À Punt and MLLP (Polytechnic University of Valencia): the definitive automated transcription project?
Automated transcription may be the solution that the broadcast industry needs to achieve 100% accessibility to audiovisual content. But can the necessary reliability, minimum latency, and stability be achieved while preserving the indispensable work of transcribers?
Vicente Fuster, head of the IT team at In Punt, recounts the latest major development promoted by the Valencian corporation: a transcription system for live and deferred developed through an agreement with the Polytechnic University of Valencia - UPV.
The Law 7/2010 of March 31 (General Law of Audiovisual Communication which, by the way, will be renewed in the near future), requires that the television audiovisual communication subtitle at least the 75% of the programs, additionally contributing two hours a week of sign language interpretation. We talk about a whole challenge. Although the accessibility of delayed content can be addressed if the necessary time is available, live transcription still requires difficulties to achieve the desired zero latency and a minimum margin of error. It is true that different human resources or digital tools are currently used. However, neither world is perfect: transcribers will be able to offer much more reliable capture, although they move away from real time. On the other hand, automation systems will reduce latency, but they will fail miserably when it comes to capturing regional languages, such as Valencian, he gallego or the Basque.
The team of In Punt, lead by Vicente Fuster y Pau Peiró under the leadership of Higinio Año Sanz (director of the technology department), has achieved the difficult balance between both worlds. Thanks to an agreement with the UPV, and after profound developments both at the level of engineering and lexical training with the media of the chain and Valencian Language Academy (AVL), what is possibly the most solid system prepared to transcribe a Spanish co-official language. In it, a IA fast and efficient, a emissions integration system light and functional, and the essential work of the transcription team, in charge of correcting the first result of the system and carrying out respoken for optimize the operation of the tool.
Fuster delves into the keys to this solution, which has also had special involvement from Alfred Costa Folgado, director of the Sociedad Anónima de Medjans Comunciació – SAMC; Empar Marco Estellés, former director of the SAMC; Mar Iglesias, acting president of the Valencian Corporation of Audiovisual Media - CVMA; Enrique Soriano Hernandez, former director of the Valencian Corporation of Audiovisual Media; and the whole team of the Polytechnic University of Valencia, lead by Alfonso Juan Ciscar, professor at the UPV and director of the MLLP research group (VRAIN institute).
The context of À Punt… and Spanish television
The automated transcription and subtitling initiative initiated by À Punt is based on the public service work that the corporation has to continue promoting the accessibility of its contents. The objective is clear: to bring each format closer to each viewer in the Valencian Community through different accessibility tools, whether audio description, Subtitled or interpretation in sign language.
The development is framed in a complex global context, in which major global agents such as Google o Microsoft they hardly put any effort into creating capable systems to carry out the speech-to-text of languages like Valencian, he gallego or the Basque. Likewise, given their global service strategy, these agents rely on proposals cloud o hybrids; solutions that, either because the technology is not ready or because of the long distances from the processing centers, do not achieve the desired immediacy.
The needs of the project were on the table. A tool with some was essential. high percentages of successes in Valencian (adapting, in this way, to the lexical particularities that differentiate the language from Catalan) and that could integrate into the entire workflow of the Valencian chain. All with the objective of achieving subtitling in the shortest possible time audiovisual content of computer-assisted way in real time and based on intelligence artificial. The opportunity, as Fuster relates, came from a work group from the Polytechnic University of Valencia.
The first steps
Á Punt permanently tracks the technological context, looking for future tools that could optimize its processes. However, as Fuster confesses, many times the technique is not prepared. There are developments (small initiatives from the academic world or alphas driven by technological) that address certain areas, but these do not usually come close to the minimums required by a television with the reach of Á Punt.
Through various contacts, Fuster discovered that since Polytechnic University of Valencia Numerous research projects were being promoted, especially in the fields of artificial intelligence, which could be perfectly applicable (after adaptation) to the world television. With the support of the chain's management, Fuster was able to promote a open day in À Punt that served for different Investigation groups They could present their latest developments: transcoding server farms, sentiment analysis, use of AI to anticipate technical problems... Multiple lines of research were put on the table that could contribute new dimensions to the technical operations of television. However, one group stood out above all: MLLP, with his work on the automated transcription.
After nine months of dialogue, procedures and administrative problems, À Punt was able to sign a health insurance with the UPV. In it, both institutions would commit to adapt the transcription system to television operations, as well as continue rediscovering, developing and deepening the tool.
A successful development in just one year
After nine months, the agreement was ready to be signed in February 2020. However, Covid-19 arrived in Spain. The pandemic delayed the signing and application of the alliance, moving the launch of the initiative to October 2020. This agreement, signed for two years With another two years extension options, it has a series of objectives. A year later, many of them have already been fulfilled. Among them, the System integration and commissioning: at the end of October the system began to operate in all live broadcasts, and since mid-November, this operation has been extended to the delayed programs.
The adaptation of the UPV solution, despite the talent of both teams, has not been an simple work. They have automated numerous processes, they have interconnected all emission and production systems with the automated transcription system and has been integrated tool with the live API Fingertext of English technician (À Punt supplier). Likewise, the system has been adapted to the needs of each program. As Fuster recalls, “it is not the same to make a news program in which standardized language is used and a script is available, as to make a live show with the complexity that perhaps you could be speaking in Valencian and they respond to you in Spanish.” The system, flexible and constantly evolving, continues to multiply its reach as the weeks go by.
Improving the daily life of workers
One of the keys to the automated transcription project is that it was never designed with replace The team of transcribers by À Punt. On the contrary, as emphasized carpenter, the objective was to improve their performance, making the transcriber work “with less pressure and achieving a better result.”
In this way, transcribers perform closely related functions with the work they did previously: correct the transcription proposed by the system, they adapt elements to make them compatible with the subtitling system, supervise the transcription processes of delayed content or perform the respoken: live voice-overs that replicate what was said by the protagonists in various formats. The system, being trained to understand the voices of the transcribers and as they make their statements in an isolated and controlled space, is capable of responding with greater fidelity in certain contexts.
Fuster highlights that the transcription team has expressed their satisfaction with this methodology. Furthermore, he considers that using the previous model was not sustainable to meet one of À Punt's goals: that the 100% of its programming is subtitled. “We could achieve this goal if we hired a hundred specialized transcriptionists, but the money we have is what it is and we cannot do it. What we have to do is increase the productivity of the processes. That is why we have applied this system,” highlights the IT manager at À Punt.
Success and latency
To date, the results of the proposal are very promising. The success rate of automatic transcription in the global programming of À Punt in Valencian reaches a 80% (against him 55% of the success of Google systems), an amount that rises to more than a 90% (against 68% of Google) correct in the news of the chain.
Vicente Fuster's team and the UPV work to improve these percentages, although they recognize the merit of the results after just one year actively working on the project.
Regarding system latency, Fuster highlights that this is set at 0.8 seconds, a figure that is described as quite an achievement. The IT manager justifies this milestone in the decision to operate on premises con completely local servers. These servers, by the way, are capable of serving up to two live streams at a time “without problem”.
A promising future
The automated transcription solution has barely taken off first steps. The agreement signed between À Punt and UPV contemplates new developments that will continue to provide valuable resources to television. Among these developments, we find the application of a new artificial intelligence that will allow you to carry out a labeling y cataloging of the media to promote the work of the area of documentation; and the generating a tag tree based on the transcriptions so that the journalist can easily locate the most appropriate piece to complement his report or news.
Similarly, the À Punt team is also working on the creating an avatar that allows you to carry out sign language based on the speech-to-text of the transcription system; and in the automated generation of subtitles in Valencian, Spanish or English based on the original audio in one of these three languages. This latest application, designed to “enrich educational content”, will take its first steps throughout 2022.
Beyond these medium-term projects, a new improvement is contemplated that will reach beginning of next year: enabling subtitling systems (supported by the solution of automated transcription) integrated into the live signal broadcast what In Punt offers through its website y mobile apps.
Will the system reach other televisions?
Today, numerous televisions Public authorities have contacted the team In Punt to get interested in automated transcription solution. Once they discover their performance, are as surprised as they are interested in knowing more details, in order to assess its inclusion in their respective workflows.
As of today, Fuster does not know what the future of this tool will be. out of À Punt. May be marketed by television itself? Does the university working group can spread it and achieve an economic return from it? The IT manager at À Punt breathes a sigh of relief knowing that “fortunately” that decision does not belong in it. However, he acknowledges that he would feel “proud” that the system was will use on other televisions, since it would endorse the intense work carried out by both his team and the UPV.
Likewise, Fuster considers that the success of this tool provides a clear message about him value of regional television as In Punt: “In this context, in which there are sectors that proclaim that investing in regional television is throwing money away, projects like this demonstrate that our media make perfect sense; Thanks to regional television, we are doing projects that will never be carried out in many other private initiatives”.
A report by Sergio Julián Gómez
Did you like this article?
Subscribe to our RSS feed and you won't miss anything.