The Speech Technology Group wins the Albayzin 2012 competition
The aim of the competition was to propose a challenge to international research groups to test their language recognition algorithms, and which eventually involved a total of 7 research groups from the following countries: Spain (3) , Portugal (1), France (1) and China (2).
Teachers at the School of Telecommunications at the Polytechnic University of Madrid, Ricardo Cordoba and Luis Fernando D'Haro, representing the Speech Technology Group, were the winners of the international competition language recognition ALBAYZIN 2012 organized by the Network Thematic Speech Technologies, within the activities of the International Congress IberSpeech 2012 .
Going into the details of the competition, the organizers proposed as application domain using audio clips taken from videos downloaded from a popular website. The objective was to simulate a real application, in which the result of the recognition system is used for audio indexing multimedia content on the Internet or machine translation. Also, included heterogeneous files in duration, conditions and channel noise, number of speakers, background music, etc.. Finally, as a general restriction, participants could only use the audio files provided by the organizers.
The evaluation considered the following two situations: Plenty and Empty. Plenty was a condition in which data were provided training and systems development, and in which they should recognize the following languages: Spanish, Catalan, Basque, Galician, Portuguese and English.
Furthermore, Empty, situation in which only provided a small data set was designed to desarrollo.Esta condition where we must recognize reduced or languages for which there are few data. The goal here was to design robust algorithms to recognize the following languages: French, German, Greek and Italian.
For both conditions also raised the possibility of recognizing the audio fragment belonged to the languages proposed (closed) or a language other than the initially raised (open).
The final results were as winner GTH for the primary or fundamental condition of the evaluation was plenty-closed, as for plenty-open condition.
The GTH and voice recognition
Today, as never before in history, we live in what we call "Global Village". Recent advances in mobile devices, social networks and cable TV have accelerated the way we communicate and access information available on the web. However, these advances have generated a new series of major technological challenges among which we highlight the need for automatic identification systems and language speaker, speech recognition and machine translation.
For example, if we find a video on the net where people speak another language and want to see it with subtitles in Spanish, it is necessary to unite all these technologies quickly, accurately and with minimal errors for that can be used in a real system. In this regard, it is important to highlight the importance of all components, but especially that of language identification module as their fault cause failure of all subsequent modules and therefore the service.
For over 30 years, the Speech Technology Group (GTH) of the Higher Technical School of Telecommunication Engineering at the Polytechnic University of Madrid has worked in different areas of speech processing, including speech recognition, in which may have contributed to important advances and technological developments proposing innovative techniques that reduce the problems mentioned above and improve current services.