Project Details

Kvalitativní posun v automatickém rozpoznávání jazyků s využitím streamovaných audio-médií

Project Period: 19. 1. 2006 – 19. 7. 2007

Project Type: grant

Code: 162/2005

English title

Advancing the automatic language recognition using streamed audio media

Type

grant

Keywords

speech processing, language identification, parallel computing, unsupervised
acquisition of speech data, streaming

Abstract

The projects aims at massive usage of streamed audio for a qualitative
improvement of LID (automatic language identification) system accuracy. The
speech processing research group at Faculty of Information Technology, Brno
University of Technology (Speech@FIT) disposes of a state-of-the-art LID system
based on acoustic and phonotactic modeling. For further improvement of its
accuracy, it is crucial to gather huge amounts of language-specific data. In the
framework of this project, such data will be collected from available streamed
sources (Internet radios), on-line stored, parameterized and processed. We will
develop software for training of LID models. Resulting models and algorithms will
be evaluated in international evaluation campaigns organized by NIST and in
cooperation with Czech law enforcement forces.

Team members

Černocký Jan, prof. Dr. Ing. (DCGM) – research leader
Kašpárek Tomáš, Ing., Ph.D. (CVT)
Matějka Pavel, Ing., Ph.D.
Schwarz Petr, Ing., Ph.D. (DCGM)