Project Details
Nové směry ve výzkumu a využití hlasových technologií
Project Period: 1. 1. 2005 – 31. 12. 2007
Project Type: grant
Code: GA102/05/0278
Agency: Czech Science Foundation
Program: Standardní projekty
voice technology;automatic speech recognition;multi-lingual systems;speaker recognition and verification;spontaneous speech recognition;accoustic-visual speech processing;automatic transcription;large speech databases;dialogue systems;prosody optimization
The proposed project follows up the previous research activities carried out in the speech processing area by the team that integrates all Czech research groups which are recently active in speech analysis, synthesis and recognition. It was established in 1996 to participate on an ambitious 6-year project supported by the GACR and later continued in another speech oriented project ending in 2002. Each of the groups involved has its own proficiency in a specific domain, which allows the consortium to work on integrated and complex tasks. In the previous years the team has created large databases of annotated speech recordings, which are now available both training and testing purposes in speech recognition domain as well as for speech synthesis. In addition, a set of powerful tools and platforms for developing own recognition and synthesis systems has been built together with several working prototypes that serve for evaluation and demonstration purposes. Based on this state and with respect to the recent trends in voice technologies, the project will focus on the investigation and implementation of algorithms that are applicable in distributed, embedded and mobile systems, in recognition engines working with very large vocabularies, in TTS modules for interactive communication and information services, in automatic transcription of broadcast news as well as in multimodal audio-visual interfaces. Primarily, the research will address specific needs of Czech.
Burget Lukáš, doc. Ing., Ph.D. (DCGM)
Grézl František, Ing., Ph.D. (DCGM)
Chalupníček Kamil, Ing. (RG SPEECH)
Karafiát Martin, Ing., Ph.D. (DCGM)
Matějka Pavel, Ing., Ph.D. (DCGM)
Motlíček Petr, doc. Ing., Ph.D. (DCGM)
Schwarz Petr, Ing., Ph.D. (DCGM)
Szőke Igor, Ing., Ph.D. (DCGM)
2007
- BRÜMMER, N.; BURGET, L.; ČERNOCKÝ, J.; GLEMBEK, O.; GRÉZL, F.; KARAFIÁT, M.; VAN LEEUWEN, D.; MATĚJKA, P.; SCHWARZ, P.; STRASHEIM, A. Fusion of heterogeneous speaker recognition systems in the STBU submission for the NIST speaker recognition evaluation 2006. IEEE Transactions on Audio, Speech, and Language Processing, 2007, vol. 15, no. 7,
p. 2072-2084. ISSN: 1558-7916. Detail - BURGET, L.; MATĚJKA, P.; SCHWARZ, P.; GLEMBEK, O.; ČERNOCKÝ, J. Analysis of feature extraction and channel compensation in GMM speaker recognition system. IEEE Transactions on Audio, Speech, and Language Processing, 2007, vol. 15, no. 7,
p. 1979-1986. ISSN: 1558-7916. Detail - GRÉZL, F.; ČERNOCKÝ, J. TRAP-based Techniques for Recognition of Noisy Speech. Proc. 10th International Conference on Text Speech and Dialogue (TSD 2007). LNCS. Berlin: Springer Verlag, 2007.
p. 270-277. ISBN: 978-3-540-74627-0. Detail - GRÉZL, F.; KARAFIÁT, M.; ČERNOCKÝ, J. Neural network topologies and bottle neck features in speech recognition. Brno: 2007.
p. 78-82. Detail - HUBEIKA, V.; SZŐKE, I.; BURGET, L.; ČERNOCKÝ, J. Maximum Likelihood and Maximum Mutual Information Training in Gender and Age Recognition System. In Proc. 10th International Conference on Text Speech and Dialogue (TSD 2007). Pilsen: Springer Verlag, 2007.
p. 1-6. ISBN: 978-3-540-74627-0. Detail - MATĚJKA, P.; BURGET, L.; SCHWARZ, P.; GLEMBEK, O.; KARAFIÁT, M.; GRÉZL, F.; ČERNOCKÝ, J.; VAN LEEUWEN, D.; BRÜMMER, N.; STRASHEIM, A. STBU system for the NIST 2006 speaker recognition evaluation. Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2007). Honolulu: IEEE Signal Processing Society, 2007.
p. 221-224. ISBN: 1-4244-0728-1. Detail - MIKOLOV, T.; OPARIN, I.; GLEMBEK, O.; BURGET, L.; KARAFIÁT, M.; ČERNOCKÝ, J. Použití mluvených korpusů ve vývoji systému pro rozpoznávání českých přednášek. Praha: Univerzita Karlova v Praze, 2007.
s. 1-5. Detail - SZŐKE, I.; BURGET, L.; KARAFIÁT, M. Combination of Word and Phoneme Approach for Spoken Term Detection. Brno: 2007.
p. 1 (1 s.). Detail - SZŐKE, I.; FAPŠO, M.; KARAFIÁT, M.; BURGET, L.; GRÉZL, F.; SCHWARZ, P.; GLEMBEK, O.; MATĚJKA, P.; KOPECKÝ, J.; ČERNOCKÝ, J. Spoken Term Detection System Based on a Combination of LVCSR and Phonetic Search. Brno: 2007.
p. 1 (1 s.). Detail
2006
- FAPŠO, M.; SMRŽ, P.; SCHWARZ, P.; SZŐKE, I.; SCHWARZ, M.; ČERNOCKÝ, J.; KARAFIÁT, M.; BURGET, L. Information Retrieval from Spoken Documents. In Proceedings of the Seventh International Conference on Intelligent Text Processing and Computational Linguistics (CICLING 2006). Mexico City: Springer Verlag, 2006.
p. 410-416. ISBN: 3-540-32205-1. Detail - HUBEIKA, V. Estimation of Gender and Age from Recorded Speech. In Proc. ACM Student Research competition 2006. Prague: Czech Technical University, 2006.
p. 25-32. ISBN: 80-01-03595-6. Detail - KARAFIÁT, M.; GRÉZL, F.; SCHWARZ, P.; BURGET, L.; ČERNOCKÝ, J. Robust heteroscedastic linear discriminant analysis and LCRC posterior features in meeting data recognition. In Proc. 3nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI 2006). LNCS 4299. Berlin: Springer Verlag, 2006.
p. 275-284. ISBN: 3-540-69267-3. Detail - MATĚJKA, P.; BURGET, L.; SCHWARZ, P.; ČERNOCKÝ, J. Brno University of Technology System for NIST 2005 Language Recognition Evaluation. In Proceedings of Odyssey 2006: The Speaker and Language Recognition Workshop. San Juan: 2006.
p. 57-64. ISBN: 1-4244-0472-X. Detail
2005
- FAPŠO, M., SCHWARZ, P., SZŐKE, I., ČERNOCKÝ, J., SMRŽ, P., BURGET, L., KARAFIÁT, M. Search Engine for Information Retrieval from Multi-modal Records. Edinburgh: 2005. Detail
- FAPŠO, M., SMRŽ, P., SCHWARZ, P., SZŐKE, I., BURGET, L., KARAFIÁT, M., ČERNOCKÝ, J. Systém pre efektívne vyhľadávanie v rečových databázach. In Sborník databázové konference DATAKON 2005. Brno: Masaryk University, 2005.
s. 323-333. ISBN: 80-210-3813-6. Detail - GRÉZL, F. Spectral plane investigation for probabilistic features for ASR. Edinburgh: 2005.
p. 82 ( p.) Detail - MATĚJKA, P. Phoneme Recognition Tuning for Language Identification System. In Proceedings of the 11th conference STUDENT EEICT 2005. Brno: Faculty of Electrical Engineering and Communication BUT, 2005.
p. 658-653. ISBN: 80-214-2890-2. Detail - MATĚJKA, P., SCHWARZ, P., ČERNOCKÝ, J., CHYTIL, P. Phonotactic Language Identification. In Proceedings of Radioelektronika 2005. Brno: Faculty of Electrical Engineering and Communication BUT, 2005.
p. 140-143. ISBN: 80-214-2904-6. Detail - SZŐKE, I. Smooth Pitch Tracker Based on Harmonic and Noise Model. In STUDENT EEICT 2005. Brno: Faculty of Information Technology BUT, 2005.
p. 673-677. ISBN: 80-214-2890-2. Detail - SZŐKE, I., SCHWARZ, P., BURGET, L., KARAFIÁT, M., MATĚJKA, P., ČERNOCKÝ, J. Phoneme Based Acoustics Keyword Spotting in Informal Continuous Speech. Lecture Notes in Computer Science, 2005, vol. 2005, no. 3658,
p. 302 ( p.) ISSN: 0302-9743. Detail