Výzkum užitečný pro společnost.
Detail projektu
Technologie zpracování řeči pro efektivní komunikaci člověk-počítač
Období řešení: 1. 1. 2011 – 31. 12. 2014
Typ projektu: grant
Kód: TA01011328
Agentura: Technologická agentura ČR
Program: Program aplikovaného výzkumu a experimentálního vývoje ALFA
rozpoznávání řeči, elektronické slovníky, obrana a bezpečnost, mobilní zařízení, dialogové systémy, CRM, eLearning
Cílem projektu je vyvinout pokročilé techniky pro rozpoznávání řeči a nasadit je v praktických aplikacích: vyhledávání v elektronickém slovníku na mobilních zařízeních, diktování překladů, v bezpečnosti a obraně, v dialogových systémech, systémech péče o zákazníky (CRM, helpdesk apod.) a v audiovizuálním přístupu k výukovým materiálům.
Hannemann Mirko, Ph.D.
Heřmanský Hynek, prof. Ing. (UPGM)
Karafiát Martin, Ing., Ph.D. (UPGM)
Ondel Lucas Antoine Francois, Mgr., Ph.D. (SSDIT)
Szőke Igor, Ing., Ph.D. (UPGM)
Žižka Josef, Ing. (UPGM)
2015
- ONDEL YANG, L.; ANGUERA, X.; LUQUE, J. MASK+:Data-Driven Regions Selection for Acoustic Fingerprinting. In Proceedings of 2015 IEEE International Conference on Acoustics, Speech and Signal Processing. South Brisbane, Queensland: IEEE Signal Processing Society, 2015.
p. 335-339. ISBN: 978-1-4673-6997-8. Detail
2014
- GLEMBEK, O.; MA, J.; MATĚJKA, P.; ZHANG, B.; PLCHOT, O.; BURGET, L.; MATSOUKAS, S. Domain Adaptation Via Within-class Covariance Correction in I-Vector Based Speaker Recognition Systerms. In Proceedings of ICASSP 2014. Florencie: IEEE Signal Processing Society, 2014.
p. 4060-4064. ISBN: 978-1-4799-2892-7. Detail - KARAFIÁT, M.; GRÉZL, F.; HANNEMANN, M.; ČERNOCKÝ, J. BUT Neural Network Features for Spontaneous Vietnamese in BABEL. In Proceedings of ICASSP 2014. Florencie: IEEE Signal Processing Society, 2014.
p. 5659-5663. ISBN: 978-1-4799-2892-7. Detail - KARAFIÁT, M.; GRÉZL, F.; VESELÝ, K.; HANNEMANN, M.; SZŐKE, I.; ČERNOCKÝ, J. BUT 2014 Babel System: Analysis of adaptation in NN based systems. In Proceedings of Interspeech 2014. Singapore: International Speech Communication Association, 2014.
p. 3002-3006. ISBN: 978-1-63439-435-2. Detail - KARAFIÁT, M.; VESELÝ, K.; SZŐKE, I.; BURGET, L.; GRÉZL, F.; HANNEMANN, M.; ČERNOCKÝ, J. BUT ASR System for BABEL Surprise Evaluation 2014. In Proceedings of 2014 Spoken Language Technology Workshop. South Lake Tahoe, Nevada: IEEE Signal Processing Society, 2014.
p. 501-506. ISBN: 978-1-4799-7129-9. Detail - MARTÍNEZ GONZÁLEZ, D.; BURGET, L.; STAFYLAKIS, T.; LEI, Y.; KENNY, P.; LLEIDA, E. Unscented Transform For Ivector-based Noisy Speaker Recognition. In Proceedings of ICASSP 2014. Florencie: IEEE Signal Processing Society, 2014.
p. 4070-4074. ISBN: 978-1-4799-2892-7. Detail
2013
- EGOROVA, E.; VESELÝ, K.; KARAFIÁT, M.; JANDA, M.; ČERNOCKÝ, J. Manual and Semi-Automatic Approaches to Building a Multilingual Phoneme Set. In Proceedings of ICASSP 2013. Vancouver: IEEE Signal Processing Society, 2013.
p. 7324-7328. ISBN: 978-1-4799-0355-9. Detail - LEI, Y.; BURGET, L.; SCHEFFER, N. A Noise Robust I-Vector Extractor Using Vector Taylor Series For Speaker Recognition. Proceedings of ICASSP 2013. Vancouver: IEEE Signal Processing Society, 2013.
p. 6788-6791. ISBN: 978-1-4799-0355-9. Detail - PLCHOT, O.; MATSOUKAS, S.; MATĚJKA, P.; DEHAK, N.; MA, J.; CUMANI, S.; GLEMBEK, O.; HEŘMANSKÝ, H.; MESGARANI, N.; SOUFIFAR, M.; THOMAS, S.; ZHANG, B.; ZHOU, X. Developing A Speaker Identification System For The DARPA RATS Project. Proceedings of ICASSP 2013. Vancouver: IEEE Signal Processing Society, 2013.
p. 6768-6772. ISBN: 978-1-4799-0355-9. Detail - RATH, S.; BURGET, L.; KARAFIÁT, M.; GLEMBEK, O.; ČERNOCKÝ, J. A Region-specific Feature-space Transformation for Speaker Adaptation and Singularity Analysis of Jacobian Matrix. Proceedings of Interspeeech 2013. Proceedings of the 14th Annual Conference of the International Speech Communication Association (Interspeech 2013). Lyon: International Speech Communication Association, 2013.
p. 1228-1232. ISBN: 978-1-62993-443-3. ISSN: 2308-457X. Detail - RATH, S.; POVEY, D.; VESELÝ, K.; ČERNOCKÝ, J. Improved Feature Processing for Deep Neural Networks. Proceedings of Interspeech 2013. Proceedings of the 14th Annual Conference of the International Speech Communication Association (Interspeech 2013). Lyon: International Speech Communication Association, 2013.
p. 109-113. ISBN: 978-1-62993-443-3. ISSN: 2308-457X. Detail
2012
- CUMANI, S.; PLCHOT, O.; KARAFIÁT, M. Independent Component Analysis and MLLR Transforms for Speaker Identification. Proc. International Conference on Acoustics, Speech, and Signal P. Kyoto: IEEE Signal Processing Society, 2012.
p. 4365-4368. ISBN: 978-1-4673-0044-5. Detail - DEORAS, A.; MIKOLOV, T.; KOMBRINK, S.; CHURCH, K. Approximate inference: A sampling based modeling technique to capture complex dependencies in a language model. Speech Communication, 2012, vol. 2012, no. 8,
p. 1-16. ISSN: 0167-6393. Detail - KARAFIÁT, M.; JANDA, M.; ČERNOCKÝ, J.; BURGET, L. Region Dependent Linear Transforms in Multilingual Speech Recognition. In Proc. International Conference on Acoustics, Speech, and Signal Processing 2012. Kyoto: IEEE Signal Processing Society, 2012.
p. 4885-4888. ISBN: 978-1-4673-0044-5. Detail - KOMBRINK, S.; MIKOLOV, T.; KARAFIÁT, M.; BURGET, L. Improving Language Models for ASR Using Translated In-domain Data. Proceedings of 2012 IEEE International Conference on Acoustics, Speech and Signal Processing. Kyoto: IEEE Signal Processing Society, 2012.
p. 4405-4408. ISBN: 978-1-4673-0044-5. Detail - POVEY, D.; HANNEMANN, M.; BOULIANNE, G.; BURGET, L.; GHOSHAL, A.; JANDA, M.; KARAFIÁT, M.; KOMBRINK, S.; MOTLÍČEK, P.; QIAN, Y.; RIEDHAMMER, K.; VESELÝ, K.; VU, N. Generating Exact Lattices in The WFST Framework. Proceedings of 2012 IEEE International Conference on Acoustics, Speech and Signal Processing. Kyoto: IEEE Signal Processing Society, 2012.
p. 4213-4216. ISBN: 978-1-4673-0044-5. Detail - RATH, S.; KARAFIÁT, M.; GLEMBEK, O.; ČERNOCKÝ, J. A factorized representation of FMLLR transform based on QR-decomposition. Proceedings of Interspeech 2012. Proceedings of Interspeech. Portland, Oregon: International Speech Communication Association, 2012.
p. 1-4. ISBN: 978-1-62276-759-5. ISSN: 1990-9772. Detail - SOUFIFAR, M.; CUMANI, S.; BURGET, L.; ČERNOCKÝ, J. Discriminative Classifiers for Phonotactic Language Recognition with iVectors. Proc. International Conference on Acoustics, Speech, and Signal Processing 2012. Kyoto: IEEE Signal Processing Society, 2012.
p. 4853-4856. ISBN: 978-1-4673-0044-5. Detail - SZŐKE, I.; FAPŠO, M.; VESELÝ, K. BUT2012 přístup pro Spoken Web Search úkol na MediaEval2012. Working Notes Proceedings of the MediaEval 2012 Workshop. CEUR Workshop Proceedings. Pisa: CEUR-WS.org, 2012.
s. 1-2. ISSN: 1613-0073. Detail - SZŐKE, I.; FAPŠO, M.; ŽIŽKA, J.; BERAN, V.; ČERNOCKÝ, J. Efektivní přístup ke znalostem v audio-vizuálních záznamech. Proceedings of the Annual Database Conference. Praha: Technická univerzita v Košiciach, 2012.
s. 57-74. ISBN: 978-80-553-1049-7. Detail - VESELÝ, K.; KARAFIÁT, M.; GRÉZL, F.; JANDA, M.; EGOROVA, E. The Language-Independent Bottleneck Features. Proceedings of IEEE 2012 Workshop on Spoken Language Technology. Miami: IEEE Signal Processing Society, 2012.
p. 336-341. ISBN: 978-1-4673-5124-9. Detail
2011
- DEORAS, A.; MIKOLOV, T.; CHURCH, K. A Fast Re-scoring Strategy to Capture Long-Distance Dependencies. Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing July 2011 Edinburgh, Scotland, UK. Edinburgh: Association for Computational Linguistics, 2011.
p. 1116-1127. ISBN: 978-1-937284-11-4. Detail - GRÉZL, F. The Role of Neural Network Size in TRAP/HATS Feature Extraction. Proceedings Text, Speech and Dialogue 2011. Lecture Notes in Computer Science. LNAI 6836. Plzeň: Springer Verlag, 2011.
p. 315-322. ISBN: 978-3-642-23537-5. ISSN: 0302-9743. Detail - GRÉZL, F.; KARAFIÁT, M. Integrating recent MLP feature extraction techniques into TRAP architecture. Proceedings of Interspeech 2011. Proceedings of Interspeech. Florence: International Speech Communication Association, 2011.
p. 1229-1232. ISBN: 978-1-61839-270-1. ISSN: 1990-9772. Detail - KARAFIÁT, M.; BURGET, L.; MATĚJKA, P.; GLEMBEK, O.; ČERNOCKÝ, J. iVector-Based Discriminative Adaptation for Automatic Speech Recognition. Proceedings of ASRU 2011. Hilton Waikoloa Village, Big Island, Hawaii: IEEE Signal Processing Society, 2011.
p. 152-157. ISBN: 978-1-4673-0366-8. Detail - KOMBRINK, S.; MIKOLOV, T.; KARAFIÁT, M.; BURGET, L. Recurrent Neural Network based Language Modeling in Meeting Recognition. Proceedings of Interspeech 2011. Proceedings of Interspeech. Florence: International Speech Communication Association, 2011.
p. 2877-2880. ISBN: 978-1-61839-270-1. ISSN: 1990-9772. Detail - MIKOLOV, T.; DEORAS, A.; KOMBRINK, S.; BURGET, L.; ČERNOCKÝ, J. Empirical Evaluation and Combination of Advanced Language Modeling Techniques. Proceedings of Interspeech 2011. Proceedings of Interspeech. Florence: International Speech Communication Association, 2011.
p. 605-608. ISBN: 978-1-61839-270-1. ISSN: 1990-9772. Detail - MIKOLOV, T.; DEORAS, A.; POVEY, D.; BURGET, L.; ČERNOCKÝ, J. Strategies for Training Large Scale Neural Network Language Models. Proceedings of ASRU 2011. Hilton Waikoloa Village, Big Island, Hawaii: IEEE Signal Processing Society, 2011.
p. 196-201. ISBN: 978-1-4673-0366-8. Detail - MIKOLOV, T.; KOMBRINK, S.; DEORAS, A.; BURGET, L.; ČERNOCKÝ, J. RNNLM - Recurrent Neural Network Language Modeling Toolkit. Proceedings of ASRU 2011. Hilton Waikoloa Village, Big Island, Hawaii: IEEE Signal Processing Society, 2011.
p. 1-4. ISBN: 978-1-4673-0366-8. Detail - POVEY, D.; GHOSHAL, A.; BOULIANNE, G.; BURGET, L.; GLEMBEK, O.; GOEL, N.; HANNEMANN, M.; MOTLÍČEK, P.; QIAN, Y.; SCHWARZ, P.; SILOVSKÝ, J.; STEMMER, G.; VESELÝ, K. The Kaldi Speech Recognition Toolkit. Proceedings of ASRU 2011. Hilton Waikoloa Village Resort, Hawaii: IEEE Signal Processing Society, 2011.
p. 1-4. ISBN: 978-1-4673-0366-8. Detail - VESELÝ, K.; KARAFIÁT, M.; GRÉZL, F. Convolutive Bottleneck Network Features for LVCSR. Proceedings of ASRU 2011. Big Island, Hawaii: IEEE Signal Processing Society, 2011.
p. 42-47. ISBN: 978-1-4673-0366-8. Detail