Project Details
Teorie a aplikace odhadu posteriorních pravděpodobností fonémů ve zpracování řeči
Project Period: 1. 1. 2009 – 31. 12. 2011
Project Type: grant
Code: GP102/09/P635
Agency: Czech Science Foundation
Program: Doktorské granty
speech processing, speech recognition, phoneme recognition, probabilistic
features
Estimation of posterior probabilities of discrete speech units - phonemes - has
significant importance in basic speech processing research. The estimates are
used in feature extraction (posterior features), phonotactic models for language
recognition, generation of phoneme lattices for keyword spotting, and in other
applications. The goal of this project is to create a fast and reliable system
for estimation of such posterior probabilities that would allow to decrease error
rates of the target systems. The project will deal with feature extraction,
discriminative transforms, architectures of classifiers and techniques of
training. The quality will be assessed mainly in international evaluations
organized by US National Institute of Standards and Technology (NIST).
Kopecký Jiří, Bc.
Plchot Oldřich, Ing., Ph.D. (DCGM)
2012
- HAIN, T.; BURGET, L.; DINES, J.; GARNER, P.; GRÉZL, F.; EL HANNANI, A.; HUIJBREGTS, M.; KARAFIÁT, M.; LINCOLN, M.; WAN, V. Transcribing Meetings with the AMIDA System. IEEE Transactions on Audio, Speech, and Language Processing, 2012, vol. 20, no. 2,
p. 486-498. ISSN: 1558-7916. Detail
2011
- BOŘIL, H.; GRÉZL, F.; HANSEN, J. Front-End Compensation Methods for LVCSR Under Lombard Effect. Proceedings of Interspeech 2011. Proceedings of Interspeech. Florence: International Speech Communication Association, 2011.
p. 1257-1260. ISBN: 978-1-61839-270-1. ISSN: 1990-9772. Detail - GRÉZL, F. The Role of Neural Network Size in TRAP/HATS Feature Extraction. Proceedings Text, Speech and Dialogue 2011. Lecture Notes in Computer Science. LNAI 6836. Plzeň: Springer Verlag, 2011.
p. 315-322. ISBN: 978-3-642-23537-5. ISSN: 0302-9743. Detail - GRÉZL, F.; KARAFIÁT, M. Integrating recent MLP feature extraction techniques into TRAP architecture. Proceedings of Interspeech 2011. Proceedings of Interspeech. Florence: International Speech Communication Association, 2011.
p. 1229-1232. ISBN: 978-1-61839-270-1. ISSN: 1990-9772. Detail - GRÉZL, F.; KARAFIÁT, M.; JANDA, M. Study of Probabilistic and Bottle-Neck Features in Multilingual Environment. Proceedings of ASRU 2011. Hilton Waikoloa Village, Big Island, Hawaii: IEEE Signal Processing Society, 2011.
p. 359-364. ISBN: 978-1-4673-0366-8. Detail - KOCKMANN, M.; FERRER, L.; BURGET, L.; ČERNOCKÝ, J. iVector Fusion of Prosodic and Cepstral Features for Speaker Verification. Proceedings of Interspeech 2011. Proceedings of Interspeech. Florence: International Speech Communication Association, 2011.
p. 265-268. ISBN: 978-1-61839-270-1. ISSN: 1990-9772. Detail - KOMBRINK, S.; MIKOLOV, T.; KARAFIÁT, M.; BURGET, L. Recurrent Neural Network based Language Modeling in Meeting Recognition. Proceedings of Interspeech 2011. Proceedings of Interspeech. Florence: International Speech Communication Association, 2011.
p. 2877-2880. ISBN: 978-1-61839-270-1. ISSN: 1990-9772. Detail - MIKOLOV, T.; DEORAS, A.; KOMBRINK, S.; BURGET, L.; ČERNOCKÝ, J. Empirical Evaluation and Combination of Advanced Language Modeling Techniques. Proceedings of Interspeech 2011. Proceedings of Interspeech. Florence: International Speech Communication Association, 2011.
p. 605-608. ISBN: 978-1-61839-270-1. ISSN: 1990-9772. Detail - VESELÝ, K.; KARAFIÁT, M.; GRÉZL, F. Convolutive Bottleneck Network Features for LVCSR. Proceedings of ASRU 2011. Big Island, Hawaii: IEEE Signal Processing Society, 2011.
p. 42-47. ISBN: 978-1-4673-0366-8. Detail
2010
- GRÉZL, F.; KARAFIÁT, M. Hierarchical Neural Net Architectures for Feature Extraction in ASR. Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010). Proceedings of Interspeech. Makuhari, Chiba: International Speech Communication Association, 2010.
p. 1201-1204. ISBN: 978-1-61782-123-3. ISSN: 1990-9772. Detail - HAIN, T.; BURGET, L.; DINES, J.; GARNER, P.; EL HANNANI, A.; HUIJBREGTS, M.; KARAFIÁT, M.; LINCOLN, M.; WAN, V. The AMIDA 2009 Meeting Transcription System. Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010). Proceedings of Interspeech. Makuhari, Chiba: International Speech Communication Association, 2010.
p. 358-361. ISBN: 978-1-61782-123-3. ISSN: 1990-9772. Detail - SZŐKE, I.; GRÉZL, F.; ČERNOCKÝ, J.; FAPŠO, M. Acoustic keyword spotter - optimization from end-user perspective. Proceedings of the 2010 IEEE Spoken Language Technology Workshop. IEEE Catalog Number: CFP 10SLT-USB. Berkeley, California: IEEE Signal Processing Society, 2010.
p. 177-181. ISBN: 978-1-4244-7902-3. Detail
2009
- GRÉZL, F.; ČERNOCKÝ, J. Audio Surveillance through Known Event Classification. Radioengineering, 2009, vol. 18, no. 4,
p. 671-675. ISSN: 1210-2512. Detail - GRÉZL, F.; KARAFIÁT, M.; BURGET, L. Investigation into bottle-neck features for meeting speech recognition. Proc. Interspeech 2009. Proceedings of Interspeech. Brighton: International Speech Communication Association, 2009.
p. 2947-2950. ISBN: 978-1-61567-692-7. ISSN: 1990-9772. Detail