Project Details
Speech enhancement front-end for robust automatic speech recognition with large amount of training data
Project Period: 1. 10. 2017 – 30. 9. 2018
Project Type: contract
Partner: NTT Corporation
Czech title
Parametrizace s obohacováním řeči pro robustní automatické rozpoznávání řeči s velkým objemem trénovacích dat
Type
contract
Keywords
speech recognition, robustness, large data, DNN embeddings
Abstract
The purpose of the Joint Research is to develop Speech enhancement front-end for robust automatic speech recognition with large amount of training data through the cooperation of NTT and BUT. The work is relying on embeddings produced by neural networks in various places of the processing chain.
Team members
Publications
2018
- DELCROIX, M.; ŽMOLÍKOVÁ, K.; KINOSHITA, K.; ARAKI, S.; OGAWA, A.; NAKATANI, T. SpeakerBeam: A New Deep Learning Technology for Extracting Speech of a Target Speaker Based on the Speaker's Voice Characteristics. NTT Technical Review, 2018, vol. 16, no. 11,
p. 19-24. ISSN: 1348-3447. Detail - DELCROIX, M.; ŽMOLÍKOVÁ, K.; KINOSHITA, K.; OGAWA, A.; NAKATANI, T. Single Channel Target Speaker Extraction and Recognition with Speaker Beam. In Proceedings of ICASSP 2018. Calgary: IEEE Signal Processing Society, 2018.
p. 5554-5558. ISBN: 978-1-5386-4658-8. Detail - ROHDIN, J.; SILNOVA, A.; DIEZ SÁNCHEZ, M.; PLCHOT, O.; MATĚJKA, P.; BURGET, L. End-to-End DNN Based Speaker Recognition Inspired by i-Vector and PLDA. In Proceedings of ICASSP. Calgary: IEEE Signal Processing Society, 2018.
p. 4874-4878. ISBN: 978-1-5386-4658-8. Detail
2017
- ŽMOLÍKOVÁ, K. Souhrnná výzkumná zpráva projektu "Speech enhancement front-end for robust automatic speech recognition with large amount of training data" pro rok 2017. Brno: NTT Corporation, 2017.
s. 0-0. Detail - ŽMOLÍKOVÁ, K.; DELCROIX, M.; KINOSHITA, K.; HIGUCHI, T.; OGAWA, A.; NAKATANI, T. Learning Speaker Representation for Neural Network Based Multichannel Speaker Extraction. In Proceedings of ASRU 2017. Okinawa: IEEE Signal Processing Society, 2017.
p. 8-15. ISBN: 978-1-5090-4788-8. Detail