Project Details

Speech enhancement front-end for robust automatic speech recognition with large amount of training data

Project Period: 2. 1. 2023 – 31. 1. 2024

Project Type: contract

Partner: NTT Corporation

Czech title

Parametrizace s obohacováním řeči pro robustní automatické rozpoznávání řeči s velkým objemem trénovacích dat

Type

contract

Keywords

speech recognition, speaker diarization, large data, robustness

Abstract

The joint research will aim at investigating and developing speech enhancement
and speaker diarization techniques for automatic speech recognition systems that
are trained using a large amount of training data.

Team members

Diez Sánchez Mireia, M.Sc., Ph.D. (DCGM) – research leader
Černocký Jan, prof. Dr. Ing. (DCGM)
Pavlus Ján, Ing. (DCGM)
Peng Junyi (DCGM)
Švec Ján, Ing. (DCGM)

Publication Results

2023

DELCROIX, M.; TAWARA, N.; DIEZ SÁNCHEZ, M.; LANDINI, F.; SILNOVA, A.; OGAWA, A.; NAKATANI, T.; BURGET, L.; ARAKI, S. Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Dublin: International Speech Communication Association, 2023. no. 08, p. 3477-3481. ISSN: 1990-9772. Detail