Project Details

Multi-lingualita v řečových technologiích

Project Period: 1. 1. 2020 – 31. 8. 2023

Project Type: grant

Code: LTAIN19087

Agency: Ministerstvo školství, mládeže a tělovýchovy ČR

Program: INTER-EXCELLENCE - Podprogram INTER-ACTION 19LTAIN

English title
Multi-linguality in speech technologies
Type
grant
Keywords

multi-linguality, speech recognition, machine learning, data, transfer learning

Abstract

Speech data mining technologies and human-machine interfaces based on speech have
witnessed significant advances in the past decade and numerous applications have
been successfully commercialized. However, they usually work correctly only in
favorable scenarios - in languages with abundance of training data and in
relatively clean environments, such as office or apartment. In fast developing
big markets such as the Indian one, severe problems make the exploitation of
speech difficult: multitude of languages (some of them with limited or missing
resources), highly noisy conditions (lots of business is simply done on the
streets in Indian cities), and highly variable numbers of speakers in
a conversation (from normal two to whole families). These make the development of
automatic speech recognition (ASR), speaker recognition (SR) and speaker
diarization (determining who spoke when, SD) complicated. In the proposed
project, two established research institutes with significant track multi-lingual
ASR, robust SR and SD: Brno University of Technology (BUT), IIT Madras (IIT-M)
have teamed up with an important player on the Indian and global personal
electronics markets - Samsung R&D Institute India-Bangalore (SRI-B), and propose
significant advances in several speech technologies, notably in multi-lingual
low-resource ASR. While BUT and IIT-M will provide top speech research (based,
among others, on the U.S. IARPA Babel and Material programs, victory in IARPA
ASpIRE evaluation and in Interspeech 2018 Low Resource Speech Recognition
Challenge for Indian Languages, and on Indian MANDI project), SRI-B will provide
data, industrial guidelines and to produce demonstrators of technologies.

Team members
Černocký Jan, prof. Dr. Ing. (DCGM) – research leader
Egorova Ekaterina, Ing., Ph.D.
Skácel Miroslav, Ing.
Žižka Josef, Ing. (DCGM)
Publications

2023

2022

2021

2020

Back to top