Publication Details

BCN2BRNO: ASR System Fusion for Albayzin 2020 Speech to Text Challenge

KOCOUR, M.; CÁMBARA, G.; LUQUE, J.; BONET, D.; FARRÚS, M.; KARAFIÁT, M.; VESELÝ, K.; ČERNOCKÝ, J. BCN2BRNO: ASR System Fusion for Albayzin 2020 Speech to Text Challenge. Proceedings of IberSPEECH 2021. Vallaloid: International Speech Communication Association, 2021. p. 113-117.

Czech title

BCN2BRNO: Fúze ASR systémů pro Albayzin 2020 Speech to Text Challenge

Type

conference paper

Language

English

Authors

Kocour Martin, Ing. (DCGM)
CÁMBARA, G.
Luque Jordi
BONET, D.
FARRÚS, M.
Karafiát Martin, Ing., Ph.D. (DCGM)
Veselý Karel, Ing., Ph.D. (DCGM)
Černocký Jan, prof. Dr. Ing. (DCGM)

URL

Keywords

fusion, end-to-end model, hybrid model, semisupervised,automatic speech recognition, convolutional neuralnetwork.

Abstract

This paper describes the joint effort of BUT and Telefónica Researchon the development of Automatic Speech Recognitionsystems for the Albayzin 2020 Challenge. We compare approachesbased on either hybrid or end-to-end models. In hybridmodelling, we explore the impact of a SpecAugment layeron performance. For end-to-end modelling, we used a convolutionalneural network with gated linear units (GLUs). Theperformance of such model is also evaluated with an additionaln-gram language model to improve word error rates. We furtherinspect source separation methods to extract speech fromnoisy environments (i.e. TV shows). More precisely, we assessthe effect of using a neural-based music separator named Demucs.A fusion of our best systems achieved 23.33% WER inofficial Albayzin 2020 evaluations. Aside from techniques usedin our final submitted systems, we also describe our efforts inretrieving high-quality transcripts for training.

Published

2021

Pages

113–117

Proceedings

Proceedings of IberSPEECH 2021

Conference

IberSPEECH 2021 Conference, Valladolid, ES

Publisher

International Speech Communication Association

Place

Vallaloid

DOI

10.21437/IberSPEECH.2021-24

BibTeX

@inproceedings{BUT175823,
  author="KOCOUR, M. and CÁMBARA, G. and LUQUE, J. and BONET, D. and FARRÚS, M. and KARAFIÁT, M. and VESELÝ, K. and ČERNOCKÝ, J.",
  title="BCN2BRNO: ASR System Fusion for Albayzin 2020 Speech to Text Challenge",
  booktitle="Proceedings of IberSPEECH 2021",
  year="2021",
  pages="113--117",
  publisher="International Speech Communication Association",
  address="Vallaloid",
  doi="10.21437/IberSPEECH.2021-24",
  url="https://www.isca-speech.org/archive/iberspeech_2021/kocour21_iberspeech.html"
}

Files

pdf kocour21_iberspeech.pdf 236 kB