Publication Details

The IWSLT 2021 BUT Speech Translation Systems

VYDANA, H.; KARAFIÁT, M.; BURGET, L.; ČERNOCKÝ, J. The IWSLT 2021 BUT Speech Translation Systems. In Proceedings of 18th International Conference on Spoken Language Translation (IWSLT). Bangkok, on-line: Association for Computational Linguistics, 2021. p. 75-83. ISBN: 978-1-7138-3378-9.
Czech title
BUT systém pro strojový překlad z řeči pro IWSLT 2021
Type
conference paper
Language
English
Authors
URL
Keywords

speech, translation

Abstract

The paper describes BUTs English to Germanoffline speech translation (ST) systemsdeveloped for IWSLT2021. They are based onjointly trained Automatic Speech Recognition-Machine Translation models. Their performancesis evaluated on MustC-Common testset. In this work, we study their efficiencyfrom the perspective of having a large amountof separate ASR training data and MT trainingdata, and a smaller amount of speechtranslationtraining data. Large amounts ofASR and MT training data are utilized for pretrainingthe ASR and MT models. Speechtranslationdata is used to jointly optimizeASR-MT models by defining an end-to-enddifferentiable path from speech to translations.For this purpose, we use the internal continuousrepresentations from the ASR-decoder asthe input to MT module. We show that speechtranslation can be further improved by trainingthe ASR-decoder jointly with the MT-moduleusing large amount of text-only MT trainingdata. We also show significant improvementsby training an ASR module capable of generatingpunctuated text, rather than leaving thepunctuation task to the MT module.

Published
2021
Pages
75–83
Proceedings
Proceedings of 18th International Conference on Spoken Language Translation (IWSLT)
Conference
18TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION, Bangkok (on-line), TH
ISBN
978-1-7138-3378-9
Publisher
Association for Computational Linguistics
Place
Bangkok, on-line
DOI
UT WoS
000694723100007
EID Scopus
BibTeX
@inproceedings{BUT177246,
  author="Hari Krishna {Vydana} and Martin {Karafiát} and Lukáš {Burget} and Jan {Černocký}",
  title="The IWSLT 2021 BUT Speech Translation Systems",
  booktitle="Proceedings of 18th International Conference on Spoken Language Translation (IWSLT)",
  year="2021",
  pages="75--83",
  publisher="Association for Computational Linguistics",
  address="Bangkok, on-line",
  doi="10.18653/v1/2021.iwslt-1.7",
  isbn="978-1-7138-3378-9",
  url="https://aclanthology.org/2021.iwslt-1.7.pdf"
}
Back to top