Publication Details
The IWSLT 2021 BUT Speech Translation Systems
Karafiát Martin, Ing., Ph.D. (DCGM)
Burget Lukáš, doc. Ing., Ph.D. (DCGM)
Černocký Jan, prof. Dr. Ing. (DCGM)
speech, translation
The paper describes BUTs English to Germanoffline speech translation (ST) systemsdeveloped for IWSLT2021. They are based onjointly trained Automatic Speech Recognition-Machine Translation models. Their performancesis evaluated on MustC-Common testset. In this work, we study their efficiencyfrom the perspective of having a large amountof separate ASR training data and MT trainingdata, and a smaller amount of speechtranslationtraining data. Large amounts ofASR and MT training data are utilized for pretrainingthe ASR and MT models. Speechtranslationdata is used to jointly optimizeASR-MT models by defining an end-to-enddifferentiable path from speech to translations.For this purpose, we use the internal continuousrepresentations from the ASR-decoder asthe input to MT module. We show that speechtranslation can be further improved by trainingthe ASR-decoder jointly with the MT-moduleusing large amount of text-only MT trainingdata. We also show significant improvementsby training an ASR module capable of generatingpunctuated text, rather than leaving thepunctuation task to the MT module.
@inproceedings{BUT177246,
author="Hari Krishna {Vydana} and Martin {Karafiát} and Lukáš {Burget} and Jan {Černocký}",
title="The IWSLT 2021 BUT Speech Translation Systems",
booktitle="Proceedings of 18th International Conference on Spoken Language Translation (IWSLT)",
year="2021",
pages="75--83",
publisher="Association for Computational Linguistics",
address="Bangkok, on-line",
doi="10.18653/v1/2021.iwslt-1.7",
isbn="978-1-7138-3378-9",
url="https://aclanthology.org/2021.iwslt-1.7.pdf"
}