Publication Details
The IWSLT 2021 BUT Speech Translation Systems
Karafiát Martin, Ing., Ph.D. (DCGM)
Burget Lukáš, doc. Ing., Ph.D. (DCGM)
Černocký Jan, prof. Dr. Ing. (DCGM)
speech, translation
The paper describes BUTs English to German offline speech translation (ST)
systems developed for IWSLT2021. They are based on jointly trained Automatic
Speech Recognition- Machine Translation models. Their performances is evaluated
on MustC-Common test set. In this work, we study their efficiency from the
perspective of having a large amount of separate ASR training data and MT
training data, and a smaller amount of speechtranslation training data. Large
amounts of ASR and MT training data are utilized for pretraining the ASR and MT
models. Speechtranslation data is used to jointly optimize ASR-MT models by
defining an end-to-end differentiable path from speech to translations. For this
purpose, we use the internal continuous representations from the ASR-decoder as
the input to MT module. We show that speech translation can be further improved
by training the ASR-decoder jointly with the MT-module using large amount of
text-only MT training data. We also show significant improvements by training an
ASR module capable of generating punctuated text, rather than leaving the
punctuation task to the MT module.
@inproceedings{BUT177246,
author="Hari Krishna {Vydana} and Martin {Karafiát} and Lukáš {Burget} and Jan {Černocký}",
title="The IWSLT 2021 BUT Speech Translation Systems",
booktitle="Proceedings of 18th International Conference on Spoken Language Translation (IWSLT)",
year="2021",
pages="75--83",
publisher="Association for Computational Linguistics",
address="Bangkok, on-line",
doi="10.18653/v1/2021.iwslt-1.7",
isbn="978-1-7138-3378-9",
url="https://aclanthology.org/2021.iwslt-1.7.pdf"
}