Publication Details

Speech Technology for Unwritten Languages

SCHARENBORG, O.; BESACIER, L.; BLACK, A.; HASEGAWA-JOHNSON, M.; METZE, F.; NEUBIG, G.; STÜKER, S.; GODARD, P.; MÜLLER, M.; ONDEL YANG, L.; PALASKAR, S.; ARTHUR, P.; CIANNELLA, F.; DU, M.; LARSEN, E.; MERKX, D.; RIAD, R.; WANG, L.; DUPOUX, E. Speech Technology for Unwritten Languages. IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING, 2020, vol. 2020, no. 28, p. 964-975. ISSN: 2329-9290.
Czech title
Řečové technologie pro jazyky bez psané formy
Type
journal article
Language
English
Authors
SCHARENBORG, O.
BESACIER, L.
BLACK, A.
Hasegawa-Johnson Mark
Metze Florian
NEUBIG, G.
STÜKER, S.
GODARD, P.
MÜLLER, M.
ONDEL YANG, L.
PALASKAR, S.
ARTHUR, P.
CIANNELLA, F.
DU, M.
LARSEN, E.
MERKX, D.
RIAD, R.
WANG, L.
Dupoux Emmanuel
URL
Keywords

Speech processing, automatic speech recognition, unsupervised learning, speech
synthesis, image retrieval.

Abstract

Abstract-Speech technology plays an important role in our everyday life. Among
others, speech is used for human-computer interaction, for instance for
information retrieval and on-line shopping. In the case of an unwritten language,
however, speech technology is unfortunately difficult to create, because it
cannot be created by the standard combination of pre-trained speech-to-text and
text-to-speech subsystems. The research presented in this article takes the first
steps towards speech technology for unwritten languages. Specifically, the aim of
this work was 1) to learn speech-to-meaning representations without using text as
an intermediate representation, and 2) to test the sufficiency of the learned
representations to regenerate speech or translated text, or to retrieve images
that depict the meaning of an utterance in an unwritten language. The results
suggest that building systems that go directly from speech-to-meaning and from
meaning-to-speech, bypassing the need for text, is possible.

Published
2020
Pages
964–975
Journal
IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING, vol. 2020, no. 28, ISSN 2329-9290
DOI
UT WoS
000522357500002
EID Scopus
BibTeX
@article{BUT170325,
  author="SCHARENBORG, O. and BESACIER, L. and BLACK, A. and HASEGAWA-JOHNSON, M. and METZE, F. and NEUBIG, G. and STÜKER, S. and GODARD, P. and MÜLLER, M. and ONDEL YANG, L. and PALASKAR, S. and ARTHUR, P. and CIANNELLA, F. and DU, M. and LARSEN, E. and MERKX, D. and RIAD, R. and WANG, L. and DUPOUX, E.",
  title="Speech Technology for Unwritten Languages",
  journal="IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING",
  year="2020",
  volume="2020",
  number="28",
  pages="964--975",
  doi="10.1109/TASLP.2020.2973896",
  issn="2329-9290",
  url="https://ieeexplore.ieee.org/document/8998182"
}
Files
Back to top