Publication Details

Speech Technology for Unwritten Languages

SCHARENBORG, O.; BESACIER, L.; BLACK, A.; HASEGAWA-JOHNSON, M.; METZE, F.; NEUBIG, G.; STÜKER, S.; GODARD, P.; MÜLLER, M.; ONDEL YANG, L.; PALASKAR, S.; ARTHUR, P.; CIANNELLA, F.; DU, M.; LARSEN, E.; MERKX, D.; RIAD, R.; WANG, L.; DUPOUX, E. Speech Technology for Unwritten Languages. IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING, 2020, vol. 2020, no. 28, p. 964-975. ISSN: 2329-9290.

Czech title

Řečové technologie pro jazyky bez psané formy

Type

journal article

Language

English

Authors

SCHARENBORG, O.
BESACIER, L.
BLACK, A.
Hasegawa-Johnson Mark
Metze Florian
NEUBIG, G.
STÜKER, S.
GODARD, P.
MÜLLER, M.
ONDEL YANG, L.
PALASKAR, S.
ARTHUR, P.
CIANNELLA, F.
DU, M.
LARSEN, E.
MERKX, D.
RIAD, R.
WANG, L.
Dupoux Emmanuel

URL

Keywords

Speech processing, automatic speech recognition,unsupervised learning, speech synthesis, image retrieval.

Abstract

Abstract-Speech technology plays an important role in oureveryday life. Among others, speech is used for human-computerinteraction, for instance for information retrieval and on-lineshopping. In the case of an unwritten language, however, speechtechnology is unfortunately difficult to create, because it cannotbe created by the standard combination of pre-trained speech-to-text and text-to-speech subsystems. The research presented in this article takes the first steps towards speech technology forunwritten languages. Specifically, the aim of this work was 1) tolearn speech-to-meaning representations without using text as anintermediate representation, and 2) to test the sufficiency of thelearned representations to regenerate speech or translated text, orto retrieve images that depict the meaning of an utterance in anunwritten language. The results suggest that building systems thatgo directly from speech-to-meaning and from meaning-to-speech,bypassing the need for text, is possible.

Published

2020

Pages

964–975

Journal

IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING, vol. 2020, no. 28, ISSN 2329-9290

DOI

10.1109/TASLP.2020.2973896

UT WoS

000522357500002

EID Scopus

2-s2.0-85079642575

BibTeX

@article{BUT170325,
  author="SCHARENBORG, O. and BESACIER, L. and BLACK, A. and HASEGAWA-JOHNSON, M. and METZE, F. and NEUBIG, G. and STÜKER, S. and GODARD, P. and MÜLLER, M. and ONDEL YANG, L. and PALASKAR, S. and ARTHUR, P. and CIANNELLA, F. and DU, M. and LARSEN, E. and MERKX, D. and RIAD, R. and WANG, L. and DUPOUX, E.",
  title="Speech Technology for Unwritten Languages",
  journal="IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING",
  year="2020",
  volume="2020",
  number="28",
  pages="964--975",
  doi="10.1109/TASLP.2020.2973896",
  issn="2329-9290",
  url="https://ieeexplore.ieee.org/document/8998182"
}

Files

pdf scharenborg_ieee_ACM_transactions2020_08998182.pdf 2 MB