Publication Details
Unsupervised Word Segmentation from Speech with Attention
BOITO, M.
ONDEL YANG, L.
BERARD, A.
YVON, F.
VILLAVICENCIO, A.
BESACIER, L.
computational language documentation,encoder-decoder models, attentional models, unsupervised word segmentation.
We present a first attempt to perform attentional word segmentation directly from the speech signal, with the final goal to automatically identify lexical units in a low-resource, unwritten language (UL). Our methodology assumes a pairing between recordings in the UL with translations in a well-resourced language. It uses Acoustic Unit Discovery (AUD) to convert speech into a sequence of pseudo-phones that is segmented using neural soft-alignments produced by a neural machine translation model. Evaluation uses an actual Bantu UL, Mboshi; comparisons to monolingual and bilingual baselines illustrate the potential of attentional word segmentation for language documentation.
@inproceedings{BUT163406,
author="GODARD, P. and BOITO, M. and ONDEL YANG, L. and BERARD, A. and YVON, F. and VILLAVICENCIO, A. and BESACIER, L.",
title="Unsupervised Word Segmentation from Speech with Attention",
booktitle="Proceeding of Interspeech 2018",
year="2018",
journal="Proceedings of Interspeech",
volume="2018",
number="9",
pages="2678--2682",
publisher="International Speech Communication Association",
address="Hyderabad",
doi="10.21437/Interspeech.2018-1308",
issn="1990-9772",
url="https://www.isca-speech.org/archive/Interspeech_2018/pdfs/1308.pdf"
}