Publication Details
Bayesian Models for Unit Discovery on a Very Low Resource Language
GODARD, P.
BESACIER, L.
LARSEN, E.
Hasegawa-Johnson Mark
SCHARENBORG, O.
Dupoux Emmanuel
Burget Lukáš, doc. Ing., Ph.D. (DCGM)
YVON, F.
Khudanpur Sanjeev
Acoustic Unit Discovery, Low-ResourceASR, Bayesian Model, Informative Prior.
Developing speech technologies for low-resource languageshas become a very active research field over the last decade.Among others, Bayesian models have shown some promisingresults on artificial examples but still lack of in situ experiments.Our work applies state-of-the-art Bayesian modelsto unsupervised Acoustic Unit Discovery (AUD) in a reallow-resource language scenario. We also show that Bayesianmodels can naturally integrate information from other resourcefullanguages by means of informative prior leadingto more consistent discovered units. Finally, discoveredacoustic units are used, either as the 1-best sequence or as alattice, to perform word segmentation. Word segmentationresults show that this Bayesian approach clearly outperformsa Segmental-DTW baseline on the same corpus.
@inproceedings{BUT155041,
author="ONDEL YANG, L. and GODARD, P. and BESACIER, L. and LARSEN, E. and HASEGAWA-JOHNSON, M. and SCHARENBORG, O. and DUPOUX, E. and BURGET, L. and YVON, F. and KHUDANPUR, S.",
title="Bayesian Models for Unit Discovery on a Very Low Resource Language",
booktitle="Proceedings of ICASSP 2018",
year="2018",
pages="5939--5943",
publisher="IEEE Signal Processing Society",
address="Calgary",
doi="10.1109/ICASSP.2018.8461545",
isbn="978-1-5386-4658-8",
url="https://www.fit.vut.cz/research/publication/11719/"
}