Publication Details

Regularized Subspace n-Gram Model for Phonotactic iVector Extraction

SOUFIFAR, M.; BURGET, L.; PLCHOT, O.; CUMANI, S.; ČERNOCKÝ, J. Regularized Subspace n-Gram Model for Phonotactic iVector Extraction. Proceedings of Interspeech 2013. Proceedings of the 14th Annual Conference of the International Speech Communication Association (Interspeech 2013). Lyon: International Speech Communication Association, 2013. p. 74-78. ISBN: 978-1-62993-443-3. ISSN: 2308-457X.
Czech title
Regularizovaný podprostorový n-ramový model pro extrakci fonotaktických iVektorů
Type
conference paper
Language
English
Authors
Soufifar Mehdi Mohammad, Ing.
Burget Lukáš, doc. Ing., Ph.D. (DCGM)
Plchot Oldřich, Ing., Ph.D. (DCGM)
Cumani Sandro, Ph.D.
Černocký Jan, prof. Dr. Ing. (DCGM)
URL
Keywords

Language identification, Subspace modeling, Subspace multinomial model

Abstract

This article describes an enhanced phonotactic iVector extraction model over the n-gram counts. In the first step, a subspace n-gram model is proposed to model conditional n-gram probabilities. Modeling different 3-gram histories with separated multinomial distributions shows promising results for the long condition however, we observed model over-fitting for the short duration conditions.

Annotation

Phonotactic language identification (LID) by means of n-gram statistics and discriminative classifiers is a popular approach for the LID problem. Low-dimensional representation of the n-gram statistics leads to the use of more diverse and efficient machine learning techniques in the LID. Recently, we proposed phototactic iVector as a low-dimensional representation of the n-gram statistics. In this work, an enhanced modeling of the n-gram probabilities along with regularized parameter estimation is proposed. The proposed model consistently improves the LID system performance over all conditions up to 15% relative to the previous state of the art system. The new model also alleviates memory requirement of the iVector extraction and helps to speed up subspace training. Results are presented in terms of Cavg over NIST LRE2009 evaluation set.

Published
2013
Pages
74–78
Journal
Proceedings of the 14th Annual Conference of the International Speech Communication Association (Interspeech 2013)., no. 8, ISSN 2308-457X
Proceedings
Proceedings of Interspeech 2013
ISBN
978-1-62993-443-3
Publisher
International Speech Communication Association
Place
Lyon
BibTeX
@inproceedings{BUT103567,
  author="Mehdi Mohammad {Soufifar} and Lukáš {Burget} and Oldřich {Plchot} and Sandro {Cumani} and Jan {Černocký}",
  title="Regularized Subspace n-Gram Model for Phonotactic iVector Extraction",
  booktitle="Proceedings of Interspeech 2013",
  year="2013",
  journal="Proceedings of the 14th Annual Conference of the International Speech Communication Association (Interspeech 2013).",
  number="8",
  pages="74--78",
  publisher="International Speech Communication Association",
  address="Lyon",
  isbn="978-1-62993-443-3",
  issn="2308-457X",
  url="http://www.fit.vutbr.cz/research/groups/speech/publi/2013/soufifar_interspeech2013_IS131171.pdf"
}
Back to top