Publication Details

Employment of Subspace Gaussian Mixture Models in Speaker Recognition

MOTLÍČEK, P.; DEY, S.; MADIKERI, S.; BURGET, L. Employment of Subspace Gaussian Mixture Models in Speaker Recognition. In Proceedings of 2015 IEEE International Conference on Acoustics, Speech and Signal Processing. South Brisbane, Queensland: IEEE Signal Processing Society, 2015. p. 4445-4449. ISBN: 978-1-4673-6997-8.
Czech title
Využití podprostorových modelů Gaussovských směsí pro rozpoznávání mluvčího
Type
conference paper
Language
English
Authors
URL
Keywords

speaker recognition, i-vectors, subspace Gaussian mixture models, automatic speech recognition

Abstract

This paper presents Subspace Gaussian Mixture Model (SGMM) approach employed as a probabilistic generative model to estimate speaker vector representations to be subsequently used in the speaker verification task. SGMMs have already been shown to significantly outperform traditional HMM/GMMs in Automatic Speech Recognition (ASR) applications. An extension to the basic SGMM framework allows to robustly estimate low-dimensional speaker vectors and exploit them for speaker adaptation. We propose a speaker verification framework based on low-dimensional speaker vectors estimated using SGMMs, trained in ASR manner using manual transcriptions. To test the robustness of the system, we evaluate the proposed approach with respect to the state-of-the-art i-vector extractor on the NIST SRE 2010 evaluation set and on four different length-utterance conditions: 3sec-10sec, 10 sec-30 sec, 30 sec-60 sec and full (untruncated) utterances. Experimental results reveal that while i-vector system performs better on truncated 3sec to 10sec and 10 sec to 30 sec utterances, noticeable improvements are observed with SGMMs especially on full length-utterance durations. Eventually, the proposed SGMM approach exhibits complementary properties and can thus be efficiently fused with i-vector based speaker verification system.

Annotation

This paper presents Subspace Gaussian Mixture Model (SGMM) approach employed as a probabilistic generative model to estimate speaker vector representations to be subsequently used in the speaker verification task. SGMMs have already been shown to significantly outperform traditional HMM/GMMs in Automatic Speech Recognition (ASR) applications. An extension to the basic SGMM framework allows to robustly estimate low-dimensional speaker vectors and exploit them for speaker adaptation. We propose a speaker verification framework based on low-dimensional speaker vectors estimated using SGMMs, trained in ASR manner using manual transcriptions. To test the robustness of the system, we evaluate the proposed approach with respect to the state-of-the-art i-vector extractor on the NIST SRE 2010 evaluation set and on four different length-utterance conditions: 3sec-10sec, 10 sec-30 sec, 30 sec-60 sec and full (untruncated) utterances. Experimental results reveal that while i-vector system performs better on truncated 3sec to 10sec and 10 sec to 30 sec utterances, noticeable improvements are observed with SGMMs especially on full length-utterance durations. Eventually, the proposed SGMM approach exhibits complementary properties and can thus be efficiently fused with i-vector based speaker verification system.

Published
2015
Pages
4445–4449
Proceedings
Proceedings of 2015 IEEE International Conference on Acoustics, Speech and Signal Processing
ISBN
978-1-4673-6997-8
Publisher
IEEE Signal Processing Society
Place
South Brisbane, Queensland
DOI
UT WoS
000427402904111
EID Scopus
BibTeX
@inproceedings{BUT119895,
  author="Petr {Motlíček} and Subhadeep {Dey} and Srikanth {Madikeri} and Lukáš {Burget}",
  title="Employment of Subspace Gaussian Mixture Models in Speaker Recognition",
  booktitle="Proceedings of 2015 IEEE International Conference on Acoustics, Speech and Signal Processing",
  year="2015",
  pages="4445--4449",
  publisher="IEEE Signal Processing Society",
  address="South Brisbane, Queensland",
  doi="10.1109/ICASSP.2015.7178811",
  isbn="978-1-4673-6997-8",
  url="https://ieeexplore.ieee.org/document/7178811"
}
Back to top