Publication Details
Domain Adaptation Via Within-class Covariance Correction in I-Vector Based Speaker Recognition Systerms
Ma Jeff
Matějka Pavel, Ing., Ph.D. (DCGM)
Zhang Bing
Plchot Oldřich, Ing., Ph.D. (DCGM)
Burget Lukáš, doc. Ing., Ph.D. (DCGM)
Matsoukas Spyros (FIT)
speaker recognition, i-vectors, source normalization, LDA, inter-dataset variability compensation
In this paper, we have shown a technique of within-class correction for Linear Discriminant Analysis estimation. We have shown that when correct dataset clustering is used, adapting the within-class covariance of LDA by low-rank between-dataset covariance matrix can lead to significant improvement of the system, namely up to 70% in the Domain Adaptation Task, and 17.5% and 36% relative in the RATS unmatched and semi-matched tasks, respectively. The dataset clustering problem gave us an interesting direction for future research.
In this paper we propose a technique of Within-Class Covariance Correction (WCC) for Linear Discriminant Analysis (LDA) in Speaker Recognition to perform an unsupervised adaptation of LDA to an unseen data domain, and/or to compensate for speaker population difference among different portions of LDA training dataset. The paper follows on the study of source-normalization and interdatabase variability compensation techniques which deal with multimodal distribution of i-vectors. On the DARPA RATS (Robust Automatic Transcription of Speech) task, we show that, with two hours of unsupervised data, we improve the Equal-Error Rate (EER) by 17.5%, and 36% relative on the unmatched and semi-matched conditions, respectively. On the Domain Adaptation Challenge we show up to 70% relative EER reduction and we propose a data clustering procedure to identify the directions of the domain-based variability in the adaptation data.
@inproceedings{BUT111543,
author="Ondřej {Glembek} and Jeff {Ma} and Pavel {Matějka} and Bing {Zhang} and Oldřich {Plchot} and Lukáš {Burget} and Spyros {Matsoukas}",
title="Domain Adaptation Via Within-class Covariance Correction in I-Vector Based Speaker Recognition Systerms",
booktitle="Proceedings of ICASSP 2014",
year="2014",
pages="4060--4064",
publisher="IEEE Signal Processing Society",
address="Florencie",
doi="10.1109/ICASSP.2014.6854359",
isbn="978-1-4799-2892-7",
url="https://www.fit.vut.cz/research/publication/10555/"
}