Publication Details

Developing A Speaker Identification System For The DARPA RATS Project

PLCHOT, O.; MATSOUKAS, S.; MATĚJKA, P.; DEHAK, N.; MA, J.; CUMANI, S.; GLEMBEK, O.; HEŘMANSKÝ, H.; MESGARANI, N.; SOUFIFAR, M.; THOMAS, S.; ZHANG, B.; ZHOU, X. Developing A Speaker Identification System For The DARPA RATS Project. Proceedings of ICASSP 2013. Vancouver: IEEE Signal Processing Society, 2013. p. 6768-6772. ISBN: 978-1-4799-0355-9.

Czech title

Vývoj systému identifikace řečníka pro DARPA RATS projekt

Type

conference paper

Language

English

Authors

Plchot Oldřich, Ing., Ph.D. (DCGM)
Matsoukas Spyros
Matějka Pavel, Ing., Ph.D.
Dehak Najim
Ma Jeff
Cumani Sandro, Ph.D.
Glembek Ondřej, Ing., Ph.D.
Heřmanský Hynek, prof. Ing., Dr. Eng. (DCGM)
Mesgarani Nima
Soufifar Mehdi Mohammad, Ing.
Thomas Samuel
Zhang Bing
Zhou Xinhui
and others

URL

http://www.fit.vutbr.cz/research/groups/speech/publi/2013/plchot_icassp2013_0006768.pdf

Keywords

speaker identification, noisy speech processing

Abstract

This paper is focusing on the development of a speaker identification system for the DARPA RATS Project.

Annotation

This paper describes the speaker identification (SID) system developed by the Patrol team for the first phase of the DARPA RATS (Robust Automatic Transcription of Speech) program, which seeks to advance state of the art detection capabilities on audio from highly degraded communication channels. We present results using multiple SID systems differing mainly in the algorithm used for voice activity detection (VAD) and feature extraction. We show that (a) unsupervised VAD performs as well supervised methods in terms of downstream SID performance, (b) noise-robust feature extraction methods such as CFCCs out-perform MFCC front-ends on noisy audio, and (c) fusion of multiple systems provides 24% relative improvement in EER compared to the single best system when using a novel SVM-based fusion algorithm that uses side information such as gender, language, and channel id.

Published

2013

Pages

6768–6772

Proceedings

Proceedings of ICASSP 2013

Conference

38th International Conference on Acoustics, Speech, and Signal Processing, Vancouver, CA

ISBN

978-1-4799-0355-9

Publisher

IEEE Signal Processing Society

Place

Vancouver

BibTeX

@inproceedings{BUT103484,
  author="Oldřich {Plchot} and Spyros {Matsoukas} and Pavel {Matějka} and Najim {Dehak} and Jeff {Ma} and Sandro {Cumani} and Ondřej {Glembek} and Hynek {Heřmanský} and Nima {Mesgarani} and Mehdi Mohammad {Soufifar} and Samuel {Thomas} and Bing {Zhang} and Xinhui {Zhou}",
  title="Developing A Speaker Identification System For The DARPA RATS Project",
  booktitle="Proceedings of ICASSP 2013",
  year="2013",
  pages="6768--6772",
  publisher="IEEE Signal Processing Society",
  address="Vancouver",
  isbn="978-1-4799-0355-9",
  url="http://www.fit.vutbr.cz/research/groups/speech/publi/2013/plchot_icassp2013_0006768.pdf"
}