Result Details

Fusion of heterogeneous speaker recognition systems in the STBU submission for the NIST speaker recognition evaluation 2006

BRÜMMER, N.; BURGET, L.; ČERNOCKÝ, J.; GLEMBEK, O.; GRÉZL, F.; KARAFIÁT, M.; VAN LEEUWEN, D.; MATĚJKA, P.; SCHWARZ, P.; STRASHEIM, A. Fusion of heterogeneous speaker recognition systems in the STBU submission for the NIST speaker recognition evaluation 2006. IEEE Transactions on Audio Speech and Language Processing, 2007, vol. 15, no. 7, p. 2072-2084. ISSN: 1558-7916.

Type

journal article

Language

English

Authors

Brümmer Niko
Burget Lukáš, doc. Ing., Ph.D., DCGM (FIT)
Černocký Jan, prof. Dr. Ing., DCGM (FIT)
Glembek Ondřej, Ing., Ph.D., FIT (FIT), DCGM (FIT)
Grézl František, Ing., Ph.D., FIT (FIT), DCGM (FIT)
Karafiát Martin, Ing., Ph.D., FIT (FIT), DCGM (FIT)
van Leeuwen David
Matějka Pavel, Ing., Ph.D., DCGM (FIT), UREL (FEEC)
Schwarz Petr, Ing., Ph.D., FIT (FIT), DCGM (FIT)
Strasheim Albeert

Abstract

The paper describes the fusion of heterogeneous speaker recognition systems in the STBU submission for the NIST speaker recognition evaluation 2006.

Keywords

speaker recognition

URL

https://www-dev.fit.vutbr.cz/research/group/speech/public/publi/2007/brummer…

Annotation

This paper describes and discusses the `STBU' speaker recognition system, which performed well in the NIST Speaker Recognition Evaluation 2006 (SRE). STBU is a consortium of 4 partners: Spescom DataVoice (South Africa), TNO (The Netherlands), BUT (Czech Republic) and University of Stellenbosch (South Africa). The STBU system was a combination of three main kinds of sub-systems: (1) GMM, with shorttime MFCC or PLP features, (2) GMM-SVM, using GMM mean supervectors as input to an SVM, and (3) MLLR-SVM, using MLLR speaker adaptation coefficients derived from an English LVCSR system. All sub-systems made use of supervector subspace channel compensation methodsóeither eigenchannel adaptation or nuisance attribute projection. We document the design and performance of all sub-systems, as well as their fusion and calibration via logistic regression. Finally, we also present a cross-site fusion that was done with several additional systems from other NIST SRE-2006 participants.

Published

2007

Pages

2072–2084

Journal

IEEE Transactions on Audio Speech and Language Processing, vol. 15, no. 7, ISSN 1558-7916

BibTeX

@article{BUT45307,
  author="Niko {Brümmer} and Lukáš {Burget} and Jan {Černocký} and Ondřej {Glembek} and František {Grézl} and Martin {Karafiát} and David {van Leeuwen} and Pavel {Matějka} and Petr {Schwarz} and Albeert {Strasheim}",
  title="Fusion of heterogeneous speaker recognition systems in the STBU submission for the NIST speaker recognition evaluation 2006",
  journal="IEEE Transactions on Audio Speech and Language Processing",
  year="2007",
  volume="15",
  number="7",
  pages="2072--2084",
  issn="1558-7916",
  url="http://www.fit.vutbr.cz/research/groups/speech/publi/2007/brummer_stbu_t-asl_2007.pdf"
}

Projects

CARETAKER - Content Analysis and REtrieval Technologies to Apply Knowledge Extraction to massive Recording, EU, Sixth Framework programme, 027231, start: 2006-03-01, end: 2008-09-30, completed
Interactive Keyword Detector, GACR, Postdoktorandské granty, GP102/06/P383, start: 2006-01-01, end: 2008-12-31, completed
New trends in research and application of voice technology, GACR, Standardní projekty, GA102/05/0278, start: 2005-01-01, end: 2007-12-31, completed
Security-Oriented Research in Information Technology, MŠMT, Institucionální prostředky SR ČR (např. VZ, VC), MSM0021630528, start: 2007-01-01, end: 2013-12-31, running

Research groups

Speech Data Mining Research Group BUT Speech@FIT (RG SPEECH)

Departments

Department of Computer Graphics and Multimedia (DCGM)
Department of Radio Electronics (UREL)