Publication Details

BAT System Description for NIST LRE 2015

PLCHOT, O.; MATĚJKA, P.; FÉR, R.; GLEMBEK, O.; NOVOTNÝ, O.; PEŠÁN, J.; VESELÝ, K.; ONDEL YANG, L.; KARAFIÁT, M.; GRÉZL, F.; KESIRAJU, S.; BURGET, L.; BRUMMER, J.; SWART, A.; CUMANI, S.; MALLIDI, S.; LI, R. BAT System Description for NIST LRE 2015. In Proceedings of Odyssey 2016, The Speaker and Language Recognition Workshop. Proceedings of Odyssey: The Speaker and Language Recognition Workshop Odyssey 2014, Joensuu, Finland. Bilbao: International Speech Communication Association, 2016. p. 166-173. ISSN: 2312-2846.

Czech title

Popis BAT systému pro NIST LRE 2015 evaluace

Type

conference paper

Language

English

Authors

Plchot Oldřich, Ing., Ph.D. (DCGM)
Matějka Pavel, Ing., Ph.D. (DCGM)
Fér Radek, Ing.
Glembek Ondřej, Ing., Ph.D.
Novotný Ondřej, Ing., Ph.D.
Pešán Jan, Ing. (DCGM)
Veselý Karel, Ing., Ph.D. (DCGM)
Ondel Lucas Antoine Francois, Mgr., Ph.D. (SSDIT)
Karafiát Martin, Ing., Ph.D. (DCGM)
Grézl František, Ing., Ph.D. (DCGM)
Kesiraju Santosh, Ph.D. (DCGM)
Burget Lukáš, doc. Ing., Ph.D. (DCGM)
Brummer Johan Nikolaas Langenhoven, Dr.
Swart Albert du Preez
Cumani Sandro, Ph.D.
Mallidi Sri Harish (FIT)
Li Ruizhi

URL

Keywords

BAT System Description, NIST LRE

Abstract

In this paper, we summarize our efforts in the NIST Language Recognition (LRE) 2015 Evaluations which resulted in systems providing very competitive performance. We provide both the descriptions and the analysis of the systems that we included in our submission. We start by detailed description of the datasets that we used for training and development, and we follow by describing the models and methods that were used to produce the final scores. These include the front-end (i.e., the voice activity detection and feature extraction), the back-end (i.e., the final classifier), and the calibration and fusion stages. Apart from the techniques commonly used in the field (such as i-vectors, DNN bottle-Neck features, NN classifiers, Gaussian Back-ends, etc.), we present less-common methods, such as Sequence Summarizing Neural Networks (SSNN), and Automatic Unit Discovery. We present the performance of the systems both on the Fixed condition (where participants are required to use predefined data sets only), and the Open condition (where participants are allowed to use any publicly available resource) of the NIST LRE 2015.

Annotation

In this work, we have described our efforts in the NIST LRE 2015. The most difficult part of this evaluation was to deal with limited amount of data and the results show that the proper analysis in this direction is necessary. We have built over 20 systems for this evaluation. We have experimented with de-noising NN, automatic unit discovery, different flavors of phonotactic systems, backends, sizes of ivector systems, feature sets, BN features or frame level language classifiers. We used up to 6 systems in the fusion. The performance of our best system reached Cavg of 16.9% on the fixed training data condition and 13.9% (11.9% after post-evaluation analysis) on the open training data condition.

Published

2016

Pages

166–173

Journal

Proceedings of Odyssey: The Speaker and Language Recognition Workshop Odyssey 2014, Joensuu, Finland, vol. 2016, no. 06, ISSN 2312-2846

Proceedings

Proceedings of Odyssey 2016, The Speaker and Language Recognition Workshop

Publisher

International Speech Communication Association

Place

Bilbao

DOI

10.21437/Odyssey.2016-24

EID Scopus

2-s2.0-85073213346

BibTeX

@inproceedings{BUT131004,
  author="Oldřich {Plchot} and Pavel {Matějka} and Radek {Fér} and Ondřej {Glembek} and Ondřej {Novotný} and Jan {Pešán} and Karel {Veselý} and Lucas Antoine Francois {Ondel} and Martin {Karafiát} and František {Grézl} and Santosh {Kesiraju} and Lukáš {Burget} and Johan Nikolaas Langenhoven {Brummer} and Albert du Preez {Swart} and Sandro {Cumani} and Sri Harish {Mallidi} and Ruizhi {Li}",
  title="BAT System Description for NIST LRE 2015",
  booktitle="Proceedings of Odyssey 2016, The Speaker and Language Recognition Workshop",
  year="2016",
  journal="Proceedings of Odyssey: The Speaker and Language Recognition Workshop Odyssey 2014, Joensuu, Finland",
  volume="2016",
  number="06",
  pages="166--173",
  publisher="International Speech Communication Association",
  address="Bilbao",
  doi="10.21437/Odyssey.2016-24",
  issn="2312-2846",
  url="http://www.odyssey2016.org/papers/pdfs_stamped/73.pdf"
}