Publication Details

Sequence Summarizing Neural Networks for Spoken Language Recognition

PEŠÁN, J.; BURGET, L.; ČERNOCKÝ, J. Sequence Summarizing Neural Networks for Spoken Language Recognition. In Proceedings of Interspeech 2016. San Francisco: International Speech Communication Association, 2016. p. 3285-3289. ISBN: 978-1-5108-3313-5.

Czech title

Sekvenční sumarizační neuronové sítě pro rozpoznávání mluveného jazyka

Type

conference paper

Language

English

Authors

Pešán Jan, Ing. (DCGM)
Burget Lukáš, doc. Ing., Ph.D. (DCGM)
Černocký Jan, prof. Dr. Ing. (DCGM)

URL

Keywords

Sequence Summarizing Neural Network, DNN,i-vectors

Abstract

This paper explores the use of Sequence Summarizing NeuralNetworks (SSNNs) as a variant of deep neural networks(DNNs) for classifying sequences. In this work, it is appliedto the task of spoken language recognition. Unlike other classificationtasks in speech processing where the DNN needs toproduce a per-frame output, language is considered constantduring an utterance. We introduce a summarization componentinto the DNN structure producing one set of language posteriorsper utterance. The training of the DNN is performed byan appropriately modified gradient-descent algorithm. In ourinitial experiments, the SSNN results are compared to a singlestate-of-the-art i-vector based baseline system with a similarcomplexity (i.e. no system fusion, etc.). For some conditions,SSNNs is able to provide performance comparable to the baselinesystem. Relative improvement up to 30% is obtained withthe score level fusion of the baseline and the SSNN systems.

Annotation

Tento článek pojednává o sekvenčních sumarizačních neuronových sítích pro rozpoznávání mluveného jazyka.

Published

2016

Pages

3285–3289

Proceedings

Proceedings of Interspeech 2016

Conference

Interspeech Conference, San Francisco, US

ISBN

978-1-5108-3313-5

Publisher

International Speech Communication Association

Place

San Francisco

DOI

10.21437/Interspeech.2016-764

UT WoS

000409394402038

EID Scopus

2-s2.0-84994361899

BibTeX

@inproceedings{BUT131019,
  author="Jan {Pešán} and Lukáš {Burget} and Jan {Černocký}",
  title="Sequence Summarizing Neural Networks for Spoken Language Recognition",
  booktitle="Proceedings of Interspeech 2016",
  year="2016",
  pages="3285--3289",
  publisher="International Speech Communication Association",
  address="San Francisco",
  doi="10.21437/Interspeech.2016-764",
  isbn="978-1-5108-3313-5",
  url="https://www.researchgate.net/publication/307889421_Sequence_Summarizing_Neural_Networks_for_Spoken_Language_Recognition"
}

Files

pdf pesan_interspeech2016_IS160764.pdf 234 kB