Result Details
Automatic Language Identification using Phoneme and Automatically Derived Unit Strings
MATĚJKA, P., SZŐKE, I., SCHWARZ, P., ČERNOCKÝ, J. Automatic Language Identification using Phoneme and Automatically Derived Unit Strings. In Proceedings of 7th International Conference Text,Speech and Dialoque 2004. Brno: Springer, 2004. 8 p. ISBN: 3-540-23049-1.
Type
conference paper
Language
English
Authors
Matějka Pavel, Ing., Ph.D., DCGM (FIT), UREL (FEEC)
Szőke Igor, Ing., Ph.D., FIT (FIT), DCGM (FIT)
Schwarz Petr, Ing., Ph.D., FIT (FIT), DCGM (FIT)
Černocký Jan, prof. Dr. Ing., DCGM (FIT)
Szőke Igor, Ing., Ph.D., FIT (FIT), DCGM (FIT)
Schwarz Petr, Ing., Ph.D., FIT (FIT), DCGM (FIT)
Černocký Jan, prof. Dr. Ing., DCGM (FIT)
Abstract
Language identification (LID) based on phono-tactic modeling is presented in this paper.
Approaches using phoneme strings and strings of units automatically derived by an Ergodic HMM
(EHMM) are compared. The phoneme recognizers were trained on 6 languages from OGI
multi-language-corpus and Czech SpeechDat-E. The LID results are obtained on 4 languages. The
results show superiority of Czech phoneme recognizer while used in LID and promising trends using the EHMM-derived units.
Keywords
language identificaton, phoneme recognizer, speech processing, ergodic hidden Markov model
Published
2004
Pages
8
Proceedings
Proceedings of 7th International Conference Text,Speech and Dialoque 2004
Conference
International Conference on Text Speech and Dialogue, TSD 2004
ISBN
3-540-23049-1
Publisher
Springer
Place
Brno
BibTeX
@inproceedings{BUT11955,
author="Pavel {Matějka} and Igor {Szőke} and Petr {Schwarz} and Jan {Černocký}",
title="Automatic Language Identification using Phoneme and Automatically Derived Unit Strings",
booktitle="Proceedings of 7th International Conference Text,Speech and Dialoque 2004",
year="2004",
pages="8",
publisher="Springer",
address="Brno",
isbn="3-540-23049-1"
}
Projects
Data driven and anthropic coding and recognition of speech, GACR, Postdoktorandské granty, GP102/02/D108, start: 2002-09-01, end: 2005-08-30, completed
Voice technologies for support of information society, GACR, Standardní projekty, GA102/02/0124, start: 2002-01-01, end: 2004-12-31, completed
Voice technologies for support of information society, GACR, Standardní projekty, GA102/02/0124, start: 2002-01-01, end: 2004-12-31, completed
Departments