Publication Details

Temporal processing for feature extraction in speech recognition, habilitation thesis

ČERNOCKÝ, J. Temporal processing for feature extraction in speech recognition, habilitation thesis. Brno: 2002. p. 0-0.
Czech title
Časové zpracování pro výpočet příznaků v rozpoznávání řeči, habilitační práce
Type
habilitation thesis
Language
English
Authors
URL
Keywords

speech recognition, feature extraction

Abstract

Speech recognition is a booming research field, having large number of
applications in telecommunications (especially mobile), automobile
industry, consumer electronics, military and security, etc. Speech
recognition systems are classically built from three basic blocks:
feature extraction, acoustic matching and language modeling. While the
last two are trained on data (annotated databases for acoustics and
large speech corpora for the LM), feature extraction block is often
neglected and most often, mel-frequency cepstral coefficients (MFCC) are
used. This work concentrates on two techniques that should improve the
feature extraction. The first is temporal filtering of feature
trajectories using filters designed on data using Linear Discriminant
Analysis (LDA). This technique is shown to improve the recognition
accuracy of isolated Czech words, confirming previous results on
US-English obtained by our colleagues from OGI Portland. The second part
of the work concentrates on more revolutionary approach of feature
extraction using TRAPs (temporal patterns) whose fundamentals were also
laid at OGI. Several experiments were conducted on three databases
during author's stay at OGI. Although we have shown that TRAPs are
comparable to MFCC's only on a small vocabulary recognition task, we
believe that combination of frequency-band processing and neural nets
will become very important in the next decade, and that they will become
standard blocks of feature extraction. A conclusion chapter is included
for both methods, giving directions of current and future work both at
OGI Portland and VUT Brno.

Annotation
Published
2002
Pages
80
Place
Brno
BibTeX
@misc{BUT67489,
  author="Jan {Černocký}",
  title="Temporal processing for feature extraction in speech recognition, habilitation thesis",
  year="2002",
  pages="80",
  address="Brno",
  url="http://www.fit.vutbr.cz/~cernocky/publi/2002/habil.pdf",
  note="habilitation thesis"
}
Back to top