Publication Details
VISUAL FEATURES FOR MULTIMODAL SPEECH RECOGNITION
Burget Lukáš, doc. Ing., Ph.D. (DCGM)
Černocký Jan, prof. Dr. Ing. (DCGM)
speech recognition, feature extraction, parameterization, visual features, linear transforms, meeting data
This paper demonstrates the use of visual parameters extracted from video for automatic recognition of phoneme strings. Encouraged by previous works utilizing "visually clean" data we investigate their efficiency in non-ideal conditions which are introduced by meeting audio-visual data employed in our experiments.
This paper demonstrates the use of visual parameters extracted from video for automatic recognition of phoneme strings. Encouraged by previous works utilizing "visually clean" data we investigate their efficiency in non-ideal conditions which are introduced by meeting audio-visual data employed in our experiments.
@inproceedings{BUT21499,
author="Petr {Motlíček} and Lukáš {Burget} and Jan {Černocký}",
title="VISUAL FEATURES FOR MULTIMODAL SPEECH RECOGNITION",
booktitle="Radioelektronika 2005",
year="2005",
pages="187--190",
publisher="Faculty of Electrical Engineering and Communication BUT",
address="Brno",
isbn="80-214-2904-6",
url="http://www.fit.vutbr.cz/~motlicek/publi/2005/radioel05.pdf, http://wes.feec.vutbr.cz/UREL/"
}