Publication Details

Efektivní přístup ke znalostem v audio-vizuálních záznamech

SZŐKE, I.; FAPŠO, M.; ŽIŽKA, J.; BERAN, V.; ČERNOCKÝ, J. Efektivní přístup ke znalostem v audio-vizuálních záznamech. Proceedings of the Annual Database Conference. Praha: Technická univerzita v Košiciach, 2012. s. 57-74. ISBN: 978-80-553-1049-7.
English title
Effective access for information in audio-visual recordings
Type
conference paper
Language
Czech
Authors
URL
Keywords

audiovisual recording, speech-to-text, image-to-text, indexing and search, web

Abstract

The amount of audiovisual data in growing. Part of the data as lecture or conference recordings contain important information. However this information is hidden and unreachable for standard web crawlers as Google. This paper deals with a system, which makes the information available for standard text based indexers and searchers. It is done by conversion of speech and video into text. Description of the audiovisual indexing and search system is provided in the first part of this paper. We briefly describe the speech-to-text and slide synchronization components. Next, the description of an indexing engine is given. The engine is capable to index not only text but also timing and probability of recognized speech. The second part is aimed at practical issues like user interface and customer feedback.

Annotation

The amount of audiovisual data in growing. Part of the data as lecture or conference recordings contain important information. However this information is hidden and unreachable for standard web crawlers as Google. This paper deals with a system, which makes the information available for standard text based indexers and searchers. It is done by conversion of speech and video into text. Description of the audiovisual indexing and search system is provided in the first part of this paper. We briefly describe the speech-to-text and slide synchronization components. Next, the description of an indexing engine is given. The engine is capable to index not only text but also timing and probability of recognized speech. The second part is aimed at practical issues like user interface and customer feedback.

Published
2012
Pages
57–74
Proceedings
Proceedings of the Annual Database Conference
ISBN
978-80-553-1049-7
Publisher
Technická univerzita v Košiciach
Place
Praha
BibTeX
@inproceedings{BUT97053,
  author="Igor {Szőke} and Michal {Fapšo} and Josef {Žižka} and Vítězslav {Beran} and Jan {Černocký}",
  title="Efektivní přístup ke znalostem v audio-vizuálních záznamech",
  booktitle="Proceedings of the Annual Database Conference",
  year="2012",
  pages="57--74",
  publisher="Technická univerzita v Košiciach",
  address="Praha",
  isbn="978-80-553-1049-7",
  url="https://www.fit.vut.cz/research/publication/10172/"
}
Back to top