Publication Details
Brno University of Technology at MediaEval 2011 Genre Tagging Task
genre recognition, bag of words, SIFT, local features, SVM, classification, classifier fusion
This paper briefly describes our approach to the video genre tagging task which was a part of MediaEval 2011. We focused mainly on visual and audio information, and we exploited metadata and automatic speech transcripts only in a very basic way. Our approach relied on classification and on classifier fusion to combine different sources of information. We did not use any additional training data except the very small exemplary set provided by MediaEval (only 246 videos). The best performance was achieved by metadata alone. Combination with the other sources of information did not improve results in the submitted runs. This was achieved later by choosing more suitable weights in fusion. Excluding the metadata, audio and video gave better results than speech transcripts. Using classifiers for 345 semantic classes from TRECVID 2011 semantic indexing (SIN) task to project the data worked better than classifying directly from video and audio features.
@inproceedings{BUT91115,
author="Michal {Hradiš} and Ivo {Řezníček} and Kamil {Behúň}",
title="Brno University of Technology at MediaEval 2011 Genre Tagging Task",
booktitle="Working Notes Proceedings of the MediaEval 2011 Workshop",
year="2011",
journal="CEUR Workshop Proceedings",
number="9",
pages="1--2",
publisher="CEUR-WS.org",
address="Pisa, Italy",
issn="1613-0073",
url="http://ceur-ws.org/Vol-807/Hradis_BUT_Genre_me11wn.pdf"
}