DIRAC - Detection and Identification of Rare Audio-visual Cues

Czech title

Detekce a identifikace řídkých audiovizuálních podnětů - DIRAC

Type

grant

Keywords

audio, video, detection

Abstract

Today's computers can do many amazing things but there are still many "trivial"
but important tasks they cannot do well. In particular, current information
extraction techniques perform well when event types are well represented in the
training data but often fail when encountering information-rich unexpected rare
events. DIRAC project addresses this crucial machine weakness and aims at
designing and developing an environment-adaptive autonomous artificial cognitive
system that will detect, identify and classify possibly threatening rare events
from the information derived by multiple active information-seeking audio-visual
sensors.

Biological organisms rely for their survival on detecting and identifying new
events. DIRAC therefore strives to combine its expertise in physiology of
mammalian auditory and visual cortex and in audio/visual recognition engineering
with the aim to move the art of audiovisual machine recognition from the
classical signal processing/pattern classification paradigm to human-like
information extraction. This means, among other things, to move from
interpretation of all incoming data to reliable rejection of non-informative
inputs, from passive acquisition of a single incoming stream to active search for
the most relevant information in multiple streams, and from a system optimized
for one static environment to autonomous adaptation to new changing environments,
thus forming foundation for a new generation of efficient cognitive information
processing technologies.

DIRAC is an EU IP IST project of the 6th Framework Program. Its duration is 5
years, from January 2006 until December 2010.

Partners of the project comes from all over the world and are the following:
Idiap Research Institute (coordinator), Eidgenossische Technische Hochschule
Zuerich (CH), The Hebrew University of Jerusalem
(http://www.cs.huji.ac.il/labs/vision/) (IL), Czech Technical University
(http://cmp.felk.cvut.cz/) (CS), Carl von Ossietzky Universitaet Oldenburg (DE),
Leibniz Institute for Neurobiology (DE), Katholieke Universiteit Leuven
(http://134.58.34.1/index.php) (B), Oregon Health and Science University OGI
School of Science and Engineerring (http://www.bme.ogi.edu/) (USA).

Team members

Heřmanský Hynek, prof. Ing., Dr. Eng. (DCGM) – research leader
Burget Lukáš, doc. Ing., Ph.D. (DCGM)
Hannemann Mirko, Ph.D.
Kombrink Stefan, Dipl.-Linguist.
Mikolov Tomáš, Ing., Ph.D.

Publications

2012

KOMBRINK, S.; HANNEMANN, M.; BURGET, L. Out-of-Vocabulary Word Detection and Beyond. In Detection and Identification of Rare Audiovisual Cues. Studies in Computational Intelligence, 384. Springer-Verlag Berlin Heidelberg: Springer Verlag, 2012. p. 57-65. ISBN: 978-3-642-24033-1. Detail

2011

DEORAS, A.; MIKOLOV, T.; KOMBRINK, S.; KARAFIÁT, M.; KHUDANPUR, S. Variational Approximation of Long-span Language Models for LVCSR. Proceedings of the 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011. Praha: IEEE Signal Processing Society, 2011. p. 5532-5535. ISBN: 978-1-4577-0537-3. Detail
KOMBRINK, S.; MIKOLOV, T. Recurrent Neural Network Language Modeling Applied to the Brno AMI/AMIDA 2009 Meeting Recognizer Setup. Proceedings of the 17th Conference STUDENT EEICT 2011. Volume 3. Brno: Brno University of Technology, 2011. p. 527-531. ISBN: 978-80-214-4273-3. Detail
MIKOLOV, T.; KOMBRINK, S.; BURGET, L.; ČERNOCKÝ, J.; KHUDANPUR, S. Extensions of Recurrent Neural Network Language Model. Proceedings of the 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011. Praha: IEEE Signal Processing Society, 2011. p. 5528-5531. ISBN: 978-1-4577-0537-3. Detail

2010

ČERNOCKÝ, J.; SZŐKE, I.; HANNEMANN, M.; KOMBRINK, S. Word-subword based keyword spotting with implications in OOV detection. Pacific Grove: Institute of Electrical and Electronics Engineers, 2010. p. 0-0. Detail
HANNEMANN, M.; KOMBRINK, S.; KARAFIÁT, M.; BURGET, L. Similarity Scoring for Recognizing Repeated Out-of-VocabularyWords. Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010). Proceedings of Interspeech. Makuhari, Chiba: International Speech Communication Association, 2010. p. 897-900. ISBN: 978-1-61782-123-3. ISSN: 1990-9772. Detail
KOMBRINK, S.; HANNEMANN, M. DIRAC D2.16 - Final system for identifying unexpected acoustic inputs (BUT). Brno: The Information Society Technologies (IST) 6th Framework programme, 2010. p. 1-19. Detail
KOMBRINK, S.; HANNEMANN, M.; BURGET, L. Out-of-vocabulary word detection and beyond. ECML PKDD 2010 Proceedings and Journal Content. Barcelona: 2010. p. 1-8. Detail
KOMBRINK, S.; HANNEMANN, M.; BURGET, L.; HEŘMANSKÝ, H. Recovery of Rare Words in Lecture Speech. Proc. Text, Speech and Dialogue 2010. Lecture Notes in Computer Science. Brno: Springer Verlag, 2010. p. 330-337. ISBN: 978-3-642-15759-2. ISSN: 0302-9743. Detail
MIKOLOV, T.; KARAFIÁT, M.; BURGET, L.; ČERNOCKÝ, J.; KHUDANPUR, S. Recurrent neural network based language model. Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010). Proceedings of Interspeech. Makuhari, Chiba: International Speech Communication Association, 2010. p. 1045-1048. ISBN: 978-1-61782-123-3. ISSN: 1990-9772. Detail

2009

BRÜMMER, N.; STRASHEIM, A.; HUBEIKA, V.; MATĚJKA, P.; BURGET, L.; GLEMBEK, O. Discriminative Acoustic Language Recognition via Channel-Compensated GMM Statistics. Proc. Interspeech 2009. Proceedings of Interspeech. Brighton: International Speech Communication Association, 2009. p. 2187-2190. ISBN: 978-1-61567-692-7. ISSN: 1990-9772. Detail
KOMBRINK, S.; BURGET, L.; MATĚJKA, P.; KARAFIÁT, M.; HEŘMANSKÝ, H. Posterior-based Out of Vocabulary Word Detection in Telephone Speech. Proc. Interspeech 2009. Proceedings of Interspeech. Brighton: International Speech Communication Association, 2009. p. 80-83. ISSN: 1990-9772. Detail

2008

BURGET, L.; BRÜMMER, N.; REYNOLDS, D.; KENNY, P.; PELECANOS, J.; VOGT, R.; CASTALDO, F.; DEHAK, N.; DEHAK, R.; GLEMBEK, O.; KARAM, Z.; NOECKER, J.; NA, H.; COSTIN, C.; HUBEIKA, V.; KAJAREKAR, S.; SCHEFFER, N.; ČERNOCKÝ, J. Robust Speaker Recognition Over Varying Channels. Baltimore: Johns Hopkins University, 2008. p. 0-0. Detail
BURGET, L.; SCHWARZ, P.; MATĚJKA, P.; HANNEMANN, M.; RASTROW, A.; WHITE, C.; KHUDANPUR, S.; HEŘMANSKÝ, H.; ČERNOCKÝ, J. Combination of strongly and weakly constrained recognizers for reliable detection of OOVs. Proc. International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Las Vegas: IEEE Signal Processing Society, 2008. p. 1-4. ISBN: 1-4244-1484-9. Detail