Speech Data Mining Research Group BUT Speech@FIT

https://speech.fit.vutbr.cz/

Publications

  • 2024

    BENEŠ, K.; KOCOUR, M.; BURGET, L. Hystoc: Obtaining Word Confidences for Fusion of End-To-End ASR Systems. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Seoul: IEEE Signal Processing Society, 2024. p. 11276-11280. ISBN: 979-8-3503-4485-1. Detail

    HAN, J.; LANDINI, F.; ROHDIN, J.; DIEZ SÁNCHEZ, M.; BURGET, L.; CAO, Y.; LU, H.; ČERNOCKÝ, J. Diacorrect: Error Correction Back-End for Speaker Diarization. In ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Seoul: IEEE Signal Processing Society, 2024. p. 11181-11185. ISBN: 979-8-3503-4485-1. Detail

    KLEMENT, D.; DIEZ SÁNCHEZ, M.; LANDINI, F.; BURGET, L.; SILNOVA, A.; DELCROIX, M.; TAWARA, N. Discriminative Training of VBx Diarization. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Seoul: IEEE Signal Processing Society, 2024. p. 11871-11875. ISBN: 979-8-3503-4485-1. Detail

    LANDINI, F.; DIEZ SÁNCHEZ, M.; STAFYLAKIS, T.; BURGET, L. DiaPer: End-to-End Neural Diarization With Perceiver-Based Attractors. IEEE Transactions on Audio, Speech, and Language Processing, 2024, vol. 32, no. 7, p. 3450-3465. ISSN: 1558-7916. Detail

  • 2023

    BHATTACHARJEE, M.; MOTLÍČEK, P.; NIGMATULINA, I.; HELMKE, H.; OHNEISER, O.; KLEINERT, M.; EHR, H. Customization of Automatic Speech Recognition Engines for Rare Word Detection Without Costly Model Re-Training. Proceedings of the 13th SESAR Innovation Days. Seville: SESAR Joint Undertaking, 2023. p. 1-8. Detail

    BURDISSO, S.; VILLATORO-TELLO, E.; MADIKERI, S.; MOTLÍČEK, P. Node-weighted Graph Convolutional Network for Depression Detection in Transcribed Clinical Interviews. In Proceedings of the Annual Conference of International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Dublin: International Speech Communication Association, 2023. p. 3617-3621. ISSN: 1990-9772. Detail

    DELCROIX, M.; TAWARA, N.; DIEZ SÁNCHEZ, M.; LANDINI, F.; SILNOVA, A.; OGAWA, A.; NAKATANI, T.; BURGET, L.; ARAKI, S. Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Dublin: International Speech Communication Association, 2023. p. 3477-3481. ISSN: 1990-9772. Detail

    HELMKE, H.; KLEINERT, M.; AHRENHOLD, N.; EHR, H.; MÜHLHAUSEN, T.; PINSKA, E.; OHNEISER, O.; KLAMERT, L.; MOTLÍČEK, P.; PRASAD, A.; ZULUAGA-GOMEZ, J.; DOKIC, J. Automatic Speech Recognition and Understanding for Radar Label Maintenance Support Increases Safety and Reduces Air Traffic Controllers' Workload. Proceedings of ATM Seminar. Savannah, Georgia: EUROPEAN ORGANISATION FOR THE SAFETY OF AIR NAVIGATION, 2023. p. 1-11. Detail

    KAKOUROS, S.; STAFYLAKIS, T.; MOŠNER, L.; BURGET, L. Speech-Based Emotion Recognition with Self-Supervised Models Using Attentive Channel-Wise Correlations and Label Smoothing. In Proceedings of ICASSP 2023. Rhodes Island: IEEE Signal Processing Society, 2023. p. 1-5. ISBN: 978-1-7281-6327-7. Detail

    KESIRAJU, S.; BENEŠ, K.; TIKHONOV, M.; ČERNOCKÝ, J. BUT Systems for IWSLT 2023 Marathi - Hindi Low Resource Speech Translation Task. In 20th International Conference on Spoken Language Translation, IWSLT 2023 - Proceedings of the Conference. Toronto (in-person and online): Association for Computational Linguistics, 2023. p. 227-234. ISBN: 978-1-959429-84-5. Detail

    KESIRAJU, S.; SARVAŠ, M.; PAVLÍČEK, T.; MACAIRE, C.; CIUBA, A. Strategies for Improving Low Resource Speech to Text Translation Relying on Pre-trained ASR Models. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Dublin: International Speech Communication Association, 2023. p. 2148-2152. ISSN: 1990-9772. Detail

    KHALIL, D.; PRASAD, A.; MOTLÍČEK, P.; ZULUAGA-GOMEZ, J.; NIGMATULINA, I.; MADIKERI, S.; SCHUEPBACH, C. An Automatic Speaker Clustering Pipeline for the Air Traffic Communication Domain. Aerospace, 2023, vol. 10, no. 10, p. 1-14. ISSN: 2226-4310. Detail

    LANDINI, F.; DIEZ SÁNCHEZ, M.; LOZANO DÍEZ, A.; BURGET, L. Multi-Speaker and Wide-Band Simulated Conversations as Training Data for End-to-End Neural Diarization. In Proceedings of ICASSP 2023. Rhodes Island: IEEE Signal Processing Society, 2023. p. 1-5. ISBN: 978-1-7281-6327-7. Detail

    MAI, F.; ZULUAGA-GOMEZ, J.; PARCOLLET, T.; MOTLÍČEK, P. HyperConformer: Multi-head HyperMixer for Efficient Speech Recognition. In Proceedings of the Annual Conference of International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Dublin: International Speech Communication Association, 2023. p. 2213-2217. ISSN: 1990-9772. Detail

    MATĚJKA, P.; SILNOVA, A.; SLAVÍČEK, J.; MOŠNER, L.; PLCHOT, O.; KLČO, M.; PENG, J.; STAFYLAKIS, T.; BURGET, L. Description and Analysis of ABC Submission to NIST LRE 2022. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Dublin: International Speech Communication Association, 2023. p. 511-515. ISSN: 1990-9772. Detail

    MOŠNER, L.; PLCHOT, O.; PENG, J.; BURGET, L.; ČERNOCKÝ, J. Multi-Channel Speech Separation with Cross-Attention and Beamforming. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Dublin: International Speech Communication Association, 2023. p. 1693-1697. ISSN: 1990-9772. Detail

    MOTLÍČEK, P.; PRASAD, A.; NIGMATULINA, I.; HELMKE, H.; OHNEISER, O.; KLEINERT, M. Automatic Speech Analysis Framework for ATC Communication in HAAWAII. Proceedings of the 13th SESAR Innovation Days. Seville: SESAR Joint Undertaking, 2023. p. 1-9. Detail

    NIGMATULINA, I.; MADIKERI, S.; VILLATORO-TELLO, E.; MOTLÍČEK, P.; ZULUAGA-GOMEZ, J.; PANDIA, K.; GANAPATHIRAJU, A. Implementing contextual biasing in GPU decoder for online ASR. In Proceedings of the Annual Conference of International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Dublin: International Speech Communication Association, 2023. p. 4494-4498. ISSN: 1990-9772. Detail

    PENG, J.; PLCHOT, O.; STAFYLAKIS, T.; MOŠNER, L.; BURGET, L.; ČERNOCKÝ, J. An attention-based backend allowing efficient fine-tuning of transformer models for speaker verification. In 2022 IEEE Spoken Language Technology Workshop, SLT 2022 - Proceedings. Doha: IEEE Signal Processing Society, 2023. p. 555-562. ISBN: 978-1-6654-7189-3. Detail

    PENG, J.; PLCHOT, O.; STAFYLAKIS, T.; MOŠNER, L.; BURGET, L.; ČERNOCKÝ, J. Improving Speaker Verification with Self-Pretrained Transformer Models. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Dublin: International Speech Communication Association, 2023. p. 5361-5365. ISSN: 1990-9772. Detail

    PENG, J.; STAFYLAKIS, T.; GU, R.; PLCHOT, O.; MOŠNER, L.; BURGET, L.; ČERNOCKÝ, J. Parameter-Efficient Transfer Learning of Pre-Trained Transformer Models for Speaker Verification Using Adapters. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Rhodes Island: IEEE Signal Processing Society, 2023. p. 1-5. ISBN: 978-1-7281-6327-7. Detail

    ŘIHÁČEK, T.; NEHYBA, J.; ČEVELÍČEK, M.; POLOK, A.; MATĚJKA, P.; DOLEŽAL, P. DeePsy: Představení online nástroje pro zpětnou vazbu v psychoterapii. Psychoterapie. Masarykova univerzita AN FL, 2023, roč. 17, č. 1, s. 1-11. ISSN: 1802-3983. Detail

    SILNOVA, A.; BRUMMER, J.; SWART, A.; BURGET, L. Toroidal Probabilistic Spherical Discriminant Analysis. In Proceedings of ICASSP 2023. Rhodes Island: IEEE Signal Processing Society, 2023. p. 1-5. ISBN: 978-1-7281-6327-7. Detail

    SILNOVA, A.; SLAVÍČEK, J.; MOŠNER, L.; KLČO, M.; PLCHOT, O.; MATĚJKA, P.; PENG, J.; STAFYLAKIS, T.; BURGET, L. ABC System Description for NIST LRE 2022. Proceedings of NIST LRE 2022 Workshop. Washington DC: National Institute of Standards and Technology, 2023. p. 1-5. Detail

    SKOWRON, M.; BACKFRIED, G.; NAVAS, E.; BERZINŠ, A.; VAN, J.; DE, F.; DEMARCO, A.; POLÁK, P.; KOVÁČ, M.; POLÁK, P.; ROHDIN, J.; ROSNER, M.; SANCHEZ, J.; SARATXAGA, I.; SCHWARZ, P. Deep Dive Speech Technology. In European Language Equality. Cham: Springer Nature Switzerland AG, 2023. p. 289-312. ISBN: 978-3-031-28819-7. Detail

    STAFYLAKIS, T.; MOŠNER, L.; KAKOUROS, S.; PLCHOT, O.; BURGET, L.; ČERNOCKÝ, J. Extracting speaker and emotion information from self-supervised speech models via channel-wise correlations. In 2022 IEEE Spoken Language Technology Workshop, SLT 2022 - Proceedings. Doha: IEEE Signal Processing Society, 2023. p. 1136-1143. ISBN: 978-1-6654-7189-3. Detail

    VANDERREYDT, G.; PRASAD, A.; KHALIL, D.; MADIKERI, S.; DEMUYNCK, K.; MOTLÍČEK, P. Parameter-Efficient Tuning With Adaptive Bottlenecks For Automatic Speech Recognition. Proceedings of IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). Taipei: IEEE Signal Processing Society, 2023. p. 1-7. ISBN: 979-8-3503-0689-7. Detail

    VILLATORO-TELLO, E.; MADIKERI, S.; ZULUAGA-GOMEZ, J.; SHARMA, B.; SARFJOO, S.; NIGMATULINA, I.; MOTLÍČEK, P.; IVANOV, V.; GANAPATHIRAJU, A. Effectiveness of Text, Acoustic, and Lattice-Based Representations in Spoken Language Understanding Tasks. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Rhodes Island: IEEE Signal Processing Society, 2023. p. 1-5. ISBN: 978-1-7281-6327-7. Detail

    YU, D.; GONG, Y.; PICHENY, A.; RAMABHADRAN, B.; HAKKANI-TÜR, D.; PRASAD, R.; ZEN, H.; SKOGLUND, J.; ČERNOCKÝ, J.; BURGET, L.; MOHAMED, A. Twenty-Five Years of Evolution in Speech and Language Processing. IEEE SIGNAL PROCESSING MAGAZINE, 2023, vol. 40, no. 5, p. 27-39. ISSN: 1558-0792. Detail

    YUSUF, B.; ČERNOCKÝ, J.; SARAÇLAR, M. End-to-End Open Vocabulary Keyword Search With Multilingual Neural Representations. IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING, 2023, vol. 31, no. 08, p. 3070-3080. ISSN: 2329-9290. Detail

    YUSUF, B.; GOURAV, A.; GANDHE, A.; BULYKO, I. On-the-Fly Text Retrieval for end-to-end ASR Adaptation. In Proceedings of ICASSP 2023. Rhodes Island: IEEE Signal Processing Society, 2023. p. 1-5. ISBN: 978-1-7281-6327-7. Detail

    ŽMOLÍKOVÁ, K.; DELCROIX, M.; OCHIAI, T.; ČERNOCKÝ, J.; KINOSHITA, K.; YU, D. Neural Target Speech Extraction: An overview. IEEE SIGNAL PROCESSING MAGAZINE, 2023, vol. 40, no. 3, p. 8-29. ISSN: 1558-0792. Detail

    ZULUAGA-GOMEZ, J.; NIGMATULINA, I.; PRASAD, A.; MOTLÍČEK, P.; KHALIL, D.; MADIKERI, S.; TART, A.; SZŐKE, I.; LENDERS, V.; RIGAULT, M.; CHOUKRI, K. Lessons Learned in Transcribing 5000 h of Air Traffic Control Communications for Robust Automatic Speech Understanding. Aerospace, 2023, vol. 2023, no. 10, p. 1-33. ISSN: 2226-4310. Detail

    ZULUAGA-GOMEZ, J.; PRASAD, A.; NIGMATULINA, I.; MOTLÍČEK, P.; KLEINERT, M.;. A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers. Aerospace, 2023, vol. 10, no. 5, p. 1-25. ISSN: 2226-4310. Detail

    ZULUAGA-GOMEZ, J.; PRASAD, A.; NIGMATULINA, I.; SARFJOO, S.; MOTLÍČEK, P.; KLEINERT, M.; HELMKE, H.; OHNEISER, O.; ZHAN, Q. How Does Pre-Trained Wav2Vec 2.0 Perform on Domain-Shifted ASR? an Extensive Benchmark on Air Traffic Control Communications. In IEEE Spoken Language Technology Workshop, SLT 2022 - Proceedings. Doha: IEEE Signal Processing Society, 2023. p. 205-212. ISBN: 978-1-6654-7189-3. Detail

    ZULUAGA-GOMEZ, J.; SARFJOO, S.; PRASAD, A.; NIGMATULINA, I.; MOTLÍČEK, P.; ONDŘEJ, K.; OHNEISER, O.; HELMKE, H. BERTraffic: BERT-based Joint Speaker Role and Speaker Change Detection for Air Traffic Control Communications. In IEEE Spoken Language Technology Workshop, SLT 2022 - Proceedings. Doha: IEEE Signal Processing Society, 2023. p. 633-640. ISBN: 978-1-6654-7189-3. Detail

  • 2022

    ALAM, J.; BURGET, L.; GLEMBEK, O.; MATĚJKA, P.; MOŠNER, L.; PLCHOT, O.; ROHDIN, J.; SILNOVA, A.; STAFYLAKIS, T. Development of ABC systems for the 2021 edition of NIST Speaker Recognition evaluation. Proceedings of The Speaker and Language Recognition Workshop (Odyssey 2022). Beijing: International Speech Communication Association, 2022. p. 346-353. Detail

    BASKAR, M.; HERZIG, T.; NGUYEN, D.; DIEZ SÁNCHEZ, M.; POLZEHL, T.; BURGET, L.; ČERNOCKÝ, J. Speaker adaptation for Wav2vec2 based dysarthric ASR. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Incheon: International Speech Communication Association, 2022. p. 3403-3407. ISSN: 1990-9772. Detail

    BASKAR, M.; ROSENBERG, A.; RAMABHADRAN, B.; ZHANG, Y. Reducing Domain mismatch in Self-supervised speech pre-training. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Incheon: International Speech Communication Association, 2022. p. 3028-3032. ISSN: 1990-9772. Detail

    BASKAR, M.; ROSENBERG, A.; RAMABHADRAN, B.; ZHANG, Y.; MORENO, P. Ask2Mask: Guided Data Selection for Masked Speech Modeling. IEEE J-STSP, 2022, vol. 16, no. 6, p. 1357-1366. ISSN: 1932-4553. Detail

    BLATT, A.; KOCOUR, M.; VESELÝ, K.; SZŐKE, I.; KLAKOW, D. Call-Sign Recognition and Understanding for Noisy Air-Traffic Transcripts Using Surveillance Information. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Singapore: IEEE Signal Processing Society, 2022. p. 8357-8361. ISBN: 978-1-6654-0540-9. Detail

    BOITO, M.; YUSUF, B.; ONDEL YANG, L.; VILLAVICENCIO, A.; BESACIER, L. Unsupervised Word Segmentation from Discrete Speech Units in Low-Resource Settings. In Proceedings of the the 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages. Marseile: European Language Resources Association, 2022. p. 1-9. ISBN: 979-10-95546-91-7. Detail

    BRUMMER, J.; SWART, A.; MOŠNER, L.; SILNOVA, A.; PLCHOT, O.; STAFYLAKIS, T.; BURGET, L. Probabilistic Spherical Discriminant Analysis: An Alternative to PLDA for length-normalized embeddings. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Incheon: International Speech Communication Association, 2022. p. 1446-1450. ISSN: 1990-9772. Detail

    BURGET, L.; BOJAR, O. NEUREM3 Interim Research Report. Brno: Department of Computer Graphics and Multimedia FIT BUT, 2022. p. 1-78. Detail

    DE BENITO GORRON, D.; ŽMOLÍKOVÁ, K.; TORRE TOLEDANO, D. Source Separation for Sound Event Detection in domestic environments using jointly trained models. In Proceedings of The 17th International Workshop on Acoustic Signal Enhancement (IWAENC 2022). Bamberg: IEEE Signal Processing Society, 2022. p. 1-5. ISBN: 978-1-6654-6867-1. Detail

    DELCROIX, M.; KINOSHITA, K.; OCHIAI, T.; ŽMOLÍKOVÁ, K.; SATO, H.; NAKATANI, T. Listen only to me! How well can target speech extraction handle false alarms?. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Incheon: International Speech Communication Association, 2022. p. 216-220. ISSN: 1990-9772. Detail

    DVOŘÁKOVÁ, M.; HRADIŠ, M.; ŽABIČKA, P.; KOHÚT, J.; KIŠŠ, M.; BENEŠ, K. Využití PERO OCR při přepisu rukopisů. Archivní časopis, 2022, roč. 72, č. 1, s. 14-27. ISSN: 0004-0398. Detail

    EGOROVA, E.; VYDANA, H.; BURGET, L.; ČERNOCKÝ, J. Spelling-Aware Word-Based End-to-End ASR. IEEE SIGNAL PROCESSING LETTERS, 2022, vol. 29, no. 29, p. 1729-1733. ISSN: 1558-2361. Detail

    HAN, J.; LONG, Y.; BURGET, L.; ČERNOCKÝ, J. DPCCN: Densely-Connected Pyramid Complex Convolutional Network for Robust Speech Separation and Extraction. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Singapore: IEEE Signal Processing Society, 2022. p. 7292-7296. ISBN: 978-1-6654-0540-9. Detail

    KIŠŠ, M.; KOHÚT, J.; BENEŠ, K.; HRADIŠ, M. Importance of Textlines in Historical Document Classification. In Uchida, S., Barney, E., Eglin, V. (eds) Document Analysis Systems. Lecture Notes in Computer Science. La Rochelle: Springer Nature Switzerland AG, 2022. p. 158-170. ISBN: 978-3-031-06554-5. Detail

    KOCOUR, M.; UMESH, J.; KARAFIÁT, M.; ŠVEC, J.; LOPEZ, F.; BENEŠ, K.; DIEZ SÁNCHEZ, M.; SZŐKE, I.; LUQUE, J.; VESELÝ, K.; BURGET, L.; ČERNOCKÝ, J. BCN2BRNO: ASR System Fusion for Albayzin 2022 Speech to Text Challenge. Proceedings of IberSpeech 2022. Granada: International Speech Communication Association, 2022. p. 276-280. Detail

    KOCOUR, M.; ŽMOLÍKOVÁ, K.; ONDEL YANG, L.; ŠVEC, J.; DELCROIX, M.; OCHIAI, T.; BURGET, L.; ČERNOCKÝ, J. Revisiting joint decoding based multi-talker speech recognition with DNN acoustic model. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Incheon: International Speech Communication Association, 2022. p. 4955-4959. ISSN: 1990-9772. Detail

    LANDINI, F.; LOZANO DÍEZ, A.; DIEZ SÁNCHEZ, M.; BURGET, L. From Simulated Mixtures to Simulated Conversations as Training Data for End-to-End Neural Diarization. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Incheon: International Speech Communication Association, 2022. p. 5095-5099. ISSN: 1990-9772. Detail

    LANDINI, F.; PROFANT, J.; DIEZ SÁNCHEZ, M.; BURGET, L. Bayesian HMM clustering of x-vector sequences (VBx) in speaker diarization: Theory, implementation and analysis on standard tasks. COMPUTER SPEECH AND LANGUAGE, 2022, vol. 71, no. 101254, p. 1-16. ISSN: 0885-2308. Detail

    MOŠNER, L.; PLCHOT, O.; BURGET, L.; ČERNOCKÝ, J. Multi-Channel Speaker Verification with Conv-Tasnet Based Beamformer. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Singapore: IEEE Signal Processing Society, 2022. p. 7982-7986. ISBN: 978-1-6654-0540-9. Detail

    MOŠNER, L.; PLCHOT, O.; BURGET, L.; ČERNOCKÝ, J. Multisv: Dataset for Far-Field Multi-Channel Speaker Verification. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Singapore: IEEE Signal Processing Society, 2022. p. 7977-7981. ISBN: 978-1-6654-0540-9. Detail

    NADIMPALLI, V.; KESIRAJU, S.; BANKA, R.; KETHIREDDY, R.; GANGASHETTY, S. Resources and Benchmarks for Keyword Search in Spoken Audio From Low-Resource Indian Languages. IEEE Access, 2022, vol. 10, no. 2022, p. 34789-34799. ISSN: 2169-3536. Detail

    NIGMATULINA, I.; ZULUAGA-GOMEZ, J.; PRASAD, A.; SARFJOO, S.; MOTLÍČEK, P. A Two-Step Approach to Leverage Contextual Data: Speech Recognition in Air-Traffic Communications. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Singapore: IEEE Signal Processing Society, 2022. p. 6282-6286. ISBN: 978-1-6654-0540-9. Detail

    ONDEL YANG, L.; LAM-YEE-MUI, L.; KOCOUR, M.; CORRO, C.; BURGET, L. GPU-Accelerated Forward-Backward Algorithm with Application to Lattice-Free MMI. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Singapore: IEEE Signal Processing Society, 2022. p. 8417-8421. ISBN: 978-1-6654-0540-9. Detail

    ONDEL YANG, L.; YUSUF, B.; BURGET, L.; SARAÇLAR, M. Non-Parametric Bayesian Subspace Models for Acoustic Unit Discovery. IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING, 2022, vol. 30, no. 5, p. 1902-1917. ISSN: 2329-9290. Detail

    PENG, J.; GU, R.; MOŠNER, L.; PLCHOT, O.; BURGET, L.; ČERNOCKÝ, J. Learnable Sparse Filterbank for Speaker Verification. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Incheon: International Speech Communication Association, 2022. p. 5110-5114. ISSN: 1990-9772. Detail

    PENG, J.; ZHANG, C.; ČERNOCKÝ, J.; YU, D. Progressive contrastive learning for self-supervised text-independent speaker verification. Proceedings of The Speaker and Language Recognition Workshop (Odyssey 2022). Beijing: International Speech Communication Association, 2022. p. 17-24. Detail

    PRASAD, A.; ZULUAGA-GOMEZ, J.; MOTLÍČEK, P.; SARFJOO, S.; NIGMATULINA, I.; OHNEISER, O.; HELMKE, H. Grammar Based Speaker Role Identification for Air Traffic Control Speech Recognition. Proceedings of the 12th SESAR Innovation Days. Budapest: 2022. p. 1-9. Detail

    PRASAD, A.; ZULUAGA-GOMEZ, J.; MOTLÍČEK, P.; SARFJOO, S.; NIGMATULINA, I.; VESELÝ, K. Speech and Natural Language Processing Technologies for Pseudo-Pilot Simulator. Proceedings of the 12th SESAR Innovation Days. Budapest: 2022. p. 1-9. Detail

    SILNOVA, A.; STAFYLAKIS, T.; MOŠNER, L.; PLCHOT, O.; ROHDIN, J.; MATĚJKA, P.; BURGET, L.; GLEMBEK, O.; BRUMMER, J. Analyzing speaker verification embedding extractors and back-ends under language and channel mismatch. Proceedings of The Speaker and Language Recognition Workshop (Odyssey 2022). Beijing: International Speech Communication Association, 2022. p. 9-16. Detail

    SOLEWICZ, Y.; COHEN, N.; ROHDIN, J.; MADIKERI, S.; ČERNOCKÝ, J. Speaker recognition on mono-channel telephony recordings. Proceedings of Odyssey 2022. Beijing: International Speech Communication Association, 2022. p. 193-199. Detail

    STAFYLAKIS, T.; MOŠNER, L.; PLCHOT, O.; ROHDIN, J.; SILNOVA, A.; BURGET, L.; ČERNOCKÝ, J. Training Speaker Embedding Extractors Using Multi-Speaker Audio with Unknown Speaker Boundaries. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Incheon: International Speech Communication Association, 2022. p. 605-609. ISSN: 1990-9772. Detail

    ŠVEC, J.; ŽMOLÍKOVÁ, K.; KOCOUR, M.; DELCROIX, M.; OCHIAI, T.; MOŠNER, L.; ČERNOCKÝ, J. Analysis of impact of emotions on target speech extraction and speech separation. In Proceedings of The 17th International Workshop on Acoustic Signal Enhancement (IWAENC 2022). Bamberg: IEEE Signal Processing Society, 2022. p. 1-5. ISBN: 978-1-6654-6867-1. Detail

    YUSUF, B.; GANDHE, A.; SOKOLOV, A. Usted: Improving ASR with a Unified Speech and Text Encoder-Decoder. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Singapore: IEEE Signal Processing Society, 2022. p. 8297-8301. ISBN: 978-1-6654-0540-9. Detail

  • 2021

    BASKAR, M.; BURGET, L.; WATANABE, S.; ASTUDILLO, R.; ČERNOCKÝ, J. Eat: Enhanced ASR-TTS for Self-Supervised Speech Recognition. In ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Toronto, Ontario: IEEE Signal Processing Society, 2021. p. 6753-6757. ISBN: 978-1-7281-7605-5. Detail

    BENEŠ, K.; BURGET, L. Text Augmentation for Language Models in High Error Recognition Scenario. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Brno: International Speech Communication Association, 2021. p. 1872-1876. ISSN: 1990-9772. Detail

    DELCROIX, M.; ŽMOLÍKOVÁ, K.; OCHIAI, T.; KINOSHITA, K.; NAKATANI, T. Speaker activity driven neural speech extraction. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Toronto: IEEE Signal Processing Society, 2021. p. 6099-6103. ISBN: 978-1-7281-7605-5. Detail

    EGOROVA, E.; VYDANA, H.; BURGET, L.; ČERNOCKÝ, J. Out-of-Vocabulary Words Detection with Attention and CTC Alignments in an End-to-End ASR System. In Proceedings Interspeech 2021. Proceedings of Interspeech. Brno: International Speech Communication Association, 2021. p. 2901-2905. ISSN: 1990-9772. Detail

    HELMKE, H.; KLEINERT, M.; SHETTY, S.; OHNEISER, O.; EHR, H.; PRASAD, A.; MOTLÍČEK, P.; VESELÝ, K.; ONDŘEJ, K.; SMRŽ, P.; HARFMANN, J.; WINDISCH, C. Readback Error Detection by Automatic Speech Recognition to Increase ATM Safety. In Proceedings of ATM Seminar. on-line: EUROPEAN ORGANISATION FOR THE SAFETY OF AIR NAVIGATION, 2021. p. 1-10. Detail

    HELMKE, H.; SHETTY, S.; KLEINERT, M.; OHNEISER, O.; EHR, H.; MOTLÍČEK, P.; PRASAD, A.; WINDISCH, C. Measuring Speech Recognition And Understanding Performance in Air Traffic Control Domain Beyond Word Error Rates. Proceedings of 11th SESAR Innovation Days 2021. Belgie: 2021. p. 1-8. Detail

    KARAFIÁT, M.; VESELÝ, K.; ČERNOCKÝ, J.; PROFANT, J.; NYTRA, J.; HLAVÁČEK, M.; PAVLÍČEK, T. Analysis of X-Vectors for Low-Resource Speech Recognition. In ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Toronto, Ontario: IEEE Signal Processing Society, 2021. p. 6998-7002. ISBN: 978-1-7281-7605-5. Detail

    KIŠŠ, M.; BENEŠ, K.; HRADIŠ, M. AT-ST: Self-Training Adaptation Strategy for OCR in Domains with Limited Transcriptions. In Lladós J., Lopresti D., Uchida S. (eds) Document Analysis and Recognition - ICDAR 2021. Lecture Notes in Computer Science. Lausanne: Springer Nature Switzerland AG, 2021. p. 463-477. ISBN: 978-3-030-86336-4. Detail

    KLEINERT, M.; HELMKE, H.; SHETTY, S.; OHNEISER, O.; EHR, H.; PRASAD, A.; MOTLÍČEK, P.; HARFMANN, J. Automated Interpretation of Air Traffic Control Communication: The Journey from Spoken Words to a Deeper Understanding of the Meaning. In Proceedings of DASC 2021. San Antonio, Texas: Institute of Electrical and Electronics Engineers, 2021. p. 1-9. ISBN: 978-1-6654-3420-1. Detail

    KOCOUR, M.; CÁMBARA, G.; LUQUE, J.; BONET, D.; FARRÚS, M.; KARAFIÁT, M.; VESELÝ, K.; ČERNOCKÝ, J. BCN2BRNO: ASR System Fusion for Albayzin 2020 Speech to Text Challenge. Proceedings of IberSPEECH 2021. Vallaloid: International Speech Communication Association, 2021. p. 113-117. Detail

    KOCOUR, M.; VESELÝ, K.; BLATT, A.; ZULUAGA-GOMEZ, J.; SZŐKE, I.; ČERNOCKÝ, J.; KLAKOW, D.; MOTLÍČEK, P. Boosting of Contextual Information in ASR for Air-Traffic Call-Sign Recognition. In Proceedings Interspeech 2021. Proceedings of Interspeech. Brno: International Speech Communication Association, 2021. p. 3301-3305. ISSN: 1990-9772. Detail

    KOCOUR, M.; VESELÝ, K.; SZŐKE, I.; KESIRAJU, S.; ZULUAGA-GOMEZ, J.; BLATT, A.; PRASAD, A.; NIGMATULINA, I.; MOTLÍČEK, P.; KLAKOW, D.; TART, A.; KOLČÁREK, P.; ČERNOCKÝ, J.; CEVENINI, C.; CHOUKRI, K.; RIGAULT, M.; LANDIS, F.; SARFJOO, S. Automatic Processing Pipeline for Collecting and Annotating Air-Traffic Voice Communication Data. In Proceedings of 9th OpenSky Symposium 2021, OpenSky Network, Brussels, Belgium. Proceedings. Brussels: MDPI, 2021. p. 1-10. ISSN: 2504-3900. Detail

    LANDINI, F.; GLEMBEK, O.; MATĚJKA, P.; ROHDIN, J.; BURGET, L.; DIEZ SÁNCHEZ, M.; SILNOVA, A. Analysis of the BUT Diarization System for Voxconverse Challenge. In ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Toronto, Ontario: IEEE Signal Processing Society, 2021. p. 5819-5823. ISBN: 978-1-7281-7605-5. Detail

    LANDINI, F.; LOZANO DÍEZ, A.; BURGET, L.; DIEZ SÁNCHEZ, M.; SILNOVA, A.; ŽMOLÍKOVÁ, K.; GLEMBEK, O.; MATĚJKA, P.; STAFYLAKIS, T.; BRUMMER, J. BUT System Description for The Third DIHARD Speech Diarization Challenge. Proceedings available at Dihard Challenge Github. on-line by LDC and University of Pennsylvania: 2021. p. 1-5. Detail

    PENG, J.; QU, X.; WANG, J.; GU, R.; XIAO, J.; BURGET, L.; ČERNOCKÝ, J. ICSpk: Interpretable Complex Speaker Embedding Extractor from Raw Waveform. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Brno: International Speech Communication Association, 2021. p. 511-515. ISSN: 1990-9772. Detail

    ŘIHÁČEK, T.; MATĚJKA, P. Deep learning v psychoterapii: Strojová analýza nahrávek terapeutických sezení. E-psychologie., 2021, roč. 15, č. 3, s. 35-37. ISSN: 1802-8853. Detail

    STAFYLAKIS, T.; ROHDIN, J.; BURGET, L. Speaker embeddings by modeling channel-wise correlations. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Brno: International Speech Communication Association, 2021. p. 501-505. ISSN: 1990-9772. Detail

    SZŐKE, I.; KESIRAJU, S.; NOVOTNÝ, O.; KOCOUR, M.; VESELÝ, K.; ČERNOCKÝ, J. Detecting English Speech in the Air Traffic Control Voice Communication. In Proceedings Interspeech 2021. Proceedings of Interspeech. Brno: International Speech Communication Association, 2021. p. 3286-3290. ISSN: 1990-9772. Detail

    VYDANA, H.; KARAFIÁT, M.; BURGET, L.; ČERNOCKÝ, J. The IWSLT 2021 BUT Speech Translation Systems. In Proceedings of 18th International Conference on Spoken Language Translation (IWSLT). Bangkok, on-line: Association for Computational Linguistics, 2021. p. 75-83. ISBN: 978-1-7138-3378-9. Detail

    VYDANA, H.; KARAFIÁT, M.; ŽMOLÍKOVÁ, K.; BURGET, L.; ČERNOCKÝ, J. Jointly Trained Transformers Models for Spoken Language Translation. In ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Toronto, Ontario: IEEE Signal Processing Society, 2021. p. 7513-7517. ISBN: 978-1-7281-7605-5. Detail

    WANNER, L.; KLUSCH, M.; MAVROPOULOS, A.; JAMIN, E.; MARIN PUCHADES, V.; CASAMAYOR, G.; ČERNOCKÝ, J.; EGOROVA, E. Towards a Versatile Intelligent Conversational Agent as Personal Assistant for Migrants. In The PAAMS Collection. PAAMS 2021: Advances in Practical Applications of Agents, Multi-Agent Systems, and Social Good. Lecture Notes in Computer Science. Lecture Notes in Computer Science book series. Salamanca: Springer International Publishing, 2021. p. 316-327. ISBN: 978-3-030-85739-4. ISSN: 0302-9743. Detail

    YUSUF, B.; GOK, A.; GUNDOGDU, B.; SARAÇLAR, M. End-to-End Open Vocabulary Keyword Search. In Proceedings Interspeech 2021. Proceedings of Interspeech. Brno: International Speech Communication Association, 2021. p. 4388-4392. ISSN: 1990-9772. Detail

    YUSUF, B.; ONDEL YANG, L.; BURGET, L.; ČERNOCKÝ, J.; SARAÇLAR, M. A Hierarchical Subspace Model for Language-Attuned Acoustic Unit Discovery. In ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Toronto, Ontario: IEEE Signal Processing Society, 2021. p. 3710-3714. ISBN: 978-1-7281-7605-5. Detail

    ŽMOLÍKOVÁ, K.; DELCROIX, M.; BURGET, L.; NAKATANI, T.; ČERNOCKÝ, J. Integration of Variational Autoencoder and Spatial Clustering for Adaptive Multi-Channel Neural Speech Separation. In 2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Proceedings. Shenzhen - virtual: IEEE Signal Processing Society, 2021. p. 889-896. ISBN: 978-1-7281-7066-4. Detail

    ŽMOLÍKOVÁ, K.; DELCROIX, M.; RAJ, D.; WATANABE, S.; ČERNOCKÝ, J. Auxiliary Loss Function for Target Speech Extraction and Recognition with Weak Supervision Based on Speaker Characteristics. In Proceedings of 2021 Interspeech. Proceedings of Interspeech. Brno: International Speech Communication Association, 2021. p. 1464-1468. ISSN: 1990-9772. Detail

    ZULUAGA-GOMEZ, J.; NIGMATULINA, I.; PRASAD, A.; MOTLÍČEK, P.; VESELÝ, K.; KOCOUR, M.; SZŐKE, I. Contextual Semi-Supervised Learning: An Approach to Leverage Air-Surveillance and Untranscribed ATC Data in ASR Systems. In Proceedings Interspeech 2021. Proceedings of Interspeech. Brno: International Speech Communication Association, 2021. p. 3296-3300. ISSN: 1990-9772. Detail

  • 2020

    ALAM, J.; BOULIANNE, G.; BURGET, L.; DAHMANE, M.; DIEZ SÁNCHEZ, M.; GLEMBEK, O.; LALONDE, M.; LOZANO DÍEZ, A.; MATĚJKA, P.; MIZERA, P.; MOŠNER, L.; NOISEUX, C.; MONTEIRO, J.; NOVOTNÝ, O.; PLCHOT, O.; ROHDIN, J.; SILNOVA, A.; SLAVÍČEK, J.; STAFYLAKIS, T.; ST-CHARLES, P.; WANG, S.; ZEINALI, H. Analysis of ABC Submission to NIST SRE 2019 CMN and VAST Challenge. In Proceedings of Odyssey 2020 The Speaker and Language Recognition Workshop. Proceedings of Odyssey: The Speaker and Language Recognition Workshop Odyssey 2014, Joensuu, Finland. Tokyo: International Speech Communication Association, 2020. p. 289-295. ISSN: 2312-2846. Detail

    BURGET, L.; GLEMBEK, O.; LOZANO DÍEZ, A.; MATĚJKA, P.; NOVOTNÝ, O.; PLCHOT, O.; PULUGUNDLA, B.; ROHDIN, J.; SILNOVA, A.; VESELÝ, K. BUT System Description to SdSV Challenge 2020. Proceedings of Short-duration Speaker Verification Challenge 2020 Workshop. Shanghai, on-line event of Interspeech 2020 Conference: 2020. p. 1-5. Detail

    DELCROIX, M.; OCHIAI, T.; ŽMOLÍKOVÁ, K.; KINOSHITA, K.; TAWARA, N.; NAKATANI, T.; ARAKI, S. Improving Speaker Discrimination of Target Speech Extraction With Time-Domain Speakerbeam. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Barcelona: IEEE Signal Processing Society, 2020. p. 691-695. ISBN: 978-1-5090-6631-5. Detail

    DIEZ SÁNCHEZ, M.; BURGET, L.; LANDINI, F.; ČERNOCKÝ, J. Analysis of Speaker Diarization based on Bayesian HMM with Eigenvoice Priors. IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING, 2020, vol. 28, no. 1, p. 355-368. ISSN: 2329-9290. Detail

    DIEZ SÁNCHEZ, M.; BURGET, L.; LANDINI, F.; WANG, S.; ČERNOCKÝ, J. Optimizing Bayesian Hmm Based X-Vector Clustering for the Second Dihard Speech Diarization Challenge. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Barcelona: IEEE Signal Processing Society, 2020. p. 6519-6523. ISBN: 978-1-5090-6631-5. Detail

    DUNBAR, E.; KARADAYI, J.; BERNARD, M.; CAO, X.; ALGAYRES, R.; ONDEL YANG, L.; BESACIER, L.; SAKTI, S.; DUPOUX, E. The Zero Resource Speech Challenge 2020: Discovering discrete subword and word units. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Shanghai: International Speech Communication Association, 2020. p. 4831-4835. ISSN: 1990-9772. Detail

    KESIRAJU, S.; PLCHOT, O.; BURGET, L.; GANGASHETTY, S. Learning Document Embeddings Along With Their Uncertainties. IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING, 2020, vol. 2020, no. 28, p. 2319-2332. ISSN: 2329-9290. Detail

    KOSIBA, M.; BURGET, L. Multiwavelength classification of X-ray selected galaxy cluster candidates using convolutional neural networks. Monthly Notices of the Royal Astronomical Society, 2020, vol. 496, no. 4, p. 4141-4153. ISSN: 1365-2966. Detail

    LANDINI, F.; WANG, S.; DIEZ SÁNCHEZ, M.; BURGET, L.; MATĚJKA, P.; ŽMOLÍKOVÁ, K.; MOŠNER, L.; SILNOVA, A.; PLCHOT, O.; NOVOTNÝ, O.; ZEINALI, H.; ROHDIN, J. But System for the Second Dihard Speech Diarization Challenge. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Barcelona: IEEE Signal Processing Society, 2020. p. 6529-6533. ISBN: 978-1-5090-6631-5. Detail

    LOZANO DÍEZ, A.; SILNOVA, A.; PULUGUNDLA, B.; ROHDIN, J.; VESELÝ, K.; BURGET, L.; PLCHOT, O.; GLEMBEK, O.; NOVOTNÝ, O.; MATĚJKA, P. BUT Text-Dependent Speaker Verification System for SdSV Challenge 2020. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Shanghai: International Speech Communication Association, 2020. p. 761-765. ISSN: 1990-9772. Detail

    MATĚJKA, P.; PLCHOT, O.; GLEMBEK, O.; BURGET, L.; ROHDIN, J.; ZEINALI, H.; MOŠNER, L.; SILNOVA, A.; NOVOTNÝ, O.; DIEZ SÁNCHEZ, M.; ČERNOCKÝ, J. 13 years of speaker recognition research at BUT, with longitudinal analysis of NIST SRE. COMPUTER SPEECH AND LANGUAGE, 2020, vol. 2020, no. 63, p. 1-15. ISSN: 0885-2308. Detail

    MOŠNER, L.; PLCHOT, O.; ROHDIN, J.; ČERNOCKÝ, J. Utilizing VOiCES dataset for multichannel speaker verification with beamforming. Proceedings of Odyssey 2020 The Speaker and Language Recognition Workshop. Proceedings of Odyssey: The Speaker and Language Recognition Workshop Odyssey 2014, Joensuu, Finland. Tokyo: International Speech Communication Association, 2020. p. 187-193. ISSN: 2312-2846. Detail

    ROHDIN, J.; SILNOVA, A.; DIEZ SÁNCHEZ, M.; PLCHOT, O.; MATĚJKA, P.; BURGET, L.; GLEMBEK, O. End-to-end DNN based text-independent speaker recognition for long and short utterances. COMPUTER SPEECH AND LANGUAGE, 2020, vol. 2020, no. 59, p. 22-35. ISSN: 0885-2308. Detail

    SCHARENBORG, O.; BESACIER, L.; BLACK, A.; HASEGAWA-JOHNSON, M.; METZE, F.; NEUBIG, G.; STÜKER, S.; GODARD, P.; MÜLLER, M.; ONDEL YANG, L.; PALASKAR, S.; ARTHUR, P.; CIANNELLA, F.; DU, M.; LARSEN, E.; MERKX, D.; RIAD, R.; WANG, L.; DUPOUX, E. Speech Technology for Unwritten Languages. IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING, 2020, vol. 2020, no. 28, p. 964-975. ISSN: 2329-9290. Detail

    SILNOVA, A.; BRUMMER, J.; ROHDIN, J.; STAFYLAKIS, T.; BURGET, L. Probabilistic embeddings for speaker diarization. Proceedings of Odyssey 2020 The Speaker and Language Recognition Workshop. Proceedings of Odyssey: The Speaker and Language Recognition Workshop Odyssey 2014, Joensuu, Finland. Tokyo: International Speech Communication Association, 2020. p. 24-31. ISSN: 2312-2846. Detail

    WANG, S.; ROHDIN, J.; PLCHOT, O.; BURGET, L.; YU, K.; ČERNOCKÝ, J. Investigation of Specaugment for Deep Speaker Embedding Learning. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Barcelona: IEEE Signal Processing Society, 2020. p. 7139-7143. ISBN: 978-1-5090-6631-5. Detail

    ZEINALI, H.; LEE, K.; ALAM, J.; BURGET, L. SdSV Challenge 2020: Large-Scale Evaluation of Short-duration Speaker Verification. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Shanghai: International Speech Communication Association, 2020. p. 731-735. ISSN: 1990-9772. Detail

    ŽMOLÍKOVÁ, K.; KOCOUR, M.; LANDINI, F.; BENEŠ, K.; KARAFIÁT, M.; VYDANA, H.; LOZANO DÍEZ, A.; PLCHOT, O.; BASKAR, M.; ŠVEC, J.; MOŠNER, L.; MALENOVSKÝ, V.; BURGET, L.; YUSUF, B.; NOVOTNÝ, O.; GRÉZL, F.; SZŐKE, I.; ČERNOCKÝ, J. BUT System for CHiME-6 Challenge. Proceedings of CHiME 2020 Virtual Workshop. Barcelona: University of Sheffield, 2020. p. 1-3. Detail

    ZULUAGA-GOMEZ, J.; MOTLÍČEK, P.; ZHAN, Q.; VESELÝ, K.; BRAUN, R. Automatic Speech Recognition Benchmark for Air-Traffic Communications. In Proceedings of Interspeech 2020. Proceedings of Interspeech. Shanghai: International Speech Communication Association, 2020. p. 2297-2301. ISSN: 1990-9772. Detail

    ZULUAGA-GOMEZ, J.; VESELÝ, K.; BLATT, A.; MOTLÍČEK, P.; KLAKOW, D.; TART, A.; SZŐKE, I.; PRASAD, A.; SARFJOO, S.; KOLČÁREK, P.; KOCOUR, M.; ČERNOCKÝ, J.; CEVENINI, C.; CHOUKRI, K.; RIGAULT, M.; LANDIS, F. Automatic Call Sign Detection: Matching Air Surveillance Data with Air Traffic Spoken Communications. Proceedings of the 8th OpenSky Symposium 2020. Proceedings. Brusel: MDPI, 2020. p. 1-10. ISSN: 2504-3900. Detail

  • 2019

    ALAM, J.; BOULIANNE, G.; BURGET, L.; GLEMBEK, O.; LOZANO DÍEZ, A.; MATĚJKA, P.; MIZERA, P.; MOŠNER, L.; NOVOTNÝ, O.; PLCHOT, O.; ROHDIN, J.; SILNOVA, A.; SLAVÍČEK, J.; STAFYLAKIS, T.; WANG, S.; ZEINALI, H.; DAHMANE, M.; ST-CHARLES, P.; LALONDE, M.; NOISEUX, C.; MONTEIRO, J. ABC System Description for NIST Multimedia Speaker Recognition Evaluation 2019. Proceedings of NIST 2019 SRE Workshop. Sentosa, Singapore: National Institute of Standards and Technology, 2019. p. 1-7. Detail

    ALAM, J.; BOULIANNE, G.; GLEMBEK, O.; LOZANO DÍEZ, A.; MATĚJKA, P.; MIZERA, P.; MONTEIRO, J.; MOŠNER, L.; NOVOTNÝ, O.; PLCHOT, O.; ROHDIN, J.; SILNOVA, A.; SLAVÍČEK, J.; STAFYLAKIS, T.; WANG, S.; ZEINALI, H. ABC NIST SRE 2019 CTS System Description. Proceedings of NIST. Sentosa, Singapore: National Institute of Standards and Technology, 2019. p. 1-6. Detail

    BASKAR, M.; BURGET, L.; WATANABE, S.; KARAFIÁT, M.; HORI, T.; ČERNOCKÝ, J. Promising Accurate Prefix Boosting For Sequence-to-sequence ASR. In Proceedings of ICASSP. Brighton: IEEE Signal Processing Society, 2019. p. 5646-5650. ISBN: 978-1-5386-4658-8. Detail

    BASKAR, M.; WATANABE, S.; ASTUDILLO, R.; HORI, T.; BURGET, L.; ČERNOCKÝ, J. Semi-supervised Sequence-to-sequence ASR using Unpaired Speech and Text. In Proceedings of Interspeech. Proceedings of Interspeech. Graz: International Speech Communication Association, 2019. p. 3790-3794. ISSN: 1990-9772. Detail

    BENEŠ, K.; IRIE, K.; BECK, E.; SCHLÜTER, R.; NEY, H. Unsupervised Language Model Adaptation for Speech Recognition with no Extra Resources. Proceedings of DAGA 2019. Rostock: DEGA Head office, Deutsche Gesellschaft für Akustik, 2019. p. 954-957. ISBN: 978-3-939296-14-0. Detail

    CARTAS, A.; KOCOUR, M.; RAMAN, A.; LEONTIADIS, I.; LUQUE, J.; SASTRY, N.; NUNEZ-MARTINEZ, L.; PERINO, D.; PERALES, C. A Reality Check on Inference at Mobile Networks Edge. In Proceedings of the 2nd ACM International Workshop on Edge Systems, Analytics and Networking (EDGESYS '19). Dressden: Association for Computing Machinery, 2019. p. 54-59. ISBN: 978-1-4503-6275-7. Detail

    CHO, J.; WATANABE, S.; HORI, T.; BASKAR, M.; INAGUMA, H.; VILLALBA LOPEZ, J.; DEHAK, N. Language Model Integration Based on Memory Control for Sequence to Sequence Speech Recognition. In Proceedings of 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP). Brighton: IEEE Signal Processing Society, 2019. p. 6191-6195. ISBN: 978-1-5386-4658-8. Detail

    DELCROIX, M.; ŽMOLÍKOVÁ, K.; OCHIAI, T.; KINOSHITA, K.; ARAKI, S.; NAKATANI, T. Compact Network for Speakerbeam Target Speaker Extraction. In Proceedings of ICASSP. Brighton: IEEE Signal Processing Society, 2019. p. 6965-6969. ISBN: 978-1-5386-4658-8. Detail

    DELCROIX, M.; ŽMOLÍKOVÁ, K.; OCHIAI, T.; KINOSHITA, K.; ARAKI, S.; NAKATANI, T. Evaluation of SpeakerBeam target speech extraction in real noisy and reverberant conditions. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF JAPAN, 2019, vol. 2019, no. 2, p. 1-2. ISSN: 0369-4232. Detail

    DIEZ SÁNCHEZ, M.; BURGET, L.; WANG, S.; ROHDIN, J.; ČERNOCKÝ, J. Bayesian HMM based x-vector clustering for Speaker Diarization. In Proceedings of Interspeech. Proceedings of Interspeech. Graz: International Speech Communication Association, 2019. p. 346-350. ISSN: 1990-9772. Detail

    INAGUMA, H.; CHO, J.; BASKAR, M.; KAWAHARA, T.; WATANABE, S. Transfer Learning Of Language-independent End-to-end ASR With Language Model Fusion. In Proceedings of ICASSP. Brighton: IEEE Signal Processing Society, 2019. p. 6096-6100. ISBN: 978-1-5386-4658-8. Detail

    KARAFIÁT, M.; BASKAR, M.; WATANABE, S.; HORI, T.; WIESNER, M.; ČERNOCKÝ, J. Analysis of Multilingual Sequence-to-Sequence Speech Recognition Systems. In Proceedings of Interspeech. Proceedings of Interspeech. Graz: International Speech Communication Association, 2019. p. 2220-2224. ISSN: 1990-9772. Detail

    MAGHSOODI, N.; SAMETI, H.; ZEINALI, H.; STAFYLAKIS, T. Speaker Recognition With Random Digit Strings Using Uncertainty Normalized HMM-Based i-Vectors. IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING, 2019, vol. 2019, no. 11, p. 1815-1825. ISSN: 2329-9290. Detail

    MATĚJKA, P.; PLCHOT, O.; ZEINALI, H.; MOŠNER, L.; SILNOVA, A.; BURGET, L.; NOVOTNÝ, O.; GLEMBEK, O. Analysis of BUT Submission in Far-Field Scenarios of VOiCES 2019 Challenge. In Proceedings of Interspeech. Proceedings of Interspeech. Graz: International Speech Communication Association, 2019. p. 2448-2452. ISSN: 1990-9772. Detail

    MOŠNER, L.; PLCHOT, O.; ROHDIN, J.; BURGET, L.; ČERNOCKÝ, J. Speaker Verification with Application-Aware Beamforming. In IEEE Automatic Speech Recognition and Understanding Workshop - Proceedings (ASRU). Sentosa, Singapore: IEEE Signal Processing Society, 2019. p. 411-418. ISBN: 978-1-7281-0306-8. Detail

    MOŠNER, L.; WU, M.; RAJU, A.; PARTHASARATHI, S.; KUMATANI, K.; SUNDARAM, S.; MAAS, R.; HOFFMEISTER, B. Improving Noise Robustness of Automatic Speech Recognition via Parallel Data and Teacher-student Learning. In Proceedings of ICASSP. Brighton: IEEE Signal Processing Society, 2019. p. 6475-6479. ISBN: 978-1-5386-4658-8. Detail

    NOVOTNÝ, O.; PLCHOT, O.; GLEMBEK, O.; BURGET, L. Factorization of Discriminatively Trained i-Vector Extractor for Speaker Recognition. In Proceedings of Interspeech. Proceedings of Interspeech. Graz: International Speech Communication Association, 2019. p. 4330-4334. ISSN: 1990-9772. Detail

    NOVOTNÝ, O.; PLCHOT, O.; GLEMBEK, O.; BURGET, L.; MATĚJKA, P. Discriminatively Re-trained i-Vector Extractor For Speaker Recognition. In Proceedings of 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP). Brighton: IEEE Signal Processing Society, 2019. p. 6031-6035. ISBN: 978-1-5386-4658-8. Detail

    NOVOTNÝ, O.; PLCHOT, O.; GLEMBEK, O.; ČERNOCKÝ, J.; BURGET, L. Analysis of DNN Speech Signal Enhancement for Robust Speaker Recognition. COMPUTER SPEECH AND LANGUAGE, 2019, vol. 2019, no. 58, p. 403-421. ISSN: 0885-2308. Detail

    ONDEL YANG, L.; LI, R.; SELL, G.; HEŘMANSKÝ, H. Deriving Spectro-temporal Properties of Hearing from Speech Data. In Proceedings of ICASSP. Brighton: IEEE Signal Processing Society, 2019. p. 411-415. ISBN: 978-1-5386-4658-8. Detail

    ONDEL YANG, L.; VYDANA, H.; BURGET, L.; ČERNOCKÝ, J. Bayesian Subspace Hidden Markov Model for Acoustic Unit Discovery. In Proceedings of Interspeech 2019. Proceedings of Interspeech. Graz: International Speech Communication Association, 2019. p. 261-265. ISSN: 1990-9772. Detail

    ROHDIN, J.; STAFYLAKIS, T.; SILNOVA, A.; ZEINALI, H.; BURGET, L.; PLCHOT, O. Speaker Verification Using End-To-End Adversarial Language Adaptation. In Proceedings of ICASSP 2019. Brighton: IEEE Signal Processing Society, 2019. p. 6006-6010. ISBN: 978-1-5386-4658-8. Detail

    STAFYLAKIS, T.; ROHDIN, J.; PLCHOT, O.; MIZERA, P.; BURGET, L. Self-supervised speaker embeddings. In Proceedings of Interspeech. Proceedings of Interspeech. Graz: International Speech Communication Association, 2019. p. 2863-2867. ISSN: 1990-9772. Detail

    SUBRAMANIAN, A.; WANG, X.; BASKAR, M.; WATANABE, S.; TANIGUCHI, T.; TRAN, D.; FUJITA, Y. Speech Enhancement Using End-to-End Speech Recognition Objectives. In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. New Paltz, NY: IEEE Signal Processing Society, 2019. p. 234-238. ISBN: 978-1-7281-1123-0. Detail

    SZŐKE, I.; SKÁCEL, M.; MOŠNER, L.; PALIESEK, J.; ČERNOCKÝ, J. Building and Evaluation of a Real Room Impulse Response Dataset. IEEE J-STSP, 2019, vol. 13, no. 4, p. 863-876. ISSN: 1932-4553. Detail

    WANG, S.; ROHDIN, J.; BURGET, L.; PLCHOT, O.; QIAN, Y.; YU, K.; ČERNOCKÝ, J. On the Usage of Phonetic Information for Text-independent Speaker Embedding Extraction. In Proceedings of Interspeech. Proceedings of Interspeech. Graz: International Speech Communication Association, 2019. p. 1148-1152. ISSN: 1990-9772. Detail

    YANG, J.; ONDEL YANG, L.; MANOHAR, V.; HEŘMANSKÝ, H. Towards Automatic Methods to Detect Errors in Transcriptions of Speech Recordings. In Proceedings of ICASSP. Brighton: IEEE Signal Processing Society, 2019. p. 3747-3751. ISBN: 978-1-5386-4658-8. Detail

    ZEINALI, H.; BURGET, L.; ROHDIN, J.; STAFYLAKIS, T.; ČERNOCKÝ, J. How To Improve Your Speaker Embeddings Extractor in Generic Toolkits. In Proceedings of 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP). Brighton: IEEE Signal Processing Society, 2019. p. 6141-6145. ISBN: 978-1-5386-4658-8. Detail

    ZEINALI, H.; ČERNOCKÝ, J.; BURGET, L. A multi purpose and large scale speech corpus in Persian and English for speaker and speech Recognition: the DeepMine database. In IEEE Automatic Speech Recognition and Understanding Workshop - Proceedings (ASRU). Sentosa, Singapore: IEEE Signal Processing Society, 2019. p. 397-402. ISBN: 978-1-7281-0306-8. Detail

    ZEINALI, H.; STAFYLAKIS, T.; ATHANASOPOULOU, G.; ROHDIN, J.; GKINIS, I.; BURGET, L.; ČERNOCKÝ, J. Detecting Spoofing Attacks Using VGG and SincNet: BUT-Omilia Submission to ASVspoof 2019 Challenge. In Proceedings of Interspeech. Proceedings of Interspeech. Graz: International Speech Communication Association, 2019. p. 1073-1077. ISSN: 1990-9772. Detail

    ZEINALI, H.; WANG, S.; SILNOVA, A.; MATĚJKA, P.; PLCHOT, O. BUT System Description to VoxCeleb Speaker Recognition Challenge 2019. Proceedings of The VoxCeleb Challange Workshop 2019. Graz: 2019. p. 1-4. Detail

    ŽMOLÍKOVÁ, K.; DELCROIX, M.; KINOSHITA, K.; OCHIAI, T.; NAKATANI, T.; BURGET, L.; ČERNOCKÝ, J. SpeakerBeam: Speaker Aware Neural Network for Target Speaker Extraction in Speech Mixtures. IEEE J-STSP, 2019, vol. 13, no. 4, p. 800-814. ISSN: 1932-4553. Detail

  • 2018

    ALAM, J.; BHATTACHARYA, G.; BRUMMER, J.; BURGET, L.; DIEZ SÁNCHEZ, M.; GLEMBEK, O.; KENNY, P.; KLČO, M.; LANDINI, F.; LOZANO DÍEZ, A.; MATĚJKA, P.; MONTEIRO, J.; MOŠNER, L.; NOVOTNÝ, O.; PLCHOT, O.; PROFANT, J.; ROHDIN, J.; SILNOVA, A.; SLAVÍČEK, J.; STAFYLAKIS, T.; ZEINALI, H. ABC NIST SRE 2018 SYSTEM DESCRIPTION. Proceedings of 2018 NIST SRE Workshop. Athens: National Institute of Standards and Technology, 2018. p. 1-10. Detail

    BARTOS, A.; CIPR, T.; NELSON, D.; SCHWARZ, P.; BANOWETZ, J.; JERABEK, L. Noise-robust speech triage. Journal of the Acoustical Society of America, 2018, vol. 143, no. 4, p. 2313-2320. ISSN: 1520-8524. Detail

    BENEŠ, K.; KESIRAJU, S.; BURGET, L. i-vectors in language modeling: An efficient way of domain adaptation for feed-forward models. In Proceedings of Interspeech 2018. Proceedings of Interspeech. Hyderabad: International Speech Communication Association, 2018. p. 3383-3387. ISSN: 1990-9772. Detail

    BRUMMER, J.; SILNOVA, A.; BURGET, L.; STAFYLAKIS, T. Gaussian meta-embeddings for efficient scoring of a heavy-tailed PLDA model. In Proceedings of Odyssey 2018. Proceedings of Odyssey: The Speaker and Language Recognition Workshop Odyssey 2014, Joensuu, Finland. Les Sables d'Olonne: International Speech Communication Association, 2018. p. 349-356. ISSN: 2312-2846. Detail

    CHO, J.; BASKAR, M.; LI, R.; WIESNER, M.; MALLIDI, S.; YALTA, N.; KARAFIÁT, M.; WATANABE, S.; HORI, T. Multilingual Sequence-to-Sequence Speech Recognition: Architecture, Transfer Learning, and Language Modeling. In Proceedings of 2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018). Athens: IEEE Signal Processing Society, 2018. p. 521-527. ISBN: 978-1-5386-4334-1. Detail

    DELCROIX, M.; ŽMOLÍKOVÁ, K.; KINOSHITA, K.; ARAKI, S.; OGAWA, A.; NAKATANI, T. SpeakerBeam: A New Deep Learning Technology for Extracting Speech of a Target Speaker Based on the Speaker's Voice Characteristics. NTT Technical Review, 2018, vol. 16, no. 11, p. 19-24. ISSN: 1348-3447. Detail

    DELCROIX, M.; ŽMOLÍKOVÁ, K.; KINOSHITA, K.; OGAWA, A.; NAKATANI, T. Single Channel Target Speaker Extraction and Recognition with Speaker Beam. In Proceedings of ICASSP 2018. Calgary: IEEE Signal Processing Society, 2018. p. 5554-5558. ISBN: 978-1-5386-4658-8. Detail

    DIEZ SÁNCHEZ, M.; BURGET, L.; MATĚJKA, P. Speaker Diarization based on Bayesian HMM with Eigenvoice Priors. In Proceedings of Odyssey 2018. Proceedings of Odyssey: The Speaker and Language Recognition Workshop Odyssey 2014, Joensuu, Finland. Les Sables d´Olonne: International Speech Communication Association, 2018. p. 147-154. ISSN: 2312-2846. Detail

    DIEZ SÁNCHEZ, M.; LANDINI, F.; BURGET, L.; ROHDIN, J.; SILNOVA, A.; ŽMOLÍKOVÁ, K.; NOVOTNÝ, O.; VESELÝ, K.; GLEMBEK, O.; PLCHOT, O.; MOŠNER, L.; MATĚJKA, P. BUT system for DIHARD Speech Diarization Challenge 2018. In Proceedings of Interspeech 2018. Proceedings of Interspeech. Hyderabad: International Speech Communication Association, 2018. p. 2798-2802. ISSN: 1990-9772. Detail

    EGOROVA, E.; BURGET, L. Out-of-Vocabulary Word Recovery Using FST-Based Subword Unit Clustering in a Hybrid ASR System. In Proceedings of ICASSP 2018. Calgary: IEEE Signal Processing Society, 2018. p. 5919-5923. ISBN: 978-1-5386-4658-8. Detail

    GODARD, P.; BOITO, M.; ONDEL YANG, L.; BERARD, A.; YVON, F.; VILLAVICENCIO, A.; BESACIER, L. Unsupervised Word Segmentation from Speech with Attention. In Proceeding of Interspeech 2018. Proceedings of Interspeech. Hyderabad: International Speech Communication Association, 2018. p. 2678-2682. ISSN: 1990-9772. Detail

    KARAFIÁT, M.; BASKAR, M.; SZŐKE, I.; MALENOVSKÝ, V.; VESELÝ, K.; GRÉZL, F.; BURGET, L.; ČERNOCKÝ, J. BUT OpenSAT 2017 speech recognition system. In Proceedings of Interspeech 2018. Proceedings of Interspeech. Hyderabad: International Speech Communication Association, 2018. p. 2638-2642. ISSN: 1990-9772. Detail

    KARAFIÁT, M.; BASKAR, M.; VESELÝ, K.; GRÉZL, F.; BURGET, L.; ČERNOCKÝ, J. Analysis of Multilingual BLSTM Acoustic Model on Low and High Resource Languages. In Proceedings of ICASSP 2018. Calgary: IEEE Signal Processing Society, 2018. p. 5789-5793. ISBN: 978-1-5386-4658-8. Detail

    LOZANO DÍEZ, A.; PLCHOT, O.; MATĚJKA, P.; GONZALEZ-RODRIGUEZ, J. DNN Based Embeddings for Language Recognition. In Proceedings of ICASSP 2018. Calgary: IEEE Signal Processing Society, 2018. p. 5184-5188. ISBN: 978-1-5386-4658-8. Detail

    LOZANO DÍEZ, A.; PLCHOT, O.; MATĚJKA, P.; NOVOTNÝ, O.; GONZALEZ-RODRIGUEZ, J. Analysis of DNN-based Embeddings for Language Recognition on the NIST LRE 2017. In Proceedings of Odyssey 2018 The Speaker and Language Recognition Workshop. Proceedings of Odyssey: The Speaker and Language Recognition Workshop Odyssey 2014, Joensuu, Finland. Les Sables d'Olonne: International Speech Communication Association, 2018. p. 39-46. ISSN: 2312-2846. Detail

    MOŠNER, L.; PLCHOT, O.; MATĚJKA, P.; NOVOTNÝ, O.; ČERNOCKÝ, J. Dereverberation and Beamforming in Robust Far-Field Speaker Recognition. In Proceedings of Interspeech 2018. Proceedings of Interspeech. Hyderabad: International Speech Communication Association, 2018. p. 1334-1338. ISSN: 1990-9772. Detail

    NOVOTNÝ, O.; MATĚJKA, P.; PLCHOT, O.; GLEMBEK, O. On the use of DNN Autoencoder for Robust Speaker Recognition. Brno: Faculty of Information Technology BUT, 2018. p. 1-5. Detail

    NOVOTNÝ, O.; PLCHOT, O.; MATĚJKA, P.; MOŠNER, L.; GLEMBEK, O. On the use of X-vectors for Robust Speaker Recognition. Proceedings of Odyssey 2018. Proceedings of Odyssey: The Speaker and Language Recognition Workshop Odyssey 2014, Joensuu, Finland. Les Sables d´Olonne: International Speech Communication Association, 2018. p. 168-175. ISSN: 2312-2846. Detail

    ONDEL YANG, L.; GODARD, P.; BESACIER, L.; LARSEN, E.; HASEGAWA-JOHNSON, M.; SCHARENBORG, O.; DUPOUX, E.; BURGET, L.; YVON, F.; KHUDANPUR, S. Bayesian Models for Unit Discovery on a Very Low Resource Language. In Proceedings of ICASSP 2018. Calgary: IEEE Signal Processing Society, 2018. p. 5939-5943. ISBN: 978-1-5386-4658-8. Detail

    PLCHOT, O.; MATĚJKA, P.; NOVOTNÝ, O.; CUMANI, S.; LOZANO DÍEZ, A.; SLAVÍČEK, J.; DIEZ SÁNCHEZ, M.; GRÉZL, F.; GLEMBEK, O.; KAMSALI VEERA, M.; SILNOVA, A.; BURGET, L.; ONDEL YANG, L.; KESIRAJU, S.; ROHDIN, J. Analysis of BUT-PT Submission for NIST LRE 2017. In Proceedings of Odyssey 2018 The Speaker and Language Recognition Workshop. Proceedings of Odyssey: The Speaker and Language Recognition Workshop Odyssey 2014, Joensuu, Finland. Les Sables d'Olonne: International Speech Communication Association, 2018. p. 47-53. ISSN: 2312-2846. Detail

    PULUGUNDLA, B.; BASKAR, M.; KESIRAJU, S.; EGOROVA, E.; KARAFIÁT, M.; BURGET, L.; ČERNOCKÝ, J. BUT system for low resource Indian language ASR. In Proceedings of Interspeech 2018. Proceedings of Interspeech. Hyderabad: International Speech Communication Association, 2018. p. 3182-3186. ISSN: 1990-9772. Detail

    ROHDIN, J.; SILNOVA, A.; DIEZ SÁNCHEZ, M.; PLCHOT, O.; MATĚJKA, P.; BURGET, L. End-to-End DNN Based Speaker Recognition Inspired by i-Vector and PLDA. In Proceedings of ICASSP. Calgary: IEEE Signal Processing Society, 2018. p. 4874-4878. ISBN: 978-1-5386-4658-8. Detail

    RYANT, N.; BERGELSON, E.; CHURCH, K.; CRISTIA, A.; DU, J.; GANAPATHY, S.; KHUDANPUR, S.; KOWALSKI, D.; KRISHNAMOORTHY, M.; KULSHRESHTA, R.; LIBERMAN, M.; LU, Y.; MACIEJEWSKI, M.; METZE, F.; PROFANT, J.; SUN, L.; TSAO, Y.; YU, Z. Enhancement and Analysis of Conversational Speech: JSALT 2017. In Proceedings of ICASSP 2018. Calgary: IEEE Signal Processing Society, 2018. p. 5154-5158. ISBN: 978-1-5386-4658-8. Detail

    SILNOVA, A.; BRUMMER, J.; GARCÍA-ROMERO, D.; SNYDER, D.; BURGET, L. Fast variational Bayes for heavy-tailed PLDA applied to i-vectors and x-vectors. In Proceedings of Interspeech 2018. Proceedings of Interspeech. Hyderabad: International Speech Communication Association, 2018. p. 72-76. ISSN: 1990-9772. Detail

    SILNOVA, A.; MATĚJKA, P.; GLEMBEK, O.; PLCHOT, O.; NOVOTNÝ, O.; GRÉZL, F.; SCHWARZ, P.; ČERNOCKÝ, J. BUT/Phonexia Bottleneck Feature Extractor. In Proceedings of Odyssey 2018. Proceedings of Odyssey: The Speaker and Language Recognition Workshop Odyssey 2014, Joensuu, Finland. Les Sables d´Olonne: International Speech Communication Association, 2018. p. 283-287. ISSN: 2312-2846. Detail

    SZŐKE, I. Souhrnná zpráva k výzkumnému projektu "Škoda auto - Digital Minutes". Brno: ŠKODA AUTO a.s., 2018. s. 0-0. Detail

    VESELÝ, K.; PERALES, C.; SZŐKE, I.; LUQUE, J.; ČERNOCKÝ, J. Lightly supervised vs. semi-supervised training of acoustic model on Luxembourgish for low-resource automatic speech recognition. In Proceedings of Interspeech 2018. Proceedings of Interspeech. Hyderabad: International Speech Communication Association, 2018. p. 2883-2887. ISSN: 1990-9772. Detail

    WIESNER, M.; LIU, C.; ONDEL YANG, L.; HARMAN, C.; MANOHAR, V.; TRMAL, J.; HUANG, Z.; DEHAK, N.; KHUDANPUR, S. Automatic Speech Recognition and Topic Identification for Almost-Zero-Resource Languages. In Proceedings of Interspeech. Proceedings of Interspeech. Hyderabad: International Speech Communication Association, 2018. p. 2052-2056. ISSN: 1990-9772. Detail

    ZEINALI, H.; BURGET, L.; ČERNOCKÝ, J. Convolutional Neural Networks and X-Vector Embedding for DCASE2018 Acoustic Scene Classification Challenge. Proceedings of DCASE 2018 Workshop. Surrey: Tampere University of Technology, 2018. p. 1-5. ISBN: 978-952-15-4262-6. Detail

    ZEINALI, H.; BURGET, L.; SAMETI, H.; ČERNOCKÝ, J. Spoken Pass-Phrase Verification in the i-vector Space. In Proceedings of Odyssey 2018. Proceedings of Odyssey: The Speaker and Language Recognition Workshop Odyssey 2014, Joensuu, Finland. Les Sables d´Olonne: International Speech Communication Association, 2018. p. 372-377. ISSN: 2312-2846. Detail

    ŽMOLÍKOVÁ, K.; DELCROIX, M.; KINOSHITA, K.; HIGUCHI, T.; NAKATANI, T.; ČERNOCKÝ, J. Optimization of Speaker-aware Multichannel Speech Extraction with ASR Criterion. In Proceedings of ICASSP 2018. Calgary: IEEE Signal Processing Society, 2018. p. 6702-6706. ISBN: 978-1-5386-4658-8. Detail

  • 2017

    BASKAR, M.; KARAFIÁT, M.; BURGET, L.; VESELÝ, K.; GRÉZL, F.; ČERNOCKÝ, J. Residual Memory Networks: Feed-forward approach to learn long-term temporal dependencies. In Proceedings of ICASSP 2017. New Orleans: IEEE Signal Processing Society, 2017. p. 4810-4814. ISBN: 978-1-5090-4117-6. Detail

    BENEŠ, K.; BASKAR, M.; BURGET, L. Residual Memory Networks in Language Modeling: Improving the Reputation of Feed-Forward Networks. In Proceedings of Interspeeech 2017. Proceedings of Interspeech. Stockholm: International Speech Communication Association, 2017. p. 284-288. ISSN: 1990-9772. Detail

    DAS, A.; HASEGAWA-JOHNSON, M.; VESELÝ, K. Deep Auto-encoder Based Multi-task Learning Using Probabilistic Transcriptions. In Proceedings of Interspeech 2017. Proceedings of Interspeech. Stockholm: International Speech Communication Association, 2017. p. 2073-2077. ISSN: 1990-9772. Detail

    FÉR, R.; MATĚJKA, P.; GRÉZL, F.; PLCHOT, O.; VESELÝ, K.; ČERNOCKÝ, J. Multilingually Trained Bottleneck Features in Spoken Language Recognition. COMPUTER SPEECH AND LANGUAGE, 2017, vol. 2017, no. 46, p. 252-267. ISSN: 0885-2308. Detail

    GLEMBEK, O. Summary report for project Exploiting Language Information for Situational Awareness (ELISA) For year 2017. Brno: University of Southern California, 2017. p. 1-2. Detail

    HANNEMANN, M.; TRMAL, J.; ONDEL YANG, L.; KESIRAJU, S.; BURGET, L. Bayesian joint-sequence models for grapheme-to-phoneme conversion. In Proceedings of ICASSP 2017. New Orleans: IEEE Signal Processing Society, 2017. p. 2836-2840. ISBN: 978-1-5090-4117-6. Detail

    HIGUCHI, T.; KINOSHITA, K.; DELCROIX, M.; ŽMOLÍKOVÁ, K.; NAKATANI, T. Deep clustering-based beamforming for separation with unknown number of sources. In Proceedings of Interspeech 2017. Proceedings of Interspeech. Stockholm: International Speech Communication Association, 2017. p. 1183-1187. ISSN: 1990-9772. Detail

    KARAFIÁT, M.; BASKAR, M.; MATĚJKA, P.; VESELÝ, K.; GRÉZL, F.; BURGET, L.; ČERNOCKÝ, J. 2016 BUT Babel system: Multilingual BLSTM acoustic model with i-vector based adaptation. In Proceedings of Interspeech 2017. Proceedings of Interspeech. Stockholm: International Speech Communication Association, 2017. p. 719-723. ISSN: 1990-9772. Detail

    KESIRAJU, S.; PAPPAGARI, R.; ONDEL YANG, L.; BURGET, L.; DEHAK, N.; KHUDANPUR, S.; ČERNOCKÝ, J.; GANGASHETTY, S. Topic identification of spoken documents using unsupervised acoustic unit discovery. In Proceedings of ICASSP 2017. New Orleans: IEEE Signal Processing Society, 2017. p. 5745-5749. ISBN: 978-1-5090-4117-6. Detail

    LIU, C.; YANG, J.; SUN, M.; KESIRAJU, S.; ROTT, A.; ONDEL YANG, L.; GHAHREMANI, P.; DEHAK, N.; BURGET, L.; KHUDANPUR, S. An Empirical evaluation of zero resource acoustic unit discovery. In Proceedings of ICASSP 2017. New Orleans: IEEE Signal Processing Society, 2017. p. 5305-5309. ISBN: 978-1-5090-4117-6. Detail

    MALANDRAKIS, N.; GLEMBEK, O.; NARAYANAN, S. Extracting Situation Frames from non-English Speech: Evaluation Framework and Pilot Results. In Proceedings of Interspeech 2017. Proceedings of Interspeech. Stockholm: International Speech Communication Association, 2017. p. 2123-2127. ISSN: 1990-9772. Detail

    MATĚJKA, P. Souhrnná zpráva k projektu "Speaker REcognition" za rok 2017. Brno: Phonexia s.r.o., 2017. s. 1-17. Detail

    MATĚJKA, P. Summary report for project "Robust Automatic Speech Transcription" in Year 2017. Brno: Raytheon BBN Technologies, 2017. p. 1-5. Detail

    MATĚJKA, P.; NOVOTNÝ, O.; PLCHOT, O.; BURGET, L.; DIEZ SÁNCHEZ, M.; ČERNOCKÝ, J. Analysis of Score Normalization in Multilingual Speaker Recognition. In Proceedings of Interspeech 2017. Proceedings of Interspeech. Stockholm: International Speech Communication Association, 2017. p. 1567-1571. ISSN: 1990-9772. Detail

    MATĚJKA, P.; PLCHOT, O.; NOVOTNÝ, O.; CUMANI, S.; LOZANO DÍEZ, A.; SLAVÍČEK, J.; DIEZ SÁNCHEZ, M.; GRÉZL, F.; GLEMBEK, O.; KAMSALI VEERA, M.; SILNOVA, A.; BURGET, L.; ONDEL YANG, L.; KESIRAJU, S.; ROHDIN, J. BUT- PT System Description for NIST LRE 2017. Proceedings of NIST Language Recognition Workshop 2017. Orlando, Florida: National Institute of Standards and Technology, 2017. p. 1-6. Detail

    ONDEL YANG, L.; BURGET, L.; ČERNOCKÝ, J.; KESIRAJU, S. Bayesian phonotactic language model for acoustic unit discovery. In Proceedings of ICASSP 2017. New Orleans: IEEE Signal Processing Society, 2017. p. 5750-5754. ISBN: 978-1-5090-4117-6. Detail

    PAPADOPOULOS, P.; TRAVADI, R.; VAZ, C.; MALANDRAKIS, N.; HERMJAKOB, U.; POURDAMGHANI, N.; PUST, M.; ZHANG, B.; PAN, X.; LU, D.; LIN, Y.; GLEMBEK, O.; BASKAR, M.; KARAFIÁT, M.; BURGET, L.; HASEGAWA-JOHNSON, M.; JI, H.; MAY, J.; KNIGHT, K.; NARAYANAN, S. Team ELISA System for DARPA LORELEI Speech Evaluation 2016. In Proceedings of Interspeech 2017. Proceedings of Interspeech. Stockholm: International Speech Communication Association, 2017. p. 2053-2057. ISSN: 1990-9772. Detail

    PLCHOT, O.; MATĚJKA, P.; SILNOVA, A.; NOVOTNÝ, O.; DIEZ SÁNCHEZ, M.; ROHDIN, J.; GLEMBEK, O.; BRÜMMER, N.; SWART, A.; PRIETO, J.; GARCIA PERERA, L.; BUERA, L.; KENNY, P.; ALAM, J.; BHATTACHARYA, G. Analysis and Description of ABC Submission to NIST SRE 2016. In Proceedings of Interspeech 2017. Proceedings of Interspeech. Stockholm: International Speech Communication Association, 2017. p. 1348-1352. ISSN: 1990-9772. Detail

    SILNOVA, A.; BURGET, L.; ČERNOCKÝ, J. Alternative Approaches to Neural Network based Speaker Verification. In Proceedings of Interspeech 2017. Proceedings of Interspeech. Stockholm: International Speech Communication Association, 2017. p. 1572-1575. ISSN: 1990-9772. Detail

    VESELÝ, K.; BASKAR, M.; DIEZ SÁNCHEZ, M.; BENEŠ, K. MGB-3 BUT System: Low-resource ASR on Egyptian YOUTUBE data. In Proceedings of ASRU 2017. Okinawa: IEEE Signal Processing Society, 2017. p. 368-373. ISBN: 978-1-5090-4788-8. Detail

    VESELÝ, K.; BURGET, L.; ČERNOCKÝ, J. Semi-supervised DNN training with word selection for ASR. In Proceedings of Interspeech 2017. Proceedings of Interspeech. Stockholm: International Speech Communication Association, 2017. p. 3687-3691. ISSN: 1990-9772. Detail

    ZEINALI, H.; SAMETI, H.; BURGET, L. HMM-Based Phrase-Independent i-Vector Extractor for Text-Dependent Speaker Verification. IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING, 2017, vol. 25, no. 7, p. 1421-1435. ISSN: 2329-9290. Detail

    ZEINALI, H.; SAMETI, H.; BURGET, L.; ČERNOCKÝ, J. Text-dependent speaker verification based on i-vectors, Neural Networks and Hidden Markov Models. COMPUTER SPEECH AND LANGUAGE, 2017, vol. 2017, no. 46, p. 53-71. ISSN: 0885-2308. Detail

    ŽMOLÍKOVÁ, K. Souhrnná výzkumná zpráva projektu "Speech enhancement front-end for robust automatic speech recognition with large amount of training data" pro rok 2017. Brno: NTT Corporation, 2017. s. 0-0. Detail

    ŽMOLÍKOVÁ, K.; DELCROIX, M.; KINOSHITA, K.; HIGUCHI, T.; OGAWA, A.; NAKATANI, T. Learning Speaker Representation for Neural Network Based Multichannel Speaker Extraction. In Proceedings of ASRU 2017. Okinawa: IEEE Signal Processing Society, 2017. p. 8-15. ISBN: 978-1-5090-4788-8. Detail

    ŽMOLÍKOVÁ, K.; DELCROIX, M.; KINOSHITA, K.; HIGUCHI, T.; OGAWA, A.; NAKATANI, T. Speaker-aware neural network based beamformer for speaker extraction in speech mixtures. In Proceedings of Interspeech 2017. Proceedings of Interspeech. Stocholm: International Speech Communication Association, 2017. p. 2655-2659. ISSN: 1990-9772. Detail

  • 2016

    EGOROVA, E.; SERRANO, J. Semi-Supervised Training of Language Model on Spanish Conversational Telephone Speech Data. In Procedia Computer Science. Procedia Computer Science. Yogyakarta: Elsevier Science, 2016. p. 114-120. ISSN: 1877-0509. Detail

    GLEMBEK, O. Summary report for project Exploiting Language Information for Situational Awareness (ELISA) For year 2016. Brno: University of Southern California, 2016. p. 1-2. Detail

    GRÉZL, F.; EGOROVA, E.; KARAFIÁT, M. Study of Large Data Resources for Multilingual Training and System Porting. In Procedia Computer Science. Procedia Computer Science. Yogyakarta: Elsevier Science, 2016. p. 15-22. ISSN: 1877-0509. Detail

    GRÉZL, F.; KARAFIÁT, M. Boosting Performance on Low-resource Languages by Standard Corpora: AN ANALYSIS. In Proceeding of SLT 2016. San Diego: IEEE Signal Processing Society, 2016. p. 629-636. ISBN: 978-1-5090-4903-5. Detail

    GRÉZL, F.; KARAFIÁT, M. Bottle-Neck Feature Extraction Structures for Multilingual Training and Porting. In Procedia Computer Science. Procedia Computer Science. Yogyakarta: Elsevier Science, 2016. p. 144-151. ISSN: 1877-0509. Detail

    KARAFIÁT, M.; BASKAR, M.; MATĚJKA, P.; VESELÝ, K.; GRÉZL, F.; ČERNOCKÝ, J. Multilingual BLSTM and Speaker-Specific Vector Adaptation in 2016 BUT BABEL SYSTEM. In Proceedings of SLT 2016. San Diego: IEEE Signal Processing Society, 2016. p. 637-643. ISBN: 978-1-5090-4903-5. Detail

    KARAFIÁT, M.; BURGET, L.; GRÉZL, F.; VESELÝ, K.; ČERNOCKÝ, J. Multilingual Region-Dependent Transforms. In Proceedings of the 41th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), 2016. Shanghai: IEEE Signal Processing Society, 2016. p. 5430-5434. ISBN: 978-1-4799-9988-0. Detail

    KESIRAJU, S.; BURGET, L.; SZŐKE, I.; ČERNOCKÝ, J. Learning document representations using subspace multinomial model. In Proceedings of Interspeech 2016. San Francisco: International Speech Communication Association, 2016. p. 700-704. ISBN: 978-1-5108-3313-5. Detail

    LI, R.; MALLIDI, S.; PLCHOT, O.; BURGET, L.; DEHAK, N. Exploiting Hidden-Layer Responses of Deep Neural Networks for Language Recognition. In Proceedings of Interspeech 2016. San Francisco: International Speech Communication Association, 2016. p. 3265-3269. ISBN: 978-1-5108-3313-5. Detail

    LOPEZ-MORENO, I.; GONZALEZ-DOMINGUEZ, J.; MARTÍNEZ GONZÁLEZ, D.; PLCHOT, O.; GONZALEZ-RODRIGUEZ, J.; MORENO, P. On the use of deep feedforward neural networks for automatic language identification. COMPUTER SPEECH AND LANGUAGE, 2016, vol. 2016, no. 40, p. 46-59. ISSN: 0885-2308. Detail

    LOZANO DÍEZ, A.; SILNOVA, A.; MATĚJKA, P.; GLEMBEK, O.; PLCHOT, O.; PEŠÁN, J.; BURGET, L.; GONZALEZ-RODRIGUEZ, J. Analysis and Optimization of Bottleneck Features for Speaker Recognition. In Proceedings of Odyssey 2016. Proceedings of Odyssey: The Speaker and Language Recognition Workshop Odyssey 2014, Joensuu, Finland. Bilbao: International Speech Communication Association, 2016. p. 352-357. ISSN: 2312-2846. Detail

    MATĚJKA, P. Summary report for project "Robust Automatic Speech Transcription" in Year 2016. Brno: Raytheon BBN Technologies, 2016. p. 1 (1 s.). Detail

    MATĚJKA, P.; GLEMBEK, O.; NOVOTNÝ, O.; PLCHOT, O.; GRÉZL, F.; BURGET, L.; ČERNOCKÝ, J. Analysis Of DNN Approaches To Speaker Identification. In Proceedings of the 41th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), 2016. Shanghai: IEEE Signal Processing Society, 2016. p. 5100-5104. ISBN: 978-1-4799-9988-0. Detail

    NOVOTNÝ, O.; MATĚJKA, P.; GLEMBEK, O.; PLCHOT, O.; GRÉZL, F.; BURGET, L.; ČERNOCKÝ, J. Analysis of the DNN-Based SRE Systems in Multi-language Conditions. In Proceedings of SLT 2016. San Diego: IEEE Signal Processing Society, 2016. p. 199-204. ISBN: 978-1-5090-4903-5. Detail

    NOVOTNÝ, O.; MATĚJKA, P.; PLCHOT, O.; GLEMBEK, O.; BURGET, L.; ČERNOCKÝ, J. Analysis of Speaker Recognition Systems in Realistic Scenarios of the SITW 2016 Challenge. In Proceedings of Interspeech 2016. San Francisco: International Speech Communication Association, 2016. p. 828-832. ISBN: 978-1-5108-3313-5. Detail

    ONDEL YANG, L.; BURGET, L.; ČERNOCKÝ, J. Variational Inference for Acoustic Unit Discovery. In Procedia Computer Science. Procedia Computer Science. Yogyakarta: Elsevier Science, 2016. p. 80-86. ISSN: 1877-0509. Detail

    PEŠÁN, J.; BURGET, L.; ČERNOCKÝ, J. Sequence Summarizing Neural Networks for Spoken Language Recognition. In Proceedings of Interspeech 2016. San Francisco: International Speech Communication Association, 2016. p. 3285-3289. ISBN: 978-1-5108-3313-5. Detail

    PLCHOT, O.; BURGET, L.; ARONOWITZ, H.; MATĚJKA, P. Audio Enhancing With DNN Autoencoder For Speaker Recognition. In Proceedings of the 41th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), 2016. Shanghai: IEEE Signal Processing Society, 2016. p. 5090-5094. ISBN: 978-1-4799-9988-0. Detail

    PLCHOT, O.; MATĚJKA, P.; FÉR, R.; GLEMBEK, O.; NOVOTNÝ, O.; PEŠÁN, J.; VESELÝ, K.; ONDEL YANG, L.; KARAFIÁT, M.; GRÉZL, F.; KESIRAJU, S.; BURGET, L.; BRUMMER, J.; SWART, A.; CUMANI, S.; MALLIDI, S.; LI, R. BAT System Description for NIST LRE 2015. In Proceedings of Odyssey 2016, The Speaker and Language Recognition Workshop. Proceedings of Odyssey: The Speaker and Language Recognition Workshop Odyssey 2014, Joensuu, Finland. Bilbao: International Speech Communication Association, 2016. p. 166-173. ISSN: 2312-2846. Detail

    POPKOVÁ, A.; POVOLNÝ, F.; MATĚJKA, P.; GLEMBEK, O.; GRÉZL, F.; ČERNOCKÝ, J. Investigation of Bottle-Neck Features for Emotion Recognition. In 19th International Conference, TSD 2016, Brno , Czech Republic, September 12-16, 2016, Proceedings. Lecture Notes in Computer Science. Lecture Notes in Computer Science, Lecture Notes in Artificial Intelligence. Brno: International Speech Communication Association, 2016. p. 426-434. ISSN: 0302-9743. Detail

    POVOLNÝ, F.; MATĚJKA, P.; HRADIŠ, M.; POPKOVÁ, A.; OTRUSINA, L.; SMRŽ, P.; WOOD, I.; ROBIN, C.; LAMEL, L. Multimodal Emotion Recognition for AVEC 2016 Challenge. In AVEC '16 Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge. Amsterdam: Association for Computing Machinery, 2016. p. 75-82. ISBN: 978-1-4503-4516-3. Detail

    SAGHA, H.; MATĚJKA, P.; GAVRYUOKOVA, M.; POVOLNÝ, F.; MARCHI, E.; SCHULLER, B. Enhancing multilingual recognition of emotion in speech by language identification. In 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION - Proceedings (INTERSPEECH 2016). Proceedings of Interspeech. San Francisco: International Speech Communication Association, 2016. p. 2949-2953. ISSN: 1990-9772. Detail

    SKÁCEL, M.; KARAFIÁT, M.; ONDEL YANG, L.; UCHYTIL, A.; SZŐKE, I. BUT Zero-Cost Speech Recognition 2016 System Description. In CEUR Workshop Proceedings. CEUR Workshop Proceedings. Hilversum: CEUR-WS.org, 2016. p. 1-3. ISSN: 1613-0073. Detail

    SZŐKE, I.; ANGUERA, X. Zero-Cost Speech Recognition Task at Mediaeval 2016. In CEUR Workshop Proceedings. CEUR Workshop Proceedings. Hilversum: CEUR-WS.org, 2016. p. 1-3. ISSN: 1613-0073. Detail

    VESELÝ, K.; WATANABE, S.; ŽMOLÍKOVÁ, K.; KARAFIÁT, M.; BURGET, L.; ČERNOCKÝ, J. Sequence Summarizing Neural Network for Speaker Adaptation. In Proceedings of the 41th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), 2016. Shanghai: IEEE Signal Processing Society, 2016. p. 5315-5319. ISBN: 978-1-4799-9988-0. Detail

    ZEINALI, H.; BURGET, L.; SAMETI, H.; GLEMBEK, O.; PLCHOT, O. Deep Neural Networks and Hidden Markov Models in i-vector-based Text-Dependent Speaker Verification. In Proceedings of Odyssey 2016, The Speaker and Language Recognition Workshop. Proceedings of Odyssey: The Speaker and Language Recognition Workshop Odyssey 2014, Joensuu, Finland. Bilbao: International Speech Communication Association, 2016. p. 24-30. ISSN: 2312-2846. Detail

    ZEINALI, H.; SAMETI, H.; BURGET, L.; ČERNOCKÝ, J.; MAGHSOODI, N.; MATĚJKA, P. i-vector/HMM Based Text-dependent Speaker Verification System for RedDots Challenge. In Proceedings of Interspeech 2016. San Francisco: International Speech Communication Association, 2016. p. 440-444. ISBN: 978-1-5108-3313-5. Detail

    ŽMOLÍKOVÁ, K.; KARAFIÁT, M.; VESELÝ, K.; DELCROIX, M.; WATANABE, S.; BURGET, L.; ČERNOCKÝ, J. Data selection by sequence summarizing neural network in mismatch condition training. In Proceedings of Interspeech 2016. San Francisco: International Speech Communication Association, 2016. p. 2354-2358. ISBN: 978-1-5108-3313-5. Detail

  • 2015

    ANGUERA, X.; RODRIGUEZ-FUENTES, L.; BUZO, A.; METZE, F.; SZŐKE, I.; PENAGARIKANO, M. QUESST 2014: Evaluating Query-By-Example Speech Search in a Zero-Resource. In Proceedings of 2015 IEEE International Conference on Acoustics, Speech and Signal Processing. South Brisbane, Queensland: IEEE Signal Processing Society, 2015. p. 5833-5837. ISBN: 978-1-4673-6997-8. Detail

    CUMANI, S.; PLCHOT, O.; FÉR, R. Exploiting i-vector posterior covariances for short-duration language recognition. In Proceedings of Interspeech 2015. Proceedings of Interspeech. Dresden: International Speech Communication Association, 2015. p. 1002-1006. ISBN: 978-1-5108-1790-6. ISSN: 1990-9772. Detail

    FÉR, R.; MATĚJKA, P.; GRÉZL, F.; PLCHOT, O.; ČERNOCKÝ, J. Multilingual Bottleneck Features for Language Recognition. In Proceedings of Interspeech 2015. Proceedings of Interspeech. Dresden: International Speech Communication Association, 2015. p. 389-393. ISBN: 978-1-5108-1790-6. ISSN: 1990-9772. Detail

    GLEMBEK, O.; KESIRAJU, S.; ONDEL YANG, L. Summary report for project "ELISA" in Year 2015. Brno: University of Southern California, 2015. p. 0-0. Detail

    GLEMBEK, O.; MATĚJKA, P.; BURGET, L.; SCHWARZ, P.; PEŠÁN, J.; PLCHOT, O. Voice-print transformation for migration between automatic speaker identification systems. Abstract book of the 7th European Academy of Forensic Science Conference. Praha: Criminal Police Department Prague, 2015. p. 345-345. ISBN: 978-80-260-8659-8. Detail

    GLEMBEK, O.; MATĚJKA, P.; PLCHOT, O.; PEŠÁN, J.; BURGET, L.; SCHWARZ, P. Migrating i-vectors Between Speaker Recognition Systems Using Regression Neural Networks. In Proceedings of Interspeech 2015. Proceedings of Interspeech. Dresden: International Speech Communication Association, 2015. p. 2327-2331. ISBN: 978-1-5108-1790-6. ISSN: 1990-9772. Detail

    GRÉZL, F.; KARAFIÁT, M.; VESELÝ, K.; ŽIŽKA, J. Souhrnná zpráva k projektu "Zpracování audiovizuálních dat pro Superlectures.com" za rok 2015. Brno: ReplayWell, s. r. o., 2015. s. 0-0. Detail

    HEŘMANSKÝ, H.; BURGET, L.; COHEN, J.; DUPOUX, E.; FELDMAN, N.; GODFREY, J.; KHUDANPUR, S.; MACIEJEWSKI, M.; MALLIDI, S.; MENON, A.; OGAWA, T.; PEDDINTI, V.; ROSE, R.; STERN, R.; WIESNER, M.; VESELÝ, K. Towards Machines That Know When They Do Not Know: Summary of Work Done at 2014 FREDERICK JELINEK MEMORIAL WORKSHOP. In Proceedings of 2015 IEEE International Conference on Acoustics, Speech and Signal Processing. South Brisbane, Queensland: IEEE Signal Processing Society, 2015. p. 5009-5013. ISBN: 978-1-4673-6997-8. Detail

    HSIAO, R.; MA, J.; HARTMANN, W.; KARAFIÁT, M.; GRÉZL, F.; BURGET, L.; SZŐKE, I.; ČERNOCKÝ, J.; WATANABE, S.; CHEN, Z.; MALLIDI, S.; HEŘMANSKÝ, H.; TSAKALIDIS, S.; SCHWARTZ, R. Robust Speech Recognition in Unknown Reverberant and Noisy Conditions. In Proceedings of 2015 IEEE Automatic Speech Recognition and Understanding Workshop. Scottsdale, Arizona: IEEE Signal Processing Society, 2015. p. 533-538. ISBN: 978-1-4799-7291-3. Detail

    KARAFIÁT, M.; GRÉZL, F. Souhrnná zpráva k projektu "ASR-FR" za rok 2015. Brno: Phonexia s.r.o., 2015. s. 0-0. Detail

    KARAFIÁT, M.; GRÉZL, F. Souhrnná zpráva k projektu "Dodání anotací akustických dat, akustického modelu, jazykového modelu a výslovnostního slovníku pro francouzský jazyk" za rok 2015. Brno: Phonexia s.r.o., 2015. s. 0-0. Detail

    KARAFIÁT, M.; GRÉZL, F.; BURGET, L.; SZŐKE, I.; ČERNOCKÝ, J. Three ways to adapt a CTS recognizer to unseen reverberated speech in BUT system for the ASpIRE challenge. In Proceedings of Interspeech 2015. Proceedings of Interspeech. Dresden: International Speech Communication Association, 2015. p. 2454-2458. ISBN: 978-1-5108-1790-6. ISSN: 1990-9772. Detail

    KARAFIÁT, M.; GRÉZL, F.; HANNEMANN, M.; VESELÝ, K. Summary report for project "Multilingual speech recognition" in Year 2015. Brno: Raytheon BBN Technologies, 2015. p. 0-0. Detail

    MALLIDI, S.; OGAWA, T.; VESELÝ, K.; NIDADAVOLU, P.; HEŘMANSKÝ, H. Autoencoder based multi-stream combination for noise robust speech recognition. In Proceeding of Interspeech 2015. Proceedings of Interspeech. Dresden: International Speech Communication Association, 2015. p. 3551-3555. ISBN: 978-1-5108-1790-6. ISSN: 1990-9772. Detail

    MATĚJKA, P.; PLCHOT, O.; NOVOTNÝ, O.; FÉR, R. Summary report for project "Robust Automatic Speech Transcription" in Year 2015. Brno: Raytheon BBN Technologies, 2015. p. 0-0. Detail

    MOTLÍČEK, P.; DEY, S.; MADIKERI, S.; BURGET, L. Employment of Subspace Gaussian Mixture Models in Speaker Recognition. In Proceedings of 2015 IEEE International Conference on Acoustics, Speech and Signal Processing. South Brisbane, Queensland: IEEE Signal Processing Society, 2015. p. 4445-4449. ISBN: 978-1-4673-6997-8. Detail

    ONDEL YANG, L.; ANGUERA, X.; LUQUE, J. MASK+:Data-Driven Regions Selection for Acoustic Fingerprinting. In Proceedings of 2015 IEEE International Conference on Acoustics, Speech and Signal Processing. South Brisbane, Queensland: IEEE Signal Processing Society, 2015. p. 335-339. ISBN: 978-1-4673-6997-8. Detail

    PEŠÁN, J.; BURGET, L.; HEŘMANSKÝ, H.; VESELÝ, K. DNN derived filters for processing of modulation spectrum of speech. In Proceedings of Interspeech 2015. Proceedings of Interspeech. Dresden: International Speech Communication Association, 2015. p. 1908-1911. ISBN: 978-1-5108-1790-6. ISSN: 1990-9772. Detail

    SILNOVA, A.; GLEMBEK, O.; KINNUNEN, T.; MATĚJKA, P. Exploring ANN Back-Ends for i-Vector Based Speaker Age Estimation. In Proceedings of Interspeech 2015. Proceedings of Interspeech. Dresden: International Speech Communication Association, 2015. p. 3036-3040. ISBN: 978-1-5108-1790-6. ISSN: 1990-9772. Detail

    SKÁCEL, M.; SZŐKE, I. BUT QUESST 2015 System Description. In CEUR Workshop Proceedings. CEUR Workshop Proceedings. Wurzen: CEUR-WS.org, 2015. p. 1-3. ISSN: 1613-0073. Detail

    SZŐKE, I.; METZE, F.; RODRIGUEZ-FUENTES, L.; PROENCA, J.; BUZO, A.; LOJKA, M.; ANGUERA, X.; XIONG, X. Query by Example Search on Speech at Mediaeval 2015. In CEUR Workshop Proceedings. CEUR Workshop Proceedings. Wurzen: CEUR-WS.org, 2015. p. 1-3. ISSN: 1613-0073. Detail

    SZŐKE, I.; SKÁCEL, M.; ČERNOCKÝ, J.; BURGET, L. Coping with Channel Mismatch in Query-By-Example - BUT QUESST 2014. In Proceedings of 2015 IEEE International Conference on Acoustics, Speech and Signal Processing. South Brisbane, Queensland: IEEE Signal Processing Society, 2015. p. 5838-5842. ISBN: 978-1-4673-6997-8. Detail

  • 2014

    ANGUERA, X.; RODRIGUEZ-FUENTES, L.; SZŐKE, I.; BUZO, A.; METZE, F. Query-by-example Spoken Term Detection Evaluation on Low-resource Languages. Proceedings of the 4th International Workshop on Spoken Language Technologies for Under- resourced Languages SLTU-2014. St. Petersburg, Russia. St. Petersburg: International Speech Communication Association, 2014. p. 24-31. ISBN: 978-5-8088-0908-6. Detail

    ANGUERA, X.; RODRIGUEZ-FUENTES, L.; SZŐKE, I.; BUZO, A.; METZE, F. Query by Example Search on Speech at Mediaeval 2014. In CEUR Workshop Proceedings. CEUR Workshop Proceedings. Barcelona: CEUR-WS.org, 2014. p. 1-2. ISSN: 1613-0073. Detail

    BAHARI, M.; DEHAK, N.; VAN HAMME, H.; BURGET, L.; ALI, A.; GLASS, J. Non-Negative Factor Analysis of Gaussian Mixture Model Weight Adaptation for Language and Dialect Recognition. IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING, 2014, vol. 2014, no. 7, p. 1117-1129. ISSN: 2329-9290. Detail

    CUMANI, S.; LAFACE, P.; PLCHOT, O. On the use of i-vector posterior distributions in Probabilistic Linear Discriminant Analysis. IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING, 2014, vol. 22, no. 4, p. 846-857. ISSN: 2329-9290. Detail

    EGOROVA, E. Multi-task Neural Networks For Speech Recognition. Proceedings of the 20th Student Conference, EEICT 2014. Volume 2. Brno: Brno University of Technology, 2014. p. 24-26. ISBN: 978-80-214-4923-7. Detail

    GLEMBEK, O.; MA, J.; MATĚJKA, P.; ZHANG, B.; PLCHOT, O.; BURGET, L.; MATSOUKAS, S. Domain Adaptation Via Within-class Covariance Correction in I-Vector Based Speaker Recognition Systerms. In Proceedings of ICASSP 2014. Florencie: IEEE Signal Processing Society, 2014. p. 4060-4064. ISBN: 978-1-4799-2892-7. Detail

    GRÉZL, F.; EGOROVA, E.; KARAFIÁT, M. Further Investigation into Multilingual Training and Adaptation of Stacked Bottle-neck Neural Network Structure. In Proceedings of 2014 Spoken Language Technology Workshop. South Lake Tahoe, Nevada: IEEE Signal Processing Society, 2014. p. 48-53. ISBN: 978-1-4799-7129-9. Detail

    GRÉZL, F.; KARAFIÁT, M. Adapting Multilingual Neural Network Hierarchy to a New Language. Proceedings of the 4th International Workshop on Spoken Language Technologies for Under- resourced Languages SLTU-2014. St. Petersburg, Russia, 2014. St. Petersburg: International Speech Communication Association, 2014. p. 39-45. ISBN: 978-5-8088-0908-6. Detail

    GRÉZL, F.; KARAFIÁT, M. Combination of Multilingual and Semi-Supervised Training for Under-Resourced Languages. In Proceedings of Interspeech 2014. Singapore: International Speech Communication Association, 2014. p. 820-824. ISBN: 978-1-63439-435-2. Detail

    GRÉZL, F.; KARAFIÁT, M.; VESELÝ, K. Adaptation of Multilingual Stacked Bottle-neck Neural Network Structure for New Language. In Proceedings of ICASSP 2014. Florencie: IEEE Signal Processing Society, 2014. p. 7704-7708. ISBN: 978-1-4799-2892-7. Detail

    KARAFIÁT, M.; GRÉZL, F. Souhrnná zpráva k projektu "Dodání anotací akustických dat, akustického modelu, jazykového modelu a výslovnostního slovníku pro španělský jazyk" za rok 2014. Brno: Phonexia s.r.o., 2014. s. 0-0. Detail

    KARAFIÁT, M.; GRÉZL, F.; HANNEMANN, M.; ČERNOCKÝ, J. BUT Neural Network Features for Spontaneous Vietnamese in BABEL. In Proceedings of ICASSP 2014. Florencie: IEEE Signal Processing Society, 2014. p. 5659-5663. ISBN: 978-1-4799-2892-7. Detail

    KARAFIÁT, M.; GRÉZL, F.; VESELÝ, K.; HANNEMANN, M.; SZŐKE, I.; ČERNOCKÝ, J. BUT 2014 Babel System: Analysis of adaptation in NN based systems. In Proceedings of Interspeech 2014. Singapore: International Speech Communication Association, 2014. p. 3002-3006. ISBN: 978-1-63439-435-2. Detail

    KARAFIÁT, M.; VESELÝ, K.; SZŐKE, I.; BURGET, L.; GRÉZL, F.; HANNEMANN, M.; ČERNOCKÝ, J. BUT ASR System for BABEL Surprise Evaluation 2014. In Proceedings of 2014 Spoken Language Technology Workshop. South Lake Tahoe, Nevada: IEEE Signal Processing Society, 2014. p. 501-506. ISBN: 978-1-4799-7129-9. Detail

    LOPEZ-MORENO, I.; GONZALEZ-DOMINGUEZ, J.; MARTÍNEZ GONZÁLEZ, D.; PLCHOT, O.; GONZALEZ-RODRIGUEZ, J.; MORENO, P. Automatic Language Identification Using Deep Neural Networks. In Proceeding of ICASSP 2014. Florencie: IEEE Signal Processing Society, 2014. p. 5374-5378. ISBN: 978-1-4799-2892-7. Detail

    MARTÍNEZ GONZÁLEZ, D.; BURGET, L.; STAFYLAKIS, T.; LEI, Y.; KENNY, P.; LLEIDA, E. Unscented Transform For Ivector-based Noisy Speaker Recognition. In Proceedings of ICASSP 2014. Florencie: IEEE Signal Processing Society, 2014. p. 4070-4074. ISBN: 978-1-4799-2892-7. Detail

    MATĚJKA, P.; ZHANG, L.; NG, T.; MALLIDI, S.; GLEMBEK, O.; MA, J.; ZHANG, B. Neural Network Bottleneck Features for Language Identification. In Proceedings of Odyssey 2014. Proceedings of Odyssey: The Speaker and Language Recognition Workshop Odyssey 2014, Joensuu, Finland. Joensuu: International Speech Communication Association, 2014. p. 299-304. ISSN: 2312-2846. Detail

    NG, T.; HSIAO, R.; ZHANG, L.; KARAKOS, D.; MALLIDI, S.; KARAFIÁT, M.; VESELÝ, K.; SZŐKE, I.; ZHANG, B.; NGUYEN, L.; SCHWARTZ, R. Progress in the BBN Keyword Search System for the DARPA RATS Program. In Proceedings of Interspeech 2014. Singapore: International Speech Communication Association, 2014. p. 959-963. ISBN: 978-1-63439-435-2. Detail

    PLCHOT, O.; DIEZ SÁNCHEZ, M.; SOUFIFAR, M.; BURGET, L. PLLR Features in Language Recognition System for RATS. In Proceedings of Interspeech 2014. Singapore: International Speech Communication Association, 2014. p. 3048-3051. ISBN: 978-1-63439-435-2. Detail

    SZŐKE, I.; BURGET, L.; GRÉZL, F.; ČERNOCKÝ, J.; ONDEL YANG, L. Calibration and Fusion of Query-by-example Systems - BUT SWS 2013. In Proceedings of ICASSP 2014. Florencie: IEEE Signal Processing Society, 2014. p. 7899-7903. ISBN: 978-1-4799-2892-7. Detail

    SZŐKE, I.; SKÁCEL, M.; BURGET, L. BUT QUESST 2014 System Description. In CEUR Workshop Proceedings. CEUR Workshop Proceedings. Barcelona: CEUR-WS.org, 2014. p. 1-2. ISSN: 1613-0073. Detail

  • 2013

    AKBACAK, M.; BURGET, L.; WENG, W.; VAN HOUT, J. Rich System Combination For Keyword Spotting In Noisy and Acoustically Heterogenous Audio Streams. Proceedings of ICASSP 2013. Vancouver: IEEE Signal Processing Society, 2013. p. 8267-8271. ISBN: 978-1-4799-0355-9. Detail

    ANGUERA, X.; METZE, F.; BUZO, A.; SZŐKE, I.; RODRIGUEZ-FUENTES, L. The Spoken Web Search Task. CEUR Workshop Proceedings. CEUR Workshop Proceedings. Barcelona: CEUR-WS.org, 2013. p. 1-2. ISSN: 1613-0073. Detail

    CUMANI, S.; BRUMMER, J.; BURGET, L.; LAFACE, P.; PLCHOT, O.; VASILAKAKIS, V. Pairwise Discriminative Speaker Verification in the I -Vector Space. IEEE Transactions on Audio, Speech, and Language Processing, 2013, vol. 2013, no. 6, p. 1217-1227. ISSN: 1558-7916. Detail

    CUMANI, S.; PLCHOT, O.; LAFACE, P. Probabilistic Linear Discriminant Analysis Of I-Vector Posterior Distributions. Proceedings of ICASSP 2013. Vancouver: IEEE Signal Processing Society, 2013. p. 7644-7648. ISBN: 978-1-4799-0355-9. Detail

    EGOROVA, E.; VESELÝ, K.; KARAFIÁT, M.; JANDA, M.; ČERNOCKÝ, J. Manual and Semi-Automatic Approaches to Building a Multilingual Phoneme Set. In Proceedings of ICASSP 2013. Vancouver: IEEE Signal Processing Society, 2013. p. 7324-7328. ISBN: 978-1-4799-0355-9. Detail

    GRÉZL, F.; CHALUPNÍČEK, K.; KARAFIÁT, M.; VESELÝ, K. Souhrnná zpráva k projektu "Dodání anotací akustických dat, akustického modelu, jazykového modelu a výslovnostního slovníku pro arabský jazyk" za rok 2013. Brno: Phonexia s.r.o., 2013. s. 0-0. Detail

    GRÉZL, F.; KARAFIÁT, M. Semi-Supervised Bootstrapping Approach For Neural Network Feature Extractor Training. Proceedings of ASRU 2013. Olomouc: IEEE Signal Processing Society, 2013. p. 470-475. ISBN: 978-1-4799-2755-5. Detail

    GRÉZL, F.; KARAFIÁT, M.; VESELÝ, K.; ŽIŽKA, J. Souhrnná zpráva k projektu "Zpracování audiovizuálních dat pro Superlectures.com" za rok 2013. Brno: ReplayWell, s. r. o., 2013. s. 0-0. Detail

    HANNEMANN, M.; POVEY, D.; ZWEIG, G. Combining Forward and Backward Search in Decoding. Proceedings of ICASSP 2013. Vancouver: IEEE Signal Processing Society, 2013. p. 6739-6743. ISBN: 978-1-4799-0355-9. Detail

    HSIAO, R.; NG, T.; GRÉZL, F.; KARAKOS, D.; TSAKALIDIS, S.; NGUYEN, L.; SCHWARTZ, R. Discriminative Semi-supervised Training for Keyword Search in Low Resource Languages. Proceedings of ASRU 2013. Olomouc: IEEE Signal Processing Society, 2013. p. 440-445. ISBN: 978-1-4799-2755-5. Detail

    JANDA, M. Automatic Generation Of Pronunciation Dictionaries Based On Diarization. Proceedings of the 19th Conference Student EEICT 2013. Brno: Brno University of Technology, 2013. p. 228-232. ISBN: 978-80-214-4695-3. Detail

    KARAFIÁT, M.; GRÉZL, F.; HANNEMANN, M.; VESELÝ, K.; ČERNOCKÝ, J. BUT BABEL System for Spontaneous Cantonese. Proceedings of Interspeech 2013. Proceedings of the 14th Annual Conference of the International Speech Communication Association (Interspeech 2013). Lyon: International Speech Communication Association, 2013. p. 2589-2593. ISBN: 978-1-62993-443-3. ISSN: 2308-457X. Detail

    KARAKOS, D.; SCHWARTZ, R.; TSAKALIDIS, S.; ZHANG, L.; RANJAN, S.; NG, T.; HSIAO, R.; NGUYEN, L.; GRÉZL, F.; HANNEMANN, M.; KARAFIÁT, M.; SZŐKE, I.; VESELÝ, K. Score Normalization and System Combination for Improved Keyword Spotting. In Proceedings of ASRU 2013. Olomouc: IEEE Signal Processing Society, 2013. p. 210-215. ISBN: 978-1-4799-2755-5. Detail

    KHOURY, E.; VESNICER, B.; FRANCO-PEDROSO, J.; DIEZ SÁNCHEZ, M.; CIPR, T.; SCHWARZ, P.; VAN LEEUWEN, D.; PETROVSKA-DELACRETAZ, D.; MATĚJKA, P.; RODRIGUEZ-FUENTES, L.; CHOLLET, G.; MARCEL, S. The 2013 Speaker Recognition Evaluation in Mobile Environment. Proceedings of Biometrics (ICB), 2013 International Conference on. Madrid: IEEE Biometric Council, 2013. p. 1-8. ISBN: 978-1-4799-0310-8. Detail

    LEI, Y.; BURGET, L.; SCHEFFER, N. A Noise Robust I-Vector Extractor Using Vector Taylor Series For Speaker Recognition. Proceedings of ICASSP 2013. Vancouver: IEEE Signal Processing Society, 2013. p. 6788-6791. ISBN: 978-1-4799-0355-9. Detail

    MCLAREN, M.; ABRASH, V.; GRACIARENA, M.; LEI, Y.; PEŠÁN, J. Improving Robustness to Compressed Speech in Speaker Recognition. Proceedings of Interspeech 2013. Lyon: International Speech Communication Association, 2013. p. 3698-3702. ISBN: 978-1-62993-443-3. Detail

    MOTLÍČEK, P.; POVEY, D.; KARAFIÁT, M. Feature And Score Level Combination Of Subspace Gaussians In LVCSR Task. Proceedings of ICASSP 2013. Vancouver: IEEE Signal Processing Society, 2013. p. 7604-7608. ISBN: 978-1-4799-0355-9. Detail

    PLCHOT, O.; BURGET, L.; SZŐKE, I. 2013 Summary report of project "Processing and analysis of speech, automatic speaker identification". Brno: Raytheon BBN Technologies, 2013. p. 0-0. Detail

    PLCHOT, O.; MATSOUKAS, S.; MATĚJKA, P.; DEHAK, N.; MA, J.; CUMANI, S.; GLEMBEK, O.; HEŘMANSKÝ, H.; MESGARANI, N.; SOUFIFAR, M.; THOMAS, S.; ZHANG, B.; ZHOU, X. Developing A Speaker Identification System For The DARPA RATS Project. Proceedings of ICASSP 2013. Vancouver: IEEE Signal Processing Society, 2013. p. 6768-6772. ISBN: 978-1-4799-0355-9. Detail

    RATH, S.; BURGET, L.; KARAFIÁT, M.; GLEMBEK, O.; ČERNOCKÝ, J. A Region-specific Feature-space Transformation for Speaker Adaptation and Singularity Analysis of Jacobian Matrix. Proceedings of Interspeeech 2013. Proceedings of the 14th Annual Conference of the International Speech Communication Association (Interspeech 2013). Lyon: International Speech Communication Association, 2013. p. 1228-1232. ISBN: 978-1-62993-443-3. ISSN: 2308-457X. Detail

    RATH, S.; POVEY, D.; VESELÝ, K.; ČERNOCKÝ, J. Improved Feature Processing for Deep Neural Networks. Proceedings of Interspeech 2013. Proceedings of the 14th Annual Conference of the International Speech Communication Association (Interspeech 2013). Lyon: International Speech Communication Association, 2013. p. 109-113. ISBN: 978-1-62993-443-3. ISSN: 2308-457X. Detail

    SOUFIFAR, M.; BURGET, L.; PLCHOT, O.; CUMANI, S.; ČERNOCKÝ, J. Regularized Subspace n-Gram Model for Phonotactic iVector Extraction. Proceedings of Interspeech 2013. Proceedings of the 14th Annual Conference of the International Speech Communication Association (Interspeech 2013). Lyon: International Speech Communication Association, 2013. p. 74-78. ISBN: 978-1-62993-443-3. ISSN: 2308-457X. Detail

    SZŐKE, I.; BURGET, L.; GRÉZL, F.; ONDEL YANG, L. BUT SWS 2013 - Massive Parallel Approach. In Proceedings of the MediaEval 2013 Multimedia Benchmark Workshop. CEUR Workshop Proceedings. Barcelona: CEUR-WS.org, 2013. p. 1-2. ISSN: 1613-0073. Detail

    TRESADERN, P.; COOTES, T.; POH, N.; MATĚJKA, P.; HADID, A.; LÉVY, C.; MCCOOL, C.; MARCEL, S. Mobile Biometrics: Combined Face and Voice Verification for a Mobile Platform. IEEE PERVASIVE COMPUTING, 2013, vol. 12, no. 1, p. 79-87. ISSN: 1536-1268. Detail

    VESELÝ, K.; GHOSHAL, A.; BURGET, L.; POVEY, D. Sequence-discriminative Training of Deep Neural Networks. Proceedings of Interspeech 2013. Proceedings of the 14th Annual Conference of the International Speech Communication Association (Interspeech 2013). Lyon: International Speech Communication Association, 2013. p. 2345-2349. ISBN: 978-1-62993-443-3. ISSN: 2308-457X. Detail

    VESELÝ, K.; HANNEMANN, M.; BURGET, L. Semi-supervised Training of Deep Neural Networks. Proceedings of ASRU 2013. Olomouc: IEEE Signal Processing Society, 2013. p. 267-272. ISBN: 978-1-4799-2755-5. Detail

    ZHILA, A.; YIH, W.; MEEK, C.; MIKOLOV, T.; ZWEIG, G. Combining Heterogeneous Models for Measuring Relational Similarity. Proceedings of NAACL-HLT 2013. Atlanata, Georgia: Association for Computational Linguistics, 2013. p. 1000-1009. ISBN: 978-1-937284-47-3. Detail

  • 2012

    BOUSQUET, P.; LARCHER, A.; MATROUF, D.; BONASTRE, J.; PLCHOT, O. Variance-Spectra based Normalization for I-vector Standard and Probabilistic Linear Discriminant Analysis. In Proceedings of Odyssey 2012: The Speaker and Language Recognition Workshop. Singapur: International Speech Communication Association, 2012. p. 157-164. ISBN: 978-981-07-3093-2. Detail

    BRUMMER, J.; CUMANI, S.; GLEMBEK, O.; KARAFIÁT, M.; MATĚJKA, P.; PEŠÁN, J.; PLCHOT, O.; SOUFIFAR, M.; DE VILLIERS, E.; ČERNOCKÝ, J. Description and analysis of the Brno276 system for LRE2011. In Proceedings of Odyssey 2012: The Speaker and Language Recognition Workshop. Singapur: International Speech Communication Association, 2012. p. 216-223. ISBN: 978-981-07-3093-2. Detail

    ČERNOCKÝ, J. Dolování informací z mluvené řeči v BUT Speech@FIT. Hovory s informatiky 2012. Praha: Akademie věd ČR, 2012. s. 113-114. ISBN: 978-80-87136-14-0. Detail

    CUMANI, S.; GLEMBEK, O.; BRUMMER, J.; DE VILLIERS, E.; LAFACE, P. Gender Independent Discriminative Speaker Recognition in I-Vector Space. Proc. International Conference on Acoustics, Speech, and Signal P. Kyoto: IEEE Signal Processing Society, 2012. p. 4361-4364. ISBN: 978-1-4673-0044-5. Detail

    CUMANI, S.; PLCHOT, O.; KARAFIÁT, M. Independent Component Analysis and MLLR Transforms for Speaker Identification. Proc. International Conference on Acoustics, Speech, and Signal P. Kyoto: IEEE Signal Processing Society, 2012. p. 4365-4368. ISBN: 978-1-4673-0044-5. Detail

    D'HARO, L.; GLEMBEK, O.; PLCHOT, O.; MATĚJKA, P.; SOUFIFAR, M.; CORDOBA, R.; ČERNOCKÝ, J. Phonotactic Language Recognition using i-vectors and Phoneme Posteriogram Counts. Proceedings of Interspeech 2012. Proceedings of Interspeech. Portland, Oregon: International Speech Communication Association, 2012. p. 1-4. ISBN: 978-1-62276-759-5. ISSN: 1990-9772. Detail

    DEORAS, A.; MIKOLOV, T.; KOMBRINK, S.; CHURCH, K. Approximate inference: A sampling based modeling technique to capture complex dependencies in a language model. Speech Communication, 2012, vol. 2012, no. 8, p. 1-16. ISSN: 0167-6393. Detail

    FERRER, L.; BURGET, L.; PLCHOT, O.; SCHEFFER, N. A Unified Approach for Audio Characterization and its Application to Speaker Recognition. Proceedings of Odyssey 2012, The Speaker and Language Recognition Workshop. Singapur: International Speech Communication Association, 2012. p. 317-323. ISBN: 978-981-07-3093-2. Detail

    HAIN, T.; BURGET, L.; DINES, J.; GARNER, P.; GRÉZL, F.; EL HANNANI, A.; HUIJBREGTS, M.; KARAFIÁT, M.; LINCOLN, M.; WAN, V. Transcribing Meetings with the AMIDA System. IEEE Transactions on Audio, Speech, and Language Processing, 2012, vol. 20, no. 2, p. 486-498. ISSN: 1558-7916. Detail

    JANDA, M. Grapheme Based Speech Recognition. Proceedings of the 18th Conference STUDENT EEICT 2012. Brno: Brno University of Technology, 2012. p. 441-445. ISBN: 978-80-214-4460-7. Detail

    JANDA, M.; KARAFIÁT, M.; ČERNOCKÝ, J. Dealing with Numbers in Grapheme-Based Speech Recognition. Proceedings of 15th International Conference on Text, Speech and Dialogue. Lecture Notes in Computer Science. Lecture Notes in Computer Science, 2012, Volume 7499. Springer-Verlag Berlin Heidelberg 2012: Springer Verlag, 2012. p. 438-445. ISBN: 978-3-642-32789-6. ISSN: 0302-9743. Detail

    KARAFIÁT, M.; JANDA, M.; ČERNOCKÝ, J.; BURGET, L. Region Dependent Linear Transforms in Multilingual Speech Recognition. In Proc. International Conference on Acoustics, Speech, and Signal Processing 2012. Kyoto: IEEE Signal Processing Society, 2012. p. 4885-4888. ISBN: 978-1-4673-0044-5. Detail

    KOMBRINK, S.; HANNEMANN, M.; BURGET, L. Out-of-Vocabulary Word Detection and Beyond. In Detection and Identification of Rare Audiovisual Cues. Studies in Computational Intelligence, 384. Springer-Verlag Berlin Heidelberg: Springer Verlag, 2012. p. 57-65. ISBN: 978-3-642-24033-1. Detail

    KOMBRINK, S.; MIKOLOV, T.; KARAFIÁT, M.; BURGET, L. Improving Language Models for ASR Using Translated In-domain Data. Proceedings of 2012 IEEE International Conference on Acoustics, Speech and Signal Processing. Kyoto: IEEE Signal Processing Society, 2012. p. 4405-4408. ISBN: 978-1-4673-0044-5. Detail

    LEI, Y.; BURGET, L.; FERRER, L.; GRACIARENA, M.; SCHEFFER, N. Towards Noise-Robust Speaker Recognition Using Probabilistic Linear Discriminant Analysis. Proc. International Conference on Acoustics, Speech, and Signal P. Kyoto: IEEE Signal Processing Society, 2012. p. 4253-4256. ISBN: 978-1-4673-0044-5. Detail

    LEI, Y.; BURGET, L.; SCHEFFER, N. Bilinear Factor Analysis for iVector Based Speaker Verification. Proceedings of Interspeech. Portland, Oregon: International Speech Communication Association, 2012. p. 1-4. ISBN: 978-1-62276-759-5. Detail

    MARTÍNEZ GONZÁLEZ, D.; BURGET, L.; FERRER, L.; SCHEFFER, N. Ivector-Based Prosodic System For Language Identification. Proc. International Conference on Acoustics, Speec. Kyoto: IEEE Signal Processing Society, 2012. p. 4861-4864. ISBN: 978-1-4673-0044-5. Detail

    MATĚJKA, P.; PLCHOT, O.; SOUFIFAR, M.; GLEMBEK, O.; D'HARO, L.; VESELÝ, K.; GRÉZL, F.; MA, J.; MATSOUKAS, S.; DEHAK, N. Patrol Team Language Identification System for DARPA RATS P1 Evaluation. Proceedings of Interspeech 2012. Proceedings of Interspeech. Portland, Oregon: International Speech Communication Association, 2012. p. 1-4. ISBN: 978-1-62276-759-5. ISSN: 1990-9772. Detail

    MCCOOL, C.; MARCEL, S.; MATĚJKA, P.; ČERNOCKÝ, J.; KITTLER, J.; LARCHER, A.; LÉVY, C.; MATROUF, D.; BONASTRE, J. Bi-Modal Person Recognition on a Mobile Phone: using mobile phone data. 2012 IEEE International Conference on Multimedia and Expo Workshops. Melbourne, Victoria: IEEE Computer Society, 2012. p. 635-640. ISBN: 978-1-4673-2027-6. Detail

    METZE, F.; RAJPUT, N.; ANGUERA, X.; DAVEL, M.; GRAVIER, G.; HEERDEN, C.; MANTENA, G.; MUSCARIELLO, A.; PRAHALLAD, K.; SZŐKE, I.; TEJEDOR, J. The Spoken WEB Search Task At Mediaeval 2011. Proc. International Conference on Acoustics, Speech, and Signal P. Kyoto: IEEE Signal Processing Society, 2012. p. 5165-5168. ISBN: 978-1-4673-0044-5. Detail

    MOTLÍČEK, P.; VALENTE, F.; SZŐKE, I. Improving Acoustic Based Keyword Spotting Using LVCSR Lattices. Proc. International Conference on Acoustics, Speech, and Signal Processing 2012. Kyoto: IEEE Signal Processing Society, 2012. p. 4413-4416. ISBN: 978-1-4673-0044-5. Detail

    NG, T.; ZHANG, B.; NGUYEN, L.; MATSOUKAS, S.; ZHOU, X.; MESGARANI, N.; VESELÝ, K.; MATĚJKA, P. Developing a Speech Activity Detection System for the DARPA RATS Program. Proceedings of Interspeech 2012. Proceedings of Interspeech. Portland, Oregon: International Speech Communication Association, 2012. p. 1-4. ISBN: 978-1-62276-759-5. ISSN: 1990-9772. Detail

    PLCHOT, O.; KARAFIÁT, M.; BRUMMER, J.; GLEMBEK, O.; MATĚJKA, P.; DE VILLIERS, E.; ČERNOCKÝ, J. Speaker vectors from Subspace Gaussian Mixture Model as complementary features for Language Identification. In Proceedings of Odyssey 2012, The Speaker and Language Recognition Workshop. Singapur: International Speech Communication Association, 2012. p. 330-333. ISBN: 978-981-07-3093-2. Detail

    POVEY, D.; HANNEMANN, M.; BOULIANNE, G.; BURGET, L.; GHOSHAL, A.; JANDA, M.; KARAFIÁT, M.; KOMBRINK, S.; MOTLÍČEK, P.; QIAN, Y.; RIEDHAMMER, K.; VESELÝ, K.; VU, N. Generating Exact Lattices in The WFST Framework. Proceedings of 2012 IEEE International Conference on Acoustics, Speech and Signal Processing. Kyoto: IEEE Signal Processing Society, 2012. p. 4213-4216. ISBN: 978-1-4673-0044-5. Detail

    RATH, S.; KARAFIÁT, M.; GLEMBEK, O.; ČERNOCKÝ, J. A factorized representation of FMLLR transform based on QR-decomposition. Proceedings of Interspeech 2012. Proceedings of Interspeech. Portland, Oregon: International Speech Communication Association, 2012. p. 1-4. ISBN: 978-1-62276-759-5. ISSN: 1990-9772. Detail

    SOUFIFAR, M.; CUMANI, S.; BURGET, L.; ČERNOCKÝ, J. Discriminative Classifiers for Phonotactic Language Recognition with iVectors. Proc. International Conference on Acoustics, Speech, and Signal Processing 2012. Kyoto: IEEE Signal Processing Society, 2012. p. 4853-4856. ISBN: 978-1-4673-0044-5. Detail

    SZŐKE, I.; FAPŠO, M.; VESELÝ, K. BUT2012 přístup pro Spoken Web Search úkol na MediaEval2012. Working Notes Proceedings of the MediaEval 2012 Workshop. CEUR Workshop Proceedings. Pisa: CEUR-WS.org, 2012. s. 1-2. ISSN: 1613-0073. Detail

    SZŐKE, I.; FAPŠO, M.; ŽIŽKA, J.; BERAN, V.; ČERNOCKÝ, J. Efektivní přístup ke znalostem v audio-vizuálních záznamech. Proceedings of the Annual Database Conference. Praha: Technická univerzita v Košiciach, 2012. s. 57-74. ISBN: 978-80-553-1049-7. Detail

    TEJEDOR, J.; FAPŠO, M.; SZŐKE, I.; ČERNOCKÝ, J.; GRÉZL, F. Comparison of methods for language-dependent and language-independent query-by-example spoken term detection. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2012, vol. 2012, no. 30, p. 1-34. ISSN: 1046-8188. Detail

    VESELÝ, K.; KARAFIÁT, M.; GRÉZL, F.; JANDA, M.; EGOROVA, E. The Language-Independent Bottleneck Features. Proceedings of IEEE 2012 Workshop on Spoken Language Technology. Miami: IEEE Signal Processing Society, 2012. p. 336-341. ISBN: 978-1-4673-5124-9. Detail

  • 2011

    BOŘIL, H.; GRÉZL, F.; HANSEN, J. Front-End Compensation Methods for LVCSR Under Lombard Effect. Proceedings of Interspeech 2011. Proceedings of Interspeech. Florence: International Speech Communication Association, 2011. p. 1257-1260. ISBN: 978-1-61839-270-1. ISSN: 1990-9772. Detail

    BURGET, L.; PLCHOT, O.; CUMANI, S.; GLEMBEK, O.; MATĚJKA, P.; BRÜMMER, N. Discriminatively Trained Probabilistic Linear Discriminant Analysis for Speaker Verification. In Proceedings of the 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011. Praha: IEEE Signal Processing Society, 2011. p. 4832-4835. ISBN: 978-1-4577-0537-3. Detail

    ČERNOCKÝ, J. MOBIO D1.3 - Annual Report. Martigny: Information and Communication Technologies (ICT) 7th Framework programme, 2011. p. 0-0. Detail

    CUMANI, S.; BRÜMMER, N.; BURGET, L.; LAFACE, P. Fast Discriminative Speaker Verification in the I-Vector Space. Proceedings of the 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011. Praha: IEEE Signal Processing Society, 2011. p. 4852-4855. ISBN: 978-1-4577-0537-3. Detail

    DEORAS, A.; MIKOLOV, T.; CHURCH, K. A Fast Re-scoring Strategy to Capture Long-Distance Dependencies. Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing July 2011 Edinburgh, Scotland, UK. Edinburgh: Association for Computational Linguistics, 2011. p. 1116-1127. ISBN: 978-1-937284-11-4. Detail

    DEORAS, A.; MIKOLOV, T.; KOMBRINK, S.; KARAFIÁT, M.; KHUDANPUR, S. Variational Approximation of Long-span Language Models for LVCSR. Proceedings of the 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011. Praha: IEEE Signal Processing Society, 2011. p. 5532-5535. ISBN: 978-1-4577-0537-3. Detail

    GLEMBEK, O.; BURGET, L.; BRÜMMER, N.; PLCHOT, O.; MATĚJKA, P. Discriminatively Trained i-vector Extractor for Speaker Verification. In Proceedings of Interspeech 2011. Proceedings of Interspeech. Florence: International Speech Communication Association, 2011. p. 137-140. ISBN: 978-1-61839-270-1. ISSN: 1990-9772. Detail

    GLEMBEK, O.; BURGET, L.; KENNY, P.; KARAFIÁT, M.; MATĚJKA, P. Simplification and optimization of I-Vector Extraction. Proceedings of the 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011. Praha: IEEE Signal Processing Society, 2011. p. 4516-4519. ISBN: 978-1-4577-0537-3. Detail

    GRÉZL, F. The Role of Neural Network Size in TRAP/HATS Feature Extraction. Proceedings Text, Speech and Dialogue 2011. Lecture Notes in Computer Science. LNAI 6836. Plzeň: Springer Verlag, 2011. p. 315-322. ISBN: 978-3-642-23537-5. ISSN: 0302-9743. Detail

    GRÉZL, F.; KARAFIÁT, M. Integrating recent MLP feature extraction techniques into TRAP architecture. Proceedings of Interspeech 2011. Proceedings of Interspeech. Florence: International Speech Communication Association, 2011. p. 1229-1232. ISBN: 978-1-61839-270-1. ISSN: 1990-9772. Detail

    GRÉZL, F.; KARAFIÁT, M.; JANDA, M. Study of Probabilistic and Bottle-Neck Features in Multilingual Environment. Proceedings of ASRU 2011. Hilton Waikoloa Village, Big Island, Hawaii: IEEE Signal Processing Society, 2011. p. 359-364. ISBN: 978-1-4673-0366-8. Detail

    KARAFIÁT, M.; BURGET, L.; MATĚJKA, P.; GLEMBEK, O.; ČERNOCKÝ, J. iVector-Based Discriminative Adaptation for Automatic Speech Recognition. Proceedings of ASRU 2011. Hilton Waikoloa Village, Big Island, Hawaii: IEEE Signal Processing Society, 2011. p. 152-157. ISBN: 978-1-4673-0366-8. Detail

    KOCKMANN, M.; BURGET, L.; ČERNOCKÝ, J. Application of speaker- and language identification state-of-the-art techniques for emotion recognition. Speech Communication, 2011, vol. 53, no. 9, p. 1172-1185. ISSN: 0167-6393. Detail

    KOCKMANN, M.; FERRER, L.; BURGET, L.; ČERNOCKÝ, J. iVector Fusion of Prosodic and Cepstral Features for Speaker Verification. Proceedings of Interspeech 2011. Proceedings of Interspeech. Florence: International Speech Communication Association, 2011. p. 265-268. ISBN: 978-1-61839-270-1. ISSN: 1990-9772. Detail

    KOCKMANN, M.; FERRER, L.; BURGET, L.; SHRIBERG, E.; ČERNOCKÝ, J. Recent Progress in Prosodic Speaker Verification. Proceedings of the 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011. Praha: IEEE Signal Processing Society, 2011. p. 4556-4559. ISBN: 978-1-4577-0537-3. Detail

    KOMBRINK, S.; MIKOLOV, T. Recurrent Neural Network Language Modeling Applied to the Brno AMI/AMIDA 2009 Meeting Recognizer Setup. Proceedings of the 17th Conference STUDENT EEICT 2011. Volume 3. Brno: Brno University of Technology, 2011. p. 527-531. ISBN: 978-80-214-4273-3. Detail

    KOMBRINK, S.; MIKOLOV, T.; KARAFIÁT, M.; BURGET, L. Recurrent Neural Network based Language Modeling in Meeting Recognition. Proceedings of Interspeech 2011. Proceedings of Interspeech. Florence: International Speech Communication Association, 2011. p. 2877-2880. ISBN: 978-1-61839-270-1. ISSN: 1990-9772. Detail

    MARTÍNEZ GONZÁLEZ, D.; PLCHOT, O.; BURGET, L.; GLEMBEK, O.; MATĚJKA, P. Language Recognition in iVectors Space. In Proceedings of Interspeech 2011. Proceedings of Interspeech. Florence: International Speech Communication Association, 2011. p. 861-864. ISBN: 978-1-61839-270-1. ISSN: 1990-9772. Detail

    MATĚJKA, P.; GLEMBEK, O.; CASTALDO, F.; ALAM, J.; PLCHOT, O.; KENNY, P.; BURGET, L.; ČERNOCKÝ, J. Full-covariance UBM and Heavy-tailed PLDA in I-Vector Speaker Verification. In Proceedings of the 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011. Praha: IEEE Signal Processing Society, 2011. p. 4828-4831. ISBN: 978-1-4577-0537-3. Detail

    MIKOLOV, T.; DEORAS, A.; KOMBRINK, S.; BURGET, L.; ČERNOCKÝ, J. Empirical Evaluation and Combination of Advanced Language Modeling Techniques. Proceedings of Interspeech 2011. Proceedings of Interspeech. Florence: International Speech Communication Association, 2011. p. 605-608. ISBN: 978-1-61839-270-1. ISSN: 1990-9772. Detail

    MIKOLOV, T.; DEORAS, A.; POVEY, D.; BURGET, L.; ČERNOCKÝ, J. Strategies for Training Large Scale Neural Network Language Models. Proceedings of ASRU 2011. Hilton Waikoloa Village, Big Island, Hawaii: IEEE Signal Processing Society, 2011. p. 196-201. ISBN: 978-1-4673-0366-8. Detail

    MIKOLOV, T.; KOMBRINK, S.; BURGET, L.; ČERNOCKÝ, J.; KHUDANPUR, S. Extensions of Recurrent Neural Network Language Model. Proceedings of the 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011. Praha: IEEE Signal Processing Society, 2011. p. 5528-5531. ISBN: 978-1-4577-0537-3. Detail

    MIKOLOV, T.; KOMBRINK, S.; DEORAS, A.; BURGET, L.; ČERNOCKÝ, J. RNNLM - Recurrent Neural Network Language Modeling Toolkit. Proceedings of ASRU 2011. Hilton Waikoloa Village, Big Island, Hawaii: IEEE Signal Processing Society, 2011. p. 1-4. ISBN: 978-1-4673-0366-8. Detail

    PEŠÁN, J. Rozpoznávání mluvčího na mobilním telefonu. Proceedings of the 17th Conference Student EEICT 2011. Volume 2. Brno: Vysoké učení technické v Brně, 2011. s. 341-343. ISBN: 978-80-214-4272-6. Detail

    POVEY, D.; BURGET, L.; AGARWAL, M.; AKYAZI, P.; GHOSHAL, A.; GLEMBEK, O.; GOEL, N.; KARAFIÁT, M.; RASTROW, A.; ROSE, R.; SCHWARZ, P.; THOMAS, S. The subspace Gaussian mixture model-A structured model for speech recognition. COMPUTER SPEECH AND LANGUAGE, 2011, vol. 25, no. 2, p. 404-439. ISSN: 0885-2308. Detail

    POVEY, D.; GHOSHAL, A.; BOULIANNE, G.; BURGET, L.; GLEMBEK, O.; GOEL, N.; HANNEMANN, M.; MOTLÍČEK, P.; QIAN, Y.; SCHWARZ, P.; SILOVSKÝ, J.; STEMMER, G.; VESELÝ, K. The Kaldi Speech Recognition Toolkit. Proceedings of ASRU 2011. Hilton Waikoloa Village Resort, Hawaii: IEEE Signal Processing Society, 2011. p. 1-4. ISBN: 978-1-4673-0366-8. Detail

    POVEY, D.; KARAFIÁT, M.; GHOSHAL, A.; SCHWARZ, P. A Symmetrization of the Subspace Gaussian Mixture Model. Proceedings of 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing. Praha: IEEE Signal Processing Society, 2011. p. 4504-4507. ISBN: 978-1-4577-0537-3. Detail

    SOUFIFAR, M.; KOCKMANN, M.; BURGET, L.; PLCHOT, O.; GLEMBEK, O.; SVENDSEN, T. iVector Approach to Phonotactic Language Recognition. In Proceedings of Interspeech 2011. Proceedings of Interspeech. Florence: International Speech Communication Association, 2011. p. 2913-2916. ISBN: 978-1-61839-270-1. ISSN: 1990-9772. Detail

    VESELÝ, K.; KARAFIÁT, M.; GRÉZL, F. Convolutive Bottleneck Network Features for LVCSR. Proceedings of ASRU 2011. Big Island, Hawaii: IEEE Signal Processing Society, 2011. p. 42-47. ISBN: 978-1-4673-0366-8. Detail

  • 2010

    BRUMMER, J.; BURGET, L.; KENNY, P.; MATĚJKA, P.; DE VILLIERS, E.; KARAFIÁT, M.; KOCKMANN, M.; GLEMBEK, O.; PLCHOT, O.; BAUM, D.; SENOUSSAUOI, M. ABC System description for NIST SRE 2010. Proc. NIST 2010 Speaker Recognition Evaluation. Brno: National Institute of Standards and Technology, 2010. p. 1-20. Detail

    BURGET, L.; SCHWARZ, P.; AGARWAL, M.; AKYAZI, P.; FENG, K.; GHOSHAL, A.; GLEMBEK, O.; GOEL, N.; KARAFIÁT, M.; POVEY, D.; RASTROW, A.; ROSE, R.; THOMAS, S. Multilingual acoustic modeling for speech recognition based on Subspace Gaussian Mixture Models. Proc. International Conference on Acoustictics, Speech, and Signal Processing. Proc. International Conference on Acoustics, Speech, and Signal Processing. Dallas: IEEE Signal Processing Society, 2010. p. 4334-4337. ISBN: 978-1-4244-4296-6. ISSN: 1520-6149. Detail

    ČERNOCKÝ, J. MOBIO D7.4: Second report on dissemination activities. Martigny: Information and Communication Technologies (ICT) 7th Framework programme, 2010. p. 0-0. Detail

    ČERNOCKÝ, J.; ŠEVEČKOVÁ, M. Korpusové a hlasové technologie v nové generaci elektronických slovníků - závěrečná technická zpráva. Brno: Ministerstvo průmyslu a obchodu ČR, 2010. s. 0-0. Detail

    ČERNOCKÝ, J.; SZŐKE, I.; HANNEMANN, M.; KOMBRINK, S. Word-subword based keyword spotting with implications in OOV detection. Pacific Grove: Institute of Electrical and Electronics Engineers, 2010. p. 0-0. Detail

    GHOSHAL, A.; POVEY, D.; AGARWAL, M.; AKYAZI, P.; BURGET, L.; FENG, K.; GLEMBEK, O.; GOEL, N.; KARAFIÁT, M.; RASTROW, A.; ROSE, R.; SCHWARZ, P.; THOMAS, S. A novel estimation of feature-space MLLR for full_covariance models. Proc. International Conference on Acoustics, Speech, and Signal Processing. Proc. International Conference on Acoustics, Speech, and Signal Processing. Dallas: IEEE Signal Processing Society, 2010. p. 4310-4313. ISBN: 978-1-4244-4296-6. ISSN: 1520-6149. Detail

    GOEL, N.; THOMAS, S.; AGARWAL, M.; AKYAZI, P.; BURGET, L.; FENG, K.; GHOSHAL, A.; GLEMBEK, O.; KARAFIÁT, M.; POVEY, D.; RASTROW, A.; ROSE, R.; SCHWARZ, P. Approaches to automatic lexicon learning with limited training examples. Proc. International Conference on Acoustics, Speech, and Signal Processing. Proc. International Conference on Acoustics, Speech, and Signal Processing. Dallas: IEEE Signal Processing Society, 2010. p. 5094-5097. ISBN: 978-1-4244-4296-6. ISSN: 1520-6149. Detail

    GRÉZL, F.; KARAFIÁT, M. Hierarchical Neural Net Architectures for Feature Extraction in ASR. Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010). Proceedings of Interspeech. Makuhari, Chiba: International Speech Communication Association, 2010. p. 1201-1204. ISBN: 978-1-61782-123-3. ISSN: 1990-9772. Detail

    HAIN, T.; BURGET, L.; DINES, J.; GARNER, P.; EL HANNANI, A.; HUIJBREGTS, M.; KARAFIÁT, M.; LINCOLN, M.; WAN, V. The AMIDA 2009 Meeting Transcription System. Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010). Proceedings of Interspeech. Makuhari, Chiba: International Speech Communication Association, 2010. p. 358-361. ISBN: 978-1-61782-123-3. ISSN: 1990-9772. Detail

    HANNEMANN, M.; KOMBRINK, S.; KARAFIÁT, M.; BURGET, L. Similarity Scoring for Recognizing Repeated Out-of-VocabularyWords. Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010). Proceedings of Interspeech. Makuhari, Chiba: International Speech Communication Association, 2010. p. 897-900. ISBN: 978-1-61782-123-3. ISSN: 1990-9772. Detail

    JANČÍK, Z.; PLCHOT, O.; BRUMMER, J.; BURGET, L.; GLEMBEK, O.; HUBEIKA, V.; KARAFIÁT, M.; MATĚJKA, P.; MIKOLOV, T.; STRASHEIM, A.; ČERNOCKÝ, J. Data selection and calibration issues in automatic language recognition - investigation with BUT-AGNITIO NIST LRE 2009 system. In Proc. Odyssey 2010 - The Speaker and Language Recognition Workshop. Brno: International Speech Communication Association, 2010. p. 215-221. ISBN: 978-80-214-4114-9. Detail

    KARAFIÁT, M.; SZŐKE, I.; ČERNOCKÝ, J. Using Gradient Descent Optimization for Acoustics Training from Heterogeneous Data. Proc. Text, Speech and Dialog 2010. Lecture Notes in Computer Science. LNAI 6231. Brno: Springer Verlag, 2010. p. 322-329. ISBN: 978-3-642-15759-2. ISSN: 0302-9743. Detail

    KOCKMANN, M.; BURGET, L.; ČERNOCKÝ, J. Brno University of Technology System for Interspeech 2010 Paralinguistic Challenge. Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010). Proceedings of Interspeech. Makuhari, Chiba: International Speech Communication Association, 2010. p. 2822-2825. ISBN: 978-1-61782-123-3. ISSN: 1990-9772. Detail

    KOCKMANN, M.; BURGET, L.; ČERNOCKÝ, J. Investigations into prosodic syllable contour features for speaker recognition. Proc. International Conference on Acoustics, Speech, and Signal Processing. Proc. International Conference on Acoustics, Speech, and Signal Processing. Dallas: IEEE Signal Processing Society, 2010. p. 4418-4421. ISBN: 978-1-4244-4296-6. ISSN: 1520-6149. Detail

    KOCKMANN, M.; BURGET, L.; GLEMBEK, O.; FERRER, L.; ČERNOCKÝ, J. Prosodic Speaker Verification using Subspace Multinomial Models with Intersession Compensation. Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010). Proceedings of Interspeech. Makuhari, Chiba, Japan: International Speech Communication Association, 2010. p. 1061-1064. ISBN: 978-1-61782-123-3. ISSN: 1990-9772. Detail

    KOMBRINK, S.; HANNEMANN, M.; BURGET, L.; HEŘMANSKÝ, H. Recovery of Rare Words in Lecture Speech. Proc. Text, Speech and Dialogue 2010. Lecture Notes in Computer Science. Brno: Springer Verlag, 2010. p. 330-337. ISBN: 978-3-642-15759-2. ISSN: 0302-9743. Detail

    MARCEL, S.; MATĚJKA, P. MOBIO D6.6: Report on the MOBIO Final Prototypes. Martigny: Information and Communication Technologies (ICT) 7th Framework programme, 2010. p. 0-0. Detail

    MARCEL, S.; MCCOOL, C.; MATĚJKA, P.; ČERNOCKÝ, J.; KITTLER, J.; GLEMBEK, O.; PLCHOT, O.; JANČÍK, Z.; LARCHER, A.; LÉVY, C. On the Results of the First Mobile Biometry (MOBIO) Face and Speaker Verification Evaluation. In Recognizing Patterns in Signals, Speech, Images, and Videos. Lecture Notes in Computer Science. Lecture Notes in Computer Science. Istanbul: Springer Verlag, 2010. p. 210-225. ISBN: 978-3-642-17710-1. ISSN: 0302-9743. Detail

    MIKOLOV, T.; KARAFIÁT, M.; BURGET, L.; ČERNOCKÝ, J.; KHUDANPUR, S. Recurrent neural network based language model. Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010). Proceedings of Interspeech. Makuhari, Chiba: International Speech Communication Association, 2010. p. 1045-1048. ISBN: 978-1-61782-123-3. ISSN: 1990-9772. Detail

    MIKOLOV, T.; PLCHOT, O.; GLEMBEK, O.; MATĚJKA, P.; BURGET, L.; ČERNOCKÝ, J. PCA-based Feature Extraction for Phonotactic Language Recognition. In Proc. Odyssey 2010 - The Speaker and Language Recognition Workshop. Brno: International Speech Communication Association, 2010. p. 251-255. ISBN: 978-80-214-4114-9. Detail

    POVEY, D.; BURGET, L.; AGARWAL, M.; AKYAZI, P.; FENG, K.; GHOSHAL, A.; GLEMBEK, O.; GOEL, N.; KARAFIÁT, M.; RASTROW, A.; ROSE, R.; SCHWARZ, P.; THOMAS, S. Subspace Gaussian mixture models for speech recognition. Proc. International Conference on Acoustics, Speech, and Signal Processing. Proc. International Conference on Acoustics, Speech, and Signal Processing. Dallas: IEEE Signal Processing Society, 2010. p. 4330-4333. ISBN: 978-1-4244-4296-6. ISSN: 1520-6149. Detail

    ROSE, R.; NOROUZIAN, A.; REDDY, A.; COY, A.; GUPTA, V.; KARAFIÁT, M. Subword-based spoken term detection in audio course lectures. Proc. International Conference on Acoustics, Speech, and Signal Processing. Proc. International Conference on Acoustics, Speech, and Signal Processing. Dallas: IEEE Signal Processing Society, 2010. p. 5282-5285. ISBN: 978-1-4244-4296-6. ISSN: 1520-6149. Detail

    SANTHOSH KUMAR, C.; LI, H.; TONG, R.; MATĚJKA, P.; BURGET, L.; ČERNOCKÝ, J. Tuning phone decoders for language identification. Proc. International Conference on Acoustics, Speech, and Signal Processing 2010. Proc. International Conference on Acoustics, Speech, and Signal Processing. Dallas: IEEE Signal Processing Society, 2010. p. 5010-5013. ISBN: 978-1-4244-4296-6. ISSN: 1520-6149. Detail

    SZŐKE, I.; ČERNOCKÝ, J.; FAPŠO, M.; ŽIŽKA, J. SPEECH@FIT LECTURE BROWSER. Proceedings of the 2010 IEEE Spoken Language Technology Workshop. IEEE Catalog Number: CFP 10SLT-USB. Berkeley, California: IEEE Signal Processing Society, 2010. p. 157-158. ISBN: 978-1-4244-7902-3. Detail

    SZŐKE, I.; GRÉZL, F.; ČERNOCKÝ, J.; FAPŠO, M. Acoustic keyword spotter - optimization from end-user perspective. Proceedings of the 2010 IEEE Spoken Language Technology Workshop. IEEE Catalog Number: CFP 10SLT-USB. Berkeley, California: IEEE Signal Processing Society, 2010. p. 177-181. ISBN: 978-1-4244-7902-3. Detail

    TEJEDOR, J.; SZŐKE, I.; FAPŠO, M. Novel Methods for Query Selection and Query Combination in Query-By-Example Spoken Term Detection. Proceedings of the ACM Multimedia 2010 International Conference. Copyright 2010 ACM 978-1-4503-0162-6/10/10. Florencie: Association for Computing Machinery, 2010. p. 15-20. ISBN: 978-1-60558-933-6. Detail

    VESELÝ, K. Parallel training of neural networks for speech recognition. Proceedings of the 16th Conference STUDENT EEICT 2010. Volume 3. Brno: Brno University of Technology, 2010. p. 74-76. ISBN: 978-80-214-4078-4. Detail

    VESELÝ, K.; BURGET, L.; GRÉZL, F. Parallel Training of Neural Networks for Speech Recognition. Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010). Proceedings of Interspeech. Makuhari, Chiba: International Speech Communication Association, 2010. p. 2934-2937. ISSN: 1990-9772. Detail

    VESELÝ, K.; BURGET, L.; GRÉZL, F. Parallel Training of Neural Networks for Speech Recognition. Prof. Text, Speech and Dialogue 2010. Lecture Notes in Computer Science. LNAI 6231. Brno: Springer Verlag, 2010. p. 439-446. ISBN: 978-3-642-15759-2. ISSN: 0302-9743. Detail

    ŽIŽKA, J.; ČERNOCKÝ, J.; FAPŠO, M.; SZŐKE, I. Web-Based Lecture Browser with Speech Search. Znalosti 2010. Sborník příspěvků 9. ročníku konference. Jindřichův Hradec: Fakulty of management and information, 2010. p. 287-290. ISBN: 978-80-245-1636-3. Detail

  • 2009

    BRÜMMER, N.; BURGET, L.; GLEMBEK, O.; HUBEIKA, V.; JANČÍK, Z.; KARAFIÁT, M.; MATĚJKA, P.; MIKOLOV, T.; PLCHOT, O.; STRASHEIM, A. BUT-AGNITIO System Description for NIST Language Recognition Evaluation 2009. Proceedings NIST 2009 Language Recognition Evaluation Workshop. Baltimore, Maryland, USA: National Institute of Standards and Technology, 2009. p. 1-7. Detail

    BRÜMMER, N.; STRASHEIM, A.; HUBEIKA, V.; MATĚJKA, P.; BURGET, L.; GLEMBEK, O. Discriminative Acoustic Language Recognition via Channel-Compensated GMM Statistics. Proc. Interspeech 2009. Proceedings of Interspeech. Brighton: International Speech Communication Association, 2009. p. 2187-2190. ISBN: 978-1-61567-692-7. ISSN: 1990-9772. Detail

    BURGET, L.; FAPŠO, M.; HUBEIKA, V.; GLEMBEK, O.; KARAFIÁT, M.; KOCKMANN, M.; MATĚJKA, P.; SCHWARZ, P.; ČERNOCKÝ, J. BUT system for NIST 2008 speaker recognition evaluation. Proc. Interspeech 2009. Proceedings of Interspeech. Brighton: International Speech Communication Association, 2009. p. 2335-2338. ISBN: 978-1-61567-692-7. ISSN: 1990-9772. Detail

    BURGET, L.; MATĚJKA, P.; HUBEIKA, V.; ČERNOCKÝ, J. Investigation into variants of Joint Factor Analysis for speaker recognition. Proc. Interspeech 2009. Proceedings of Interspeech. Brighton: International Speech Communication Association, 2009. p. 1263-1266. ISBN: 978-1-61567-692-7. ISSN: 1990-9772. Detail

    ČERNOCKÝ, J. MOBIO D7.3: First report on dissemination activities. Martigny: Information and Communication Technologies (ICT) 7th Framework programme, 2009. p. 0-0. Detail

    ČERNOCKÝ, J.; MATĚJKA, P.; GLEMBEK, O. MOBIO D3.4: Description and evaluation of advanced algorithms for uni-modal authentication. Martigny: Information and Communication Technologies (ICT) 7th Framework programme, 2009. p. 0-0. Detail

    DEHAK, N.; KENNY, P.; DEHAK, R.; GLEMBEK, O.; DUMOUCHEL, P.; BURGET, L.; HUBEIKA, V.; CASTALDO, F. Support vector machines and joint factor analysis for speaker verification. Proc. ICASSP 2009. Taiwan: IEEE Signal Processing Society, 2009. p. 1-4. ISBN: 978-1-4244-2354-5. Detail

    FAPŠO, M.; SZŐKE, I.; ČERNOCKÝ, J. Hlasový přístup ke korpusům - experimenty. Brno: Ministerstvo průmyslu a obchodu ČR, 2009. s. 0-0. Detail

    GARNER, P.; DINES, J.; HAIN, T.; EL HANNANI, A.; KARAFIÁT, M.; KORCHAGIN, D.; LINCOLN, M.; WAN, V.; ZHANG, L. Real-Time ASR from Meetings. Proc. Interspeech 2009. Proceedings of Interspeech. Brighton: International Speech Communication Association, 2009. p. 2119-2122. ISSN: 1990-9772. Detail

    GLEMBEK, O.; BURGET, L.; DEHAK, N.; BRÜMMER, N.; KENNY, P. Comparison of Scoring Methods used in Speaker Recognition with Joint Factor Analysis. Proc. ICASSP 2009. Taipei: IEEE Signal Processing Society, 2009. p. 1-4. ISBN: 978-1-4244-2354-5. Detail

    GRÉZL, F.; ČERNOCKÝ, J. Audio Surveillance through Known Event Classification. Radioengineering, 2009, vol. 18, no. 4, p. 671-675. ISSN: 1210-2512. Detail

    GRÉZL, F.; KARAFIÁT, M.; BURGET, L. Investigation into bottle-neck features for meeting speech recognition. Proc. Interspeech 2009. Proceedings of Interspeech. Brighton: International Speech Communication Association, 2009. p. 2947-2950. ISBN: 978-1-61567-692-7. ISSN: 1990-9772. Detail

    HUBEIKA, V. Speaker verification as a target-nontarget trial task. Proceedings of the 15th Conference and Competition STUDENT EEICT 2009. Brno: Faculty of Electrical Engineering and Communication BUT, 2009. p. 1-5. ISBN: 978-80-214-3870-5. Detail

    KAŠPAR, M.; PEŠÁN, J.; SZŐKE, I.; CHALUPNÍČEK, K.; ČERNOCKÝ, J. Technická zpráva k MPO projektu FT-TA3/006: Práce na Etapě 6: "Integrace". Brno: Ministerstvo průmyslu a obchodu ČR, 2009. s. 0-0. Detail

    KAŠPAR, M.; ŠEVEČKOVÁ, M.; CHALUPNÍČEK, K.; ČERNOCKÝ, J. Textové a řečové korpusy. Brno: 2009. s. 0-0. Detail

    KOCKMANN, M.; BURGET, L.; ČERNOCKÝ, J. Brno University of Technology System for Interspeech 2009 Emotion Challenge. Proc. Interspeech 2009. Proceedings of Interspeech. Brighton: International Speech Communication Association, 2009. p. 348-351. ISSN: 1990-9772. Detail

    KOMBRINK, S.; BURGET, L.; MATĚJKA, P.; KARAFIÁT, M.; HEŘMANSKÝ, H. Posterior-based Out of Vocabulary Word Detection in Telephone Speech. Proc. Interspeech 2009. Proceedings of Interspeech. Brighton: International Speech Communication Association, 2009. p. 80-83. ISSN: 1990-9772. Detail

    MIKOLOV, T.; KOPECKÝ, J.; BURGET, L.; GLEMBEK, O.; ČERNOCKÝ, J. Neural network based language models for highly inflective languages. Proc. ICASSP 2009. Taipei: IEEE Signal Processing Society, 2009. p. 1-4. ISBN: 978-1-4244-2354-5. Detail

  • 2008

    BURGET, L.; BRÜMMER, N.; REYNOLDS, D.; KENNY, P.; PELECANOS, J.; VOGT, R.; CASTALDO, F.; DEHAK, N.; DEHAK, R.; GLEMBEK, O.; KARAM, Z.; NOECKER, J.; NA, H.; COSTIN, C.; HUBEIKA, V.; KAJAREKAR, S.; SCHEFFER, N.; ČERNOCKÝ, J. Robust Speaker Recognition Over Varying Channels. Baltimore: Johns Hopkins University, 2008. p. 0-0. Detail

    BURGET, L.; FAPŠO, M.; HUBEIKA, V.; GLEMBEK, O.; KARAFIÁT, M.; KOCKMANN, M.; MATĚJKA, P.; SCHWARZ, P.; ČERNOCKÝ, J. Brno University Of Technology - NIST 2008 SRE. Montreal: 2008. p. 1-28. Detail

    BURGET, L.; SCHWARZ, P.; MATĚJKA, P.; HANNEMANN, M.; RASTROW, A.; WHITE, C.; KHUDANPUR, S.; HEŘMANSKÝ, H.; ČERNOCKÝ, J. Combination of strongly and weakly constrained recognizers for reliable detection of OOVs. Proc. International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Las Vegas: IEEE Signal Processing Society, 2008. p. 1-4. ISBN: 1-4244-1484-9. Detail

    ČERNOCKÝ, J.; MATĚJKA, P. MOBIO D7.1: Planning of evaluation campaigns. Martigny: Information and Communication Technologies (ICT) 7th Framework programme, 2008. p. 0-0. Detail

    GLEMBEK, O.; MATĚJKA, P.; BURGET, L.; MIKOLOV, T. Advances in Phonotactic Language Recognition. Proc. Interspeech 2008. Proceedings of Interspeech. Brisbane: International Speech Communication Association, 2008. p. 1-4. ISSN: 1990-9772. Detail

    GRÉZL, F.; FOUSEK, P. Optimizing bottle-neck features for LVCSR. 2008 IEEE International Conference on Acoustics, Speech, and Signal Processing. Las Vegas, Nevada: IEEE Signal Processing Society, 2008. p. 4729-4732. ISBN: 1-4244-1484-9. Detail

    HUBEIKA, V.; BURGET, L.; MATĚJKA, P.; SCHWARZ, P. Discriminative Training and Channel Compensation for Acoustic Language Recognition. Proc. Interspeech 2008. Proceedings of Interspeech. Brisbane: International Speech Communication Association, 2008. p. 1-4. ISSN: 1990-9772. Detail

    JANČÍK, Z. Modelování dynamiky prosodie pro rozpoznání řečníka. Proceedings of the 14th Conference STUDENT EEICT 2008. Volume 2. Brno: Fakulta elektrotechniky a komunikačních technologií VUT v Brně, 2008. s. 67-69. ISBN: 978-80-214-3615-2. Detail

    KARAFIÁT, M.; BURGET, L.; HAIN, T.; ČERNOCKÝ, J. Discrimininative training of narrow band - wide band adaptated systems for meeting recognition. Proc. Interspeech 2008. Proceedings of Interspeech. Brisbane: International Speech Communication Association, 2008. p. 1-4. ISSN: 1990-9772. Detail

    KOCKMANN, M.; BURGET, L. Contour modeling of prosodic and acoustic features for speaker recognition. Proc. 2008 IEEE Workshop on Spoken Language Technology. Goa: IEEE Signal Processing Society, 2008. p. 1-4. ISBN: 978-1-4244-3472-5. Detail

    KOCKMANN, M.; BURGET, L. Syllable based Feature-Contours for Speaker Recognition. Proc. 14th International Workshop on Advances in Speech Technology. Maribor: 2008. p. 1-4. Detail

    KOMBRINK, S. OOV detection in LVCSR using neural networks. Proc. STUDENT EEICT 2008. Brno: Faculty of Electrical Engineering and Communication BUT, 2008. p. 1-3. ISBN: 978-80-214-3617-6. Detail

    KOPECKÝ, J.; GLEMBEK, O.; KARAFIÁT, M. Advances in Acoustic Modeling for the Recognition of Czech. Proc. 11th International Conference on Text, Speech and Dialogue. Lecture Notes in Computer Science. Berlin: Springer Verlag, 2008. p. 357-363. ISBN: 978-3-540-87390-7. Detail

    MATĚJKA, P.; BURGET, L.; GLEMBEK, O.; SCHWARZ, P.; HUBEIKA, V.; FAPŠO, M.; MIKOLOV, T.; PLCHOT, O.; ČERNOCKÝ, J. BUT language recognition system for NIST 2007 evaluations. Proc. Interspeech 2008. Proceedings of Interspeech. Brisbane, Australia: International Speech Communication Association, 2008. p. 1-4. ISSN: 1990-9772. Detail

    MIKOLOV, T. LANGUAGE MODELS FOR AUTOMATIC SPEECH RECOGNITION OF CZECH LECTURES. Proc. STUDENT EEICT 2008. Brno: Faculty of Electrical Engineering and Communication BUT, 2008. p. 1-5. ISBN: 978-80-214-3617-6. Detail

    OPARIN, I.; GLEMBEK, O.; BURGET, L.; ČERNOCKÝ, J. Morphological random forests for language modeling of inflectional languages. Proc. 2008 IEEE Workshop on Spoken Language Technology. Goa: IEEE Signal Processing Society, 2008. p. 1-4. ISBN: 978-1-4244-3472-5. Detail

    PINTO, J.; SZŐKE, I.; PRASANNA, S.; HEŘMANSKÝ, H. Fast Approximate Spoken Term Detection from Sequence of Phonemes. The 31st Annual International ACM SIGIR Conference 20-24 July 2008, Singapore. Singapore: Association for Computing Machinery, 2008. p. 28-33. ISBN: 978-90-365-2697-5. Detail

    PLCHOT, O.; HUBEIKA, V.; BURGET, L.; SCHWARZ, P.; MATĚJKA, P. Acquisition of Telephone Data from Radio Broadcasts with Applications to Language Recognition. Proc. 11th International Conference on Text, Speech and Dialogue. Berlin: Springer Verlag, 2008. p. 477-483. ISBN: 978-3-540-87390-7. Detail

    SZŐKE, I.; BURGET, L.; ČERNOCKÝ, J.; FAPŠO, M. Sub-word modeling of out of vocabulary words in spoken term detection. Proc. 2008 IEEE Workshop on Spoken Language Technology. Goa: IEEE Signal Processing Society, 2008. p. 1-4. ISBN: 978-1-4244-3472-5. Detail

    SZŐKE, I.; FAPŠO, M.; BURGET, L.; ČERNOCKÝ, J. Hybrid word-subword decoding for spoken term detection. Proc. SSCS 2008: Speech search workshop at SIGIR. Singapore: Association for Computing Machinery, 2008. p. 1-4. ISBN: 978-90-365-2697-5. Detail

    SZŐKE, I.; FAPŠO, M.; ČERNOCKÝ, J. Hlasový přístup ke korpusům - studie. Brno: Ministerstvo průmyslu a obchodu ČR, 2008. s. 0-0. Detail

    WHITE, C.; ZWEIG, G.; BURGET, L.; SCHWARZ, P.; HEŘMANSKÝ, H. Confidence estimation, OOV detection and language ID using phone-to-word transduction and phone-level alignments. Proc. 2008 IEEE International Conference on Acoustics, Speech, and Signal Processing. Las Vegas: IEEE Signal Processing Society, 2008. p. 1-4. ISBN: 1-4244-1484-9. Detail

  • 2007

    CHALUPNÍČEK, K.; ČERNOCKÝ, J.; KOSTKA, M.; PAVELEK, T.; VŠIANSKÝ, J. Automatické hodnocení výslovnosti. Brno: Ministerstvo průmyslu a obchodu ČR, 2007. s. 0-0. Detail

    GRÉZL, F.; ČERNOCKÝ, J. TRAP-based Techniques for Recognition of Noisy Speech. Proc. 10th International Conference on Text Speech and Dialogue (TSD 2007). LNCS. Berlin: Springer Verlag, 2007. p. 270-277. ISBN: 978-3-540-74627-0. Detail

    GRÉZL, F.; HRDLIČKA, P.; VESELÝ, K.; CHALUPNÍČEK, K.; ČERNOCKÝ, J.; KOSTKA, M.; PAVELEK, T.; VŠIANSKÝ, J. Vyhledávání slovníkových hesel hlasem. Brno: Ministerstvo průmyslu a obchodu ČR, 2007. s. 0-0. Detail

    MATĚJKA, P.; BURGET, L.; SCHWARZ, P.; GLEMBEK, O.; KARAFIÁT, M.; GRÉZL, F.; ČERNOCKÝ, J.; VAN LEEUWEN, D.; BRÜMMER, N.; STRASHEIM, A. STBU system for the NIST 2006 speaker recognition evaluation. Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2007). Honolulu: IEEE Signal Processing Society, 2007. p. 221-224. ISBN: 1-4244-0728-1. Detail

  • 2004

    GRÉZL, F. Combinations of TRAP-based systems. Proc. Seventh International conference on Text, Speech and Dialogue. Brno: Faculty of Informatics MU, 2004. p. 323-330. ISBN: 3-540-23049-1. Detail

    MATĚJKA, P.; SZŐKE, I.; SCHWARZ, P.; ČERNOCKÝ, J. Automatic Language Identification using Phoneme and Automatically Derived Unit Strings. Proceedings of 7th International Conference Text,Speech and Dialoque 2004. Brno: Springer Verlag, 2004. p. 147-154. ISBN: 3-540-23049-1. Detail

    SZŐKE, I. Speech units automatically generated by ergodic hidden Markov model. Proceedings of 10th Conference and Competition STUDENT EEICT 2004. Brno: Faculty of Electrical Engineering and Communication BUT, 2004. p. 1-5. Detail

  • 2003

    ČERNOCKÝ, J. Temporal processing for feature extraction in speech recognition. In Vědecké spisy VUT. Edice Habilitační a inaugurační spisy, sv. 112. Brno: Publishing house of Brno University of Technology VUTIUM, 2003. p. 1-30. ISBN: 80-214-2395-1. Detail

    GRÉZL, F. Effect of normalization on TRAP based systems in ASR. Proc. 13th International scientific conference Radioelektronika 2003. Brno: Department of Radioelectronics FEEC BUT, 2003. p. 128-131. ISBN: 80-214-2383-8. Detail

    GRÉZL, F. Local time-frequency operators in TRAPs for speech recognition. 6th International Conference, TSD 2003 České Budějovice, Czech Republic, September 2003 Proceedings. Lecture Notes in Computer Science. České Budějovice: University of West Bohemia in Pilsen, 2003. p. 269-274. ISBN: 3-540-20024-X. ISSN: 0302-9743. Detail

    GRÉZL, F.; HEŘMANSKÝ, H. Local averaging and differentiating of spectral plane for TRAP-based ASR. Proc. EUROSPEECH 2003. European Conference EUROSPEECH. Geneva: Institute for Perceptual Artificial Intelligence, 2003. p. 0-0. ISSN: 1018-4074. Detail

    JENDERKA, P.; VÍCHA, T. Voice Activity Detection in Multimodal Meeting Manager. Proceedings of 9th Conference and Competition STUDENT EEICT 2003 Volume 3. Brno: Faculty of Electrical Engineering and Communication BUT, 2003. p. 588-592. ISBN: 80-214-2379-X. Detail

    KARAFIÁT, M.; GRÉZL, F. Using MATLAB for Analysis of TRAP system. Radioengineering, 2003, vol. 2003, no. 4, p. 38-41. ISSN: 1210-2512. Detail

    MATĚJKA, P.; SCHWARZ, P.; GRÉZL, F.; ČERNOCKÝ, J. Phoneme Classification using Temporal Patterns. Proc. 13th International scientific conference Radioelektronika 2003. Brno: Faculty of Electrical Engineering and Communication BUT, 2003. p. 1-4. ISBN: 80-214-2383-8. Detail

    MATĚJKA, P.; SCHWARZ, P.; HEŘMANSKÝ, H.; ČERNOCKÝ, J. Phoneme Recognition using Temporal Patterns. Proc. 6th International Conference Text, Speech and Dialogue, TSD2003. Ceske Budejovice: Springer Verlag, 2003. p. 465-472. ISBN: 3-540-20024-X. Detail

    MOTLÍČEK, P. Derivation of TRAPs in Auditory Domain. Proceedings of 9th Conference and Competition STUDENT EEICT 2003. Brno: Dean Office of FEEC BUT, 2003. p. 598-602. ISBN: 80-214-2379-X. Detail

    MOTLÍČEK, P. Modeling of Spectra and Temporal Trajectories in Speech Processing. Sborník příspěvků a prezentací akce Odborné semináře 2003. REL02V. Brno: Department of Radioelectronics FEEC BUT, 2003. s. 0-0. Detail

    MOTLÍČEK, P.; ČERNOCKÝ, J. All-Pole Modeling for Definition of Speech Features in Aurora3 DSR Task. 6th International Conference, TSD 2003 České Budějovice, Czech Republic, September 2003 Proceedings. Lecture Notes in Computer Science. České Budějovice: University of West Bohemia in Pilsen, 2003. p. 295-300. ISBN: 3-540-20024-X. ISSN: 0302-9743. Detail

    MOTLÍČEK, P.; ČERNOCKÝ, J. Autoregressive Modeling based Feature Extraction for Aurora3 DSR Task. Proc. EUROSPEECH 2003. European Conference EUROSPEECH. Geneva: Institute for Perceptual Artificial Intelligence, 2003. p. 1801-1804. ISSN: 1018-4074. Detail

    MOTLÍČEK, P.; ČERNOCKÝ, J. Time-domain based Temporal Processing with Application of. Proc. EUROSPEECH 2003. European Conference EUROSPEECH. Geneva: Institute for Perceptual Artificial Intelligence, 2003. p. 821-824. ISSN: 1018-4074. Detail

    SCHWARZ, P. Would You Like To Make Your Programs Understand Human Voice?. Proceedings of 9th Conference STUDENT EEICT 2003. Brno: Faculty of Electrical Engineering and Communication BUT, 2003. p. 231-235. ISBN: 80-214-2379-X. Detail

    SCHWARZ, P.; MATĚJKA, P.; ČERNOCKÝ, J. Recognition of Phoneme Strings using TRAP Technique. Proceedings of 8th International Conference Eurospeech. European Conference EUROSPEECH. Geneve: International Speech Communication Association, 2003. p. 1-4. ISSN: 1018-4074. Detail

  • 2002

    BAUDOIN, G.; CAPMAN, F.; ČERNOCKÝ, J.; EL CHAMI, F.; CHARBIT, M.; CHOLLET, G.; PETROVSKA-DELACRETAZ, D. Advances in very low bit-rate speech coding using recognition and synthesis techniques. Lecture Notes in Computer Science, 2002, vol. 2002, no. 2448, p. 269-276. ISSN: 0302-9743. Detail

    BURGET, L.; DUPONT, S.; GARUDADRI, H.; GRÉZL, F.; HEŘMANSKÝ, H.; JAIN, P.; KAJAREKAR, S.; MORGAN, N. QUALCOMM-ICSI-OGI Features for ASR. Proc. 7th International Conference on Spoken Language Processing. Denver: International Speech Communication Association, 2002. p. 4-7. ISBN: 1-876346-42-6. Detail

    BURGET, L.; MOTLÍČEK, P.; GRÉZL, F.; JAIN, P. Distributed speech recognition. Radioengineering, 2002, vol. 2002, no. 4, p. 12-16. ISSN: 1210-2512. Detail

    GARUDADRI, H.; HEŘMANSKÝ, H.; MORGAN, N.; BENITEZ, C.; BURGET, L.; KAJAREKAR, S.; GRÉZL, F.; JAIN, P.; MOTLÍČEK, P. Distributed Voice Recognition System Utilizing Multistream Network Feature Processing. San Diego: Qualcomm, 2002. p. 0-0. Detail

    GRÉZL, F. Classifiers in speech recognition systems based on TRAPS. Proceedings of 8th Conference STUDENT EEICT 2002. Brno: Faculty of Electrical Engineering and Communication BUT, 2002. p. 74-77. ISBN: 80-214-2116-9. Detail

    GRÉZL, F.; BURGET, L.; JAIN, P.; ČERNOCKÝ, J. Improving TRAPS features using LDA. Proc. 12th International scientific conference Radioelektronika 2002. Bratislava: Slovak University of Technology in Bratislava, 2002. p. 0-0. ISBN: 80-227-1700-2. Detail

    KARAFIÁT, M.; ČERNOCKÝ, J. Context dependent Hidden Markov models in recognition of Czech. Proc. 12th International scientific conference Radioelektronika 2002. Bratislava: Slovak University of Technology in Bratislava, 2002. p. 0-0. ISBN: 80-227-1700-2. Detail

    KARAFIÁT, M.; ČERNOCKÝ, J. Differences between context dependent and context independent Hidden Markov Models for recognition of Czech. Proc. of 8th student conference STUDENT EEICT 2002. Brno: Faculty of Electrical Engineering TUB, 2002. p. 328-332. ISBN: 80-214-2116-9. Detail

    MATĚJKA, P.; ČERNOCKÝ, J. Feature gaussianization in speech recognition. Proc. 12th International scientific conference Radioelektronika 2002. Bratislava: Slovak University of Technology in Bratislava, 2002. p. 0-0. ISBN: 80-227-1700-2. Detail

    MATĚJKA, P.; SCHWARZ, P.; KARAFIÁT, M.; ČERNOCKÝ, J. Some like it Gaussian... Proc. 5th International Conference Text, Speech and Dialogue, TSD2002. Lecture notes in artificial intelligence 2448. Berlin: Springer Verlag, 2002. p. 321-324. ISBN: 3-540-44129-8. Detail

    MOTLÍČEK, P. Application of Mel-scale Filter bank for Noise Estimation in Speech Processing. 12th International Czech-Slovak Scientific conference Radioelektronika 2002. Bratislava: Slovak University of Technology in Bratislava, 2002. p. 1-4. ISBN: 80-227-1700-2. Detail

    MOTLÍČEK, P.; BURGET, L. Efficient Noise Estimation and its Application for Robust Speech Recognition. 5th International Conference, TSD 2002 Brno, Czech Republic, September 2002 Proceedings. Berlin: Springer Verlag, 2002. p. 229-236. ISBN: 3-540-44129-8. Detail

    MOTLÍČEK, P.; BURGET, L. Noise estimation for efficient speech enhancement and robust speech recognition. Proc. 7th International Conference on Spoken Language Processing. Denver: International Speech Communication Association, 2002. p. 1033-1036. ISBN: 1-876346-42-6. Detail

    SCHWARZ, P.; ČERNOCKÝ, J. Keyword detection in Czech fluent speech. Proc. 12th International scientific conference Radioelektronika 2002. Bratislava: Slovak University of Technology in Bratislava, 2002. p. 1-4. ISBN: 80-227-1700-2. Detail

  • 2001

    ČERNOCKÝ, J.; BAUDOIN, G.; PETROVSKA-DELACRETAZ, D.; CHOLLET, G. Vers une analyse acoustico-phonetique de la parole independante de la langue, basee sur ALISP. Revue Parole, 2001, roč. 2001, č. 17, s. 191-226. ISSN: 1373-1955. Detail

    HEUVEL, H.; BOUDY, J.; BAKCSI, Z.; ČERNOCKÝ, J.; GALUNOV, V.; KOCHANINA, J.; MAJEWSKI, W.; POLLÁK, P.; RUSKO, M.; SADOWSKI, J.; STARONIEWICZ, P.; TROPF, H. SpeechDat-East: Five multilingual speech databases for voice-operated teleservices completed. Proc. EUROSPEECH 2001. Aalborg: International Speech Communication Association, 2001. p. 2059-2062. ISBN: 87-90834-09-7. Detail

    MOTLÍČEK, P. Application of Re-segmentation in Very Low Bit Rate Speech Coding. Proceedings of 7th Conference STUDENT EEICT 2001. Brno: Faculty of Electrical Engineering and Communication BUT, 2001. p. 269-274. ISBN: 80-214-1860-5. Detail

    MOTLÍČEK, P.; BAUDOIN, G.; ČERNOCKÝ, J.; CHOLLET, G. Minimization of transition noise and HNM synthesis in very low bit rate speech coding. 4th International Conference, TSD 2001 Železná Ruda, Czech Republic, September 2001 Proceedings. Berlin: Springer Verlag, 2001. p. 305-312. ISBN: 3-540-42557-8. Detail

Back to top