Publication Results
-
2026
POLOK, A.; KLEMENT, D.; KOCOUR, M.; HAN, J.; LANDINI, F.; YUSUF, B.; WIESNER, M.; KHUDANPUR, S.; ČERNOCKÝ, J.; BURGET, L. DiCoW: Diarization-Conditioned Whisper for Target Speaker Automatic Speech Recognition. COMPUTER SPEECH AND LANGUAGE, 2026, vol. 95, no. 1,
p. 1-19. Detail -
2025
Alexander Polok, Jiangyu Han, Dominik Klement, Samuele Cornell, Jan Černocký, Lukáš Burget. BUT System for the MLC-SLM Challenge. ISCA: ISCA, 2025.
p. 23-27. DetailKHURANA, S.; KLEMENT, D.; LAURENT, A.; BOBOS, D.; NOVOSAD, J.; GAZDIK, P.; ZHANG, E.; HUANG, Z.; HUSSEIN, A.; MARXER, R.; MASUYAMA, Y.; AIHARA, R.; HORI, C.; GERMAIN, F.; WICHERN, G.; LE ROUX, J. Factorized RVQ-GAN For Disentangled Speech Tokenization. In Proceedings of the Annual Conference of the International Speech Communication Association Interspeech. Interspeech. Rotterdam, The Netherlands: International Speech Communication Association, 2025.
p. 3514-3518. DetailPÁLKA, P.; LANDINI, F.; KLEMENT, D.; DIEZ SÁNCHEZ, M.; SILNOVA, A.; BURGET, L.; DELCROIX, M. Joint Training of Speaker Embedding Extractor, Speech and Overlap Detection for Diarization. Palermo: IEEE Signal Processing Society, 2025.
p. 31-35. ISBN: 978-9-46-459362-4. DetailPOLOK, A.; KLEMENT, D.; WIESNER, M.; KHUDANPUR, S.; ČERNOCKÝ, J.; BURGET, L. Target Speaker ASR with Whisper. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Hyderabad: IEEE Signal Processing Society, 2025.
p. 1-5. ISBN: 979-8-3503-6874-1. DetailYAN, B.; HAMED, I.; SHIMIZU, S.; LODAGALA, V.; CHEN, W.; IAKOVENKO, O.; TALAFHA, B.; HUSSEIN, A.; POLOK, A.; CHANG, K.; KLEMENT, D.; ALTHUBAITI, S.; PENG, P.; WIESNER, M.; SOLORIO, T.; ALI, A.; KHUDANPUR, S.; WATANABE, S. CS-FLEURS: A Massively Multilingual and Code-Switched Speech Dataset. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Interspeech. Rotterdam, Nizozemí: ISCA, 2025.
p. 743-747. Detail -
2024
KLEMENT, D.; DIEZ SÁNCHEZ, M.; LANDINI, F.; BURGET, L.; SILNOVA, A.; DELCROIX, M.; TAWARA, N. Discriminative Training of VBx Diarization. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Seoul: IEEE Signal Processing Society, 2024.
p. 11871-11875. ISBN: 979-8-3503-4485-1. DetailMACIEJEWSKI, M.; KLEMENT, D.; HUANG, R.; WIESNER, M.; KHUDANPUR, S. Evaluating the Santa Barbara Corpus: Challenges of the Breadth of Conversational Spoken Language. In Proceedings of Interspeech 2024. Proceedings of Interspeech. Kos: International Speech Communication Association, 2024. no. 9,
p. 2155-2160. ISSN: 1990-9772. DetailPOLOK, A.; KLEMENT, D.; HAN, J.; SEDLÁČEK, Š.; YUSUF, B.; MACIEJEWSKI, M.; WIESNER, M.; BURGET, L. BUT/JHU System Description for CHiME-8 NOTSOFAR-1 Challenge. Proceedings of CHiME 2024 Workshop. Kos Island: International Speech Communication Association, 2024.
p. 18-22. Detail