Project Details
Soudobé metody zpracování, analýzy a zobrazování multimediálních a 3D dat
Project Period: 1. 3. 2023 – 28. 2. 2026
Project Type: grant
Code: FIT-S-23-8278
Agency: Brno University of Technology
Program: Vnitřní projekty VUT
Multimediální a 3D data jsou důležitými a potřebnými daty pro vzrůstající počet aplikací moderních počítačových systémů, v nichž je jejich využití nenahraditelné. Současně je známo, že zpracování takových dat je obtížné a výpočetně náročné a to platí i o jejich zobrazování a analýze. Proto je výzkum v této oblasti jedním z obtížnějších a důležitých. Projekt navazuje na dřívější projekt "Moderní metody zpracování, analýzy a zobrazování multimediálních a 3D dat".
Bambušek Daniel, Ing. (DCGM)
Bartl Vojtěch, Ing., Ph.D. (DCGM)
Bažout David, Ing. (UPGM)
Beneš Karel, Ing. (DCGM)
Beran Vítězslav, doc. Ing., Ph.D. (DCGM)
Bobák Petr, Ing., Ph.D. (DCGM)
Brukner Jan, Ing. (DCGM)
Burget Lukáš, doc. Ing., Ph.D. (DCGM)
Čadík Martin, doc. Ing., Ph.D. (DCGM)
Černocký Jan, prof. Dr. Ing. (DCGM)
Dobeš Petr, Ing. (DCGM)
Dočekal Martin, Ing. (DCGM)
Fajčík Martin, Ing., Ph.D. (DCGM)
Hanák Jiří, Ing. (DCGM)
Herout Adam, prof. Ing., Ph.D. (DCGM)
Hříbek David, Ing.
Chlubna Tomáš, Ing., Ph.D. (DCGM)
Chudý Peter, doc. Ing., Ph.D., MBA (FIT)
Kapinus Michal, Ing., Ph.D. (DCGM)
Karas Matej, Ing. (UPGM)
Kišš Martin, Ing. (DCGM)
Klepárník Petr, Ing., Ph.D. (DCGM)
Kocour Martin, Ing. (DCGM)
Kohút Jan, Ing. (DCGM)
Landini Federico Nicolás, Ph.D. (RG SPEECH)
Maršík Lukáš, Ing. (DCGM)
Mošner Ladislav, Ing. (DCGM)
Munzar Milan, Ing. (RG CPHOTO)
Nguyen Son Hai, Ing.
Nosko Svetozár, Ing., Ph.D. (DCGM)
Novák Jiří, Ing., Ph.D. (DCGM)
Omachtová Alena, Ing. (DCGM)
Ondřej Karel, Ing. (FIT)
Pavlus Ján, Ing. (DCGM)
Peng Junyi (DCGM)
Polášek Tomáš, Ing. (DCGM)
Reich Bořek, Ing. (DCGM)
Smrž Pavel, doc. RNDr., Ph.D. (DCGM)
Španěl Michal, doc. Ing., Ph.D. (DCGM)
Špaňhel Jakub, Ing., Ph.D. (DCGM)
Švec Ján, Ing. (DCGM)
Švec Tomáš, Ing. (DCGM)
Vlnas Michal, Ing. (DCGM)
2025
- HANÁK, J.; NOVÁK, J.; CHUDÝ, P.; BEN-ASHER, J. Cross-Entropy Method for Laser Defense Applications. Journal of Aerospace Information Systems, 2025, vol. 22, no. 1,
p. 53-58. ISSN: 2327-3097. Detail
2024
- BENEŠ, K.; KOCOUR, M.; BURGET, L. Hystoc: Obtaining Word Confidences for Fusion of End-To-End ASR Systems. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Seoul: IEEE Signal Processing Society, 2024.
p. 11276-11280. ISBN: 979-8-3503-4485-1. Detail - BHATTACHARJEE, M.; NIGMATULINA, I.; PRASAD, A.; RANGAPPA, P.; MADIKERI, S.; MOTLÍČEK, P.; HELMKE, H.; KLEINERT, M. Contextual Biasing Methods for Improving Rare Word Detection in Automatic Speech Recognition. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Seoul: IEEE Signal Processing Society, 2024.
p. 12652-12656. ISBN: 979-8-3503-4485-1. Detail - BOBÁK, P.; ČMOLÍK, L.; ČADÍK, M. Reinforced Labels: Multi-Agent Deep Reinforcement Learning for Point-Feature Label Placement. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2024, vol. 30, no. 9,
p. 5908-5922. ISSN: 1077-2626. Detail - CHLUBNA, T.; MILET, T.; ZEMČÍK, P. How Capturing Camera Trajectory Distortion Affects User Experience on Looking Glass 3D Display. MULTIMEDIA TOOLS AND APPLICATIONS, 2024, vol. 2024, no. 83,
p. 20265-20287. ISSN: 1573-7721. Detail - CHLUBNA, T.; MILET, T.; ZEMČÍK, P. Lightweight All-Focused Light Field Rendering. COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, vol. 244, no. 7,
p. 7-8. ISSN: 1077-3142. Detail - CHLUBNA, T.; ZEMČÍK, P.; MILET, T. Efficient Random-Access GPU Video Decoding for Light-Field Rendering. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, vol. 2024, no. 102,
p. 1-14. ISSN: 1047-3203. Detail - ESPUNA, A.; PRASAD, A.; MOTLÍČEK, P.; MADIKERI, S.; SCHUEPBACH, C. Normalising Flows for Speaker and Language Recognition Backend. Proceedings of Odyssey 2024: The Speaker and Language Recognition Workshop. Quebec: International Speech Communication Association, 2024.
p. 74-80. Detail - HANÁK, J.; NOVÁK, J.; CHUDÝ, P. Tactical Scenario Adaptation for Pilot Training. In AIAA/IEEE Digital Avionics Systems Conference - Proceedings. San Diego: Institute of Electrical and Electronics Engineers, 2024.
p. 1-7. ISBN: 979-8-3503-4961-0. ISSN: 2155-7195. Detail - KIŠŠ, M.; HRADIŠ, M. Self-supervised Pre-training of Text Recognizers. In Barney Smith, E.H., Liwicki, M., Peng, L. (eds) Document Analysis and Recognition - ICDAR 2024. Lecture Notes in Computer Science. Atény: Springer Nature Switzerland AG, 2024.
p. 218-235. ISBN: 978-3-031-70545-8. Detail - KUBÍK, T.; ŠPANĚL, M. LMVSegRNN and Poseidon3D: Addressing Challenging Teeth Segmentation Cases in 3D Dental Surface Orthodontic Scans. Bioengineering, 2024, vol. 11, no. 10,
p. 1-18. ISSN: 2306-5354. Detail - KUMAR, S.; MADIKERI, S.; NIGMATULINA, I.; VILLATORO-TELLO, E.; MOTLÍČEK, P.; PANDIA, K.; DUBAGUNTA, P.; GANAPATHIRAJU, A. Multitask Speech Recognition and Speaker Change Detection for Unknown Number of Speakers. ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Seoul: IEEE Signal Processing Society, 2024.
p. 12592-12596. ISBN: 979-8-3503-4485-1. Detail - MACIEJEWSKI, M.; KLEMENT, D.; HUANG, R.; WIESNER, M.; KHUDANPUR, S. Evaluating the Santa Barbara Corpus: Challenges of the Breadth of Conversational Spoken Language. In Proceedings of Interspeech 2024. Proceedings of Interspeech. Kos: International Speech Communication Association, 2024.
p. 2155-2160. ISSN: 1990-9772. Detail - NOVÁK, J.; CHUDÝ, P. Dynamic Soaring in Uncertain Wind Conditions: Polynomial Chaos Expansion Approach. In Machine Learning, Optimization, and Data Science. Lecture Notes in Computer Science. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Grasmere: Springer Nature Switzerland AG, 2024.
p. 104-115. ISBN: 978-3-031-53968-8. ISSN: 0302-9743. Detail - NOVÁK, J.; CHUDÝ, P.; HANÁK, J. Model Predictive Control Driven Aerial Grasping with Soft Operational Constraints. In ICAS Proceedings. ICAS Proceedings. Florence: International Council of the Aeronautical Sciences, 2024.
p. 1-15. ISSN: 2958-4647. Detail - NOVÁK, J.; HANÁK, J.; CHUDÝ, P. Hybrid Modeling Approach for Optimization Based Control of Multirotor Unmanned Aerial Vehicles. In ICAS Proceedings. ICAS Proceedings. Florence: International Council of the Aeronautical Sciences, 2024.
p. 1-10. ISSN: 2958-4647. Detail - NOVÁK, J.; HANÁK, J.; CHUDÝ, P. Predictive Control Driven Tactical Maneuvering. In ICAS Proceedings. ICAS Proceedings. Florence: International Council of the Aeronautical Sciences, 2024.
p. 1-12. ISSN: 2958-4647. Detail - NOVÁK, J.; HANÁK, J.; CHUDÝ, P. Reliability-Based Control System Optimization in Uncertain Conditions. In AIAA Aviation Forum and ASCEND, 2024. Las Vegas: American Institute of Aeronautics and Astronautics, 2024.
p. 1-15. ISBN: 978-1-62410-716-0. Detail - PEŠÁN, J.; JUŘÍK, V.; KARAFIÁT, M.; ČERNOCKÝ, J. BESST Dataset: A Multimodal Resource for Speech-based Stress Detection and Analysis. In Proceedings of Interspeech 2024. Proceedings of Interspeech. Kos: International Speech Communication Association, 2024.
p. 1355-1359. ISSN: 1990-9772. Detail - PRASAD, A.; CAROFILIS, A.; VANDERREYDT, G.; KHALIL, D.; MADIKERI, S.; MOTLÍČEK, P.; SCHUEPBACH, C. Fine-Tuning Self-Supervised Models for Language Identification Using Orthonormal Constraint. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Seoul: IEEE Signal Processing Society, 2024.
p. 11921-11925. ISBN: 979-8-3503-4485-1. Detail - PRASAD, A.; MADIKERI, S.; KHALIL, D.; MOTLÍČEK, P.; SCHUEPBACH, C. Speech and Language Recognition with Low-rank Adaptation of Pretrained Models. In Proceedings of Interspeech. Proceedings of Interspeech. Kos Island: International Speech Communication Association, 2024.
p. 2825-2829. ISSN: 1990-9772. Detail - RANGAPPA, P.; MUSCAT, A.; SANCHEZ-LARA, A.; MOTLÍČEK, P.; ANTONOPOULOU, M.; FOURFOURIS, I.; SKARLATOS, A.; AVGERINOS, N.; TSANGARIS, M.; KOSTKA, K. Detecting Criminal Networks via Non-Content Communication Data Analysis Techniques from the TRACY Project. Proceedings of the15th EAI International Conference on Digital Forensics & Cyber Crime (EAI-ICDF2C24). Dubrovnik: 2024.
p. 1-15. Detail - YUSUF, B.; BASKAR, M.; ROSENBERG, A.; RAMABHADRAN, B. Speculative Speech Recognition by Audio-Prefixed Low-Rank Adaptation of Language Models. In Proceedings of Interspeech 2024. Proceedings of Interspeech. Kos: International Speech Communication Association, 2024.
p. 792-796. ISSN: 1990-9772. Detail
2023
- APAROVICH, M.; KESIRAJU, S.; DUFKOVÁ, A.; SMRŽ, P. FIT BUT at SemEval-2023 Task 12: Sentiment Without Borders - Multilingual Domain Adaptation for Low-Resource Sentiment Classification. In Proceedings of the The 17th International Workshop on Semantic Evaluation (SemEval-2023). Toronto (online): Association for Computational Linguistics, 2023.
p. 1518-1524. ISBN: 978-1-959429-99-9. Detail - BAMBUŠEK, D.; MATERNA, Z.; KAPINUS, M.; BERAN, V.; SMRŽ, P. How Do I Get There? Overcoming Reachability Limitations of Constrained Industrial Environments in Augmented Reality Applications. In 2023 IEEE Conference on Virtual Reality and 3D User Interfaces (VR). Shanghai: Institute of Electrical and Electronics Engineers, 2023.
p. 115-122. ISBN: 979-8-3503-4815-6. Detail - BAŘINA, D. Experimental lossless data compressor. Microprocessors and Microsystems, 2023, vol. 98, no. 4,
p. 104803-104803. ISSN: 0141-9331. Detail - BHATTACHARJEE, M.; MOTLÍČEK, P.; NIGMATULINA, I.; HELMKE, H.; OHNEISER, O.; KLEINERT, M.; EHR, H. Customization of Automatic Speech Recognition Engines for Rare Word Detection Without Costly Model Re-Training. 13th SESAR Innovation Days 2023, SIDS 2023. Seville: SESAR Joint Undertaking, 2023.
p. 1-8. ISSN: 0770-1268. Detail - BURDISSO, S.; VILLATORO-TELLO, E.; MADIKERI, S.; MOTLÍČEK, P. Node-weighted Graph Convolutional Network for Depression Detection in Transcribed Clinical Interviews. In Proceedings of the Annual Conference of International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Dublin: International Speech Communication Association, 2023.
p. 3617-3621. ISSN: 1990-9772. Detail - CHLUBNA, T.; MILET, T.; ZEMČÍK, P.; KULA, M. Real-Time Light Field Video Focusing and GPU Accelerated Streaming. Journal of Signal Processing Systems for Signal Image and Video Technology, 2023, vol. 95, no. 6,
p. 703-719. ISSN: 1939-8115. Detail - GAVRIELIDES, A.; SOPHOCLEOUS, M.; AGAPIOU, G.; LESSI, C.; ŠPAŇHEL, J.; LENDINEZ, A.; QIU, R.; LI, D. Implementing Network Applications for 5G-Enabled Robots Through the 5G-ERA Platform. In IFIP Advances in Information and Communication Technology. IFIP Advances in Information and Communication Technology. Artificial Intelligence Applications and Innovations. Cham: Springer Nature Switzerland AG, 2023.
p. 55-65. ISBN: 978-3-031-34170-0. ISSN: 1868-422X. Detail - HANÁK, J.; CHUDÝ, P.; VLK, J. Collaborative Agents for Synthetic Tactical Training. In AIAA/IEEE Digital Avionics Systems Conference - Proceedings. Barcelona: Institute of Electrical and Electronics Engineers, 2023.
p. 1-9. ISBN: 979-8-3503-3357-2. ISSN: 2155-7195. Detail - KHALIL, D.; PRASAD, A.; MOTLÍČEK, P.; ZULUAGA-GOMEZ, J.; NIGMATULINA, I.; MADIKERI, S.; SCHUEPBACH, C. An Automatic Speaker Clustering Pipeline for the Air Traffic Communication Domain. Aerospace, 2023, vol. 10, no. 10,
p. 1-14. ISSN: 2226-4310. Detail - KIŠŠ, M.; HRADIŠ, M.; BENEŠ, K.; BUCHAL, P.; KULA, M. SoftCTC-semi-supervised learning for text recognition using soft pseudo-labels. International Journal on Document Analysis and Recognition, 2023, vol. 2024, no. 27,
p. 177-193. ISSN: 1433-2825. Detail - MAI, F.; ZULUAGA-GOMEZ, J.; PARCOLLET, T.; MOTLÍČEK, P. HyperConformer: Multi-head HyperMixer for Efficient Speech Recognition. In Proceedings of the Annual Conference of International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Dublin: International Speech Communication Association, 2023.
p. 2213-2217. ISSN: 1990-9772. Detail - MOTLÍČEK, P.; PRASAD, A.; NIGMATULINA, I.; HELMKE, H.; OHNEISER, O.; KLEINERT, M. Automatic Speech Analysis Framework for ATC Communication in HAAWAII. In SESAR Innovation Days. Seville: SESAR Joint Undertaking, 2023.
p. 1-9. ISSN: 0770-1268. Detail - NIGMATULINA, I.; MADIKERI, S.; VILLATORO-TELLO, E.; MOTLÍČEK, P.; ZULUAGA-GOMEZ, J.; PANDIA, K.; GANAPATHIRAJU, A. Implementing contextual biasing in GPU decoder for online ASR. In Proceedings of the Annual Conference of International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Dublin: International Speech Communication Association, 2023.
p. 4494-4498. ISSN: 1990-9772. Detail - NOVÁK, J.; CHUDÝ, P. Surrogate Modeling of Optimal Control Based Collision Avoidance System for Multirotor Unmanned Aerial Vehicles. In AIAA/IEEE Digital Avionics Systems Conference - Proceedings. Barcelona: Institute of Electrical and Electronics Engineers, 2023.
p. 1-7. ISBN: 979-8-3503-3357-2. ISSN: 2155-7195. Detail - OMACHTOVÁ, A.; HEROUT, A.; BAMBUŠEK, D.; JUŘÍK, V. How to shoot yourself right with a smartphone?. VIRTUAL REALITY, 2023, vol. 2023, no. 1,
p. 1-13. ISSN: 1434-9957. Detail - POLÁŠEK, T.; ČADÍK, M. Predicting Photovoltaic Power Production using High-Uncertainty Weather Forecasts. APPLIED ENERGY, 2023, vol. 2023, no. 339,
p. 120989-121004. ISSN: 0306-2619. Detail - POLÁŠEK, T.; ČADÍK, M.; KELLER, Y.; BENEŠ, B. Vision UFormer: Long-Range Monocular Absolute Depth Estimation. COMPUTERS & GRAPHICS-UK, 2023, vol. 111, no. 4,
p. 180-189. ISSN: 0097-8493. Detail - SKOWRON, M.; BACKFRIED, G.; NAVAS, E.; BERZINŠ, A.; VAN, J.; DE, F.; DEMARCO, A.; POLÁK, P.; KOVÁČ, M.; POLÁK, P.; ROHDIN, J.; ROSNER, M.; SANCHEZ, J.; SARATXAGA, I.; SCHWARZ, P. Deep Dive Speech Technology. In European Language Equality. Cham: Springer Nature Switzerland AG, 2023.
p. 289-312. ISBN: 978-3-031-28819-7. Detail - VANDERREYDT, G.; PRASAD, A.; KHALIL, D.; MADIKERI, S.; DEMUYNCK, K.; MOTLÍČEK, P. Parameter-Efficient Tuning With Adaptive Bottlenecks For Automatic Speech Recognition. Proceedings of IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). Taipei: IEEE Signal Processing Society, 2023.
p. 1-7. ISBN: 979-8-3503-0689-7. Detail - VILLATORO-TELLO, E.; MADIKERI, S.; ZULUAGA-GOMEZ, J.; SHARMA, B.; SARFJOO, S.; NIGMATULINA, I.; MOTLÍČEK, P.; IVANOV, V.; GANAPATHIRAJU, A. Effectiveness of Text, Acoustic, and Lattice-Based Representations in Spoken Language Understanding Tasks. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Rhodes Island: IEEE Signal Processing Society, 2023.
p. 1-5. ISBN: 978-1-7281-6327-7. Detail - YUSUF, B.; GOURAV, A.; GANDHE, A.; BULYKO, I. On-the-Fly Text Retrieval for end-to-end ASR Adaptation. In Proceedings of ICASSP 2023. Rhodes Island: IEEE Signal Processing Society, 2023.
p. 1-5. ISBN: 978-1-7281-6327-7. Detail - ZULUAGA-GOMEZ, J.; PRASAD, A.; NIGMATULINA, I.; MOTLÍČEK, P.; KLEINERT, M.;. A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers. Aerospace, 2023, vol. 10, no. 5,
p. 1-25. ISSN: 2226-4310. Detail
2022
- BOITO, M.; YUSUF, B.; ONDEL YANG, L.; VILLAVICENCIO, A.; BESACIER, L. Unsupervised Word Segmentation from Discrete Speech Units in Low-Resource Settings. In Proceedings of the the 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages. Marseile: European Language Resources Association, 2022.
p. 1-9. ISBN: 979-10-95546-91-7. Detail
- ASHIHARA, T.; MORIYA, T.; HORIGUCHI, S.; PENG, J.; OCHIAI, T.; DELCROIX, M.; MATSUURA, K.; SATO, H. Investigation of Speaker Representation for Target-Speaker Speech Processing. Proc. 2024 IEEE Spoken Language Technology Workshop (SLT). Macao: IEEE Signal Processing Society,
p. 423-430. ISBN: 979-8-3503-9225-8. Detail - ZULUAGA-GOMEZ, J.; VESELÝ, K.; SZŐKE, I.; BLATT, A.; MOTLÍČEK, P.; KOCOUR, M.; RIGAULT, M.; CHOUKRI, K.; PRASAD, A.; SARFJOO, S.; NIGMATULINA, I.; CEVENINI, C.; KOLČÁREK, P.; TART, A.; ČERNOCKÝ, J.; KLAKOW, D. ATCO2 corpus: A Large-Scale Dataset for Research on Automatic Speech Recognition and Natural Language Understanding of Air Traffic Control Communications has been verified and confirmed by the Action Editor. Journal of Machine Learning Research, vol. 2, no. 1,
p. 1-45. ISSN: 1533-7928. Detail