Dissertation Topic

Detection of voice deepfakes

Academic Year: 2024/2025

Supervisor: Černocký Jan, prof. Dr. Ing.

Department: Department of Computer Graphics and Multimedia

Programs:
Information Technology (DIT) - full-time study
Information Technology (DIT) - combined study
Information Technology (DIT-EN) - full-time study
Information Technology (DIT-EN) - combined study

The work will start with getting familiar with the basics of the problem of voice deep fake detection (DFD), terminology, available techniques, data and challenges (especially AVSpoof), with the history and state of the art techniques and tools for speaker recognition (wespeaker toolkit), with state of the art techniques and tools for personalized text to speech (pTTS) synthesis and voice conversion. The first task will be reproducing one or two DF detection systems from AVSpoof 2021 (or a newer challenge), checking that the numbers match what is reported, studying how the systems work, followed by attacking the AVSpoof 2021 DFD system(s) with several up to date DF creation techniques. The main task of the PhD work is to suggest and implement ways to detect DFD (or help DFD detection) by for example (1) making the DFD system aware of genuine speech of the target speaker (2) work on artifacts that might be badly handled by pTTS systems, such as breaths. (3) suggesting and implementing techniques making use of psychoacoustical findings (4) suggesting and implementing techniques making use of text information available from the target speaker (such as social media).