Publication Details
Assessing the Human Ability to Recognize Synthetic Speech in Ordinary Conversation
deepfake, synthetic speech, artificial intelligence, cybersecurity, deepfake detection
This work assesses the human ability to recognize synthetic speech (deepfake). This paper describes an experiment in which we communicated with respondents using voice messages. We presented the respondents with a cover story about testing the user-friendliness of voice messages while secretly sending them a pre-prepared deep fake recording during the conversation. We examined their reactions, knowledge of deepfakes, or how many could correctly identify which message was deepfake. The results show that none of the respondents reacted in any way to the fraudulent deepfake message, and only one retrospectively admitted to noticing something specific. On the other hand, a voicemail message that contained a deepfake was correctly identified by 83.9% of respondents after revealing the nature of the experiment. Thus, the results show that although the deepfake recording was clearly identifiable among others, no one reacted to it. In summary, we show that the human ability to recognize voice deepfakes is not at a level we can trust. It is very difficult for people to distinguish between real and fake voices, especially if they do not expect them.
@inproceedings{BUT185186,
author="Daniel {Prudký} and Anton {Firc} and Kamil {Malinka}",
title="Assessing the Human Ability to Recognize Synthetic Speech in Ordinary Conversation",
booktitle="2023 International Conference of the Biometrics Special Interest Group (BIOSIG)",
year="2023",
series="Proceedings of the 22nd International Conference of the Biometrics Special Interest Group",
pages="1--5",
publisher="GI - Group for computer science",
address="Darmstadt",
doi="10.1109/BIOSIG58226.2023.10346006",
isbn="978-3-88579-733-3",
url="https://ieeexplore.ieee.org/document/10346006"
}