Publication Details
Assessing the Human Ability to Recognize Synthetic Speech in Ordinary Conversation
deepfake, synthetic speech, artificial intelligence, cybersecurity, deepfake
detection
This work assesses the human ability to recognize synthetic speech (deepfake).
This paper describes an experiment in which we communicated with respondents
using voice messages. We presented the respondents with a cover story about
testing the user-friendliness of voice messages while secretly sending them
a pre-prepared deep fake recording during the conversation. We examined their
reactions, knowledge of deepfakes, or how many could correctly identify which
message was deepfake. The results show that none of the respondents reacted in
any way to the fraudulent deepfake message, and only one retrospectively admitted
to noticing something specific. On the other hand, a voicemail message that
contained a deepfake was correctly identified by 83.9% of respondents after
revealing the nature of the experiment. Thus, the results show that although the
deepfake recording was clearly identifiable among others, no one reacted to it.
In summary, we show that the human ability to recognize voice deepfakes is not at
a level we can trust. It is very difficult for people to distinguish between real
and fake voices, especially if they do not expect them.
@inproceedings{BUT185186,
author="Daniel {Prudký} and Anton {Firc} and Kamil {Malinka}",
title="Assessing the Human Ability to Recognize Synthetic Speech in Ordinary Conversation",
booktitle="2023 International Conference of the Biometrics Special Interest Group (BIOSIG)",
year="2023",
series="Proceedings of the 22nd International Conference of the Biometrics Special Interest Group",
pages="1--5",
publisher="GI - Group for computer science",
address="Darmstadt",
doi="10.1109/BIOSIG58226.2023.10346006",
isbn="978-3-88579-733-3",
url="https://ieeexplore.ieee.org/document/10346006"
}