Publication Details
Clustering Unsupervised Representations as Defense against Poisoning Attacks on Speech Commands Classification System
JOSHI, S.
LI, H.
Šůstek Martin, Ing. (DCGM)
VILLALBA LOPEZ, J.
Khudanpur Sanjeev
Dehak Najim
poisoning attack, unsupervised representa-
tions, clustering, Speech commands, defense against attacks
on speech systems
Poisoning attacks entail attackers intentionally tampering with training data. In
this paper, we consider a dirty-label poisoning attack scenario on a speech
commands classification system. The threat model assumes that certain utterances
from one of the classes (source class) are poisoned by superimposing a trigger on
it, and its label is changed to another class selected by the attacker (target
class). We propose a filtering defense against such an attack. First, we use
DIstillation with NO labels (DINO) to learn unsupervised representations for all
the training examples. Next, we use K-means and LDA to cluster these
representations. Finally, we keep the utterances with the most repeated label in
their cluster for training and discard the rest. For a 10% poisoned source class,
we demonstrate a drop in attack success rate from 99.75% to 0.25%. We test our
defense against a variety of threat models, including different target and source
classes, as well as trigger variations.
@inproceedings{BUT187976,
author="THEBAUD, T. and JOSHI, S. and LI, H. and ŠŮSTEK, M. and VILLALBA LOPEZ, J. and KHUDANPUR, S. and DEHAK, N.",
title="Clustering Unsupervised Representations as Defense against Poisoning Attacks on Speech Commands Classification System",
booktitle="Proceedings of IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)",
year="2023",
pages="1--8",
publisher="IEEE Signal Processing Society",
address="Taipei",
doi="10.1109/ASRU57964.2023.10389650",
isbn="979-8-3503-0689-7",
url="https://ieeexplore.ieee.org/document/10389650"
}