Publication Details
Fighting Randomness With Randomness: Mitigating Optimisation Instability of Fine-Tuning Using Ensemble and Noise Regularisation
Čegiň Ján, Ing. (DCGM)
Belanec Róbert, Bc. (DCGM)
SRBA, I.
Šimko Jakub, doc. Ing., PhD. (DCGM)
Bieliková Mária, prof. Ing., Ph.D. (DCGM)
NLP in resource-constrained settings, parameter-efficient-training,
data-efficient training, data augmentation, fine-tuning, mitigating randomness,
ensembling
While fine-tuning of pre-trained language models generally helps to overcome the
lack of labelled training samples, it also displays model performance
instability. This instability mainly originates from randomness in initialisation
or data shuffling. To address this, researchers either modify the training
process or augment the available samples, which typically results in increased
computational costs. We propose a new mitigation strategy, called Delayed
Ensemble with Noisy Interpolation (DENI), that leverages the strengths of
ensembling, noise regularisation and model interpolation, while retaining
computational efficiency. We compare DENI with 9 representative mitigation
strategies across 3 models, 4 tuning strategies and 7 text classification
datasets. We show that: 1) DENI outperforms the best performing mitigation
strategy (Ensemble), while using only a fraction of its cost; 2) the mitigation
strategies are beneficial for parameter-efficient fine-tuning (PEFT) methods,
outperforming full fine-tuning in specific cases; and 3) combining DENI with data
augmentation often leads to even more effective instability mitigation.
@inproceedings{BUT193319,
author="PECHER, B. and ČEGIŇ, J. and BELANEC, R. and SRBA, I. and ŠIMKO, J. and BIELIKOVÁ, M.",
title="Fighting Randomness With Randomness: Mitigating Optimisation Instability of Fine-Tuning Using Ensemble and Noise Regularisation",
booktitle="Findings of the Association for Computational Linguistics: EMNLP 2024",
year="2024",
pages="11005--11044",
publisher="Association for Computational Linguistics",
address="Miami",
doi="10.18653/v1/2024.findings-emnlp.644",
isbn="979-8-8917-6168-1"
}