Publication Details
Lessons Learned in Transcribing 5000 h of Air Traffic Control Communications for Robust Automatic Speech Understanding
NIGMATULINA, I.
Prasad Amrutha (DCGM)
Motlíček Petr, doc. Ing., Ph.D. (DCGM)
KHALIL, D.
Madikeri Srikanth
TART, A.
Szőke Igor, Ing., Ph.D. (DCGM)
LENDERS, V.
RIGAULT, M.
CHOUKRI, K.
air traffic control communications; automatic speech recognition and
understanding; OpenSky Network; callsign recognition; ADS-B data
Voice communication between air traffic controllers (ATCos) and pilots is
critical for ensuring safe and efficient air traffic control (ATC). The handling
of these voice communications requires high levels of awareness from ATCos and
can be tedious and error-prone. Recent attempts aim at integrating artificial
intelligence (AI) into ATC communications in order to lessen ATCos's workload.
However, the development of data-driven AI systems for understanding of spoken
ATC communications demands large-scale annotated datasets, which are currently
lacking in the field. This paper explores the lessons learned from the ATCO2
project, which aimed to develop an unique platform to collect, preprocess, and
transcribe large amounts of ATC audio data from airspace in real time. This paper
reviews (i) robust automatic speech recognition (ASR), (ii) natural language
processing, (iii) English language identification, and (iv) contextual ASR
biasing with surveillance data. The pipeline developed during the ATCO2 project,
along with the open-sourcing of its data, encourages research in the ATC field,
while the full corpus can be purchased through ELDA. ATCO2 corpora is suitable
for developing ASR systems when little or near to no ATC audio transcribed data
are available. For instance, the proposed ASR system trained with ATCO2 reaches
as low as 17.9% WER on public ATC datasets which is 6.6% absolute WER better than
with "out-of-domain" but gold transcriptions. Finally, the release of 5000 h of
ASR transcribed speech-covering more than 10 airports worldwide-is a step forward
towards more robust automatic speech understanding systems for ATC
communications.
@article{BUT185576,
author="ZULUAGA-GOMEZ, J. and NIGMATULINA, I. and PRASAD, A. and MOTLÍČEK, P. and KHALIL, D. and MADIKERI, S. and TART, A. and SZŐKE, I. and LENDERS, V. and RIGAULT, M. and CHOUKRI, K.",
title="Lessons Learned in Transcribing 5000 h of Air Traffic Control Communications for Robust Automatic Speech Understanding",
journal="Aerospace",
year="2023",
volume="2023",
number="10",
pages="1--33",
doi="10.3390/aerospace10100898",
issn="2226-4310",
url="https://www.mdpi.com/2226-4310/10/10/898"
}