Structured Command Extraction from ATC Communications Using Open and Fine-Tuned Language Models

Paper ID

SIDs-2025-050

Conference

SESAR Innovation Days

Year

2025

Theme

Communications, navigation and surveillance (CNS)

Project Name

Keywords:

Radiotelephony; speech-to-text; language model; structured command extraction

Authors

Ana Maria Mekerishvili, Junzi Sun, Patrick Jonk and Vincent de Vries

DOI

https://doi.org/10.61009/SID.2025.1.19

Link

Download

Abstract

Radiotelephony remains the primary medium for pilot-controller communication, yet extracting structured infor- mation from spoken exchanges is challenging. Deep learning approaches often depend on large annotated datasets, limiting use in data-scarce environments. This study evaluates open-source Large Language Models for Structured Information Extraction from ATC communications, with applications in assisting or automating pseudo-pilot tasks. We evaluate Llama 3.3 (70B) with baseline prompting and Gemma 3 (4B) with baseline and fine- tuned variants on 496 utterances from NLR’s ATM simulator: NARSIM (NLR ATC real-time simulator). Performance is as- sessed on human transcripts and ASR outputs from Whisper models, with varying prompt contexts. Cross-sector generaliza- tion is tested across two ATC sectors. Using manual scoring, Llama 3.3 achieves micro-F1 0.95 on human transcripts and 0.86 on fine-tuned Whisper outputs. While Gemma 3 performed weaker in its baseline form, fine-tuning on a small sample led to notable improvements. Results demonstrate the potential of LLMs for ATC applications without the need for large annotated datasets.