LuxMT Technical Report
Abstract: We introduce LuxMT, a machine translation system based on Gemma 3 27B and fine-tuned for translation from Luxembourgish (LB) into French (FR) and English (EN). To assess translation performance, we construct a novel benchmark covering LB-FR, LB-EN, and LB-FR using human-translated data from Luci, a tourist magazine about Luxembourg. Training data stems from LuxAlign, a parallel corpus of multilingual Luxembourgish news articles, and LB parliamentary transcripts augmented with Google Translate. We filter the data using LuxEmbedder, LB sentence embeddings, to remove low-equivalence segment-pairs. Overall, LuxMT's results suggest strong improvements over the Gemma 3 baseline, even for translating LB to German (DE), despite the training data not containing any DE. We also explore LuxEmbedder's potential to be used as a quality estimation metric and find strong correlations with other reference-based metrics. However, we call for further research to fully assess the metric's utility and advise using it with caution.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.