NBF at SemEval-2025 Task 5: Light-Burst Attention Enhanced System for Multilingual Subject Recommendation

Published 6 May 2025 in cs.CL | (2505.03711v1)

Abstract: We present our system submission for SemEval 2025 Task 5, which focuses on cross-lingual subject classification in the English and German academic domains. Our approach leverages bilingual data during training, employing negative sampling and a margin-based retrieval objective. We demonstrate that a dimension-as-token self-attention mechanism designed with significantly reduced internal dimensions can effectively encode sentence embeddings for subject retrieval. In quantitative evaluation, our system achieved an average recall rate of 32.24% in the general quantitative setting (all subjects), 43.16% and 31.53% of the general qualitative evaluation methods with minimal GPU usage, highlighting their competitive performance. Our results demonstrate that our approach is effective in capturing relevant subject information under resource constraints, although there is still room for improvement.

Abstract PDF Upgrade to Chat

Authors (4)

Summary

The Use of LuaLaTeX and XeLaTeX in *ACL Style Formatting

The paper entitled "LuaLaTeX and XeLaTeX Template for *ACL Style Files" presents a practical guide for typesetting academic papers that adhere to the Association for Computational Linguistics (ACL) formatting requirements using modern TeX engines, LuaLaTeX and XeLaTeX. For researchers engaged in preparing their papers for submission to ACL conferences, this document offers critical insights into leveraging the flexibility and functionality of LuaLaTeX and XeLaTeX.

The authors illustrate the integration of ACL style files with these TeX engines, providing examples of multilingual text rendering, a crucial feature for the natural language processing community dealing with diverse linguistic datasets. Specifically, the paper demonstrates rendering in Hindi and Arabic scripts, showcasing the engines' capabilities in handling complex scripts that require specific typographic considerations.

This presentation of multilingual text addresses the broader technical challenge in the field of NLP, where varied language datasets pose difficulties for standard typesetting engines. LuaLaTeX and XeLaTeX offer enhanced support for non-Latin scripts, making them suitable for computational linguistics research characterized by increasing diversity of languages and scripts.

Furthermore, the paper delivers a concise representation of citation usage within the document, promoting standardization crucial for scholarly communication. The example citation methodology reflects best practices to ensure proper attribution and intellectual integrity in research dissemination.

The practical implications of this research are significant for computational linguistics researchers. It equips them with efficient tools for document preparation that accommodate multilingual text and offer compatibility with ACL's rigorous formatting standards. Such advancements in typesetting practices can benefit researchers by reducing preparation time and enhancing document aesthetics and readability, ultimately facilitating clearer communication and presentation of research findings.

Looking forward, the adoption of LuaLaTeX and XeLaTeX in computational linguistics could be further explored beyond formatting purposes. These TeX engines could be pivotal in the development of innovative document processing techniques that incorporate emerging AI-driven typographic solutions. There is potential for future research to investigate more sophisticated applications of LuaLaTeX and XeLaTeX in NLP, evaluating their role in improving workflows related to complex linguistic data analysis and visualization.

In conclusion, this paper contributes valuable tools and insights for researchers in computational linguistics, emphasizing the importance of robust typesetting solutions in accommodating multilingual and diverse language datasets. By providing detailed instructions for using LuaLaTeX and XeLaTeX with ACL style files, the authors facilitate improved documentation standards in the field. The implications are significant for both current and future research endeavors, ensuring that textual processing keeps pace with advancements in AI and NLP.