Multi-domain Multilingual Sentiment Analysis in Industry: Predicting Aspect-based Opinion Quadruples

Published 15 May 2025 in cs.CL | (2505.10389v1)

Abstract: This paper explores the design of an aspect-based sentiment analysis system using LLMs for real-world use. We focus on quadruple opinion extraction -- identifying aspect categories, sentiment polarity, targets, and opinion expressions from text data across different domains and languages. Using internal datasets, we investigate whether a single fine-tuned model can effectively handle multiple domain-specific taxonomies simultaneously. We demonstrate that a combined multi-domain model achieves performance comparable to specialized single-domain models while reducing operational complexity. We also share lessons learned for handling non-extractive predictions and evaluating various failure modes when developing LLM-based systems for structured prediction tasks.

Abstract PDF Upgrade to Chat

Summary

Multi-domain Multilingual Sentiment Analysis in Industry: Predicting Aspect-based Opinion Quadruples

The exploration of a comprehensive aspect-based sentiment analysis (ABSA) system in this paper illustrates the application of large language models (LLMs) for extracting nuanced sentiment insights across multiple domains and languages. The study's primary contribution is the implementation of a unified model capable of predicting quadruple opinion elements—aspect categories, sentiment polarity, targets, and opinion expressions—from text data. This functionality addresses the complexities inherent in analyzing multilingual datasets derived from various sectors, such as customer feedback and employee surveys.

Model Design and Implementation

The authors propose a generative modeling approach to handle the extraction of sentiment quadruples. Quadruple opinion extraction extends beyond the traditional triplet extraction by encompassing the identification of opinion expressions, thereby offering a more granular sentiment analysis. This task is particularly beneficial for industrial applications where stakeholders rely on accurate sentiment insights for decision-making in domains such as Products and Services (PS), Human Resources (HR), and Customer Relations (CR).

The research emphasizes the advantages of developing a multi-domain model. The model effectively integrates multiple taxonomies associated with different domains into a single framework, bypassing the need for separate models for each taxonomy. This consolidated approach not only simplifies operational logistics but also enhances the model’s capability to adapt to domain-specific variations in taxonomy dynamically. The study rigorously tests the model’s ability to simultaneously process diverse taxonomies, demonstrating its proficiency in achieving comparable performance to specialized single-domain models.

Experimental Approach

Utilizing a collection of expansive datasets, the researchers conducted extensive empirical analyses to validate their approach. The datasets included multi-lingual text data, contributing to a robust analysis framework that accommodates varying syntactic and semantic structures. Notably, the study handled non-extractive predictions and introduced a systematic post-processing method for ensuring extractive output conformity—aligning model predictions strictly with substrings present in the input text.

For model comparison, the authors employed Qwen2.5 family models alongside a GPT-4o mini baseline. Their findings underscore the efficacy of fine-tuned smaller models over larger closed-source models in specific sentiment tasks, exemplifying scalability without compromising performance in real-world business environments.

Evaluation and Results

The evaluation hinged on both the model's structural integrity and task-specific performance. Structural evaluation assessed the adherence to JSON schema outputs, while task performance focused on the accuracy of quadruple predictions. Importantly, the research highlighted model scalability across different domains and underscored the variable difficulty levels among datasets, influenced by factors like writing style and domain specificity.

The paper's results reported that fine-tuned models significantly outperformed their base configurations, with multi-domain models yielding near parity with single-domain configurations in terms of F1 scores. Remarkably, the multi-domain fine-tuned models demonstrated efficient handling of diverse classification taxonomies, suggesting their practical feasibility in dynamic business scenarios.

Practical and Theoretical Implications

Practically, the development of a versatile, multi-domain sentiment analysis system holds significant promise for real-world applications. By consolidating domain-specific models into a single operable unit, enterprises can streamline sentiment analysis processes while reducing complexity. Theoretically, the paper contributes to the literature on ABSA by expanding the frontiers of multilingual, multi-domain sentiment analysis and learning from domain-specific data using generalized models.

Future Directions

The authors propose future exploration into language-specific prompts and enhancement of pre-existing models through post-training adaptations such as Graph Regularization with Projection Operators (GRPO). These directions aim to augment model performance further, especially in Babel-rich environments and across underrepresented domains, aligning with ongoing advancements in AI and natural language processing.

This study exemplifies the intersection of industrial requirements and academic inquiry in sentiment analysis, paving the way for sophisticated applications of LLMs in the real-world sentiment extraction landscape.