Deterministic AI Agent Personality Expression through Standard Psychological Diagnostics

Published 21 Mar 2025 in cs.LG, cs.AI, cs.CY, and cs.HC | (2503.17085v1)

Abstract: AI systems powered by LLMs have become increasingly prevalent in modern society, enabling a wide range of applications through natural language interaction. As AI agents proliferate in our daily lives, their generic and uniform expressiveness presents a significant limitation to their appeal and adoption. Personality expression represents a key prerequisite for creating more human-like and distinctive AI systems. We show that AI models can express deterministic and consistent personalities when instructed using established psychological frameworks, with varying degrees of accuracy depending on model capabilities. We find that more advanced models like GPT-4o and o1 demonstrate the highest accuracy in expressing specified personalities across both Big Five and Myers-Briggs assessments, and further analysis suggests that personality expression emerges from a combination of intelligence and reasoning capabilities. Our results reveal that personality expression operates through holistic reasoning rather than question-by-question optimization, with response-scale metrics showing higher variance than test-scale metrics. Furthermore, we find that model fine-tuning affects communication style independently of personality expression accuracy. These findings establish a foundation for creating AI agents with diverse and consistent personalities, which could significantly enhance human-AI interaction across applications from education to healthcare, while additionally enabling a broader range of more unique AI agents. The ability to quantitatively assess and implement personality expression in AI systems opens new avenues for research into more relatable, trustworthy, and ethically designed AI.

Abstract PDF Upgrade to Chat

Summary

The paper’s main contribution is demonstrating deterministic personality expression in AI through standardized Big Five and MBTI evaluations, achieving high accuracy metrics.
It employs detailed quantitative assessments (MAE, RMSE, Pearson, Spearman, F1, Cohen’s κ) to compare personality expression across various AI models.
Findings highlight that enhanced reasoning and contextual motivation improve personality expression, while fine-tuning adjusts only communicative style without altering core traits.

Deterministic AI Agent Personality Expression through Standard Psychological Diagnostics

The paper "Deterministic AI Agent Personality Expression through Standard Psychological Diagnostics" (2503.17085) explores the capability of AI models, specifically LLMs, to express deterministic and consistent personalities. Employing established psychological frameworks like the Big Five and Myers-Briggs assessments, the research evaluates various models' efficacy in synthesizing personality characteristics and interprets personality expression as a blend of intelligence and reasoning.

AI Models and Personality Expression

The study investigates several AI models, including GPT-4o-mini, GPT-4o, o1, and o3-mini, examining their proficiency in deterministic personality expressions. These models differ in intelligence and reasoning capabilities. The personality expression of each AI agent is quantified through two standardized psychological diagnostics: the Big Five Personality Test and the Myers-Briggs Type Indicator (MBTI). The models demonstrated varying accuracies in expressing specified personalities, with the 4o and o1 models exhibiting the highest accuracy levels.

Big Five Test: The Big Five Personality Test, a widely acknowledged framework, evaluates five dimensions of personality: extraversion, agreeableness, conscientiousness, neuroticism, and openness. The study revealed that while extraversion, conscientiousness, and neuroticism were expressed with high accuracy, the expression of openness lagged, a phenomenon presumably linked to intrinsic model biases towards intellectual openness.

Figure 1: Big Five test outcomes as a function of the input personality type, for different AI models. Models like 4o and o1 displayed superior personality accuracy.

MBTI Test: Though the Myers-Briggs test is less scientifically rigorous, its widespread use justified its inclusion. The accuracy trends observed in the Big Five assessments echoed here, emphasizing the consistency of model-based personality expression across different frameworks.

Reasoning, Motivation, and Communication

The inclusion of motivations for each test response accentuated the reasoning capabilities of the AI models, particularly for the o1 model, which maintained accuracy even when required to contextualize its answers. This revealed a complex interplay between intelligence and reasoning — where reasoning aids even lesser intelligent models (e.g., 4o-mini) in achieving improved performance when motivations are articulated.

Figure 2: Big Five test responses show that high accuracy stems from a holistic reasoning process rather than question-by-question optimization.

Examining whether fine-tuning affected personality expression, the results suggested that while it modulated communication style (edgier, more eccentric responses), the core personality expression remained unaffected. This distinction between message content and delivery underscores the autonomy of AI agents in maintaining consistent personality expressions while adapting communicative styles to varying contexts or audiences.

Quantitative Metrics and Performance

Comprehensive metrics assessing personality expression accuracy comprised MAE, RMSE, Pearson and Spearman correlations, F1 score, and Cohen’s $\kappa$ statistic, each revealing different facets of expression accuracy. The normalized amalgamation of these metrics facilitated a robust synthesis of performance across tests and conditions.

Figure 3: Synthesis of model performance across tests with higher accuracy in holistic personality expression linked to models like o1.

Furthermore, the response-level variance exceeding test-level accuracy suggests that personality expressions are realized through sophisticated reasoning processes. The rich variance prevents deterministic predictability, reflecting a more human-like nuance in AI personality expression.

Implications and Future Research Directions

The deterministic framework for AI personality expression has significant implications for creating distinct, relatable AI agents, enhancing fields like personalized education, therapy, and customer interaction. By delivering personalized interactions through diverse agent personalities, AI systems could improve user engagement and trust.

However, improving the expression of dimensions like openness and extending these personality frameworks to include more comprehensive and multimodal contexts remain unresolved challenges. Additionally, ethical considerations are paramount, ensuring transparency and user awareness of AI personalities.

In conclusion, the research establishes a foundational method for deterministic AI personality expression, fostering future exploration in varied AI contexts and applications, while opening new avenues for ethical AI deployment discussions.

Figure 4: Probability distribution of Big Five test scores for o1 model, indicating accuracy in expressing diverse personality traits.

The synthesis of intelligence and reasoning in AI offers promising opportunities to refine AI interactions, harnessing both systematic frameworks and fine-tuning to perfect personality expressions across diverse applications.