Papers
Topics
Authors
Recent
Search
2000 character limit reached

Zero-shot Performance of Generative AI in Brazilian Portuguese Medical Exam

Published 26 Jul 2025 in cs.CL | (2507.19885v1)

Abstract: AI has shown the potential to revolutionize healthcare by improving diagnostic accuracy, optimizing workflows, and personalizing treatment plans. LLMs and Multimodal LLMs (MLLMs) have achieved notable advancements in natural language processing and medical applications. However, the evaluation of these models has focused predominantly on the English language, leading to potential biases in their performance across different languages. This study investigates the capability of six LLMs (GPT-4.0 Turbo, LLaMA-3-8B, LLaMA-3-70B, Mixtral 8x7B Instruct, Titan Text G1-Express, and Command R+) and four MLLMs (Claude-3.5-Sonnet, Claude-3-Opus, Claude-3-Sonnet, and Claude-3-Haiku) to answer questions written in Brazilian spoken portuguese from the medical residency entrance exam of the Hospital das Cl\'inicas da Faculdade de Medicina da Universidade de S~ao Paulo (HCFMUSP) - the largest health complex in South America. The performance of the models was benchmarked against human candidates, analyzing accuracy, processing time, and coherence of the generated explanations. The results show that while some models, particularly Claude-3.5-Sonnet and Claude-3-Opus, achieved accuracy levels comparable to human candidates, performance gaps persist, particularly in multimodal questions requiring image interpretation. Furthermore, the study highlights language disparities, emphasizing the need for further fine-tuning and data set augmentation for non-English medical AI applications. Our findings reinforce the importance of evaluating generative AI in various linguistic and clinical settings to ensure a fair and reliable deployment in healthcare. Future research should explore improved training methodologies, improved multimodal reasoning, and real-world clinical integration of AI-driven medical assistance.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.