Identify the teaching strategy LLMs use to choose the next instructional action

Determine what teaching strategy large language models employ when deciding what to teach next in instructional settings; specifically, ascertain whether their action selection is governed by learner-model-based mentalizing (e.g., Bayes-Optimal teaching via inverse planning) or by simpler model-free heuristics.

Background

The paper reviews that most prior work on LLMs in education evaluates outputs and downstream learning outcomes rather than the decision process that selects instructional actions. This motivates assessing the underlying strategy that guides LLM teaching choices.

Human teaching studies often reveal two distinct strategies: a model-based mentalizing approach that reasons about a learner’s knowledge to maximize improvement, and model-free heuristics that rely on environmental cues. Understanding which of these (or other) strategies LLMs use is crucial for designing effective scaffolds and evaluating reliability in tutoring contexts.

References

Consequently, it remains unclear what teaching strategy an LLM uses when deciding what to teach next.

Do Large Language Models Mentalize When They Teach?  (2604.01594 - Harootonian et al., 2 Apr 2026) in Introduction (Section 1)