DuoDrama: Dual-Character AI Storytelling

Updated 6 February 2026

DuoDrama is a computational framework for interactive dual-character storytelling, uniting dialog generation, personality control, and multi-agent feedback loops.
It employs structured intention graphs, pragmatic markers, and role-based dialogue planning to achieve precise narrative control.
The system integrates multi-agent reflective workflows and ego–superego dynamics to enhance narrative divergence, emotional authenticity, and screenwriting support.

DuoDrama is a computational framework and class of interactive AI systems for the generation, refinement, and critical examination of two-character dramatic scenarios. The paradigm unites dialogic story construction, personality-driven stylistic control, and multi-agent feedback loops, with core applications in automated storytelling and screenwriting support. Three major lines of work collectively define the landscape of DuoDrama: (1) formal dialog generation pipelines grounded in story intention graphs and personality modeling (Bowden et al., 2017), (2) experience-grounded reflection workflows for collaborative screenplay critique (Tang et al., 5 Feb 2026), and (3) multi-agent LLM roleplay architectures simulating ego–superego dynamics for emergent character development (Magee et al., 2024).

1. Formal Story Representation and Dialog Generation

The foundational architecture for DuoDrama, outlined in extensions to the Monolog-to-Dialog (M2D) framework, operationalizes stories as structured intention graphs $F=(E,C,R)$ . Here, $E$ enumerates plot events as predicate-argument frames, $C$ denotes the character set, and $R$ encodes inter-event and character relationships including causality, temporality, goals, and affect. Parsing involves annotation via Scheherazade, yielding deep-syntactic structures (DsyntS) that faithfully preserve predicate-argument information, event salience, and emotional transitions.

The dialog generation pipeline comprises several key stages:

Content Allocation: Sentences $s_i$ from $S=\{s_1,\ldots,s_n\}$ are probabilistically attributed to speakers according to role-based and personality-modulated density $\alpha_j$ , with $\sum_j \alpha_j=1$ . For example, $\alpha_1 = 0.5 + \lambda(E_1-E_2)$ links allocation to extroversion differentials.
Sentence Planning: Aggregation merges adjacent sentences sharing syntactic heads, while deaggregation splits complex structures; coreference substitution controlled by recency thresholds enables pronominalization; question generation formulates micro-dialogues for elaboration and engagement.
Pragmatic Markers and Turn-Taking: Insertion of hedges, exclamations, and confirmation tags whose probabilities are parameterized by Big Five traits (e.g., $P(\text{“yeah”}|p_j) = \beta_0+\beta_1 \text{Agreeableness}_j$ ) modulates interactional style and conversational floor-yielding.
Surface Realization: Final DsyntS representations are linearized via grammatical surface realization engines such as RealPro (Bowden et al., 2017).

This precise representation-to-dialog conversion enables pointwise control over narrative, role, and style at each generation step.

2. Multi-Agent Reflective Feedback and ExReflect Workflow

Recent developments in DuoDrama have centered on reflective screenwriting support systems that simulate both internal character immersion and detached evaluative critique. The principal method, ExReflect, is a dual-role prompting workflow inspired by Stanislavskian and Brechtian performance theories (Tang et al., 5 Feb 2026):

Experience Role: An agent enacts the inner psychological state of a character for each script line, generating “inner thoughts” through a chain-of-thought comprising intuitive reaction, episodic memory recall, goal inference, and synthesis.
Evaluation Role: The agent assumes the viewpoint of the actor portraying the character, using the just-generated inner thoughts as evidence to produce multidimensional reflective feedback. This feedback targets five axes: emotional plausibility, behavioral motivation, relational dynamics, pacing coherence, and thematic integration.
Feedback Filtering: Candidate feedback undergoes automatic self-verification for evidence grounding, expression diversity, dimensional coverage, and impact calibration.

In technical terms, the iterative core loop can be succinctly specified as:

$\begin{aligned} &\text{For each line:}\ &\quad \text{Agent}_\text{exp} = \mathrm{ExperienceRole}(\text{line},\text{persona},\text{memory})\ &\quad \text{InnerThoughts} = \mathrm{Enact}(\text{Agent}_\text{exp})\ &\quad \text{ShortTermMemory} \!+\!= \text{InnerThoughts}\ &\quad \text{Agent}_\text{eval} = \mathrm{EvaluationRole}(\text{persona}, \text{ShortTermMemory})\ &\quad \text{Candidates} = \mathrm{GenerateFeedback}(\text{Agent}_\text{eval})\ &\quad \text{Feedback} = \{f \in \text{Candidates} \mid \mathrm{Verify}(f)\}\ &\quad \mathrm{Display}(\text{Feedback}) \end{aligned}$

Systemically, DuoDrama implements this workflow with per-character ExReflect agents, dual-memory modules (long-term FAISS indices and incremental short-term state), and chain-of-thought LLM orchestrations (Tang et al., 5 Feb 2026).

3. Multi-Agent Ego–Superego Architectures and Character Dynamics

The Drama Machine instantiates DuoDrama through role-split LLM agents embodying “Ego” (external, public-facing) and “Superego” (internal, normative critic) subsystems (Magee et al., 2024). The technical protocol involves two parallel, recurrent processes:

Ego State ( $s_t^{\text{Ego}}$ ): Aggregates memory embedding $m_t \in \mathbb{R}^d$ , emotion vector $e_t \in \mathbb{R}^k$ , and current system-prompt $p_t$ managing backstory and stance; evolves via $s_{t+1}^{\text{Ego}} = u_E(s_t^{\text{Ego}}, U_t, C_t)$ , where $U_t$ is input and $C_t$ is Superego critique.
Superego State ( $s_t^\text{Sup}$ ): Maintains constraint vector $n_t$ and recent-context embedding $c_t$ ; updates as $s_{t+1}^{\text{Sup}} = u_S(s_t^{\text{Sup}}, D_t, U_t)$ , with $D_t$ the Ego's proposed next reply.

The coordination protocol alternates User input, Ego drafting, Superego critique (via prompt rewrite, input rewrite, or revision suggestion), Ego state update, and final response emission, as follows:

$\forall t \in \{1, \dots, T\}: \begin{cases} D_t = \mathrm{Ego.generate}(s_t^\text{Ego}, U_t)\ C_t = \mathrm{Sup.generate}(s_t^\text{Sup}, D_t, U_t)\ s_{t+1}^\text{Ego} = u_E(s_t^\text{Ego}, U_t, C_t)\ R_t = \mathrm{Ego.respond}(D_t, C_t)\ s_{t+1}^\text{Sup} = u_S(s_t^\text{Sup}, D_t) \end{cases}$

Empirically, inclusion of the Superego agent elevates adaptive behavior, introspection, and narrative divergence by 2–3 points across standardized metrics, producing character trajectories marked by oscillation, ambiguity, and “learning” across turns (Magee et al., 2024).

4. Personality Modeling, Stylistic Control, and Dramatic Planning

Personality and linguistic style in DuoDrama systems are modulated via parameterization over Big Five psychological profiles—extraversion, agreeableness, conscientiousness, neuroticism, and openness (all $\in [0,1]$ )—plus dramatic/role identity (protagonist, antagonist, etc.) (Bowden et al., 2017). These parameters govern:

Allocation and density of narrative content per speaker
Probability distributions for insertion of pragmatic markers and expressive features
Lexical choice among synonym sets, e.g., sampling $w_i$ for predicate $p$ based on

$\text{score}(w_i) = \phi_1\,\text{freq}_\text{norm}(w_i) + \phi_2\,\text{length}_\text{norm}(w_i) + \phi_3\,(E_j{-}0.5)\,\text{complexity}(w_i)$

Higher-level goal-setting via a “dramatic planner” that assigns discourse motives such as conflict, reveal, or climax to dialogic turns.
Optional LLM-based style adapters that learn empirical placement distributions for markers or paraphrase style from annotated corpora.

This personality-driven mechanism ensures that DuoDrama outputs are not merely turn-based exchanges but embody idiosyncratic voices, dynamic interpersonal balance, and unfolding dramaturgy.

5. Evaluation Methods and Empirical Outcomes

Assessment of DuoDrama comprises both large-scale quantitative studies and qualitative analyses, targeting engagement, naturalness, character arc recognition, and effectiveness in collaborative creative refinement.

Story Retelling and Dialog Generation: Judged on engagement and naturalness in multi-condition experiments (e.g., EST–Basic–Chatty) with paired $t$ -tests for statistical significance. Chatty, personality-rich outputs reliably yield higher engagement ( $p=0.04$ ) while baseline-inflected dialog is perceived as most natural due to restraint in pragmatic marker use (Bowden et al., 2017).
Screenwriting Feedback Quality: Evaluated in two-session studies with professional screenwriters, using within-participant designs, Likert scales on usability and DGs, and Wilcoxon signed-rank tests for effect size estimation. DuoDrama’s ExReflect consistently outperforms experience-only, eval-only, and industry-standard feedback on revision motivation ( $W=0,p=.004,r=.78$ ), specificity, emotional and relational insight, and richness of reflection (Tang et al., 5 Feb 2026).
LLM Agent Roleplay Metrics: Measured by Critic-LMM scoring on behavioral change, introspection, divergence, and adaptation, with the inclusion of intrapsychic Superego processes quantitatively increasing all metrics in both biographical and detective suspense scenarios (Magee et al., 2024).

Select error analyses indicate, for instance, that overuse of hedges can reduce perceived naturalness, and that contextually ungrounded confirmation tags may undermine dialog realism (Bowden et al., 2017).

6. Theoretical Implications and Future Research Trajectories

DuoDrama provides a computational instantiation of performance theory, synthesizing the Stanislavskian ideal of full character immersion (internal experience) with the Brechtian principle of critical distance (external evaluation). The multi-agent and reflective feedback paradigms engender both intrapsychic and intersubjective performativity, supporting conceptualizations of AI character as a dynamic, self-crafting system rather than a fixed prompt (Magee et al., 2024).

Broader applicability spans any domain requiring alternation or synthesis of situated experience and analytic oversight, such as educational and UX feedback systems. Open research directions include longitudinal deployments to observe skill development, multimodal (visual or acoustic) grounding of experience, and recursive critique architectures (hierarchical layers of Superego/Critic agents) (Tang et al., 5 Feb 2026). A plausible implication is the systematic enrichment of LLM-based agents with regulated internal conflict and performativity, yielding more tractable models of artificial subjectivity and creative co-authorship.