AI Voting Assistance in Collective Decisions

Updated 26 December 2025

AI voting assistance is a computational approach using LLMs and related architectures to offer both advisory support and automated ballot generation in diverse voting contexts.
It employs various voting methods, including approval, ranked, and cumulative voting with proportional aggregation like Equal Shares to enhance representational fidelity.
The system is rigorously evaluated using metrics such as Jaccard similarity, Kendall’s Tau, and bias amplification factors to ensure alignment with human intent and mitigate systemic biases.

AI voting assistance encompasses computational systems—typically based on LLMs and related AI architectures—that facilitate, augment, or automate voting processes in collective decision-making scenarios. These systems range from AI agents that summarize choices or platforms to those that generate ballots or make voting recommendations, and recently include AI "representatives" that vote on behalf of individuals who abstain from participation. Such technologies are increasingly evaluated both for direct voting (e.g., participatory budgeting, policy elections) and for expert decision-making contexts (e.g., medical diagnostics), with a strong focus on resilience to bias, alignment with human intent, and safeguarding democratic legitimacy (Yang et al., 2024, Gu et al., 2024, Pournaras, 19 Dec 2025, Majumdar et al., 2024).

1. Definitions, Scope, and Key Use Cases

AI voting assistance consists of two principal modalities:

Advisory: Systems provide information, summarize agendas, or generate ranked recommendations to human voters.
Automated Representation: AI models construct and submit ballots or direct votes on behalf of absent or abstaining individuals (AI personas), using preference data, prior voting history, or survey-derived traits (Pournaras, 19 Dec 2025).

Key deployment contexts include direct democratic processes (e.g., participatory budgeting elections), high-stakes expert decisions (e.g., pathology diagnosis), and simulated proxies in low-turnout or voter-fatigue scenarios (Yang et al., 2024, Gu et al., 2024, Majumdar et al., 2024).

2. Voting Methods, Ballot Aggregation, and Algorithmic Workflows

AI voting assistants are typically integrated with multiple ballot formats and aggregation schemes. Empirical studies have benchmarked performance across:

Approval Voting: Unlimited selection of preferred options.
k-Approval Voting: Select exactly k items (e.g., 5-Approval).
Cumulative (Score) Voting: Distribute points (e.g., 10 points across 24 projects).
Ranked Voting: Explicit rank orders (with classical Borda count for aggregation).

Aggregate outcomes are produced using methods such as:

Method	Format Supported	Aggregation Rule
Approval / k-Approval	Binary/limited approval	Summing votes per candidate
Ranked	Ordinal ranks	Borda count ( $\text{score}=k-i+1$ )
Cumulative	Real-valued allocations	Normalization then cost-based selection
Proportional (Equal Shares)	Expressive/cumulative	Iterative per-capita cost sharing until budget spent

Among these, proportional aggregation algorithms such as Equal Shares (also known as Rule X) have demonstrated superior consistency between human and AI-generated collective outcomes, especially under abstention and model bias (Majumdar et al., 2024, Pournaras, 19 Dec 2025).

3. Evaluation Metrics: Alignment, Consistency, and Diversity

Evaluation of AI voting assistance leverages quantitative measures to assess alignment with human choices, resistance to bias, and diversity in collective choice.

Jaccard Similarity: Measures overlap in sets of selected options; $J(A,B)=|A\cap B|/|A\cup B|$ (Yang et al., 2024, Majumdar et al., 2024).
Kendall’s Tau: Assesses rank correlation between aggregate human and AI orderings;

$\tau(X,Y)=\frac{2}{n(n-1)}\sum_{i<j}\mathrm{sgn}(X_i-X_j)\,\mathrm{sgn}(Y_i-Y_j)$

Preference Diversity: Captured via mean Jaccard distance $D_{J_i}(A,B)=1-J_i(A,B)$ , or (outside the original studies) entropy and Gini–Simpson indices.
Consistency Recovery: Quantifies how well AI can replenish voter abstention by restoring overlap with original human outcome sets (Majumdar et al., 2024).
Bias Amplification Factor (BAF): Evaluates whether AI increases existing outcome skews (e.g., systematic regional underrepresentation) (Pournaras, 19 Dec 2025).
Appropriateness of Reliance: In expert-AI collaboration, appropriate reliance is measured by Relative AI Reliance (RAIR) and Relative Self-Reliance (RSR), assessing when experts accept or override correct or incorrect AI advice (Gu et al., 2024).

These metrics form the foundation of rigorous audit protocols for both system designers and electoral regulators.

4. Model Behaviors, Biases, and Parameter Trade-offs

Extensive empirical analysis reveals emergent behaviors in LLM-based agents:

Bias and List-Order Sensitivity: Models such as LLaMA-2 and GPT-4 have shown systematic biases (e.g., over-selection of certain project types) and pronounced sensitivity to candidate order, impacting alignment (Kendall’s τ drops from ~0.7 to –0.2 for reversed order in LLaMA-2) (Yang et al., 2024).
Temperature-Diversity Trade-off: For GPT-4 Turbo, increasing temperature (T) from 1.0 to 1.5 increases mean preference diversity (mean Jaccard distance from 0.44→0.67), with a slight drop in alignment (τ from 0.39→0.37). LLaMA-2 does not align with human votes at any T but shows increased diversity at higher T (Yang et al., 2024).
Chain-of-Thought (CoT) Prompting: CoT increases response explainability but offers no prediction accuracy benefit (Jaccard similarity ~0.30 with or without CoT, $p=0.9906$ ), demonstrating its potential for transparency rather than correctness (Yang et al., 2024).
Ensemble and Persona Engineering: Diversity and alignment can be improved via prompt ensembles (varying temperature or persona), calibrated narrative injection (based on user traits), and randomized candidate orderings. Direct demographic attributes are avoided in favor of explicit, non-sensitive preference statements (Yang et al., 2024, Majumdar et al., 2024).

5. Resilience of Collective Choice: The Role of Proportional Representation

Recent real-world studies demonstrate that proportional methods, specifically Equal Shares, are highly robust to biases and inconsistencies introduced by AI voting assistance:

Consistency: Under majoritarian rules, the overlap (Consistency) between human and AI-augmented winning sets declines substantially as the proportion of AI-proxy votes rises (from 1.0 to 0.68 at $μ=1.0$ for 50% abstention). Equal Shares maintains $>0.92$ Consistency in all regimes (Pournaras, 19 Dec 2025, Majumdar et al., 2024).
Representation Error (RE): Under majoritarian aggregation, RE grows linearly with AI representation (up to 0.25 at full AI replacement), but under Equal Shares, RE remains ≤0.07 across all conditions (Pournaras, 19 Dec 2025).
Bias Amplification: BAF for spatial underfunding is 1.35 (majoritarian) versus 1.08 (Equal Shares), suggesting proportional rules temper both bias and inconsistency under AI voting agents (Pournaras, 19 Dec 2025).
Collective Resilience to Abstention: Filling abstention with AI proxies under Equal Shares recovers >53% of lost consistency even at high abstention levels, outperforming greedy/majority aggregation by >18% on key outcome sets (Majumdar et al., 2024).
Legitimacy and Democratic Values: Proportional methods not only improve robustness but also promote higher perceived fairness, acceptance, and representational breadth, forming a strong buffer against manipulation or delegitimation via systematic model errors or bias (Pournaras, 19 Dec 2025).

6. Best Practices, Safeguards, and Recommendations

Systematic deployment of AI voting assistants emphasizes transparency, explainability, and safeguards:

Fair Methods Integration: Vote advice and ballot construction should utilize expressive ballot formats and proportional sharing rather than single-choice pickers (Pournaras, 19 Dec 2025).
Auditability: The system must expose prompt templates, preference vectors (e.g., $s_{ij}$ ), step-level budget states ( $b_i$ , $\rho_j$ ), and full persona-to-vote translations for inspection (Majumdar et al., 2024, Pournaras, 19 Dec 2025).
Human-in-the-Loop: Voters should review and revise AI-suggested allocations. System “confidence thresholds” can trigger fallback to direct input when model uncertainty is high (Pournaras, 19 Dec 2025).
Ensemble Diversity and Privacy: Use ensembles of models, randomized traits, and mask sensitive features to diffuse model-specific bias. Direct input of sensitive demographics is avoided unless strictly consented (Yang et al., 2024, Majumdar et al., 2024).
Metric Monitoring: Real-time dashboards should display alignment (e.g., τ, Jaccard), diversity, and bias metrics against reference populations. Interventions are triggered when diversity falls below threshold (e.g., mean Jaccard D < 0.3) (Yang et al., 2024).
Regulatory Frameworks: Cap the proportion of AI-generated votes, certify AI voting flow for traceability (including CoT “why” features), and publish open-source code and audit logs for post-election verification (Majumdar et al., 2024).

7. Extensions: Expert-AI Collaboration and Broader Impacts

In specialized domains, such as medical diagnostics, AI voting assistance is extended to majority-voting schemes among human experts, each aided by AI:

Appropriateness of Reliance: Metrics such as RAIR and RSR quantify when experts properly accept correct AI advice or properly reject incorrect recommendations. Majority voting (k=3 experts) increases RAIR by ~9.4% and RSR by ~31.1%, translating to higher diagnostic precision and robustness, and lower variance across collective outcomes (Gu et al., 2024).
Scalability and Generalization: Majority voting protocols generalize to other high-stakes, sparse-signal tasks involving AI, offering simplicity, anonymity, and democratic participation with minimal coordination overhead (Gu et al., 2024).
Implications: AI voting assistance, when coupled with robust aggregation and audit, enables scalable, resilient collective decision-making while maintaining legitimacy and representational fidelity even in challenging circumstances (e.g., low turnout, expert heterogeneity, or model drift) (Gu et al., 2024, Pournaras, 19 Dec 2025, Majumdar et al., 2024).

In summary, the state-of-the-art in AI voting assistance centers on transparent, explainable, and proportionally representative workflows, integrating model diversity, robust aggregation, and ongoing metric surveillance to safeguard democratic and expert decision-making processes in the presence of both abstention and AI-induced bias (Yang et al., 2024, Gu et al., 2024, Pournaras, 19 Dec 2025, Majumdar et al., 2024).