- The paper establishes a robust computational framework for comparing legal regimes in data markets using an empirically calibrated agent-based model.
- It employs a novel LLM-driven discrete choice experiment to parameterize agent decisions and validate the simulation against real market data.
- Results highlight that buyer-shared liability offers greater welfare benefits compared to consent or property-style relief.
Computational Policy Laboratory for Data Law: Agent-Based Modeling of Legal Regimes in Data Markets
Introduction and Motivation
This paper presents a rigorous computational framework for empirically analyzing the effects of legal regimes on data markets, addressing the persistent opacity and empirical intractability that have stymied regulatory and doctrinal progress. The authors construct a high-fidelity agent-based model (ABM) of the Chinese institutional data market, parameterized via a novel LLM-driven discrete choice experiment (DCE), to simulate and compare the welfare and market outcomes of competing legal rules—ranging from seller-centric liability to property-style carve-outs and two-sided liability splits. The approach is motivated by the structural failures of traditional empirics in data law: nonrival data, externalities, and the breakdown of price discovery, compounded by the technical fragility of anonymization and the systematic concealment of real-world transactions.
Model Architecture and Calibration
Geographical and Agent Structure
The ABM discretizes the Chinese economy into 14,526 hexagonal cells (Figure 1), each representing a potential industrial cluster. Buyer and seller agents are distributed across these cells according to empirical data on AI enterprises and hospital locations (Figure 2), stratified into capability tiers to reflect real-world heterogeneity.
Figure 1: Hexagonal grid with 20 km radius, partitioning the Chinese economy for agent-based simulation.
Figure 2: Distribution of agents, showing spatial heterogeneity in buyer and seller locations and tiers.
Agent Decision Rules
Buyers' willingness-to-pay (WTP) is modeled via a random-coefficient logit, incorporating data volume, seller reputation, buyer capability, price sensitivity, and geographic distance. Sellers' willingness-to-accept (WTA) is a function of fixed and variable costs, risk level (endogenously assigned via ordered logit based on data volume), enforcement intensity (dynamically updated based on local transaction activity), and institutional tier. The model explicitly encodes risk–enforcement interactions, ensuring that regulatory intensity and transaction scale jointly determine reservation prices.
LLM-Based Discrete Choice Calibration
To overcome the absence of observable market microdata, the authors employ a DCE using DeepSeek, a frontier LLM, as a proxy for unsurveyable populations. The DCE manipulates transaction attributes (price, data scale, risk, enforcement, agent tier) in factorial design, extracting interpretable preference primitives for both buyers and sellers. Posterior estimation is performed via MCMC, yielding empirically disciplined coefficients for agent decision rules.
Validation
The ABM is externally validated against stylized facts of China's data market: coastal concentration, distance decay, and hub formation in trade arcs, as well as the low empirical share of platform-mediated exchange (≈4%), matching external estimates. The model reproduces observed spatial and temporal regularities in data trading, supporting its credibility for policy counterfactuals.
Comparative Institutional Analysis
The baseline regime assigns full liability for risk and enforcement to sellers, mirroring GDPR and PIPL doctrine. Platform-mediated exchange overlays search and governance infrastructure but does not shift legal incidence. Empirically, platform mediation yields negligible average effects on trade volume, surplus, or welfare (Figure 3), with only episodic gains in upper-tail outcomes.
Figure 3: Models with liability on sellers (t=100), showing distributional similarity between baseline and platform-mediated exchange.
Property-Style Carve-Outs and Consent Gates
Three rules externalize risk to third parties: low-risk carve-outs (anonymization exemption), informed consent (property-rule gate), and risk immunity (seller entitlement). The low-risk carve-out is non-distortive but does not systematically increase welfare. Informed consent sharply reduces market activity and welfare, while risk immunity expands trade and surplus but at the cost of increased externalities, leaving total welfare statistically unchanged (Figure 4).
Figure 5: Trade arcs in platform-mediated exchange (t=100), illustrating spatial patterns under platform mediation.
Figure 6: Trade arcs in low-risk carve-out (t=100), showing expanded but risk-externalized trade.
Figure 7: Trade arcs in informed consent (t=100), with reduced market activity due to consent gating.
Figure 8: Trade arcs in risk immunity (t=100), with increased trade and externalized risk.
Figure 4: Models with externality on third party (t=100), highlighting the trade-off between expanded trade and increased externalities.
Buyer-Shared and Two-Sided Liability
The most robust welfare gains arise when liability for risk (and optionally enforcement) is shared with buyers, reflecting the "least-cost avoider" principle and aligning with contemporary legal doctrine (GDPR processor liability, HIPAA business associate rules). Parametric variation in buyer liability share reveals monotonic increases in trade and volume, with statistically significant welfare gains under risk-only sharing (Figure 9, Figure 10). When enforcement is also shared, throughput increases but welfare gains attenuate, indicating compliance costs offset surplus expansion.
Figure 11: Trade arcs in buyer-shared risk (t=100), showing expanded and more efficient trade patterns.
Figure 12: Trade arcs in two-sided liability split (t=100), with both risk and enforcement shared.
Figure 9: Effects of buyer-side liability splits on throughput indicators, demonstrating monotonic increases in trades and volume.
Figure 10: Effects of buyer-side liability splits on welfare, with risk-only sharing yielding positive welfare slopes.
Theoretical and Practical Implications
The computational laboratory demonstrates that property-style relief (anonymization, consent gates) expands trade but fails to reliably raise welfare, as externalized harms are not internalized. In contrast, regimes that assign substantive risk to buyers—who are best positioned to mitigate downstream harms—induce efficient safeguards, increase welfare, and sustain trade. This provides empirical support for the legal drift toward two-sided reachability and joint liability, moving beyond intuition and doctrinal conjecture to controlled, comparative evidence.
The pipeline—fieldwork, LLM-DCE calibration, and ABM simulation—constitutes a reproducible methodology for legal-institutional analysis in opaque markets. It enables direct testing of policy counterfactuals, obsoleting armchair speculation and providing actionable guidance for regulatory design.
Future Directions
The approach can be extended to other jurisdictions, data types, and institutional settings, with LLM-based calibration offering scalable access to elite or otherwise unreachable populations. Further work may integrate dynamic learning, richer network structures, and endogenous innovation in agent strategies. The framework is adaptable to evolving legal doctrines and technical safeguards, supporting ongoing empirical evaluation as the data economy matures.
Conclusion
This paper establishes a computational paradigm for data law, demonstrating that agent-based modeling—empirically disciplined via LLM-driven discrete choice experiments—can resolve the epistemic impasse in legal analysis of data markets. The results show that neither consent nor property-style relief reliably maximizes welfare; instead, efficient risk internalization via buyer-shared liability is empirically optimal. The methodology provides a scalable, transparent engine for comparative institutional analysis, with direct implications for regulatory design and the future of empirical legal studies in the AI economy.