Design a controlled self-play study to distill piece values

Design and execute a controlled experimental study using computer chess programs—preferably reinforcement-learning systems without built-in piece values—to estimate material values by (i) randomly generating positions and observing self-play outcomes and (ii) performing matched “what if” experiments that add or remove a single piece from randomly selected positions to create balanced datasets, thereby mitigating confounding from player skill and position selection.

Background

Human-game observational data confound skill and position, and snapshots may not be quiescent, potentially biasing estimated piece values. The authors suggest distilling values from computer self-play to control confounding and to test causal effects of specific material changes.

They outline a design that randomizes positions and performs matched interventions by adding or removing a piece, comparing outcomes with and without the intervention to better identify the effect of material imbalance.

References

We could also perform “what if” experiments by adding or removing one piece from randomly selected positions, observing outcomes with and without the piece, to balance the dataset. We leave such a study for further research.

Inferring Piece Value in Chess and Chess Variants  (2509.04691 - Pav, 4 Sep 2025) in Introduction, Section 1, item 4