Ideal Linear Probes: Criteria and Identification
Characterize the properties that define an ideal linear probe β_W for a binary concept W in softmax-based representation spaces where the concept probability is modeled as P(W=1 | λ) = σ(β_W^T λ + b_W), and develop principled procedures to identify or estimate such probes from data.
References
In general, it is unclear what makes an ideal probe or how best to identify one.
— The Information Geometry of Softmax: Probing and Steering
(2602.15293 - Park et al., 17 Feb 2026) in Section 3 (Dual Steering with a Linear Probe), paragraph following Eq. (eq:linear_probe)