Optimality of using the backbone’s native hidden size as embedding dimension

Determine whether selecting the transformer backbone’s native hidden size as the embedding dimension is optimal for bi-encoder dense retrieval models that encode queries and documents into single vectors and score via inner products, and quantify how retrieval performance changes when the embedding dimension is expanded beyond or compressed below the backbone’s hidden size.

Background

The paper highlights that, in practice, dense retrieval systems often default to using the encoder’s native hidden size (e.g., 768 for BERT-base) as the embedding dimension, despite this choice directly affecting index storage and retrieval compute. The authors explicitly note that it is unclear whether this default is optimal or how altering the dimension (via expansion or compression) changes effectiveness.

This uncertainty motivates their empirical study of embedding-dimension scaling across multiple model families and sizes, aiming to characterize performance as dimensions vary and to provide guidance beyond the commonly adopted default.

References

Despite the impact of embedding dimension on efficiency, practitioners often rely on the ``native'' hidden size of a transformer backbone (e.g., 768 for BERT-base ), though it is unclear whether this choice is optimal or how performance is impacted as dimensions are expanded or compressed.

Scaling Laws for Embedding Dimension in Information Retrieval  (2602.05062 - Killingback et al., 4 Feb 2026) in Introduction (Section 1)