Implicit query expansion effect of pre-training prompts in ColBERT
Determine whether the inclusion of the "search_query:" and "search_document:" prompt tokens during Nomic Embed contrastive pre-training acts as implicit query expansion in ColBERT late-interaction retrieval models, and quantify the extent to which this mechanism contributes to improved retrieval performance relative to models trained without prompts or using only the [Q]/[D] markers.
References
We conjecture this may be a form of implicit query expansion, a mechanism that has shown very useful in the early variant of ColBERT.
— ColBERT-Zero: To Pre-train Or Not To Pre-train ColBERT models
(2602.16609 - Chaffin et al., 18 Feb 2026) in Section 3.2 (Impact of the Prompt)