Impact of unpaired misspelled query variants on training noise
Establish whether including misspelled query variants without their corresponding clean queries predominantly introduces noise due to lacking well-formed semantic structure during training of the multilingual Siamese two-tower embedding retriever for e-commerce search.
References
We conjecture that spelling variants presented without their corresponding clean queries introduce predominantly noise, as they lack a well formed semantic structure from which the model can learn.
— Mine and Refine: Optimizing Graded Relevance in E-commerce Search Retrieval
(2602.17654 - Xi et al., 19 Feb 2026) in Section 5.3 (Ablation Studies: Spelling Variation Augmentation of Training Queries)