Optimal teacher prompts for distillation
Determine which instruction prompts provided to the instruction-following teacher embedding model Qwen3-Embedding-4B yield the most empirically useful supervision during knowledge distillation to the student embedding models jina-embeddings-v5-text-small and jina-embeddings-v5-text-nano, in order to minimize ambiguity and improve transfer effectiveness when generating query and document embeddings.
References
However, it leads to ambiguity when we do not know what instructions are empirically most useful and makes it harder for us to transfer knowledge through distillation.
— jina-embeddings-v5-text: Task-Targeted Embedding Distillation
(2602.15547 - Akram et al., 17 Feb 2026) in Section 4.1 First-Stage: Embedding Distillation