TY - GEN
T1 - Improving Instruction-Aware Retrieval with Query-Preserving Regularization
AU - Kim, Hyewon
AU - Song, Hyun Je
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
PY - 2026
Y1 - 2026
N2 - Instruction-aware retrievers incorporate natural language instructions to express fine-grained retrieval constraints beyond the original query. These retrievers are typically trained using contrastive learning that considers relevance signals from both standard queries and instruction-augmented queries. However, prior instruction-aware retrievers learn instruction-augmented queries solely from document relevance signals, without explicitly preserving the semantics of the original query. As a result, instruction signals can dominate query semantics during training, leading to retrieved results that either fail to follow the instruction or are irrelevant to the original query. To address this issue, we propose a query-preserving regularization that enforces consistency between the relevance distributions induced by the original query and by the query component within the instruction-augmented query. This regularization prevents instruction signals from dominating query semantics while still allowing instructions to refine relevance estimation. Experiments on two instruction following retrieval benchmarks demonstrate that our method improves the existing state-of-the-art instruction-aware retriever. Furthermore, our model achieves strong performance on standard retrieval tasks without instructions, in both in domain and out of domain scenarios.
AB - Instruction-aware retrievers incorporate natural language instructions to express fine-grained retrieval constraints beyond the original query. These retrievers are typically trained using contrastive learning that considers relevance signals from both standard queries and instruction-augmented queries. However, prior instruction-aware retrievers learn instruction-augmented queries solely from document relevance signals, without explicitly preserving the semantics of the original query. As a result, instruction signals can dominate query semantics during training, leading to retrieved results that either fail to follow the instruction or are irrelevant to the original query. To address this issue, we propose a query-preserving regularization that enforces consistency between the relevance distributions induced by the original query and by the query component within the instruction-augmented query. This regularization prevents instruction signals from dominating query semantics while still allowing instructions to refine relevance estimation. Experiments on two instruction following retrieval benchmarks demonstrate that our method improves the existing state-of-the-art instruction-aware retriever. Furthermore, our model achieves strong performance on standard retrieval tasks without instructions, in both in domain and out of domain scenarios.
KW - Instruction Following Retrieval
KW - Instruction-Aware Retrieval
KW - Query-Preserving Regularization
UR - https://www.scopus.com/pages/publications/105035375459
U2 - 10.1007/978-3-032-21300-6_11
DO - 10.1007/978-3-032-21300-6_11
M3 - Conference paper
AN - SCOPUS:105035375459
SN - 9783032212993
T3 - Lecture Notes in Computer Science
SP - 172
EP - 187
BT - Advances in Information Retrieval - 48th European Conference on Information Retrieval, ECIR 2026, Proceedings
A2 - Campos, Ricardo
A2 - Jatowt, Adam
A2 - Lan, Yanyan
A2 - Aliannejadi, Mohammad
A2 - Bauer, Christine
A2 - MacAvaney, Sean
A2 - Anand, Avishek
A2 - Bai, Nan
A2 - Mansoury, Masoud
A2 - Ren, Zhaochun
A2 - Verberne, Suzan
PB - Springer Science and Business Media Deutschland GmbH
T2 - 48th European Conference on Information Retrieval, ECIR 2026
Y2 - 29 March 2026 through 2 April 2026
ER -