中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
NIR-Prompt: A Multi-task Generalized Neural Information Retrieval Training Framework

文献类型:期刊论文

作者Xu, Shicheng2,3; Pang, Liang3; Shen, Huawei2,3; Cheng, Xueqi1,2
刊名ACM TRANSACTIONS ON INFORMATION SYSTEMS
出版日期2024-03-01
卷号42期号:2页码:32
关键词Neural information retrieval dense retrieval reranking prompt learning
ISSN号1046-8188
DOI10.1145/3626092
英文摘要Information retrieval aims to find information that meets users' needs from the corpus. Different needs correspond to different IR tasks such as document retrieval, open-domain question answering, retrieval-based dialogue, and so on, while they share the same schema to estimate the relationship between texts. It indicates that a good IR model can generalize to different tasks and domains. However, previous studies indicate that state-of-the-art neural information retrieval (NIR) models, e.g., pre-trained language models (PLMs) are hard to generalize. It is mainly because the end-to-end fine-tuning paradigm makes the model overemphasize task-specific signals and domain biases but loses the ability to capture generalized essential signals. To address this problem, we propose a novel NIR training framework named NIR-Prompt for retrieval and reranking stages based on the idea of decoupling signal capturing and combination. NIR-Prompt exploits Essential Matching Module (EMM) to capture the essential matching signals and gets the description of tasks by Matching Description Module (MDM). The description is used as task-adaptation information to combine the essential matching signals to adapt to different tasks. Experiments under in-domain multi-task, out-of-domain multi-task, and new task adaptation settings show that NIR-Prompt can improve the generalization of PLMs in NIR for both retrieval and reranking stages compared with baselines.
资助项目National Key R&D Program of China[2022YFB3103700] ; National Key R&D Program of China[2022YFB3103704] ; National Natural Science Foundation of China (NSFC)[62276248] ; Youth Innovation Promotion Association CAS[2023111]
WOS研究方向Computer Science
语种英语
WOS记录号WOS:001152702600025
出版者ASSOC COMPUTING MACHINERY
源URL[http://119.78.100.204/handle/2XEOYT63/38367]  
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Pang, Liang
作者单位1.Chinese Acad Sci, Inst Comp Technol, CAS Key Lab Network Data Sci & Technol, Beijing, Peoples R China
2.Univ Chinese Acad Sci, Beijing, Peoples R China
3.Chinese Acad Sci, Inst Comp Technol, Data Intelligence Syst Res Ctr, Beijing, Peoples R China
推荐引用方式
GB/T 7714
Xu, Shicheng,Pang, Liang,Shen, Huawei,et al. NIR-Prompt: A Multi-task Generalized Neural Information Retrieval Training Framework[J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS,2024,42(2):32.
APA Xu, Shicheng,Pang, Liang,Shen, Huawei,&Cheng, Xueqi.(2024).NIR-Prompt: A Multi-task Generalized Neural Information Retrieval Training Framework.ACM TRANSACTIONS ON INFORMATION SYSTEMS,42(2),32.
MLA Xu, Shicheng,et al."NIR-Prompt: A Multi-task Generalized Neural Information Retrieval Training Framework".ACM TRANSACTIONS ON INFORMATION SYSTEMS 42.2(2024):32.

入库方式: OAI收割

来源:计算技术研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。