中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Enhancing SPARQL Query Generation for Knowledge Base Question Answering Systems by Learning to Correct Triplets

文献类型:期刊论文

作者Qi, Jiexing2; Su, Chang2; Guo, Zhixin2; Wu, Lyuwen2; Shen, Zanwei2; Fu, Luoyi2; Wang, Xinbing2; Zhou, Chenghu1,2
刊名APPLIED SCIENCES-BASEL
出版日期2024-02-01
卷号14期号:4页码:19
关键词Knowledge Base Question Answering Text-to-SPARQL semantic parsing further pretraining Triplet Structure
DOI10.3390/app14041521
通讯作者Fu, Luoyi(yiluofu@sjtu.edu.cn)
英文摘要Generating SPARQL queries from natural language questions is challenging in Knowledge Base Question Answering (KBQA) systems. The current state-of-the-art models heavily rely on fine-tuning pretrained models such as T5. However, these methods still encounter critical issues such as triple-flip errors (e.g., (subject, relation, object) is predicted as (object, relation, subject)). To address this limitation, we introduce TSET (Triplet Structure Enhanced T5), a model with a novel pretraining stage positioned between the initial T5 pretraining and the fine-tuning for the Text-to-SPARQL task. In this intermediary stage, we introduce a new objective called Triplet Structure Correction (TSC) to train the model on a SPARQL corpus derived from Wikidata. This objective aims to deepen the model's understanding of the order of triplets. After this specialized pretraining, the model undergoes fine-tuning for SPARQL query generation, augmenting its query-generation capabilities. We also propose a method named "semantic transformation" to fortify the model's grasp of SPARQL syntax and semantics without compromising the pre-trained weights of T5. Experimental results demonstrate that our proposed TSET outperforms existing methods on three well-established KBQA datasets: LC-QuAD 2.0, QALD-9 plus, and QALD-10, establishing a new state-of-the-art performance (95.0% F1 and 93.1% QM on LC-QuAD 2.0, 75.85% F1 and 61.76% QM on QALD-9 plus, 51.37% F1 and 40.05% QM on QALD-10).
资助项目NSF China
WOS研究方向Chemistry ; Engineering ; Materials Science ; Physics
语种英语
WOS记录号WOS:001168342000001
出版者MDPI
资助机构NSF China
源URL[http://ir.igsnrr.ac.cn/handle/311030/203188]  
专题中国科学院地理科学与资源研究所
通讯作者Fu, Luoyi
作者单位1.Chinese Acad Sci, Inst Geog Sci & Nat Resources Res, Beijing 100101, Peoples R China
2.Shanghai Jiao Tong Univ, Sch Elect Informat & Elect Engn, Shanghai 200240, Peoples R China
推荐引用方式
GB/T 7714
Qi, Jiexing,Su, Chang,Guo, Zhixin,et al. Enhancing SPARQL Query Generation for Knowledge Base Question Answering Systems by Learning to Correct Triplets[J]. APPLIED SCIENCES-BASEL,2024,14(4):19.
APA Qi, Jiexing.,Su, Chang.,Guo, Zhixin.,Wu, Lyuwen.,Shen, Zanwei.,...&Zhou, Chenghu.(2024).Enhancing SPARQL Query Generation for Knowledge Base Question Answering Systems by Learning to Correct Triplets.APPLIED SCIENCES-BASEL,14(4),19.
MLA Qi, Jiexing,et al."Enhancing SPARQL Query Generation for Knowledge Base Question Answering Systems by Learning to Correct Triplets".APPLIED SCIENCES-BASEL 14.4(2024):19.

入库方式: OAI收割

来源:地理科学与资源研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。