中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
FAPM: functional annotation of proteins using multimodal models beyond structural modeling

文献类型:期刊论文

作者Xiang, Wenkai1,5; Xiong, Zhaoping4; Chen, Huan3; Xiong, Jiacheng1,2; Zhang, Wei1,2; Fu, Zunyun1; Zheng, Mingyue1,2,5; Liu, Bing3; Shi, Qian5
刊名BIOINFORMATICS
出版日期2024-12-06
卷号40期号:12页码:12
ISSN号1367-4803
DOI10.1093/bioinformatics/btae680
英文摘要Motivation Assigning accurate property labels to proteins, like functional terms and catalytic activity, is challenging, especially for proteins without homologs and "tail labels" with few known examples. Previous methods mainly focused on protein sequence features, overlooking the semantic meaning of protein labels.Results We introduce functional annotation of proteins using multimodal models (FAPM), a contrastive multimodal model that links natural language with protein sequence language. This model combines a pretrained protein sequence model with a pretrained large language model to generate labels, such as Gene Ontology (GO) functional terms and catalytic activity predictions, in natural language. Our results show that FAPM excels in understanding protein properties, outperforming models based solely on protein sequences or structures. It achieves state-of-the-art performance on public benchmarks and in-house experimentally annotated phage proteins, which often have few known homologs. Additionally, FAPM's flexibility allows it to incorporate extra text prompts, like taxonomy information, enhancing both its predictive performance and explainability. This novel approach offers a promising alternative to current methods that rely on multiple sequence alignment for protein annotation.Availability and implementation The online demo is at: https://huggingface.co/spaces/wenkai/FAPM_demo.
WOS关键词LARGE-SCALE ; SEQUENCE ; ONTOLOGY
资助项目Shanghai Rising-Star Program[23QD1400600] ; National Natural Science Foundation of China[82204278] ; National Natural Science Foundation of China[T2225002] ; National Key Research and Development Program of China[2022YFC3400504]
WOS研究方向Biochemistry & Molecular Biology ; Biotechnology & Applied Microbiology ; Computer Science ; Mathematical & Computational Biology ; Mathematics
WOS记录号WOS:001373170000001
出版者OXFORD UNIV PRESS
源URL[http://119.78.100.183/handle/2S10ELR8/315019]  
专题新药研究国家重点实验室
通讯作者Xiong, Zhaoping; Liu, Bing; Shi, Qian
作者单位1.Chinese Acad Sci, Shanghai Inst Mat Med, Drug Discovery & Design Ctr, State Key Lab Drug Res, Shanghai 201203, Peoples R China
2.Univ Chinese Acad Sci, Beijing 100049, Peoples R China
3.Xi An Jiao Tong Univ, BioBank, Affiliated Hosp 1, Xian 710061, Peoples R China
4.ProtonUnfold Technol Co, Suzhou 215000, Peoples R China
5.Lingang Lab, Shanghai 200031, Peoples R China
推荐引用方式
GB/T 7714
Xiang, Wenkai,Xiong, Zhaoping,Chen, Huan,et al. FAPM: functional annotation of proteins using multimodal models beyond structural modeling[J]. BIOINFORMATICS,2024,40(12):12.
APA Xiang, Wenkai.,Xiong, Zhaoping.,Chen, Huan.,Xiong, Jiacheng.,Zhang, Wei.,...&Shi, Qian.(2024).FAPM: functional annotation of proteins using multimodal models beyond structural modeling.BIOINFORMATICS,40(12),12.
MLA Xiang, Wenkai,et al."FAPM: functional annotation of proteins using multimodal models beyond structural modeling".BIOINFORMATICS 40.12(2024):12.

入库方式: OAI收割

来源:上海药物研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。