中国科学院机构知识库网格系统: ProALIGN: Directly Learning Alignments for Protein Structure Prediction via Exploiting Context-Specific Alignment Motifs

ProALIGN: Directly Learning Alignments for Protein Structure Prediction via Exploiting Context-Specific Alignment Motifs

文献类型：期刊论文


作者	Kong, Lupeng 1,2,3; Ju, Fusong 1,2; Zheng, Wei-mou 4; Zhu, Jianwei 5; Sun, Shiwei 1,2; Xu, Jinbo 3; Bu, Dongbo 1,2
刊名	JOURNAL OF COMPUTATIONAL BIOLOGY
出版日期	2022-01-21
页码	14
关键词	deep learning and protein threading protein alignment protein structure prediction
ISSN号	1066-5277
DOI	10.1089/cmb.2021.0430
英文摘要	Template-based modeling (TBM), including homology modeling and protein threading, is one of the most reliable techniques for protein structure prediction. It predicts protein structure by building an alignment between the query sequence under prediction and the templates with solved structures. However, it is still very challenging to build the optimal sequence-template alignment, especially when only distantly related templates are available. Here we report a novel deep learning approach ProALIGN that can predict much more accurate sequence-template alignment. Like protein sequences consisting of sequence motifs, protein alignments are also composed of frequently occurring alignment motifs with characteristic patterns. Alignment motifs are context-specific as their characteristic patterns are tightly related to sequence contexts of the aligned regions. Inspired by this observation, we represent a protein alignment as a binary matrix (in which 1 denotes an aligned residue pair) and then use a deep convolutional neural network to predict the optimal alignment from the query protein and its template. The trained neural network implicitly but effectively encodes an alignment scoring function, which reduces inaccuracies in the handcrafted scoring functions widely used by the current threading approaches. For a query protein and a template, we apply the neural network to directly infer likelihoods of all possible residue pairs in their entirety, which could effectively consider the correlations among multiple residues. We further construct the alignment with maximum likelihood, and finally build a structure model according to the alignment. Tested on three independent data sets with a total of 6688 protein alignment targets and 80 CASP13 TBM targets, our method achieved much better alignments and 3D structure models than the existing methods, including HHpred, CNFpred, CEthreader, and DeepThreader. These results clearly demonstrate the effectiveness of exploiting the context-specific alignment motifs by deep learning for protein threading.
WOS研究方向	Biochemistry & Molecular Biology ; Biotechnology & Applied Microbiology ; Computer Science ; Mathematical & Computational Biology ; Mathematics
语种	英语
WOS记录号	WOS:000756282100001
出版者	MARY ANN LIEBERT, INC
源URL	[http://119.78.100.204/handle/2XEOYT63/18990]
专题	中国科学院计算技术研究所期刊论文_英文
通讯作者	Xu, Jinbo; Bu, Dongbo
作者单位	1.Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing, Peoples R China 2.Univ Chinese Acad Sci, Beijing, Peoples R China 3.Toyota Technol Inst, Chicago, IL 60637 USA 4.Chinese Acad Sci, Inst Theoret Phys, Beijing, Peoples R China 5.Microsoft Res Asia, Beijing, Peoples R China
推荐引用方式 GB/T 7714	Kong, Lupeng,Ju, Fusong,Zheng, Wei-mou,et al. ProALIGN: Directly Learning Alignments for Protein Structure Prediction via Exploiting Context-Specific Alignment Motifs[J]. JOURNAL OF COMPUTATIONAL BIOLOGY,2022:14.
APA	Kong, Lupeng.,Ju, Fusong.,Zheng, Wei-mou.,Zhu, Jianwei.,Sun, Shiwei.,...&Bu, Dongbo.(2022).ProALIGN: Directly Learning Alignments for Protein Structure Prediction via Exploiting Context-Specific Alignment Motifs.JOURNAL OF COMPUTATIONAL BIOLOGY,14.
MLA	Kong, Lupeng,et al."ProALIGN: Directly Learning Alignments for Protein Structure Prediction via Exploiting Context-Specific Alignment Motifs".JOURNAL OF COMPUTATIONAL BIOLOGY (2022):14.

入库方式： OAI收割

来源：计算技术研究所

下载0

ProALIGN: Directly Learning Alignments for Protein Structure Prediction via Exploiting Context-Specific Alignment Motifs

其他版本