中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
COMSAT: Residue contact prediction of transmembrane proteins based on support vector machines and mixed integer linear programming

文献类型:期刊论文

作者Zhang, Huiling2; Huang, Qingsheng2; Bei, Zhendong1; Wei, Yanjie2; Floudas, Christodoulos A.3,4
刊名PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS
出版日期2016-03-01
卷号84期号:3页码:332-348
关键词protein structure prediction hybrid framework machine learning ab initio prediction MILP
ISSN号0887-3585
DOI10.1002/prot.24979
英文摘要In this article, we present COMSAT, a hybrid framework for residue contact prediction of transmembrane (TM) proteins, integrating a support vector machine (SVM) method and a mixed integer linear programming (MILP) method. COMSAT consists of two modules: COMSAT_SVM which is trained mainly on position-specific scoring matrix features, and COMSAT_MILP which is an ab initio method based on optimization models. Contacts predicted by the SVM model are ranked by SVM confidence scores, and a threshold is trained to improve the reliability of the predicted contacts. For TM proteins with no contacts above the threshold, COMSAT_MILP is used. The proposed hybrid contact prediction scheme was tested on two independent TM protein sets based on the contact definition of 14 angstrom between C alpha-C alpha atoms. First, using a rigorous leave-one-protein-out cross validation on the training set of 90 TM proteins, an accuracy of 66.8%, a coverage of 12.3%, a specificity of 99.3% and a Matthews' correlation coefficient (MCC) of 0.184 were obtained for residue pairs that are at least six amino acids apart. Second, when tested on a test set of 87 TM proteins, the proposed method showed a prediction accuracy of 64.5%, a coverage of 5.3%, a specificity of 99.4% and a MCC of 0.106. COMSAT shows satisfactory results when compared with 12 other state-of-the-art predictors, and is more robust in terms of prediction accuracy as the length and complexity of TM protein increase.
资助项目National Science Foundation of China[11204342] ; Shenzhen Peacock Plan[KQCX20130628112914299] ; Science Technology and Innovation Committee of Shenzhen Municipality[JCYJ20120615140912201] ; National High Technology Research and Development Program of China[2015AA020109]
WOS研究方向Biochemistry & Molecular Biology ; Biophysics
语种英语
WOS记录号WOS:000373352100004
出版者WILEY-BLACKWELL
源URL[http://119.78.100.204/handle/2XEOYT63/8443]  
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Wei, Yanjie; Floudas, Christodoulos A.
作者单位1.Chinese Acad Sci, Shenzhen Inst Adv Technol, Ctr Cloud Comp, Shenzhen 518055, Peoples R China
2.Chinese Acad Sci, Shenzhen Inst Adv Technol, Ctr High Performance Comp, Shenzhen 518055, Peoples R China
3.Texas A&M Univ, Texas A&M Energy Inst, College Stn, TX 77843 USA
4.Texas A&M Univ, Dept Chem Engn, College Stn, TX 77843 USA
推荐引用方式
GB/T 7714
Zhang, Huiling,Huang, Qingsheng,Bei, Zhendong,et al. COMSAT: Residue contact prediction of transmembrane proteins based on support vector machines and mixed integer linear programming[J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS,2016,84(3):332-348.
APA Zhang, Huiling,Huang, Qingsheng,Bei, Zhendong,Wei, Yanjie,&Floudas, Christodoulos A..(2016).COMSAT: Residue contact prediction of transmembrane proteins based on support vector machines and mixed integer linear programming.PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS,84(3),332-348.
MLA Zhang, Huiling,et al."COMSAT: Residue contact prediction of transmembrane proteins based on support vector machines and mixed integer linear programming".PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS 84.3(2016):332-348.

入库方式: OAI收割

来源:计算技术研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。