中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
PredPPCrys: Accurate Prediction of Sequence Cloning, Protein Production, Purification and Crystallization Propensity from Protein Sequences Using Multi-Step Heterogeneous Feature Fusion and Selection

文献类型:期刊论文

作者Wang, Huilin1,2; Wang, Mingjun1,2; Tan, Hao3; Li, Yuan1,2; Zhang, Ziding4; Song, Jiangning1,2,3,5
刊名PLOS ONE
出版日期2014-08-22
卷号9期号:8
英文摘要X-ray crystallography is the primary approach to solve the three-dimensional structure of a protein. However, a major bottleneck of this method is the failure of multi-step experimental procedures to yield diffraction-quality crystals, including sequence cloning, protein material production, purification, crystallization and ultimately, structural determination. Accordingly, prediction of the propensity of a protein to successfully undergo these experimental procedures based on the protein sequence may help narrow down laborious experimental efforts and facilitate target selection. A number of bioinformatics methods based on protein sequence information have been developed for this purpose. However, our knowledge on the important determinants of propensity for a protein sequence to produce high diffraction-quality crystals remains largely incomplete. In practice, most of the existing methods display poorer performance when evaluated on larger and updated datasets. To address this problem, we constructed an up-to-date dataset as the benchmark, and subsequently developed a new approach termed 'PredPPCrys' using the support vector machine (SVM). Using a comprehensive set of multifaceted sequence-derived features in combination with a novel multi-step feature selection strategy, we identified and characterized the relative importance and contribution of each feature type to the prediction performance of five individual experimental steps required for successful crystallization. The resulting optimal candidate features were used as inputs to build the first-level SVM predictor (PredPPCrys I). Next, prediction outputs of PredPPCrys I were used as the input to build second-level SVM classifiers (PredPPCrys II), which led to significantly enhanced prediction performance. Benchmarking experiments indicated that our PredPPCrys method outperforms most existing procedures on both up-to-date and previous datasets. In addition, the predicted crystallization targets of currently non-crystallizable proteins were provided as compendium data, which are anticipated to facilitate target selection and design for the worldwide structural genomics consortium.
WOS标题词Science & Technology
类目[WOS]Multidisciplinary Sciences
研究领域[WOS]Science & Technology - Other Topics
关键词[WOS]STRUCTURAL GENOMICS ; UNFOLDED STATES ; RANDOM FOREST ; WEB SERVER ; DATABASE ; PEPTIDES ; SITES ; SCALE ; MRMR
收录类别SCI
语种英语
WOS记录号WOS:000341230600095
公开日期2014-11-23
源URL[http://124.16.173.210/handle/311007/434]  
专题天津工业生物技术研究所_结构生物信息学和整合系统生物学实验室 宋江宁_期刊论文
作者单位1.Chinese Acad Sci, Tianjin Inst Ind Biotechnol, Natl Engn Lab Ind Enzymes, Tianjin, Peoples R China
2.Chinese Acad Sci, Tianjin Inst Ind Biotechnol, Key Lab Syst Microbial Biotechnol, Tianjin, Peoples R China
3.Monash Univ, Fac Med, Dept Biochem & Mol Biol, Melbourne, Vic 3004, Australia
4.China Agr Univ, Coll Biol Sci, State Key Lab Agrobiotechnol, Beijing 100094, Peoples R China
5.Monash Univ, ARC Ctr Excellence Struct & Funct Microbial Genom, Melbourne, Vic 3004, Australia
推荐引用方式
GB/T 7714
Wang, Huilin,Wang, Mingjun,Tan, Hao,et al. PredPPCrys: Accurate Prediction of Sequence Cloning, Protein Production, Purification and Crystallization Propensity from Protein Sequences Using Multi-Step Heterogeneous Feature Fusion and Selection[J]. PLOS ONE,2014,9(8).
APA Wang, Huilin,Wang, Mingjun,Tan, Hao,Li, Yuan,Zhang, Ziding,&Song, Jiangning.(2014).PredPPCrys: Accurate Prediction of Sequence Cloning, Protein Production, Purification and Crystallization Propensity from Protein Sequences Using Multi-Step Heterogeneous Feature Fusion and Selection.PLOS ONE,9(8).
MLA Wang, Huilin,et al."PredPPCrys: Accurate Prediction of Sequence Cloning, Protein Production, Purification and Crystallization Propensity from Protein Sequences Using Multi-Step Heterogeneous Feature Fusion and Selection".PLOS ONE 9.8(2014).

入库方式: OAI收割

来源:天津工业生物技术研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。