PredPPCrys: Accurate Prediction of Sequence Cloning, Protein Production, Purification and Crystallization Propensity from Protein Sequences Using Multi-Step Heterogeneous Feature Fusion and Selection
文献类型:期刊论文
作者 | Wang, Huilin1,2; Wang, Mingjun1,2; Tan, Hao3; Li, Yuan1,2; Zhang, Ziding4; Song, Jiangning1,2,3,5 |
刊名 | PLOS ONE
![]() |
出版日期 | 2014-08-22 |
卷号 | 9期号:8 |
英文摘要 | X-ray crystallography is the primary approach to solve the three-dimensional structure of a protein. However, a major bottleneck of this method is the failure of multi-step experimental procedures to yield diffraction-quality crystals, including sequence cloning, protein material production, purification, crystallization and ultimately, structural determination. Accordingly, prediction of the propensity of a protein to successfully undergo these experimental procedures based on the protein sequence may help narrow down laborious experimental efforts and facilitate target selection. A number of bioinformatics methods based on protein sequence information have been developed for this purpose. However, our knowledge on the important determinants of propensity for a protein sequence to produce high diffraction-quality crystals remains largely incomplete. In practice, most of the existing methods display poorer performance when evaluated on larger and updated datasets. To address this problem, we constructed an up-to-date dataset as the benchmark, and subsequently developed a new approach termed 'PredPPCrys' using the support vector machine (SVM). Using a comprehensive set of multifaceted sequence-derived features in combination with a novel multi-step feature selection strategy, we identified and characterized the relative importance and contribution of each feature type to the prediction performance of five individual experimental steps required for successful crystallization. The resulting optimal candidate features were used as inputs to build the first-level SVM predictor (PredPPCrys I). Next, prediction outputs of PredPPCrys I were used as the input to build second-level SVM classifiers (PredPPCrys II), which led to significantly enhanced prediction performance. Benchmarking experiments indicated that our PredPPCrys method outperforms most existing procedures on both up-to-date and previous datasets. In addition, the predicted crystallization targets of currently non-crystallizable proteins were provided as compendium data, which are anticipated to facilitate target selection and design for the worldwide structural genomics consortium. |
WOS标题词 | Science & Technology |
类目[WOS] | Multidisciplinary Sciences |
研究领域[WOS] | Science & Technology - Other Topics |
关键词[WOS] | STRUCTURAL GENOMICS ; UNFOLDED STATES ; RANDOM FOREST ; WEB SERVER ; DATABASE ; PEPTIDES ; SITES ; SCALE ; MRMR |
收录类别 | SCI |
语种 | 英语 |
WOS记录号 | WOS:000341230600095 |
公开日期 | 2014-11-23 |
源URL | [http://124.16.173.210/handle/311007/434] ![]() |
专题 | 天津工业生物技术研究所_结构生物信息学和整合系统生物学实验室 宋江宁_期刊论文 |
作者单位 | 1.Chinese Acad Sci, Tianjin Inst Ind Biotechnol, Natl Engn Lab Ind Enzymes, Tianjin, Peoples R China 2.Chinese Acad Sci, Tianjin Inst Ind Biotechnol, Key Lab Syst Microbial Biotechnol, Tianjin, Peoples R China 3.Monash Univ, Fac Med, Dept Biochem & Mol Biol, Melbourne, Vic 3004, Australia 4.China Agr Univ, Coll Biol Sci, State Key Lab Agrobiotechnol, Beijing 100094, Peoples R China 5.Monash Univ, ARC Ctr Excellence Struct & Funct Microbial Genom, Melbourne, Vic 3004, Australia |
推荐引用方式 GB/T 7714 | Wang, Huilin,Wang, Mingjun,Tan, Hao,et al. PredPPCrys: Accurate Prediction of Sequence Cloning, Protein Production, Purification and Crystallization Propensity from Protein Sequences Using Multi-Step Heterogeneous Feature Fusion and Selection[J]. PLOS ONE,2014,9(8). |
APA | Wang, Huilin,Wang, Mingjun,Tan, Hao,Li, Yuan,Zhang, Ziding,&Song, Jiangning.(2014).PredPPCrys: Accurate Prediction of Sequence Cloning, Protein Production, Purification and Crystallization Propensity from Protein Sequences Using Multi-Step Heterogeneous Feature Fusion and Selection.PLOS ONE,9(8). |
MLA | Wang, Huilin,et al."PredPPCrys: Accurate Prediction of Sequence Cloning, Protein Production, Purification and Crystallization Propensity from Protein Sequences Using Multi-Step Heterogeneous Feature Fusion and Selection".PLOS ONE 9.8(2014). |
入库方式: OAI收割
来源:天津工业生物技术研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。