中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Genome-wide association data classification and SNPs selection using two-stage quality-basedRandom Forests

文献类型:期刊论文

作者Thanh-Tung Nguyen; Huang, Joshua Zhexue; Wu, Qingyao; Thuy Thi Nguyen; Li, Mark Junjie
刊名BMC GENOMICS
出版日期2014
英文摘要Background: Single-nucleotide polymorphisms (SNPs) selection and identification are the most important tasks in Genome-wide association data analysis. The problem is difficult because genome-wide association data is very high dimensional and a large portion of SNPs in the data is irrelevant to the disease. Advanced machine learning methods have been successfully used in Genome-wide association studies (GWAS) for identification of genetic variants that have relatively big effects in some common, complex diseases. Among them, the most successful one is Random Forests (RF). Despite of performing well in terms of prediction accuracy in some data sets with moderate size, RF still suffers from working in GWAS for selecting informative SNPs and building accurate prediction models. In this paper, we propose to use a new two-stage quality-based sampling method in random forests, named ts-RF, for SNP subspaceselection for GWAS. The method first applies p-value assessment to find a cut-off point that separates informative and irrelevant SNPs in two groups. The informative SNPs group is further divided into two sub-groups: highly informative and weak informative SNPs. When sampling the SNP subspace for building trees for the forest, only those SNPs from the two sub-groups are taken into account. The feature subspaces always contain highly informative SNPs when used to split a node at a tree. 
收录类别SCI
原文出处http://www.biomedcentral.com/qc/1471-2164/16/S2/S5
语种英语
源URL[http://ir.siat.ac.cn:8080/handle/172644/5964]  
专题深圳先进技术研究院_数字所
作者单位BMC GENOMICS
推荐引用方式
GB/T 7714
Thanh-Tung Nguyen,Huang, Joshua Zhexue,Wu, Qingyao,et al. Genome-wide association data classification and SNPs selection using two-stage quality-basedRandom Forests[J]. BMC GENOMICS,2014.
APA Thanh-Tung Nguyen,Huang, Joshua Zhexue,Wu, Qingyao,Thuy Thi Nguyen,&Li, Mark Junjie.(2014).Genome-wide association data classification and SNPs selection using two-stage quality-basedRandom Forests.BMC GENOMICS.
MLA Thanh-Tung Nguyen,et al."Genome-wide association data classification and SNPs selection using two-stage quality-basedRandom Forests".BMC GENOMICS (2014).

入库方式: OAI收割

来源:深圳先进技术研究院

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。