中国科学院机构知识库网格系统: Finding minimum gene subsets with heuristic breadth-first search algorithm for robust tumor classification

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

Finding minimum gene subsets with heuristic breadth-first search algorithm for robust tumor classification

文献类型：期刊论文


作者	Wang,Shu-Lin 1,2,3; Li,Xue-Ling 1; Fang,Jianwen 3
刊名	BMC Bioinformatics
出版日期	2012-07-25
卷号	13 期号:1 页码:1-26
关键词	Gene expression profiles Gene selection Tumor classification Heuristic breadth-first search Power-law distribution
ISSN号	1471-2105
DOI	10.1186/1471-2105-13-178
英文摘要	AbstractBackgroundPrevious studies on tumor classification based on gene expression profiles suggest that gene selection plays a key role in improving the classification performance. Moreover, finding important tumor-related genes with the highest accuracy is a very important task because these genes might serve as tumor biomarkers, which is of great benefit to not only tumor molecular diagnosis but also drug development.ResultsThis paper proposes a novel gene selection method with rich biomedical meaning based on Heuristic Breadth-first Search Algorithm (HBSA) to find as many optimal gene subsets as possible. Due to the curse of dimensionality, this type of method could suffer from over-fitting and selection bias problems. To address these potential problems, a HBSA-based ensemble classifier is constructed using majority voting strategy from individual classifiers constructed by the selected gene subsets, and a novel HBSA-based gene ranking method is designed to find important tumor-related genes by measuring the significance of genes using their occurrence frequencies in the selected gene subsets. The experimental results on nine tumor datasets including three pairs of cross-platform datasets indicate that the proposed method can not only obtain better generalization performance but also find many important tumor-related genes.ConclusionsIt is found that the frequencies of the selected genes follow a power-law distribution, indicating that only a few top-ranked genes can be used as potential diagnosis biomarkers. Moreover, the top-ranked genes leading to very high prediction accuracy are closely related to specific tumor subtype and even hub genes. Compared with other related methods, the proposed method can achieve higher prediction accuracy with fewer genes. Moreover, they are further justified by analyzing the top-ranked genes in the context of individual gene function, biological pathway, and protein-protein interaction network.
语种	英语
WOS记录号	BMC:10.1186/1471-2105-13-178
出版者	BioMed Central
源URL	[http://ir.hfcas.ac.cn:8080/handle/334002/34670]
专题	合肥物质科学研究院_中科院合肥智能机械研究所
通讯作者	Fang,Jianwen
作者单位	1.Hefei Institute of Intelligent Machines, Chinese Academy of Sciences; Intelligent Computing Laboratory 2.Hunan University; College of Information Science and Engineering 3.the University of Kansas; Applied Bioinformatics Laboratory
推荐引用方式 GB/T 7714	Wang,Shu-Lin,Li,Xue-Ling,Fang,Jianwen. Finding minimum gene subsets with heuristic breadth-first search algorithm for robust tumor classification[J]. BMC Bioinformatics,2012,13(1):1-26.
APA	Wang,Shu-Lin,Li,Xue-Ling,&Fang,Jianwen.(2012).Finding minimum gene subsets with heuristic breadth-first search algorithm for robust tumor classification.BMC Bioinformatics,13(1),1-26.
MLA	Wang,Shu-Lin,et al."Finding minimum gene subsets with heuristic breadth-first search algorithm for robust tumor classification".BMC Bioinformatics 13.1(2012):1-26.

入库方式： OAI收割

来源：合肥物质科学研究院

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。