中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
基于蛋白质相互作用网络的高容错乳腺癌疾病基因挖掘算法

文献类型:学位论文

作者聂亚玲
学位类别硕士
答辩日期2012-05-21
授予单位中国科学院研究生院
导师余景开
关键词乳腺癌 疾病基因 PPI网络 表型数据 GO功能性注释
其他题名Mining breast cancer related genes with a network based noise-tolerant approach
学位专业化学工程
中文摘要乳腺癌等复杂疾病是目前人类健康的主要威胁之一,其发病机制和相关疾病基因仍未被完全探知,因此挖掘新的乳腺癌基因是乳腺癌研究中的重要任务之一。大多数方法是基于候选基因与已知的的癌症基因的结构和功能上的相似性,并通过整合多种数据源,包括PPI网络、GO注释、基因表达数据和生物学通路等来定位潜在的癌症基因。然而,不同类型的数据包含不同程度的噪音,会影响挖掘结果的正确性。因此,需要开发具有高度容错性能的的乳腺癌疾病基因挖掘算法。本课题尝试基于蛋白质相互作用网络(PPI network)架构,通过整合多种不同类型的数据,建立一个高效的乳腺癌疾病基因挖掘算法来搜索新的乳腺癌疾病基因。主要成果包括: 1)整合多个数据源构建了一个较为全面的人类蛋白质相互作用(PPI)网络,并明确了已知疾病基因与非疾病基因在网络中的不同拓扑特性。从PPI网络中抽取大量随机样本,并分别统计分析随机样本和已知疾病基因在多种拓扑结构属性上的差异程度,进而为潜在癌症基因挖掘提供重要依据。结果发现PPI网络的节点度、节点介数和最大连接组件大小等拓扑结构属性能够显著地区分癌症基因与非癌症相关基因,为新的未知癌症基因的发掘提供了重要考察依据(P<2.2×10-16)。 2)提出了一种新的乳腺癌疾病基因挖掘方法,这个方法整合了多种数据源,包括PPI网络拓扑结构属性、人类乳腺癌相关表型数据、已知乳腺癌疾病基因和GO功能性注释信息。并通过定量分析证明了我们的算法对数据噪音的具有高容忍性。据我们所知,这是首次对不同的癌症基因挖掘方法的抗噪能力做出定量分析。 3)初步探索了乳腺癌个体致病基因的挖掘。癌症的复杂性与异质性(heterogeneity)要求我们寻找到个体病例(individual patient)中对癌症的发生、发展起到关键推动作用的基因,以满足临床个体化医药(personalized medicine)的需要。我们对乳腺癌致病基因进行了初步的探索。通过对个体病例的初步分析,发现了该病例差异表达值变化的一般规律,为乳腺癌个体化诊断和治疗提供了初步的数据基础。
英文摘要Novel disease genes remain quite difficult to identify in most genetic diseases. Currently, not all disease genes have yet been detected even for those diseases whose molecular mechanisms are partly known, for instance, the breast cancer. Breast cancer is the most common cancer and the major cause of cancer death among females around the world, mining breast cancer genes is conducive to understand the pathogenic mechanism and find effective treatments. With rapid growth of disease-related genomic and functional data, computational approaches can be utilized to mine for new cancer genes Most approaches prioritize candidate genes based on their similarity to known cancer genes. Many existing methods identified potential cancer genes by integrating multiple data sources, including PPI networks, Gene Ontology (GO) annotations, gene expression profiles and biological pathways. However, different data categories have their own noise and hardly anyone have analyzed this problem quantitatively. Thus, great need is required to calibrate how a certain method scales with respect to noise. In this study, we integrated multiple data sources and developed a novel network-based noise-tolerant method to mine for new breast cancer genes. Major contents of the thesis are summarized as the following: 1)We constructed a comprehensive human PPI network with data collected from multiple PPI databases. We also obtained a set of breast cancer genes from the OMIM database. And we found that the topological attributes of node degree, betweenness and maximally connected components can apparently distinguish cancer genes from non-cancer genes in the PPI network. 2)We ranked candidate genes from the PPI network by using a novel noise-tolerant approach, which integrated multiple data sources, including PPI network, gene expression, prior knowledge of breast cancer and GO annotations. We also established a quantitative analysis system to compare the performance of our method with random walk approach when different levels of noise were added into the input data. Our method achieved good performance in ranking breast cancer related genes and obtained better noise tolerance than random walk. We believe this is the first systematic effort to quantitatively analyze noise tolerance of different cancer gene mining methods. 3)Preliminary study on breast cancer driver genes for individuals. Personalized medicine demands that we pinpoint the real cancer driver genes for individual patients, rather than depending on related genes derived from population genetics studies. We pilot studied an individual breast cancer patient’s transcriptome data. We found some general rules of genes’ expression value of this individual case, which may be helpful for individualized diagnosis and treatment.
语种中文
公开日期2013-09-25
源URL[http://ir.ipe.ac.cn/handle/122111/1857]  
专题过程工程研究所_研究所(批量导入)
推荐引用方式
GB/T 7714
聂亚玲. 基于蛋白质相互作用网络的高容错乳腺癌疾病基因挖掘算法[D]. 中国科学院研究生院. 2012.

入库方式: OAI收割

来源:过程工程研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。