Edge-group sparse PCA for network-guided high dimensional data analysis
文献类型:期刊论文
作者 | Min, Wenwen1; Liu, Juan1; Zhang, Shihua2,3,4![]() |
刊名 | BIOINFORMATICS
![]() |
出版日期 | 2018-10-15 |
卷号 | 34期号:20页码:3479-3487 |
ISSN号 | 1367-4803 |
DOI | 10.1093/bioinformatics/bty362 |
英文摘要 | Motivation: Principal component analysis (PCA) has been widely used to deal with high-dimensional gene expression data. In this study, we proposed an Edge-group Sparse PCA (ESPCA) model by incorporating the group structure from a prior gene network into the PCA framework for dimension reduction and feature interpretation. ESPCA enforces sparsity of principal component (PC) loadings through considering the connectivity of gene variables in the prior network. We developed an alternating iterative algorithm to solve ESPCA. The key of this algorithm is to solve a new k-edge sparse projection problem and a greedy strategy has been adapted to address it. Here we adopted ESPCA for analyzing multiple gene expression matrices simultaneously. By incorporating prior knowledge, our method can overcome the drawbacks of sparse PCA and capture some gene modules with better biological interpretations. Results: We evaluated the performance of ESPCA using a set of artificial datasets and two real biological datasets (including TCGA pan-cancer expression data and ENCODE expression data), and compared their performance with PCA and sparse PCA. The results showed that ESPCA could identify more biologically relevant genes, improve their biological interpretations and reveal distinct sample characteristics. |
资助项目 | National Natural Science Foundation of China[11661141019] ; National Natural Science Foundation of China[61621003] ; National Natural Science Foundation of China[61422309] ; National Natural Science Foundation of China[61379092] ; Strategic Priority Research Program of the Chinese Academy of Sciences (CAS)[XDB13040600] ; Key Research Program of the Chinese Academy of Sciences[KFZD-SW-219] ; National Key Research and Development Program of China[2017YFC0908405] ; CAS Frontier Science Research Key Project for Top Young Scientist[QYZDB-SSW-SYS008] |
WOS研究方向 | Biochemistry & Molecular Biology ; Biotechnology & Applied Microbiology ; Computer Science ; Mathematical & Computational Biology ; Mathematics |
语种 | 英语 |
WOS记录号 | WOS:000448782100008 |
出版者 | OXFORD UNIV PRESS |
源URL | [http://ir.amss.ac.cn/handle/2S8OKBNM/31658] ![]() |
专题 | 应用数学研究所 |
通讯作者 | Liu, Juan; Zhang, Shihua |
作者单位 | 1.Wuhan Univ, Sch Comp Sci, Wuhan 430072, Hubei, Peoples R China 2.Chinese Acad Sci, Acad Math & Syst Sci, NCMIS, CEMS,RCSDS, Beijing 100190, Peoples R China 3.Univ Chinese Acad Sci, Sch Math Sci, Beijing 100049, Peoples R China 4.Chinese Acad Sci, Ctr Excellence Anim Evolut & Genet, Kunming 650223, Yunnan, Peoples R China |
推荐引用方式 GB/T 7714 | Min, Wenwen,Liu, Juan,Zhang, Shihua. Edge-group sparse PCA for network-guided high dimensional data analysis[J]. BIOINFORMATICS,2018,34(20):3479-3487. |
APA | Min, Wenwen,Liu, Juan,&Zhang, Shihua.(2018).Edge-group sparse PCA for network-guided high dimensional data analysis.BIOINFORMATICS,34(20),3479-3487. |
MLA | Min, Wenwen,et al."Edge-group sparse PCA for network-guided high dimensional data analysis".BIOINFORMATICS 34.20(2018):3479-3487. |
入库方式: OAI收割
来源:数学与系统科学研究院
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。