A Distributed Framework for Large-scale Protein-protein Interaction Data Analysis and Prediction Using MapReduce
文献类型:期刊论文
作者 | Lun Hu; Shicheng Yang; Xin Luo; Huaqiang Yuan; Khaled Sedraoui; MengChu Zhou |
刊名 | IEEE/CAA Journal of Automatica Sinica |
出版日期 | 2022 |
卷号 | 9期号:1页码:160-172 |
ISSN号 | 2329-9266 |
关键词 | Distributed computing large-scale prediction machine learning MapReduce protein-protein interaction (PPI) |
DOI | 10.1109/JAS.2021.1004198 |
英文摘要 | Protein-protein interactions are of great significance for human to understand the functional mechanisms of proteins. With the rapid development of high-throughput genomic technologies, massive protein-protein interaction (PPI) data have been generated, making it very difficult to analyze them efficiently. To address this problem, this paper presents a distributed framework by reimplementing one of state-of-the-art algorithms, i.e., CoFex, using MapReduce. To do so, an in-depth analysis of its limitations is conducted from the perspectives of efficiency and memory consumption when applying it for large-scale PPI data analysis and prediction. Respective solutions are then devised to overcome these limitations. In particular, we adopt a novel tree-based data structure to reduce the heavy memory consumption caused by the huge sequence information of proteins. After that, its procedure is modified by following the MapReduce framework to take the prediction task distributively. A series of extensive experiments have been conducted to evaluate the performance of our framework in terms of both efficiency and accuracy. Experimental results well demonstrate that the proposed framework can considerably improve its computational efficiency by more than two orders of magnitude while retaining the same high accuracy. |
源URL | [http://ir.ia.ac.cn/handle/173211/45982] |
专题 | 自动化研究所_学术期刊_IEEE/CAA Journal of Automatica Sinica |
推荐引用方式 GB/T 7714 | Lun Hu,Shicheng Yang,Xin Luo,et al. A Distributed Framework for Large-scale Protein-protein Interaction Data Analysis and Prediction Using MapReduce[J]. IEEE/CAA Journal of Automatica Sinica,2022,9(1):160-172. |
APA | Lun Hu,Shicheng Yang,Xin Luo,Huaqiang Yuan,Khaled Sedraoui,&MengChu Zhou.(2022).A Distributed Framework for Large-scale Protein-protein Interaction Data Analysis and Prediction Using MapReduce.IEEE/CAA Journal of Automatica Sinica,9(1),160-172. |
MLA | Lun Hu,et al."A Distributed Framework for Large-scale Protein-protein Interaction Data Analysis and Prediction Using MapReduce".IEEE/CAA Journal of Automatica Sinica 9.1(2022):160-172. |
入库方式: OAI收割
来源:自动化研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。