中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
A Top-Down Binary Hierarchical Topic Model for Biomedical Literature

文献类型:期刊论文

作者Lin, Xiaoguang1,2,3; Liu, Mingxuan1,2; Zhang, Ju1,2
刊名IEEE ACCESS
出版日期2020
卷号8页码:59870-59882
关键词Topic model topic hierarchy binary modality biomedical literature text mining
ISSN号2169-3536
DOI10.1109/ACCESS.2020.2983265
通讯作者Zhang, Ju(zhangju@cigit.ac.cn)
英文摘要Over the past two decades, a number of advances in topic modeling have produced sophisticated models that are capable of generating topic hierarchies. In particular, hierarchical Latent Dirichlet Allocation (hLDA) builds a topic tree based on the nested Chinese Restaurant Process (nCRP) or other sampling processes to generate a topic hierarchy that allows arbitrarily large branch structures and adaptive dataset growth. In addition, hierarchical topic models based on the latent tree model, such as Hierarchical Latent Tree Analysis (HLTA), have been developed over the last five years. However, these models do not work well in cases with millions of documents and hundreds of thousands of terms. In addition, the topic trees generated by these models are always poorly interpretable, and the relationships among topics in different levels are relatively simple. The biomedical literature, including Medline abstracts, has large-scale documents in two major categories: biological laboratory research and medical clinical research. We propose a top-down binary hierarchical topic model (biHTM) for biomedical literature by iteratively applying a flat topic model and adaptively processing subtrees of the hierarchy. The biHTM topic hierarchy of complete Medline abstracts with more than 14 topic node levels shows good bimodality and interpretability. Compared to hLDA and HLTA, biHTM shows promising results in experiments assessed in terms of runtime and quality.
WOS研究方向Computer Science ; Engineering ; Telecommunications
语种英语
WOS记录号WOS:000527413100019
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
源URL[http://119.78.100.138/handle/2HOD01W0/10920]  
专题中国科学院重庆绿色智能技术研究院
通讯作者Zhang, Ju
作者单位1.Chinese Acad Sci, Chongqing Inst Green & Intelligent Technol, Chongqing 400714, Peoples R China
2.Univ Chinese Acad Sci, Beijing 100049, Peoples R China
3.Chinese Acad Sci, Chengdu Inst Comp Applicat, Chengdu 610041, Peoples R China
推荐引用方式
GB/T 7714
Lin, Xiaoguang,Liu, Mingxuan,Zhang, Ju. A Top-Down Binary Hierarchical Topic Model for Biomedical Literature[J]. IEEE ACCESS,2020,8:59870-59882.
APA Lin, Xiaoguang,Liu, Mingxuan,&Zhang, Ju.(2020).A Top-Down Binary Hierarchical Topic Model for Biomedical Literature.IEEE ACCESS,8,59870-59882.
MLA Lin, Xiaoguang,et al."A Top-Down Binary Hierarchical Topic Model for Biomedical Literature".IEEE ACCESS 8(2020):59870-59882.

入库方式: OAI收割

来源:重庆绿色智能技术研究院

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。