A LDA Feature Grouping Method for Subspace Clustering of Text Data
文献类型:会议论文
作者 | Cai, Yeshou; Chen, Xiaojun; Peng, Patrick Xiaogang; Huang, Joshua Zhexue |
出版日期 | 2014 |
会议名称 | 2014 Pacific Asia Workshop on Intelligence and Security Informatics, PAISI 2014 |
会议地点 | Tainan, Taiwan |
英文摘要 | This paper proposes a feature grouping method for clustering of text data. In this new method, the vector space model is used to represent a set of documents. The LDA algorithm is applied to the text data to generate groups of features as topics. The topics are treated as group features which enable the recently publishedsubspace clustering algorithm FG-k-means to be used to cluster high dimensional text data with two level features, the word level and the group level. In generating the group level features with LDA, an entropy based word filtering method is proposed to remove the words with low probabilities in the word distributionof the corresponding topics. Experiments were conducted on three real-life text data sets to compare the new method with three existing clustering algorithms. The experiment results have shown that the new method improved the clustering performance in comparison with other methods. © 2014 Springer International Publishing.(20 refs) |
收录类别 | EI |
语种 | 英语 |
源URL | [http://ir.siat.ac.cn:8080/handle/172644/6035] ![]() |
专题 | 深圳先进技术研究院_数字所 |
作者单位 | 2014 |
推荐引用方式 GB/T 7714 | Cai, Yeshou,Chen, Xiaojun,Peng, Patrick Xiaogang,et al. A LDA Feature Grouping Method for Subspace Clustering of Text Data[C]. 见:2014 Pacific Asia Workshop on Intelligence and Security Informatics, PAISI 2014. Tainan, Taiwan. |
入库方式: OAI收割
来源:深圳先进技术研究院
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。