Chinese Short Text Classification Based on Domain Knowledge
文献类型:会议论文
作者 | Xiao, Feng1; Yang, Shen2; Chengyong, Liu3; Wei, Liang1; Shuwu, Zhang1; Shuwu Zhang![]() ![]() ![]() |
出版日期 | 2013-10 |
会议名称 | International Joint Conference on Natural Language Processing |
会议日期 | 2013-10-14 |
会议地点 | Nagoya, Japan |
关键词 | Text Classification Short Text Domain Knowledge |
英文摘要 |
People are generating more and more short texts. There is an urgent demand to classify short texts into different domains. Due to the shortness and sparseness of short texts, con-ventional methods based on Vector Space Model (VSM) have limitations. To tackle the data scarcity problem, we propose a new mod-
el to directly measure the correlation between a short text instance and a domain instead of representing short texts as vectors of weights. We firstly draw domain knowledge for each user-defined domain using an external corpus
of longer documents. Secondly, the correlation is calculated by measuring the proportion of the overlapping part of the instance and the domain knowledge. Finally, if the correlation is greater than a threshold, the instance will be classified into the domain. Experimental results show that the classifier based on the proposed model outperforms the state-of-the-art baselines based on VSM. |
收录类别 | 其他 |
会议录 | In Proceedings of the 6th International Joint Conference on Natural Language Processing(IJCNLP), pp. 859–863
![]() |
源URL | [http://ir.ia.ac.cn/handle/173211/11229] ![]() |
专题 | 数字内容技术与服务研究中心_新媒体服务与管理技术 |
作者单位 | 1.Institute of Automation Chinese Academy of Science 2.State Administration for Industry & Commerce of the People's Republic of China 3.Information Center of General Administration of Press and Publication of PR China |
推荐引用方式 GB/T 7714 | Xiao, Feng,Yang, Shen,Chengyong, Liu,et al. Chinese Short Text Classification Based on Domain Knowledge[C]. 见:International Joint Conference on Natural Language Processing. Nagoya, Japan. 2013-10-14. |
入库方式: OAI收割
来源:自动化研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。