中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Chinese Short Text Classification Based on Domain Knowledge

文献类型:会议论文

作者Xiao, Feng1; Yang, Shen2; Chengyong, Liu3; Wei, Liang1; Shuwu, Zhang1; Shuwu Zhang; Xiao Feng; Wei Liang
出版日期2013-10
会议名称International Joint Conference on Natural Language Processing
会议日期2013-10-14
会议地点Nagoya, Japan
关键词Text Classification Short Text Domain Knowledge
英文摘要
People  are  generating  more  and  more  short texts.  There  is  an  urgent  demand  to  classify short  texts  into  different  domains.  Due  to  the shortness  and  sparseness  of  short  texts,  con-ventional  methods  based  on  Vector  Space Model  (VSM)  have  limitations.  To  tackle  the data scarcity problem, we propose a new mod-
el to directly measure the correlation between a  short  text  instance  and  a  domain  instead  of representing short texts as vectors of weights. We  firstly  draw  domain  knowledge  for  each user-defined  domain  using  an  external  corpus 
of longer documents. Secondly, the correlation is  calculated  by  measuring  the  proportion  of the  overlapping  part  of  the  instance  and  the domain  knowledge.  Finally,  if  the  correlation is greater than a threshold, the instance will be classified  into  the  domain.  Experimental  results show that the classifier based on the proposed  model  outperforms  the  state-of-the-art baselines based on VSM. 
收录类别其他
会议录In Proceedings of the 6th International Joint Conference on Natural Language Processing(IJCNLP), pp. 859–863
源URL[http://ir.ia.ac.cn/handle/173211/11229]  
专题数字内容技术与服务研究中心_新媒体服务与管理技术
作者单位1.Institute of Automation Chinese Academy of Science
2.State Administration for Industry & Commerce of the People's Republic of China
3.Information Center of General Administration of Press and Publication of PR China
推荐引用方式
GB/T 7714
Xiao, Feng,Yang, Shen,Chengyong, Liu,et al. Chinese Short Text Classification Based on Domain Knowledge[C]. 见:International Joint Conference on Natural Language Processing. Nagoya, Japan. 2013-10-14.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。