中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
research of chinese text classification methods based on semantic vector and semantic similarity

文献类型:会议论文

作者Song Xin ; Huang Jia ; Zhou Jing-Min ; Chen Xi
出版日期2009
会议名称2009 International Forum on Computer Science-Technology and Applications, IFCSTA 2009
会议日期40879
会议地点Chongqing, China
关键词Computer science Information retrieval systems Knowledge representation Semantics Vector spaces Vectors
页码187-190
英文摘要To overcome the limitations of traditional text classification approaches based on bag-of-words representation and to effectively incorporate linguistic knowledge and conceptual index into text vector space model, based on two thesaurus HowNet and Tongyici Cilin(hereinafter referred to Cilin), we use semantic vector to describe a document instead of traditional keywords vector, which is based on merging words with high similarity and using a concept to describe the semantic feature rather than a series of words. It not only reduces feature dimension but also adds semantic information to the vector. We also use sentence (document) similarity based on simple vector distance to classify the text and three groups of experiments are made respectively. The results show that the accuracy rates are generally improved along with semantic treatment, which indicates that semantic mining is very important and necessary to text classification. © 2009 IEEE.
收录类别EI
会议主办者IITAA - International Information Technology; and Applications Association
会议录IFCSTA 2009 Proceedings - 2009 International Forum on Computer Science-Technology and Applications
会议录出版地United States
语种英语
ISBN号9780769539300
源URL[http://124.16.136.157/handle/311060/8434]  
专题软件研究所_软件所图书馆_2009年期刊/会议论文
推荐引用方式
GB/T 7714
Song Xin,Huang Jia,Zhou Jing-Min,et al. research of chinese text classification methods based on semantic vector and semantic similarity[C]. 见:2009 International Forum on Computer Science-Technology and Applications, IFCSTA 2009. Chongqing, China. 40879.

入库方式: OAI收割

来源:软件研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。