Heuristics based semantic annotation of biodiversity documents in Chinese
文献类型:期刊论文
作者 | DUAN Yufeng ; HEI Zhenzhen ; JU Fei ; CUI Hong |
刊名 | chinese journal of library and information science
![]() |
出版日期 | 2013-06-25 |
卷号 | 6期号:2页码:33-46 |
关键词 | Heuritistics based method Leading word analysis Taxonomic descriptions Semantic annotation |
ISSN号 | 1674-3393 |
通讯作者 | duan yufeng (e-mail:yfduan@infor.ecnu.edu.cn) |
中文摘要 | purpose: to design an efficient high-performance algorithm for semantic annotation of biodiversity documents in chinese. design/methodology/approach: data set consists of 1,000 randomly selected documents from flora of china. comparative evaluation of the proposed approach with the na ve bayes algorithm have been developed before for the same purpose. findings: experimental results show that the heuristics based algorithm outperformed the na ve bayes algorithm. the use of leading words helped improving the annotation performance while prioritizing rule application based on their weights had no significant impact on algorithm performance. research limitations: the ictclas was used to identify word boundaries off-shelf without optimatization for biodiversity domain. this may have not made the best use of the tool. practical implications & originality/value: the performance of heuristics based approach, enhanced by leading words analysis, reached an f value of 0.9216, which is sufficiently accurate for practical use. |
英文摘要 | purpose: to design an efficient high-performance algorithm for semantic annotation of biodiversity documents in chinese. design/methodology/approach: data set consists of 1,000 randomly selected documents from flora of china. comparative evaluation of the proposed approach with the na ve bayes algorithm have been developed before for the same purpose. findings: experimental results show that the heuristics based algorithm outperformed the na ve bayes algorithm. the use of leading words helped improving the annotation performance while prioritizing rule application based on their weights had no significant impact on algorithm performance. research limitations: the ictclas was used to identify word boundaries off-shelf without optimatization for biodiversity domain. this may have not made the best use of the tool. practical implications & originality/value: the performance of heuristics based approach, enhanced by leading words analysis, reached an f value of 0.9216, which is sufficiently accurate for practical use. |
学科主题 | 编辑出版 |
资助信息 | this work is jointly supported by the national social science foundation of china (grant no:11btq024) and the foundation for humanities and social sciences of the chinese ministry of education (grant no:10yjc87004) |
原文出处 | http://www.chinalibraries.net |
公开日期 | 2013-08-08 |
源URL | [http://ir.las.ac.cn/handle/12502/6238] ![]() |
专题 | 文献情报中心_Journal of Data and Information Science_Chinese Journal of Library and Information Science-2013 |
推荐引用方式 GB/T 7714 | DUAN Yufeng,HEI Zhenzhen,JU Fei,et al. Heuristics based semantic annotation of biodiversity documents in Chinese[J]. chinese journal of library and information science,2013,6(2):33-46. |
APA | DUAN Yufeng,HEI Zhenzhen,JU Fei,&CUI Hong.(2013).Heuristics based semantic annotation of biodiversity documents in Chinese.chinese journal of library and information science,6(2),33-46. |
MLA | DUAN Yufeng,et al."Heuristics based semantic annotation of biodiversity documents in Chinese".chinese journal of library and information science 6.2(2013):33-46. |
入库方式: OAI收割
来源:文献情报中心
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。