中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Automatic Extraction of Flooding Control Knowledge from Rich Literature Texts Using Deep Learning

文献类型:期刊论文

作者Zhang, Min4; Wang, Juanle2,3
刊名APPLIED SCIENCES-BASEL
出版日期2023-02-01
卷号13期号:4
关键词flood control knowledge extraction text mining deep learning long tail data
ISSN号2076-3417
DOI10.3390/app13042115
文献子类Article
英文摘要Flood control is a global problem; increasing number of flooding disasters occur annually induced by global climate change and extreme weather events. Flood studies are important knowledge sources for flood risk reduction and have been recorded in the academic literature. The main objective of this paper was to acquire flood control knowledge from long-tail data of the literature by using deep learning techniques. Screening was conducted to obtain 4742 flood-related academic documents from past two decades. Machine learning was conducted to parse the documents, and 347 sample data points from different years were collected for sentence segmentation (approximately 61,000 sentences) and manual annotation. Traditional machine learning (NB, LR, SVM, and RF) and artificial neural network-based deep learning algorithms (Bert, Bert-CNN, Bert-RNN, and ERNIE) were implemented for model training, and complete sentence-level knowledge extraction was conducted in batches. The results revealed that artificial neural network-based deep learning methods exhibit better performance than traditional machine learning methods in terms of accuracy, but their training time is much longer. Based on comprehensive feature extraction capability and computational efficiency, the performances of deep learning methods were ranked as: ERNIE > Bert-CNN > Bert > Bert-RNN. When using Bert as the benchmark model, several deformation models showed applicable characteristics. Bert, Bert-CNN, and Bert-RNN were good at acquiring global features, local features, and processing variable-length inputs, respectively. ERNIE showed improved masking mechanism and corpus and therefore exhibited better performance. Finally, 124,196 usage method and 8935 quotation method sentences were obtained in batches. The proportions of method sentence in the literature showed increasing trends over the last 20 years. Thus, as literature with more method sentences accumulates, this study lays a foundation for knowledge extraction in the future.
WOS研究方向Chemistry ; Engineering ; Materials Science ; Physics
WOS记录号WOS:000938760100001
出版者MDPI
源URL[http://ir.igsnrr.ac.cn/handle/311030/190324]  
专题资源与环境信息系统国家重点实验室_外文论文
作者单位1.Jiangsu Ctr Collaborat Innovat Geog Informat Resou, Nanjing 210023, Peoples R China
2.China Pakistan Joint Res Ctr Earth Sci, Islamabad 45320, Pakistan
3.Univ Chinese Acad Sci, Coll Resources & Environm, Beijing 100049, Peoples R China
4.Chinese Acad Sci, Inst Geog Sci & Nat Resources Res, State Key Lab Resources & Environm Informat Syst, Beijing 100101, Peoples R China
推荐引用方式
GB/T 7714
Zhang, Min,Wang, Juanle. Automatic Extraction of Flooding Control Knowledge from Rich Literature Texts Using Deep Learning[J]. APPLIED SCIENCES-BASEL,2023,13(4).
APA Zhang, Min,&Wang, Juanle.(2023).Automatic Extraction of Flooding Control Knowledge from Rich Literature Texts Using Deep Learning.APPLIED SCIENCES-BASEL,13(4).
MLA Zhang, Min,et al."Automatic Extraction of Flooding Control Knowledge from Rich Literature Texts Using Deep Learning".APPLIED SCIENCES-BASEL 13.4(2023).

入库方式: OAI收割

来源:地理科学与资源研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。