中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Systematic comparison of five machine-learning models in classification and interpolation of soil particle size fractions using different transformed data

文献类型:期刊论文

作者Zhang, Mo2,3; Shi, Wenjiao1,2; Xu, Ziwei4
刊名HYDROLOGY AND EARTH SYSTEM SCIENCES
出版日期2020-05-14
卷号24期号:5页码:2505-2526
ISSN号1027-5606
DOI10.5194/hess-24-2505-2020
通讯作者Shi, Wenjiao(shiwj@lreis.ac.cn)
英文摘要Soil texture and soil particle size fractions (PSFs) play an increasing role in physical, chemical, and hydrological processes. Many previous studies have used machine-learning and log-ratio transformation methods for soil texture classification and soil PSF interpolation to improve the prediction accuracy. However, few reports have systematically compared their performance with respect to both classification and interpolation. Here, five machine-learning models - K-nearest neighbour (KNN), multilayer perceptron neural network (MLP), random forest (RF), support vector machines (SVM), and extreme gradient boosting (XGB) - combined with the original data and three log-ratio transformation methods - additive log ratio (ALR), centred log ratio (CLR), and isometric log ratio (ILR) - were applied to evaluate soil texture and PSFs using both raw and log-ratio-transformed data from 640 soil samples in the Heihe River basin (HRB) in China. The results demonstrated that the log-ratio transformations decreased the skewness of soil PSF data. For soil texture classification, RF and XGB showed better performance with a higher overall accuracy and kappa coefficient. They were also recommended to evaluate the classification capacity of imbalanced data according to the area under the precision-recall curve (AUPRC). For soil PSF interpolation, RF delivered the best performance among five machine-learning models with the lowest root-mean-square error (RMSE; sand had a RMSE of 15.09 %, silt was 13.86 %, and clay was 6.31 %), mean absolute error (MAE; sand had a MAD of 10.65 %, silt was 9.99 %, and clay was 5.00 %), Aitchison distance (AD; 0.84), and standardized residual sum of squares (STRESS; 0.61), and the highest Spearman rank correlation coefficient (RCC; sand was 0.69, silt was 0.67, and clay was 0.69). STRESS was improved by using log-ratio methods, especially for CLR and ILR. Prediction maps from both direct and indirect classification were similar in the middle and upper reaches of the HRB. However, indirect classification maps using log-ratio-transformed data provided more detailed information in the lower reaches of the HRB. There was a pronounced improvement of 21.3 % in the kappa coefficient when using indirect methods for soil texture classification compared with direct methods. RF was recommended as the best strategy among the five machine-learning models, based on the accuracy evaluation of the soil PSF interpolation and soil texture classification, and ILR was recommended for component-wise machine-learning models without multivariate treatment, considering the constrained nature of compositional data. In addition, XGB was preferred over other models when the trade-off between the accuracy and runtime was considered. Our findings provide a reference for future works with respect to the spatial prediction of soil PSFs and texture using machine-learning models with skewed distributions of soil PSF data over a large area.
WOS关键词LOG-RATIO TRANSFORMATION ; SUPPORT VECTOR MACHINES ; COMPOSITIONAL DATA ; ORGANIC-CARBON ; SPATIAL PREDICTION ; PRECISION-RECALL ; REGRESSION TREE ; NEURAL-NETWORK ; TEXTURE ; REGION
资助项目National Key Research and Development Program of China[2017YFA0604703] ; National Natural Science Foundation of China[41771364] ; National Natural Science Foundation of China[41771111] ; Fund for Excellent Young Talents in Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences (CAS)[2016RC201] ; Youth Innovation Promotion Association, CAS[2018071] ; Investigation and Monitoring project of Ministry of Natural Resources[JCQQ191504-06] ; State Key Laboratory of Resources and Environmental Information System
WOS研究方向Geology ; Water Resources
语种英语
出版者COPERNICUS GESELLSCHAFT MBH
WOS记录号WOS:000535260900001
资助机构National Key Research and Development Program of China ; National Natural Science Foundation of China ; Fund for Excellent Young Talents in Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences (CAS) ; Youth Innovation Promotion Association, CAS ; Investigation and Monitoring project of Ministry of Natural Resources ; State Key Laboratory of Resources and Environmental Information System
源URL[http://ir.igsnrr.ac.cn/handle/311030/159536]  
专题中国科学院地理科学与资源研究所
通讯作者Shi, Wenjiao
作者单位1.Univ Chinese Acad Sci, Coll Resources & Environm, Beijing 100049, Peoples R China
2.Chinese Acad Sci, Inst Geog Sci & Nat Resources Res, Key Lab Land Surface Pattern & Simulat, State Key Lab Resources & Environm Informat Syst, Beijing 100101, Peoples R China
3.China Univ Geosci, Sch Earth Sci & Resources, Beijing 100083, Peoples R China
4.Beijing Normal Univ, Fac Geog Sci, State Key Lab Earth Surface Proc & Resource Ecol, Beijing 100875, Peoples R China
推荐引用方式
GB/T 7714
Zhang, Mo,Shi, Wenjiao,Xu, Ziwei. Systematic comparison of five machine-learning models in classification and interpolation of soil particle size fractions using different transformed data[J]. HYDROLOGY AND EARTH SYSTEM SCIENCES,2020,24(5):2505-2526.
APA Zhang, Mo,Shi, Wenjiao,&Xu, Ziwei.(2020).Systematic comparison of five machine-learning models in classification and interpolation of soil particle size fractions using different transformed data.HYDROLOGY AND EARTH SYSTEM SCIENCES,24(5),2505-2526.
MLA Zhang, Mo,et al."Systematic comparison of five machine-learning models in classification and interpolation of soil particle size fractions using different transformed data".HYDROLOGY AND EARTH SYSTEM SCIENCES 24.5(2020):2505-2526.

入库方式: OAI收割

来源:地理科学与资源研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。