中国农业科学

• • 上一篇    

基于机器学习方法的黄土区土壤质地空间分布研究——以宁夏南部为例

申哲,张认连,龙怀玉,徐爱国   

  1. 中国农业科学院农业资源与农业区划研究所,北京 100081
  • 出版日期:2021-11-11 发布日期:2021-11-11

Research on Spatial Distribution of Soil Texture Loess Area Based on Machine Learning—Taking Southern Ningxia as an Example

SHEN Zhe, ZHANG RenLian, LONG HuaiYu, XU AiGuo   

  1. Institute of Agricultural Resources and Regional Planning, Chinese Academy of Agricultural Sciences, Beijing 100081
  • Published:2021-11-11 Online:2021-11-11

摘要: 【目的】基于历史数据,利用机器学习方法分析黄土区土壤质地空间变异规律及其与环境因素之间的关系。【方法】以宁夏自治区南部为研究区域,基于该区42820世纪80年代第二次土壤普查土壤剖面点数据,采用分类回归树(CART)和随机森林(RF)两种机器学习方法,结合地形因子、土壤类型、归一化植被指数,探究与宁南地区土壤质地分布相关性较强的环境因素,并用两种机器学习预测该区土壤质地类型的空间分布,用剖面点验证集数据以及宁夏自治区海原县实测样点数据验证模型精度。【结果】(1RFCART对剖面点验证集土壤质地类型的预测正确率分别为62.36%55.29%,接收者操作特性(Receiver Operating CharacteristicROC)曲线下面积(Area Under ROC CurveAUC)分别为0.75150.6933,对海原县122个实测样点的预测正确率分别为 54.10%48.36%AUC分别为0.65990.5981RF的预测精度高于CART。(2)该区土壤类型(ST)是与土壤质地空间分布相关性最强的环境因素,其次是高程(Ele),高程越高,土壤质地越黏重。风力作用指数(WEI)和坡度(Slo)对土壤质地的影响较小;(3)研究区土壤质地类型以轻壤土为主,空间分布格局基本呈现为南部土壤质地黏重,北部土壤质地较轻。【结论】RF更适合预测宁南地区土壤质地的空间分布,且充分利用历史数据,结合新的野外采样,可以达到预测制图的精度要求;土壤类型(ST)和高程(Ele)是与土壤质地空间分布相关性较强的环境因素。


关键词: 土壤质地, 空间分布, 因素分析, 随机森林, 宁夏

Abstract: 【Objective】Based on historical soil data, this paper studied the spatial variability of soil texture and its relationship with environmental factors in the loess region using machine learning.MethodClassification and regression tree (CART), random forest (RF) and traditional statistical methods were used to explore the main environmental factors that affect the soil texture types and predict the spatial distribution of soil texture types in southern Ningxia, based on 428 soil profiles from the second soil survey in the 1980s, combined with topographic factors, soil types, and normalized vegetation Index. And the accuracy of the models were verified by the validating set of soil profiles and the soil samples in Haiyuan County, Ningxia.ResultThe accuracy rates of RF and CART on the soil texture type of the verification set of soil profiles were 62.36% and 55.29%, respectively. The area under the receiver operating characteristic (ROC) curve (Area Under ROC Curve, AUC) are 0.7515 and 0.6933, respectively. The accuracy rates on soil samples in Haiyuan County are 54.10% and 48.36% respectively, and the AUC are 0.6599 and 0.5981 respectively. Soil type (ST) is the most important predictor variable, followed by elevation (Ele), the higher elevation, the heavier and the soil texture. And finally wind exposition index (WEI) and slope (Slo). Results predicted by two methods show a spatial distribution trend that the soil texture is heavy in the southern area but light in the northern area of southern Ningxia.ConclusionThe prediction accuracy of RF for soil texture type in southern Ningxia is higher than CART. Making full use of historical data, combined with field sampling, can meet the accuracy requirements of digital mapping. In the loess region, soil types and elevation are the environmental factors which have strong correlation with spatial variation of soil texture.


Key words: soil texture, spatial distribution, factor analysis, random forest