Please wait a minute...
Journal of Integrative Agriculture  2022, Vol. 21 Issue (8): 2422-2434    DOI: 10.1016/S2095-3119(21)63692-4
Special Issue: 农业生态环境-遥感合辑Agro-ecosystem & Environment—Romote sensing
Agro-ecosystem & Environment Advanced Online Publication | Current Issue | Archive | Adv Search |
Predicting soil depth in a large and complex area using machine learning and environmental correlations

LIU Feng1, 2, YANG Fei1, ZHAO Yu-guo1, 2, ZHANG Gan-lin1, 2, 3, LI De-cheng1 

1 State Key Laboratory of Soil and Sustainable Agriculture, Institute of Soil Science, Chinese Academy of Sciences, Nanjing 210008, P.R.China

2 University of the Chinese Academy of Sciences, Beijing 100049, P.R.China

3 Key Laboratory of Watershed Geographic Science, Nanjing Institute of Geography and Limnology, Chinese Academy of Sciences, Nanjing 210008, P.R.China

Download:  PDF in ScienceDirect  
Export:  BibTeX | EndNote (RIS)      

本研究构建了一个可直接估计空间不确定性的集合式机器学习模型,即分位数回归森林,定量土体深度与环境条件之间的关系。将该模型与丰富的环境协同变量结合,预测了位于我国西北地区、面积为14万km2的黑河流域的土体深度空间分布,估算了制图结果的空间不确定性。使用了275个土体深度观测样本和26个环境协同变量数据。结果显示,模型预测精度R2为0.587,RMSE为2.98 cm(平方根尺度),可解释近60%的土体深度变异。土体深度图清晰地展示了土体深度的区域分布模式和局部细节。谷底、平原等低平低洼景观部位土体深度较大,而山坡、山脊、台地等高陡景观部位土体深度较小;绿洲内土体深度明显大于绿洲之外的荒漠地区,冲积平原中部土体深度明显大于边缘地带,而湖泊平原中部土体深度明显小于边缘地带。高的预测不确定性主要出现在可达性差、缺少样本的区域。分析发现,土壤发生过程和地貌过程共同塑造了该流域土体深度的空间模式,但地貌过程起主导作用。这一点可能也适用于世界上其它寒旱地区类似的“高寒山地-平原绿洲-荒漠戈壁”流域。


Soil depth is critical for eco-hydrological modeling, carbon storage calculation and land evaluation.  However, its spatial variation is poorly understood and rarely mapped.  With a limited number of sparse samples, how to predict soil depth in a large area of complex landscapes is still an issue.  This study constructed an ensemble machine learning model, i.e., quantile regression forest, to quantify the relationship between soil depth and environmental conditions.  The model was then combined with a rich set of environmental covariates to predict spatial variation of soil depth and straightforwardly estimate the associated predictive uncertainty in the 140 000 km2 Heihe River basin of northwestern China.  A total of 275 soil depth observation points and 26 covariates were used.  The results showed a model predictive accuracy with coefficient of determination (R2) of 0.587 and root mean square error (RMSE) of 2.98 cm (square root scale), i.e., almost 60% of soil depth variation explained.  The resulting soil depth map clearly exhibited regional patterns as well as local details.  Relatively deep soils occurred in low lying landscape positions such as valley bottoms and plains while shallow soils occurred in high and steep landscape positions such as hillslopes, ridges and terraces.  The oases had much deeper soils than outside semi-desert areas, the middle of an alluvial plain had deeper soils than its margins, and the middle of a lacustrine plain had shallower soils than its margins.  Large predictive uncertainty mainly occurred in areas with a lack of soil survey points.  Both pedogenic and geomorphic processes contributed to the shaping of soil depth pattern of this basin but the latter was dominant.  This findings may be applicable to other similar basins in cold and arid regions around the world.

Keywords:  digital soil mapping       spatial variation       uncertainty        machine learning        soil-landscape model        soil depth  
Received: 23 March 2021   Accepted: 23 April 2021
Fund: This study was supported by the National Natural Science Foundation of China (41130530, 91325301 and 42071072).
About author:  LIU Feng, E-mail:; Correspondence ZHANG Gan-lin, Tel: +86-25-86881279, E-mail:

Cite this article: 

LIU Feng, YANG Fei, ZHAO Yu-guo, ZHANG Gan-lin, LI De-cheng. 2022. Predicting soil depth in a large and complex area using machine learning and environmental correlations. Journal of Integrative Agriculture, 21(8): 2422-2434.

Baver L D. 1956. Soil Physics. 3rd ed. John Wiley & Sons, New York, USA.
Bivand R, Keitt T, Rowlingson B, Pebesma E, Sumner M, Hijmans R, Baston D, Rouault E, Warmerdam F, Ooms J, Rundel C. 2021. rgdal: Bindings for the geospatial data abstraction library. R package version 1.5–23. [2021-02-03].
Böhner J, Antonic O. 2009. Chapter 8: Land-surface parameters specific to topo-climatology. Developments in Soil Science, 33, 195–226.
Bonfatti B R, Hartemink A E, Vanwalleghem T, Minasny B, Giasson E. 2018. A mechanistic model to predict soil thickness in a valley area of Rio Grande do Sul, Brazil. Geoderma, 309, 17–31.
Bourennane H, Salvador-Blanes S, Couturier A, Chartin C, Pasquier C, Hinschberger F, Macaire J, Daroussin J. 2014. Geostatistical approach for identifying scale-specific correlations between soil thickness and topographic attributes. Geomorphology, 220, 58–67.
Breiman L. 2001. Random forests. Machine Learning, 45, 5–32.
Brunke MA, Broxton P, Pelletier J, Gochis D, Hazenberg P, Lawrence D M, Leung R, Niu G Y, Troch P A, Zeng X. 2016. Implementing and evaluating variable soil thickness in the Comminity Land Model, Version 4.5 (CLM4.5). Journal of Climate, 29, 3441–3461.
Chen S C, Mulder V L, Martin M P, Walter C, Lacoste M, Richer-de-Forges A C, Saby N P A, Loiseau T, Hu B F, Arrouays D. 2019. Probability mapping of soil thickness by random survival forest at a national scale. Geoderma, 344, 184–194.
Cheng G D, Li X, Zhao W Z, Xu Z M, Feng Q, Xiao S C, Xiao H L. 2014. Integrated study of the water-ecosystem-economy in the Heihe River Basin. National Science Review, 1, 413–428.
Cheng W, Zhu A X, Qin C Z, Qi F. 2019. Updating conventional soil maps by mining soil-environment relationships from individual soil polygons. Journal of Integrative Agriculture, 18, 265–278.
CRGCST (Cooperative Research Group on Chinese Soil Taxonomy). 2001. Keys to Chinese Soil Taxonomy. 3rd ed. Press of University of Science and Technology of China, Hefei, China.
Fick S E, Hijmans R J. 2017. WorldClim 2: new 1-km spatial resolution climate surfaces for global land areas. International Journal of Climatology, 37, 4302–4315.
Gao B C. 1996. NDWI - A normalized difference water index for remote sensing of vegetation liquid water from space. Remote Sensing of Environment, 58, 257–266.
Gao X S, Xiao Y, Deng L J, Li Q Q, Wang C Q, Li B, Deng O P, Zeng M. 2019. Spatial variability of soil total nitrogen, phosphorus and potassium in Renshou County of Sichuan Basin, China. Journal of Integrative Agriculture, 18, 279–289.
Gessler P E, Moore I D, McKenzie N J, Ryan P J. 1995. Soil-landscape modelling and spatial prediction of soil attributes. International Journal of Geographical Information Systems, 9, 421–432.
Goodman A. 1999. Trend Surface Analysis in the Comparison of Spatial Distributions of Hillslope Parameters. Ph D thesis, Deakin University, Australia.
Guo Y, Zhou Y, Zhou L Q, Liu T, Wang L G, Cheng Y Z, He J, Zheng G Q. 2019. Using proximal sensor data for soil salinity management and mapping. Journal of Integrative Agriculture, 18, 340–349.
Heimsath A M, Dietrich W E, Nishiizumi K, Finkel R C. 1999. Cosmogenic nuclides, topography, and the spatial variation of soil depth. Geomorphology, 27, 151–172.
Henderson B L, Bui E N, Moran C J, Simon D A P. 2005. Australia-wide predictions of soil properties using decision trees. Geoderma, 124, 383–398.
Hijmans R J, van Etten J. 2013. Raster: Geographic data analysis and modeling. R package version 2.1-25. [2013-04-12]. Archive /raster/
IUSS (International Union of Soil Science) Working Group WRB (World Reference Base). 2014. World Reference Base for Soil Resources 2014. International Soil Classification System for Naming Soils and Creating Legends For Soil Maps. World Soil Resources Reports No. 106. FAO, Rome.
Jin Y X, Li H, Li F S, Liu Y Y, Li Y. 2015. Dust storm occurrence and its possible dynamic mechanism in northwestern China over the past 50 years. Journal of Lanzhu University (Natural Sciences), 51, 633–638. (in Chinese)
Koenker R. 2005. Quantile Regression. Cambridge University Press, Cambridge, UK.
Kuhn M. 2008. Building predictive models in R using the caret package. Journal of Statistical Software, 28, 1–26.
Kuriakose S L, Devkota S, Rossiter D G, Jetten V G. 2009. Prediction of soi depth using environmental variables in an anthropogenic landscape, a case study in the western Ghats of Kerala, India. Catena, 79, 27–38.
Lacoste M, Mulder V L, Richer-de-Forges A C, Martin M P, Arrouays D. 2016. Evaluating large-extent spatial modeling approaches: A case study for soil depth for France. Geoderma Regional, 7, 137–152.
Lagacherie P, Arrouays D, Bourennane H, Gomez C, Martin M, Saby N P A. 2019. How far can the uncertianty on a Digital Soil Map be known: A numerical experiment using pseudo values of clay content obtained from Vis-SWIR hyperspectral imagery. Geoderma, 337, 1320–1328.
Li L P, Li Y Y, Liu W C. 2020. Characteristics of the strong sandstorm process in the east of Hexi Corridor on 19 March 2018. Desert and Oasis Meteorology, 14, 10–17. (in Chinese)
Li Z W, Xin X P, Tang H, Yang F, Chen B R, Zhang B H. 2017. Estimating grassland LAI using the random forest approach and Landsat imagery in the meadow steppe of Hulunber, China. Journal of Integrative Agriculture, 16, 286–297.
Lin L I K. 1989. A concordance correlation coefficient to evaluate reproducibility. Biometrics, 45, 255–268.
Liu F, Rossiter D G, Song X D, Zhang G L, Wu H Y, Zhao Y G. 2020. An approach for broad scale predictive soil properties mapping in low-relief areas based on responses to solar radiation. Soil Science Society of America Journal, 84, 144–162.
Liu F, Zhang G L, Sun Y J, Zhao Y G, Li D C. 2013. Mapping the three-dimensional distribution of soil organic matter across a subtropical hilly landscape. Soil Science Society of America Journal, 77, 1241–1253.
Liu J T, Han X L, Chen X, He R M, Wu P F. 2019. Prediction of soil thickness in a headwater hillslope with constrained sampling data. Catena, 177, 101–113.
Lu Y Y, Liu F, Zhao Y G, Song X D, Zhang G L. 2019. An integrated method of selecting environmental covariates for predictive soil depth mapping. Journal of Integrative Agriculture, 18, 301–315.
Ma C, Zhang H H, Wang X F. 2014. Machine learning for big data analytics in plants. Trends in Plant Science, 19, 798–808.
Malone B P, McBratney A B, Minasny B. 2011. Empirical estimates of uncertainty of mapping continuous depth functions of soil attributes. Geoderma, 160, 614–626.
Malone B P, Searle R. 2020. Improvements to the Australian national soil thickness map using an integrated data mining approach. Geoderma, 377, 114579.
Meinshausen N. 2006. Quantile regression forests. Journal of Machine Learning Research, 7, 983–999.
Minasny B, McBratney A B. 2006. Mechanistic soil-landscape modelling as an approach to developing pedogenetic classifications. Geoderma, 133, 138–149.
Mulder V L, Lacoste M, Richer de Forges A C, Arrouays D. 2016. Globalsoilmap France: High-resolution spatial modelling the soils of France up to two meter depth. Science of the Total Environment, 573, 1352−1369.
Patton N R, Lohse K A, Godsey S E, Crosby B T, Seyfried M S. 2018. Predicting soil thickness on soil mantled hillslopes. Nature Communications, 9, 3329.
Penizek V, Boruvka L. 2006. Soil depth prediction supported by primary terrain attributes: A comparison of methods. Plant, Soil and Environment, 52, 424–430.
Phiri D, Morgenroth J. 2017. Developments in Landsat land cover classification methods: A review. Remote Sensing, 9, 967.
Saco P M, Willgoose G R, Hancock G R. 2006. Spatial organization of soil depths using a landform evolution model. Journal of Geophysical Research, 111, F02016.
Sarkar S, Roy A K, Martha T R. 2013. Soil depth estimation through soil-landscape modelling using regression kriging in a Himalayan terrain. International Journal of Geographical Information Science, 27, 2436–2454.
Seid N M, Yitaferu B, Kibret K, Ziadat F. 2013. Soil-landscape modeling and remote sensing to provide spatial representation of soil attributes for an Ethiopian watershed. Applied and Environmental Soil Science, 2013, 798094.
Scarpone C, Schmidt M G, Bulmer C E, Knudby A. 2016. Modelling soil thickness in the critical zone for Southern British Columbia. Geoderma, 282, 59–69.
Singh S, Nouri A, Singh S, Anapalli S, Lee J, Arelli P, Jagadamma S. 2020. Soil organic carbon and aggregation in response to thirty-nine years of tillage management in the southeastern US. Soil & Tillage Research, 197, 104523.
Tao J H. 2009. Analysis on a sandstorm event over the Hexi Corridor on April 4–6, 2006. Arid Zone Research, 26, 664–670. (in Chinese)
Teng H F, Hu J, Zhou Y, Zhou L Q, Shi Z. 2019. Modelling and mapping soil erosion potential in China. Journal of Integrative Agriculture, 18, 251–264.
Teng H F, Liang Z Z, Chen S C, Liu Y, Viscarra Rossel R A, Chappell A, Yu W, Shi Z. 2018. Current and future assessments of soil erosion by water on the Tibetan Plateau based on RUSLE and CMIP5 climate models. Science of the Total Environment, 635, 673–686.
Vaysse K, Lagacherie P. 2017. Using quantile regression forest to estimate uncertainty of digital soil mapping products. Geoderma, 291, 55–64.
Wang Z J, Zhang J P, Zheng H. 2013. A brief discussion of utilization and management of water resources under historical dimension in Hexi Corridor, China. South-to-North Water Transfers and Water Science & Technology, 11, 7–11. (in Chinese)
Wright M N, Ziegler A. 2017. Ranger: A fast implementation of Random Forests for high dimensional data in C++ and R. Journal of Statistical Software, 77, 1–17.
Wu C Y, Cao G C, Chen K L, E C Y, Mao Y H, Zhao S K, Wang Q, Su X Y, Wei Y L. 2019. Remote sensed estimation and mapping of soil moisture by eliminating the effect of vegetation cover. Journal of Integrative Agriculture, 18, 316–327.
Yan T T, Zhao W J, Zhu Q K, Xu F J, Gao Z K. 2021. Spatial distribution characteristics of the soil thickness on different land use types in the Yimeng Mountain area, China. Alexandria Engineering Journal, 60, 511–520.
Yang F, Zhang G L, Sauer D, Yang F, Yang R M, Liu F, Song X D, Zhao Y G, Li D C, Yang J L. 2020. The geomorphology-sediment distribution-soil formation nexus on the northeastern Qinghai-Tibetan Plateau: Implications for landscape evolution. Geomorphology, 354, 107040.
Yang Q J, Wu K N, Feng Z, Zhao R, Zhang X D, Li X L. 2020. Advancement and revelation of the research on soil quality assessment on large spatial scales. Acta Pedologica Sinica, 57, 565–578. (in Chinese) 
Yi C, Li D C, Zhang G L, Zhao Y G, Yang J L, Liu F, Song, X D. 2015. Criteria for partition of soil thickness and case studies. Acta Pedologica Sinica, 52, 220–227. (in Chinese)
Zhang S, Liu G, Chen S L, Rasmussen C, Liu B. 2021. Assessing soil thickness in a black soil watershed in northeast China using random forest and field observations. International Soil and Water Conservation Research, 9, 49–57.
Ziadat F M. 2010. Prediction of soil depth from digital terrain data by integrating statistical and visual approaches. Pedosphere, 20, 361–367.

[1] ZHOU Wen-bin, WANG Huai-yu, HU Xi, DUAN Feng-ying. Spatial variation of technical efficiency of cereal production in China at the farm level[J]. >Journal of Integrative Agriculture, 2021, 20(2): 470-481.
[2] XU Xin-peng, HE Ping, CHUAN Li-min, LIU Xiao-yan, LIU Ying-xia, ZHANG Jia-jia, HUANG Xiao-meng, QIU Shao-jun, ZHAO Shi-cheng, ZHOU Wei. Regional distribution of wheat yield and chemical fertilizer requirements in China[J]. >Journal of Integrative Agriculture, 2021, 20(10): 2772-2780.
[3] LIANG Peng, QIN Cheng-zhi, ZHU A-xing, HOU Zhi-wei, FAN Nai-qing, WANG Yi-jie. A case-based method of selecting covariates for digital soil mapping[J]. >Journal of Integrative Agriculture, 2020, 19(8): 2127-2136.
[4] LU Yuan-yuan, LIU Feng, ZHAO Yu-guo, SONG Xiao-dong, ZHANG Gan-lin. An integrated method of selecting environmental covariates for predictive soil depth mapping[J]. >Journal of Integrative Agriculture, 2019, 18(2): 301-315.
[5] GUO Yan, ZHOU Yin, ZHOU Lian-qing, LIU Ting, WANG Lai-gang, CHENG Yong-zheng, HE Jia, ZHENG Guo-qing. Using proximal sensor data for soil salinity management and mapping[J]. >Journal of Integrative Agriculture, 2019, 18(2): 340-349.
[6] ZHANG Gan-lin, LIU Feng, SONG Xiao-dong. Recent progress and future prospect of digital soil mapping: A review[J]. >Journal of Integrative Agriculture, 2017, 16(12): 2871-2885.
[7] ZHANG Shi-yuan, ZHANG Xiao-hu, QIU Xiao-lei, TANG Liang, ZHU Yan, CAO Wei-xing, LIU Lei-lei. Quantifying the spatial variation in the potential productivity and yield gap of winter wheat in China[J]. >Journal of Integrative Agriculture, 2017, 16(04): 845-857.
[8] FANG Fu-ping, FENG Jin-fei, LI Feng-bo, PENG Shao-bing. Impacts of the north migration of China’s rice production on its ecosystem service value during the last three decades (1980–2014)[J]. >Journal of Integrative Agriculture, 2017, 16(01): 76-84.
[9] CAO Xiang-hui, LONG Huai-yu, LEI Qiu-liang, LIU Jian, ZHANG Ji-zong, ZHANG Wen-ju, WU Shu-xia. Spatio-temporal variations in organic carbon density and carbon sequestration potential in the topsoil of Hebei Province, China[J]. >Journal of Integrative Agriculture, 2016, 15(11): 2627-2638.
[10] SU Wei, YU De-yong, SUN Zhong-ping, ZHAN Jun-ge, LIU Xiao-xuan, LUO Qian. Vegetation changes in the agricultural-pastoral areas of northern China from 2001 to 2013[J]. >Journal of Integrative Agriculture, 2016, 15(05): 1145-1156.
No Suggested Reading articles found!