Please wait a minute...
Journal of Integrative Agriculture  2020, Vol. 19 Issue (8): 2127-2136    DOI: 10.1016/S2095-3119(19)62857-1
Special Issue: 农业生态环境-遥感合辑Agro-ecosystem & Environment—Romote sensing
Agro-ecosystem & Environment Advanced Online Publication | Current Issue | Archive | Adv Search |
A case-based method of selecting covariates for digital soil mapping
LIANG Peng1, 2, QIN Cheng-zhi1, 2, 3, ZHU A-xing1, 2, 3, 4, 5, HOU Zhi-wei1, 2, FAN Nai-qing1, 2, WANG Yi-jie1, 2
 
1 State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, P.R.China
2 University of Chinese Academy of Sciences, Beijing 100049, P.R.China
3 Jiangsu Center for Collaborative Innovation in Geographical Information Resource Development and Application, School of Geography, Nanjing Normal University, Nanjing 210097, P.R.China
4 Key Laboratory of Virtual Geographic Environment, Ministry of Education, Nanjing Normal University, Nanjing 210023, P.R.China
5 Department of Geography, University of Wisconsin-Madison, Madison, WI 53706, USA
Download:  PDF in ScienceDirect  
Export:  BibTeX | EndNote (RIS)      
Abstract  
Selecting a proper set of covariates is one of the most important factors that influence the accuracy of digital soil mapping (DSM).  The statistical or machine learning methods for selecting DSM covariates are not available for those situations with limited samples.  To solve the problem, this paper proposed a case-based method which could formalize the covariate selection knowledge contained in practical DSM applications.  The proposed method trained Random Forest (RF) classifiers with DSM cases extracted from the practical DSM applications and then used the trained classifiers to determine whether each one potential covariate should be used in a new DSM application.  In this study, we took topographic covariates as examples of covariates and extracted 191 DSM cases from 56 peer-reviewed journal articles to evaluate the performance of the proposed case-based method by Leave-One-Out cross validation.  Compared with a novices’ commonly-used way of selecting DSM covariates, the proposed case-based method improved more than 30% accuracy according to three quantitative evaluation indices (i.e., recall, precision, and F1-score).  The proposed method could be also applied to selecting the proper set of covariates for other similar geographical modeling domains, such as landslide susceptibility mapping, and species distribution modeling.
 
Keywords:  digital soil mapping        covariates        case-based reasoning        Random Forest  
Received: 11 June 2019   Accepted:
Fund: This work was supported by grants from the National Natural Science Foundation of China (41431177 and 41871300), the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD), China, the Innovation Project of State Key Laboratory of Resources and Environmental Information System (LREIS), China (O88RA20CYA), and the Outstanding Innovation Team in Colleges and Universities in Jiangsu Province, China.
Corresponding Authors:  Correspondence QIN Cheng-zhi, Tel: +86-10-64888959, E-mail: qincz@lreis.ac.cn    
About author:  LIANG Peng, E-mail: liangp@lreis.ac.cn;

Cite this article: 

LIANG Peng, QIN Cheng-zhi, ZHU A-xing, HOU Zhi-wei, FAN Nai-qing, WANG Yi-jie. 2020. A case-based method of selecting covariates for digital soil mapping. Journal of Integrative Agriculture, 19(8): 2127-2136.

Adhikari K, Hartemink A E, Minasny B, Kheir R B, Greve M B, Greve M H. 2014. Digital mapping of soil organic carbon contents and stocks in Denmark. PLoS ONE, 9, e105519.
Behrens T, Zhu A X, Schmidt K, Scholten T. 2010. Multi-scale digital terrain analysis and feature selection for digital soil mapping. Geoderma, 155, 175–185.
Bishop T F, Minasny B. 2016. Digital soil-terrain modeling: the predictive potential and uncertainty. In: Grunwald S, ed., Environmental Soil-Landscape Modeling. CRC Press, Boca Raton. pp. 194–222.
Breiman L. 2001. Random Forests. Machine Learning, 45, 5–32.
de Carvalho Junior W, Lagacherie P, da Silva Chagas C, Calderano Filho B, Bhering S B. 2014. A regional-scale assessment of digital mapping of soil attributes in a tropical hillslope environment. Geoderma, 232–234, 479–486.
Chandrashekar G, Sahin F. 2014. A survey on feature selection methods. Computers & Electrical Engineering, 40, 16–28.
Derksen S, Keselman H J. 1992. Backward, forward and stepwise automated subset selection algorithms: Frequency of obtaining authentic and noise variables. British Journal of Mathematical and Statistical Psychology, 45, 265–282.
Dharumarajan S, Hegde R, Janani N, Singh S K. 2019. The need for digital soil mapping in India. Geoderma Regional, 16, e00204.
Dharumarajan S, Hegde R, Singh S K. 2017. Spatial prediction of major soil properties using Random Forest techniques - A case study in semi-arid tropics of South India. Geoderma Regional, 10, 154–162.
Ditzler C A. 2017. Soil properties and classification (soil taxonomy). In: West L T, Singer M J, Hartemink A E, eds., The Soils of the USA. Springer, Cham. pp. 29–41.
Fourcade Y, Besnard A G, Secondi J. 2018. Paintings predict the distribution of species, or the challenge of selecting environmental predictors and evaluation statistics. Global Ecology and Biogeography, 27, 245–256.
Greve M H, Kheir R B, Greve M B, Bøcher P K. 2012. Using digital elevation models as an environmental predictor for soil clay contents. Soil Science Society of America Journal, 76, 2116–2127.
Grimm R, Behrens T, Märker M, Elsenbeer H. 2008. Soil organic carbon concentrations and stocks on Barro Colorado Island  - Digital soil mapping using Random Forests analysis. Geoderma, 146, 102–113.
Guyon I, Elisseeff A. 2003. An introduction to variable and feature selection. Journal of Machine Learning Research, 3, 1157–1182.
Harris P T, Baker E K. 2011. Seafloor Geomorphology as Benthic Habitat: GeoHAB Atlas of Seafloor Geomorphic Features and Benthic Habitats. Elsevier, Amsterdam.
Hengl T, Heuvelink G B M, Kempen B, Leenaars J G B, Walsh M G, Shepherd K D, Sila A, MacMillan R A, Mendes de Jesus J, Tamene L, Tondoh J E. 2015. Mapping soil properties of Africa at 250 m resolution: Random forests significantly improve current predictions. PLoS ONE, 10, e0125814.
Hengl T, Reuter H I. 2008. Geomorphometry: Concepts, Software, Applications. Elsevier, Amsterdam.
Hou Z W, Qin C Z, Zhu A X, Liang P, Wang Y J, Zhu Y Q. 2019. From manual to intelligent: A review of input data preparation methods for geographic modeling. ISPRS International Journal of Geo-Information, 8, 376.
Huang R, Huang J X, Zhang C, Ma H Y, Zhuo W, Chen Y Y, Zhu D H, Wu Q, Mansaray L R. 2020. Soil temperature estimation at different depths, using remotely-sensed data. Journal of Integrative Agriculture, 19, 277–290.
Jiang J C, Zhu A X, Qin C Z, Zhu T, Liu J Z, Du F, Liu J, Zhang G M, An Y M. 2016. CyberSoLIM: A cyber platform for digital soil mapping. Geoderma, 263, 234–243.
Jiang J C, Zhu A X, Qin C Z, Liu J Z. 2019. A knowledge-based method for the automatic determination of hydrological model structures. Journal of Hydroinformatics, 21, 1163–1178.
Kaster D S, Medeiros C B, Rocha H V. 2005. Supporting modeling and problem solving from precedent experiences: The role of workflows and case-based reasoning. Environmental Modelling & Software, 20, 689–704.
Khoshgoftaar T M, Golawala M, Hulse J V. 2007. An empirical study of learning from imbalanced data using random forest. In: The 19th IEEE International Conference on Tools with Artificial Intelligence (ICTAI) 2007. IEEE, Greece. pp. 310–317.
Lagacherie P, Sneep A R, Gomez C, Bacha S, Coulouma G, Hamrouni M H, Mekki I. 2013. Combining Vis-NIR hyperspectral imagery and legacy measured soil profiles to map subsurface soil properties in a Mediterranean area (Cap-Bon, Tunisia). Geoderma, 209–210, 168–176.
Lark R M, Bishop T F A, Webster R. 2007. Using expert knowledge with control of false discovery rate to select regressors for prediction of soil properties. Geoderma, 138, 65–78.
Lecours V, Devillers R, Simms A E, Lucieer V L, Brown C J. 2017. Towards a framework for terrain attribute selection in environmental studies. Environmental Modelling & Software, 89, 19–30.
Liu F, Zhang G L, Sun Y J, Zhao Y G, Li D C. 2013. Mapping the three-dimensional distribution of soil organic matter across a subtropical hilly landscape. Soil Science Society of America Journal, 77, 1241–1253.
Lu Y Y, Liu F, Zhao Y G, Song X D, Zhang G L. 2019. An integrated method of selecting environmental covariates for predictive soil depth mapping. Journal of Integrative Agriculture, 18, 301–315.
Ma Y, Minasny B, Wu C. 2017. Mapping key soil properties to support agricultural production in Eastern China. Geoderma Regional, 10, 144–153.
McBratney A B, Mendonça Santos M L, Minasny B. 2003. On digital soil mapping. Geoderma, 117, 3–52.
Mosleh Z, Salehi M H, Jafari A, Borujeni I E, Mehnatkesh A. 2016. The effectiveness of digital soil mapping to predict soil properties over low-relief areas. Environmental Monitoring and Assessment, 188, 195.
Pahlavan-Rad M R, Akbarimoghaddam A. 2018. Spatial variability of soil texture fractions and pH in a flood plain (case study from eastern Iran). Catena, 160, 275–281.
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay É. 2011. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
Qin C Z, Wu X W, Jiang J C, Zhu A X. 2016. Case-based knowledge formalization and reasoning method for digital terrain analysis-application to extracting drainage networks. Hydrology and Earth System Sciences, 20, 3379–3392.
Ramcharan A, Hengl T, Nauman T, Brungard C, Waltman S, Wills S, Thompson J. 2018. Soil property and class maps of the conterminous united states at 100-meter spatial resolution. Soil Science Society of America Journal, 82, 186–201.
Rossiter D G, Liu J, Carlisle S, Zhu A X. 2015. Can citizen science assist digital soil mapping? Geoderma, 259, 71–80.
Santra P, Kumar M, Panwar N. 2017. Digital soil mapping of sand content in arid western India through geostatistical approaches. Geoderma Regional, 9, 56–72.
Shi J J, Yang L, Zhu A X, Qin C Z, Liang P, Zeng C Y, Pei T. 2018. Machine-learning variables at different scales vs. knowledge-based variables for mapping multiple soil properties. Soil Science Society of America Journal, 82, 645–656.
Vaysse K, Lagacherie P. 2015. Evaluating Digital Soil Mapping approaches for mapping GlobalSoilMap soil properties from legacy data in Languedoc-Roussillon (France). Geoderma Regional, 4, 20–30.
Wang S, Zhuang Q, Jia S, Jin X, Wang Q. 2018. Spatial variations of soil organic carbon stocks in a coastal hilly area of China. Geoderma, 314, 8–19.
Zhang G L, Liu F, Song X D. 2017. Recent progress and future prospect of digital soil mapping: A review. Journal of Integrative Agriculture, 16, 2871–2885.
Zhou Y, Biswas A, Ma Z, Lu Y, Chen Q, Shi Z. 2016. Revealing the scale-specific controls of soil organic matter at large scale in Northeast and North China Plain. Geoderma, 271, 71–79.
Zhu A X. 2006. SoLIM solutions. [2019-01-10]. http://solim.geography.wisc.edu/software/index.htm
Zhu A X, Liu F, Li B L, Pei T, Qin C Z, Liu G H, Wang Y J, Chen Y N, Ma X W, Qi F, Zhou C H. 2010. Differentiation of soil conditions over low relief areas using feedback dynamic patterns. Soil Science Society of America Journal, 74, 861–869.
Zhu A X, Liu J, Du F, Zhang S J, Qin C Z, Burt J, Behrens T, Scholten T. 2015. Predictive soil mapping with limited sample data. European Journal of Soil Science, 66, 535–547.
Zhu A X, Lu G, Liu J, Qin C Z, Zhou C. 2018. Spatial prediction based on Third Law of Geography. Annals of GIS, 24, 225–240.
Zhu A X, Wang R X, Qiao J P, Qin C Z, Chen Y B, Liu J, Du F, Lin Y, Zhu T X. 2014. An expert knowledge-based approach to landslide susceptibility mapping using GIS and fuzzy logic. Geomorphology, 214, 128–138.
Zhu A X, Zhao F H, Liang P, Qin C Z. 2020. Next generation of GIS: must be easy. Annals of GIS, doi: 10.1080/19475683.2020.1766563
Zhu Y, Yang J. 2019. Automatic data matching for geospatial models: A new paradigm for geospatial data and models sharing. Annals of GIS, 25, 283–298.
[1] LIU Feng, YANG Fei, ZHAO Yu-guo, ZHANG Gan-lin, LI De-cheng. Predicting soil depth in a large and complex area using machine learning and environmental correlations[J]. >Journal of Integrative Agriculture, 2022, 21(8): 2422-2434.
[2] ZHANG Gan-lin, LIU Feng, SONG Xiao-dong. Recent progress and future prospect of digital soil mapping: A review[J]. >Journal of Integrative Agriculture, 2017, 16(12): 2871-2885.
No Suggested Reading articles found!