中国农业科学 ›› 2015, Vol. 48 ›› Issue (12): 2317-2326.doi: 10.3864/j.issn.0578-1752.2015.12.004

• 耕作栽培·生理生化·农业信息技术 • 上一篇    下一篇

小麦籽粒蛋白质光谱特征变量筛选方法研究

李栓明1,2,3,郭银巧1,王克如1,4,谢瑞芝1,戴建国2,3,肖春华4,李静4,李少昆1,4   

  1. 1中国农业科学院作物科学研究所,北京 100081
    2石河子大学信息科学与技术学院,新疆石河子 832000
    3兵团空间信息工程技术研究中心,新疆石河子 832000
    4石河子大学绿洲生态农业重点实验室,新疆石河子 832000
  • 收稿日期:2014-09-30 出版日期:2015-06-16 发布日期:2015-06-16
  • 通讯作者: 李少昆,Tel:010-82108891;E-mail:lishaokun@caas.cn
  • 作者简介:李栓明,E-mail:lsmxj737@163.com
  • 基金资助:
    国家科技支撑计划(2012BAH27B00)

Research on Variable Selection of Wheat Kernel Protein Content with Near-Infrared Spectroscopy

LI Shuan-ming1,2,3, GUO Yin-qiao1, WANG Ke-ru1,4, XIE Rui-zhi1, DAI Jian-guo2,3XIAO Chun-hua4, LI Jing4, LI Shao-kun1,4   

  1. 1Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing 100081
    2College of Information Science and Technology, Shihezi University, Shihezi 832000, Xinjiang
    3Geospatial Information Engineering Research Center, Xinjiang Production and Construction Corps, Shihezi 832000, Xinjiang
    4Key Laboratory of Oasis Ecology Agriculture of Shihezi  University, Shihezi 832000, Xinjiang
  • Received:2014-09-30 Online:2015-06-16 Published:2015-06-16

摘要: 【目的】筛选整粒小麦籽粒蛋白质的近红外特征光谱波段并建立优化模型,可实现快速、无损测定整粒小麦籽粒蛋白质含量,为田间便携式小麦籽粒蛋白质含量速测仪设计提供依据。【方法】2012—2013年以蛋白质含量有明显差异的8个冬小麦品种为试验品种,设置3个施氮量和2个灌溉量共6个处理,建立丰富的样本类型,共采集176个小麦籽粒光谱数据;将ASD FieldSpec Pro光谱仪采集到的基于全反射下垫面的整粒小麦籽粒反射光谱通过公式A=log(1/R)转换为吸收光谱,对吸收光谱采用S-G平滑、多元散射校正和基线校正等方法进行预处理,以消除背景噪声,然后采用交叉验证偏最小二乘回归方法进行特征波段压缩;分析比较无信息变量剔除法(UVE)结合交叉验证偏最小二乘回归、连续投影算法(SPA)结合交叉验证偏最小二乘回归、UVE与SPA组合后结合交叉验证偏最小二乘回归、UVE与SPA组合后结合多元线性回归(MLR)及UVE与SPA组合后结合逐步多元线性回归(SMLR)等多种特征光谱筛选方法选出的蛋白质特征波段的优劣,并与凯氏定氮法测定的小麦籽粒蛋白质含量进行回归分析,构建并优选小麦籽粒蛋白质最佳预测模型。【结果】利用无信息变量剔除(UVE)方法可将与小麦籽粒蛋白质含量无关的信息变量剔除,把籽粒的原始光谱由1 621个波段压缩至717个,在保留了蛋白质信息的同时,实现了特征谱段的初次优选;对逐步多元线性回归(SMLR)、连续投影算法(SPA)、连续投影算法(SPA)+逐步多元线性回归(SMLR)及连续投影算法(SPA)+偏最小二乘回归(PLS)+交叉验证(CV)等特征波段优选算法比较发现,不同的方法获得的特征谱段有差异,构建的模型及精度也明显不同。对经过无信息变量剔除(UVE)法筛选光谱特征谱段,利用SPA消除光谱矩阵中波段共线性影响,再利用SMLR筛选出小麦籽粒蛋白质信息贡献最大的15个特征谱段,所得模型的预测均方根误差(RMSEP)和R2分别为0.5898和0.9410,模型预测精度最高。【结论】本研究利用UVE、SPA与SMLR方法有效压缩了整粒小麦籽粒光谱矩阵,基于所筛选的蛋白质含量特征谱段数构建的预测模型可以实现无损、快速测定整粒小麦籽粒蛋白质含量,预测模型精度可靠,方法经济有效,为设计田间便携式整粒小麦籽粒蛋白质测定仪的波段选择和开发奠定了基础。

关键词: 特征光谱, 小麦, 籽粒蛋白质, 无信息变量剔除, 连续投影算法, 模型构建

Abstract: 【Objective】The objective of the study was to select the characteristic spectrum of the whole grain of wheat grain protein and set up an optimization model for the rapid and non-damage detection of protein content of the whole grain of wheat grain protein, so as to provide a basis for designing field portable wheat grain protein content determination of spectrometer.【Method】 The experiment was carried out in 2012 and 2013 with eight winter wheat varieties with obvious difference in protein content as materials. Six treatments including three nitrogen levels and two irrigation levels were designed, and rich sample types of 176 wheat grain spectral data was collected. The original reflection spectrum obtained by ASD FieldSpec Pro optical spectrum instrument were transformed as absorption spectra. Then, through the S-G smoothing, the multiplicative scatter correction and baseline correction processing, the spectra were used to create model with cross validation of partial least squares regression, uninformative variables elimination (UVE) method, successive projections algorithm (SPA), multiple linear regression (MLR) provision and stepwise multiple linear regression (SMLR) and their combination, respectively. 【Result】The results show that, the unconcerned information with wheat grain protein could be eliminated by uninformative variables elimination (UVE) method, the original spectrum wavelengths were compressed from 1621 to 717, which realized the first screening without getting rid of protein information. After that, different screening methods were used to correct the characteristics spectrum model. In this study, firstly, the grain protein content irrelevant information variables were removed by UVE. Then, using the SPA eliminated the effects of collinearity in the band spectrum matrix. Last, using SMLR contribution to the whole grain of wheat grain protein prediction model, the 15 big characteristic bands were screened out. The root mean square prediction error (RMSEP) and R2 are 0.5898 and 0.9410, respectively. 【Conclusion】To implement rapid determination of the whole grain of wheat grain protein under field conditions, the whole grain of wheat grain spectrum matrix can be effectively compressed using UVE, SPA and SMLR methods. The constructed forecasting model based on the screening protein content characteristic spectra can realize nondestructive and rapid determination of the whole grain of wheat grain protein content. The forecasting model is accuracy, reliable and cost effective, which lay a solid foundation for the design of field portable integrated wheat grain protein meter band selection and development.

Key words: characteristic spectrum, wheat, kernel protein, uninformative variables elimination, successive projections algorithm, model formation