中国农业科学 ›› 2022, Vol. 55 ›› Issue (1): 26-35.doi: 10.3864/j.issn.0578-1752.2022.01.003

• 作物遗传育种·种质资源·分子遗传学 • 上一篇    下一篇

基于近红外光谱的高粱籽粒直链淀粉、支链淀粉含量检测模型的构建与应用

张北举(),陈松树,李魁印,李鲁华,徐如宏,安畅,熊富敏,张燕,董俐利,任明见()   

  1. 贵州大学农学院/国家小麦改良中心贵州分中心,贵阳 550025
  • 收稿日期:2021-06-01 接受日期:2021-07-30 出版日期:2022-01-01 发布日期:2022-01-07
  • 通讯作者: 任明见
  • 作者简介:张北举,E-mail: 743665191@qq.com
  • 基金资助:
    贵州省特色杂粮现代农业产业技术体系建设专项(黔财农[2019]15号);酒用高粱良种繁殖及配套栽培技术试验研究(700484192124);贵州酒用高粱品种选育研究(GNW2020GD001)

Construction and Application of Detection Model for Amylose and Amylopectin Content in Sorghum Grains Based on Near Infrared Spectroscopy

ZHANG BeiJu(),CHEN SongShu,LI KuiYin,LI LuHua,XU RuHong,AN Chang,XIONG FuMin,ZHANG Yan,DONG LiLi,REN MingJian()   

  1. College of Agriculture, Guizhou University/Guizhou Branch of National Wheat Improvement Center, Guiyang 550025
  • Received:2021-06-01 Accepted:2021-07-30 Online:2022-01-01 Published:2022-01-07
  • Contact: MingJian REN

摘要:

【目的】高粱是酿酒和饲料的主要原料之一,其籽粒直链淀粉含量与支链淀粉含量的比值大小与白酒品质及饲料质量密切相关。传统的高粱成分化学检测方法已不适合高通量测试,采用改进最小二乘法(modified PLS)对高粱样品的近红外光谱图进行光谱预处理、得分处理和结果监控建立高粱籽粒直链淀粉、支链淀粉含量的预测模型,旨在得到一种快速高效低成本的检测方法,为高粱的遗传改良及品质分析提供依据。【方法】从450份高粱资源中筛选出112份代表品种作为校正集和验证集,通过双波长法测定112份高粱品种籽粒中直链淀粉、支链淀粉含量的化学值,并收集波长为850—1 048 nm的近红外光谱,对光谱进行扫描数据矩阵和化学数据计算得分(PL1)处理解释光谱间差异,剔除马氏距离(GH)大于3的超常品种以减小建模误差。采用Modified PLS回归技术建模,通过不同散射处理和导数处理等方法建立不同的定标模型。根据交叉验证标准偏差(SECV)、交叉验证相关系数(1-VR)确定最佳模型,并进行结果监控和非参数检验评估模型的预测性能。【结果】直链淀粉的近红外预测模型SECV是2.7732,1-VR是0.9503,相关系数(RSQ)是0.9688。Bias=0.229<2.7732(SECV)×0.6,即偏差(Bias)小于定标模型SECV的0.6倍;预测标准偏差(SEP)=1.266<2.7732(SECV)×1.3=3.60516,即SEP小于定标模型SECV的1.3倍,11.01(SD)—10.81(SD)=0.2<11.02(SD)×0.2=2.204即化学数据和近红外预测数据标准偏差(SD)差值小于化学数据SD的20%。支链淀粉的近红外预测模型SECV是1.7516,1-VR是0.8818,RSQ是0.9127。Bias=-0.014<1.7516(SECV)×0.6即Bias小于定标模型SECV的0.6倍,SEP=1.316<1.7516(SECV)×1.3=2.2708即SEP小于定标模型SECV的1.3倍,5.30–5.29=0.01<5.30×0.2=1.06即化学数据和近红外预测数据SD差值小于化学数据SD的20%。利用30份模型外高粱籽粒对模型的有效性进行两配对样本非参数检验,结果表明,直链淀粉含量和支链淀粉含量的测定值与预测值之间差异不显著(P=0.262>0.05;P=0.992>0.05)。【结论】所建立的近红外模型精准度高,稳定性好,能准确快速地检测高粱籽粒中直链淀粉、支链淀粉的含量,可用于高粱的遗传改良及高粱品质的检测。

关键词: 近红外光谱, 高粱, 直链淀粉, 支链淀粉, 改进最小二乘法

Abstract:

【Objective】 Sorghum is one of the main raw materials for wine making and feed. The ratio of amylose content to amylopectin content in its grains is closely related to liquor quality and feed quality. Traditional chemical detection methods of sorghum components are no longer suitable for high-throughput testing. Modified PLS is used to perform spectral preprocessing, score processing and result monitoring on the near-infrared spectra of sorghum samples to establish sorghum grain amylose and amylopectin. The prediction model of amylose content aims to obtain a fast, efficient and low-cost detection method, laying the foundation for genetic improvement and quality analysis of sorghum. 【Method】 From 450 sorghum resources, 112 representative varieties were selected as calibration set and verification set. The chemical values of amylose and amylopectin content in 112 sorghum varieties were measured, and near-infrared spectra with wavelengths of 850-1 048 nm were collected, and the spectrum was scanned data matrix and chemical data calculated score (PL1) processing and interpreting the differences between the spectra, and eliminating abnormal species with Global H (GH) greater than 3 to reduce modeling errors. Modified PLS regression technology is used for modeling, and different calibration models are established through different scattering processing and derivative processing methods. Determine the best model according to the cross-validation standard deviation (SECV) and cross-validation correlation coefficient (1-VR), and perform result monitoring and non-parametric testing to evaluate the predictive performance of the model.【Result】 The near-infrared prediction model SECV of amylose is 2.7732, 1-VR is 0.9503, and the correlation coefficient (RSQ) is 0.9688. Bias=0.229<2.7732(SECV)×0.6, that is, the deviation (Bias) is less than 0.6 times of the calibration model SECV; the predicted standard deviation (SEP)=1.266<2.7732(SECV)×1.3=3.60516, that is, the SEP is less than the calibration. The model SECV is 1.3 times, 11.01(SD)-10.81(SD)=0.2<11.02(SD)×0.2=2.204, that is, the difference between the standard deviation (SD) of the chemical data and the near-infrared prediction data is less than 20% of the chemical data SD. The near-infrared prediction model SECV of amylopectin is 1.7516, 1-VR is 0.8818, and RSQ is 0.9127. Bias=-0.014<1.7516(SECV)×0.6 means that Bias is less than 0.6 times of SECV of calibration model, SEP=1.316<1.7516(SECV)×1.3=2.2708 means SEP is less than 1.3 times of SECV of calibration model, 5.30-5.29=0.01<5.30×0.2=1.06, that is, the difference between the chemical data and the near-infrared prediction data SD is less than 20% of the chemical data SD. Using 30 sorghum grains outside the model to conduct a two-pair sample non-parametric test on the validity of the model, the results showed that the difference between the measured and predicted values of amylose content and amylopectin content was not significant (P=0.262>0.05; P=0.992>0.05).【Conclusion】 The established near-infrared model has high accuracy and good stability, can accurately and quickly detect the content of amylose and amylopectin in sorghum, and can be used for the genetic improvement of sorghum and the detection of sorghum quality.

Key words: near infrared spectroscopy, sorghum, amylose, amylopectin, improved least squares method