中国农业科学 ›› 2015, Vol. 48 ›› Issue (20): 4111-4119.doi: 10.3864/j.issn.0578-1752.2015.20.012

• 园艺·贮藏·保鲜·加工 • 上一篇    下一篇

基于近红外光谱的纽荷尔脐橙产地识别研究

廖秋红1,2,何绍兰1,谢让金1,钱春2,胡德玉1,2,吕强1,易时来1,郑永强1,邓烈1

 
  

  1. 1中国农业科学院柑桔研究所/西南大学柑桔研究所/国家柑桔工程技术研究中心,重庆 400712
    2西南大学园艺园林学院,重庆 400715
  • 收稿日期:2015-03-15 出版日期:2015-10-20 发布日期:2015-10-20
  • 通讯作者: 邓烈,E-mail:liedeng@cric.cn
  • 作者简介:廖秋红,E-mail:qiu5711@163.com
  • 基金资助:
    国家“863”计划(2012AA101904)、国家国际科技合作专项(2013DFA11470)、重庆市科技支撑项目(cstc2014fazktpt80015)

Study on Producing Area Classification of Newhall Navel Orange Based on the Near Infrared Spectroscopy

LIAO Qiu-hong1,2, HE Shao-lan1, XIE Rang-jin1, QIAN Chun2, HU De-yu1,2, LÜ Qiang1, YI Shi-lai1, ZHENG Yong-qiang1, DENG Lie1   

  1. 1Citrus Research Institute, Southwest University/Chinese Academy of Agricultural Sciences/ National Engineering Technology Research Center for Citrus, Chongqing 400712
    2College of Horticulture and Landscape Architecture, Southwest University, Chongqing 400715
  • Received:2015-03-15 Online:2015-10-20 Published:2015-10-20

摘要: 【目的】中国柑橘产区分布广、生态类型复杂,不同产地纽荷尔脐橙果实品质和市场效应具有较大差异。研究基于近红外光谱技术的柑橘产地识别技术,利于不同柑橘产地果品的识别和鉴伪。【方法】从中国南方17个纽荷尔脐橙主要产地选择代表性成年果园,分别采摘成熟鲜果样品100个。利用SupNIR-1500近红外分析仪采集脐橙果实赤道部、肩部表面以及果汁滤液的近红外反射光谱,光谱波长范围为1 000—2 499 nm。采用主成分分析法对原始光谱数据进行预处理,提取近红外光谱的特征信息以降低数据集维度以及噪声。研究人工神经网络理论,构建由一个输入层、一个具有非线性激励函数的隐藏层和一个输出层组成的典型的3层人工神经网络识别模型。研究由径向基函数作为核函数、以光谱主成分作为输入的支持向量机模型,构建由126个分类器组成的一对一扩展支持向量机模型。研究遗传算法优异的自然选择特性,利用遗传算法从光谱主成分中选择出最优的特征基因子集作为支持向量机的输入,构建遗传算法-支持向量机模型。利用3种模型分别对果汁滤液的近红外反射光谱数据进行分类,从而实现产地识别测试,并根据产地识别精度筛选出最优的产地识别模型。进一步对比该最优识别模型对果实赤道部、肩部反射光谱数据的识别精度,从而确定识别精度最高的光谱数据采集源。【结果】利用所建立的3层人工神经网络模型对纽荷尔脐橙果汁滤液的近红外光谱进行产地识别测试,确定当输入神经元数量为11、隐藏神经元数量为13时,模型对果实产地识别的最佳精度达81.45%。采用一对一扩展方式建立支持向量机产地识别模型,研究确定采用径向基函数作为核函数,当主成分数量为20时,脐橙产地识别精度最高可达86.98%。测试利用遗传算法-支持向量机混合模型进行脐橙产地分类识别,确定当种群数量为200、遗传代数为100、交叉概率0.7、突变概率0.01时,遗传算法选择出最优的基因子集进行产地识别,遗传算法-支持向量机模型的产地识别精度最高可达89.72%,优于人工神经网络分类模型和支持向量机分类模型的产地识别精度。进一步利用遗传算法-支持向量机产地识别模型对果实赤道部及肩部的果面反射光谱进行产地识别测试,得到对应的最高识别精度分别为80.00%和69.00%。【结论】遗传算法-支持向量机模型对果汁反射近红外光谱进行产地识别精度最高,优于人工神经网络模型和支持向量机模型。该模型对果实赤道部反射光谱进行分类的精度次于果汁滤液反射光谱但优于果实肩部反射光谱,因此,可利用赤道部的反射光谱实现非破坏性果实产地分类识别

关键词: 纽荷尔脐橙, 产地识别, 近红外光谱, 主成分分析, 人工神经网络, 支持向量机, 遗传算法

Abstract: 【Objective】Newhall navel orange (Citrus sinensis L.) fruits from different producing areas in China, exhibit contrasting quality and market values, due to wide-spreading location of various ecologies. Developing a recognition method based on Near-Infrared (NIR) spectroscopy is very important to identify and distinguish fruits from different producing areas. 【Method】One representative orchard was selected from 17 main producing areas distributed throughout southern China, from which one 100 Newhall navel orange samples were collected. NIR spectra were collected with a SupNIR-1500 spectrograph from the surface of fruit equator and shoulder and the filtered juice for each sample, which were further preprocessed through principal component analysis (PCA) for reduced dimensions and noise. By studying artificial neural network (ANN), a classic three-layer ANN model was established with an input layer, a hidden layer of non-linear activation function and an output layer. By studying support vector machine (SVM) with the radial basis function (RBF) being the kernel function and the principal components of NIR spectra being the input, a one-to-one extended SVM model was established with 126 classifiers. Gene algorithm (GA) with excellent natural selection was used to identify the best Genetic character subset from the principal components as inputs of a SVM classifier, thus a GA-optimized SVM model was composed. These three models were used to classify the NIR spectra of filtered juice, thus the production areas of the oranges, the classification accuracies of which decided the best classifier. Furthermore, the accuracies of the best classifier were tested with the NIR spectra from fruit equator and shoulder surface being the inputs. As a comparison, the best NIR spectra could be identified. 【Result】 Producing area classification was implemented with the three-layer ANN classifier with NIR spectra of Newhall orange juice as the input, where the classifying accuracy reached up to 81.45% when there were 11 input neurons and 13 hidden neurons. The studied one-to-one extended SVM classifier with radial basis function being the core function, exhibited higher accuracy of 86.98% when the number of PC was 20, better than ANN classifier. For GA-SVM classifier took into account the interaction of individual inputs, where the PCA-processed results were optimized by GA. During the experiments, classification accuracy hit 89.72% when the population, generation, mating probability, and mutation probability were 200, 100, 0.7 and 0.01 respectively, surpassed ANN and SVM classifier. Subsequent research found the highest accuracy of GA-SVM classifier was 80% when taking the spectra from the fruit equator, and 69% from the shoulder, not good enough as that of orange juice.【Conclusion】Considering the accuracy, GA-SVM classifier was regarded with the most excellence among three investigated classifiers. Spectra of orange juice were selected as the best data to analyze origins traceability. Accuracy of spectra of fruit equator was inferior to juice but superior to the shoulder, thus had the potential for non-destructive origins classification.

Key words: Newhall Navel orange, producing area recognition, near-infrared spectroscopy, principal components analysis, artificial neural network, support vector machine, genetic algorithm