Scientia Agricultura Sinica ›› 2015, Vol. 48 ›› Issue (3): 449-459.doi: 10.3864/j.issn.0578-1752.2015.03.05

• AGRICULTURE INFORMATION TECHNOLOGY • Previous Articles     Next Articles

The Agricultural Price Information Acquisition Method Based on Speech Recognition

XU Jin-pu1,2, ZHU Ye-ping1   

  1. 1Agricultural Information Institute, Chinese Academy of Agricultural Sciences/Key Laboratory of Agri-Information Service Technology, Ministry of Agriculture, Beijing 100081
    2Animation and Media College, Qingdao Agricultural University, Qingdao 266109, Shandong
  • Received:2014-03-03 Online:2015-01-31 Published:2015-01-31

Abstract: 【Objective】In this research, speech recognition technology was applied to collect agricultural price information. The aim of the research is to recognize the continuous speech which is limited in vocabulary and uttered by independent Chinese mandarin speakers, and to propose a robust speech recognition method suitable for the environment where agricultural product prices are collected. On the basis of Hidden Markov Model (HMM), we train the acoustic models for this environment, so as to relieve the decrease of recognition rate caused by the mismatching between the test environment and the training environment, and to make further improvement of the recognition rate. 【Method】 In the stage of acquiring and processing data, we first built the transformation grammar according to certain rules to recognize the limited vocabulary, and this grammar will be used to guide the recording of both train data and test data. Then we select different environments to collect agricultural product prices by different speakers. On this basis, we built a speech corpus in which speech data are artificially segmented with accuracy. In the stage of training model, we choose the continuous mixed density Hidden Markov Model with left-to-rigt and non-jump structure, and extract 39 demension MFCC feature vector from training dataset to train the HMMs. Firstly, we select monophones as the recognition unit to train male HMMs, female HMMs, and male-female mixed HMMs. Taking it into consideration that the monophones are poor in stability and vulnerable to coarticulation, we select context-dependent triphone as the decoding unit to retrain above HMMs. Since the number of triphones models will increase significantly when the triphones are chosen as modeling unit, we use the decision tree clustering to solve the insufficiency of training samples. In the process of building a decision tree, we divide all the finals and initials into different sets by using the phonetic knowledge. For the identification of initials, we appeal to their pronunciation way and place, and for the finals, we resort to their constitution and head vowels. In this way we realize the design of binary value questions. On this basis, we increase Gaussian mixture components to make the model more accurately described. Besides, in order to solve the problem of convolution noise in the communication channel, we adopt the CMN and CVN methods to alleviate the mismatching problem between test environment and training environment. Finally, the male and female HMMs are obtained respectively by training. In the stage of test, for the different models employing different methods mentioned above, we do the test experiments with the same test dataset respectively and obtain the sentence recognition rate, word recognition rate, and accuracy of every different method. 【Result】 The results show that recognition performance of triphone models are superior to monophone models. Both male and female HMMs perform better than the male and female mixed acoustic models. Though decision tree clustering method cannot promote recognition rate significantly, it can reduce the quantity of triphone models evidently. Gaussian mixture components improve the recognition rate on the one hand, but they bring a certain amount of increase in calculation on the other. CMN and CVN methods can significantly improve the performance of identification system. Through the different locations and different speaker test, the methods we have used demonstrated varying degrees improvement in the recognition performance. The ultimate recognition rate was 95.04% for males, and 97.62% for females.【Conclusion】It is feasible to apply speech recognition technology to the collection of agricultural product price information. In this paper, we proposed a method to improve the recognition rate in agricultural product price information acquisition. The experiment results show that the models trained by these methods have a good recognition performance. Furthermore, the approach adopted by our research lays a foundation for the development of the application system in the future.

Key words: speech recognition, agricultural price, information acquisition, CMVN, decision tree clustering

[1]    许世卫, 张永恩, 李志强, 李哲敏, 孔繁涛. 农产品全息市场信息规范及分类编码研制. 中国食物与营养, 2011, 17(12): 5-8.
Xu S W, Zhang Y E, LI Z Q, Li Z M, Kong F T. Research on standard and classification coding system of holographic information of agricultural products market. Food and Nutrition in China, 2011, 17(12): 5-8. (in Chinese)
[2]    姚鑫, 罗敏, 杨国强. 基于笔式交互的农产品信息采集系统的研究与设计. 计算机与现代化, 2012(4): 71-75,78.
Yao X, Luo M, Yang G Q. Research and design of pen-based interaction agricultural information collection and dissemination system. Computer and Mordenlization, 2012(4): 71-75, 78. (in Chinese)
[3]    钱建平, 吴晓明, 范蓓蕾, 杨信廷, 孙连新, 陈明. 基于条码-RFID关联的蔬菜流通过程追溯精确度提高方法. 中国农业科学, 2013, 46(18): 3857-3863.
Qian J P, Wu X M, Fan B L, Yang X T, Sun L X, Chen M. A solution for improving vegetable circulation traceability precision based on barcode-rfid correspondence. Scientia Agricultura Sinica, 2013, 46(18): 3857-3863. (in Chinese)
[4]    戴建国, 王克如, 李少昆, 李栓明, 王琼. 基于国营农场的作物生产信息管理系统设计与实现. 中国农业科学, 2012, 45(11): 2159-2167.
Dai J G, Wang K R, Li S K, Li S M, Wang Q. Designing and implementation of crop production management information system based on state-operated farm. Scientia Agricultura Sinica, 2012, 45(11): 2159-2167. (in Chinese)
[5]    赵春江, 申长军, 邢振, 郑文刚, 鲍锋, 吴文彪. 农产品信息采集器及采集方法[P].中国: CN102122430A, 2011.
Zhao C J, Shen C J, Xing Z, Zheng W G, Bao F, Wu W B. A device and method of agricultural product information acquisition. China Patent: CN102122430A[P], 2011.
[6]    张石锐, 郑文刚, 申长军, 邢振. 嵌入式手持无线农产品价格信息采集终端. 计算机工程与设计, 2012, 33(2): 514-518.
Zhang S R, Zheng W G, Shen C J, Xing Z. Agricultural product  price information collection terminal of embedded portable   wireless. Computer Engineering and Design, 2012, 33(2): 514-518. (in Chinese)
[7]    田文君, 申长军, 郑文刚, 张石锐, 周冠华. 农产品价格信息采集与预警系统设计与实现. 计算机工程与设计, 2012, 33(5): 1816-1821.
Tian W J, Shen C J, Zheng W G, Zhang S R, Zhou G H. Design and implementation of agricultural products price information acquisition and early warning system. Computer Engineering and Design, 2012, 33(5): 1816-1821. (in Chinese)
[8]    Singh G. Multi utility e-controlled cum voice operated farm. International Journal of Computer Applications, 2010, 1(13): 109-113.
[9]    Mantena G V, Rajendran S, Rambabu B, Gangashetty S V, Yegnanarayana B, Prahallad K. A speech-based conversation system for accessing agriculture commodity prices in Indian languages. Hands-free Speech Communication and Microphone Arrays (HSCMA), 2011 Joint Workshop on, 2011: 153-154.
[10]   Plauche M, Nallasamy U, Pal J, Wooters C, Ramachandran D. Speech recognition for illiterate access to information and technology. //Proceedings of the First International Conference on Information and Communication Technologies and Development (ICTD '06). Berkeley, CA, 2006: 83-92.
[11]   张翠丽, 张申生, 李磊. 基于统一受理的农业呼叫中心解决方案. 计算机软件与应用, 2006, 23(10): 31-32, 35.
Zhang C L, Zhang S S, Li L. Unified service-based solution of call center of agriculture. Computer Applications and Software, 2006, 23(10): 31-32, 35. (in Chinese)
[12]   Ou W H, Gao W L, Li Z, Zhang S L, Wang Q. Application of keywords speech recognition in agricultural voice information system. //Computational Intelligence and Natural Computing Proceedings (CINC), 2010 Second International Conference. Wuhan, Hubei, 2010: 197-200.
[13]   Chedad A, Moshou D, Aerts J M, Van Hirtum A, Ramon H, Berckmans D. Recognition system for pig cough based on probabilistic neural networks. Journal of Agricultural Engineering Research, 2001, 79(4): 449-457.
[14]   Guarino M, Jans P, Costa A, Aerts J M, Berckmans D. Field test of algorithm for automatic cough detection in pig houses. Computers and Electronics in Agriculture, 2008, 62(1): 22-28.
[15]   赵丽稳, 王鸿斌, 张真, 孔祥波. 昆虫声音信号和应用研究进展. 植物保护, 2008, 34(4): 5-12.
Zhao L W, Wang H B, Zhang Z, Kong X B. Research advances in insect acoustic signals and their applications. Plant Protection, 2008, 34(4): 5-12. (in Chinese)
[16]   竺乐庆, 张真. 基于MFCCGMM的昆虫声音自动识别. 昆虫学报, 2012, 55(4): 466-471.
Zhu L Q, Zhang Z. Automatic recognition of insect sounds using MFCC and GMM. Acta Entomologica Sinica, 2012, 55(4): 466-471. (in Chinese)
[17]   曹晏飞, 滕光辉, 余礼根, 李乔伟. 含风机噪声的蛋鸡声音信号去噪方法比较. 农业工程学报, 2014, 30(2): 212-218.
Cao Y F, Teng G H, Yu L G, Li Q W. Comparison of different de-noising methods in vocalization environment of laying hens including fan noise. Transactions of the Chinese Society of Agricultural Engineering, 2014, 30(2): 212-218. (in Chinese)
[18]   倪崇嘉, 刘文举, 徐波. 汉语大词汇量连续语音识别系统研究进展.中文信息学报, 2009, 23(1): 112-123, 128.
Ni C J, Liu W J, Xu B. Research on large vocabulary continuous speech recognition for mardarin Chinese. Journal of Chinese Information Processing, 2009, 23(1): 112-123, 128. (in Chinese)
[19]   Lee K F, Hon H W, Reddy R. An overview of the SPHINX speech recognition system. Acoustics, Speech and Signal Processing, IEEE Transactions, 1990, 38(1): 35-45.
[20]   齐耀辉, 潘复平, 葛凤培, 颜永红. 汉语连续语音识别系统中三音子模型的优化. 计算机应用研究, 2013, 30(10): 2920-2922.
Qi Y H, Pan F P, Ge F P, Yan Y H. Refining triphone model in mandarin continuous speech recognition. Application Research of Computers, 2013, 30(10): 2920-2922. (in Chinese)
[21]   徐向华, 朱杰, 郭强. 汉语连续语音识别中的分级聚类算法的研究和应用. 信号处理, 2004, 20(5): 497-500.
Xu X H, Zhu J, Guo Q. A Hierarchical clustering algorithm in continuous mandarin speech recognition. Signal Processing, 2004, 20(5): 497-500. (in Chinese)
[22]   李春, 王作英. 汉语连续语音识别中一种新的音节间相关识别单元. 声学学报, 2003, 28(2): 187-191.
Li C, Wang Z Y. A new acoustic modeling of inter-syllable context-dependent units for Putonghua continuous speech recognition. Acta Acustica, 2003, 28(2): 187-191. (in Chinese)
[23]   , 郑方, 张继勇, 吴文虎. 汉语连续语音识别中上下文相关的声韵母建模. 清华大学学报: 自然科学版, 2004, 44(1): 61-64.
Li J, Zheng F, Zhang J Y, Wu W H. Context-dependent initial/final acoustic modeling for Chinese continuous speech recognition. Journal of Tsinghua University: Science &Technology, 2004, 44(1): 61-64. (in Chinese)
[24]   吴娅辉. 语音识别中区分性训练算法研究[D].北京:北京邮电大学, 2009.
Wu Y H. A research on discriminative training in speech recognition[D]. Beijing: University of Post and Telecommunication, 2009. (in Chinese)
[25]   Li Y G, Pu F A, Zheng F. Statistical threshholding for robust ASR. Journal of ChongQing University of Posts and Telecommunications: Natural Science Edition, 2012, 24(2): 127-132.
[26]   肖云鹏, 叶卫平. 基于特征参数归一化的鲁棒语音识别方法综述. 中文信息学报, 2010, 24(5): 106-116.
Xiao Y P, Ye W P. Survey of feature normalization techniques for robust speech recognition. Journal of Chinese Information Processing, 2010, 24(5): 106-116. (in Chinese)
[27]   Chia-Ping C, Bilmes J A. MVA processing of speech features//Audio, Speech, and Language Processing, IEEE Transactions on, 2007, 15(1): 257-270.
[28]   赵力. 语音信号处理. 2. 北京: 机械工业出版社, 2011: 12-13.
Zhao L. Voice Signal Processing. 2nd ed. Beijing: China Machine Press, 2011: 12-13. (in Chinese)
[29]   Li J, Zheng F, Wu W H. Context-independent Chinese initial-final acoustic modeling//International Symposium on Chinese Spoken Language Processing (ISCSLP'00). Beijing, 2000: 23-26.
[30]   高升, 徐波, 黄泰翼. 基于决策树的汉语三音子模型. 声学学报, 2000, 25(6): 504-509.
Gao S, Xu B, Huang T Y. Triphone models for speech recognition mandarin based on decision tree. Acta Acustica, 2000, 25(6): 504-509. (in Chinese)
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!