中国农业科学 ›› 2014, Vol. 47 ›› Issue (22): 4495-4505.doi: 10.3864/j.issn.0578-1752.2014.22.015

• 畜牧·兽医 • 上一篇    下一篇

畜禽基因组选择中贝叶斯方法及其参数优化策略

朱波1,2,王延晖1,牛红1,陈燕1,张路培1,高会江1,高雪1,李俊雅1,孙少华2   

  1. 1中国农业科学院北京畜牧兽医研究所,北京 100193
    2河北农业大学动物科技学院,河北保定071000
  • 收稿日期:2013-10-23 修回日期:2014-07-11 出版日期:2014-11-16 发布日期:2014-11-16
  • 通讯作者: 孙少华,Tel:13315252636;E-mail:shaohuasun@sina.com。李俊雅,Tel:13811568766;E-mail:jl1@iascaas.net.cn
  • 作者简介:朱波,Tel:15652938847;E-mail:zhubo525@126.com
  • 基金资助:
    国家现代农业(肉牛牦牛)产业技术体系(CARS-38)、肉牛多品种全基因组选择关键技术(31272428)

The Strategy of Parameter Optimization of Bayesian Methods for Genomic Selection in Livestock

ZHU Bo1,2, WANG Yan-hui1, NIU Hong1, CHEN Yan1, ZHANG Lu-pei1, GAO Hui-jiang1, GAO Xue1, LI Jun-ya1, SUN Shao-hua2   

  1. 1Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193
    2College of Animal Science and Technology, Agricultural University of Hebei, Baoding 071000, Hebei
  • Received:2013-10-23 Revised:2014-07-11 Online:2014-11-16 Published:2014-11-16

摘要: 品种选育在畜禽育种中占十分重要的地位,基因组选择作为畜禽育种的新兴技术手段而备受关注。其优点为可以缩短世代间隔,加快遗传进展,可以不依赖于表型进行选择。2001年,Meuwisen提出基因组选择的概念后,基因组选择首先应用于奶牛育种,至2014年8月,国际公牛组织已有34个成员国在其国家奶牛育种群中应用基因组选择。随着基因组选择的不断推广应用,提高基因组育种值估计准确性的问题有待于解决,当前对基因组选择方法的研究和探讨正在不断深入,有效的模型及算法对提高基因组育种值估计的准确性具有重大现实意义。至今已有17种贝叶斯方法相继被提出,本文简要介绍了基因组选择中的经典BayesA和BayesB方法,其中BayesA假设所有位点都有效应,BayesB假设部分位点有效应,且这部分有效应的位点所占的比例很小,它们的假设模型和算法都不相同。Meuviwisen提出经典贝叶斯方法后,其它贝叶斯方法犹如雨后春笋般涌出,这些新方法的提出,都是基于经典贝叶斯方法原理,对假设模型和算法进行适当改进,以期对模型中的参数进行优化。如BayesC方法在BayesB的基础上对模型中的π值进行优化,BayesCπ和 BayesDπ是在BayesC的基础上进行改进,这两种方法假设各位点的效应方差是相同的,而BayesC假设各位点的效应方差是不同的,BayesDπ又是在BayesCπ基础上对效应方差服从尺度逆卡方分布中的尺度参数进行优化。Bayes Lasso的思想和BayesA一样,不同之处在于它假设标记效应服从另一种分布-拉普拉斯分布,所以标记效应的后验分布也随之改变。BayesRS方法假设各位点的效应方差是按占一定比例的总遗传方差分配的。其它的贝叶斯方法也都是在前人研究的基础之上对模型中的先验假设进行变换和模型中的参数进行优化,以期寻找最适合群体的假设模型和参数。目前广泛应用的贝叶斯算法仍是经典贝叶斯算法以及BayesCπ,这是由于它们计算结果的稳定性和较高的基因组育种值估计准确性。在这3种贝叶斯算法中,基因组育种值估计准确性基本上是BayesB>BayesCπ>BayesA,但某些性状计算的基因组育种值准确性结果并非如此相对于经典贝叶斯方法,参数优化过程在一定程度上提高了基组育种值估计的准确性。总之,在经典贝叶斯方法的基础上,贝叶斯方法的改进算法及其参数优化策略围绕着以提高基因组育种值估计的准确性为目的,通过生物遗传算法与实际的群体情况相结合,寻找最适的假设模型和参数优化策略,丰富和拓展了基因组选择算法,并能使得基因组育种值更具参考价值由于中国的动物育种历程与国外育种差距甚远,利用基因组选择可以加快畜禽育种进程,进而还可以培养新品系,丰富遗传资源。同时对基因组选择在中国的方法研究及应用进行了介绍,面对基因组选择的种种优点,全基因组选择育种技术势在必行。此外,文章还探讨了畜禽基因组选择中贝叶斯方法及其参数优化策略存在的主要问题和今后研究的热点,以期为获得更加可靠和快捷的基因组选择算法提供参考。

关键词: 基因组选择, 贝叶斯方法, 参数优化

Abstract: Variety selection in livestock breeding occupies an important position. Genomic selection, as a novel technology in livestock breeding, has raised considerable concern. It can shorten the generation interval, speed up the genetic progress, and it can select the candidate individuals as breeding stock without phenotypic data. In 2001, Meuviwisen proposed the concept of genomic selection, which was first applied in dairy cattle. Until August 2014, there were 34 member countries of Interbull organization that had applicated genomic selection in their national dairy cattle breeding group. With the popularization and continuous promotion of genomic selection, some problems of the accuracy of genomic estimated breeding value need to be solved. Various methods of genomic selection have been proposed and more efficient models are being developed. So it has great practical significance to exploit better models and algorithm to improve the accuracy of genomic estimated breeding value. So far, there were 17 Bayesian methods that have been successively proposed. This thesis briefly introduced the classical BayesA and BayesB methods for genomic selection. BayesA assumed that all loca have effect, while BayesB supposed that a small part of locus have effect, and the percentage was extremely small. Therefore, BayesA and BayesB had different models and algorithms. After Meuviwisen proposed classic Bayesian methods, other methods were like mushrooms springing up. New Bayesian methods were based on the classical Bayesian methods, which was optimized by improving the hypothetical model and algorithm. For example, BayesC method, which was based on BayesB, optimized the π value in the model. BayesCπ and BayesDπ were the improvement of BayesC, and these two approaches assumed that marker effect variance of each locus had the same value, whereas BayesC assumed that its marker effect variance of each locus was different. BayesDπ, which was based on BayesCπ, optimized the scale parameter of inverse chi-square distribution. Bayes Lasso had the same idea with BayesA. However, its marker effects were assumed to be another distribution for Laplace, so its posterior distributions of marker effects were also changed. BayesRS method assumed that the variances of marker effect were allocated in different percentage of total genetic variance. In order to find proper hypothesis model and parameters, other Bayesian methods were also based on predecessors' research through changing the prior assumption and improving parameters of the model. At present, commonly used methods for genomic selection are classic Bayesian methods and BayesCπ, which have stabile calculation results and high accuracy of genomic estimated breeding value. In the three Bayesian algorithms, the accuracy is generally arranged into BayesB > BayesCπ > BayesA, but accuracy of genomic estimated breeding value of some traits is not the case. Compared with classical Bayesian method, parameter optimization can improve the accuracy of genome estimated breeding value to some extent. In a word, on the basis of the classical Bayesian method,for the purpose of improving the accuracy of genomic estimated breeding value, the extension of bayesian methods and its parameters optimization strategy seeked for the optimal model and parameters optimization through biological genetic algorithm combined with actual population situation. They enriched and expanded the genomic selection algorithm, and can make the genomic breeding value more reference significance. As the animal breeding process is far from the foreign breeding process in China, genomic selection can cultivate new breed, enrich the genetic resources of China and accelerate the pace of livestock and poultry breeding process. Meanwhile, the algorithm study of genomic slection and its application in China was introduced. In face of the advantages of genomic selection, whole genomic selection breeding technology is imperative. Furthermore, the main problems in current researches and the key points in future studies were also proposed, in hope of providing reference for obtaining more reliable and faster algorithm of genomic selection.

Key words: genomic selection, Bayesian method, parameter optimization