Please wait a minute...
Journal of Integrative Agriculture  2022, Vol. 21 Issue (4): 1126-1136    DOI: 10.1016/S2095-3119(21)63813-3
Special Issue: 动物科学合辑Animal Science
Animal Science · Veterinary Medicine Advanced Online Publication | Current Issue | Archive | Adv Search |
Incorporating genomic annotation into single-step genomic prediction with imputed whole-genome sequence data
TENG Jin-yan1, YE Shao-pan1, GAO Ning2, CHEN Zi-tao1, DIAO Shu-qi1, LI Xiu-jin3, YUAN Xiao-long1, ZHANG Hao1, LI Jia-qi1, ZHANG Xi-quan1, ZHANG Zhe1
1 Guangdong Laboratory of Lingnan Modern Agriculture/Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou 510642, P.R.China
2 State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510006, P.R.China
3 Guangdong Provincial Key Laboratory of Waterfowl Healthy Breeding, College of Animal Sciences and Technology, Zhongkai University of Agriculture and Engineering, Guangzhou 510225, P.R.China
Download:  PDF in ScienceDirect  
Export:  BibTeX | EndNote (RIS)      
摘要  

一步法基因组预测方法广泛应用于畜禽育种中,它可以在一个模型中同时考虑有基因分型和无基因分型的个体信息。目前,基于群体水平的全基因组序列数据快速增长,如何在一步法基因组预测中更好地利用序列数据提高基因组预测准确性备受关注。研究表明,通过整合来自公共数据库的生物学先验信息可提高基因组预测的准确性。因此,本研究中对一步法基因组预测模型进行扩展,以探究如何在模型中有效地整合基因组注释信息提高基因组预测的性能。本研究以黄羽肉鸡群体为实验对象,群体共有1338个个体及23个性状,其中895个个体具有填充的全基因组序列数据,包含5127612个单核苷酸多态性(SNP)标记。研究考虑了不同注释信息与模型的组合,提出四种扩展的一步法模型,并与原始的一步法及基于单倍型的一步法模型进行比较。对于整合基因组注释信息的扩展一步法模型,我们根据鸡的基因组注释信息对SNP标记进行映射,在基因区域和外显子区域分别共映射到3155524和94837个SNP标记。随后采用这些映射到的SNP标记依据不同模型的规则构建基因组亲缘关系矩阵并用于基因组预测。研究结果发现,扩展的一步法模型在其中15个性状中优于其它基准模型。相比于原始的一步法模型,扩展的一步法模型预测能力可提升约2.5%~6.1%。此外,为了进一步提升一步法模型利用序列数据时的基因组预测准确性,我们在该群体中研究了参考群基因分型策略。结果显示在大部分情况下按家系均匀选择的策略来对个体进行基因分型优于随机选择的方式。综上,本研究在一步法框架下,扩展了基因组预测模型,使其整合序列数据以及基因组注释信息。验证了合理利用基因组注释信息和填充的序列数据可提高一步法基因组预测模型的预测能力。而且,在利用序列数据的同时,通过最大化参考群体和候选群体之间的期望亲缘关系来进行基因分型可进一步提高一步法模型的预测能力。本研究的创新在于通过序列数据将基因组注释信息整合至一步法模型中,为在一步法基因组预测方法中有效利用序列数据提供了有益参考




Abstract  Single-step genomic best linear unbiased prediction (ssGBLUP) is now intensively investigated and widely used in livestock breeding due to its beneficial feature of combining information from both genotyped and ungenotyped individuals in the single model.  With the increasing accessibility of whole-genome sequence (WGS) data at the population level, more attention is being paid to the usage of WGS data in ssGBLUP.  The predictive ability of ssGBLUP using WGS data might be improved by incorporating biological knowledge from public databases.  Thus, we extended ssGBLUP, incorporated genomic annotation information into the model, and evaluated them using a yellow-feathered chicken population as the examples.  The chicken population consisted of 1 338 birds with 23 traits, where imputed WGS data including 5 127 612 single nucleotide polymorphisms (SNPs) are available for 895 birds.  Considering different combinations of annotation information and models, original ssGBLUP, haplotype-based ssGHBLUP, and four extended ssGBLUP incorporating genomic annotation models were evaluated.  Based on the genomic annotation (GRCg6a) of chickens, 3 155 524 and 94 837 SNPs were mapped to genic and exonic regions, respectively.  Extended ssGBLUP using genic/exonic SNPs outperformed other models with respect to predictive ability in 15 out of 23 traits, and their advantages ranged from 2.5 to 6.1% compared with original ssGBLUP.  In addition, to further enhance the performance of genomic prediction with imputed WGS data, we investigated the genotyping strategies of reference population on ssGBLUP in the chicken population.  Comparing two strategies of individual selection for genotyping in the reference population, the strategy of evenly selection by family (SBF) performed slightly better than random selection in most situations.  Overall, we extended genomic prediction models that can comprehensively utilize WGS data and genomic annotation information in the framework of ssGBLUP, and validated the idea that properly handling the genomic annotation information and WGS data increased the predictive ability of ssGBLUP.  Moreover, while using WGS data, the genotyping strategy of maximizing the expected genetic relationship between the reference and candidate population could further improve the predictive ability of ssGBLUP.  The results from this study shed light on the comprehensive usage of genomic annotation information in WGS-based single-step genomic prediction.

Keywords:  genomic selection       prior information        sequencing data        genotype imputation        haplotype  
Received: 27 October 2020   Accepted: 11 August 2021
Fund: This work was supported by the National Natural Science Foundation of China (32022078) and the Local Innovative and Research Teams Project of Guangdong Province, China (2019BT02N630). 
About author:  TENG Jin-yan, E-mail: kingyan312@live.cn; Correspondence ZHANG Zhe, Tel/Fax: +86-20-85282019, E-mail: zhezhang@scau.edu.cn

Cite this article: 

TENG Jin-yan, YE Shao-pan, GAO Ning, CHEN Zi-tao, DIAO Shu-qi, LI Xiu-jin, YUAN Xiao-long, ZHANG Hao, LI Jia-qi, ZHANG Xi-quan, ZHANG Zhe. 2022. Incorporating genomic annotation into single-step genomic prediction with imputed whole-genome sequence data. Journal of Integrative Agriculture, 21(4): 1126-1136.

Browning B L, Browning S R. 2016. Genotype imputation with millions of reference samples. The American Journal of Human Genetics, 98, 116–126.
Brøndum R F, Su G, Janss L, Sahana G, Guldbrandtsen B, Boichard D, Lund M S. 2015. Quantitative trait loci markers derived from whole genome sequence data increases the reliability of genomic prediction. Journal of Dairy Science, 98, 4107–4116.
Calus M P L, Meuwissen T H E, de Roos A P W, Veerkamp R F. 2008. Accuracy of genomic selection using different methods to define haplotypes. Genetics, 178, 553–561.
Christensen O F, Lund M S. 2010. Genomic prediction when some animals are not genotyped. Genetics Selection Evolution, 42, 2.
Christensen O F, Madsen P, Nielsen B, Ostersen T, Su G. 2012. Single-step methods for genomic evaluation in pigs. Animal, 6, 1565–1571.
Druet T, Macleod I M, Hayes B J. 2014. Toward genomic prediction from whole-genome sequence data: Impact of sequencing design on genotype imputation, accuracy of predictions. Heredity, 112, 39–47.
Fragomeni B O, Lourenco D A L, Masuda Y, Legarra A, Misztal I. 2017. Incorporation of causative quantitative trait nucleotides in single-step GBLUP. Genetics Selection Evolution, 49, 59.
Gao N, Martini J W R, Zhang Z, Yuan X, Zhang H, Simianer H, Li J. 2017. Incorporating gene annotation into genomic prediction of complex phenotypes. Genetics, 207, 489–501.
Gao N, Teng J, Ye S, Yuan X, Huang S, Zhang H, Zhang X, Li J, Zhang Z. 2018. Genomic prediction of complex phenotypes using genic similarity based relatedness matrix. Frontiers in Genetics, 9, 364.
Goddard M E, Hayes B J. 2007. Genomic selection. Journal of Animal Breeding and Genetics, 124, 323–330.
Granleese T, Clark S A, van der Werf J H J. 2019. Genotyping strategies of selection candidates in livestock breeding programmes. Journal of Animal Breeding and Genetics, 136, 91–101.
Henderson C R. 1975. Best linear unbiased estimation and prediction under a selection model. Biometrics, 31, 423–447.
Horton B J, Banks R G, Van Der Werf J H J. 2015. Industry benefits from using genomic information in two- and three-tier sheep breeding systems. Animal Production Science, 55, 437–446.
Legarra A, Aguilar I, Misztal I. 2009. A relationship matrix including full pedigree and genomic information. Journal of Dairy Science, 92, 4656–4663.
Legarra A, Vitezica Z G. 2015. Genetic evaluation with major genes and polygenic inheritance when some animals are not genotyped using gene content multiple-trait BLUP. Genetics Selection Evolution, 47, 89.
Li X, Zhang Z, Liu X, Chen Y. 2019. Impact of genotyping strategy on the accuracy of genomic prediction in simulated populations of purebred swine. Animal, 13, 1804–1810.
Meuwissen T H, Hayes B J, Goddard M E. 2001. Prediction of total genetic value using genome-wide dense marker maps. Genetics, 157, 1819–1829.
Meuwissen T H E, Odegard J, Andersen-Ranberg I, Grindflek E. 2014. On the distance of genetic relationships and the accuracy of genomic prediction in pig breeding. Genetics Selection Evolution, 46, 49.
Moghaddar N, Khansefid M, Van Der Werf J H J, Bolormaa S, Duijvesteijn N, Clark S A, Swan A A, Daetwyler H D, MacLeod I M. 2019. Genomic prediction based on selected variants from imputed whole-genome sequence data in Australian sheep populations. Genetics Selection Evolution, 51, 1–14.
Pérez-Enciso M, Rincón J C, Legarra A. 2015. Sequence-  vs. chip-assisted genomic selection: Accurate biological information is advised. Genetics Selection Evolution, 47, 1–14.
Solberg T R, Sonesson A K, Woolliams J A, Meuwissen T H E. 2008. Genomic selection using different marker types and densities. Journal of Animal Science, 86, 2447–2454.
Teissier M, Larroque H, Robert-Granié C. 2018. Weighted single-step genomic BLUP improves accuracy of genomic breeding values for protein content in French dairy goats: a quantitative trait influenced by a major gene. Genetics Selection Evolution, 50, 31.
Teng J, Huang S, Chen Z, Gao N, Ye S, Diao S, Ding X, Yuan X, Zhang H, Li J, Zhang Z. 2020. Optimizing genomic prediction model given causal genes in a dairy cattle population. Journal of Dairy Science, 103, 10299–10310.
VanRaden P M. 2008. Efficient methods to compute genomic predictions. Journal of Dairy Science, 91, 4414–4423.
VanRaden P M, Tooker M E, O’Connell J R, Cole J B, Bickhart D M. 2017. Selecting sequence variants to improve genomic predictions for dairy cattle. Genetics Selection Evolution, 49, 1–12.
Xiang R, van den Berg I, MacLeod I M, Hayes B J, Prowse-Wilkins C P, Wang M, Bolormaa S, Liu Z, Rochfort S J, Reich C M, Mason B A, Vander Jagt C J, Daetwyler H D, Lund M S, Chamberlain A J, Goddard M E. 2019. Quantifying the contribution of sequence variants with regulatory and evolutionary significance to 34 bovine complex traits. Proceedings of the National Academy of Sciences of the United States of America, 116, 19398–19408.
Ye S, Yuan X, Huang S, Zhang H, Chen Z, Li J, Zhang X, Zhang Z. 2019. Comparison of genotype imputation strategies using a combined reference panel for chicken population. Animal, 13, 1119–1126.
Ye S, Yuan X, Lin X, Gao N, Luo Y, Chen Z, Li J, Zhang X, Zhang Z. 2018. Imputation from SNP chip to sequence: A case study in a Chinese indigenous chicken population. Journal of Animal Science and Biotechnology, 9, 30.
Zhang Z, Ober U, Erbe M, Zhang H, Gao N, He J, Li J, Simianer H. 2014. Improving the accuracy of whole genome prediction for complex traits using the results of genome wide association studies. PLoS ONE, 9, e93017.
Zhang Z, Xu Z Q, Luo Y Y, Zhang H B, Gao N, He J L, Ji C L, Zhang D X, Li J Q, Zhang X Q. 2017. Whole genomic prediction of growth and carcass traits in a Chinese quality chicken population. Journal of Animal Science, 95, 72–80.

[1] WU Bang-bang, SHI Meng-meng, Mohammad POURKHEIRANDISH, ZHAO Qi, WANG Ying, YANG Chen-kang, QIAO Ling, ZHAO Jia-jia, YAN Su-xian, ZHENG Xing-wei, ZHENG Jun. Allele mining of wheat ABA receptor at TaPYL4 suggests neo-functionalization among the wheat homoeologs[J]. >Journal of Integrative Agriculture, 2022, 21(8): 2183-2196.
[2] Learnmore Mwadzingeni, Hussein Shimelis, Ernest Dube, Mark D Laing, Toi J Tsilo. Breeding wheat for drought tolerance: Progress and technologies[J]. >Journal of Integrative Agriculture, 2016, 15(05): 935-943.
[3] ZHANG Zhe, ZHANG Hao, PAN Rong-yang, WU Long, LI Ya-lan, CHEN Zan-mou, CAI Geng-yuan, LI Jia-qi, WU Zhen-fang. Genetic parameters and trends for production and reproduction traits of a Landrace herd in China[J]. >Journal of Integrative Agriculture, 2016, 15(05): 1069-1075.
[4] CHEN Jun, WANG Ya-chun, ZHANG Yi, SUN Dong-xiao, ZHANG Sheng-li , ZHANG Yuan . Evaluation of Breeding Programs Combining Genomic Information in Chinese Holstein[J]. >Journal of Integrative Agriculture, 2011, 10(12): 1949-1957.
No Suggested Reading articles found!