Scientia Agricultura Sinica ›› 2023, Vol. 56 ›› Issue (8): 1585-1593.doi: 10.3864/j.issn.0578-1752.2023.08.013

• ANIMAL SCIENCE·VETERINARY SCIENCE • Previous Articles     Next Articles

Comparison of Imputation Accuracy for Different Low-Density SNP Selection Strategies

LIN YuNong1,2(), WANG ZeZhao2, CHEN Yan2, ZHU Bo2, GAO Xue2, ZHANG LuPei2, GAO HuiJiang2, XU LingYang2, CAI WenTao2, LI YingHao3, LI JunYa2(), GAO ShuXin1()   

  1. 1 College of Animal Science and Technology, Inner Mongolia University for the Nationalities, Tongliao 028042, Inner Magnolia
    2 Institute of Animal Sciences, Chinese Academy of Agriculture Sciences, Beijing 100193
    3 Tongliao Jingyuan Breeding Cattle Breeding LLC, Tongliao 028006, Inner Magnolia
  • Received:2021-12-13 Accepted:2022-03-24 Online:2023-04-16 Published:2023-04-23

Abstract:

【Objective】 To facilitate the low-cost genomic selection in Huaxi Cattle, the present study represented the first attempt to designed a new low-density Genotype chip to support imputation to higher density genotypes. The representative SNP markers with different density gradients were selected from high-density SNP chips in the Huaxi cattle reference population by using two SNP selection methods. And then, these marker sets were imputed to high-density sets with the same imputation parameters for subsequent genomic studies. Meanwhile, the current study compared the differences in imputation accuracy and concordance among SNP panels and illustrated the effects of four factors on imputation results, including marker screening method, marker density, minor allele frequency, and the number of reference population. This study could provide insights about the methods to select the low-density SNP markers for imputation in the current population and the representative SNPs, and aid in designing low-density SNP chip for Huaxi cattle.【Method】Totally 1,233 Huaxi cattle after genotypes filtered was randomly divided into reference (986) and validation (247) populations., Based on Equidistance (EQ) and high MAF (HM), two SNP selection strategies were used to make 16 SNP sets with different densities from the Illumina Bovine HD chip in the reference population, respectively. Each of the 32 low-density set was then imputed to the 770K density level in the validation population by using Beagle (v5.1), while the imputation accuracy and concordance were calculated as the mean correlation between true and imputed genotypes. Finally, a comprehensive set of factors that influence the imputation performance were analyzed.【Result】The number of markers in the 32 low-density SNP sets ranged from 100 to 16 000, with a maximum window of 24 176 kb and a minimum window of 151 kb. The imputation accuracy and concordance of both EQ and HM methods went up with increasing marker densities. The imputation accuracy of both methods was the highest at 16k SNP density (r2 EQ=0.8801, r2 MAF=0.8696). When the marker density was below 11k, the imputation concordance of HM was higher than EQ for all marker density gradients. However, when the SNP density exceeded 11 k, EQ showed an imputation accuracy advantage over HM. Similar to the imputation concordance results, the HM method still had higher imputation accuracy when the SNP density was lower than 10 k, but the EQ method had higher imputation accuracy when the SNP pool density was higher than 10 k, and the EQ imputation accuracy tended to be stable after the SNP density was greater than 12 k. It was also found that the imputation accuracy of high MAF locus was higher. During the imputation process, it was found that the imputation accuracy and concordance increased with the increase of the reference panel. The imputation accuracy and concordance of loci were higher when the population of the reference panel was 600-800. 【Conclusion】In the Huaxi cattle population, the imputation accuracy and concordance increased with increasing marker density, and a better imputation effect could be obtained in the marker density of 10 k-12 k interval. The HM method was preferred when the marker density was less than 10 k, and the EQ method was better at high marker density. High MAF loci were more accurate for imputation. When the imputation strategy for low-density marker imputation was used, the number of reference panel should be at least 400 heads for better imputation effect.

Key words: imputation accuracy, low density SNP array, Chinese Simmental cattle, linkage disequilibrium, MAF

Table 1

The window size and the number of SNP different marker densities"

组号
No.
窗口大小
Window size(kb)
SNP标记数量
SNPs amount
组号
No.
窗口大小
Window size (kb)
SNP标记数量
SNPs amount
1 24176 100 9 268 90 000
2 4835 500 10 242 10 000
3 2417 1000 11 220 11000
4 805 3000 12 201 12000
5 483 50000 13 186 13000
6 403 60000 14 173 14000
7 345 70000 15 161 15000
8 302 80000 16 151 16000

Fig. 1

The imputation concordance rate (A) and the imputation accuracy (B) of different selection strategies"

Fig. 2

Imputation accuracy and concordance rate with minor allele frequency for two selection strategies with the 16k-density A: Concordance rate of EQ selection strategy; B: Imputation accuracy of EQ selection strategy; C: Concordance rate of HM selection strategy; D: Imputation accuracy of HM selection strategy. Red points indicate more identical markers, and points blue indicates fewer identical markers"

Fig. 3

Mean concordance rate (A) and mean imputation accuracy (B) with minor allele frequency of two selection strategies with the 16 k-density"

Fig. 4

Effects of reference population size on concordance rate (A, C) and imputation accuracy (B, D)"

[1]
朱波, 王延晖, 牛红, 陈燕, 张路培, 高会江, 高雪, 李俊雅, 孙少华. 畜禽基因组选择中贝叶斯方法及其参数优化策略. 中国农业科学, 2014, 47(22): 4495-4505. doi:10.3864/j.issn.0578-1752.2014.22.015.

doi: 10.3864/j.issn.0578-1752.2014.22.015
ZHU B, WANG Y H, NIU H, CHEN Y, ZHANG L P, GAO H J, GAO X, LI J Y, SUN S H. The strategy of parameter optimization of Bayesian methods for genomic selection in livestock. Scientia Agricultura Sinica, 2014, 47(22): 4495-4505. doi:10.3864/j.issn.0578-1752.2014.22.015. (in Chinese)

doi: 10.3864/j.issn.0578-1752.2014.22.015
[2]
VANRADEN P M, VAN TASSELL C P, WIGGANS G R, SONSTEGARD T S, SCHNABEL R D, TAYLOR J F, SCHENKEL F S. Invited Review: reliability of genomic predictions for North American Holstein bulls. Journal of Dairy Science, 2009, 92(1): 16-24.

doi: 10.3168/jds.2008-1514 pmid: 19109259
[3]
DE ROOS A P W, HAYES B J, SPELMAN R J, GODDARD M E. Linkage disequilibrium and persistence of phase in Holstein-Friesian, jersey and Angus cattle. Genetics, 2008, 179(3): 1503-1512.

doi: 10.1534/genetics.107.084301 pmid: 18622038
[4]
HAYES B J, BOWMAN P J, CHAMBERLAIN A C, VERBYLA K, GODDARD M E. Accuracy of genomic breeding values in multi- breed dairy cattle populations. Genetics, Selection, Evolution, 2009, 41: 51.

doi: 10.1186/1297-9686-41-51
[5]
MATUKUMALLI L K, SCHROEDER S, DENISE S, SONSTEGARD T, LAWLEY C T, GEORGES M. Analyzing LD blocks and CNV segments in cattle: Novel genomic features identified using the BovineHD BeadChip. 2011. www.scienceopen.com/document?vid=0fb91f10-7679-4ec4-b5a9-ca39bd541f2e.
[6]
CARVALHEIRO R, BOISON S A, NEVES H H R, SARGOLZAEI M, SCHENKEL F S, UTSUNOMIYA Y T, O'BRIEN A M P, SÖLKNER J, MCEWAN J C, VAN TASSELL C P, SONSTEGARD T S, GARCIA J F. Accuracy of genotype imputation in nelore cattle. Genetics, Selection, Evolution, 2014, 46: 69.

doi: 10.1186/s12711-014-0069-1
[7]
VANRADEN P M, NULL D J, SARGOLZAEI M, WIGGANS G R, TOOKER M E, COLE J B, SONSTEGARD T S, CONNOR E E, WINTERS M, VAN KAAM J B C H M, VALENTINI A, VAN DOORMAAL B J, FAUST M A, DOAK G A. Genomic imputation and evaluation using high-density Holstein genotypes. Journal of Dairy Science, 2013, 96(1): 668-678.

doi: 10.3168/jds.2012-5702 pmid: 23063157
[8]
LI N, STEPHENS M. Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. Genetics, 2003, 165(4): 2213-2233.

doi: 10.1093/genetics/165.4.2213 pmid: 14704198
[9]
DRUET T, SCHROOTEN C, DE ROOS A P W. Imputation of genotypes from different single nucleotide polymorphism panels in dairy cattle. Journal of Dairy Science, 2010, 93(11): 5443-5454.

doi: 10.3168/jds.2010-3255 pmid: 20965360
[10]
GROSSI D A, BRITO L F, JAFARIKIA M, SCHENKEL F S, FENG Z. Genotype imputation from various low-density SNP panels and its impact on accuracy of genomic breeding values in pigs. Animal, 2018, 12(11): 2235-2245.

doi: 10.1017/S175173111800085X pmid: 29706144
[11]
CORBIN L J, KRANIS A, BLOTT S C, SWINBURNE J E, VAUDIN M, BISHOP S C, WOOLLIAMS J A. The utility of low-density genotyping for imputation in the Thoroughbred horse. Genetics, Selection, Evolution: GSE, 2014, 46(1): 9.
[12]
YE S P, YUAN X L, LIN X R, GAO N, LUO Y Y, CHEN Z M, LI J Q, ZHANG X Q, ZHANG Z. Imputation from SNP chip to sequence: a case study in a Chinese indigenous chicken population. Journal of Animal Science and Biotechnology, 2018, 9: 30.

doi: 10.1186/s40104-018-0241-5 pmid: 29581880
[13]
CHANG C C, CHOW C C, TELLIER L C, VATTIKUTI S, PURCELL S M, LEE J J. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience, 2015, 4(1): s13742-15.
[14]
BOICHARD D, CHUNG H, DASSONNEVILLE R, DAVID X, EGGEN A, FRITZ S, GIETZEN K J, HAYES B J, LAWLEY C T, SONSTEGARD T S, VAN TASSELL C P, VANRADEN P M, VIAUD-MARTINEZ K A, WIGGANS G R, CONSORTIUM B L. Design of a bovine low-density SNP array optimized for imputation. PLoS One, 2012, 7(3): e34130.

doi: 10.1371/journal.pone.0034130
[15]
BOLORMAA S, GORE K, VAN DER WERF J H J, HAYES B J, DAETWYLER H D. Design of a low-density SNP chip for the main Australian sheep breeds and its effect on imputation and genomic prediction accuracy. Animal Genetics, 2015, 46(5): 544-556.

doi: 10.1111/age.12340 pmid: 26360638
[16]
BROWNING B L, ZHOU Y, BROWNING S R. A one-penny imputed genome from next-generation reference panels. The American Journal of Human Genetics, 2018, 103(3): 338-348.

doi: 10.1016/j.ajhg.2018.07.015
[17]
MARCHINI J, HOWIE B. Genotype imputation for genome-wide association studies. Nature Reviews Genetics, 2010, 11(7): 499-511.

doi: 10.1038/nrg2796 pmid: 20517342
[18]
VENTURA R V, MILLER S P, DODDS K G, AUVRAY B, LEE M, BIXLEY M, CLARKE S M, MCEWAN J C. Assessing accuracy of imputation using different SNP panel densities in a multi-breed sheep population. Genetics, Selection, Evolution: GSE, 2016, 48(1): 71.
[19]
O’BRIEN A C, JUDGE M M, FAIR S, BERRY D P. High imputation accuracy from informative low-to-medium density single nucleotide polymorphism genotypes is achievable in sheep1. Journal of Animal Science, 2019, 97(4): 1550-1567.

doi: 10.1093/jas/skz043 pmid: 30722011
[20]
BROWNING S R, BROWNING B L. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. The American Journal of Human Genetics, 2007, 81(5): 1084-1097.

doi: 10.1086/521987
[21]
CALUS M P L, BOUWMAN A C, HICKEY J M, VEERKAMP R F, MULDER H A. Evaluation of measures of correctness of genotype imputation in the context of genomic prediction: a review of livestock applications. Animal, 2014, 8(11): 1743-1753.

doi: 10.1017/S1751731114001803 pmid: 25045914
[22]
WENG Z, ZHANG Z, ZHANG Q, FU W, HE S, DING X. Comparison of different imputation methods from low- to high-density panels using Chinese Holstein cattle. Animal, 2013, 7(5): 729-735.

doi: 10.1017/S1751731112002224 pmid: 23228675
[23]
WANG C, HABIER D, PEIRIS B L, WOLC A, KRANIS A, WATSON K A, AVENDANO S, GARRICK D J, FERNANDO R L, LAMONT S J, DEKKERS J C M. Accuracy of genomic prediction using an evenly spaced, low-density single nucleotide polymorphism panel in broiler chickens. Poultry Science, 2013, 92(7): 1712-1723.

doi: 10.3382/ps.2012-02941 pmid: 23776257
[24]
WELLMANN R, PREUß S, THOLEN E, HEINKEL J, WIMMERS K, BENNEWITZ J. Genomic selection using low density marker panels with application to a sire line in pigs. Genetics, Selection, Evolution: GSE, 2013, 45(1): 28.
[25]
HERRY F, HÉRAULT F, PICARD DRUET D, VARENNE A, BURLOT T, LE ROY P, ALLAIS S. Design of low density SNP chips for genotype imputation in layer chicken. BMC Genetics, 2018, 19(1): 108.

doi: 10.1186/s12863-018-0695-7 pmid: 30514201
[26]
YUAN M, FANG H Y, ZHANG H. Correcting for differential genotyping error in genetic association analysis. Journal of Human Genetics, 2013, 58(10): 657-666.

doi: 10.1038/jhg.2013.74 pmid: 23863749
[27]
HOZÉ C, FOUILLOUX M N, VENOT E, GUILLAUME F, DASSONNEVILLE R, FRITZ S, DUCROCQ V, PHOCAS F, BOICHARD D, CROISEAU P. High-density marker imputation accuracy in sixteen French cattle breeds. Genetics, Selection, Evolution, 2013, 45: 33.

doi: 10.1186/1297-9686-45-33
[28]
罗汉鹏, 窦金焕, 安涛, 陈少侃, 王雅春. 基于荷斯坦牛群体基因组数据填充软件的准确性比较(Minimac 3与Beagle 5.1). 中国畜牧兽医, 2021, 48(5): 1664-1671.
LUO H P, DOU J H, AN T, CHEN S K, WANG Y C. Comparison of software (minimac 3 and beagle 5.1) for genomic imputation using Holstein cow population. China Animal Husbandry & Veterinary Medicine, 2021, 48(5): 1664-1671. (in Chinese)
[29]
BOLORMAA S, CHAMBERLAIN A J, KHANSEFID M, STOTHARD P, SWAN A A, MASON B, PROWSE-WILKINS C P, DUIJVESTEIJN N, MOGHADDAR N, VAN DER WERF J H, DAETWYLER H D, MACLEOD I M. Accuracy of imputation to whole-genome sequence in sheep. Genetics, Selection, Evolution, 2019, 51(1): 1.

doi: 10.1186/s12711-018-0443-5
[30]
HAYES B J, BOWMAN P J, DAETWYLER H D, KIJAS J W, VAN DER WERF J H J. Accuracy of genotype imputation in sheep breeds. Animal Genetics, 2012, 43(1): 72-80.

doi: 10.1111/j.1365-2052.2011.02208.x pmid: 22221027
[31]
VENTURA R V, LU D, SCHENKEL F S, WANG Z, LI C, MILLER S P. Impact of reference population on accuracy of imputation from 6K to 50K single nucleotide polymorphism chips in purebred and crossbreed beef cattle1. Journal of Animal Science, 2014, 92(4): 1433-1444.

doi: 10.2527/jas.2013-6638 pmid: 24663187
[32]
HEIDARITABAR M, CALUS M P L, VEREIJKEN A, GROENEN M A M, BASTIAANSEN J W M. Accuracy of imputation using the most common sires as reference population in layer chickens. BMC Genetics, 2015, 16: 101.

doi: 10.1186/s12863-015-0253-5 pmid: 26282557
[33]
UEMOTO Y, SASAKI S, SUGIMOTO Y, WATANABE T. Accuracy of high-density genotype imputation in Japanese Black cattle. Animal Genetics, 2015, 46(4): 388-394.

doi: 10.1111/age.12314 pmid: 26156250
[1] TU YunJie,JI GaiGe,ZHANG Ming,LIU YiFan,JU XiaoJun,SHAN YanJu,ZOU JianMin,LI Hua,CHEN ZhiWu,SHU JingTing. Screening of Wnt3a SNPs and Its Association Analysis with Skin Feather Follicle Density Traits in Chicken [J]. Scientia Agricultura Sinica, 2022, 55(23): 4769-4780.
[2] JunYi GAI,JianBo HE. Major Characteristics, Often-Raised Queries and Potential Usefulness of the Restricted Two-Stage Multi-Locus Genome-Wide Association Analysis [J]. Scientia Agricultura Sinica, 2020, 53(9): 1699-1703.
[3] WANG HaiGang,WEN QiFen,MU ZhiXin,QIAO ZhiJun. Population Structure and Association Analysis of Main Agronomic Traits of Shanxi Core Collection in Foxtail Millet [J]. Scientia Agricultura Sinica, 2019, 52(22): 4088-4099.
[4] REN YiYing, CUI Cui, WANG Qian, TANG ZhangLin, XU XinFu, LIN Na, YIN JiaMing, LI JiaNa, ZHOU QingYuan. Genome-Wide Association Analysis of Silique Density on Racemes and Its Component Traits in Brassica napus L. [J]. Scientia Agricultura Sinica, 2018, 51(6): 1020-1033.
[5] GAO BaoZhen, LIU Bo, LI ShiKai, LIANG JianLi, CHENG Feng, WANG XiaoWu, WU Jian. Genome-Wide Association Studies for Flowering Time in Brassica rapa [J]. Scientia Agricultura Sinica, 2017, 50(17): 3375-3385.
[6] WANG Guang-Kai-1, ZENG Tao-1, 2 , WANG Hui-Hua-1, ZHANG Shu-Zhen-1, ZHANG Li-1, WEI Cai-Hong-1, ZHAO Fu-Ping-1, DU Li-Xin-1. Genome-wide Detection of Selection Signature on Sunite Sheep [J]. Scientia Agricultura Sinica, 2014, 47(6): 1190-1199.
[7] FAN Hu, ZHAO Tuan-Jie, DING Yan-Lai, XING Guang-南, GAI Jun-Yi. Genetic Analysis of the Characteristics and Geographic Differentiation of Chinese Wild Soybean Population [J]. Scientia Agricultura Sinica, 2012, 45(3): 414-425.
[8] CHU Ming-xing,ZHANG Bao-yun,WANG Ping-qing,FANG Li,DI Ran,MA Yue-hui,LI Kui
. Polymorphic and Linkage Analysis of Microsatellite OarJL36 and FecB Gene in Sheep
[J]. Scientia Agricultura Sinica, 2009, 42(6): 2133-2141 .
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!