Scientia Agricultura Sinica

Previous Articles    

Comparison of Imputation Accuracy for Different Low-Density SNP Selection Strategies

LIN YuNong1,2, WANG ZeZhao2, CHEN Yan2, ZHU Bo2, GAO Xue2, ZHANG LuPei2, GAO HuiJiang2, XU LingYang2, CAI WenTao2, Li YingHao3, LI JunYa2*, GAO ShuXin1* #br#   

  1. 1College of Animal Science and Technology, Inner Mongolia University for the Nationalities, Tongliao 028042 Inner Magnolia; 2Institute of Animal Sciences, Chinese Academy of Agriculture Sciences, Beijing 100193;3 Tongliao Jingyuan Breeding Cattle Breeding LLC,Tongliao 028006, Inner Magnolia
  • Published:2022-04-12

Abstract: 【ObjectiveTo facilitate low-cost genomic selection in Huaxi Cattle, the present study represent the first attempt to designed a new low-denstity Genotype chip to support imputation to higher density genotypes. The representative SNP markers with different density gradients were selected from high-density SNP chips in the Huaxi Cattle reference population by using two SNP selection methods. And then these marker sets were imputed to high-density sets with the same imputation parameters for subsequent genomic studies. Meanwhile, the current study compared the differences in imputation accuracy and concordance among SNP panels and illustrated the effects of four factors on imputation results including marker screening method, marker density, minor allele frequency and the number of reference population. This study provides insights about the methods to select low-density SNP markers for imputation in the current population and the representative SNPs will aid in designing low-density SNP chip for Huaxi cattle.【Method】Totally 1,233 Huaxi cattle after genotypes filtered was randomly divided into reference (986) and validation (247) populations. two SNP selection strategies, based on Equidistance (EQ) and on high MAF (HM), were used to make 16 SNP sets with different densities from the Illumina Bovine HD chip in the reference population, respectively. Each of the 32 low-density set was then imputed to the 770K density level in the validation population using Beagle (v5.1), while the imputation accuracy and concordance were calculated as the mean correlation between true and imputed genotypes. While, a comprehensive set of factors that influence the imputation performance were analyzed.【Result】The number of markers in the 32 low-density SNP sets ranged from 100 to 16 000, with a maximum window of 24 176 kb and a minimum window of 151 kb. The imputation accuracy and concordance of both EQ and HM methods went up with increasing marker densities. The imputation accuracy of both methods was highest at 16k SNP density (r2EQ=0.8801r2MAF=0.8696). When the marker density was below 11k, the imputation concordance of HM was higher than EQ for all marker density gradients. However, when the SNP density exceeded 11k, EQ showed an imputation accuracy advantage over HM. Similar to the imputation concordance results, the HM method still had higher imputation accuracy when the SNP density was lower than 10k, but the EQ method had higher imputation accuracy when the SNP pool density was higher than 10k, and the EQ imputation accuracy tended to be stable after the SNP density was greater than 12k. It was also found that the imputation accuracy of high MAF locus was higher. During the imputation process, it was found that the imputation accuracy and concordance increased with the increase of the reference panel. The imputation accuracy and concordance of loci were higher when the population of the reference panel was 600-800. 【Conclusion】In the Huaxi cattle population, imputation accuracy and concordance increased with increasing marker density, and a better imputation effect could be obtained in the marker density of 10k-12k interval. The HM method was preferred when the marker density was less than 10k, and the EQ method was better at high marker density. High MAF locuses were more accurate for imputation. When using the imputation strategy for low-density marker imputation, the number of reference panel should be at least 400 heads for better imputation effect.

Key words: imputation accuracy, low density SNP array, Chinese Simmental cattle, linkage disequilibrium, MAF

[1] LIN YuNong, WANG ZeZhao, CHEN Yan, ZHU Bo, GAO Xue, ZHANG LuPei, GAO HuiJiang, XU LingYang, CAI WenTao, LI YingHao, LI JunYa, GAO ShuXin. Comparison of Imputation Accuracy for Different Low-Density SNP Selection Strategies [J]. Scientia Agricultura Sinica, 2023, 56(8): 1585-1593.
[2] TU YunJie,JI GaiGe,ZHANG Ming,LIU YiFan,JU XiaoJun,SHAN YanJu,ZOU JianMin,LI Hua,CHEN ZhiWu,SHU JingTing. Screening of Wnt3a SNPs and Its Association Analysis with Skin Feather Follicle Density Traits in Chicken [J]. Scientia Agricultura Sinica, 2022, 55(23): 4769-4780.
[3] JunYi GAI,JianBo HE. Major Characteristics, Often-Raised Queries and Potential Usefulness of the Restricted Two-Stage Multi-Locus Genome-Wide Association Analysis [J]. Scientia Agricultura Sinica, 2020, 53(9): 1699-1703.
[4] WANG HaiGang,WEN QiFen,MU ZhiXin,QIAO ZhiJun. Population Structure and Association Analysis of Main Agronomic Traits of Shanxi Core Collection in Foxtail Millet [J]. Scientia Agricultura Sinica, 2019, 52(22): 4088-4099.
[5] REN YiYing, CUI Cui, WANG Qian, TANG ZhangLin, XU XinFu, LIN Na, YIN JiaMing, LI JiaNa, ZHOU QingYuan. Genome-Wide Association Analysis of Silique Density on Racemes and Its Component Traits in Brassica napus L. [J]. Scientia Agricultura Sinica, 2018, 51(6): 1020-1033.
[6] GAO BaoZhen, LIU Bo, LI ShiKai, LIANG JianLi, CHENG Feng, WANG XiaoWu, WU Jian. Genome-Wide Association Studies for Flowering Time in Brassica rapa [J]. Scientia Agricultura Sinica, 2017, 50(17): 3375-3385.
[7] FAN Hu, ZHAO Tuan-Jie, DING Yan-Lai, XING Guang-南, GAI Jun-Yi. Genetic Analysis of the Characteristics and Geographic Differentiation of Chinese Wild Soybean Population [J]. Scientia Agricultura Sinica, 2012, 45(3): 414-425.
[8] LI Ping-Hua, LI Jie, YANG Zhu-Qing, ZHANG Zhi-Yan, YANG Bin, CHEN Cong-Ying. Genome-Wide Association Study and Positional Candidate Gene Analysis on Age at Puberty of Gilts [J]. Scientia Agricultura Sinica, 2012, 45(19): 4075-4083.
[9] CHU Ming-xing,ZHANG Bao-yun,WANG Ping-qing,FANG Li,DI Ran,MA Yue-hui,LI Kui
. Polymorphic and Linkage Analysis of Microsatellite OarJL36 and FecB Gene in Sheep
[J]. Scientia Agricultura Sinica, 2009, 42(6): 2133-2141 .
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!