中国农业科学 ›› 2020, Vol. 53 ›› Issue (9): 1704-1716.doi: 10.3864/j.issn.0578-1752.2020.09.002

• 专题:限制性两阶段多位点全基因组关联分析法的应用 • 上一篇    下一篇

限制性两阶段多位点全基因组关联分析法在遗传育种中的应用

贺建波,刘方东,王吴彬,邢光南,管荣展,盖钧镒()   

  1. 南京农业大学大豆研究所/国家大豆改良中心/农业部大豆生物学与遗传育种重点实验室/作物遗传与种质创新国家重点实验室/江苏省现代作物生产协同创新中心,南京 210095
  • 收稿日期:2019-08-26 接受日期:2019-11-30 出版日期:2020-05-01 发布日期:2020-05-13
  • 通讯作者: 盖钧镒
  • 作者简介:贺建波,E-mail:hjbxyz@gmail.com。
  • 基金资助:
    国家自然科学基金(31701447);国家作物育种重点研发计划(2017YFD0101500);国家作物育种重点研发计划(2017YFD0102002);长江学者和创新团队发展计划(PCSIRT_17R55);教育部111项目(B08025);中央高校基本科研业务费项目(KYT201801);农业部国家大豆产业技术体系CARS-04;江苏省优势学科建设工程专项;江苏省JCIC-MCP项目

Restricted Two-Stage Multi-Locus Genome-Wide Association Analysis and Its Applications to Genetic and Breeding Studies

JianBo HE,FangDong LIU,WuBin WANG,GuangNan XING,RongZhan GUAN,JunYi GAI()   

  1. Soybean Research Institute, Nanjing Agricultural University/National Center for Soybean Improvement/Key Laboratory of Biology and Genetic Improvement of Soybean (General), Ministry of Agriculture/State Key Laboratory for Crop Genetics and Germplasm Enhancement/Jiangsu Collaborative Innovation Center for Modern Crop Production, Nanjing 210095
  • Received:2019-08-26 Accepted:2019-11-30 Online:2020-05-01 Published:2020-05-13
  • Contact: JunYi GAI

摘要:

全基因组关联分析(genome-wide association studies,GWAS)通过建立全基因组高密度分子标记以检测基因型与表型间的关联性,已成为动植物数量性状遗传解析的主要方法。然而,以往GWAS方法只注重于个别主要QTL的检测,而且使用仅有2个等位变异的SNP标记不能检测自然群体中广泛存在的复等位变异,一定程度限制了GWAS的应用。限制性两阶段多位点全基因组关联分析方法(RTM-GWAS)首先根据全基因组高密度SNP标记间的连锁不平衡程度,将多个相邻且紧密连锁的SNP标记组成为具有复等位变异(单倍型)的连锁不平衡区段(SNPLDB)标记。其次,RTM-GWAS使用由SNPLDB标记计算的遗传相似系数矩阵作为群体结构偏差的通用估计,并提取该矩阵的特征向量作为模型协变量以降低由群体结构偏差导致的假阳性。最后,利用具有复等位变异的SNPLDB标记与建立的多位点复等位变异模型,RTM-GWAS将性状遗传率作为QTL表型变异解释率的上限,通过两阶段分析策略高效地进行全基因组QTL及其复等位变异的检测,并最终构建多QTL遗传模型。该法还可以基于性状小区观测值,建立QTL与环境互作多位点模型,不仅能检测与环境有交互作用的主效应QTL,还能检测仅与环境有交互作用的无主效应QTL。RTM-GWAS不仅解决了以往GWAS不能估计复等位变异的问题,而且通过使用多位点模型拟合多个QTL提高了检测功效并能有效地控制假阳性的膨胀,为全面解析自然群体QTL及其复等变异提供了通道。该法能估计出等位基因的效应及其在群体内的相对频率,由其结果建立的QTL-allele矩阵代表了目标性状在群体中的全部遗传组成,不仅可用于候选基因发掘,还为群体内QTL及其复等位变异(基因及其复等位基因)的动态研究(群体遗传分化以及特有与新生等位变异)提供了新的工具。依据QTL-allele矩阵,还能进一步利用计算机模拟产生杂交组合后代基因型,并预测杂交组合后代纯合群体的表现,从而进行优化组合设计与分子设计育种。此外,RTM-GWAS还适用于双亲杂交后代重组自交系群体以及多亲杂交后代巢式关联作图群体,因避免了群体结构偏离的干扰,检测功效更高。本文归纳了RTM-GWAS的原理和方法,并综述了其在遗传育种研究中的应用。

关键词: 全基因组关联分析, 复等位变异, SNPLDB标记, 多位点模型, QTL-allele矩阵

Abstract:

Genome-wide association studies (GWAS) take genome-wide high-density molecular markers to identify associations between genotype and phenotype, which have been widely used for genetic dissection of quantitative traits in plants and animals. However, previous GWAS methods focused on finding a handful of major loci and were not able to detect multi-allelic genetic variation in natural populations based on bi-allelic SNP marker, which caused limitations in extending application of GWAS. The restricted two-stage multi-locus genome-wide association analysis (RTM-GWAS) firstly groups multiple adjacent and tightly linked SNPs based on linkage disequilibrium to form multi-allelic SNPLDB markers with multiple haplotypes as alleles. Secondly, population structure bias is estimated using the genetic similarity coefficient matrix calculated from SNPLDB marker, and the eigenvectors of the similarity matrix are extracted and incorporated as model covariates to correct for population structure bias and to reduce false positives. Finally, RTM-GWAS utilizes two-stage association analysis to detect genome-wide QTLs and their multiple alleles efficiently based on the SNPLDB marker and multi-locus multi-allele model, and builds the final multi-QTL genetic model with the total QTL genetic contribution restricted to trait heritability. RTM-GWAS can also detect QTL-by-environment interaction effect using plot-based phenotype data, and can detect not only the main effect QTL, but also QTL with only interaction effect with environment. RTM-GWAS solves the issue that multiple alleles are not estimable in previous GWAS, and also improves the detection power and reduces the false positive rate by fitting multiple QTLs simultaneously in a multi-locus model. It provides a potential solution for a relatively thorough detection of genome-wide QTLs and their multiple alleles, and the allele effect and relative frequency can also be estimated. From RTM-GWAS results, a QTL-allele matrix can be constructed as a compact form of the population genetic constitution, and can be further used for gene discovery. QTL-allele matrix also provides a new tool for studies on the dynamic change of QTLs and their multiple alleles (genes and their multiple alleles), such as population genetic differentiation and population-specific and new alleles. According to QTL-allele matrix, the progeny genotype of cross between parental lines can be simulated by using computer simulation, and then the phenotype can be predicted to assist optimal cross design and molecular design breeding. In addition, RTM-GWAS is more efficient in QTL detection for bi-parental recombinant inbred line population and multi-parental nested association mapping population because the population structure bias can be well-controlled. The present paper presents the principles and procedures of the RTM-GWAS method at first, and then provides some potential applications of RTM-GWAS in plant genetic and breeding studies.

Key words: restricted two-stage multi-locus genome-wide association analysis, multiple alleles, SNPLDB marker, multi-locus model, QTL-allele matrix