中国农业科学 ›› 2022, Vol. 55 ›› Issue (12): 2265-2277.doi: 10.3864/j.issn.0578-1752.2022.12.001

• 作物遗传育种·种质资源·分子遗传学 • 上一篇    下一篇

棉花产量构成因素性状的全基因组关联分析

王娟1(),马晓梅1,周小凤1,王新1,田琴1,李成奇2(),董承光1()   

  1. 1新疆农垦科学院棉花研究所/农业农村部西北内陆区棉花生物学与遗传育种重点实验室,新疆石河子 832000
    2运城学院生命科学系,山西运城 044000
  • 收稿日期:2022-01-17 接受日期:2022-03-21 出版日期:2022-06-16 发布日期:2022-06-23
  • 通讯作者: 李成奇,董承光
  • 作者简介:王娟,E-mail: cottonwj@126.com
  • 基金资助:
    兵团中青年科技创新领军人才计划(2019CB016);兵团科技攻关计划(2019AB021)

Genome-Wide Association Study of Yield Component Traits in Upland Cotton (Gossypium hirsutum L.)

WANG Juan1(),MA XiaoMei1,ZHOU XiaoFeng1,WANG Xin1,TIAN Qin1,LI ChengQi2(),DONG ChengGuang1()   

  1. 1Cotton Research Institute, Xinjiang Academy of Agricultural and Reclamation Science/Northwest Inland Region Key Laboratory of Cotton Biology and Genetic Breeding, Ministry of Agriculture and Rural Affairs, Shihezi 832000, Xinjiang
    2Life Science College, Yuncheng University, Yuncheng 044000, Shanxi
  • Received:2022-01-17 Accepted:2022-03-21 Online:2022-06-16 Published:2022-06-23
  • Contact: ChengQi LI,ChengGuang DONG

摘要:

【目的】 对铃重、衣分、单株铃数和籽指等棉花产量构成因素性状进行全基因组关联分析(genome-wide association study,GWAS),发掘与其关联的标记位点、优异等位变异及候选基因,为棉花产量的分子育种提供理论依据。【方法】 以408份陆地棉品种(系)资源为材料,利用Cotton SNP 80K芯片,对6个环境的铃重、衣分、单株铃数和籽指4个产量构成因素性状进行基于混合线性模型(mixed linear model,MLM)的全基因组关联分析,检测与产量构成因素性状显著关联的位点、优异等位变异;进一步依据转录组数据的基因表达量,在显著关联的位点侧翼序列1 Mb区间挖掘可能的候选基因。【结果】 4个产量构成因素性状在不同环境下均表现出广泛的表型变异,其中,单株铃数变异系数最大为16.67%—22.66%,各性状的遗传率为48.4%—92.2%;除铃重与衣分间相关性不显著外,其他性状间均呈显著或极显著相关性;基于6个环境各性状表型数据的最佳线性无偏预测值(best linear unbiased prediction,BLUP),GWAS共检测到分布于基因组的7个区间内23个与目标性状关联的SNP位点,其中,与铃重关联的位点5个,与衣分关联的位点1个,与单株铃数关联的位点9个,与籽指关联的位点8个,有3个位点(TM21094、TM21102和TM57382)同时与多个目标性状关联;鉴定到7个最优SNP位点的优异等位变异,分别为TM21099(TT)、TM57382(GG)、TM78920(CC)、TM53448(TT)、TM59015(AA)、TM43412(GG)和TM69770(AA);利用转录组数据分析,在基因组的7个区间筛选到158个与产量形成可能的候选基因,GO富集分析和KEGG代谢途径分析发现,候选基因功能类别多样并参与了多种代谢途径。【结论】 在陆地棉品种(系)群体中共鉴定到23个与产量构成因素性状关联的SNP位点,筛选到158个可能与产量性状相关的候选基因。

关键词: 陆地棉, 产量构成因素, 全基因组关联分析, 候选基因

Abstract:

【Objective】The loci, elite alleles and candidate genes associated with yield component traits, such as boll weight, lint percentage, number of bolls per plant and seed index, were explored using a genome-wide association analysis (GWAS), which provided a theoretical reference for the molecular breeding of cotton yield.【Method】The GWAS based on a mixed linear model was performed on 408 upland cotton accessions grown in six different environments using the Cotton SNP 80K chip for the four yield component traits, and the significant SNP loci (SNPs) and elite allele were also detected. Finally, on the basis of the gene expression levels of the transcriptome, candidate genes related to the target traits were mined within a 1 Mb genome range of the flanking sequences of the significant SNPs. 【Result】The four yield component traits showed wide phenotypic variations in different environments, with the maximum coefficient of variation for number of bolls per plant being 16.67%-22.66%. The heritability of each trait was between 48.4% and 92.2%. The correlations among traits were significant or highly significant, except between boll weight and lint percentage. A total of 23 significant SNPs distributed in seven different genomic regions associated with the four traits were identified across the 408 cotton accessions in the BLUP. The numbers of loci associated with boll weight, lint percentage, number of bolls per plant and seed index were 5, 1, 9 and 8, respectively, and three loci (TM21094, TM21102, and TM57382) were associated with multiple target traits simultaneously. Seven elite allele types, TM21099(TT), TM57382(GG), TM78920(CC), TM53448(TT), TM59015(AA), TM43412(GG) and TM69770(AA), were identified. A total of 158 candidate genes potentially related to yield formation were selected through an analysis of gene expression patterns in RNA-Seq data. Gene Ontology and Kyoto Encyclopedia of Genes and Genomes analyses indicated that the functions and metabolic pathways of most genes were varied.【Conclusion】In this study, 23 significant SNPs associated with four yield component traits were identified across 408 cotton accessions, and 158 candidate genes were predicted using RNA-Seq.

Key words: Upland cotton, yield components, genome-wide association analysis, candidate genes