中国农业科学 ›› 2024, Vol. 57 ›› Issue (8): 1430-1443.doi: 10.3864/j.issn.0578-1752.2024.08.002

• 作物遗传育种·种质资源·分子遗传学 • 上一篇    下一篇

中国棉花审定品种SSR指纹库的构建与综合评价

吴玉珍1,2(), 黄龙雨1,2, 周大云1,2, 黄义文1,2, 付守阳1,2, 彭军1,2(), 匡猛1,2()   

  1. 1 中国农业科学院棉花研究所/棉花生物育种与综合利用全国重点实验室,河南安阳 455000
    2 三亚中国农业科学院国家南繁研究院,海南三亚 572024
  • 收稿日期:2023-11-09 接受日期:2024-01-08 出版日期:2024-04-16 发布日期:2024-04-24
  • 通信作者:
    彭军,Tel:13937228796;E-mail:
    匡猛,Tel:15836313471;E-mail:
  • 联系方式: 吴玉珍,Tel:13629838042;E-mail:15959263920@163.com。
  • 基金资助:
    三亚崖州湾科技城科技专项(SCKJ-JYRC-2022-62); 棉花生物育种与综合利用全国重点实验室项目(CB2022C08); 农业生物育种重大项目(2022ZD04019); 海南省自然科学基金联合项目(SQ2021ZRLH0113)

Construction of SSR Fingerprint Library and Comprehensive Evaluation for Approved Cotton Varieties in China

WU YuZhen1,2(), HUANG LongYu1,2, ZHOU DaYun1,2, HUANG YiWen1,2, FU ShouYang1,2, PENG Jun1,2(), KUANG Meng1,2()   

  1. 1 Institute of Cotton Research, Chinese Academy of Agricultural Sciences/State Key Laboratory of Cotton Biology, Anyang 450001, Henan
    2 National Nanfan Research Institute (Sanya), Chinese Academy of Agricultural Sciences, Sanya 572024, Hainan
  • Received:2023-11-09 Accepted:2024-01-08 Published:2024-04-16 Online:2024-04-24

摘要:

【目的】棉花是一种异源四倍体作物,基因组结构复杂,常异花授粉的繁殖方式也导致棉花品种难以实现高度纯合;棉种市场缺乏有效的监管技术手段,品种多乱杂现象长期存在,严重影响纤维品质一致性。构建中国近20年棉花审定品种标准样品的DNA指纹库,探索棉花品种高通量SSR身份鉴定模式,为棉花品种真实性鉴定和新品种特异性鉴定提供依据;分析审定品种的遗传多样性和群体分化,为棉花不同生态区的适应性鉴定和培育适应新环境的品种提供理论基础。【方法】基于多重PCR技术和毛细管电泳检测方法,使用筛选得到的60个SSR标记构建1 015份棉花审定品种标准样品的DNA指纹库,通过植物品种DNA指纹库管理系统对审定品种SSR指纹进行两两比对,分析审定品种的遗传差异,筛选用于品种鉴定的核心SSR位点。利用聚类分析法和群体结构分析法分析1 015份棉花审定品种的遗传多样性,并计算群体间遗传分化指数。【结果】60个SSR标记在1 015个审定品种中共扩增出216种等位变异,平均等位变异数为3.6,平均PIC值为0.37。1 015个审定品种的SSR指纹进行两两比较,共产生513 591组结果,样品间最大差异位点数为58个。差异位点百分比主要集中在41%—70%,涉及428 115组,占83.36%;其中,差异位点百分比在51%—60%时,涉及组数最多,为197 829组,占38.52%。品种间差异位点百分比大于20%时,占所有品种两两比对组数的99%以上,差异位点百分比低于20%的比对结果只占0.58%。基于组合鉴定法,筛选一套包含10个SSR位点的核心位点组合,在1 015个品种中鉴别能力达到99%。聚类结果和群体结构分析表明,1 015个品种被清晰地划分为5个亚群,G1(n=240)为早熟棉亚群,主要分布于中国北部和西北内陆地区,该亚群品种遗传多样性最丰富,品种间平均遗传距离为0.419。G2(n=277)亚群为中熟棉亚群,分布于长江流域,该亚群杂交种较多,亚群内平均遗传距离为0.309。G3(n=109)亚群属于早熟、中熟棉亚群,分布于河北黑龙港地区,该亚群品种遗传组分相对单一,亚群内平均遗传距离在陆地棉群体中最小,仅为0.150。G4(n=254)亚群属于中早熟棉亚群,主要分布于黄河流域,群体内平均遗传距离为0.307。G5(n=37)亚群由37份海岛棉组成,群体内平均遗传距离最小,为0.149。海岛棉与陆地棉之间遗传分化水平最高,平均FST值为0.503;陆地棉群体内,G3亚群与其他亚群之间遗传分化水平最高,FST值为0.193—0.242。长江流域与黄河流域相比,遗传分化水平最低,FST值为0.112。【结论】构建了1 015个中国近20年审定品种标准样品DNA指纹库,筛选了一套包含10个SSR位点的核心标记组合,可以清晰地鉴别99%以上的品种,创建了“核心位点+扩展位点”的高通量棉花鉴定模式;1 015个品种被划分为5个亚群,其中,陆地棉具有明显的地理分布特征。

关键词: 棉花, 标准样品, SSR标记, DNA指纹库, 综合评价

Abstract:

【Objective】Cotton, a heterotetraploid crop with a complex genome structure, faces challenges in achieving high homozygosity due to frequent cross-pollination. The absence of effective technical supervision in the cotton seed market and the persistence of disordered varieties have a negative impact on the consistency of fiber quality. The objectives of this study are threefold: to establish a DNA fingerprint database for approved cotton varieties in China over the past 20 years, to explore a high-throughput SSR identification model for cotton varieties, and to provide a basis for the authentication of existing varieties and the specific identification of new cotton varieties. Additionally, we aim to analyze the genetic diversity and population differentiation among approved varieties. Ultimately, our goal is to provide a theoretical framework for identifying cotton varieties that are well-suited to different ecological regions and for developing varieties that can adapt to new environments. 【Method】Based on multiplex PCR technology and capillary electrophoresis detection method, using 60 SSR markers screened to construct a DNA fingerprint library of 1 015 standard samples of cotton approved varieties. Through the plant variety DNA fingerprint library management system, the SSR fingerprints of approved varieties were compared pairwise to analyze the genetic differences of approved varieties and screen the core SSR loci for variety identification. Cluster analysis and population structure analysis were used to analyze the genetic diversity of 1 015 cotton approved varieties and calculate the genetic differentiation index between populations. 【Result】60 SSR markers amplified 216 allelic variations in 1 015 approved varieties, with an average of 3.6 allelic variations and a mean PIC value of 0.37. When the SSR fingerprints of the 1 015 approved varieties were compared, a total of 513 591 pairwise results were generated, with a maximum of 58 different loci between samples. The percentage of different loci was mainly concentrated at 41%-70%, involving 428 115 groups, accounting for 83.36%. Among them, when the percentage of different loci was at 51%-60%, the largest number of groups was involved, accounting for 197 829 groups, accounting for 38.52%. When the percentage of different loci between varieties was greater than 20%, it accounted for more than 99% of all pairwise comparison groups, and the pairwise comparison results with a percentage of different loci lower than 20% only accounted for 0.58%. Based on the combination identification method, a set of cores SSR loci containing 10 SSR loci was selected, and the discrimination ability among the 1 015 varieties reached 99%. Clustering results and population structure analysis showed that the 1 015 varieties were clearly divided into five subpopulations. G1 (n=240) was an early-maturing cotton subpopulation, mainly distributed in northern and inland regions of China. This subpopulation had the most abundant genetic diversity among varieties, with an average genetic distance of 0.419 between varieties. G2 (n=277) was a medium-maturing cotton subpopulation, distributed in the Yangtze River Basin. This subpopulation had more hybrids, with an average genetic distance of 0.309 within the subpopulation. G3 (n=109) belonged to early-maturing and medium-maturing cotton subpopulations, distributed in Hebei'sHeilonggang region. This subpopulation had relatively simple genetic components, with the smallest average genetic distance among upland cotton subpopulations at only 0.150. G4 (n=254) belonged to a medium-early maturing cotton subpopulation, mainly distributed in the Yellow River Basin. The average genetic distance within this subpopulation was 0.307. G5 (n=37) consisted of 37 sea island cotton samples, with the smallest average genetic distance within the population at only 0.149. The genetic differentiation level between sea island cotton and upland cotton was the highest, with an average FST value of 0.503. Among upland cotton populations, the genetic differentiation level between G3 and other subpopulations was the highest, with FST values ranging from 0.193 to 0.242. The genetic differentiation level between the Yangtze River Basin and the Yellow River Basin was the lowest, with an FST value of 0.112. 【Conclusion】A DNA fingerprint library of standard samples of 1 015 approved varieties in China over the past 20 years was constructed. A set of cores SSR loci containing 10 SSR loci was selected to clearly identify more than 99% of the varieties. A high-throughput cotton identification model of "core loci + extended loci" was created. The 1 015 varieties were divided into five subpopulations, and upland cotton had obvious geographical distribution characteristics.

Key words: cotton, standard samples, SSR markers, DNA fingerprint database, comprehensive evaluation