中国农业科学 ›› 2020, Vol. 53 ›› Issue (4): 695-706.doi: 10.3864/j.issn.0578-1752.2020.04.003

• 作物遗传育种·种质资源·分子遗传学 • 上一篇    下一篇

基于RNA-seq数据的栽培种花生SSR位点鉴定和标记开发

徐志军1,赵胜2,徐磊1,胡小文1,安东升1,刘洋1()   

  1. 1 中国热带农业科学院湛江实验站/广东省旱作节水农业工程技术研发中心,广东湛江 524013
    2 中国农业科学院农业基因组研究所, 广东深圳 518120
  • 收稿日期:2019-07-28 接受日期:2019-10-21 出版日期:2020-02-16 发布日期:2020-03-09
  • 联系方式: 徐志军,E-mail:zhijunxu1990@163.com。
  • 基金资助:
    中国热带农业科学院科技创新团队专项资金(1630102017002)

Discovery of Microsatellite Markers from RNA-seq Data in Cultivated Peanut (Arachis hypogaea)

ZhiJun XU1,Sheng ZHAO2,Lei XU1,XiaoWen HU1,DongSheng AN1,Yang LIU1()   

  1. 1 Zhanjiang Experiment Station, Chinese Academy of Tropical Agricultural Sciences/Guangdong Engineering Technology Research Center for Dryland Water-saving Agriculture, Zhanjiang 524013, Guangdong
    2 Shenzhen Agricultural Genome Research Institute, Chinese Academy of Agricultural Sciences, Shenzhen 518120, Guangdong
  • Received:2019-07-28 Accepted:2019-10-21 Published:2020-02-16 Online:2020-03-09

摘要:

【目的】鉴定花生RNA-seq数据中的SSR位点,明确转录组中SSR位点的分布和结构特点,开发与花生基因相关联的SSR标记,为花生重要功能基因的挖掘、等位变异研究和分子标记辅助育种奠定基础。【方法】根据栽培种花生全生育期中22种不同类型的组织RNA-Seq数据,使用MISA软件分析SSR位点分布及特征,采用Primer 3设计基因关联的SSR引物,并利用电子PCR软件对引物的质量进行检测,随机合成38对引物,进行多态性检测。【结果】从52 280条转录本中共鉴定19 143个SSR位点,分布于14 084条转录本,发生频率为26.94%。重复单元类型为单核苷酸—五核苷酸,以单核苷酸和三核苷酸为重复单元的SSR位点数最多,分别占位点总数的39.24%和38.40%。各重复单元优势基序类型分别为A/T、AG/CT、AAG/CTT、AAAG/CTTT和AACAC/GTGTT,占所在重复单元中的比例分别为97.62%、72.01%、30.96%、24.59%和16.67%。重复单元的重复次数为5—47次,单个SSR位点的长度的分布范围为10—47 bp,基序长度主要集中在10—14 bp;复合SSR位点的长度范围为21—249 bp,以31—40 bp为主。鉴定的SSR位点中共有13 477个SSR位点可以进行引物设计,其中5 020条转录本序列对应到特定的基因,共包含5 859个可进行引物设计的SSR位点,这些SSR位点在A基因组和B基因组共20条染色体上不均匀分布,其中B03染色体上SSR位点最多,为484个。对特定基因SSR引物进行电子PCR检测,在A.duranensisA.ipaensisA.hypogaea基因组中有效扩增位点分别为4 468、4 929和10 188个,有效引物数分别为3 968(67.74%)、4 232(72.25%)和5 174(88.33%)对,在A. hypogaea基因组中,SSR引物扩增位点主要以2个位点为主,其中,有1 477对引物单位点扩增。根据SSR引物扩增位点在栽培种花生基因组中的位置信息绘制了SSR位点的物理图谱。在38对SSR引物中,共有35对(92.1%)SSR引物可以扩增出清晰的条带,其中,有11对(28.9%)SSR引物在2个品种间扩增出差异条带。【结论】鉴定了13 477个可进行标记开发的SSR位点,开发、检测了5 859个基因相关SSR标记,在栽培种花生基因组中具有较高的扩增效率,并构建了基因相关SSR位点的物理图谱。

关键词: 花生, RNA-seq数据, SSR位点, 基因关联SSR标记, 物理图谱

Abstract:

【Objective】 This study aimed to identify SSRs in peanut RNA-seq data,clarify their distribution and structural characteristics,and develop gene-associated SSR markers. The study may lay the foundation for the excavation of important functional genes of peanut, the study of isometric variation and molecular markers assisted breeding. 【Method】From 22 different cultivated peanut tissue types and ontogenies that represent its full development, the reported RNA-Seq data, were used to analyze the distribution and characteristics of SSR using MISA software. Gene-associated SSR primers were designed by Primer 3.0 and its quality were detected by e-PCR 2.3.9. 38 pairs of primers were randomly synthesized for polymorphism testing. 【Result】A total of 19 143 SSRs were identified from 52 280 transcripts, distributed in 14 084 transcripts, with a frequency of 26.94%. The dominant SSR repeat unit types were mononucleotide and trinucleotide in mononucleotide to pentanucleotide, accounting for 39.24% and 38.40% of the total SSR locus. Dominant motif types of each repeat unit were A/T, AG/CT, AAG/CTT, AAAG/CTTT, AACAC/GTGTT, accounting for 97.62%, 72.01%, 30.96%, 24.59%, and 16.67% in the corresponding repeat units, respectively. The repetition of repeat units was 5-47 times, and the length distribution range of single SSR site was 10-47bp, mainly concentrated at 10-14 bp. The length range of compound SSR locus was 21-249 bp, mainly concentrated at 31-40 bp. Among all the SSR,13 477 SSR could be used to develop SSR markers, of which 5 020 transcript sequences were annotated to specific genes, containing 5 859 SSR markers locus. These SSRs were unevenly distributed on the 20 chromosomes of A and B genomes, and chromosomes B03 had the most SSR locus of 484. Using electronic PCR, 4 468, 4 929 and 10 188 effective loci were amplified in the genome of A. duranensis, A. ipaensis, A. hypogaea, with 3 968 (67.74%), 4 232 (72.25%) and 5 174 (88.33%) effective markers, respectively. In the genome of A. hypogaea, SSR primers amplified mainly with 2 loci, while 1 477 pairs of SSR primers were single-locus markers. And the physical map of amplified SSR loci was drawn according to the loci position in cultivated peanut genome. Among the randomly synthesized primers, 35 pairs (92.1%) of SSR primers amplified stable and clear bands in two peanut varieties, among which 11 pairs (28.9%) of SSR primers amplified different band. 【Conclusion】 In this study, 13 477 potential primer design SSRs were identified, 5 859 gene-associated SSR markers were developed and detected,with high amplification efficiency in cultivated peanut genome,and the physical map of gene-associated SSR were constructed.

Key words: peanut, RNA-seq data, SSR loci, gene-associated SSR markers, physical map