Discovery of Microsatellite Markers from RNA-seq Data in Cultivated Peanut (Arachis hypogaea)

ZhiJun XU1,Sheng ZHAO2,Lei XU1,XiaoWen HU1,DongSheng AN1,Yang LIU1()   

  1. 1 Zhanjiang Experiment Station, Chinese Academy of Tropical Agricultural Sciences/Guangdong Engineering Technology Research Center for Dryland Water-saving Agriculture, Zhanjiang 524013, Guangdong
    2 Shenzhen Agricultural Genome Research Institute, Chinese Academy of Agricultural Sciences, Shenzhen 518120, Guangdong
  • Received:2019-07-28 Accepted:2019-10-21 Online:2020-02-16 Published:2020-03-09
  • Contact: Yang LIU


【Objective】 This study aimed to identify SSRs in peanut RNA-seq data,clarify their distribution and structural characteristics,and develop gene-associated SSR markers. The study may lay the foundation for the excavation of important functional genes of peanut, the study of isometric variation and molecular markers assisted breeding. 【Method】From 22 different cultivated peanut tissue types and ontogenies that represent its full development, the reported RNA-Seq data, were used to analyze the distribution and characteristics of SSR using MISA software. Gene-associated SSR primers were designed by Primer 3.0 and its quality were detected by e-PCR 2.3.9. 38 pairs of primers were randomly synthesized for polymorphism testing. 【Result】A total of 19 143 SSRs were identified from 52 280 transcripts, distributed in 14 084 transcripts, with a frequency of 26.94%. The dominant SSR repeat unit types were mononucleotide and trinucleotide in mononucleotide to pentanucleotide, accounting for 39.24% and 38.40% of the total SSR locus. Dominant motif types of each repeat unit were A/T, AG/CT, AAG/CTT, AAAG/CTTT, AACAC/GTGTT, accounting for 97.62%, 72.01%, 30.96%, 24.59%, and 16.67% in the corresponding repeat units, respectively. The repetition of repeat units was 5-47 times, and the length distribution range of single SSR site was 10-47bp, mainly concentrated at 10-14 bp. The length range of compound SSR locus was 21-249 bp, mainly concentrated at 31-40 bp. Among all the SSR,13 477 SSR could be used to develop SSR markers, of which 5 020 transcript sequences were annotated to specific genes, containing 5 859 SSR markers locus. These SSRs were unevenly distributed on the 20 chromosomes of A and B genomes, and chromosomes B03 had the most SSR locus of 484. Using electronic PCR, 4 468, 4 929 and 10 188 effective loci were amplified in the genome of A. duranensis, A. ipaensis, A. hypogaea, with 3 968 (67.74%), 4 232 (72.25%) and 5 174 (88.33%) effective markers, respectively. In the genome of A. hypogaea, SSR primers amplified mainly with 2 loci, while 1 477 pairs of SSR primers were single-locus markers. And the physical map of amplified SSR loci was drawn according to the loci position in cultivated peanut genome. Among the randomly synthesized primers, 35 pairs (92.1%) of SSR primers amplified stable and clear bands in two peanut varieties, among which 11 pairs (28.9%) of SSR primers amplified different band. 【Conclusion】 In this study, 13 477 potential primer design SSRs were identified, 5 859 gene-associated SSR markers were developed and detected,with high amplification efficiency in cultivated peanut genome,and the physical map of gene-associated SSR were constructed.

Key words: peanut, RNA-seq data, SSR loci, gene-associated SSR markers, physical map

Table 1

Distribution of RNA-seq SSR locus characteristics in cultivated peanut"

Repeat unit type
SSR number
Ratio (%)
Distribution frequency (%)
优势基序 Dominating motif
基序类型 Motif type 数量 Number 比例 Ratio (%)
单核苷酸 Mononucleotide 4 7513 39.24 14.37 A/T 7334 97.62
二核苷酸 Dinucleotide 12 3926 20.51 7.51 AG/CT 2827 72.01
三核苷酸 Trinucleotide 60 7351 38.40 14.06 AAG/CTT 2276 30.96
四核苷酸 Tetranucleotide 87 305 1.59 0.58 AAAG/CTTT 75 24.59
五核苷酸 Pentanucleotide 39 48 0.25 0.09 AACAC/GTGTT 8 16.67
总计 Total 202 19143 36.62

Table 2

Repetition times and distribution frequency of each SSR repeat unit in cultivated peanut"

Repeat unit type
重复次数Repetition times
5 6 7 8 9 10 11 12 13 14 ≥15
单核苷酸Mononucleotide 2985 1519 943 657 468 941
二核苷酸 Dinucleotide 1446 896 640 475 340 115 14
三核苷酸 Trinucleotide 4352 2108 761 130
四核苷酸 Tetranucleotide 246 59
五核苷酸 Pentanucleotide 48
总计 Total 4646 3613 1657 770 475 3325 1634 957 657 468 941
Distribution frequency(%)
24.2 18.87 8.66 4.02 2.48 17.37 8.54 5.00 3.43 2.44 4.92

Fig. 1

Distribution of SSR motif length"

Table 3

Statistics of primer design specific gene-associated SSR"

SSR位点数Number of SSR loci 对应基因数Number of genes 平均SSR位点密度
Average density of SSR loci
Single loci
Compound loci
Gene contain single loci
Gene contain multiple loci
A01 252 9 261 136 37 217 1.21
A02 170 15 185 145 13 172 1.08
A03 380 17 397 220 53 335 1.19
A04 222 18 240 160 24 212 1.13
A05 314 13 327 179 48 277 1.18
A06 229 11 240 160 26 213 1.13
A07 175 11 186 110 24 160 1.16
A08 255 13 268 146 36 225 1.19
A09 265 16 281 168 35 242 1.16
A10 198 18 216 148 22 193 1.12
B01 273 16 289 177 34 250 1.16
B02 264 15 279 157 36 236 1.18
B03 451 33 484 328 48 430 1.13
B04 323 21 344 169 49 281 1.22
B05 291 15 306 201 33 270 1.13
B06 293 16 309 152 45 253 1.22
B07 281 20 301 190 31 261 1.15
B08 249 15 264 135 39 219 1.21
B09 339 17 356 188 52 298 1.19
B10 309 17 326 182 44 276 1.18
Whole genome
5533 326 5859 3451 729 5020 1.17

Table 4

Statistics of gene-associated SSR primer amplified in peanut genome by e-PCR"

DNA template
Amplified loci
Effective amplified loci
引物扩增位点统计 Primer amplified loci statistics 有效引物数
Number of effective primer pairs
1 2 3 >3
A.duranensis 4760 4468
A.ipaensis 5264 4929
A.hypogaea 10818 10188

Fig. 2

Amplification site distribution of gene-associated SSR markers in peanut genome"

Fig. 3

Physical map of SSR markers in peanut genome The red line represents for peanut genes, the blue line represent for SSR locus located on the positive chain, the green line represent for SSR locus located on the negative chain"

Fig. 4

QTL analysis of bacteria wilt resistance in peanut The red line represent for peanut genes, the blue line represent for SSR locus located on the positive chain, the green line represent for SSR locus located on the negative chain"

Fig. 5

Randomly SSR markers amplification in Yuanza 9102 and Huayu 910 (part) M: Marker; Y: Yuanza 9102; H: Huayu 910. 13-24: SSR marker A05D8UNY-1-1, A0686JNN-1-1, A06R0Z7V-1-1, A06RB83Y-1-1, A076T298-1-1, A07F0Y1Z-1-1, A07F4DXF-1-1, A08648HW-2-1, A089H31H-1-1, A08A0WFX-1-1, A091GS41-1-1, A09D2340-1-1"

