中国农业科学 ›› 2023, Vol. 56 ›› Issue (2): 217-235.doi: 10.3864/j.issn.0578-1752.2023.02.002

• 作物遗传育种·种质资源·分子遗传学 • 上一篇    下一篇

基于转录组SNP构建油茶主要品种资源的分子身份证

林萍(),王开良,姚小华,任华东()   

  1. 中国林业科学研究院亚热带林业研究所/浙江省林木育种技术研究重点实验室,杭州 311400
  • 收稿日期:2022-08-25 接受日期:2022-10-24 出版日期:2023-01-16 发布日期:2023-02-07
  • 联系方式: 林萍,Tel:0571-63320229;E-mail:linping80@126.com。
  • 基金资助:
    国家“十三五”科技基础资源调查专项(2019FY100801);浙江省林木新品种选育重大科技专项课题(2021C02070-2)

Development of DNA Molecular ID in Camellia oleifera Germplasm Based on Transcriptome-Wide SNPs

LIN Ping(),WANG KaiLiang,YAO XiaoHua,REN HuaDong()   

  1. Research Institute of Subtropical Forestry, Chinese Academy of Forestry/Key Laboratory of Tree Breeding of Zhejiang Province, Hangzhou 311400
  • Received:2022-08-25 Accepted:2022-10-24 Published:2023-01-16 Online:2023-02-07

摘要:

【目的】近年来,油茶(Camellia oleifera)产业发展迅速,已成为中国四大油料之一。油茶良种不断涌现,但品质参差不齐,“同名异物、同物异名”等现象时有发生。建立油茶品种资源的单核苷酸多态性(single-nucleotide polymorphism,SNP)分子标记数据库,筛选重要SNP位点,开发油茶品种资源DNA指纹图谱,构建油茶品种资源的分子身份证,为品种鉴别、品种追溯等提供分子水平鉴别技术支撑。【方法】以221份普通油茶品种资源为材料,提取未成熟种子RNA,进行转录组测序。以二倍体南荣油茶基因组为参考,识别供试油茶品种资源的SNP位点并基因分型,利用SNP数据分析油茶群体及亚群的遗传多样性,分析SNP位点的观测杂合度、期望杂合度、多态信息含量(PIC)等信息,筛选核心SNP位点并采用Sanger测序验证,得到最优SNP位点组合后,结合品种资源基本信息构建油茶品种资源分子身份证。【结果】从油茶转录组中共检测到1 849 953个高质量SNP位点。群体遗传多样性分析发现,油茶群体观测杂合度为0.2966,期望杂合度为0.2462,固定指数为-0.2048,PIC为0.2073,最小等位基因频率为0.1648。参试群体的各亚群间遗传分化较小,存在较高的基因流,主要变异存在于亚群内。根据PIC、连锁不平衡衰退距离(LD)等参数从所有SNP位点中筛选出31个多态性高的核心位点,Sanger测序验证其中8个核心位点基因分型的准确率在91.36%以上。利用核心位点组成DNA指纹图谱,可区分出全部参试油茶品种资源。DNA指纹图谱结合油茶品种资源基本信息,构建成由66位数字组成的油茶品种资源分子身份证。【结论】依据SNP标记的PIC、LD等指标,筛选出31个核心SNP位点,精准区分全部供试油茶品种资源。将31个SNP位点所构建的油茶品种资源DNA指纹图谱与品种资源的起源、资源类型和亚群分布等基本属性信息相结合,构建了每份油茶品种资源唯一的分子身份证,并生成相应的条形码和二维码。

关键词: 普通油茶, 品种鉴别, 单核苷酸多态性, 指纹图谱, 分子身份证

Abstract:

【Objective】Camellia oleifera is a traditional woody oil plant and been widely cultivated in China. In order to facilitate the protection and precise management of C. oleifera cultivars and avoid the phenomenon of homonyms and synonyms, single-nucleotide polymorphism (SNP) marker database of C. oleifera cultivars was established, and a set of core SNPs were selected to construct molecular fingerprint and ID for each cultivar. 【Method】The RNA of developing seeds of 221 C. oleifera clones was extracted and RNA-seq were performed. Using C. oleifera var. ‘Nanyongensis’ genome sequence as reference, high-quality SNPs for C. oleifera were screened and the genotyping of accessions was carried out. Furthermore, the genetic diversity of C. oleifera population and subpopulations were analyzed using SNP data, including observed heterozygosity, expected heterozygosity and polymorphism information content (PIC) of the SNPs, etc. The SNP loci were further filtered by their polymorphism and location information to obtain the optimal combination of core SNP loci. Sanger-seq was performed to verify the core SNP loci. The fingerprints of each clone were formed according to the genotypes of the core SNPs. The molecular IDs of C. oleifera clones were finally constructed by combining the basic information and fingerprint of C. oleifera clones. 【Result】A total of 1 849 953 high-quality SNP loci were obtained from the transcriptomes of C. oleifera. The average values of observed heterozygosity, expected heterozygosity, fixed index, PIC and minor allele frequency of the C. oleifera population were 0.2966, 0.2462, -0.2048, 0.2073, and 0.1648, respectively. The genetic differentiation among the subpopulations of C. oleifera was minor with the high level gene flow, while the main variation was inside of the subpopulation. Filtered by PIC, LD, etc., 31 core SNP loci were screened out to distinguish all C. oleifera clones. The genotypes of all accessions in the eight core loci were further detected using Sanger-seq, and the verified rates were over 91.36%. All C. oleifera clones used in this study can be distinguished using the DNA fingerprints constructed by the 31 core SNPs. Based on the fingerprint of 31 SNP markers and the basic information of C. oleifera clones, a molecular ID of each clone, which composed of 66 digits, was formed finally. 【Conclusion】According to the polymorphism information of SNP markers, 31 core SNP loci were catched. And all C. oleifera clones were accurately distinguished. Furthermore, The DNA fingerprints of 221 C. oleifera clones were constructed by the 31 SNP markers. A unique molecular identity code for each germplasm was constructed using the DNA fingerprints and the converted serial codes from information of the C. oleifera clones. Finally, the bar codes and quick response (QR) codes are generated as the molecular ID, which can be quickly identified by the code scanning equipment.

Key words: Camellia oleifera, cultivar identification, single-nucleotide polymorphism (SNP), DNA fingerprint, molecular ID