中国农业科学 ›› 2012, Vol. 45 ›› Issue (7): 1246-1256.doi: 10.3864/j.issn.0578-1752.2012.07.002

• 作物遗传育种·种质资源·分子遗传学 • 上一篇    下一篇

芝麻发育转录组分析

 魏利斌, 苗红梅, 张海洋   

  1. 河南省农业科学院河南省芝麻研究中心,郑州 450002
  • 收稿日期:2011-09-29 出版日期:2012-04-01 发布日期:2011-12-26
  • 通讯作者: 通信作者张海洋,Tel:0371-65715936;E-mail:zhy@hnagri.org.cn
  • 作者简介:魏利斌,E-mail:libinwei2008@yahoo.com.cn。苗红梅,E-mail:miaohongmei@yahoo.com.cn。魏利斌和苗红梅为同等贡献作者
  • 基金资助:

    国家重点基础研究发展计划“973”项目(2011CB109304)、现代农业产业技术体系建设专项资金(CARS-15)

Transcriptomic Analysis of Sesame Development

 WEI  Li-Bin, MIAO  Hong-Mei, ZHANG  Hai-Yang   

  1. 河南省农业科学院河南省芝麻研究中心,郑州 450002
  • Received:2011-09-29 Online:2012-04-01 Published:2011-12-26

摘要: 【目的】系统了解芝麻发育及种子形成转录组特征,丰富芝麻转录组数据信息。【方法】选用6份芝麻样品(5个不同芝麻品种的完整植株、1份不同发育阶段的芝麻籽粒),构建转录组测序文库并进行Illumina RNA-seq双端测序及生物信息学分析。【结果】共获得原始数据12.69 Gb,有效数据8.80 Gb。通过de novo拼接获得了长度大于100 bp的转录物26 837条(http://www. ncbi.nlm.nih.gov/genbank/TSA.html,登录号:JP631635—JP668414);转录物总长度18.35 Mb,平均长度683 bp,N50长度1 006 bp。转录物注释结果显示,25 331条转录物序列具有同源比对信息;1 506条转录物序列无匹配(no hits)序列信息,可能为芝麻特有的基因序列。采用COG、GO功能分类工具可将已注释转录物序列划分为24个或42个功能类别,共涉及物质及能量代谢、信号传导、转录调控及防卫反应等诸多生理生化过程。通过比较不同材料间转录物序列及表达水平,初步确定1 277条序列在种子形成过程中表达量下调10倍以上,990条序列仅在芝麻植株中表达而未在籽粒中表达;660条序列在种子形成过程中表达量上调10倍以上,296条序列可能与种子形成特异相关。【结论】利用高通量测序技术对芝麻野生种和不同栽培种现蕾期植株以及种子形成过程的转录组进行研究,揭示了芝麻发育转录组的整体表达特征,在得到大量芝麻转录组unigene序列的同时,获得了一批在芝麻生长发育及籽粒形成过程中有重要功能的基因序列。为深入开展芝麻生长发育、籽粒发育相关基因功能及调控以及芝麻分子标记开发等研究提供了丰富的数据资源。

关键词: 芝麻, 发育, 转录组测序, 生物信息学分析

Abstract: 【Objective】 To enrich sesame transcriptome data, the transcriptome sequencing and bioinforamtics analysis on sesame growth and development and seed formation processes were performed in this study. 【Method】 Six transcriptome sequencing libraries for developing sesame plants and seeds were constructed and sequenced using Illumina RNA sequencing technique, and the global transcriptome information was analyzed subsequently.【Result】8.80 Gb available trancriptome data were acquired as the adaptor sequences, duplication sequences and low-quality reads were removed from the original 12.69 Gb solexa sequencing data. And 26 837 uni-transcripts, longer than 100 bp, were obtained by de novo assembly method (http://www. ncbi.nlm.nih.gov/genbank/TSA.html, GenBank ID: JP631635-JP668414). The total scaffold sequence length reached 18.35 Mb with the 1 006 bp of N50, the average uni-transcript length was 683 bp. Annotation analysis of uni-transcripts indicated that 25 331 transcripts had homolog in public protein database; however, 1 506 sequences were no hit and might be sesame-specific. With COG and GO functional classifications, all uni-transcripts were grouped into 24 and 42 function categories, respectively, in which many functional categories, such as material and energy metabolism, signaling, transcription regulation and defense reactions, etc. were included. Furthermore, compared with plant growth and development transcripts, 1 277 sequences expressed more than 10-fold low in seed formation process, and the transcript level of 990 uni-trancripts could not be measured. In addition, 660 sequences were found with more than 10-fold high expression level during seed development, in which 296 sequences seemed as seed-specific. 【Conclusion】With the transcriptome sequencing of Sesamum radiatum and several cultivars of S. indicum L., this study gave a global insight into the characteristics of sesame development transcriptome and thousands of transcript sequences with important function were acquired for future genes expression and regulation research about sesame growth and development.

Key words: sesame (Sesamum indicum L., Sesamum radiatum), development, RNA sequencing, bioinformatics analysis