中国农业科学 ›› 2019, Vol. 52 ›› Issue (21): 3713-3732.doi: 10.3864/j.issn.0578-1752.2019.21.001

• 作物遗传育种·种质资源·分子遗传学 • 上一篇    下一篇

陆地棉扩展蛋白基因的鉴定与特征分析

张奇艳1,雷忠萍2,宋银1,海江波1,贺道华1()   

  1. 1 西北农林科技大学农学院,陕西杨凌712100
    2 西北农林科技大学生命科学学院,陕西杨凌712100
  • 收稿日期:2019-04-29 接受日期:2019-06-18 出版日期:2019-11-01 发布日期:2019-11-12
  • 通讯作者: 贺道华
  • 作者简介:张奇艳,E-mail: qiyanzhang318@163.com|雷忠萍,E-mail: zhpinglei@nwafu.edu.cn。张奇艳和雷忠萍为同等贡献作者。
  • 基金资助:
    国家重点研发计划(2018YFD0100300);现代农业产业技术体系建设专项资金(CARS-15-44);全省种质资源保护利用(20171010000004)

Identification and Characterization of the Expansin Gene Family in Upland Cotton (Gossypium hirsutum)

ZHANG QiYan1,LEI ZhongPing2,SONG Yin1,HAI JiangBo1,HE DaoHua1()   

  1. 1 College of Agronomy, Northwest A&F University, Yangling 712100, Shaanxi;
    2 College of Life Sciences, Northwest A&F University, Yangling 712100, Shaanxi
  • Received:2019-04-29 Accepted:2019-06-18 Online:2019-11-01 Published:2019-11-12
  • Contact: DaoHua HE

摘要:

目的 扩展蛋白(Expansin)是细胞壁的重要组成部分,在植物的生长发育及逆境胁迫应答等方面均发挥着重要作用。基于全基因组水平系统鉴定陆地棉Expansin基因家族,并通过生物信息学及表达模式分析,为揭示扩展蛋白基因在棉花生长发育中的功能及后续利用奠定基础。方法 利用BLAST和HMMER在陆地棉基因组中搜索并鉴定扩展蛋白基因家族成员;利用ClustalW、MEGA、MCScanX、Prot Param、MEME、SignalP、Euk-mPLoc、FancyGene和DnaSP等软件对其基因序列和蛋白序列进行生物信息学分析。通过RNA-seq数据分析扩展蛋白基因的表达模式和部分同源基因间表达差异,利用qRT-PCR验证部分扩展蛋白基因的表达谱。结果 陆地棉基因组中含有46个EXPA基因、8个EXPB基因、6个EXLA基因和12个EXLB基因,合计72个Expansin基因;四倍体陆地棉中扩展蛋白成员的数量几乎是二倍体棉种(亚洲棉与雷蒙德氏棉)的2倍。除GhA02和GhD06 2个染色体外,其余各染色体上均分布有数目不等的扩展蛋白基因(2—4个),具有部分同源关系的染色体GhA08和GhD08分别有5个和8个扩展蛋白基因。系统发育树显示,各亚家族成员聚集成群,并且大部分的末端分支均由来源于3个物种的4个(亚)基因组的4个基因组成,如EXPA亚家族的Cotton_A_28454/Gh_A03G0885/Gh_D02G1269/Gorai. 005G142200等等,4个基因之间具有同线性关系。亚细胞定位发现陆地棉所有的扩展蛋白均位于细胞外。基因结构分析显示,扩展蛋白基因由3—5个外显子组成,外显子-内含子结构在进化上高度保守且与氨基酸序列的多样性一致,且在外显子上存在密码子偏好性。RNA-seq数据显示,不同基因在不同时空条件下存在特异性表达,如GhEXPA19AGhEXPA19D相比其他基因在纤维10 DPA和20 DPA中的表达量很高;在不同的组织(如子叶、新叶、老叶、苞叶)中,GhEXPA24D具有较高的表达量。部分同源基因之间具有不同的表达模式,显示它们之间功能的异化与互补。qRT-PCR结果与RNA-seq数据基本吻合,如GhEXLA3AGhEXLA3D在纤维发育的伸长阶段高量表达。GhEXPA19DGhEXLA2D在3DPA的胚珠中表达活跃。结论 陆地棉基因组中含72个扩展蛋白基因,其在DNA水平和氨基酸水平具有一致的结构多样性和进化保守性,在转录水平具有各异的表达模式,显示出家族内成员间功能上的异化与互补。

关键词: 陆地棉, 扩展蛋白, 基因家族, 生物信息学, 基因表达

Abstract:

【Objective】 Expansins are a group of non-enzymatic proteins found in the plant cell wall, with important roles in plant growth, development, biotic and abiotic stress responses. To date, no systematic study on the molecular characterization, phylogeny and expression profiling of the upland cotton Expansin gene family has yet been conducted. In this study, a genome-wide identification, characterization and expression analysis of the Expansin gene family in upland cotton was performed. 【Method】 The members of the Expansin gene family in the upland cotton genome were identified by using the bioinformatics tools BLAST and HMMER, and were further analysed by using a combination of the bioinformatics softwares, such as ClustalW, MEGA, MCScanX, Prot Param, MEME, SignalP, Euk-mPLoc, Fancy Gene and DnaSP. The spatiotemporal expression patterns of the upland cotton Expansin gene family, and the differential expression of some Expansin homoeologs during the different stages of growth were determined by publicly available RNA-seq data. The expression patterns of some candidate Expansin genes were further validated by qRT-PCR. 【Result】 In the allotetraploid upland cotton, 72 expainsin-coding genes are identified, which is approximately twice as many as in the two diploid cotton species (Gossypium arboretum and G. raimondii), and these Expansin-coding genes are grouped into four subfamilies: 46 α-expansins (EXPAs), 8 β-expansins (EXPBs), 6 Expansin-like As (EXLAs), and 12 Expansin-like Bs (EXLBs). Except the two chromosomes GhA02 and GhD06, Expansin-coding genes are unevenly distributed across the other chromosomes ranging from 2 to 4, while the chromosomes GhA08 and GhD08 harbors 5 genes and 8 genes, respectively. Phylogenetic tree reveals that the members of the same subfamily are clustered together. In most cases, four Expansin members from the four (sub-)genomes of three cotton species (G. hirsutum, G. arboretum and G. raimondii) tends to cluster together within a given clade, for example, EXPA subfamily members Cotton_A_28454/Gh_A03G0885/Gh_D02G1269/Gorai.005G142200 which are located on collinear blocks are clustered into a clade. The computational prediction tool shows that all the Expansin proteins are predicted to be extracellular. The exon-intron structure analysis reveals that the upland cotton Expansin-coding genes typically consist of 3-5 exons interrupted by multiple introns, share an evolutionarily conserved exon-intron structure (consistent with the diversity of amino acid sequences), and have codon usage bias. RNA-seq data shows that different Expansin-coding genes are expressed in a stage- and tissue-specific manner during the developmental stages. For example, transcripts for GhEXPA19A and GhEXPA19D are highly abundant in the fire 10 days post anthesis (DPA) and 20 DPA when compared with other Expansin-coding genes. GhEXPA24D is highly expressed in few tissues, including cotyledons, new leaves, old leaves and bracts. Homoeologous genes exhibits different expression profiles, indicating the functional divergence and complementation. The qRT-PCR results are consistent with the RNA-seq data with the same trends for the expression of each Expansin-coding gene. For instance, GhEXLA3A and GhEXLA3D are highly expressed during the fiber elongation stage. GhEXPA19D and GhEXLA2D are highly expressed in the ovule at 3 DPA.【Conclusion】 The upland cotton genome contains 72 Expansin-coding genes which encode protein exhibiting the same structural diversity and evolutionary conservation as the coding DNA sequences of expansins, and which display diverse and dynamic expression patterns, implying functional conservation and divergence among the members of cotton Expansin genes.

Key words: G. hirsutum, Expansin, gene family, bioinformatics, gene expression