中国农业科学 ›› 2020, Vol. 53 ›› Issue (24): 5027-5038.doi: 10.3864/j.issn.0578-1752.2020.24.006

• 植物保护 • 上一篇    下一篇

可可毛色二孢全基因组分泌蛋白的预测及分析

邢启凯1(),李铃仙1,曹阳2,张玮1,彭军波1,燕继晔1,李兴红1()   

  1. 1北京市农林科学院植物保护环境保护研究所/北方果树病虫害绿色防控北京市重点实验室,北京 100097
    2大连理工大学生物工程学院,辽宁大连 116024
  • 收稿日期:2020-02-14 接受日期:2020-03-20 出版日期:2020-12-16 发布日期:2020-12-28
  • 通讯作者: 李兴红
  • 作者简介:邢启凯,E-mail: qikaixing@163.com
  • 基金资助:
    国家自然科学基金(31801686);北京市自然科学基金(6184041);北京市农林科学院科技创新能力建设(KJCX20190406)

Prediction and Analysis of Candidate Secreted Proteins from the Genome of Lasiodiplodia theobromae

XING QiKai1(),LI LingXian1,CAO Yang2,ZHANG Wei1,PENG JunBo1,YAN JiYe1,LI XingHong1()   

  1. 1Institute of Plant and Environment Protection, Beijing Academy of Agriculture and Forestry Sciences/Beijing Key Laboratory of Environment Friendly Management on Fruit Diseases and Pests in North China, Beijing 100097
    2School of Biological Engineering, Dalian University of Technology, Dalian 116024, Liaoning
  • Received:2020-02-14 Accepted:2020-03-20 Online:2020-12-16 Published:2020-12-28
  • Contact: XingHong LI

摘要:

【目的】可可毛色二孢(Lasiodiplodia theobromae)是一种世界性分布的重要植物病原真菌,可引起严重的葡萄溃疡病(Botryosphaeria dieback),影响果木品质并造成巨大的经济损失。本研究预测并分析可可毛色二孢基因组范围内的分泌蛋白,并明确其基本特征,为该病菌分泌蛋白致病机理的研究打下基础。【方法】依据已公布的可可毛色二孢全基因组序列,利用信号肽预测软件SignalP v5.0、跨膜结构分析软件TMHMM v2.0、细胞器定位分析软件ProtComp v9.0、GPI锚定预测软件big-PI Fungal Predictor和亚细胞器定位分析软件TargetP v2.0生物信息学软件对该菌中的典型分泌蛋白进行筛选。对分泌蛋白N端信号肽的长度、氨基酸使用频率及其切割位点进行统计分析。依据蛋白序列的同源性,应用BLASTP程序对分泌组蛋白进行功能注释分析,预测其生物学功能。采用蔗糖酶缺陷的酵母分泌系统,对所选分泌蛋白的信号肽进行活性检测。利用qRT-PCR方法检测所选分泌蛋白基因在可可毛色二孢侵染葡萄中的表达情况。【结果】在可可毛色二孢全基因组编码蛋白中共筛选获得552个潜在的具有典型信号肽的分泌蛋白,占全基因组预测蛋白总数的4.3%,其编码蛋白长度集中于101—400 aa。信号肽统计分析表明,其信号肽长度以18—20 aa的序列最为集中,信号肽长度为20 aa的蛋白数量最多。信号肽中使用频率最高的氨基酸为丙氨酸;非极性、疏水的氨基酸使用频率最高,占氨基酸总数的60.2%。其信号肽的-3至-1位置上的氨基酸相对保守,切割位点属于A-X-A类型,可被Sp I型信号肽酶识别并切割。336个分泌蛋白具有功能注释,其功能较多集中于细胞壁降解有关的酶类以及致病相关蛋白,并且这些蛋白在分子量、等电点、脂肪族氨基酸指数等方面均存在差异。通过蔗糖酶缺陷的酵母分泌系统证实,挑选的9个分泌蛋白信号肽均具有分泌活性。qRT-PCR检测结果表明,所选分泌蛋白基因在该病菌侵染初期的表达发生变化。【结论】利用生物信息学分析技术从可可毛色二孢全基因组中共预测获得552个经典分泌蛋白。其信号肽氨基酸长度分布广泛,氨基酸组成中非极性、疏水的氨基酸使用频率最高。功能注释主要集中在细胞壁组分降解相关的酶类、致病侵染相关的坏死诱导相关蛋白以及几丁质结合蛋白等。

关键词: 葡萄溃疡病, 可可毛色二孢, 生物信息学, 分泌蛋白, 信号肽, 表达模式

Abstract:

【Objective】Lasiodiplodia theobromae is an important phytopathogenic fungus with a worldwide distribution. This species causes severe Botryosphaeria dieback on a wide range of woody plants, which leads to reduced crop quality and tremendous economic losses. The objective of this study is to predict and analyze the candidate secreted proteins in the genome of L. theobromae, clarify their basic characteristics, so as to lay a foundation for the study of the pathogenic mechanism of secreted proteins in this pathogen.【Method】The signal peptide prediction algorithm SignalP v5.0 and subcellular localization prediction algorithm ProtComp v9.0, transmembrane helix prediction algorithm TMHMM v2.0, GPI-anchoring site prediction algorithm big-PI Fungal Predictor, and subcellular protein location distribution algorithm TargetP v2.0 were used to analyze 12 902 protein sequences of L. theobromae published in the previous study. The basic features including the length of the N-terminal signal peptide, the frequency of amino acid usage and cleavage site of the predicted secreted proteins were statistically analyzed. Based on the homology of the protein sequence, biological function annotation of the predicted secreted proteins was clarified by using BLASTP program. The activity of the signal peptide of the selected secreted proteins was detected by yeast secretion and cell translocation assays. Expression patterns of the selected secreted protein genes during L. theobromae infection were analyzed by qRT-PCR technology.【Result】In this study, 522 secreted proteins were verified, accounting for 4.3% of the total proteins present in the genome of L. theobromae. The lengths of amino acids of secreted proteins were ranged from 101 to 400 aa. The distribution length of signal peptides was from 18 to 20 aa and the largest number was 20 aa. The top frequent amino acid was alanine in the signal peptides, and the most frequently incorporated amino acids were non-polar and hydrophobic, accounting for 60.2% of the total amino acids. Further, the amino acids in the position -3 to -1 in the signal peptides were relatively conserved and the signal peptide cleavage site belonged to A-X-A type, which could be recognized and cleaved by Sp I type peptidase. Among them, 336 secreted proteins were identified with a predictive function, which was mostly enzymatic or virulence-associated protein. Besides, there are differences in terms of molecular weight, isoelectric point, the aliphatic index in the candidate secreted proteins. Finally, the predicted signal peptides of the 9 putative L. theobromae secreted proteins were confirmed to have secretory activity by using a yeast invertase secretion assay. qRT-PCR analysis demonstrated that the expression of selected protein genes was differentially regulated during host infection.【Conclusion】A total of 552 candidate secreted proteins of L. theobromae were predicted by a set of computer algorithms. Lengths of the signal peptides vary greatly and the most frequently are mainly non-polar and hydrophobic amino acids. Secreted proteins characterized in this study can be categorized under enzymes related to the degradation of cell wall components, necrosis induction proteins, and chitin-binding proteins which may play an important role in L. theobromae pathogenetic mechanism.

Key words: Botryosphaeria dieback, Lasiodiplodia theobromae, bioinformatics, secreted protein, signal peptide, expression pattern