Please wait a minute...
Journal of Integrative Agriculture
Advanced Online Publication | Current Issue | Archive | Adv Search
toGC: a pipeline to correct gene model for functional excavation of dark GPCRs in Phytophthora sojae
Min Qiu1, 2, 3*, Chun Yan1*, Huaibo Li1, Haiyang Zhao1, Siqun Tu1, Yaru Sun1, Saijiang Yong1, Ming Wang1, 2, 3#, Yuanchao Wang1, 2, 3#

1 Sanya Institute of Nanjing Agricultural University, Department of Plant Pathology, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China

2The Key Laboratory of Plant Immunity, Nanjing Agricultural University, Nanjing, Jiangsu, China 

3Key Laboratory of Soybean Disease and Pest Control (Ministry of Agriculture and Rural Affairs), Nanjing Agricultural University, Nanjing, Jiangsu, China

Download:  PDF in ScienceDirect  
Export:  BibTeX | EndNote (RIS)      
摘要  基因组注释的准确性对于后续基因功能研究至关重要然而,常规的高通量注释基因方法可能难免存在基因模型预测错误的情况。这些基因模型错误情况会导致基因序列的错误延伸或截短,给下游的基因功能分析带来挑战。传统的通过克隆序列矫正序列的方法耗时且劳动密集,因此缺乏便捷的方法。为填补这一空白,我们开发了toGC流程,这是一个将基因组注释与转录组数据集集成起来以矫正基因模型预测错误的情况首先我们在大豆疫霉中检索了已发表的具有克隆序列的20个基因,发现大约40%的基因存在基因模型错误的情况。下一步我们利用toGC流程,发现这些基因注释序列和克隆序列不一致的情况都可以得到矫正,得到近乎100%的准确性。随后我们将toGC矫正流程应用于大豆疫霉的双元G蛋白偶联受体(GPCR)基因家族,该家族在大豆疫霉基因组中被预测为有42个成员,但缺乏实验验证。通过使用toGC,我们确定了32双元GPCR基因存在基因注释与toGC矫正序列不一致的情况。值得注意的是,其中有5个基因(GPCR-TKL9GPCR-TKL15GPCR-PDE3GPCR-AC3GPCR-AC4,其注释序列toGC矫正序列存在非常大的差异。随后,我们通过基因克隆获得了这5个基因的实际序列,测序后发现它们均与矫正后序列一致,进一步证实了toGC矫正流程的可靠性。更重要的是,我们还发现了两个新的双元GPCR基因GPCR-AC3GPCR-AC4),它们先前被错误地预测为一个基因。CRISPR/Cas9介导的基因敲除实验证实了GPCR-AC4参与了卵孢子的产生GPCR-AC3的敲除对卵孢子没有影响,进一步证实了它们作为两个独立基因的地位。除此以外我们进一步在辣椒疫霉终极腐霉证实了toGC流程的可靠性。我们的研究结果突显了toGC流程在基因模型矫正方面的实用性,促进了对生物学功能的研究,并在不同物种分析中提供了潜在的应用。

Abstract  The accuracy of genomic annotation is crucial for subsequent functional investigations; however, computational protocols used in high-throughput annotation of open reading frames (ORFs) can introduce inconsistencies. These inconsistencies, which lead to non-uniform extension or truncation of sequence ends, pose challenges for downstream analyses. Existing strategies to rectify these inconsistencies are time-consuming and labor-intensive, lacking specific approaches. To address this gap, we developed toGC, a tool that integrates genomic annotation with RNA-seq datasets to rectify annotation inconsistencies. Using toGC, we achieved an accuracy of nearly 100% accuracy in correcting inconsistencies in published P. sojae ORFs. We applied this innovative pipeline to the GPCR-bigrams gene family, which was predicted to have 42 members in the P. sojae genome but lacked experimental validation. By employing toGC, we identified 32 GPCR-bigram ORFs with inconsistencies between previous annotations and toGC-corrected sequences. Notably, among these were 5 genes (GPCR-TKL9, GPCR-TKL15, GPCR-PDE3, GPCR-AC3, and GPCR-AC4) showed substantial inconsistencies. Experimental gene annotation confirmed the effectiveness of toGC, as sequences obtained through cloning matched those annotated by toGC. Importantly, we discovered two novel GPCRs (GPCR-AC3 and GPCR-AC4), which were previously mispredicted as a single gene. CRISPR/Cas9-mediated knockout experiments revealed the involvement of GPCR-AC4 but not GPCR-AC3 in oospore production, further confirming their status as two separate genes. In addition to P. sojae, the reliability of the toGC pipeline in Phytophthora capsici and Pythium ultimum further emphasizes the robustness of this pipeline. Our findings highlight the utility of toGC for reliable gene model correction, facilitating investigations into biological functions and offering potential applications in diverse species analyses.
Keywords:  gene model correction       transcriptome              open reading frames              G-protein coupled receptors  
Online: 25 April 2024  
Fund: This work was supported by grants to grants to Min Qiu and Ming Wang from the National Natural Science Foundation of China (32100160 and 32100044), and grants to Ming Wang from the Jiangsu “Innovative and Entrepreneurial Talent” program (JSSCRC2021510), and grants to Yuanchao Wang from the Chinese Modern Agricultural Industry Technology System (CARS-004-PS14).
About author:  Min Qiu, E-mail: minqiu@njau.edu.cn; Chun Yan, E-mail: 2022202060@stu.njau.edu.cn; #Correspondence Yuanchao Wang, E-mail: wangyc@njau.edu.cn; Ming Wang, E-mail: mwang@njau.edu.cn *These authors contributed equally to this article.

Cite this article: 

Min Qiu, Chun Yan, Huaibo Li, Haiyang Zhao, Siqun Tu, Yaru Sun, Saijiang Yong, Ming Wang, Yuanchao Wang. 2024. toGC: a pipeline to correct gene model for functional excavation of dark GPCRs in Phytophthora sojae. Journal of Integrative Agriculture, Doi:10.1016/j.jia.2024.03.077

Bult C J, Blake J A, Calvi B R, Cherry J M, DiFrancesco V, Fullem R, Howe K L, Kaufman T, Mungall C, Perrimon N, Shimoyama M, Sternberg P W, Thomas P, Westerfield M, Consorti A G R. 2019. The Alliance of Genome Resources: Building a Modern Data Ecosystem for Model Organism Databases. Genetics, 213, 1189-1196. doi,10.1534/genetics.119.302523.

Chen H, Fang Y, Song W, Shu H, Li X, Ye W, Wang Y, Dong S. 2023. The SET domain protein PsKMT3 regulates histone H3K36 trimethylation and modulates effector gene expression in the soybean pathogen Phytophthora sojae. Molecular Plant Pathology, 24, 346-358. doi,10.1111/mpp.13301.

Chen H, Shu H, Wang L, Zhang F, Li X, Ochola S O, Mao F, Ma H, Ye W, Gu T, Jiang L, Wu Y, Wang Y, Kamoun S, Dong S. 2018. Phytophthora methylomes are modulated by 6mA methyltransferases and associated with adaptive genome regions. Genome biology, 19, 181. doi,10.1186/s13059-018-1564-4.

Danchin A, Ouzounis C, Tokuyasu T, Zucker J D. 2018. No wisdom in the crowd: genome annotation in the era of big data - current status and future prospects. Microbial Biotechnology, 11, 588-605. doi,10.1111/1751-7915.13284.

Denton J F, Lugo-Martinez J, Tucker A E, Schrider D R, Warren W C, Hahn M W. 2014. Extensive Error in the Number of Genes Inferred from Draft Genome Assemblies. Plos Computational Biology, 10, e1003998. doi,ARTN e100399810.1371/journal.pcbi.1003998.

Deutekom E S, Vosseberg J, van Dam T J P, Snel B. 2019. Measuring the impact of gene prediction on gene loss estimates in Eukaryotes by quantifying falsely inferred absences. Plos Computational Biology, 15, e1007301. doi,ARTN e100730110.1371/journal.pcbi.1007301.

Dragan M A, Moghul I, Priyam A, Bustos C, Wurm Y. 2016. GeneValidator: identify problems with protein-coding gene predictions. Bioinformatics, 32, 1559-1561. doi,10.1093/bioinformatics/btw015.

Fang Y, Tyler B M. 2016. Efficient disruption and replacement of an effector gene in the oomycete Phytophthora sojae using CRISPR/Cas9. Molecular Plant Pathology, 17, 127-139. doi,10.1111/mpp.12318.

Feng H, Wan C, Zhang Z, Chen H, Li Z, Jiang H, Yin M, Dong S, Dou D, Wang Y, Zheng X, Ye W. 2021. Specific interaction of an RNA-binding protein with the 3'-UTR of its target mRNA is critical to oomycete sexual reproduction. PLoS Pathogens, 17, e1010001. doi,10.1371/journal.ppat.1010001.

Gao J, Cao M, Ye W, Li H, Kong L, Zheng X, Wang Y. 2015. PsMPK7, a stress-associated mitogen-activated protein kinase (MAPK) in Phytophthora sojae, is required for stress tolerance, reactive oxygenated species detoxification, cyst germination, sexual reproduction and infection of soybean. Molecular Plant Pathology, 16, 61-70. doi,10.1111/mpp.12163.

Guigo R, Agarwal P, Abril J F, Burset M, Fickett J W. 2000. An assessment of gene prediction accuracy in large DNA sequences. Genome Research, 10, 1631-1642. doi,DOI 10.1101/gr.122800.

Guigo R, Flicek P, Abril J F, Reymond A, Lagarde J, Denoeud F, Antonarakis S, Ashburner M, Bajic V B, Birney E, Castelo R, Eyras E, Ucla C, Gingeras T R, Harrow J, Hubbard T, Lewis S E, Reese M G. 2006. EGASP: the human ENCODE genome annotation assessment project. Genome biology, 7, S2.1-31. doi,ARTN S2DOI 10.1186/gb-2006-7-s1-s2.

Hadley C. 2003. Righting the wrongs - DNA and protein sequence databases are increasingly useful research tools. But to maximize their potential, the errors in them need to be addressed. EMBO reports, 4, 829-831. doi,10.1038/sj.embor.embor932.

Hua C L, Wang Y L, Zheng X B, Dou D L, Zhang Z G, Govers F, Wang Y C. 2008. A Phytophthora sojae G-protein alpha subunit is involved in chemotaxis to soybean isoflavones. Eukaryotic Cell, 7, 2133-2140. doi,10.1128/Ec.00286-08.

Li X, Liu Y, Tan X Q, Li D L, Yang X Y, Zhang X, Zhang D Y. 2020. The high-affinity phosphodiesterase  is involved in the polarized growth and pathogenicity of. Fungal Biology, 124, 164-173. doi,10.1016/j.funbio.2020.01.006.

McGowan J, Fitzpatrick D A. 2020. Recent advances in oomycete genomics. Advances in Genetics, 105, 175-228. doi,10.1016/bs.adgen.2020.03.001.

Meyer C, Scalzitti N, Jeannin-Girardon A, Collet P, Poch O, Thompson J D. 2020. Understanding the causes of errors in eukaryotic protein-coding gene prediction: a case study of primate proteomes. BMC Bioinformatics, 21, 513. doi,ARTN 51310.1186/s12859-020-03855-1.

Mohanta T K, Al-Harrasi A. 2021. Fungal genomes: suffering with functional annotation errors. Ima Fungus, 12, 32. doi,ARTN 3210.1186/s43008-021-00083-x.

Qiu M, Li Y, Zhang X, Xuan M, Zhang B, Ye W, Zheng X, Govers F, Wang Y. 2020. G protein alpha subunit suppresses sporangium formation through a serine/threonine protein kinase in Phytophthora sojae. PLoS Pathogens, 16, e1008138. doi,10.1371/journal.ppat.1008138.

Qiu M, Tian M, Yong S, Sun Y, Cao J, Li Y, Zhang X, Zhai C, Ye W, Wang M, Wang Y. 2023. Phase-specific transcriptional patterns of the oomycete pathogen Phytophthora sojae unravel genes essential for asexual development and pathogenic processes. PLoS Pathogens, 19, e1011256. doi,10.1371/journal.ppat.1011256.

Salzberg S L. 2019. Next-generation genome annotation: we still struggle to get it right. Genome biology, 20, 92. doi,ARTN 9210.1186/s13059-019-1715-2.

Thines M. 2018. Oomycetes. Current Biology, 28, R812-R813. doi,10.1016/j.cub.2018.05.062.

Tyler B M, Tripathy S, Zhang X, Dehal P, Jiang R H, Aerts A, Arredondo F D, Baxter L, Bensasson D, Beynon J L, Chapman J, Damasceno C M, Dorrance A E, Dou D, Dickerman A W, Dubchak I L, Garbelotto M, Gijzen M, Gordon S G, Govers F, et al. 2006. Phytophthora genome sequences uncover evolutionary origins and mechanisms of pathogenesis. Science, 313, 1261-1266. doi,10.1126/science.1128796.

Van den Hoogen D J, Meijer H J G, Seidl M F, Govers F. 2018. The Ancient Link between G-Protein-Coupled Receptors and C-Terminal Phospholipid Kinase Domains. Mbio, 9. doi,ARTN e02119-1710.1128/mBio.02119-17.

Wang W Z, Xue Z L, Xie L F, Zhou X, Zhang F, Zhang S C, Govers F, Liu X L. 2023. Sterol-Sensing Domain (SSD)-Containing Proteins in Sterol Auxotrophic Mediate Sterol Signaling and Play a Role in Asexual Reproduction and Pathogenicity. Microbiology Spectrum, 11. doi,10.1128/spectrum.03797-22.

Wang Y, Ye W, Wang Y. 2018. Genome-wide identification of long non-coding RNAs suggests a potential association with effector gene transcription in Phytophthora sojae. Molecular Plant Pathology, 19, 2177-2186. doi,10.1111/mpp.12692.

Weis W I, Kobilka B K. 2018. The Molecular Basis of G Protein-Coupled Receptor Activation. Annual Review of Biochemistry, Vol 87, 87, 897-919. doi,10.1146/annurev-biochem-060614-033910.

Ye W, Wang X, Tao K, Lu Y, Dai T, Dong S, Dou D, Gijzen M, Wang Y. 2011. Digital gene expression profiling of the Phytophthora sojae transcriptome. Molecular Plant-Microbe Interactions, 24, 1530-1539. doi,10.1094/MPMI-05-11-0106.

Zerbino D R, Frankish A, Flicek P. 2020. Progress, Challenges, and Surprises in Annotating the Human Genome. Annual Review of Genomics and Human Genetics, Vol 21, 2020, 21, 55-79. doi,10.1146/annurev-genom-121119-083418.

Zhang X, Zhai C, Hua C, Qiu M, Hao Y, Nie P, Ye W, Wang Y. 2016. PsHint1, associated with the G-protein α subunit PsGPA1, is required for the chemotaxis and pathogenicity of Phytophthora sojae. Molecular Plant Pathology, 17, 272-285. 

[1] Dong Deng, Wenqi Wu, Canxing Duan, Suli Sun, Zhendong Zhu.

A novel pathogen Fusarium cuneirostrum causing common bean (Phaseolus vulgaris) root rot in China [J]. >Journal of Integrative Agriculture, 2024, 23(1): 166-176.

[2] Mu Zeng, Binhu Wang, Lei Liu, Yalan Yang, Zhonglin Tang. Genome-wide association study identifies 12 new genetic loci associated with growth traits in pigs[J]. >Journal of Integrative Agriculture, 2024, 23(1): 217-227.
[3] Jie Cheng, Xiukai Cao, Shengxuan Wang, Jiaqiang Zhang, Binglin Yue, Xiaoyan Zhang, Yongzhen Huang, Xianyong Lan, Gang Ren, Hong Chen. 3D genome organization and its study in livestock breeding[J]. >Journal of Integrative Agriculture, 2024, 23(1): 39-58.
[4] Xiaotong Guo, Xiangju Li, Zheng Li, Licun Peng, Jingchao Chen, Haiyan Yu, Hailan Cui. Effect of mutations on acetohydroxyacid synthase (AHAS) function in Cyperus difformis L.[J]. >Journal of Integrative Agriculture, 2024, 23(1): 177-186.
[5] Simin Liao, Zhibin Xu, Xiaoli Fan, Qiang Zhou, Xiaofeng Liu, Cheng Jiang, Liangen Chen, Dian Lin, Bo Feng, Tao Wang.

Genetic dissection and validation of a major QTL for grain weight on chromosome 3B in bread wheat (Triticum aestivum L.) [J]. >Journal of Integrative Agriculture, 2024, 23(1): 77-92.

[6] Yanan Xu, Yue Wu, Yan Han, Jiqing Song, Wenying Zhang, Wei Han, Binhui Liu, Wenbo Bai. Effect of chemical regulators on the recovery of leaf physiology, dry matter accumulation and translocation, and yield-related characteristics in winter wheat following dry-hot wind[J]. >Journal of Integrative Agriculture, 2024, 23(1): 108-121.
[7] Tingcheng Zhao, Aibin He, Mohammad Nauman Khan, Qi Yin, Shaokun Song, Lixiao Nie.

Coupling of reduced inorganic fertilizer with plant-based organic fertilizer as a promising fertilizer management strategy for colored rice in tropical regions [J]. >Journal of Integrative Agriculture, 2024, 23(1): 93-107.

[8] Atiqur RAHMAN, Md. Hasan Sofiur RAHMAN, Md. Shakil UDDIN, Naima SULTANA, Shirin AKHTER, Ujjal Kumar NATH, Shamsun Nahar BEGUM, Md. Mazadul ISLAM, Afroz NAZNIN, Md. Nurul AMIN, Sharif AHMED, Akbar HOSAIN. Advances in DNA methylation and its role in cytoplasmic male sterility in higher plants[J]. >Journal of Integrative Agriculture, 2024, 23(1): 1-19.
[9] Jingui Wei, Qiang Chai, Wen Yin, Hong Fan, Yao Guo, Falong Hu, Zhilong Fan, Qiming Wang. Grain yield and N uptake of maize in response to increased plant density under reduced water and nitrogen supply conditions[J]. >Journal of Integrative Agriculture, 2024, 23(1): 122-140.
[10] Wan Wang, Zhenjiang Zhang, Weldu Tesfagaber, Jiwen Zhang, Fang Li, Encheng Sun, Lijie Tang, Zhigao Bu, Yuanmao Zhu, Dongming Zhao. Establishment of an indirect immunofluorescence assay for the detection of African swine fever virus antibodies[J]. >Journal of Integrative Agriculture, 2024, 23(1): 228-238.
[11] Yanfei Song, Tai’an Tian, Yichai Chen, Keshi Zhang, Maofa Yang, Jianfeng Liu. A mite parasitoid, Pyemotes zhonghuajia, negatively impacts the fitness traits and immune response of the fall armyworm, Spodoptera frugiperda[J]. >Journal of Integrative Agriculture, 2024, 23(1): 205-216.
[12] Qi Zhang, Wenqin Zhan, Chao Li, Ling Chang, Yi Dong, Jiang Zhang.

Host-induced silencing of MpPar6 confers Myzus persicae resistance in transgenic rape plants [J]. >Journal of Integrative Agriculture, 2024, 23(1): 187-194.

[13] Jie Xue, Xianglin Zhang, Songchao Chen, Bifeng Hu, Nan Wang, Zhou Shi.

Quantifying the agreement and accuracy characteristics of four satellite-based LULC products for cropland classification in China [J]. >Journal of Integrative Agriculture, 2024, 23(1): 283-297.

[14] Qiuyan Yan, Linjia Wu, Fei Dong, Shuangdui Yan, Feng Li, Yaqin Jia, Jiancheng Zhang, Ruifu Zhang, Xiao Huang.

Subsoil tillage enhances wheat productivity, soil organic carbon and available nutrient status in dryland fields [J]. >Journal of Integrative Agriculture, 2024, 23(1): 251-266.

[15] Akmaral Baidyussen, Gulmira Khassanova, Maral Utebayev, Satyvaldy Jatayev, Rystay Kushanova, Sholpan Khalbayeva, Aigul Amangeldiyeva, Raushan Yerzhebayeva, Kulpash Bulatova, Carly Schramm, Peter Anderson, Colin L. D. Jenkins, Kathleen L. Soole, Yuri Shavrukov. Assessment of molecular markers and marker-assisted selection for drought tolerance in barley (Hordeum vulgare L.)[J]. >Journal of Integrative Agriculture, 2024, 23(1): 20-38.
No Suggested Reading articles found!