Please wait a minute...
Journal of Integrative Agriculture  2026, Vol. 25 Issue (1): 150-156    DOI: 10.1016/j.jia.2024.03.077
Plant Protection Advanced Online Publication | Current Issue | Archive | Adv Search |
toGC: A pipeline to correct gene model for functional excavation of dark GPCRs in Phytophthora sojae

Min Qiu1, 2, 3*, Chun Yan1*, Huaibo Li1, Haiyang Zhao1, Siqun Tu1, Yaru Sun1, Saijiang Yong1, Ming Wang1, 2, 3#, Yuanchao Wang1, 2, 3#

1 Sanya Institute of Nanjing Agricultural University, Department of Plant Pathology, Nanjing Agricultural University, Nanjing 210095, China

2 The Key Laboratory of Plant Immunity, Nanjing Agricultural University, Nanjing 210095, China

3 Key Laboratory of Soybean Disease and Pest Control (Ministry of Agriculture and Rural Affairs), Nanjing Agricultural University, Nanjing 210095, China

 Highlights 
Developed toGC, a new tool that corrects errors in gene annotations using RNA-seq data.
Experimentally validated toGC's accuracy, discovering two novel GPCR genes misannotated as one.
Demonstrated toGC's broad utility by applying it successfully to multiple oomycete species.
Download:  PDF in ScienceDirect  
Export:  BibTeX | EndNote (RIS)      
摘要  
基因组注释的准确性对于后续基因功能研究至关重要。然而,常规的高通量注释基因方法可能难免存在基因模型预测错误的情况。这些基因模型错误情况会导致基因序列的错误延伸或截短,给下游的基因功能分析带来挑战。传统的通过克隆序列矫正序列的方法耗时且劳动密集,因此缺乏便捷的方法。为填补这一空白,我们开发了toGC流程,这是一个将基因组注释与转录组数据集集成起来以矫正基因模型预测错误的情况。首先我们在大豆疫霉中检索了已发表的具有克隆序列的20个基因,发现大约40%的基因存在基因模型错误的情况。下一步我们利用toGC流程,发现这些基因注释序列和克隆序列不一致的情况都可以得到矫正,得到近乎100%的准确性。随后我们将toGC矫正流程应用于大豆疫霉的双元G蛋白偶联受体(GPCR)基因家族,该家族在大豆疫霉基因组中被预测为有42个成员,但缺乏实验验证。通过使用toGC,我们确定了32个双元GPCR基因存在基因注释与toGC矫正后序列不一致的情况。值得注意的是,其中有5个基因(GPCR-TKL9GPCR-TKL15GPCR-PDE3GPCR-AC3GPCR-AC4),其注释序列与toGC矫正后序列存在非常大的差异。随后,我们通过基因克隆获得了这5个基因的实际序列,测序后发现它们均与矫正后序列一致,进一步证实了toGC矫正流程的可靠性。更重要的是,我们还发现了两个新的双元GPCR基因(GPCR-AC3GPCR-AC4),它们先前被错误地预测为一个基因。CRISPR/Cas9介导的基因敲除实验证实了GPCR-AC4参与了卵孢子的产生,而GPCR-AC3的敲除对卵孢子没有影响,进一步证实了它们作为两个独立基因的地位。除此以外,我们进一步在辣椒疫霉和终极腐霉中证实了toGC流程的可靠性。我们的研究结果突显了toGC流程在基因模型矫正方面的实用性,促进了对生物学功能的研究,并在不同物种分析中提供了潜在的应用。


Abstract  

The accuracy of genomic annotation is crucial for subsequent functional investigations; however, computational protocols used in high-throughput annotation of open reading frames (ORFs) can introduce inconsistencies.  These inconsistencies, which lead to non-uniform extension or truncation of sequence ends, pose challenges for downstream analyses.  Existing strategies to rectify these inconsistencies are time-consuming and labor-intensive, lacking specific approaches.  To address this gap, we developed toGC, a tool that integrates genomic annotation with RNA-seq datasets to rectify annotation inconsistencies.  Using toGC, we achieved an accuracy of nearly 100% accuracy in correcting inconsistencies in published Phytophthora sojae ORFs.  We applied this innovative pipeline to the GPCR-bigrams gene family, which was predicted to have 42 members in the Psojae genome but lacked experimental validation.  By employing toGC, we identified 32 GPCR-bigram ORFs with inconsistencies between previous annotations and toGC-corrected sequences.  Notably, among these were 5 genes (GPCR-TKL9, GPCR-TKL15, GPCR-PDE3, GPCR-AC3, and GPCR-AC4) showed substantial inconsistencies.  Experimental gene annotation confirmed the effectiveness of toGC, as sequences obtained through cloning matched those annotated by toGC.  Importantly, we discovered two novel GPCRs (GPCR-AC3 and GPCR-AC4), which were previously mispredicted as a single gene.  CRISPR/Cas9-mediated knockout experiments revealed the involvement of GPCR-AC4 but not GPCR-AC3 in oospore production, further confirming their status as two separate genes.  In addition to Psojae, the reliability of the toGC pipeline in Phytophthora capsici and Pythium ultimum further emphasizes the robustness of this pipeline.  Our findings highlight the utility of toGC for reliable gene model correction, facilitating investigations into biological functions and offering potential applications in diverse species analyses.

Keywords:  gene model correction       transcriptome        open reading frames        G-protein coupled receptors  
Received: 01 January 2024   Accepted: 21 February 2024 Online: 27 March 2024  
Fund: This work was supported by the grants to Min Qiu and Ming Wang from the National Natural Science Foundation of China (32100160 and 32100044), the grants to Ming Wang from the Jiangsu “Innovative and Entrepreneurial Talent” Program, China (JSSCRC2021510), and the grants to Yuanchao Wang from the Chinese Modern Agricultural Industry Technology System (CARS-004-PS14).  
About author:  Min Qiu, E-mail: minqiu@njau.edu.cn; Chun Yan, E-mail: 2022202060@stu.njau.edu.cn; #Correspondence Yuanchao Wang, E-mail: wangyc@njau.edu.cn; Ming Wang, E-mail: mwang@njau.edu.cn * These authors contributed equally to this study.

Cite this article: 

Min Qiu, Chun Yan, Huaibo Li, Haiyang Zhao, Siqun Tu, Yaru Sun, Saijiang Yong, Ming Wang, Yuanchao Wang. 2026. toGC: A pipeline to correct gene model for functional excavation of dark GPCRs in Phytophthora sojae. Journal of Integrative Agriculture, 25(1): 150-156.

Bult C J, Blake J A, Calvi B R, Cherry J M, DiFrancesco V, Fullem R, Howe K L, Kaufman T, Mungall C, Perrimon N, Shimoyama M, Sternberg P W, Thomas P, Westerfield M, Consorti A G R. 2019. The alliance of genome resources: building a modern data ecosystem for model organism databases. Genetics213, 1189–1196.

Chen H, Fang Y, Song W, Shu H, Li X, Ye W, Wang Y, Dong S. 2023. The SET domain protein PsKMT3 regulates histone H3K36 trimethylation and modulates effector gene expression in the soybean pathogen Phytophthora sojaeMolecular Plant Pathology24, 346–358.

Chen H, Shu H, Wang L, Zhang F, Li X, Ochola S O, Mao F, Ma H, Ye W, Gu T, Jiang L, Wu Y, Wang Y, Kamoun S, Dong S. 2018. Phytophthora methylomes are modulated by 6mA methyltransferases and associated with adaptive genome regions. Genome Biology19, 181.

Danchin A, Ouzounis C, Tokuyasu T, Zucker J D. 2018. No wisdom in the crowd: Genome annotation in the era of big data-current status and future prospects. Microbial Biotechnology11, 588–605.

Denton J F, Lugo-Martinez J, Tucker A E, Schrider D R, Warren W C, Hahn M W. 2014. Extensive error in the number of genes inferred from draft genome assemblies. PLoS Computational Biology10, e1003998.

Deutekom E S, Vosseberg J, van Dam T J P, Snel B. 2019. Measuring the impact of gene prediction on gene loss estimates in eukaryotes by quantifying falsely inferred absences. PLoS Computational Biology15, e1007301.

Dragan M A, Moghul I, Priyam A, Bustos C, Wurm Y. 2016. GeneValidator: Identify problems with protein-coding gene predictions. Bioinformatics32, 1559–1561.

Fang Y, Tyler B M. 2016. Efficient disruption and replacement of an effector gene in the oomycete Phytophthora sojae using CRISPR/Cas9. Molecular Plant Pathology17, 127–139.

Feng H, Wan C, Zhang Z, Chen H, Li Z, Jiang H, Yin M, Dong S, Dou D, Wang Y, Zheng X, Ye W. 2021. Specific interaction of an RNA-binding protein with the 3´-UTR of its target mRNA is critical to oomycete sexual reproduction. PLoS Pathogens17, e1010001.

Gao J, Cao M, Ye W, Li H, Kong L, Zheng X, Wang Y. 2015. PsMPK7, a stress-associated mitogen-activated protein kinase (MAPK) in Phytophthora sojae, is required for stress tolerance, reactive oxygenated species detoxification, cyst germination, sexual reproduction and infection of soybean. Molecular Plant Pathology16, 61–70.

Guigo R, Agarwal P, Abril J F, Burset M, Fickett J W. 2000. An assessment of gene prediction accuracy in large DNA sequences. Genome Research10, 1631–1642.

Guigo R, Flicek P, Abril J F, Reymond A, Lagarde J, Denoeud F, Antonarakis S, Ashburner M, Bajic V B, Birney E, Castelo R, Eyras E, Ucla C, Gingeras T R, Harrow J, Hubbard T, Lewis S E, Reese M G. 2006. EGASP: The human ENCODE genome annotation assessment project. Genome Biology7, S2.1–31.

Hadley C. 2003. Righting the wrongs-DNA and protein sequence databases are increasingly useful research tools. But to maximize their potential, the errors in them need to be addressed. EMBO Reports4, 829–831.

Van den Hoogen D J, Meijer H J G, Seidl M F, Govers F. 2018. The ancient link between G-protein-coupled receptors and C-terminal phospholipid kinase domains. mBio9, e02119-17.

Hua C L, Wang Y L, Zheng X B, Dou D L, Zhang Z G, Govers F, Wang Y C. 2008. A Phytophthora sojae G-protein alpha subunit is involved in chemotaxis to soybean isoflavones. Eukaryotic Cell7, 2133–2140.

Li X, Liu Y, Tan X Q, Li D L, Yang X Y, Zhang X, Zhang D Y. 2020. The high-affinity phosphodiesterase is involved in the polarized growth and pathogenicity of. Fungal Biology124, 164–173.

McGowan J, Fitzpatrick D A. 2020. Recent advances in oomycete genomics. Advances in Genetics105, 175–228.

Meyer C, Scalzitti N, Jeannin-Girardon A, Collet P, Poch O, Thompson J D. 2020. Understanding the causes of errors in eukaryotic protein-coding gene prediction: A case study of primate proteomes. BMC Bioinformatics21, 513.

Mohanta T K, Al-Harrasi A. 2021. Fungal genomes: Suffering with functional annotation errors. IMA Fungus12, 32.

Qiu M, Li Y, Zhang X, Xuan M, Zhang B, Ye W, Zheng X, Govers F, Wang Y. 2020. G protein alpha subunit suppresses sporangium formation through a serine/threonine protein kinase in Phytophthora sojaePLoS Pathogens16, e1008138.

Qiu M, Tian M, Yong S, Sun Y, Cao J, Li Y, Zhang X, Zhai C, Ye W, Wang M, Wang Y. 2023. Phase-specific transcriptional patterns of the oomycete pathogen Phytophthora sojae unravel genes essential for asexual development and pathogenic processes. PLoS Pathogens19, e1011256.

Salzberg S L. 2019. Next-generation genome annotation: We still struggle to get it right. Genome Biology20, 92.

Thines M. 2018. Oomycetes. Current Biology28, R812-R813.

Tyler B M, Tripathy S, Zhang X, Dehal P, Jiang R H, Aerts A, Arredondo F D, Baxter L, Bensasson D, Beynon J L, Chapman J, Damasceno C M, Dorrance A E, Dou D, Dickerman A W, Dubchak I L, Garbelotto M, Gijzen M, Gordon S G, Govers F, et al. 2006. Phytophthora genome sequences uncover evolutionary origins and mechanisms of pathogenesis. Science313, 1261–1266.

Wang W Z, Xue Z L, Xie L F, Zhou X, Zhang F, Zhang S C, Govers F, Liu X L. 2023. Sterol-sensing domain (SSD)-containing proteins in sterol auxotrophic mediate sterol signaling and play a role in asexual reproduction and pathogenicity. Microbiology Spectrum11, e0379722.

Wang Y, Ye W, Wang Y. 2018. Genome-wide identification of long non-coding RNAs suggests a potential association with effector gene transcription in Phytophthora sojaeMolecular Plant Pathology19, 2177–2186.

Weis W I, Kobilka B K. 2018. The molecular basis of G protein-coupled receptor activation. Annual Review of Biochemistry87, 897–919.

Ye W, Wang X, Tao K, Lu Y, Dai T, Dong S, Dou D, Gijzen M, Wang Y. 2011. Digital gene expression profiling of the Phytophthora sojae transcriptome. Molecular Plant-Microbe Interactions24, 1530–1539.

Zerbino D R, Frankish A, Flicek P. 2020. Progress, challenges, and surprises in annotating the human genome. Annual Review of Genomics and Human Genetics21, 55–79.

Zhang X, Zhai C, Hua C, Qiu M, Hao Y, Nie P, Ye W, Wang Y. 2016. PsHint1, associated with the G-protein α subunit PsGPA1, is required for the chemotaxis and pathogenicity of Phytophthora sojaeMolecular Plant Pathology17, 272–285.

[1] Teame Gereziher Mehari, Marijana Skorić, Hui Fang, Kai Wang, Fang Liu, Tesfay Araya, Branislav Šiler, Dengbing Yao, Baohua Wang. Insights into the role of GhCYP and GhTPS in the gossypol biosynthesis pathway via a multiomics and functional-based approach in cotton[J]. >Journal of Integrative Agriculture, 2025, 24(5): 1671-1687.
[2] Congrui Sun, Runze Wang, Jiaming Li, Xiaolong Li, Bobo Song, David Edwards, Jun Wu. Pan-transcriptome analysis provides insights into resistance and fruit quality breeding of pear (Pyrus pyrifolia)[J]. >Journal of Integrative Agriculture, 2025, 24(5): 1813-1830.
[3] Jin Wang, Minghua Wei, Haiyan Wang, Changjuan Mo, Yingchun Zhu, Qiusheng Kong. A time-course transcriptome reveals the response of watermelon to low-temperature stress[J]. >Journal of Integrative Agriculture, 2025, 24(5): 1786-1799.
[4] Yonghui Fan, Yue Zhang, Yu Tang, Biao Xie, Wei He, Guoji Cui, Jinhao Yang, Wenjing Zhang, Shangyu Ma, Chuanxi Ma, Haipeng Zhang, Zhenglai Huang.
Response of wheat to winter night warming based on physiological and transcriptome analyses
[J]. >Journal of Integrative Agriculture, 2025, 24(3): 1044-1064.
[5] Xiaochun Wei, Yuanlin Zhang, Yanyan Zhao, Weiwei Chen, Ujjal Kumar Nath, Shuangjuan Yang, Henan Su, Zhiyong Wang, Wenjing Zhang, Baoming Tian, Fang Wei, Yuxiang Yuan, Xiaowei Zhang. Mitotic pollen abnormalities are linked to Ogura cytoplasmic male sterility in Chinese cabbage (Brassica rapa L. ssp. pekinensis)[J]. >Journal of Integrative Agriculture, 2025, 24(3): 1092-1107.
[6] Xiuling Wang, Li Niu, Huaipan Liu, Xucun Jia, Yulong Zhao, Qun Wang, Yali Zhao, Pengfei Dong, Moubiao Zhang, Hongping Li, Panpan An, Zhi Li, Xiaohuan Mu, Yongen Zhang, Chaohai Li. Integrated transcriptomics and metabolomics analysis provide insights into the alleviation of waterlogging stress in maize by exogenous spermidine application[J]. >Journal of Integrative Agriculture, 2025, 24(12): 4546-4560.
[7] Meixue Sun, Tong Li, Yingjie Liu, Kenneth Wilson, Xingyu Chen, Robert I. Graham, Xianming Yang, Guangwei Ren, Pengjun Xu. A dicistrovirus increases pupal mortality in Spodoptera frugiperda by suppressing protease activity and inhibiting larval diet consumption[J]. >Journal of Integrative Agriculture, 2024, 23(8): 2723-2734.
[8] Qian Wang, Huimin Cao, Jingcheng Wang, Zirong Gu, Qiuyun Lin, Zeyan Zhang, Xueying Zhao, Wei Gao, Huijun Zhu, Hubin Yan, Jianjun Yan, Qingting Hao, Yaowen Zhang. Fine-mapping and primary analysis of candidate genes associated with seed coat color in mung bean (Vigna radiata L.)[J]. >Journal of Integrative Agriculture, 2024, 23(8): 2571-2588.
[9] Hengwei Yu, Zhimei Yang, Jianfang Wang, Huaxuan Li, Xuefeng Li, Entang Liang, Chugang Mei, Linsen Zan. Identification of key genes and metabolites involved in meat quality performance in Qinchuan cattle by WGCNA[J]. >Journal of Integrative Agriculture, 2024, 23(11): 3923-3937.
[10] FAN Xiao-xue, BIAN Zhong-hua, SONG Bo, XU Hai. Transcriptome analysis reveals the differential regulatory effects of red and blue light on nitrate metabolism in pakchoi (Brassica campestris L.)[J]. >Journal of Integrative Agriculture, 2022, 21(4): 1015-1027.
[11] WU Zhe, YANG Xuan, ZHAO Yu-xuan, JIA Li. Identifying candidate genes involved in trichome formation on carrot stems by transcriptome profiling and resequencing [J]. >Journal of Integrative Agriculture, 2022, 21(12): 3589-3599.
No Suggested Reading articles found!