Please wait a minute...
Journal of Integrative Agriculture  2026, Vol. 25 Issue (5): 1927-1938    DOI: 10.1016/j.jia.2025.03.024
Horticulture Advanced Online Publication | Current Issue | Archive | Adv Search |
Core germplasm construction of tea plant populations based on genome-wide SNP and catechins in Shaanxi Province, China

Xinyu Wang1*, Xiufeng Li2*, Dan Chen1*, Jingwen Gao1, Shuangqian Hao1, He Zhang1, Ziyan Zhao1, Mengwei Shen1, Huirui Chen1, Fuqiang Qi1, Keyi Zhang1, Haozhe Zhou1, Yanjun Xi2, Jie Zhou1, Youben Yu1#, Qingshan Xu1#

1 College of Horticulture, Northwest A&F University, Yangling 712100, China

2 Hanzhong Agricultural Technology Extension and Training Center, Hanzhong 723000, China

 Highlights 
A core germplasm representing the full genetic and metabolic diversity of tea plants was constructed by integrating genome-wide SNP and targeted metabolite data.
The effectiveness of this core collection for genetic discovery was validated, as it successfully replicated all marker-trait associations identified in the original population.
Download:  PDF in ScienceDirect  
Export:  BibTeX | EndNote (RIS)      
摘要  

遗传多样性遗传研究和育种中起着关键作用核心种质是获取遗传多样性的重要资源。目前,茶树核心种质的构建主要基于表型数据或分子标记;然而,有效构建植物育种计划的核心种质需要综合考虑多个因素。本研究收集了320份茶树种质资源,分析了其单核苷酸多态性(SNP)和代谢物数据。基于2,118,060个高质量SNP标记,分析结果表明陕西省茶树种质资源具有丰富的遗传多样性,表现为较高的观察杂合度(Ho=0.340)、期望杂合度(He=0.327)、次要等位基因频率(MAF=0.229)和多态信息含量(PIC=0.268)。此外,较高的遗传多样性指数(H'=1.902)表明代谢变异显著。系统发育分析将320份茶树种质资源分为6个聚类分支,反映了地理因素对茶树遗传多样性的影响。在遗传和代谢数据的基础上,我们开发了一个包含106份材料的核心集合,旨在有效地代表原始种质资源的基因、代谢、种群和区域多样性。核心集合的全基因组关联分析成功地验证了在原始集合中发现的标记-性状关联。本研究为茶树种质资源的保护与管理提供了理论依据。



Abstract  

Genetic diversity is crucial to genetic research and crop breeding, and core collections are important resources for capturing this diversity.  Recently, the core germplasm of tea plants was constructed mainly based on phenotypic data or molecular markers; however, the effective construction of core germplasm resources for plant breeding programs requires consideration of multiple aspects.  In this study, we collected 320 tea germplasm resources and analyzed their single-nucleotide polymorphisms (SNPs) and metabolite data.  Abundant genetic diversity in tea plants was inferred from the mean values of observed heterozygosity (Ho=0.340), expected heterozygosity (He=0.327), minor allele frequency (MAF=0.229), and polymorphic information content (PIC=0.268), based on the data from 2,118,060 high-quality SNP markers.  A mean genetic diversity index (H´) value of 1.902 suggested significant metabolic variation.  The 320 tea samples were categorized into six groups based on phylogenetic analysis, reflecting the influence of geographical origins on genetic diversity.  Based on the genetic and metabolic data, a preliminary core collection of 106 accessions was developed to effectively represent most of the original panel’s molecular, metabolic, population, and regional diversity.  Genome-wide association studies of the core panel successfully replicated the marker-trait associations found in the original panel.  This study contributes to the conservation and management of tea plant germplasm.

Keywords:  tea plant       core collection        genetic diversity        SNPs        catechin index  
Received: 30 September 2024   Accepted: 23 March 2025 Online: 27 March 2025  
Fund: 

This work was supported by the National Natural Science Foundation of China (32472795), the Natural Science Basic Research Program of Shaanxi, China (2024JC-YBMS-145; 2021JQ-162), the China Agriculture Research System of MOF and MARA (CARS-19), the Agricultural Special Fund Project of Shaanxi Province, China (2024NYGG009), and Chinese Universities Scientific Fund (2452023481). 

About author:  #Correspondence Qingshan Xu, E-mail: xuqingshan@nwsuaf.edu.cn; Youben Yu, E-mail: yyben@163.com *These authors contributed equally to this study.

Cite this article: 

Xinyu Wang, Xiufeng Li, Dan Chen, Jingwen Gao, Shuangqian Hao, He Zhang, Ziyan Zhao, Mengwei Shen, Huirui Chen, Fuqiang Qi, Keyi Zhang, Haozhe Zhou, Yanjun Xi, Jie Zhou, Youben Yu, Qingshan Xu. 2026. Core germplasm construction of tea plant populations based on genome-wide SNP and catechins in Shaanxi Province, China. Journal of Integrative Agriculture, 25(5): 1927-1938.

Ahmed S, Griffin T S, Kraner D, Schäffner M K, Sharma D, Hazel M, Leitch A R, Orians C M, Han W Y, Stepp J R, Robbat A, Matyas C, Long C L, Xue D Y, Houser R F, Cash S B. 2019. Environmental factors variably impact tea secondary metabolites in the context of climate change. Frontiers in Plant Science10, 939.

Ban Q Y, Pan Y T, Pan C, Hu X, Li Y Y, Jiang C J. 2021. EST-SSR analysis of genetic diversity in tea germplasm resources in Shaanxi Province. Journal of Northwest A&F University49, 69–78. (in Chinese)

Beukelaer H D, Davenport G F, Fack V. 2018. Core Hunter 3: Flexible core subset selection. BMC Bioinformatics19, 203.

Bolger A, Lohse M, Usadel B. 2014. Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics30, 2114–2120.

Bradbury P J, Zhang Z W, Kroon D E, Casstevens T M, Ramdoss Y, Buckler E S. 2007. TASSEL: Software for association mapping of complex traits in diverse samples. Bioinformatics23, 2633–2635.

Brown A H D. 1989. Core collections: A practical approach to genetic resources management. Genome31, 818–824.

Chen J H, Qu K L, Zhang Y C, Sun Y Q, Li B, Kang Y, Dong S J. 2024.. Construction of Ziziphus jujuba var. spinosa core collection based on phenotypic traits. Journal of Shenyang Agricultural University55, 176–186. (in Chinese)

Clarke J D. 2009. Cetyltrimethyl ammonium bromide (CTAB) DNA miniprep for plant DNA isolation. Cold Spring Harbor Protocols3, 5177.

Danecek P, Auton A, Abecasis G R, Albers C A, Banks E, DePristo M A, Handsaker R E, Lunter G, Marth G, Sherry S T, McVean G, Durbin R. 2011. The variant call format and VCFtools. Bioinformatics27, 2156–2158.

Eungwanichayapant P D, Popluechai S. 2009. Accumulation of catechins in tea in relation to accumulation of mRNA from genes involved in catechin biosynthesis. Plant Physiology and Biochemistry47, 94–97.

Frankel O H, Brown A H D. 1984. Current plant genetic resources-a critical appraisal. In: GeneticsNew Frontiersvol4. Oxford & IBH Publishing, New Delhi. pp. 1–11.

Fumagalli M. 2013. Assessing the effect of sequencing depth and sample size in population genetics inferences. PLoS ONE8, 79667.

GB/T 8313–2018. 2018. Method for the Determination of Catechins Content in Tea by HPLC. Standards Press of China, Beijing. (in Chinese)

Gulati A, Rajkumar S, Karthigeyan S, Sud R K, Vijayan D, Thomas J, Rajkumar R, Das S C, Tamuly P, Hazarika M, Ahuja P S. 2009. Catechin and catechin fractions as biochemical markers to study the diversity of Indian tea (Camellia sinensis (L.) O. Kuntze) germplasm. Chemistry and Biodiversity6, 1042–1052.

Van Hintum T J L, Brown A H D, Spillane C, Hodgkin T. 2000. Core Collections of Plant Genetic Resources. Bioversity International, Rome, Italy. pp. 24–30.

Hu X. 2012. Research progress on tea germplasm resources in Shaanxi. Journal of Anhui Agricultural Sciences40, 51–52. (in Chinese)

Huang F Y, Duan J H, Lei Y, Liu Z, Kang Y K, Luo Y, Chen Y Y, Li Y Y, Liu S Q, Li S J, Liu Z H. 2022. Genetic diversity, population structure and core collection analysis of Hunan tea plant germplasm through genotyping-by-sequencing. Beverage Plant Research2, 1–7.

IBM Corp. 2016. IBM SPSS Statistics for Windows. version 24.0. IBM Corp, Armonk, NY.

Jin J Q, Ma J Q, Ma C L, Yao M Z, Chen L. 2014. Determination of catechin content in representative Chinese tea germplasms. Journal of Agricultural and Food Chemistry62, 9436–9441.

Jing X Y, Qian D D, Jiang X H, Wang P, Bao H H, Li D J, Zhu Y H, Zhu G T, Zhang C Z. 2025. Genetic diversity of the self-incompatibility locus in diploid potato. Journal of Integrative Agriculture24, 1448–1460.

Kaur V, Gomashe S S, Aravind J, Yadav S K, Sheela, Singh D, Chauhan S S, Kumar V, Jat B, Tayade N R, Saroha A, Kaushik N, Langyan S, Singh M, Wankhede D P, Singh K, Kumar A, Singh G P. 2023. Multi-environment phenotyping of linseed (Linum usitatissimum L.) germplasm for morphological and seed quality traits to assemble a core collection. Industrial Crops and Products206, 117657.

Koorevaar T, Willemsen J H, Visser R G F, Arens P, Maliepaard C. 2023. Construction of a strawberry breeding core collection to capture and exploit genetic variation. BMC Genomics24, 740.

Korte A, Vilhjálmsson B J, Segura V, Platt A, Long Q, Nordborg M. 2012. A mixed-model approach for genome-wide association studies of correlated traits in structured populations. Nature Genetics44, 1066–1071.

Kumar A, Kumar S, Singh K B, Prasad M, Thakur J K. 2020. Designing a mini-core collection effectively representing 3004 diverse rice accessions. Plant Communications1, 100049.

Li H, Durbin R. 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics25, 1754–1760.

Li S, Ji F F, Hou F F, Shi Q Q, Xing G M, Chen H, Weng Y Q, Kang X P. 2021. Morphological, palynological and molecular assessment of Hemerocallis core collection. Scientia Horticulturae285, 110181.

Li X B, Yan W G, Agrama H, Hu B L, Jia L, Jia M, Jackson A, Moldenhauer K, McClung A, Wu D X. 2010. Genotypic and phenotypic characterization of genetic differentiation and diversity in the USDA rice mini-core collection. Genetica138, 1221–1230.

Lassois L, Denancé C, Ravon E, Guyader A, Guisnel R, Hibrand-Saint-Oyant L, Poncet C, Lasserre-Zuber P, Feugey L, Durel C. 2016. Genetic diversity, population structure, parentage analysis, and construction of core collections in the French apple germplasm based on SSR markers. Plant Molecular Biology Reporter34, 827–844.

Le L L, Yang X M, Xie X Y, Zhang W T, Wang G B, Cao F L. 2024. Construction of the core germplasm of yellowhorn (Xanthoceras sorbifolium Bunge) using physiological traits and SSR markers. Scientia Horticulturae323, 112556.

Leydon A R, Weinreb C, Venable E, Reinders A, Ward J M, Johnson M A. 2017. The molecular dialog between flowering plant reproductive partners defined by SNP-Informed RNA-Sequencing. The Plant Cell29, 984–1006.

Liao Y Y, Zhou X C, Zeng L T. 2021. How does tea (Camellia sinensis) produce specialized metabolites which determine its unique quality and function: A review. Critical Reviews in Food Science and Nutrition62, 11–17.

Lin Z C, Eaves D J, Sánchez-Morán E, Franklin F H, Franklin-Tong V E. 2015. The Papaver rhoeas S determinants confer self-incompatibility to Arabidopsis thaliana in planta. Science350, 684–687.

Linde C C, Selmes H. 2012. Genetic diversity and mating type distribution of Tuber melanosporum and their significance to truffle cultivation in artificially planted truffiéres in Australia. Applied and Environmental Microbiology78, 6534–6539.

Liu M, Hu X, Wang X, Zhang J J, Peng X B, Hu Z G, Liu Y F. 2020. Constructing a core collection of the medicinal plant Angelica biserrata using genetic and metabolic data. Frontiers in Plant Science11, 600249.

Magoma G, Wachira F, Obanda M, Imbuga M, Agong S. 2000. The use of catechins as biochemical markers in diversity studies of tea (Camellia sinensis). Genetic Resources and Crop Evolution47, 107–114.

McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo M A. 2010. The genome analysis toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Research20, 1297–1303.

Mahmoodi R, Dadpour M R, Hassani D, Zeinalabedini M, Vendramin E, Micali S, Nahandi F Z. 2019. Development of a core collection in Iranian walnut (Juglans regia L.) germplasm using the phenotypic diversity. Scientia Horticulturae, 249, 439–448.

McLaughlin J F, Winker K. 2020. An empirical examination of sample size effects on population demographic estimates in birds using single nucleotide polymorphism (SNP) data. PeerJ8, 9939.

Nie X H, Wang Z H, Liu N W, Song L, Yan B B, Xing Y, Zhang Q, Fang K F, Zhao Y L, Chen X, Wang G P, Qin L, Cao Q Q. 2021. Fingerprinting 146 Chinese chestnut (Castanea mollissima Blume) accessions and selecting a core collection using SSR markers. Journal of Integrative Agriculture20, 1277–1286.

Niu S Z, Koiwa H, Song Q F, Qiao D H, Chen J, Zhao D G, Chen Z W, Wang Y, Zhang T Y. 2020. Development of core-collections for Guizhou tea genetic resources and GWAS of leaf size using SNP developed by genotyping-by-sequencing. PeerJ8, 8572.

Purcell S, Neale B M, Todd-Brown K, Thomas L, Ferreira M, Bender D, Maller J, Sklar P, Bakker P I W, Daly M J, Sham P C. 2007. Plink: A tool set for whole-genome association and population-based linkage analyses. The American Journal of Human Genetics81, 559–575.

Rodríguez-Nevado C, Lam T T, Holmes E C, Pagán I. 2018. The impact of host genetic diversity on virus evolution and emergence. Ecology Letters21, 253–263.

Roohanitaziani R, Maagd R A, Lammers M, Molthoff J, Meijer-Dekens F, Kaauwen M, Richard Finkers R, Tikunov Y, Visser R G F, Bovy A. 2020. Exploration of a resequenced tomato core collection for phenotypic and genotypic variation in plant growth and fruit quality traits. Genes11, 1278.

Sa K J, Kim D M, Oh J S, Park H, Hyun D Y, Lee S, Rhee J H, Lee J K. 2021. Construction of a core collection of native Perilla germplasm collected from South Korea based on SSR markers and morphological characteristics. Scientific Reports11, 23891.

Saravanan M, John K M M, Kumar R R, Pius P K, Sasikumar R. 2005. Genetic diversity of UPASI tea clones (Camellia sinensis (L.) O. Kuntze) on the basis of total catechins and their fractions. Phytochemistry66, 561–565.

Sun W H, Chen C L, Xu L L, Tao L L, Tong X Y, Tian Y P, Jiang H B, Chen L B, Wen H L, Liu S R, Wei C L, Zhu J Y. 2024. Genetic diversity analysis and core collection construction of tea plant from the Yunnan Province of China using ddRAD sequencing. BMC Plant Biology24, 1163.

Tanaka N, Shenton M, Kawahara Y, Kumagai M, Sakai H, Kanamori H, Yonemaru J, Fukuoka S I, Sugimoto K, Ishimoto M, Wu J Z, Ebana K. 2020. Investigation of the genetic diversity of a rice core collection of Japanese landraces using whole-genome sequencing. Plant and Cell Physiology61, 2087–2096.

Taniguchi F, Kimura K, Saba T, Ogino A, Yamaguchi S, Tanaka J. 2014. Worldwide core collections of tea (Camellia sinensis) based on SSR markers. Tree Genetics & Genomes10, 1555–1565.

Tao L L, Yang T, Chen H R, Wen H , Xie H, Luo L J, Huang K L, Zhu J Y, Liu S R, Wei C L. 2023. Core collection construction of tea plant germplasm in Anhui Province based on genetic diversity analysis using simple sequence repeat markers. Journal of Integrative Agriculture22, 2719–2728.

Upadhyaya H D, Wang Y H, Gowda C L L, Sharma S. 2014. Erratum to: Association mapping of maturity and plant height using SNP markers with the sorghum mini core collection. Theoretical and Applied Genetics127, 1461.

Wang M, Yang J, Li J L, Zhou X C,Yao Xiao Y, Liao Y Y, Tang J C, Dong F, Zeng L T. 2022. Effects of temperature and light on quality-related metabolites in tea [Camellia sinensis (L.) Kuntze] leaves. Food Research International161, 111882.

Wang R X, Zhong Y H, Hong W J, Luo H W, Li D L, Zhao L, Zhang H Y, Wang J. 2023. Genetic diversity evaluation and core collection construction of pomegranate (Punica granatum L.) using genomic SSR markers. Scientia Horticulturae319, 112192.

Wang X, Bao K, Reddy U K, Bai Y, Hammar S A, Iiao C, Wehner T C, Ramírez-Madera A O , Weng Y Q, Grumet R, Fei Z J. 2018. The USDA cucumber (Cucumis sativus L.) collection: Genetic diversity, population structure, genome-wide association studies, and core collection development. Horticulture Research5, 1–13.

Wang X C, Chen L, Yang Y J. 2011. Establishment of core collection for Chinese tea germplasm based on cultivated region grouping and phenotypic data. Frontiers of Agriculture in China5, 344–350.

Wang X L, Cao Z M, Gao C J, Li K. 2021. Strategy for the construction of a core collection for Pinus yunnanensis Franch. to optimize timber based on combined phenotype and molecular marker data. Genetic Resources and Crop Evolution68, 3219–3240.

Xu Y Q, Zhang Y N, Chen J X, Fang W, Du Q Z, Yin J F. 2018. Quantitative analyses of the bitterness and astringency of catechins from green tea. Food Chemistry258, 16–24.

Yao L, Caffin N, D’Arcy B, Jiang Y M, Shi J, Singanusong R, Liu X, Datta N, Kakùda Y, Xu Y. 2005. Seasonal variations of phenolic compounds in Australia-grown tea (Camellia sinensis). Journal of Agricultural and Food Chemistry53, 6477–6483.

Yuan Q J, Zhang Z Y, Hu J, Guo L, Shao A J, Huang L. 2010. Impacts of recent cultivation on genetic diversity pattern of a medicinal plant, Scutellaria baicalensis (Lamiaceae). BMC Genetics11, 29.

Zhang D Y, Li Y Y, Zhao X W, Zhang C L, Liu D K, Lan S,Yin W L, Zhong J L. 2023. Molecular insights into self-incompatibility systems: From evolution to breeding. Plant Communications5, 100719.

Zhang H X, Bai R, Wu F,Guo W Q, Zhang Y, Qi Y, Zhang Y F, Ma J X, Zhang J Y. 2019. Genetic diversity, phylogenetic structure and development of core collections in Melilotus accessions from a Chinese gene bank. Scientific Reports, 9, 13017.

Zhang W Y, Zhang Y J, Qiu H J, Guo Y F, Wan H, Zhang X L, Scossa F, Alseekh S, Zhang Q, Wang P, Xu L, Schmidt M, Jia X X, Li D L, Zhu A T, Guo F, Chen W, Ni D J, Usadel B, Fernie A R, Wen W W. 2020. Genome assembly of wild tea tree DASZ reveals pedigree and selection history of tea varieties. Nature Communications11, 3719.

Zhang X T, Chen S, Shi L Q, Gong D P, Zhang S C, Zhao Q, Zhan D L, Vasseur L, Wang Y B, Yu J X, Liao Z Y, Xu X D, Qi R, Wang W L, Ma Y R, Wang P J, Ye N X, Ma D N, Shi Y, Wang H F, et al. 2021. Haplotype-resolved genome assembly provides insights into evolutionary history of the tea plant Camellia sinensisNature Genetics53, 1250–1259.

Zhao M Z, Song J Y, Wu A T, Hu T, Li J Q. 2018. Mining beneficial genes for aluminum tolerance within a core collection of rice tandraces through genome-wide association mapping with high density SNPs from specific-locus amplified fragment sequencing. Frontiers in Plant Science, 9, 1838.

Zhao Y X, Zhao Z, Chen C G, Yu Y, Jeyaraj A, Zhuang J, Arkorful E, Thangaraj K, Periakaruppan R, Kou X B, Chen X, Li X H. 2022. Characterization of self-incompatibility and expression profiles of CsMCU2 related to pollination in different varieties of tea plants. Scientia Horticulturae, 293, 110693.

Zhou X J, Guo J B, Pandey M K, Varshney P K, Huang L, Luo H Y, Liu N, Chen W G, Lei Y, Liao B S, Jiang H F. 2021. Dissection of the genetic basis of yield-related traits in the Chinese peanut mini-Core collection through genome-wide association studies. Frontiers in Plant Science12, 637284.

Zhu X L, Zhou R, Qin H Z, Chai S F, Tang J L, Li Y Y, Xiao W. 2023. Genome-wide diversity evaluation and core germplasm extraction in ex situ conservation: A case of golden Camellia tunghinensisEvolutionary Applications16, 1519–1530.

Zhang C Y, Li H L, Mei P, Ye Y Y, Liu D D, Gong Y, Liu H R, Yao M Z, Ma C L. 2025. QTL detection and candidate gene analysis of the anthracnose resistance locus in tea plant (Camellia sinensis). Journal of Integrative Agriculture24, 2240–2250.

[1] Qihong Zou, Bokun Zhou, Yilan Hu, Ping Li, Qi Zhao, Hu Tang, Yujie Jiao, Xinzhuan Yao, Lin Chen, Litang Lü. Transcription factor CsHSFB2c suppresses CsTS1 and CsGS1 expression to reduce theanine biosynthesis in tea plants under heat stress[J]. >Journal of Integrative Agriculture, 2026, 25(3): 1009-1019.
[2] Siya Li, Lu Cao, Ziwen Zhou, Yaohua Cheng, Xianchen Zhang, Yeyun Li. The miR164a targets CsNAC1 to negatively regulate the cold tolerance of tea plants (Camellia sinensis)[J]. >Journal of Integrative Agriculture, 2025, 24(8): 3073-3086.
[3] Na Chang, Xiaotian Pi, Ziwen Zhou, Yeyun Li, Xianchen Zhang. Suppression of CsFAD3 in a JA-dependent manner, but not through the SA pathway, impairs drought stress tolerance in tea[J]. >Journal of Integrative Agriculture, 2024, 23(11): 3737-3750.
[4] TAO Ling-ling, TING Yu-jie, CHEN Hong-rong, WEN Hui-lin, XIE Hui, LUO Ling-yao, HUANG Ke-lin, ZHU Jun-yan, LIU Sheng-rui, WEI Chao-ling. Core collection construction of tea plant germplasm in Anhui Province based on genetic diversity analysis using simple sequence repeat markers[J]. >Journal of Integrative Agriculture, 2023, 22(9): 2719-2728.
[5] HAN Shan-jie, WANG Meng-xin, WANG Yan-su, WANG Yun-gang, CUI Lin, HAN Bao-yu. Exploiting push-pull strategy to combat the tea green leafhopper based on volatiles of Lavandula angustifolia and Flemingia macrophylla[J]. >Journal of Integrative Agriculture, 2020, 19(1): 193-203.
[6] SUN Xiao-ling, LI Xi-wang, XIN Zhao-jun, HAN Juan-juan, RAN Wei, LEI Shu. Development of synthetic volatile attractant for male Ectropis obliqua moths[J]. >Journal of Integrative Agriculture, 2016, 15(7): 1532-1539.
[7] LIANG Li-yun, LIU Li-fang, YU Xiao-ping , HAN Bao-yu. Evaluation of the Resistance of Different Tea Cultivars to TeaAphids by EPG Technique[J]. >Journal of Integrative Agriculture, 2012, 12(12): 2028-2034.
No Suggested Reading articles found!