中国农业科学 ›› 2019, Vol. 52 ›› Issue (4): 579-590.doi: 10.3864/j.issn.0578-1752.2019.04.001

• 作物遗传育种·种质资源·分子遗传学 •    下一篇

基于温带和热带玉米群体全基因组FST和XP-EHH的选择信号检测

杨宇昕,邹枨()   

  1. 中国农业科学院作物科学研究所,北京100081
  • 收稿日期:2018-10-30 接受日期:2018-12-09 出版日期:2019-02-16 发布日期:2019-02-27
  • 通讯作者: 邹枨
  • 作者简介:杨宇昕,yyx0719@126.com
  • 基金资助:
    国家重点研发计划(2016YFD0100303);国家自然科学基金面上项目(31371638)

Genome-Wide Detection of Selection Signal in Temperate and Tropical Maize Populations with Use of FST and XP-EHH

YANG YuXin,ZOU Cheng()   

  1. Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing 100081
  • Received:2018-10-30 Accepted:2018-12-09 Online:2019-02-16 Published:2019-02-27
  • Contact: Cheng ZOU

摘要:

【目的】玉米起源于热带地区,经过自然和人工选择,广泛的种植于温带地区。开花是玉米生长发育的中心环节,也是热带玉米向温带环境种植的主要适应性性状。鉴定玉米在驯化过程中出现的受选择基因区段,并进一步挖掘开花候选基因,为玉米的群体改良、开花遗传机理解析提供数据支撑。【方法】首先单独分析30份温带玉米自交系和21份热带玉米自交系的单倍型数据,通过过滤高缺失和等位基因频率较低的变异位点,得到高质量的SNP(single nucleotide polymorphism)标记,利用SnpEff软件对温带和热带玉米群体的基因组多态性位点进行了功能预测。其次过滤得到同时存在于温带和热带玉米的高质量SNP标记,对温带和热带玉米的基因型数据进行主成分分析(principle component analysis,PCA)以确定其群体结构,之后利用群体分化指数(fixation index,FST)和群体间扩展单倍型纯合度(cross population extended haplotype homozygosity,XP-EHH)法分析温带和热带玉米群体间的选择信号分布情况,选择FST和XP-EHH值的top 1%为阈值,筛选得到受选择位点。通过对SNP进行功能注释得到温热带玉米群体受到选择的基因。利用agriGO工具对候选驯化基因进行功能富集分析。利用相关的生物信息学数据库对候选基因进行功能注释,进一步鉴定玉米驯化过程中的开花候选基因。【结果】通过对温热带玉米群体的高测序深度的SNP进行分析,发现热带玉米群体的SNP数目为14 123 408个,温带玉米群体的SNP数目为8 791 673个,鉴定到的SNP主要分布于基因间区。2个群体中均存在的SNP标记数目是204 752个。主成分分析表明温带和热带玉米可以显著的分为两个类群。FST选择信号的top 1%是0.3593,共鉴定到557个候选驯化基因,XP-EHH选择信号法的top 1%是3.2681,共鉴定到1 913个候选基因。鉴定到多个候选基因与玉米的开花调控密切相关,包括ZmCCT9COL1GRMZM2G387528。ZmCCT9抑制开花基因ZCN8的表达,导致玉米在长日照环境下出现晚花表型,是一个重要的开花调控基因;COL1与开花促进因子FT蛋白互作,加速玉米开花以适应长日照环境;GRMZM2G387528的功能注释揭示该基因是一个光敏色素互作因子,与光周期基因ZmphyB1互作。【结论】热带玉米群体具有更高的遗传多态性,筛选到一系列参与了热带玉米和温带玉米的分化候选基因,并且重点挖掘了参与其中的玉米开花调控相关基因。

关键词: 玉米, 选择信号, 群体分化指数, 群体间扩展单倍型纯合度, 开花基因

Abstract:

【Objective】 Maize was first domesticated in tropical areas, but it has been cultivated widely in the temperate regions after natural and artificial selection. Flowering time is not only the key component of the entire growth period, but also a major adaptive trait during the dispersal process from tropical to temperate conditions. Thus, identifying the selected gene regions responsible for the adaptation to temperate zones, and discovering the genes that are involved in flowering time could provide a molecular basis for improving maize and for dissecting its flowering mechanism. 【Method】 We analyzed the haplotype data of 30 temperate and 21 tropical maize inbred lines. High quality SNP (single nucleotide polymorphism) markers were obtained after filtering out SNPs with high missing rates and low allele frequencies. These high quality SNPs were annotated by SNPeff. Principle component analysis (PCA) of the genotypic data of temperate and tropical maize was performed to further validate the population structure of these samples. Using high quality SNP markers that were present in tropical and temperate populations, we calculated the selection signal using the fixation index (FST) and cross population extended haplotype homozygosity (XP-EHH) methods. The top 1% of values was used as a significant threshold to identify the candidate selected signals. The candidate selected genes that we selected from temperate and tropical maize were identified based on their SNP annotation. The function of these selected genes was characterized furtherly by the GO enrichment analysis using agriGO. To identify the genes for flowering time that were under selection, bioinformatics databases were examined that contained relevant data on maize. 【Result】 By analyzing the high depth resequencing data, we found 14123408 and 8791673 SNPs in tropical and temperate populations, respectively. The identified SNPs were mainly distributed in the intergenic regions. There were 204752 high quality SNPs that coexisted in temperate and tropical populations. PCA indicated that temperate and tropical maize can be divided into two groups. The top 1% of FST value and XP-EHH were 0.3059, 3.2681, and a total of 557 and 1 913 candidate genes were identified by FST and XP-EHH methods, respectively. Many candidate genes were highly related to regulation of flowering time, which included ZmCCT9, COL1 and GRMZM2G387528. ZmCCT9 is a vital gene for regulating flowering time, and it negatively regulated the floral activator gene ZCN8, which cause the late flowering time phenotype under long-day conditions. COL1 positively interacts with the FT protein to promote the transition of flowering time to adapt to the long-day environment. Functional annotations of GRMZM2G387528 revealed that it was a phytochrome interacting factor, and interacts with photoperiod gene ZmphyB1. 【Conclusion】 Our study revealed that tropical maize had higher genetic diversity than temperate maize. A series of genes that were under selection during the adaptation to tropical to temperate conditions were predicted, and we further explored the genes that were involved in flowering during this process.

Key words: maize, selection signal, fixation index, cross population extended haplotype homozygosity, flowering time genes