中国农业科学 ›› 2024, Vol. 57 ›› Issue (14): 2889-2900.doi: 10.3864/j.issn.0578-1752.2024.14.015

• 畜牧·兽医 • 上一篇    

二代基因组测序鉴别狮头鹅拷贝数变异及其与体重体尺关联

张力允1(), 黄智荣1, 杨柳2, 陈俊鹏3, 林祯平3, 黄红艳1, 伍仲平1, 张续勐1, 田允波1, 黄运茂1(), 李秀金1()   

  1. 1 仲恺农业工程学院动物科技学院/广东省水禽健康养殖科技创新平台,广州 510225
    2 中国农业科学院深圳农业基因组研究所,广东深圳 518120
    3 广东省汕头市白沙禽畜原种研究所,广东汕头 515821
  • 收稿日期:2023-08-29 接受日期:2024-06-07 出版日期:2024-07-16 发布日期:2024-07-24
  • 通信作者:
    黄运茂,E-mail:
    李秀金,E-mail:
  • 联系方式: 张力允,E-mail:zhangliyun@zhku.edu.cn。
  • 基金资助:
    广东省重点领域研发计划(2020B020222003); 广东省乡村振兴战略专项种业振兴行动项目(2023XDY00001)

Identification of Copy Number Variation and Its Association with Body Weight and Size of Lion-Head Geese by Next-Generation Sequencing

ZHANG LiYun1(), HUANG ZhiRong1, YANG Liu2, CHEN JunPeng3, LIN ZhenPing3, HUANG HongYan1, WU ZhongPing1, ZHANG XuMeng1, TIAN YunBo1, HUANG YunMao1(), LI XiuJin1()   

  1. 1 Animal Science & Technology,Zhongkai University of Agriculture and Engineering/Science & Technology Innovation Platform of Waterfowl Health Breeding in Guangdong, Guangzhou 510225
    2 Agricultural Genomics Institute at Shenzhen Chinese Academy of Agricultural Sciences, Shenzhen 518120, Guangdong
    3 Baisha Poultry and Livestock Origin Research Institute, Shantou 515821, Guangdong
  • Received:2023-08-29 Accepted:2024-06-07 Published:2024-07-16 Online:2024-07-24

摘要:

【背景】 许多研究报道拷贝数变异(copy number variation, CNV)是一种长度在50 bp至5 Mb之间的缺失或插入,可以影响基因的表达,从而影响动物的生长发育特征,与畜禽重要经济性状有紧密的关联,是一种重要分子遗传标记之一。狮头鹅是世界体型最大鹅种之一,原产地为广东饶平,为广东卤鹅的原材料。但是,至今还没有关于狮头鹅CNV与体重体尺的全基因组关联研究报道。【目的】 通过二代基因组测序数据鉴别狮头鹅的CNV和拷贝数变异区域(copy number variation region, CNVR)在基因组上分布情况,通过CNV与体重体尺性状的关联分析,挖掘显著影响体重体尺的CNV及候选基因,为狮头鹅后续的分子育种研究提供参考。【方法】 试验共收集了来自汕头市白沙禽畜原种研究所的111只狮头鹅,其中公鹅20只,母鹅91只。所有鹅均采用统一标准饲养管理。对111只鹅进行体重体尺测定,体尺性状包括体斜长、胸深、胸宽等9个指标。本试验对111只鹅进行体重体尺测定和二代基因组测序(5×)。测序数据利用SOAPnuke进行质控,软件Speedseq中的 BWA模块进行序列比对,采用Speedseq中的LUMPY和CNVnator模块检测结构变异(structure variation,SV),从SV中筛选CNV。本试验用软件SVtools对CNV进行基因分型,然后采用单标记混合模型开展分型CNV与体重体尺的关联分析。采用染色体显著性水平(即0.05/染色体CNV数目)作为定义与性状显著关联CNV的阈值,对显著CNV位点及上下游50 kb进行基因注释,找到影响狮头鹅体重体尺关联的候选基因。用R包CNVrd2对物理距离小于1 Mb的染色体水平显著CNV和染色体水平显著SNP做连锁不平衡(linkage disequilibrium,LD)分析。【结果】 对于111只狮头鹅,共检测出 99 158个CNV,其中缺失型94 560个,重复型4 598个,CNV平均长度11 858 bp, 大部分(74.06%)CNV长度位于50—1 000 bp区间。CNVR共5 225个,包括缺失型5 029个,重复型110个和混合型86个,CNVR平均长度为7 136 bp, 大部分(81.03%)CNVR长度位于50—1 000 bp区间。功能注释发现46.92% CNVR位于基因间区域,10.30%位于基因上游,9.35%位于基因下游。准确进行基因分型的CNV有6 217个,通过10个体重体尺性状与这些CNV关联分析,共检测55个染色体显著性水平的CNV位点,注释到45个候选基因。在45个候选基因中,发现SETD2、UBR7、G2E3等10个基因同时影响两个及两个以上性状。染色体水平显著CNV独立于染色体水平显著SNP影响体重体尺性状(r2<0.02)。【结论】 通过二代基因组测序首次报道狮头鹅基因组CNV和CNVR分布及CNV和体重体尺关联的情况。本试验共发现影响体重体尺的45个候选基因,其中11个已被报道与畜禽生长信号通路有关,分别是SETD2、UBR7、ASB1HDAC4参与肌肉的增殖、分化和代谢;G2E3、P3C2B、NOVA1PDE1B参与脂肪生成和肥胖;ILKAP与调节生长因子有关;KIF1B参与骨代谢;ZFP37参与糖原代谢。这些为后续狮头鹅生长性能的分子遗传机制解析和分子标记挖掘奠定基础。

关键词: 狮头鹅, 体重体尺, 拷贝数变异, 候选基因

Abstract:

【Background】 Many previous studies have reported that copy number variation (CNV) is a kind of deletion or duplication with the length of 50 bp-5 Mb, which can affect the expression of genes. It is closely associated with economically important traits of livestock, which is one kind of promising molecular markers. Lion-head goose is one of the largest goose species in the world. It is originated in Raoping, Guangdong Province and is the raw material for Guangdong marinated geese. So far, there has no genome-wide association study on investigating the relationship between CNV and body weight and size in lion-head geese. 【Objective】 This study identified the CNV and CNV region (CNVR) of lion-head geese by using the second-generation genome sequencing data, and then detected CNV and candidate genes significantly affecting body weight and size through the association between them, which could provide the valuable reference information for molecular breeding of lion-head geese. 【Method】 A total of 111 lion-head geese were collected from Baisha Poultry and Livestock Origin Research Institute in Shantou, including 20 males and 91females. All geese were raised and managed under the uniform standards. The body weight and size traits of 111 geese were measured, and the body size traits included body oblique length, chest depth, chest width and so on. The next-generation genome sequencing data (5×) was generated using blood samples for these geese. SOAPnuke was used for the quality control of sequencing data.The BWA module of Speedseq was used for alignment, and the LUMPY and CNVnator modules of Speedseq were used to detect structural variations (SVs). CNV were selected from SV. The software SVtools was used to genotype CNV, and the association analysis between CNV and body weight and size traits was performed by using the single maker mixed model. CNV significantly associated with traits was screened through the chromosome significance level (0.05/number of CNV on the chromosome), and then annotated the significant CNV including their upstream and downstream 50 kb to identify candidate genes for the body weight and size of lion-head geese. The R package CNVrd2 was used to analyze the linkage disequilibrium (LD) of chromosome-significant CNV and chromosome-significant SNP with physical distance less than 1 Mb. 【Result】 For 111 lion-head geese, this study detected 99158 CNV including 94 560 deletions and 4 598 duplications. The average length of CNV was 11 858 bp, and most (74.06%) of them were located in the range of 50 bp-1 Kb. A total of 5 225 CNVR were detected, which contained 5 029 loss types, 110 gain types, and 86 mixed types. The average length of CNVR was 7 136 bp, and the lengths of most (81.03%) of the CNVRs were 50 bp-1 Kb. Functional annotation showed that 46.92% of CNVR were located in the inter gene region, 10.30% were located the upstream, and 9.35% were located the downstream. There were 6 217 CNV accurately genotyped for association analysis. By the association analysis of body weight and size traits and CNV, a total of 55 CNV exceeded the significance level of chromosomes, and then annotated 45 candidate genes based on these 55 CNV. Among these 45 candidate genes, it was found that 10 genes, such as SETD2, UBR7 and G2E3, simultaneously influenced two or more traits. Chromosome-significant CNV affected body weight and size traits independently of chromosome-significant SNP (r2<0.02). 【Conclusion】 This study for the first time reported the distribution of CNV and CNVR in the genome of lion-head geese as well as the association between CNV and body weight and size by using the next-generation genome sequencing data. It was found that a total of 45 candidate genes influencing the body weight and size traits, in which 11 genes were reported to be related to signal pathways of animal growth, among these 11 genes, SETD2, UBR7, ASB1 and HDAC4 were involved in muscle proliferation, differentiation and metabolism, G2E3, P3C2B, NOVA1 and PDE1B were involved in adipogenesis and obesity, ILKAP was involved in regulating growth factors, KIF1B was involved in bone metabolism, and ZFP37 was involved in glycogen metabolism. These results laid a solid foundation for analyzing molecular genetic mechanism and detecting molecular marker for the growth performance of lion-head goose.

Key words: lion-head goose, body weight and size traits, CNV, candidate gene