中国农业科学 ›› 2016, Vol. 49 ›› Issue (2): 348-360.doi: 10.3864/j.issn.0578-1752.2016.02.015

• 畜牧·兽医·资源昆虫 • 上一篇    下一篇

整合数字基因表达谱与全基因组关联分析鉴定猪血液性状候选基因

徐盼,张震,章峰,杨斌,段艳宇   

  1. 江西农业大学省部共建猪遗传改良与养殖技术国家重点实验室,南昌 330045
  • 收稿日期:2015-02-09 出版日期:2016-01-16 发布日期:2016-01-16
  • 通讯作者: 段艳宇,E-mail:yanyuduan@hotmail.com
  • 作者简介:徐盼,E-mail:panxu_nj@hotmail.com
  • 基金资助:
    国家基金(青年项目)(31301950)、江西省自然科学基金(2010GQN0045)

Identification of Candidate Genes for Hematological Traits by Integrating Gene Expression Profiling and Genome-Wide Association Study in a Porcine Model

XU Pan, ZHANG Zhen, ZHANG Feng, YANG Bin, DUAN Yan-yu   

  1. State Key Laboratory for Pig Genetics Improvement and Production Technology, Jiangxi Agricultural University, Nanchang 330045
  • Received:2015-02-09 Online:2016-01-16 Published:2016-01-16

摘要: 【目的】整合数字基因表达谱与全基因组关联分析鉴定白色杜洛克×二花脸F2资源群体的血液性状候选基因。【方法】白色杜洛克×二花脸F2资源群体在(240±3)d屠宰,收集血液于抗凝管中进行血常规检测。利用Illumina 60K SNP芯片对1 020头F2资源群体进行基因分型。剔除基因型检出率< 90%和孟德尔错误检出率> 5%的个体。检出率< 95%、次等位基因频率< 5%、哈代-温伯格检验(HWE) P < 5×10-6、与性染色体连锁疑似常染色体的SNP被筛除。利用Illumina GA II 测序仪测序对502头F2资源群体的肝脏进行数字基因表达谱测序。测序得到的原始数据经过滤获得清洁标签后与参考标签数据库比对,将能唯一比对到参考基因序列的清洁标签数量进行标准化处理以获得标准化的基因表达量。每个转录本的表达水平进一步转化为lg2值。在少于20%的个体中表达的转录本被滤去。表型性状和基因表达性状使用R程序包中GenABEL内polygentic功能进行性别、批次和亲缘关系的校正。其残差使用R程序包中斯皮尔曼系数评估基因表达水平与表型数据的关联性,设定保守阈值P < 5×10-4时调整多重检验。将检测到的表达数量性状位点(eQTL)及其对应基因根据其位置相对照的关系进行绘图。搜寻前期GWAS最高点5.0 Mb区域内eQTL结合GWAS结果进行综合分析。Gene Ontology & KEGG pathway富集分析使用在线工具DAVID。基因共表达网络使用在线工具GeneMANIA进行构建。【结果】白色杜洛克×二花脸F2资源群体中502个个体的20 108个肝脏转录本通过了质检。当P < 5×10-4时鉴别到与血红蛋白(HGB)、红细胞数目(RBC)、红细胞压积(HCT)、平均红细胞体积(MCV)、平均红细胞血红蛋白含量(MCH)和白细胞数目(WBC)关联的转录本共259个。有34个转录本与一个以上表型关联。用上述血细胞性状关联的转录本进行eQTL定位,当阈值P < 10-5时得到304个eQTL,每个转录本映射到1—6个eQTL,其中有35个顺式eQTL,120个反式eQTL。MCH和MCV的顺式eQTL位置重叠位于8号染色体。7号染色体存有数量最多的eQTL,其中多数为反式eQTL。通过eQTL定位确定了KIT为候选基因。Gene Ontology & KEGG pathway富集分析鉴别到与红细胞性状相关的基因KITPSEN2TFRC,与白细胞性状相关的基因THBS1CYR61。通过整合前期GWAS数据及eQTL定位并建立基因共表达网络,鉴定到与白细胞性状相关的基因RPS10。【结论】利用基于白色杜洛克×二花脸F2资源群体的eQTL定位并结合前期GWAS数据,鉴别到与红细胞性状相关的基因KITPSEN2TFRC,与白细胞性状相关的基因THBS1CYR61RPS10

关键词: 猪, 血液性状, 数字基因表达谱, 表达数量性状位点, 全基因组关联分析, 候选基因

Abstract: 【Objective】We herein integrated digital gene expression profiling and genome-wide association study in a White Duroc × Erhualian F2 resource population to identify candidate genes for hematological traits.【Method】The White Duroc × Erhualian F2 resource population were slaughtered at 240 ± 3 days. Blood was collected in anticoagulation tubes. A set of hematological parameters were measured using a whole blood analyzer. The 1 020 F2 pigs were genotyped using an Illumina porcine 60K SNP chip. Individuals with genotype-missing rates > 10% and Mendellian errors > 5% were removed. SNPs with a call rate < 95%, minor allele frequency < 5%, P value < 5 × 10-6 for Hardy Weinberg equilibrium (HWE), and the autosomal SNPs that were linked to sex chromosome were excluded. The liver samples of the 502 F2 pigs were then performed digital gene expression profiles sequencing on Illumina GA II. The raw tags were filtered to obtain clean tags. The clean tags were uniquely mapped to the reference gene sequences and were defined as unambiguous clean tags. The number of unambiguous clean tags was normalized to represent the expression level of each transcript. The expression level of each transcript was further transformed to lg2 value. The transcripts that expressed less than 20% of individuals were rejected. The traits of phenotype and gene expression were adjusted for sex, batch and kinship using polygentic function of GenABEL in R package. The correlations between gene expressions and phenotypic traits were evaluated using the residuals by Spearman’s correlation coefficient with a conservative threshold P < 0.0005. Positions of detected eQTL were plotted against the positions of the genes for which that eQTL were found. We also searched the eQTL within the 5.0 Mb region of the peak SNP of GWAS and performed an integrated analysis of eQTL and GWAS. Gene Ontology & KEGG pathway enrichment analysis was implemented by DAVID online tools and gene co-expression network was constructed by GeneMANIA online tools. 【Result】A total of 20 108 liver transcripts of 502 F2 pigs achieved the quality control requirements. We obtained 259 transcripts strongly associated with hemoglobin (HGB), red blood cell count (RBC), hematocrit (HCT), mean corpuscular volume (MCV), mean corpuscular hemoglobin (MCH) and white blood cell count (WBC) respectively with a conservative threshold P < 0.0005. 34 liver transcripts were significantly associated with two or more phenotypic traits. Three hundred and four loci (eQTL) were identified to affect the transcription levels of these genes including 35 cis-eQTL and 120 trans-eQTL with P < 10-5. Each transcript was mapped to one to six eQTL. MCH and MCV shared the same cis-eQTL located on SSC8. The largest number of eQTL were located on SSC7 and most of them were trans-eQTL. KIT was identified as a candidate gene by eQTL analysis. Gene Ontology & KEGG pathway enrichment analysis allowed us to prioritize five candidate genes such as for KIT, PSEN2,and TFRC for RBC and THBS1, CYR61 for WBC. RPS10 was also identified as the candidate gene for WBC by the integration of eQTL, GWAS and gene co-expression network. 【Conclusion】In this study, we identified KIT, PSEN2,TFRC as the candidate genes for RBC, THBS1, CYR61, RPS10 as the candidate genes for WBC by integrating gene expression profiling and genome-wide association study in the White Duroc × Erhualian F2 resource population.

Key words: pig, hematological traits, gene expression profiling, expression quantitative trait loci, genome-wide association study, candidate gene