JIA-2018-09
2077 SONG Hui et al. Journal of Integrative Agriculture 2018, 17(9): 2074–2081 3.3. Identification of optimal codon We identified 30 optimal codons that encode 18 amino acids in M . truncatula using the ΔRSCU method (Table 1). These optimal codons, except AGG (Arg) and TTG (Leu), preferentially end with A or T. This is consistent with the RSCU result. Furthermore, RSCU and optimal codons analyses led to the identification of 26 optimal codons with high frequency and 4 optimal codons without high frequency, whereas 28 codons are neither high frequent nor optimal. To define the factors that determine optimal codons, we selected the Fop as an evaluation index for a correlation analysis. There was no correlation between Fop and CDS length and genomic DNA (exon and intron) length (Table 2). Significant positive correlations were observed between Fop and the GC content from different CDS and genomic DNAsequences (Table 2). These results indicated that the GC content across the three codon positions had similar effects on optimal codon usage. Moreover, optimal codon usage is associated with higher GC content in CDSs and intronic (Table 2). There was a significant positive correlation between Fop and CAI ( r =0.76, P <0.01). High CAI values indicate high levels of gene expression. This finding is consistent with previous studies, which showed that optimal codons were used in highly expressed genes under the impact of natural selection (Ingvarsson 2007; Qiu et al . 2011a). In general, low ENC values indicate CUB. In this study, Fop and ENC exhibited a significant positive correlation ( r =0.21, P <0.01), indicating that highly optimal codons have low CUB. 3.4. Correlation between gene length, GC content and gene expression Various factors have been examined for their association with gene expression, including GC content, intron size, and protein sequence length (Rao et al . 2011; Williford and Demuth 2012; De La Torre et al . 2015). Based on EST data analysis, gene expression intensity in M . truncatula was not correlated with CDS and genomic DNA length, and the GC content of both CDS and genomic DNAsequences (Table 2). The results indicate that CDS length and the GC content do not influence the gene expression in M . truncatula . 4. Discussion As far as codon usage study is concerned, plants have remained well behind prokaryotic models. One major reason is the limited number of completely sequenced genomes in plants comparatively. Hordium vulgare , Nicotiana tabacum , Pisum sativum , T . aestivum , and Z . mays were pioneeringly investigated for their codon usage utilizing their EST or partial genome sequences (Fennoy and Bailey-Serres 1993; Kawabe and Miyashita 2003). Following the completion of A . thaliana genome sequencing in 2000, other plant genome sequences have also become available increasingly. So far codon usage has been analysed for several sequenced model plants including A . thaliana , Brachypodium distachyon , and O . sativa (Morton and Wright 2007; Qiu et al . 2011b; Liu et al . 2015). However, codon usage patterns in M . truncatula remain unexamined. In this study, we analysed codon usage patterns in M . truncatula utilizing 39531 CDSs. Our results suggest that: (1) natural selection acts on codon usage pattern in M . truncatula ; (2) for 18 out of 20 amino acids, the optimal codons characteristically end with A or T; (3) optimal codons are more widely present in genes with higher GC content; and (4) no correlation between gene expression intensity and either gene length or GC content. In Populous and Arabidopsis , tRNA abundance is positively correlated with optimal codon usage (Wright et al . 2004; Ingvarsson 2007). However, we found that nine optimal codons (TTT, TAT, CAT, AAT, AAA, GAT, AGT, TGT and GGT) with high RSCU are associated with the low abundance of corresponding tRNA in M . truncatula (Table 1). Williford and Demuth (2012) explained this phenomenon through two hypotheses: (1) Codon-anticodon recognition heavily depends on post-transcriptional modifications of tRNA sequences. It has been confirmed that nucleotide A is always modified into I (inosine), and nucleotide U at the first anticodon position experiences extensive changes that could expand or restrict the number of recognized codons (Agris et al . 2007); (2) codons that correspond to highly 0 0.2 0.4 0.6 0.8 1.0 25 30 35 40 45 50 55 60 65 ENC GC3 (%) Fig. 2 Effective number of codons (ENC) plot. The ENC values shown in this plot were generated using codon W. The figure was generated using Origin 9.0. The continuous curve indicates the relationship between ENC and GC3s values under neutral selection. Each dot indicates a gene. GC3, GC content at the third position of synonymous codons.
Made with FlippingBook
RkJQdWJsaXNoZXIy MzE3MzI3