Influence of the combination of last sense codon and stop codon on expression efficiency of green fluorescent protein gene in Escherichia coli when the expression vector pKK223-3

In this paper, the influence of the combination of last sense codon and stop codon on expression efficiency was studied. By our study, for the last sense codon CCG, the RFI of GFP(CCG) was 2.1 fold when the stop codon was UAA, but in comparison, the RFI was 1.1 fold when the stop codon was changed from UAA to UAG. For last sense codon TAG, the RFI of GFP(TAG) with the stop codon UAG was stronger than that with the stop codons UAA


INTRODUCTION
Green fluorescent protein (GFP) was discovered as a companion protein to aequorin, the famous chemiluminescent protein from Aequorea jellyfish [1] . GFP is composed of 238 amino acids and hence a rather small protein with a molecular weight of roughly 27 kDa [2] . Excitation at 396 nm results in an emission maximum at 508 nm [3] . The discovery of GFP led to a new revolution in molecular biology, whose different mutants had been engineered over the last few years [4][5] . Due to its favourable features, GFP rapidly became a popular tool in various applications in biology research. During the last decade, it has been introduced into a wide range of organisms, including bacteria, yeasts [6][7] .
Meantime, Escherichia coli (E.coli) is a convenient host for protein expression and one of the organisms of choice for the production of recombinant proteins in high quantities and low production costs, it has become the most popular expression platform. E.coli derived gene has become one of the most widely used. Its popularity is due to its lack of interference with plant metabolism, the ease and sensitivity of the assay and its stability both in vivo and in vitro [8][9][10][11][12][13][14][15][16][17] .
As we all know, gene contains a specific sequence of nucleotides which gives the instructions for the specific sequence of amino acids, if the gene is mutated, may change the formation of amino acids, and then influence the structure and function of the protein [18] . To master the relationship between the gene sequence and protein expression is helpful for understanding principles of gene expression and effectively controlling the production of protein. GFP and its some mutants have many useful applications, such as real-time detection, no disruption or toxicity to the host cells, no requirement for cofactors, and feasibility of fusion with the target proteins, have become one of the most used fluorescent probes in cell biology and molecular biology [19][20][21] .
Fluorescent proteins have proven to be excellent tools for live cell imaging. A mutation which results in an exchange of the amino acid serine at position 65 to threonine dramatically increased the intensity of green fluorescence [22] . Some texts about the influecne of the codon on the protein expression were greatly researched. Codon usage bias was considered to influence the elongation rate because each codon was decoded at a different rate [23][24] .
The C-terminal region contains all the information required for efficient translocation and can therefore be used as a signal sequence for recombinant protein targeting [25] . In addition, the sequence context motif surrounding the ATG initiation codon (ATG-context) is an important factor that increases protein production [26][27][28][29][30][31][32] . Introduction of the rare codons GCC, CGG, and ACC for alanine, arginine, and threonine reduced GFP production 2.1, 3.3, and 1.7 fold in comparison to the favored codons GCU, CGU, and ACA, respectively [33] .
During our former researches, my collaborators and I inserted 64 kinds of last sense codons at the 5' end of the stop codon of the GFP gene and studied on influence of last sense codon on expression in E.coli [34] . In the process, in order to study the influence of the combination of last sense codon and stop codon on expression efficiency of GFP gene in E.coli when the expression vector pKK223-3, I changed the stop codon from UAA to UAG and UGA, selected 18 kinds of last sense codons for each stop codon, and studied the influence of the combination of last sense codon and stop codon on expression efficiency of GFP gene in E.coli when the expression vector pKK223-3.

Constructions of Plasmid and Cloning Vectors
In order to research the combination of last sense codon and stop codon on expression efficiency of GFP gene, 18 kinds of GFP genes with special features were selected, such as had the higher or lower fluorescence intensities, medium fluorescence intensities, the same name with the stop codon, etc. NNN was the insertion of 18 kinds of last sense codons. The stop codon was changed from UAA to UGA and UAG ( Figure 1). pKK223-3 was used for the expression vector in this study. In order to research the relationship between the last sense codon and the protein expression efficiency, polymerase chain reaction (PCR) was used for removing these last sense codons of the modified GFP genes. PCR was performed at 94 °C for 10 min, followed by 25 cycles of 94 °C for 45 s, 48 °C for 1 min and 68 °C for 45 s. The principle of PCR was seen in Figure 2. After PCR, the plasmids were constructed by introduction of the modified GFP genes into expression vector. E.coli JM109 was transformed with the ligation mixture, the resulting clones were analyzed by sequencing. Primers (PCR) were purchased from Sigma-Aldrich (Tokyo, Japan). Restriction enzymes (EcoRI, HindIII), a PCR amplification kit and T4 DNA ligase had been purchased from TaKaRa (Otsu, Japan). We inserted the last sense codons by PCR, DNA fragments were acquired by enzyme digestion, and recombined by transformation and ligation. The flow diagram of experiment was seen in Figure 3.  When the expression vector was pKK223-3, 18 kinds of last sense codons were chosen and were inserted into GFP genes by PCR ( Figure 2). The used primers of 5' site was the same : GFP EcoRI Primer (5'Primer) 5'-CCCGAATTCTTTAACTTTAGGAAACACAATT CATGAGTAAAGGAGAAGAACTT-3' The used primers of 3' site was shown in Table 1. The NNN part was the stop codon (UAA, UGA, UAG). The shaded parts indicated restriction enzyme digested sites corresponding to the primer description.

Measurement of GFP Relative Fluorescence Intensity in E.coli
GFP fusion fluorescence intensity was an excellent indicator of over-expression potential [35] . Because fluorescence was one of the most convenient ways to follow a protein expression and purification procedure [36] , so the fluorescence intensity was used to analyze the expression efficiency of proteins. The cells were cultivated in LB medium supplemented with 0.1 mg/mL ampicillin and 40 µM isopropyl-β-Dthiogalactopyranoside (IPTG) at 37 °C for 18 h [37] . The culture of E.coli was measured its absorbance at 600 nm, and the fluorescence intensity at 508 nm was excited at 396 nm of the same culture and measured 3 times by the Gemini fluorescence microplate reader (Nihon Molecular Devise). The expression efficiency of GFP gene was compared as the value of the fluorescence intensity / the absorbance at 600 nm. The calculation formula of RFI was shown by formula (1). (1)

SDS-PAGE Analysis of Extracts of E.coli Having last sense codon of GFP Gene
After the fluorescence intensities of acquired GFP variants were known, the GFP genes were expressed in the proteins, the quantities of proteins were compared, and the influences of the last sense codon and stop codon on protein expression were analyzed. In order to be easily compared, sodium dodecyl sulfate polyacrylamide gel electropheresis (SDS-PAGE) experiment was done. E.coli JM109 was transformed by the expression vector including GFP gene, and cultured in the LB medium of 20 mL under the presence of 40 µM IPTG. DNase was added into the solution and the solution was incubated at 37 °C for 1 h to remove the remained DNA of the solution. The insoluble parts were separated from the solution by centrifugation (4°C , 13,000 rpm, 10 min) and soluble proteins were analyzed by SDS-PAGE.

Comparison of three kinds of stop codons without insertion of last sense codon by the expression vector pKK223-3
The stop codon had three kinds (UAA, UAG, and UGA), the wild types of three kinds of stop codons under the same conditions were determined the fluorescence intensities, it was found that the sequence of fluorescence intensity for three kinds of GFP genes of stop codons was wild(UGA) > wild(UAA) > wild(UAG). The result of comparison was seen in Figure 4.

Comparison of the RFI by the stop codon was changed by the expression vector pKK223-3
From Figure 5, it was seen that the last sense codon TCG had been inserted into GFP gene, the stop codon UAA was marked in square.
As the results (Figure 4), for most of the last sense codons, with one kind of last sense codon, when the stop codon was replaced, the change of fluorescence intensity was small. But some specific phenomena were seen, for example, for the last sense codon CCG, the RFI of GFP(CCG) was 2.1 fold when the stop codon was UAA, but in comparison, the RFI was 1.1 fold when the stop codon was changed from UAA to UAG. For last sense codon TAG, the RFI of GFP(TAG) with the stop codon UAG was stronger than that with the stop codons UAA and UGA.
Comparison to wild type GFP, some GFP genes had the stronger or lower RFIs when the three kinds of stop codons. For example, when the expression vector was pKK223-3, GFP(CCG), GFP(GGA) always had the stronger fluorescence intensities and GFP(GTT), GFP(TTG), GFP(CTA) always had the lower fluorescence intensities when the three stop codons (UAA, UAG, and UGA) comparison to wild type GFP.
For the last sense codons which belonged to the same amino acid, the GFP genes with these last sense codons had the nearly same RFIs, for example, the last sense codons GTT and GTG belonged to Valine, the last sense codon ACC and ACA belonged to Threonine, the last sense codon AGT and AGC belonged to Serine. Their fluorescence intensities of GFP genes with these last sense codons were separately the same.
There were also special cases, for example, the last sense codon CCC, CCA and CCG belonged to Proline, GFP(CCC) and GFP(CCA) had the near fluorescence intensities, but for the last sense codon CCG, when the stop codon was UAA, the RFI of GFP(CCG) was particularly strong (2.1 fold).
The recognition of stop codons by release factors 1 (RF-1) and 2 (RF-2) might lead to peptide chain termination during translation. Stop codons (UAG and UAA) are recognized by RF-1 while stop codons (UAA and UGA) are recognized by RF-2. The difference of the influence on expression efficiency among stop codons might be caused by the change of the release factors, this needed further verification.

SDS-PAGE of soluble proteins extracted from E.coli
To confirm that the RFI was related to the expression efficiency of GFP gene, SDS-PAGE of soluble proteins extracted from E.coli was done. The result of SDS-PAGE was seen in Figure 7 by the expression vector pKK223-3. Three kinds of last sense codons CCG, AGT, CTA were selected, and the RFIs by three kinds of stop codons were compared. As shown in Figure 6, the results of RFIs were: CCG(UAA) 2.

Conclusions
In our study, the influence of the combination of last sense codon and stop codon on expression efficiency of GFP gene in E.coli was researched. Besides the last sense codon, the stop codon had the effects on the expression of genes. When the stop codon was changed, for most of GFP genes which were inserted last sense codons, the fluorescence intensities were not changed. Some GFP genes always had the stronger expression efficiencies, such as GFP(CCG), GFP(GGA), when these last sense codons were inserted, the quantities of proteins would be increased. It was confirmed that the combination of stop codon and the last sense codon had influenced the expression efficiency of GFP gene. By insertion of last sense codon, we changed some fluorescence intensities of GFP variants, that meaned that this way could control the increase and decrease of the protein expression efficiency. In the future research, it is expected to insert the codon from different site of GFP gene, compare the different influence of GFP variants by insertion the codon, especially some GFP variants which could greatly improved the protein quantities, finally achieve that the protein could be expressed by artificial control.