Expression of recombinant proteins with peptide or protein tags is widely used in protein research for three main reasons, ease of purification from a large pool of host proteins, enhancing solubility of the protein and for localization studies. Some important steps to be considered while choosing an expression vector are compatibility of tag sequence with that of the desired protein, codon usage, including linker sequences, peptide cleavage sites and the impact of the tag on the nature of desired protein. Various tags are used ranging from large proteins (Maltose Binding Protein {MBP}) to small peptides (Hexa (6X) Histidine). Tags can be added to either N-terminal side or C-terminal side of the desired protein.
Some tags are used to enhance protein solubility, such as MBP, Glutathione-S- Transferase (GST), Small Ubiquitin related Modifier (SUMO), Ubiquitin tag etc. Some tags are useful for affinity purification, such as His tag proteins can be purified using metal affinity columns like Nickel, Cobalt; GST tagged proteins are purified by immobilized glutathione resin; MBP tagged proteins are purified by amylose agarose beads, similarly Chitin binding protein (CBP). Avi Tag, SBP tag, Strep tag allows biotinylation and binds to streptavidin. Localization studies of some proteins can be made simple by using fluorescent tags like Green fluorescent protein tag (GFP, EGFP). Myc tag, FLAG tag are non fluorescent tags which are widely used in detection and localization of proteins using Western blotting, Immunocytochemistry, Immunohistochemistry, Immunoprecipitating, ChIP assays etc. Small epitope tags like E-tag, FLAG tag, HA tag, Myc tag, V5 tag, VSV tag are also used for antibody recognition.
The main draw back of using tagged protein expression is that they may alter both functional and structural characteristics of the protein. Tag removal by proteases results in extra amino acids on N-terminal side of the protein that may alter protein function. The choice of tag is important because some proteins doesn’t require tag removal, they can be cloned with any tag depending on the requirement of yield, solubility and affinity purification. While therapeutic proteins should have native sequence and even a single extra amino acid is not accepted they should be cloned only with tags that can be completely removed from the protein in the final purification step. Protein and peptide tags are often connected to the desired protein with small protease cleavage sequences. Proteases like thrombin ( Leu-Val-Pro- Arg✂ Gly-Ser), TEV protease (ENLYFQ✂ G or ENLYFQ✂ S) are widely used to cleave the recognition sequences but they leave one or two extra amino acids on the N-terminus of the desired protein. So for complete removal of tag with out any extra amino acid the following options can be considered:
1. Enterokinase cleaves (DDDDK✂ ) after lysine separating the tag and native desired protein.
2. Cleavage with Factor Xa (Ile-(Glu or Asp)-Gly-Arg✂ ) produces native protein.
3. SUMO tag removal by sumo protease results in generation of native N-terminus of the protein.
4. Ubiquitin tag helps in enhanced solubility and can easily be removed by ubiquitin hydrolase resulting in native N-terminus of the protein.
5. Self cleaving tags are recently gaining importance as one can reduce the use of expensive proteases for tag removal. Self cleaving can be induced by pH change, thiol based or adding calcium ions or triglycine or metal ions or manganese. Intein tag, SrtAc protein, FrpC protein are few examples.
References:
1. Catanzariti A-M, Soboleva TA, Jans DA, Board PG, Baker RT. An efficient system for high-level expression and easy purification of authentic recombinant proteins. Protein Science : A Publication of the Protein Society. 2004;13(5):1331-1339.
2. Steffen Frey, Dirk Görlich, A new set of highly efficient, tag-cleaving proteases for purifying recombinant proteins, Journal of Chromatography A, Volume 1337, 2014, Pages 95-105.