Frameshift Mutation

A frameshift mutation (also called a framing error or reading frame shift ) is a genetic mutation caused by indels ( insertions or deletions ) of several nucleotides in a DNA sequence that is not divisible by three. Due to the triplicate nature of gene expression by codons , insertions or deletions can alter the reading frame (group of codons), resulting in a completely different translation from the original . The sooner the deletion or insertion occurs in the sequence, the more altered the protein is. [1]A frameshift mutation is not the same as a single-nucleotide polymorphism in which a single nucleotide is inserted or removed, rather than inserted or removed. A frameshift mutation would normally cause a reading of the codon followed by a mutation in the code for different amino acids. The frameshift mutation would also change the first stop codon (“UAA”, “UGA” or “UAG”) encountered in the sequence. The polypeptide being made may be unusually short or unusually long, and most likely will not be functional.

Frameshift Mutation

Frameshift mutations are evident in severe genetic diseases such as Tay-Sachs disease ; They increase susceptibility to certain cancers and classes of familial hypercholesterolemia ; In 1997, [3] a frameshift mutation was linked to resistance to infection by the HIV retrovirus. Frameshift mutations have been proposed as a source of biological novelty, as with the alleged construction of nylonase , however, this interpretation is controversial. A study by Negoro et al (2006) [4] found that a frameshift mutation was unlikely to be the cause and that the active site of the parental esterasehad two amino acid a result of nylonase.


The information contained in DNA determines the function of proteins in the cells of all organisms. Transcription and translation allow this information to be communicated in making proteins. However, an error in reading this communication can misinterpret the function of the protein and eventually lead to disease, even if a variety of corrective measures are involved in the cell.

Central dogma

In 1956 Francis Crick described the flow of genetic information from DNA to a specific amino acid arrangement to make proteins as the central dogma. [1] For a cell to function properly, it is necessary to produce proteins accurately for structural and catalytic activities. An incorrectly made protein can have a detrimental effect on cell viability and in most cases abnormal cellular functions make the higher organism unhealthy. To ensure that the genome successfully passes information on to proofreading mechanisms such as exonucleases andThe mismatch repair mechanism has been implicated in DNA replication. [1]

Transcription and translation

After DNA replication, the reading of a selected segment of genetic information is completed by transcription. [1] The nucleotides containing the genetic information are now on a single strand messenger template called mRNA. The mRNA is incorporated with a subunit of the ribosome and interacts with an rRNA. The genetic information carried in the codon of the mRNA is now read (decoded) by the anticodon of the tRNA. As each codon (triplet) is read, amino acids are being chained together until a stop codon (UAG, UGA or UAA) is reached. At this point the polypeptide (protein) has been synthesized and released. [1]For every 1000 amino acids included in a protein, no more than one is wrong. This fidelity of codon recognition is accomplished by proper base pairing at the ribosome A site, while maintaining the importance of the proper reading frame, the GTP hydrolysis activity of EF-Tu being a form of kinetic stability, and EF-Tu as A proofreading mechanism is issued. , [1]

Frameshifting can also occur during prophase translation, producing different proteins from overlapping open reading frames, such as the Gag-Pol-Env retroviral protein. It is quite common in viruses and also occurs in bacteria and yeasts (Farabaugh, 1996). Reverse transcriptase, in contrast to RNA polymerase II, is known to be a strong cause of the occurrence of frameshift mutations. RNA polymerase II caused only 3–13% of all frameshift mutations in the experiments. In prokaryotes the error frameshift mutation inducing rate is only somewhere in the range of .0001 and 0.00001. [5]

There are several biological processes that help prevent frameshift mutations. Reverse mutations occur that change the mutated sequence to the original wild-type sequence. Another possibility for mutation correction is the use of a suppressor mutation. This turns off the effect of the original mutation by creating a secondary mutation, shifting the sequence to allow the correct amino acid to be read. Guide RNA can also be used to insert or remove uridines in mRNA after transcription, allowing for a correct reading frame. [1]

Codon-triplet importance

A codon is a set of three nucleotides, a triplet that codes for a certain amino acid. The first codon establishes the reading frame, from which a new codon begins. The amino acid backbone sequence of a protein is defined by contiguous triplets. [6] Codons are important for the translation of genetic information for the synthesis of proteins. The reading frame is set when translation of the mRNA begins and is maintained as it reads one triplet into the next. The reading of the genetic code is subject to the three rules of the monitor codon in the mRNA. First, the codon is read in the 5′ to 3′ direction. Second, the codons are non-overlapping and there is no gap in the message. The last rule, as stated above, is that the message is translated into a certain reading frame. 


Frameshift mutations may occur randomly or may be caused by external stimuli. Detection of frameshift mutations can occur in several different ways. Frameshifts are just one type of mutation that can cause incomplete or incorrect proteins, but they cause a significant percentage of errors in DNA.

Genetic or environmental

Main article: Mutation

It is a genetic mutation at the level of nucleotide bases. Why and how frameshift mutations occur is an ongoing investigation. In an environmental study, UV-induced frameshift mutations were produced by DNA polymerases specifically deficient in 3′ → 5′ exonuclease activity. The common sequence 5′ GTC GTT TTA CAA 3′ was changed to GTC GTT C TTA CAA (MIDC) to GTC GTT T TTA CAA (MIDT) to study the frameshift. E. coli Pol I Kf and T7 DNA polymerase mutant enzymes devoid of 3′ → 5′ exonuclease activity produce UV-induced revertants at a higher frequency than their exonuclease efficient counterparts. The data indicate that the loss of proofreading activity increases the frequency of UV-induced frameshifts. [7]



The effects of neighboring bases and secondary structure have been investigated in depth using fluorescence to detect the frequency of frameshift mutations. Fluorescently tagged DNA, via base analogs, allows one to study local changes of the DNA sequence. [8] Studies on the effects of primer strand length suggest that an equilibrium mixture of the four hybridization conformations was observed when the template bases were looped-out as a bulge, i.e. a structure flanked by duplex DNA on either side. . In contrast, a double-loop structure with an unusual unstacked DNA structure at its downstream edge was observed when the extruded bases were located at the primer–template junction, indicating that the misalignment could be modified by a neighboring DNA secondary structure. . [9]


Sanger sequencing and pyrosequencing are two methods that have been used to detect frameshift mutations, however, it is likely that the data generated will not be of the highest quality. Still, 1.96 million indels have been identified through Sanger sequencing that do not overlap with other databases. When a frameshift mutation is observed it is compared with the Human Genome Mutation Database (HGMD) to determine whether the mutation has a deleterious effect. This is done by looking at the four characteristics. First, the ratio between the affected and conserved DNA, second the location of the mutation relative to the transcript, third the ratio of conserved and affected amino acids, and finally the distance of the indel to the end of the exon. [10]

Massively parallel sequencing is a new method that can be used for mutation detection. Using this method, up to 17 Gigabases can be sequenced at a time, in contrast to the limited limitations for Sanger sequencing of only 1 kilobase. Several technologies are available to perform this test and are being looked at for use in clinical applications. [11] When testing for different carcinomas, current methods allow only one gene to be looked at at a time. Massively parallel sequencing can test mutations for different types of cancer at once, as opposed to multiple specific tests. [12]One experiment to determine the accuracy of this new sequencing method tested for 21 genes and had no false positive calls for frameshift mutations. [13]


A US patent (5,958,684) in 1999 by Leeuwen details methods and reagents for the diagnosis of diseases caused by or associated with genes with somatic mutations giving rise to frameshift mutations. Methods include providing a tissue or fluid sample and performing gene analysis for frameshift mutations or proteins from this type of mutation. The nucleotide sequence of the suspected gene is provided from published gene sequences or from cloning and sequencing of the suspected gene. The amino acid sequence encoded by the gene is then predicted. [14]


Mutations occur regardless of the rules governing the genetic code and the various mechanisms present in the cell to ensure the correct transfer of genetic information during the process of DNA replication as well as translation; Frameshift mutation is not the only type. There are at least two other types of recognized point mutations, specifically missense mutations and nonsense mutations. [1] A frameshift mutation can substantially change the coding ability (genetic information) of the message. [1] Small insertions or deletions (those less than 20 base pairs) make up 24% of mutations that appear in currently recognized genetic disease. [10]

Frameshift mutations are more common in repeat regions of DNA. One reason for this is due to the slip of the polymerase enzyme in repetitive regions, allowing mutations to enter the sequence. [15] Experiments can be run to determine the frequency of frameshift mutations by adding or removing a pre-determined number of nucleotides. Experiments have been run by adding four basepairs, called the +4 experiment, but a team at Emory University observed differences in the frequency of mutations by adding and removing base pairs. It was shown that there was no difference in frequency between base pair additions and deletions. However, there is a difference in the end result of the protein.

Huntington’s disease is one of nine codon repetition disorders caused by polyglutamine expansion mutations that include spino-cerebellar ataxia (SCA) 1, 2, 6, 7 and 3, spinobulbar muscular atrophy and dentatorubal-pallidolucianatrophy. There may be a link between polyglutamine and diseases caused by polyalanine expansion mutations, as in-frame transfer of the original SCA3 gene product encoding CAG/polyglutamine to GCA/polyalanine. Ribosomal slippage has been proposed as a mechanism during translation of the SCA3 protein resulting in a transfer from polyglutamine to the polyalanine-encoding frame. A dinucleotide deletion or single nucleotide insertion CAG within the polyglutamine tract of huntingtin exon 1 will shift the polyglutamine coding frame from +1 (+1 frame shift) to GCA,[16]


Frameshift mutations occur in many diseases, at least as part of the cause. Knowing the prevalent mutations can also aid in disease diagnosis. Attempts are currently being made to beneficially use frameshift mutations in the treatment of diseases, changing the reading frame of amino acids.


Frameshift mutations are known to be a factor in colorectal cancer as well as other cancers with microsatellite instability. As stated earlier, frameshift mutations are more likely to occur in the region of the repeat sequence. When DNA mismatch repair does not correct the addition or removal of bases, these mutations are more likely to be pathogenic. This may be partly because the tumor is not told to stop growing. Experiments in yeast and bacteria help to show characteristics of microsatellites that may contribute to repair of defective DNA mismatches. These include the length of the microsatellite, the makeup of the genetic material and how pure the duplication is. Microsatellites have higher rates of frameshift mutations based on experimental results. Flanking DNA may also contribute to frameshift mutations.[17] In prostate cancer a frameshift mutation alters the open reading frame (ORF) and prevents apoptosis. This leads to uncontrolled growth of the tumor. While there are environmental factors that contribute to prostate cancer progression, there is also a genetic component. During testing of coding regions to identify mutations, 116 genetic variants were discovered, including 61 frameshift mutations. [18] There are over 500 mutations on chromosome 17 that play a role in the development of breast and ovarian cancer in the BRCA1 gene, many of which are frameshift. [19]

Crohn’s disease

Crohn’s disease is related to the NOD2 gene. The mutation is the insertion of a cytosine at position 3020. This leads to a premature stop codon, truncating the protein that is known to be transcribed. When the protein is able to form normally, it reacts to the bacterial liposaccharide, where the 3020insC mutation prevents the protein from being reactive. [20]

Cystic fibrosis

Cystic fibrosis (CF) is a disease based on mutations in the CF transmembrane conductance regulator (CFTR) gene. More than 1500 mutations have been identified, but not all cause disease. [21] Most cases of cystic fibrosis are the result of the F508 mutation, which removes entire amino acids. Two frameshift mutations are of interest in the diagnosis of CF, CF1213delT and CF1154-insTC. Both of these mutations usually occur in association with at least one other mutation. They both cause a slight decrease in lung function and occur in about 1% of patients tested. These mutations were identified through Sanger sequencing. [22]


CCR5 is one of the HIV-associated cell entry co-factors, most often associated with nonsynthetium-inducing strains, being most pronounced in HIV patients as opposed to AIDS patients. A 32 base pair deletion in CCR5 has been identified as a mutation that negates the possibility of HIV infection. This region on the open reading frame ORF contains a frameshift mutation that leads to a premature stop codon. This leads to a loss of HIV-coreceptor function in vitro. CCR5-1 is considered as the wild type and CCR5-2 is considered as the mutant allele. People with a mutation heterozygous for CCR5 were less susceptible to developing HIV. In one study, despite high exposure to the HIV virus, no one homozygous for the CCR5 mutation tested positive for HIV. [3]

Tay-Sachs disease

Tay-Sachs disease is a fatal disease affecting the central nervous system. It is mostly found in infants and young children. Progression of the disease begins in the womb but symptoms do not appear until about 6 months of age. There is no cure for the disease. [23]Mutations in the β-hexosaminidase A (Hex A) gene are known to affect Tay-Sachs onset, with 78 different types of mutations described, 67 of which are known to cause disease. Most of the mutations observed (65/78) are single base substitutions or SNPs, 11 deletions, 1 major and 10 minor, and 2 insertions. Of the observed mutations, 8 are frameshift, 6 deletions and 2 insertions. A 4 base paired insertion in exon 11 is observed in Tay-Sachs disease presence in 80% of the Ashkenazi Jewish population. Frameshift mutations lead to an early stop codon that is known to play a role in disease in infants. Delayed onset disease appears to be caused by 4 different mutations, one being a 3 base pair deletion. [24]

Smith-Magenis syndrome

Smith–Magenis syndrome (SMS) is a complex syndrome that includes intellectual disability, sleep disturbances, behavioral problems, and a variety of craniofacial, skeletal, and visceral anomalies. Most SMS cases have a common deletion of ~3.5 Mb involving the retinoic acid induced-1 (RAI1) gene. Other cases show variability in the SMS phenotype not previously shown for RAI1 mutations, including hearing loss, absence of self-degrading behavior, and mild global delay. Sequencing of RAI1 revealed a heptameric-tract (CCCCCCC) mutation in exon 3, resulting in a frameshift mutation. Of the seven reported frameshift mutations in RAI1 occurring in poly C-tracts, four cases (~57%) occur in this heptameric C-tract. The results indicate that this heptameric C-tract is a preferential recombination hotspot insertion/deletion (SNindles) and therefore a primary target for analysis in patients suspected for mutations in RAI1.[25]

Hypertrophic cardiomyopathy

Hypertrophic cardiomyopathy is the most common cause of sudden death in young people, including trained athletes, and is caused by mutations in the genes encoding proteins of the cardiac sarcomere. Mutations in the troponin C gene (TNNC1) are a rare genetic cause of hypertrophic cardiomyopathy. A recent study indicated that a frameshift mutation (c.363dupG or p.Gln122AlafsX30) in troponin C was the cause of hypertrophic cardiomyopathy (and sudden cardiac death) in a 19-year-old male. [26]


It is rare to find a cure for diseases caused by frameshift mutations. Research on this is ongoing. An example is a primary immunodeficiency (PID), which is an inherited condition that can lead to increased infection. There are 120 genes and 150 mutations that play a role in primary immunodeficiency. The standard treatment is currently gene therapy , but this is a highly risky treatment and can often lead to other diseases, such as leukemia. Gene therapy procedures involve modifying the zinc fringer nuclease fusion protein, cleaving both ends of the mutation, which in turn removes it from the sequence. Antisense-oligonucleotide mediated exon skippingThere is another possibility for Duchenne muscular dystrophy. This process allows the mutation to be passed through so that the rest of the sequence remains in frame and the function of the protein is retained. This, however, does not cure the disease, only treats the symptoms, and is practical only in structural proteins or other repetitive genes. The third form of repair is revertant mosaicism., which occurs naturally by creating a reverse mutation or a mutation at another site that corrects the reading frame. This alternation can occur by intrauterine recombination, mitotic gene conversion, second site DNA slipping or site-specific reversion. This is possible in many diseases, such as X-linked severe combined immunodeficiency (SCID), Wiskott-Aldrich syndrome and Bloom syndrome. There are no medications or other pharmacogenomic methods that help with PID. [27]

A European patent (EP1369126A1) in 2003 by Bork records a method used for cancer prevention and curative treatment of cancer and pre-cancer such as DNA-mismatch repair deficiency (MMR) sporadic tumors and tumors associated with HNPCC . The idea is to use immunotherapy with a combination mixture of tumor-specific frameshift mutation-derived peptides to elicit a cytotoxic T-cell response specifically directed against tumor cells .