The last twenty years have been marked by a veritable explosion in sequencing technology. The Human Genome Project and it’s completion in 2003 was the crowning jewel of this burgeoning genomics revolution . The amount of information to come from this relatively new branch of science is literally mind-boggling and only grows with each passing day.
Interesting observations have come out of this massive amount of genomic data relating to the non-coding DNA in our genome. Less than 2% of the over 3 billion nucleotides in our genome are responsible for coding all of the protein that makes up a human being. This leaves a large question as to what exactly that other 98% of our genome is up to. Large parts (roughly 50%) are known as “junk DNA” with no accepted role, although new research is beginning to shed light on the functions of this DNA. The remainder of our genome is composed of long and short repeated sequences, transposons, retrotransposons and the topic of today’s article: endogenous retroviruses.
These elements are not human, they are fully viral in origin. This means that our genome is not just ours alone, we carry the DNA of many viruses that infected our ancestors in every cell in our own bodies.
An endogenous retrovirus is a retrovirus that has managed to infect a germ line cell (sperm or egg), integrate into that cellular DNA, and subsequently be passed to future generations in the DNA of host. Roughly 8%1 of our genome is comprised of the human endogenous retroviruses, or HERVs for short. Retroviruses are capable of invading our genomes due to the viral protein integrase. Integrase takes the DNA copy of the viral RNA and inserts this chunk of viral DNA into sites in our own genome, creating what is known as a provirus.
It is in our cells’ best interest to inactivate these proviruses as soon as possible, as over time the odds of an integration event that disrupts a critical gene increase. Not only this, but active provirus can produce infectious virus that can then leave the cell, infect other cells, and subsequently integrate into the host DNA. Cells have several ways in which to inactivate these elements but I would like to focus specifically on the APOBEC3G protein in humans, how this protein has inactivated HERVs in our own genome, and how this discovery may be applied to modern ongoing retroviral pandemics such as HIV.
There is a body of evidence suggesting that there are critical host factors involved in the inactivation of incoming retroviruses. APOBEC3G (A3G) can inactivate proviruses only if it is packaged in the virion of the mature virus, which means that the parent virus must be produced from a cell that expresses A3G. When this virion containing A3G infects another cell A3G alters the basic structure of the viral RNA through the deamination of guanine (G) in GGG trinucleotides. This deamination leads to altered pairing of the deaminated G with uracil (U) and when the proviral DNA is generated from this altered RNA template it results in a shift from G to A. The introduction of A into the terminal space in a GGG codon results in the removal of a tryptophan codon and replacement with a stop codon. The introduction of a random stop codon is a serious event at the level of genetic readout and may effectively inhibit the production of infectious virus from the integrated provirus. These defective viruses are not cleared from the genome however, and remain there in inactive form. These integrated viruses can then be transmitted vertically through the population for generations if they integrate into the germ line.

So why these HERVs are important if they don’t do anything? What makes them important is that they act as a “fossil record” for retroviruses that infect people and were successfully inactivated by our own defenses. One of the major aims of modern research is coming up with effective treatments for HIV and other retroviruses. If we could somehow harness the function of A3G we could potentially use it or a derivation of the process to defend cells against productive HIV infection.
Preliminary work in this area has already begun, as a paper in 2008 published in the Journal of Virology outlined how a group was capable of generating a consensus sequence from a subset of related HERVs in our genome that first integrated in the human lineage millions of years ago2. This consensus sequence then proved to be capable of generating infectious virus when introduced to cells, making this the first HERV that could be studied in vivo. Think the virus version of “Jurassic Park” minus the frog DNA.
With the ancient virus at hand they compared the sequence from the functional virus with that of the inactivated copies in our genome. What they saw was a high prevalence of these G-to-A mutations, or hypermutation, in the genomes of fossilized HERVs indicating that the host protein A3G effectively inactivated these viral genomes through error catastrophe and the introduction of stop codons. Further study into the mechanism of this host defense is necessary, as the precise action of A3G is currently unknown. However, these are interesting developments in the hunt for effective treatments to stop modern epidemics. Further research into this area could lead to novel therapeutics that could be added to current HAART therapy regiments and help millions of infected people.
References:
1. Bannert, N. & Kurth, R. Retroelements and the human genome: new perspectives on an old relation. Proceedings of the National Academy of Sciences of the United States of America 101 Suppl , 14572–9 (2004).
2. Lee, Y. N., Malim, M. H. & Bieniasz, P. D. Hypermutation of an ancient human retrovirus by APOBEC3G. Journal of virology 82, 8762–70 (2008).
[Featured image from flickr user Can H. used under creative commons license]