The frustrated gene, the frustrated writer

Recently, my grad student and I published an article that articulates a number of ideas that have been bouncing around my brain for quite some time, “Non-Darwinian Molecular Biology”.

While writing this piece, I wanted to bring up some great conceptual ideas that I had read in a perspective on the relationship between the evolution of elaborate gene expression machinery in eukaryotes and selfish DNA.

Before we get to that article, I first want to first delve into how the nucleus evolved.

Unlike what you might have come accross, the nucleus is not there to “store” your DNA. Afterall, prokaryotes have DNA and no nucleus. To undrestand what the nucleus is all about, it is useful to compare life forms whose cells have nuclei (i.e. eukaryotes) and those that don’t (i.e. prokaryotes). One big difference is how these two life forms manage their RNA. Eukaryotes make their RNA in one compartment, the nucleoplasm, and translate it into protein in a second compartment, the cytoplasm. In contrast, cells that lack nuclei conduct both of these activities in the same compartment at the same time. So as an mRNA is made it is already being translated into new proteins. But this is not the case in our nucleated cells, and the biggest question is why?

Part of the answer was laid down by two fundamental papers from 2006 by Martin & Koonin and López-García & Moreira. They pieced together the steps that led to the evolution of the common ancestor of all eukaryotic cells. The idea is it is the merger of two cells which became entangled in a symbiotic relationship. One cell, likely of archaea origin, gave rise to our nuclear genome* while the other cell, likely an alpha-proteobacteria, gave rise to our mitochondria. Since these mitochondrial progenitors lived inside the archaea-like cell, they would occasionally release their DNA which would get absorbed into archaea-like genome. This allowed the transfer of new genes and selfish DNA parasites (known as group II introns) from the alpha-proteobacteria into our nuclear genome. The selfish DNA elements, then multiplied and copied themselves until they were everywhere, including in the middle of most of our nuclear genes. This was not a problem at first. When these selfish DNA entities were in the middle of a gene and the gene was transcribed into mRNA, the group II intron (which remember is at this point embedded into a longer mRNA), would fold up and splice itself out, leaving behind a spliced mRNA that no longer had any trace of the intron. The spliced out group II intron RNA would then reverse the splicing reaction, but into a new location in the genome and then use a second enzyme, reverse transcriptase to convert itself into DNA. Yes, these were nasty little critters. Eventually as the number introns increased, we think that these parasites started to parasitize each other. It turns out that defective group II introns could be spliced out of mRNAs in “trans” by other fully operational group II intron RNAs. Over time, the former group became our current day introns, while the later group became professional trans-splicing machinery, what we call the spliceosome. Yes, our spliceosome is an RNA enzyme that evolved from a parasite. Since newly made mRNAs need to be spliced before they can be properly translated, splicing was confined to the nucleoplasm while translation took place in a separate region, the cytoplasm. Although this sounds like the compartmentalization of the cell took place after the proliferation of splicing, it remains possible that the compartmentalization occurred first, and permitted splicing to proliferate. It is also possible that they evolved hand-in-hand. The presence of a nucleus, likely also changed how our genome evolved (in a later post, I’ll write about “global” and “local” solutions as evolutionary strategies) and promoted the proliferation other selfish DNA parasites, what we call transposable elements. Today we see the product of this. A genome littered with the dead remains of selfish DNA - something that we now call junk DNA.

Okay, lets get back to the perspective that discusses the relationship between the evolution of elaborate gene expression machinery in eukaryotes and selfish DNA.

This review attempts to address the question of why gene expression is complicated in eukaryotes and simple in prokaryotes. It’s not just mRNA splicing and nuclear export. Eukaryotic DNA is packed into what is called chromatin using proteins (histones) that undergo elaborate modifications. These modifications are highly dynamic and have very complicated relationships with other cellular machineries, especially with those enzymes that copy DNA into RNA and other enzymes that process RNA, such as the spliceosome, and others that use RNA to silence genomes, such as the RITS complex. The author of this perspective rightly points out that much of this elaborate and inefficient gene expression pathway arose likely due to the long historical battle that we have had with parasites - both selfish DNA and viruses.

I wanted to cite this article as its message was very much akin to the ideas we were writing about. The problem was, I could not track this article down. I was somewhat confident that this perspective was published in Cell, and that it came out awhile ago. I tried Pubmed, Google, Google Scholar. I flipped through back issues, I tried using the search function on the www.cell.com website (note to Elsevier - it sucks). All these attempts failed. I just could not find it. The article discusses very high level ideas that are hard to distill down into a few key words. So I gave up. Our paper was accepted and is now published and we never ended up citing the perspective.

As you likely know, last weekend I decided that I needed to start blogging again. I hobbled together this site, posted my first entry and sent out a few notifications over Twitter and Facebook. That’s when I see that Hiten Hadhani “liked” my tweet about my new blog. Within an instant, I recognized his name as the author of the long searched-for perspective**. Within 2 minutes I was able to track it down, “The Frustrated Gene: Origins of Eukaryotic Gene Expression”. Needless to say, I highly recommend this perspective.

*- contrary to popular belief, our nuclear genome has a greater number of genes with eubacterial origin than archeal origin. Likely we obtained these from alpha-proteobacteria, and possibly a second endosymbiotic event, although this remains controversial.

**-I also know of Hiten Madhani as the postdoc supervisor of my next door neighbour, Marc Meneghini.