A Matter of Interpretation
(February 10th, 2015) “Functional” - what does it mean? According to the Oxford Dictionary, “functional” is defined as "of or having a special activity, purpose or task". DNA is functional, too. But how is controversially discussed. A new paper adds more fuel to the fire.
In February 2001, the Human Genome Project bewildered the scientific world with their announcement that only a tiny fraction (1.5%) of the human genome contains genes (ca. 20,000) coding for proteins. What about the remaining 98.5%? Back then, functional DNA was defined as “encoding proteins” and hence, the vast majority of our genome did not serve a purpose, it was junk DNA. With the discovery of promoters, enhancers and terminators, this view gradually changed and nowadays, we believe that even the three-dimensional structure of DNA influences protein production and thus, has a function. So, did we get rid of all the junk? Does every sequence in our genome have a special activity, purpose or task?
The Encyclopedia of DNA Elements, better known as ENCODE, provides some answers. Following the Human Genome Project, it tried to sort out the junk DNA muddle by systematically mapping regions of transcription, transcription factor association, chromatin structure, and histone modification, in particular outside the well-studied protein-coding regions. Nine years and $400 million later, the ENCODE researchers proudly proclaimed that 80% of the human genome is now “associated with at least one biochemical function”.
But not everyone agreed, especially not with ENCODE’s definition of “functional”. Dan Graur, bioinformatician at the University of Houston, for instance, was highly critical and publicly criticised ENCODE’s findings. Recently, he published his own suggestion for a new DNA classification. According to Graur, biochemical function in the cell does not necessarily make a DNA sequence functional. To make his point, he used a vivid analogy: “Following a collision between a car and a pedestrian, a car's hood would be ascribed the ‘function’ of harming the pedestrian, while the pedestrian would have the ‘function’ of denting the car's hood.” This means not every biochemical activity must be biologically meaningful. Therefore, Graur suggests distinguishing the usage of a genomic element (like the car hitting a pedestrian) from the reason for its existence (a car is for driving). He calls it the selected effect function: “That is, a sequence is functional, if it is maintained in the genome by natural selection because of its function.”
Graur’s classification differs a lot from the classical view. Bioinformatician Sanne Nygard from the University of Copenhagen thinks that this is because Graur and the ENCODE researchers come from different backgrounds. “What is considered ‘functional’ DNA may depend on the processes being studied. The classification system suggested by Graur et al. makes a lot of sense for those studying evolutionary biology, but may be less optimal for other researchers.”
If this already sounds complicated, wait for it, Graur has more. DNA is to be divided into four groups: Firstly, into “functional” and “rubbish” DNA. The selected-effect “functional” DNA is further divided into “literal” and “indifferent” DNA. “Literal” means that the order of nucleotides is under selection pressure, regardless of whether it encodes a gene or an untranscribed region. “Indifferent” means that a genomic region is important but there is no selection pressure on the DNA sequence. Imagine it like a fill-in: you just need someone in this position, it doesn’t matter who it is.
Then, there’s the second category, the rubbish DNA without selected-effect function. Graur and colleagues see a delicate difference between "junk" and "garbage" DNA: Junk might come in handy at some point, like a "garage full of junk" that you don’t want to throw away. Who knows, maybe in 10,000 years or so, you will need this DNA sequence to protect you against atomic radiation. But “garbage” is really something you should dump if you want to belong to the fittest. Therefore, there is an active selection against these sequences in the genome; it just doesn’t work 100% and we still have to live with that garbage a few more thousand years or so.
It all makes sense, theoretically, but putting Graur’s new classification into practice, reveals a problem, as Nygaard points out: “In my own type of work, a typical requirement is that each base in a genome has to be put into a specific category. Suppose there is a spacer sequence of 100 bp, this would be 'indifferent' DNA. But suppose that increasing the spacer to 500 bp has a slight deleterious effect, how do I label that? It is not clear to me how well the system would work in practice.”
So, the classification of DNA remains a controversial topic. But biologists and bioinformaticians can make the best of this situation in their everyday lab life. When characterising an unknown protein, basically every kind of information that is stored in databases is useful, giving hints about that protein’s function. At this point, one probably doesn’t care too much whether DNA is “literal” or “indifferent”, or perhaps even "zombie" DNA.