I just finished reading this paper out this month in PNAS, “Specific expression of long noncoding RNAS.” From the looks of it, the paper has conjured up an interesting discussion in the science blogosphere surrounding the paper and the term “Junk DNA.” Before I get to that discussion, let me give a brief synopsis of and thoughts on the paper (and a link to a ncRNA database at the end). Mercer et al. looked at a class of non-coding RNAs (>200bp in length), ncRNAs, that make up around 1% of the genome. Looking at the Allen Brain Atlas (ABA), they found in situ hybridization data for many long ncRNAs. Filtering these, they found “1,328 of the >20,000 probes in the ABA targeted transcripts that lacked significant protein-coding potential, including many previously characterized functional ncRNAs.” To quote:
The ABA mapped the ISH images to a common anatomical framework that enables the localization and relative quantification of transcript expression in major neuroanatomical brain regions. We used this relative quantified expression data to associate expression of ncRNAs to specific neuroanatomical structures.
And what they found?
Among the >20,000 catalogued in the ABA, we identified 849 long transcripts with little or no protein-coding potential that were expressed in the adult mouse brain. Many of these ncRNAs showed regionally enriched expression profiles similar to that observed for protein-coding mRNAs. In addition, viewing the ncRNAs in their genomic context revealed potential functional implications of their expression profiles, particularly with respect to ncRNAs associated with well characterized neurological genes.
Are “these the product of transcriptional noise + RNA Processing noise” as Alex Palazzo asks? I’m not so sure that is the case here. The authors of the paper make a good argument, several detailed studies have shown those ncRNAs trafficked to specific subcellular loci, suggesting these were not artifactual. So, the authors looked further at a subset of ncRNAs expressed in Purkinje cells and found that there was a diversity of subcellular localization inconsistent with artifactual transcription.I agree with the authors, this study does seem to suggest that many of these ncRNA transcripts do have a possible function (though there is no function yet assigned). It is indeed compelling and the research is pretty solid and suggestive.Of course this has started an entire discussion in the science blogosphere about “Junk DNA.” What a loaded term. The term is an over-simplification for a complex concept. This needs to be “framed” better. Greg Laden accidently stepped into it when he said, when writing about this paper:
“Junk DNA” story is largely a myth, as you probably already know. DNA does not have to code for one of the few tens of thousands of proteins or enzymes known for any given animal, for example, to have a function.”
To the second sentence I would answer “of course not, we knew that”. As Dr. Gregory of Genomicron lists, researchers have long known and acknowledged functions for non-coding DNA. That’s not the question. To the first sentence, I think no, I didn’t know “Junk DNA” was a myth. Of course Laden has responded to the criticism of his post, his basic defense being that:
There is a conception that I believe is generally held by the public, science teachers, interested parties, etc. that the genome can be classed into two categories: DNA sequences that ultimately code for proteins and DNA sequences that are “junk” … have no function whatsoever.
If that is the general conception in the public, science teachers, etc., then it is indeed a “myth.” That myth needs to be destroyed. So, lets not use the term “Junk.” I agree with Dr. Gregory:
“Junk DNA”, which originally was coined in reference to now-functionless gene duplicates (i.e., true broken-down “junk”), is now used as “a catch-all phrase for chromosomal sequences with no apparent function” (Moore 1996). Its current usage also implies a lack of function which is accurate by definition for pseudogenes in regard to protein-coding, but which does not hold for all non-coding elements. The term has deviated from or outgrown its original use, and its continued invocation is non-neutral in its expression – and generation – of conceptual biases.
It promotes the idea, erroneous, that there are two kinds of DNA, coding and junk, functional and non-functional.
As I found out in my own Ph.D. studies, the “non-protein-coding” DNA is quite diverse. I studied retrotransposable elements. I have to admit, I’m a former-adaptionist when it came to retroposons. I had a difficult time at first grasping that such a huge part of the genome had no function, for the organism. After more study and thought, I came to the conclusion that retroposons were “selfish” elements having, as a class, no intrinsic function in the genome, but are rather parasitic. Did this make them “junk,”? No, not in the original coined meaning, and not particularly how it’s used now. Are they “non-coding”? No, they code for reverse transcriptase and other proteins. Are they non-functional? Yes and no. They are non-functional like a tick might be for me, but pretty functional when it comes to the tick’s existence.
There are also a lot of sequences in the genome that are ‘throwoffs,’ pseudogenes and the like. DNA that has no function for the genome or for themselves, that could be considered like the ‘junk’ I throw in the basement of my house. I haven’t used it in years, it once might have have a function, it doesn’t now. That I might go back into my basement some day and find a new function for it (as I’ve done recently), doesn’t mean that it now has an intrinsic function, still junk.
And of course there is a lot of DNA, like perhaps these ncRNAs, that have a function in the genome that hasn’t been determine yet. I think what we are finding, and have found, is that the classes of DNA in our genome are quite diverse, protein-coding, regulatory, scaffolding, parasitic, purely unnecessary throw off junk and so much more. I am sure we are going to find functions for DNA sequences we hadn’t ascribed before. 20,000 some protein coding genes need a lot of help to make an organism as complex as a mouse or human. That said, there is a hell of a lot of sequence that is there that we can show to have no ‘function’ in the genome.
ncRNAs might, as a class, have a function. If they do, perhaps the NONCODE database of ncRNAs reported in this year’s NAR Database issue I was just perusing can help (unfortunately, as I write this, I haven’t been able to access NONCODE).
1. Mercer, T., Dinger, M., Sunkin, S., Mehler, M., Mattick, J. (2008). Specific expression of long noncoding RNAs in the mouse brain. Proceedings of the National Academy of Sciences, 105(2), 716-721.
2. NONCODE v2.0: decoding the non-coding. Nucleic Acids Res. 2008 Jan;36(Database issue):D170-2.
(note, republishing this. Researchingblogging seems to have an issue with wordpress, so we are testing out the bug. That’s why you are finding this post at the top of the blog again this Tuesday evening if came earlier :).