ENCODE RNA-Seq data standards–we’re gonna need ‘em

I just got an important email from the ENCODE announcement mailing list at UCSC Genome Browser.  I haven’t had time to go through them as I’m packing for a trip, but I think the PDF document will make some fine airplane reading!

The ENCODE Consortium has finalized ‘Standards, Guidelines and Best Practices for RNA-Seq V1.0′, as part of the Consortium’s continuing effort to generate data standards.   The document is available at the ENCODE portal here:


“RNA-Seq is a directed experimental approach aimed at characterizing transcription in biological samples. This document presents a set of guidelines and standards focused on best practices for creating ‘reference quality’ transcriptome measurements.”

It was followed by a direct link to the RNA-Seq PDF document: http://encodeproject.org/ENCODE/protocols/dataStandards/ENCODE_RNAseq_Standards_V1.0.pdf

I think it’s going to be interesting to read this, as I was just considering RNA-Seq data the other day when Stephen Turner started a discussion on some hot news:

@genetics_blog Wish I could read ($ub) MT @GenomeWeb: Transcript abundance substantially disagrees btwn RNA-seq expts w/ same platform http://bit.ly/kfShZH
And we replied with this:
@OpenHelix: @genetics_blog Refers to this paper http://bit.ly/l9akCF
The paper is about technical variability in RNA-Seq data from the same samples prepared the exact same way. I think it’s going to be important to be aware of the variability in this data as we explore it. And I’m sure the ENCODE consortium folks will have a look at this paper and consider that information.

One of the great things about the fact that the ENCODE consortium is working to develop standards is that there are these great big data sets that are available to all of us to look at, and there are people charged with evaluating the technology and the methods to get the most out of them.

If you aren’t familiar with the ENCODE project and data sets, please have a look at the ENCODE tutorial materials that we have, which are freely available because they are sponsored by the UCSC ENCODE team. We show you about the project framework, how to identify that data over there, and some important aspects of interacting with it.