This week’s Tip of the Week is a bit different than some of the others that I have done in the past. I’m going to take you through parts of a document–the newly released draft of the Data Release Policy for ENCODE (go over to this page at NHGRI and get a copy of the document). I know–you expect software from us. But I will also show you a bit of software at the end, if you can stick with me for that. OK?
We’ve been talking about the ENCODE projects about once a month lately. We are hoping to raise awareness and understanding about the framework, foundations, and goals for ENCODE. That’s because a TON of genome-wide data is going to be collected and offered to researchers worldwide as this project progresses. And as we proceed I’ll be showing you how to access that data in the UCSC Genome Browser, since UCSC is the DCC (or data coordination center) to wrangle the human data around ENCODE.
However, if you are going to use ENCODE data, you need to know about the guidelines for using that data. That’s what I’ll cover today. And I’ll also give you a peek at some of the first data to come through the process at UCSC on the test server*. It is a sample of ChIP-Seq data from HudsonAlpha that I’ll use as an example.
In short, this data policy tries to balance the needs of the users of this publicly-funded data with those of the scientists who are generating this data. They are proposing a 9-month non-scoop window: the providers will release the data and have 9 months to submit their manuscripts on it. In the meantime, you can look at the data and start to use it. But in general, they ask that you don’t submit a paper without the consent of the ENCODE team in that window. The appendix offers a couple of nice scenarios about the appropriate use of the data so it helps to clarify this.
I hope you’ll have a look at the ENCODE draft data release policy and think about using the ENCODE data. And please give NHGRI and the ENCODE team feedback on this.
*Note on the test server: this is a sandbox for developers at UCSC, the data might not have all be QCed yet, and data here should not be considered final form. But you can have a look.
There’s been some coverage of the request for comment elsewhere, too, if you want to read more about this: http://www.genomeweb.com/issues/news/149419-1.html
UCSC Genome Browser “News” item has a link to the document as well.