As I mentioned in our last ENCODE tip, the project began with an initial hard look at 1% of the human genome sequence to learn as much as we could in any given region (the pilot project). The success of this has led to an expansion of the project to a genome-wide scale.
Changing the scale of this project requires new ways to process, organize, and display the data as well as providing convenient access for analysis. The folks at the UCSC Genome Browser have been named as the Data Coordination Center (DCC) team for handling this. In this tip we examine what this scale-up means for managing this data, and some of the strategies that are being created around this.
The goal is to get high quality data from the ENCODE data-generating teams into your hands as quickly as possible. We hope this tip provides some awareness around how that will be accomplished.
The DCC team is automating many processes, capturing crucial aspects of the data generating techniques, handling new types of data and visualizations and providing status and throughput reports to the NHGRI.