As this Nature editorial says, the as the human genome (and a few hundred others) were completed, the amount of data had become daunting (we know that well here at OpenHelix, we deal with it everyday and daily make that more accessible to scientists through training :). But also, importantly, even with all the data, it’s been found that we need more. As the editorial states:
By 2004, large-scale genome projects were already indicating that genome sequences, within and across species, were too similar to be able to explain the diversity of life. It was instead clear that epigenetics — those changes to gene expression caused by chemical modification of DNA and its associated proteins — could explain much about how these similar genetic codes are expressed uniquely in different cells, in different environmental conditions and at different times.
Thus is born the Human Epigenome Consortium (Nature paper, subscription required, here). You can find some of the data from the pilot projec at the Sanger Institute site.
The beginning stages, but I believe it will prove to be quite a treasure trove of data (as if we don’t have a huge unmined dataset now). It was this last comment in the editorial:
.., given that epigenetic coding will be orders of magnitude more complex than genetic coding, its requirement for data crunching may be similar…
Get ready for a lot more resources and tools of greater complexity :).