I’ve talked a lot about how much I am interested in seeing new visualization strategies for working with the volumes of data was have today–which are certainly not going to stop flowing in. But a more basic level of this is even just locating and navigating to find the data sets you might want to visualize.
TCGA–The Cancer Genome Atlas–collects large numbers of data sets on various cancers. They collect different types of data: GWAS, expression, protein, and more. But it can be a challenge to keep up the the huge amounts of data that are coming in. They have a portal where you can query the underlying data sets, with many features that you might be interested in. But another group has developed another strategy to access data sets–their roadmap offers a quicker and easier way to assess, and then access, what’s available, as well as providing a more general strategy for organizing access to the files.
I’ll let the team explain with their own video:
But be sure to check out their paper where they explain their strategy in more detail. They provide links to the queries they generate to you can explore that too. And you can consider this method for other types of data sets you might want to navigate as well.
There may be other ways you want to interact with TCGA data, and you can still access their portal for other types of queries. But this offers another way to quickly locate subsets of data sets that you might be interested in exploring with other tools.
Hat tip to Bell Eapen for the notice:
A self-updating road map of The Cancer Genome Atlas. http://t.co/GZ67P1ussR Good read
— Bell Eapen (@beapen) April 21, 2013
TCGA Roadmap Dashboard: http://tcga.github.io/Roadmap/
TCGA site portal: https://tcga-data.nci.nih.gov/tcga/
Robbins, D., Gruneberg, A., Deus, H., Tanik, M., & Almeida, J. (2013). A self-updating road map of The Cancer Genome Atlas Bioinformatics DOI: 10.1093/bioinformatics/btt141
The Cancer Genome Atlas (TCGA) Research Network (2008). Comprehensive genomic characterization defines human glioblastoma genes and core pathways Nature, 455 (7216), 1061-1068 DOI: 10.1038/nature07385