When we do workshops at medical centers, one of the most common questions I get is about locating good resources for cancer data. And we’ve talked about some of the large projects, like the ICGC. We’ve talked about ways to stratify data sets, and one example of this was in cancer, using data from The Cancer Genome Atlas. Going forward, the ability to rapidly sequence normal vs tumor pairs should help us to even more rapidly understand and target tumors. And this will lead to other cases of entirely new leads in some situations.
But one of the really solid tools that I like to be sure to highlight for people is the COSMIC collection. It’s not new–it’s been around for a decade now. But it’s one of those types of core data resources that people really need to know about. Their long experience, their high quality curation, and their adaptations to new influxes of data volumes and data types, make them a really valuable source of information.
Reading their update paper in the 2015 NAR Database issue, I wanted to go over and refresh my memory of the features I knew, and explore some of the newer features too. There really is some serious depth over there, and I can’t touch on all of the aspects that they have in a blog post like this. But I also discovered that they’ve recently provided a number of videos to help people learn about the various tools and options.
For this week’s Video Tip of the Week, I’ll include their “overview” piece. But you should check out their Tutorials page for additional topics as well.
One feature that I hadn’t realized is that they offer was a Genome Browser using the JBrowse framework. There’s a separate video with some guidance on how to use that.
Their future directions section in the paper makes it clear they are preparing to be able to handle the incoming data on this topic. And they are evaluation new tools and analyses that may be appropriate. But they commit to maintaining their strong emphasis on curation–which is music to my ears. I think quality hand curation is simultaneously undervalued by end users (and sadly by funders), while being entirely critical to handling all the big data that’s coming. So get familiar with COSMIC for cancer genomics data. It will be worth you time.
Forbes S.A., D. Beare, P. Gunasekaran, K. Leung, N. Bindal, H. Boutselakis, M. Ding, S. Bamford, C. Cole, S. Ward & C. Y. Kok & (2014). COSMIC: exploring the world’s knowledge of somatic mutations in human cancer, Nucleic Acids Research, 43 (D1) D805-D811. DOI: http://dx.doi.org/10.1093/nar/gku1075