Secret source code?

Over on G+, Dave Cole linked to this interesting tidbit:

Secret Source Codes Threaten Modern Science

That sounds pretty dire. But Jeremy Hsu does have a point. And it’s related to my ongoing issue of the data not being in the papers anymore either. And it’s further related to the repeated issue of us joking around BioStar that we really need to have obituary section in the NAR database or web server issues for vanished databases too.

As much as I loved that recent stickleback paper–I went back to look (including the supplements) to see where the accession numbers were for all those 21 genomes. I didn’t see them (but you can get the reference genome from a link in the browser pages). There were 5 million SNPs supposedly. I can’t find a single one in dbSNP or any kind of link to a project ID (I don’t expect 5 million SS IDs to be in the paper, but there should be some way to access them). And yes–you can download some of the data from the sticklebrowser. But will that persist? For how long? I think that’s a real risk.

We can’t let the dazzling big data blind us to the need for access to the tools and the data itself, in secure and archival ways.

EDIT: here’s the link to the Science Policy Forum piece (subscription):  Shining Light into Black Boxes