A few more bits of info from the recent 3rd International Biocuration Conference: several of the speakers described the amounts of biological data coming, and soon to be coming, as beyond a tsunami – more like a tsunami tsunami of data. Even with the talented and creative people who are members of the biocuration community, handling all this data is challenging. Funding for staff is not going up at nearly the rate at which the data is increasing so new and innovative mechanisms must be developed. An approach that has been in the works for years is community curation – acquiring curation directly from the expert researchers who are generating the data. But, as we have noted elsewhere on our blog, adding yet another to-do to a researchers already daunting list is a hard sell – especially if incentives are lacking.
Another approach being taken is to create better connections between research publications and the databases that organize the data. In other words the data would both be published AND entered into public databases in a single step process, instead of publication and then a separate collection of the data from the publication by the database well after publication occurs. Many of these approaches were discussed, such as collaborations between databases (TAIR) and journals (Plant Physiology).
The project that I want to point out today is BioLit, and you can see an example implementation of it at BioLit:PDB. I’ve listed the paper describing BioLit below so you can get all the details for yourself, but briefly BioLit leverages open access publishing on PubMed Central and ‘marks up’ the content of articles to provide metadata that describes the semantic content of the articles. BioLit will be useful to primary end-users and biocurators alike, and it will be interesting to see how this approach develops over time.
Fink, J., Kushch, S., Williams, P., & Bourne, P. (2008). BioLit: integrating biological literature with databases Nucleic Acids Research, 36 (Web Server) DOI: 10.1093/nar/gkn317