In today’s tip I want to make you aware of a tool that I think will help researchers to present their own data and publications in an accurate and universally searchable way. I learned of the resource (UCSDBioLit) through an article in one of my recent BioMed Central article alert emails. This resource allows authors to mark-up their own publications with XML tags AS THEY WRITE their papers. This will allow faster and more accurate semantic searching of their research.
A huge problem in science today is the ability to quickly search the vast literature base and to accurately and efficiently find the data that you are interested in. Here at OpenHelix we focus on ways of effectively and efficiently get information out of public databases and resources, but at the other end of the process is the ability for scientific knowledge to be curated into those resources. We have featured biocurators and the phenomenal work that they do several times in the past, but it is work that never ends and can be very labor intensive. It often involves an initial triaging of a field’s literature, some level of automatic information gathering, and then careful manual effort on the part of scientist at the resource to gather and present the information through their site. I know from personal experience that the process of reading a paper, clarifying research details with an author, and then presenting that information to the author’s satisfaction can be a very long & labor intensive process, for both the curator AND the original author.
For years there has been discussion of ‘expert curation’ in which experts in the field author review or summary pages in a resource, or community curation jamborees, etc. And there have been fruits from many of these efforts, but in general participation is low. But who is more of an expert on the research being published other than the author himself? If authors could/would mark up their own papers during the publication process, not only could they be assured that it would be accurate but they would help make their research universally searchable without the lag required for searchability through a specific resource. Thus far document mark-up is has not been an easy process and has largely been deemed ‘not worth the effort’ for the level of attribution/recognition affiliated with it.
The BioMed Central article does a nice job of outlining and discussing many of these issues. It cites many other efforts and resources, explains their motivation and the implementation of their software. A nice feature of the tool is that there are interoperability features, and a real commitment to conforming with existing standards of practice. The article also presents an appendix of resource addresses of other groups involved in semantic searching and literature publication. I especially like this quote from the paper:
The Word add-in presented here will assist authors in this effort using community standards and by making it possible for the author of the document, the absolute expert on the content, to do so during the authoring process and to provide this information in the original source document.
You can also find brief tutorials on using the tool at SciVee: Word Add-in for Ontology Recognition Tutorial (1 of 4): Install Process
As a note, literature mark-up and enabling are currently an active area – Mary found another literature handling resource and paper as well: Check out the tip, the articles & the tools. Tell me what you find/think. Thanks! (OH, and Happy St. Patty’s to ya!)
Fink, J., Fernicola, P., Chandran, R., Parastatidis, S., Wade, A., Naim, O., Quinn, G., & Bourne, P. (2010). Word add-in for ontology recognition: semantic enrichment of scientific literature BMC Bioinformatics, 11 (1) DOI: 10.1186/1471-2105-11-103