Tag: BioLit

Tip of the Week: Word Add-In for Ontology Recognition

17 March, 2010 (08:07) | Tip of the Week | By: Jennifer

In today’s tip I want to make you aware of a tool that I think will help researchers to present their own data and publications in an accurate and universally searchable way. I learned of the resource (UCSDBioLit) through an article in one of my recent BioMed Central article alert emails. This resource allows authors to mark-up their own publications with XML tags AS THEY WRITE their papers. This will allow faster and more accurate semantic searching of their research.

A huge problem in science today is the ability to quickly search the vast literature base and to accurately and efficiently find the data that you are interested in. Here at OpenHelix we focus on ways of effectively and efficiently get information out of public databases and resources, but at the other end of the process is the ability for scientific knowledge to be curated into those resources. We have featured biocurators and the phenomenal work that they do several times in the past, but it is work that never ends and can be very labor intensive. It often involves an initial triaging of a field’s literature, some level of automatic information gathering, and then careful manual effort on the part of scientist at the resource to gather and present the information through their site. I know from personal experience that the process of reading a paper, clarifying research details with an author, and then presenting that information to the author’s satisfaction can be a very long & labor intensive process, for both the curator AND the original author.

For years there has been discussion of ‘expert curation’ in which experts in the field author review or summary pages in a resource, or community curation jamborees, etc. And there have been fruits from many of these efforts, but in general participation is low. But who is more of an expert on the research being published other than the author himself? If authors could/would mark up their own papers during the publication process, not only could they be assured that it would be accurate but they would help make their research universally searchable without the lag required for searchability through a specific resource. Thus far document mark-up is has not been an easy process and has largely been deemed ‘not worth the effort’ for the level of attribution/recognition affiliated with it.

The BioMed Central article does a nice job of outlining and discussing many of these issues. It cites many other efforts and resources, explains their motivation and the implementation of their software. A nice feature of the tool is that there are interoperability features, and a real commitment to conforming with existing standards of practice. The article also presents an appendix of resource addresses of other groups involved in semantic searching and literature publication. I especially like this quote from the paper:

The Word add-in presented here will assist authors in this effort using community standards and by making it possible for the author of the document, the absolute expert on the content, to do so during the authoring process and to provide this information in the original source document.

You can also find brief tutorials on using the tool at SciVee: Word Add-in for Ontology Recognition Tutorial (1 of 4): Install Process

As a note, literature mark-up and enabling are currently an active area – Mary found another literature handling resource and paper as well: Check out the tip, the articles & the tools. Tell me what you find/think. Thanks! (OH, and Happy St. Patty’s to ya!)

UCSDBioLit Reference:
ResearchBlogging.org
Fink, J., Fernicola, P., Chandran, R., Parastatidis, S., Wade, A., Naim, O., Quinn, G., & Bourne, P. (2010). Word add-in for ontology recognition: semantic enrichment of scientific literature BMC Bioinformatics, 11 (1) DOI: 10.1186/1471-2105-11-103

An interesting experiment I heard about at the Biocuration Conference

27 April, 2009 (10:19) | General Science, Genomics News | By: Jennifer

A few more bits of info from the recent 3rd International Biocuration Conference: several of the speakers described the amounts of biological data coming, and soon to be coming, as beyond a tsunami – more like a tsunami tsunami of data. Even with the talented and creative people who are members of the biocuration community, handling all this data is challenging. Funding for staff is not going up at nearly the rate at which the data is increasing so new and innovative mechanisms must be developed. An approach that has been in the works for years is community curation – acquiring curation directly from the expert researchers who are generating the data. But, as we have noted elsewhere on our blog, adding yet another to-do to a researchers already daunting list is a hard sell – especially if incentives are lacking.

Another approach being taken is to create better connections between research publications and the databases that organize the data. In other words the data would both be published AND entered into public databases in a single step process, instead of publication and then a separate collection of the data from the publication by the database well after publication occurs. Many of these approaches were discussed, such as collaborations between databases (TAIR) and journals (Plant Physiology).

biolit_logo1 The project that I want to point out today is BioLit, and you can see an example implementation of it at BioLit:PDB. I’ve listed the paper describing BioLit below so you can get all the details for yourself, but briefly BioLit leverages open access publishing on PubMed Central and ‘marks up’ the content of articles to provide metadata that describes the semantic content of the articles. BioLit will be useful to primary end-users and biocurators alike, and it will be interesting to see how this approach develops over time.

Fink, J., Kushch, S., Williams, P., & Bourne, P. (2008). BioLit: integrating biological literature with databases Nucleic Acids Research, 36 (Web Server) DOI: 10.1093/nar/gkn317