Tip of the Week: Word Add-In for Ontology Recognition

In today’s tip I want to make you aware of a tool that I think will help researchers to present their own data and publications in an accurate and universally searchable way. I learned of the resource (UCSDBioLit) through an article in one of my recent BioMed Central article alert emails. This resource allows authors to mark-up their own publications with XML tags AS THEY WRITE their papers. This will allow faster and more accurate semantic searching of their research.

A huge problem in science today is the ability to quickly search the vast literature base and to accurately and efficiently find the data that you are interested in. Here at OpenHelix we focus on ways of effectively and efficiently get information out of public databases and resources, but at the other end of the process is the ability for scientific knowledge to be curated into those resources. We have featured biocurators and the phenomenal work that they do several times in the past, but it is work that never ends and can be very labor intensive. It often involves an initial triaging of a field’s literature, some level of automatic information gathering, and then careful manual effort on the part of scientist at the resource to gather and present the information through their site. I know from personal experience that the process of reading a paper, clarifying research details with an author, and then presenting that information to the author’s satisfaction can be a very long & labor intensive process, for both the curator AND the original author.

For years there has been discussion of ‘expert curation’ in which experts in the field author review or summary pages in a resource, or community curation jamborees, etc. And there have been fruits from many of these efforts, but in general participation is low. But who is more of an expert on the research being published other than the author himself? If authors could/would mark up their own papers during the publication process, not only could they be assured that it would be accurate but they would help make their research universally searchable without the lag required for searchability through a specific resource. Thus far document mark-up is has not been an easy process and has largely been deemed ‘not worth the effort’ for the level of attribution/recognition affiliated with it.

The BioMed Central article does a nice job of outlining and discussing many of these issues. It cites many other efforts and resources, explains their motivation and the implementation of their software. A nice feature of the tool is that there are interoperability features, and a real commitment to conforming with existing standards of practice. The article also presents an appendix of resource addresses of other groups involved in semantic searching and literature publication. I especially like this quote from the paper:

The Word add-in presented here will assist authors in this effort using community standards and by making it possible for the author of the document, the absolute expert on the content, to do so during the authoring process and to provide this information in the original source document.

You can also find brief tutorials on using the tool at SciVee: Word Add-in for Ontology Recognition Tutorial (1 of 4): Install Process

As a note, literature mark-up and enabling are currently an active area – Mary found another literature handling resource and paper as well: Check out the tip, the articles & the tools. Tell me what you find/think. Thanks! (OH, and Happy St. Patty’s to ya!)

UCSDBioLit Reference:
Fink, J., Fernicola, P., Chandran, R., Parastatidis, S., Wade, A., Naim, O., Quinn, G., & Bourne, P. (2010). Word add-in for ontology recognition: semantic enrichment of scientific literature BMC Bioinformatics, 11 (1) DOI: 10.1186/1471-2105-11-103

OpenHelix receives $1 million NIH grant for genomics training portal

January 21, 2008 (Seattle, WA) – Thanks to a $1 million grant, OpenHelix (www.openhelix.com) has been developing an innovative set of online tools for use by scientific researchers. The tools will greatly reduce the amount of time necessary to locate and use the vast genomics and bioinformatics resources available to scholars and scientists. Once relevant resources are located through an innovative search tool, researchers will learn how to use them with extensive tutorial suites. The SBIR (Small Business Innovation Research) grant was awarded by the National Human Genome Research Institute (Grant number 9R44HG004531).

Freely accessible genomics and bioinformatics resources

With numerous online databases and other genomics and bioinformatics resources available to scientists, the time spent identifying thebest resources and using them in an efficient manner has been a challenge for even the most well-staffed organization.Much data is underutilized due to a lack of awareness of its existence. When scholars and scientists do happen to locate needed information in an online resource, they then must figure out each resource’s unique navigation methods and each documentation style. Introductory training on many resources is either nonexistent or not sufficient to effectively teach users how to best use the site.

“The need for such a resource is clear in the bioinformatics area,” says Joan E. Brooks Ph.D., co-founder of Garbrook Knowledge Resources and former co-founder of Proteome, an online genomics information database company. “The OpenHelix solution will be a promising leap forward to assure the public investment in these resources is fully realized.”

Improving efficiency and effectiveness of research

While genomics resources and data continue to grow rapidly, scientists are at a disadvantage when trying to decide the best resource for them. The search and tutorial portal will enable faster completion of research projects, leading to an accelerated increase in the use and dissemination of scientific knowledge.“We are now looking at some very innovative ways to search a large number of resources, including semantic search using widely used and accepted ontologies” Warren Lathe, co-founder, OpenHelix Chief Scientific Officer and Principal Investigator on the grant said, “The science community will be very excited about the tools we are going to offer this year.”

The groundbreaking search function will provide various methods for locating and ranking genomics resources. As they use the OpenHelix online search for their projects, scientists and other researchers will use a ranking system within the search results to filter the list that pertains to their particular needs, something not previously available.The tutorials also include training material for use in the classroom setting, giving faculty ready-made, updated material to train students.By matching researchers quickly and efficiently with the resources that are most relevant to their needs and providing training so the researcher can effectively use the resource, the grant from the NHGRI will help fulfill the promise of research breakthroughs provided by the post-genomic era.

About OpenHelix

OpenHelix, LLC, (www.openhelix.com) provides the genomics knowledge you need when you need it. OpenHelix currently provides online self-run tutorials and on-site training for institutions and companies on the most powerful and popular free, web based, publicly accessible bioinformatics resources. In addition, OpenHelix is contracted by resource providers to provide comprehensive, long-term training and outreach programs. Headquartered in Washington State, OpenHelix also has offices in San Francisco, Boston and North Carolina. Further information can be found on www.openhelix.com or by calling 1-888-861-5051.