In hunting for something to feature for this week’s tip, I noticed that Nucleic Acids Research had released their 2012 Web Server Issue back in July. As many of you are might be aware, the Nucleic Acids Research journal is a forum where developers can present computational biology papers that describe the development of biologically relevant algorithms, novel usage of existing algorithms, or that report the development of biological databases & their usage. The web server issue is an annual special issue focused specifically on web-based software resources for analysis and visualization of molecular biology data.
This year marks their 10th web server issue & I decided to check it out. In order to devote full attention to the issue, I began by pouring myself a big cup of coffee in one of my favorite mugs, which somehow makes it taste better. Then I set out to enjoy the issue – every year I always begin by reading the opening editorial & then the article on the bioinformatics links directory. The editorial usually explains special emphasis for the issue (this year it is analysis of next-generation sequencing data), and is written by the executive editor of the issue, Gary Benson. For me, the editorial sets the tone of the issue, so to speak.
Next I consume the directory article, along with a couple of sips of my java. What interests me in the article is multifold. First is the discussion of trends that they see in the development of tools and resources, which is important for us here at OpenHelix. Figure 6 provides an interesting look at the categories and counts of resources from each annual issue – I am curious as to why all but one category decline in 2008. Table 1 also provides interesting data on tool trends.
I am also interested in the content of the list itself – it is a great list being developed by people that we have a lot of respect for. I was especially interested in this sentence from their article:
“The Bioinformatics Links Directory has also initiated active curation of its content, removing dead content and correcting content errors, which has resulted in more accurate although occasionally smaller counts for 2012.”
The emphasis is mine in the quote above. In my opinion this is a very important aspect of any list. If you remember, Mary posted on the idea of “Obituaries for bioinformatics tools.” and started a BioStar post to collect this information. The BioStar post generated significant comment & looks like it may have helped inspire the Bioinformatics Links Directory team, from the comments. But it makes sense that you need not just collect information but to continue to maintain and filter that data so that it remains relevant – I mean if the forest is cluttered with dead wood, the useful “live trees” (ok, resources) are obscured from users, right?
The problem is that keeping any list (or documentation or tutorials, etc.) up-to-date is a hard, labor intensive activity. Here at OpenHelix we also keep a list of biology-relevant resources that can be searched through for free, without registering, from our homepage. We currently have a summer intern culling through a list of over 5,000 resources and tools that we know of. She is eliminating duplicate entries in our database by finding and collecting alternative URLs – it is amazing how many resources have multiple entryways, each with their own URL. But different doors don’t make a different resource or utility so we eliminate them form our list. Then we will tackle the dead resources, the listings that just go to a tiny tool internal to a main resource, or to a pre-formatted PubMed search for something.
Creating AND maintaining a high quality list is not a trivial effort. In their paper the Bioinformatics Links Directory team describes remaining current as a “future challenge” and says:
“Although necessary to remain current and to advance the utility of the Bioinformatics Links Directory, these improvements will only prove useful if driven by the community. As a community-driven repository, everyone in the research or bioinformatics community has the opportunity to help make the collection better and more meaningful. “
I truly wish them better luck at “community curation” than many resources have had in the past, & hope they succeed. In our experience it works best with stable, sufficient funding because as they say: “you get what you pay for”.
OK, next post will be on actual resources in the web server issue, I promise!
2012 NAR Web Server Issue: http://nar.oxfordjournals.org/content/40/W1.toc
Bioinformatics Links Directory: http://bioinformatics.ca/links_directory/
OpenHelix Homepage & Search Portal: http://www.openhelix.com
Gary Benson (2012). Editorial: NUCLEIC ACIDS RESEARCH ANNUAL WEB SERVER ISSUE IN 2012 Nucleic Acids Research, 40 (W1) DOI: 10.1093/nar/gks607
Michelle D. Brazas, David Yim, Winston Yeung, & B. F. Francis Ouellette (2012). A decade of web server updates at the bioinformatics links directory: 2003–2012 Nucleic Acids Research, 40 (W1) DOI: 10.1093/nar/gks632