Gene ID converters compared

From my HUM-MOLGEN mailing list newsletter today I spotted an interesting comparison.  We get a lot of questions about how to convert IDs or how to best move from one data source to another.  We’ve done some explorations of that in the past (MatchMiner is one example).   This is not the sort of sexy thing that gets published in the literature in general, but a really nice thing for the informal literature system of the newsletter/blogosphere/etc world.

Diego Forero, an editor of HUM-MOLGEN, has assembled a comparison of several tools: Babelomics, Clone/ID converter, DAVID, g:Profiler, MatchMiner.  He started with a list of 100 Ensembl IDs and tested them on each of the tools to get the HUGO official nomenclature.  (He does note that there are plenty of other conversions also possible, Ensembl, HGNC, EntrezGene, RefSeq, UniGene, but Ensembl–>HGNC was the test performed). There was a second test on Affymetrix IDs to HUGO symbols too.  The references for the tools are also provided.

The data is available on Scribd and you can download it yourself.  You can access the IDs and test other tools too.  Here is a sample of the outcome:


In this experiment Babelomics did the best in this test.  Now–I have a separate question: are they right?  Just because a program provides an ID doesn’t mean it gave the right one.  This is a problem I’ve seen over and over in this field.  In my experience most stuff needs to be checked by humans.  I remember one meeting I was in and someone was describing this new tool that represented splice variants.  We were all impressed, it sounded great, and then I raised my hand to ask: “But are they right?”  and the tool developer said, “I don’t know.”

Still, it is a useful exercise to compare these tools.  And it is a great list to bookmark.  But keep that in mind.

Forero’s ID converter tool comparison direct link: http://hum-molgen.org/NewsGen/08-2009/000020.html

Omes and Omics. Oh, please stop the growth of the suffixome…?

Yesterday from the Gene Ontology GoFriends mailing list I got notified about Babelomics. The creators describe Babelomics as:

Babelomics is a suite of interconnected tools oriented to the functional annotation of genome-scale experiments. One of the field for which Babelomics is best suited is the analysisn of microarray gene expression experiments. Nevertheless Babelomics is not restricted to this type of data and has been designed for facilitating the interpretation of large-scale experiments.

I think the tools sound terrific. But I just wasn’t sure I could handle another -omic…

And then my email from the HUM-MOLGEN list arrived. I found out about the Variome meeting. The Variome project aims to collect and make available information on human variation that is correct and complete. Another worthy exercise. But another -ome for the collection. I can’t take the growth of the suffixome any more….