Tag Archives: gene wiki

Friday SNPpets

Welcome to our Friday feature link collection: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…

News bits: WikiGenes opportunity; HapMap data issue

Ok, I’m back from Thanksgiving and catching up on some emails and found a couple of news items I wanted to pass along.

WikiGenes invitation to edit a Nature Genetics paper

Here’s an interesting “experiment” I got notified about. You could potentially get authorship on this paper if you contribute to the development of this article.  Here’s the email I got from the WikiGenes mailing list–but click over for more details and you can see the article over there. I haven’t had time to check it all out yet, but as there is a deadline I wanted to mention it now.

Dear Mary,
The editor of Nature Genetics has commissioned a collaborative standards paper on Genome Wide Association Studies. An editable draft of this paper is now online at WikiGenes, http://www.wikigenes.org/GWAS.html?wpc=12

I hope this is an interesting opportunity for you, because significant contributions to this draft might get you a co-authorship on the final paper in Nature Genetics.

I would also like to use this occasion to ask you a favor.

If you like WikiGenes, please tell your friends about it. We do not have the budget of big publishers, so we depend fully on word-of-mouth publicity.

Or you could also help us by linking to WikiGenes from your website. Thank you!

Best of science,

Robert Hoffmann, PhD
Branco Weiss Fellow
WikiGenes – Evolutionary Knowledge

HapMap data in the HaploView tool

This came across from the HapMap mailing list. We tell people about using HapMap + HaploView all the time, so I wanted to mention this possible issue with some of the data:

Dear HapMap users,

Recently, there are several questions about Haploview data format errors when users tried to analyze HapMap release 28 data.  The current Haploview version (4.2) does not recognize the new individuals in release 28 and the software will generate an error similar to “Hapmap data format error: NA18876″ when trying to open the data.

Haploview is developed and maintained by an organization different from HapMap.  Please contact Haploview help desk (haploview@broadinstitute.org) for questions specific to this software.


Hua Zhang, Ph.D.
dbSNP Group

A biological wikis conference: NETTAB 2010

Boy, I wish I could go to this. I would love to know where and how the successes are happening in this arena.  From the Biocurator mailing list:

Joint NETTAB 2010 and BBCC 2010 workshop Biological Wikis

November 29 – December 1, 2010

Congress Center, University of Naples “Federico II”, Naples, Italy http://www.nettab.org/2010/

The joint NETTAB and BBCC 2010 workshop on “Biological Wikis” promises to be a great meeting for all researchers involved in the exploitation of wikis in biology.

Come and discuss your ideas and doubts with such scientists as Alex Bateman, Alexander Pico, Andrew Su, Dan Bolser, Robert Hoffmann, Thomas Kelder, Mike Cariaso, Adam Godzik, Luca Toldo and many other who, we hope, will join the workshop.

It’s a great chance to follow smart tutorials and lectures on WikiPathways, WikiGenes, Semantic Wiki, PDBWiki, Gene Wiki and a proficient use of Wikipedia.

See a list of keynote speakers and tutorials at http://www.nettab.org/2010/progr.html .

There still is time to submit abstracts for posters and software demonstrations until next October 17, 2010!

The complete Call is available on-line at http://www.nettab.org/2010/call.html .

Registration is open at http://www.nettab.org/2010/rform.html .

Register within next October 29, 2010 and take profit of early registration fees.

A reduction of 20 euro applies to all fees for members of ISCB and other societies and networks.

More reductions are foreseen for PhD students.

Further information is available at http://www.nettab.org/2010/ .

Looking forward to seeing you soon in Naples.

Paolo Romano

I’ll be watching for tweets and meeting reports, please!

Community Annotation; Beyond Reference Genomes

I’m catching up with some mailing lists and news and I came across this interesting tidbit from our friends in the GMOD community. We are huge supporters of curation by humans for a couple of reasons: 1) we know the quality of that information can be the best and it captures so much of the information biologists need beyond sequence info; and 2) some of us have done curation and we know that it’s underappreciated but far from trivial :) .

We’ve also followed attempts by various groups to get the wider community involved to do a couple of things–to get authoritative and active researchers to put in stuff they know, and to reduce the burden on the overwhelmed professionals.  There have been a variety of ambitious attempts to get people involved in curation. UCSC has a wiki they rolled out. Some journals required Wiki updates. Seeding Wikipedia with some information and encouraging community input has been attempted. Separate new wikis on some topics have been initiated like WikiPathways.  We have been “skeptical optimists” about how some of these would go. We understand the need–but we know that end users of data are busy, they don’t get any work-related credit or time to do this sort of thing, and sometimes they don’t understand the finer points of curation.  But we like to see how the efforts work out and we’d like to see success.

Well–I’ve seen some results on various efforts that you ought to see. The GMOD community recently had a Community Annotation meeting that brought several groups together to discuss their experiences and outcomes. I’m not going to give it away–you need to go read it. One group had a 90% success rate with a strategy they attempted!!  Some groups are using curation as a student project. Others report on things they’ve tried that haven’t had as much success. Anyway: it’s all very interesting to know about–what works and what doesn’t.  And what about communities that don’t have MOD (model organism database)? They touched issues on that too.

There was another meeting too that took on a separate topic: Post Reference Genome Tools.  The premise is this:

How are we going to visualize and exploit (or even cope with) the world three years from now, when small labs may be able to fully sequence 500 individuals or species (or more) in a month? How can we visualize and link together 500, 1000, or 10,000 genomes? Many existing tools assume a reference genome. Will a reference make sense in the future, or will it hold us back?

We know a lot about the volume of data that we’ve already got that so many people aren’t aware of.  As I was just saying the other day: the data’s not in the papers anymore. It’s in these databases and it’s up to you and me to find and deal with it.  But how will the data providers offer it to you? These folks are thinking about this–and it may alter the way you interact with the data.  Again, I’m not giving it away: go read the report.

Thanks for the GMOD community for doing these reports. They are nice to have, and for those of us who can’t be there they offer a really helpful look inside.

Quick links to the reports:

Community Annotation Satellite Meeting Report

Post Reference Genome Tools Satellite Meeting Report

GBrowse: http://gmod.org/wiki/Gbrowse

Tip of the Week: UCSC wiki annotations


In the continuing effort to get scientists and researchers to annotate and curate data and to capture the huge amount of knowledge available, UCSC Genome Browser has added a wiki annotation track to the browser. It’s not the first effort of course, GeneWiki is an effort, with mixed results so far, to annotate gene function information as a community exercise using Wikipedia. Some journals are requiring wiki entries, and several databases have opened wikis for curation. Wikis could be a solution for capturing the exponentially increasing amount of data,

or they could be just another place for adding confusion… or both. I suspect out of the plethora the wikis coming available for annotation and curation of genomic data, something will stick and find that Goldilocks balance of a dedicated community, ease of use, usability, and other aspects that will be needed for this to work.

Perhaps UCSC Genome Browser has that balance. It will remain to be seen, but let’s get started. Today’s tip is introducing the new wiki track in the UCSC Genome Browser.

Paper compares interaction databases

venn_interactions.jpgI wish I had more time to go into this paper in more detail–but I wanted to let you know that the paper is out there now.  It came in my recent Nature Methods in paper version, and if I wasn’t crazy busy on a very cool project that we hope to launch this week I’d go deeper….

The paper is:  Literature-curated protein interaction datasets by Cusick et al. Nature Methods 6, 39 – 46 (2009)  2008 | doi:10.1038/nmeth.1284

I knew from the abstract that it was going to cause some conflama. And I was right.  Soon after an article in Bioinform addressed some of the issues.  Requires a subscription, but here’s the title and the link if you do have one:  Study Finding Erroneous Protein-Protein Interactions in Curated Databases Stirs Debate, by Vivien Marx.

This paper gets at a question that people ask us all the time–how do I know which database to use for X purpose?  So if your question is which database to use for protein interactions, you should read this paper and consider the points they make.   They don’t compare all protein interaction databases, of course–but for those they do examine (IntAct, DIP, MINT) they provide informative comparisons that you should consider for any database.  What does it contain?  What is it missing?  They have some nice Venn diagrams to illustrate the content.  The one I used here is just a representation of that, not attempting to be accurately proportional, go to the paper to see the real ones.

Our position is that you should use all of them, of course  :)  Project goals and funding issues, species specialties, scope…all of this impacts what will be in a database.  (In fact, please go to MINT and support their funding by signing their protest of funding cuts).

One point embedded in the paper caught my attention, though.  One major curation issue was that the species designation of the protein in the interactions was not clear.   I know sometimes this is a problem with the original source paper.  Sometimes it is a curation issue.  But this worries me because of the concern I raised with Wikipedia gene entries.  I made the point that there was no way to distinguish between human genes and mouse genes of the same name (MEF2/Mef2).  This could be true of similar genes in other species too–where the gene might not even be the same gene, just a naming coincidence. I can see it has arisen again.  But if we expect to rely on Wikification projects like Gene Wiki for more and more, I think that would need to be addressed.

Gene Wiki?

ResearchBlogging.org PLoS Biology has an article out today entitled “A Gene Wiki for Community Annotation of Gene Function.” The article describes the authors attempts to create a comprehensive gene wiki of gene functions by ‘seeding’ Wikipedia with a foundation of ‘stub’ articles with information from existing databases (such as Entrez Gene). This foundation would then be built upon in Wikipedia fashion by community editing.

Continue reading


Bioinformatics and Genomics sometimes (always?) brings together two very different groups: biologists and computer scientists. They are often biologists who know something about computers and computer scientists who know something about biology and sometimes they are computational biologists who do both. We (OpenHelix scientists) train biologists who want to use genomics tools that computational biologists (or a team of computer scientists and biologists) have developed. Sometimes those biologists want to do more and sometimes computer scientists need to learn a bit of biology. So, in that vein…

Continue reading

Web strolling finds

ScienceRoll links to a search engine for Radiology, links to a post about a “Gene Wiki” project, from which I re-find the excellent blog by Deepak Singh. From there I find this interesting resource: FreeBase, which is different that Wikipedia (it doesn’t have ‘articles’, it has stats), which reminds me of that Google project I mentioned earlier and leads me to GoogleBase.

It’s all about finding that info!