News bits: WikiGenes opportunity; HapMap data issue

Ok, I’m back from Thanksgiving and catching up on some emails and found a couple of news items I wanted to pass along.

WikiGenes invitation to edit a Nature Genetics paper

Here’s an interesting “experiment” I got notified about. You could potentially get authorship on this paper if you contribute to the development of this article.  Here’s the email I got from the WikiGenes mailing list–but click over for more details and you can see the article over there. I haven’t had time to check it all out yet, but as there is a deadline I wanted to mention it now.

Dear Mary,
The editor of Nature Genetics has commissioned a collaborative standards paper on Genome Wide Association Studies. An editable draft of this paper is now online at WikiGenes, http://www.wikigenes.org/GWAS.html?wpc=12

I hope this is an interesting opportunity for you, because significant contributions to this draft might get you a co-authorship on the final paper in Nature Genetics.

I would also like to use this occasion to ask you a favor.

If you like WikiGenes, please tell your friends about it. We do not have the budget of big publishers, so we depend fully on word-of-mouth publicity.

Or you could also help us by linking to WikiGenes from your website. Thank you!

Best of science,

Robert Hoffmann, PhD
Branco Weiss Fellow
WikiGenes – Evolutionary Knowledge

HapMap data in the HaploView tool

This came across from the HapMap mailing list. We tell people about using HapMap + HaploView all the time, so I wanted to mention this possible issue with some of the data:

Dear HapMap users,

Recently, there are several questions about Haploview data format errors when users tried to analyze HapMap release 28 data.  The current Haploview version (4.2) does not recognize the new individuals in release 28 and the software will generate an error similar to “Hapmap data format error: NA18876″ when trying to open the data.

Haploview is developed and maintained by an organization different from HapMap.  Please contact Haploview help desk (haploview@broadinstitute.org) for questions specific to this software.


Hua Zhang, Ph.D.
dbSNP Group

A biological wikis conference: NETTAB 2010

Boy, I wish I could go to this. I would love to know where and how the successes are happening in this arena.  From the Biocurator mailing list:

Joint NETTAB 2010 and BBCC 2010 workshop Biological Wikis

November 29 – December 1, 2010

Congress Center, University of Naples “Federico II”, Naples, Italy http://www.nettab.org/2010/

The joint NETTAB and BBCC 2010 workshop on “Biological Wikis” promises to be a great meeting for all researchers involved in the exploitation of wikis in biology.

Come and discuss your ideas and doubts with such scientists as Alex Bateman, Alexander Pico, Andrew Su, Dan Bolser, Robert Hoffmann, Thomas Kelder, Mike Cariaso, Adam Godzik, Luca Toldo and many other who, we hope, will join the workshop.

It’s a great chance to follow smart tutorials and lectures on WikiPathways, WikiGenes, Semantic Wiki, PDBWiki, Gene Wiki and a proficient use of Wikipedia.

See a list of keynote speakers and tutorials at http://www.nettab.org/2010/progr.html .

There still is time to submit abstracts for posters and software demonstrations until next October 17, 2010!

The complete Call is available on-line at http://www.nettab.org/2010/call.html .

Registration is open at http://www.nettab.org/2010/rform.html .

Register within next October 29, 2010 and take profit of early registration fees.

A reduction of 20 euro applies to all fees for members of ISCB and other societies and networks.

More reductions are foreseen for PhD students.

Further information is available at http://www.nettab.org/2010/ .

Looking forward to seeing you soon in Naples.

Paolo Romano

I’ll be watching for tweets and meeting reports, please!

Community Annotation; Beyond Reference Genomes

I’m catching up with some mailing lists and news and I came across this interesting tidbit from our friends in the GMOD community. We are huge supporters of curation by humans for a couple of reasons: 1) we know the quality of that information can be the best and it captures so much of the information biologists need beyond sequence info; and 2) some of us have done curation and we know that it’s underappreciated but far from trivial :) .

We’ve also followed attempts by various groups to get the wider community involved to do a couple of things–to get authoritative and active researchers to put in stuff they know, and to reduce the burden on the overwhelmed professionals.  There have been a variety of ambitious attempts to get people involved in curation. UCSC has a wiki they rolled out. Some journals required Wiki updates. Seeding Wikipedia with some information and encouraging community input has been attempted. Separate new wikis on some topics have been initiated like WikiPathways.  We have been “skeptical optimists” about how some of these would go. We understand the need–but we know that end users of data are busy, they don’t get any work-related credit or time to do this sort of thing, and sometimes they don’t understand the finer points of curation.  But we like to see how the efforts work out and we’d like to see success.

Well–I’ve seen some results on various efforts that you ought to see. The GMOD community recently had a Community Annotation meeting that brought several groups together to discuss their experiences and outcomes. I’m not going to give it away–you need to go read it. One group had a 90% success rate with a strategy they attempted!!  Some groups are using curation as a student project. Others report on things they’ve tried that haven’t had as much success. Anyway: it’s all very interesting to know about–what works and what doesn’t.  And what about communities that don’t have MOD (model organism database)? They touched issues on that too.

There was another meeting too that took on a separate topic: Post Reference Genome Tools.  The premise is this:

How are we going to visualize and exploit (or even cope with) the world three years from now, when small labs may be able to fully sequence 500 individuals or species (or more) in a month? How can we visualize and link together 500, 1000, or 10,000 genomes? Many existing tools assume a reference genome. Will a reference make sense in the future, or will it hold us back?

We know a lot about the volume of data that we’ve already got that so many people aren’t aware of.  As I was just saying the other day: the data’s not in the papers anymore. It’s in these databases and it’s up to you and me to find and deal with it.  But how will the data providers offer it to you? These folks are thinking about this–and it may alter the way you interact with the data.  Again, I’m not giving it away: go read the report.

Thanks for the GMOD community for doing these reports. They are nice to have, and for those of us who can’t be there they offer a really helpful look inside.

Quick links to the reports:

Community Annotation Satellite Meeting Report

Post Reference Genome Tools Satellite Meeting Report

GBrowse: http://gmod.org/wiki/Gbrowse

Tip of the Week: PLAN2L for Arabidopsis literature

plan2L_jingFor this tip of the week we look at a text-mining tool for the Arabidopsis literature, Plan2L, or PLant ANnotation to Literature.  It has a very straightforward interface that permits searching of the paper space, and you can do that with a variety of focal points: the bibliome as a whole, or with emphasis on interactions, regulation, cell cycle, and more.  The results offer links to the PubMed abstracts, and tabular results of the statistics of the term occurance in that area of focus.  Green results indicate positive scores and likely relevance, red are likely to be non-relevant, a graphical guide to quickly finding the data of interest. Links to other resources including the BioCreative server, WikiGenes, iHOP and TAIR are provided as well.

The current emphasis for this resource is Arabidopsis, but it would be quite useful for other species too.  If you are interested in text mining Arabidopisis I would also encourage you to compare the results with the Textpresso installation at TAIR to see what you discover in a different text miner interface as well.

Plan2L site: http://zope.bioinfo.cnio.es/plan2l/plan2l.html

For their recent paper on Plan2L see: http://www.ncbi.nlm.nih.gov/pubmed/19520768 or the full article freely available in PubMedCentral:  http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=19520768