We all know and love dbSNP, and DGV, and 1000 Genomes, and HapMap, and OMIM, and the couple of other dozen variation databases I can think of off the top of my head. But–even though there’s a lot of stuff out there–you never know what you aren’t seeing. What *isn’t* yet stored in those resources? One new consortium suggests that there’s a lot you aren’t seeing. And they aim to make it easier to collect variation data, curate it, visualize it, and have it all in one place. The resource they are constructing is called MutaDATABASE.
MutaDATABASE is a new effort to bring together a lot of variation information that is just not getting into existing databases as it should be. The group is described as “a large consortium of diagnostic testing laboratories in Europe, the United States, Australia, and Asia.” In their Nature Biotechnology correspondence they describe many of the barriers facing deposition of new variants in databases. Among them are lack of incentive (or lack of pressure by publishers and other organizations), challenging/difficult software interfaces for submissions, privacy concerns for medical testing situations, and some desire to withhold novel variations as intellectual property. Not all of these issues can be overcome with some software, but they aim to try.
The structural organization of the consortium and contributor community that they wish to develop is described in this slide, which is like Figure 1 in the publication:
So there is a group of MutaAdministrators who oversee the project as a whole (this name makes me giggle a little bit–like a sci-fi government might be called…). There are MutaCurators who assemble and review data on a given gene (is it really just genes? what about non-genic regions and large deletions and such–this isn’t entirely clear to me). Clinicians can give input into the curation, and MutaCircles is a group of labs that do diagnostic testing for a gene that can also discuss, submit, evaluate data. The MutaCurator role is a gatekeeper and accountability on the final appearance.
The gene-specific collections will be freely available online in their database, and link to disease/phenotype information associated with those variations as well. In the tip-of-the-week movie I’ll show you an example of how you might expect a gene record to look when it’s been filled out to some extent.
MutaReviews is a separate aspect that they describe this way on the web site:
MutaREVIEWS is a new “Gene review journal ” published only online, which is freely available to all users. It consists of a compilation of gene review studies that describe the most common human disease genes in a standardised way and lists all observed gene variants. The variants include monogenic variants with high penetrance, rare variants with reduced penetrance and polymorphisms without clinical significance. Each gene review is edited by a specific MutaCURATOR for that gene. These gene reviews are updated every 6 months. There are 12 issues per year.
It’s certainly in the early stages of this project. A lot of the genes I checked just haven’t been curated yet, and I understand that. I hope it works out: I do like the organization and structure, and a one-stop-shop would be handy. But the “build a platform they will come and curate” system has had mixed success elsewhere around biology. And some of the things that need to happen for this to take off are philosophical or possibly legal barriers that are going to vary quite a bit around the research and genetic testing world.
One thing I’d like to see them do is permit and encourage citizen science curation by people who are adopters of personal genomics and looking at data, and by disease community groups who have a specific interest in these genes, but have even more barriers to contribution than the researchers often do. I’ve found stuff from my genome scan that I don’t really have any place to take, and there’s no way to supplement records at that provider’s site as far as I know. But maybe that’s another variation project somewhere….
Anyway, have a look at MutaDATABASE and see what you think. Or if you participate in this project and I’ve not got some part of this right, drop a note in the comments. I know it’s early in the project and I may not have all the finer points in hand from my looking around and reading.
MutaDATABASE: http://www.mutadatabase.org/ (freely available online database with the variation content)
The sample gene that’s well filled out: http://www.mutareporter.org/mutareporter/Mutadatabase.html?showgene=L1CAM#
MutaReporter: http://www.mutabase.com/index.php?option=com_content&view=article&id=48&Itemid=54 (required license and user subscriptions; but supposedly the MutaDATABASE will have a function to submit that does not require use of this specifically, if I understood that correctly)
MutaBASE: http://www.mutabase.com A company associated with the MutaReporter software. (We have no relationship with that company)
Bale, S., Devisscher, M., Criekinge, W., Rehm, H., Decouttere, F., Nussbaum, R., Dunnen, J., & Willems, P. (2011). MutaDATABASE: a centralized and standardized DNA variation database Nature Biotechnology, 29 (2), 117-118 DOI: 10.1038/nbt.1772