Ok, that’s a really broad title for an important area. And it is a problem that we are starting to see addressed more and more with GWAS (genome wide association studies). If you came here hoping that I had solved this, I’m sorry to disappoint you. We are asked all the time for places to look for this kind of information. The relatively new “Phenotypes and Disease Association Tracks” on the human UCSC Genome Browser have been popular in the training sessions we have given (look at the human browser, and check the second group of tracks controls on the page). You can find OMIM data in Ensembl. You can add Morbid Map to your Map Viewer. Another resource that I just found out about is trying to get at the same types of data–but it is available from their interface and also on the HapMap browser.
MutaGeneSys is a tool with a very simple interface at their web site, but the data is also displayed as a track in the GBrowser at HapMap. The goal was to combine HapMap information + OMIM + whole-genome marker correlation data. The news page at HapMap that describes the addition of this track to HapMap says:
Predicted OMIM associations available in GBrowse
The OMIM associations track presents data from the MutaGeneSys database, which links genotype data from HapMap and whole genome association studies with the known disease variants reported by the OMIM database. Example of a region with multiple OMIM associations: Chr1:194923128..194933127
I think it is important to assemble this type of data. But I found that there was less than I expected. The OMIM records are constructed in such a way that the genotype and phenotype information as displayed on HapMap wasn’t as clear as I had hoped. And maybe I was just expecting we already knew more than we do…but on the MutaGeneSys site they say:
MutaGeneSys currently contains 906 single-marker associations and 393 two-marker associations. These are specific to population, and genotyping technology and resolution. Single-marker associations also include the trivial associations of an OMIM SNP with itself.
So as I was just browsing around genes that I expected to contain data at HapMap it wasn’t there–it has to do with the criteria used to create the 3-part MutaGeneSys collection. Genes that I thought for sure should have something about known disease links to specific genes didn’t have any OMIM data there.
But if you are like me, you want to know where all these resources are to check them out. If this is data you are interested you might also try GAD, Genetic Association Database. The MutaGeneSys team says that their effort differs from GAD because GAD doesn’t use the individual genotype data that they do.
I like the idea–I know the need is there–and I’m sure it will grow over time. The MutaGeneSys paper was just released fairly recently. And with more GWAS coming along every these days I know we can get there. But I don’t think we are where I want to be just yet, I’m afraid.
Stoyanovich, J., Pe’er, I. (2008). MutaGeneSys: estimating individual disease susceptibility based on genome-wide SNP array data. Bioinformatics, 24(3), 440-442. DOI: 10.1093/bioinformatics/btm587