Tag Archives: OMIM


Video Tip of the Week: ClinGen, The Clinical Genome Resource

The sequence data tsunami begins to crash into the shore, at the feet of clinicians and patients who want answers and treatment directions. But sometimes the tsunami is washing in debris. As the amount of sequence and variation information grows, some of it comes without clear evaluations of the impacts. Some of it comes with conflicting information. And some of it comes in wrong.

Attempting to wrangle the information into useful understanding and treatments with standardized descriptions, the team building the ClinGen resources published a paper last week that details their efforts. The paper describes their history and goals, and how they are moving to get to a point where they have useful information for and from patients, their doctors, testing labs, and researchers. Because of the different needs of different groups, there are several moving parts to the overall ClinGen collection.

In addition to the paper–and several related articles in this NEJM special report–there are videos on their site that tackle different aspects of the ClinGen projects. I’m going to highlight one of them here as the Tip of the Week, but you should also check out the others that are available on their webinars page or their YouTube channel. This video shows the Dosage Sensitivity Map features.

This video provides some of the history and framework for the ClinGen efforts, and then also introduces one of the tools that they have made available, a dosage sensitivity map. This piece focuses on “evidence based reviews of dosage sensitivity”, and they indicate haploinsufficiency losses of regions, and triplosensitivity duplications of regions. ClinGen dosage scoresThey describe a scoring system they use to rank structural variations (CNVs, SVs), and their curation of the evidence to support or to refute dosage sensitivity. They also note that their process is conservative, and you should keep that in mind as you consider the their team’s review of the evidence. But they are definitely open and interested in feedback and they hope you will contact them if you have a different understanding from their posted evaluations.

To follow along with the video, use this site to explore the features of this part of the ClinGen tool set: http://www.ncbi.nlm.nih.gov/projects/dbvar/clingen/. But you can also just click their example genes–for instance, the ZEB2 link shows you a typical page with the score information, links to other resources, and a genome viewer right on the page.  But you can also choose to look at external browsers at NCBI, Ensembl, or UCSC. I clicked the UCSC Genome Browser one to see how it displayed, and they automatically present to you tracks with the relevant ClinGen data loaded.

In other tips I’ll talk about other pieces of the infrastructure that they are building or coordinating with. Some we’ve talked about before–you can see a previous tip that included the ClinVar resource at NCBI that is foundational to the ClinGen suite and is discussed in their paper as well. They also note the importance of the data from OMIM, and how their mutual efforts are providing important feedback loops to be alerted to needed updates.  ClinGen also employs the Human Phenotype Ontology that keeps coming up at OpenHelix lately. Another important piece to this is the standards for naming variants that were recently described by the American College of Medical Genetics and Genomics (paper linked below).

ClinGen, and the various component tools within, are worth looking at, and contributing to, as we try to move more and better information to the clinic for patients and doctors to use effectively. Steven Salzberg has a take on the value of ClinGen here: 17% Of Our Genetic Knowledge Is Wrong.

It’s also very possible that some really important things will happen in the database–new submissions, changes to the status of a variant–that will occur before any papers come out about it. Or it is even possible that a paper never will come out about it. Spend some time learning about the features; I think it will be worth the time.

Quick links:

ClinGen overall project: http://clinicalgenome.org/

ClinVar: http://www.ncbi.nlm.nih.gov/clinvar/

ClinGen Dosage Sensitivity Map: http://www.ncbi.nlm.nih.gov/projects/dbvar/clingen/


Rehm, H., Berg, J., Brooks, L., Bustamante, C., Evans, J., Landrum, M., Ledbetter, D., Maglott, D., Martin, C., Nussbaum, R., Plon, S., Ramos, E., Sherry, S., & Watson, M. (2015). ClinGen — The Clinical Genome Resource New England Journal of Medicine DOI: 10.1056/NEJMsr1406261

Richards, S., Aziz, N., Bale, S., Bick, D., Das, S., Gastier-Foster, J., Grody, W., Hegde, M., Lyon, E., Spector, E., Voelkerding, K., & Rehm, H. (2015). Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology Genetics in Medicine, 17 (5), 405-423 DOI: 10.1038/gim.2015.30

Friday SNPpets

This week’s SNPpets include an interactive timeline of cancer research, NIH’s statement on cloud-stored genomics data, Facebook and genomics data, hilarious typos in some published equations, and more….

Welcome to our Friday feature link collection: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…

Video Tip of the Week: Nowomics, set up alert feeds for new data

Yeah, I know you know. There’s a lot of genomics and proteomics data coming out every day–some of it in the traditional publication route, but some of it isn’t–and it’s only getting harder and harder to wrangle the useful information to access the signal from the noise.  I can remember when merely looking through the (er, paper-based) table of contents of Cell and Nature would get me up to speed for a week. But increasingly, the data I need isn’t even coming through the papers.

Like everyone else, I have a variety of strategies to keep notified of different things I need to see. I use the MyNCBI stored searches to keep me posted on things that come from via the NCBI system. I signed up for the OMIM new “MIM-Match” service as well. But there’s still a lot of room for new ways to collect and filter new data and information. Today’s tip focuses on a service to do that: Nowomics. This is a freely available tool to help you keep track of important new data. Here’s a quick video overview of how to see what’s going on with Nowomics.

The goal of Nowomics is to offer you an actively updated feed of relevant information on genes or topics of interest, using text mining and ontology term harvesting from a range of sources. What makes them different from MyNCBI or OMIM is the range and types of data sources they use. The user sets up some genes or Gene Ontology terms to “follow”, and the software regularly checks for changes in the source sites. You can go in an look at your feed, you can filter it for different types of data, and you can see what’s new (“latest”) or what’s being hotly chattered about (“popular”) using Altmetric strategies. For example, here’s a paper that people seemed to find worth talking about, based on the tweets and the Mendeley occurrences.

example_paper This tool is in early stages of development–if there are features you’d like to see or other sources you’d think are useful, the Nowomics team is eager for feedback. You can find a link to contact them over at their site, or locate them on Facebook and Twitter. You can also learn more from their blog. You can also learn more about the philosophy and foundations of Nowomics from their slide presentation below.


Quick links:

Nowomics: http://nowomics.com/

Example gene feed: http://nowomics.com/gene/human/BRCA2


Acland A., T. Barrett, J. Beck, D. A. Benson, C. Bollin, E. Bolton, S. H. Bryant, K. Canese, D. M. Church & K. Clark & (2014). Database resources of the National Center for Biotechnology Information, Nucleic Acids Research, 42 (D1) D7-D17. DOI: http://dx.doi.org/10.1093/nar/gkt1146

Online Mendelian Inheritance in Man, OMIM®. McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University (Baltimore, MD), July 22 2014. World Wide Web URL: http://omim.org/

Video Tip of the Week: list of genes associated with a disease

I am currently in Puerto Varas, Chile at an EMBO genomics workshop. The workshop is mainly for grad students and the instructors are, for the most part, alumni of the Bork group. I gave a tutorial on genomics databases.

Anyway, the last two days of the workshop is a challenge, in teams of 3-4 advised by an instructor, students are to develop a list of genes associated with epilepsy. Obviously, this could be a trivial task, just go to OMIM or GENECARDS and grab a list. But this challenge requires them to go behind that and use the available data and make predictions. My team attempted, on my suggestion, some brainstorming techniques to ensure a more creative solution than they could come up with individually or just jumping into normal group dynamics. It seemed to work, their solution was quite creative and we will find out today how that worked.

That was my long way of saying, in the process we came across many databases of gene-disease information. above you will find a video of rat gene disease associations from RGD, often used of course to investigate human gene disease associations.

Below you will find a list of some excellent databases and resources to find similar lists:

Gene Association Database  http://geneticassociationdb.nih.gov/

G2D http://g2d2.ogic.ca

OMIM http://www.omim.org

Diseases http://diseases.jensenlab.org/

GeneCards http://genecards.org

DisGeNET http://ibi.imim.es/web/DisGeNET/

Several NCBI resources http://www.ncbi.nlm.nih.gov/guide/howto/find-gen-phen/

UCSC Genome Browser’s tracks for disease and phenotype http://genome.ucsc.edu

There are several others I’m sure, if you have a favorite not on this list, please comment.

Reference for RGD:
Laulederkind S.J.F., Hayman G.T., Wang S.J., Smith J.R., Lowry T.F., Nigam R., Petri V., de Pons J., Dwinell M.R. & Shimoyama M. & (2013). The Rat Genome Database 2013–data, tools and users, Briefings in Bioinformatics, 14 (4) 520-526. DOI:

Video Tips of the Week, Annual Review 2013 (part 1)

As you may know, we’ve been doing these video tips-of-the-week for SiX years now. We have completed or collected around 300 little tidbit introductions to various resources through this past year, 2013. At first we had to do all of our own video intros, but as the movie technology became more accessible and more teams made their own, we were able to find a lot more that were done by the resource providers themselves. So we began to collect those as well. At the end of the year we’ve established a sort of holiday tradition: we are doing a summary post to collect them all. If you have missed any of them it’s a great way to have a quick look at what might be useful to your work.

You can see past years’ tips here: 2008 I, 2008 II, 2009 I, 2009 II, 2010 I, 2010 II, 2011 I, 2011 II, 2012 I, 2012 II, 2013 II (next week).

Annual Review VI:

January 2013:
January 2: Annual Review V part deux
January 9: The New and Improved OMIM®
January 16: InSilico DB
January 23: ZooBank and species nomenclature
January 30: ScienceGameCenter #edtech

February 2013:
February 6: MotifLab workbench for TFBS analysis
February 13: UCSC Genome Browser restriction enzyme display
February 20: ENCODE Data at UCSC (reminder)
February 27: NetGestalt

March 2013:
March 6: NCBI Genomics Workbench
March 13: FlyBase
March 20: figshare + GenoCAD = outreach
March 27: Enzyme Portal and User-Centered Design

April 2013:
April 3: Phytozome and the Peach Genome
April 10: Introductory Cheminformatics
April 17: Sharing H7N9 data at GISAID.org with EpiFlu™
April 24: Cancer Atlas roadmap

May 2013:
May 1: My Cancer Genome
May 8: Transfac (and HGMD, Proteome, etc)
May 15: Influenza Research Database (IRD)
May 22: Canary Database for sentinels of human health
May 29: QIIME for Quantitative Insights Into Microbial Ecology

June 2013:
June 5: Prezi and other nonlinear presentation methods
June 12: TrioVis for family genome data sets
June 19: ENCODE ChIP-Seq Significance Tool
June 26: InnateDB, Systems Biology of the Innate Immune Response

Video Tip of the Week: Mobile-device enabled tutorial suites

For decade now we’ve been offering our video tutorial suites to help people learn how to use bioinformatics resources. We’ve used a couple of delivery platforms, and we’ve changed the website a few times. But we also know that people like consistency with software, and if there are going to be major changes to the behavior of something, there better be a good reason.

We have a good reason. With the rise of mobile devices and the increasing use of them by students, our subscribers wanted us to make watching the tutorials on iPads and Androids and Surfaces more friendly. So we’re doing it.

This week’s video tip demonstrates the change to our tutorial movies that we’re rolling out. The basics are the same–each video offers details about how to use the software features at some database or tool site. We explain the display features, and the search mechanisms. We offer the video as well as the slides and some exercises to use as well. The only thing we’ve changed is the menu and controller options. The YouTube video here illustrates that.

So soon when you launch a tutorial video, you will have to swipe over the edges to access the menus and the slider. You can still click individual chapters, or move ahead with the controller. But those items move out of the way when you aren’t using them.

Everything else is the same. The landing pages for each tutorial suite will still have the launch buttons for all the items you need to access everything.

For subscribers, all of the suites will have this new functionality. If your site doesn’t have a subscription, you can still try it out on our sponsored training suites, such as: GenoCAD, OMIM, UCSC Genome Browser, or anything else from the “free” tutorials page: http://openhelix.com/free .

To learn more about our philosophy of training materials, you can check out our paper (below). Regular readers may already understand what we do, but if you are accessing these for the first time it might help you to know more about what we offer and how we do it.

Let us know if you have any issue with the new interface and we’ll take a look right away.

Quick link:

Free tutorials to try out: http://openhelix.com/free


Williams J.M., Mangan M.E., Perreault-Micale C., Lathe S., Sirohi N. & Lathe W.C. (2010). OpenHelix: bioinformatics education outside of a different box, Briefings in Bioinformatics, 11 (6) 598-609. DOI:

Video Tip of the Week: the new and improved OMIM®

For this week’s Tip of the Week we highlight our new tutorial on OMIM, Online Mendelian Inheritance in Man. If you haven’t looked at OMIM for a while, or if you usually only think about it as a link in some other database you use, look again. There’s more there than you realize.

OMIM is one of the first online tools I became aware of way back in my career. That shared Mac in the back of the lab, with it’s teeny little screen–and accessing the link to OMIM from that NCBI interface–remember that old interface? Even then OMIM was a venerable resource with an unmatched collection of human genes, traits, and phenotype data. There was a great paper about the history of OMIM that Victor McKusik wrote about his own career and his work, and he recounts the beginnings of his human gene information collection and many other aspects of the human genetic knowledge realm. It’s a fascinating look at one guy’s path and influences that lead us to where we are today. But here’s the short history of OMIM as a computational resource:

Mendelian Inheritance in Man has been maintained on the computer since 1964. With the first print edition in 1966, it was a pioneer in computer-based publication. In the 1980s, MIM was prepared for online presentation, with a search engine that enhanced its usefulness. Online access, as OMIM, was provided generally beginning in 1987, first from the Welch Medical Library at Johns Hopkins and since December 1995 from the National Center for Biotechnology Information (NCBI) of the National Library of Medicine (27).

Because of how long OMIM® has been around and its utility and depth, it’s been incorporated into probably almost every bioinformatics resource you use around the world. I love the UCSC Genome Browser track option that you can turn on to supplement your look at genomic regions and quickly find disease-causing genes, for example. But just seeing a link to OMIM doesn’t give you the full scope of understanding of the features it offers. With the move away from the NCBI site, the OMIM team changed their interface quite a bit to offer a lot more features than they were able to before. New links to appropriate resources have been added. New ways to integrate knowledge have been provided.

Have a fresh look at OMIM today, and you can also download slides to use and exercises to perform to help you perform more complex searches and exploration of the wealth of data they have.

The training materials are freely available because they are sponsored by the OMIM team, who worked with us to create them. They are free for a limited time, so check them out soon.

Quick links:

OMIM tutorial: http://openhelix.com/OMIM

OMIM main site: http://www.omim.org (but note there are mirror sites in the US and Europe that may be good options)


McKusick, V. (2006). A 60-Year Tale of Spots, Maps, and Genes Annual Review of Genomics and Human Genetics, 7 (1), 1-27 DOI: 10.1146/annurev.genom.7.080505.115749

Amberger, J., Bocchini, C., & Hamosh, A. (2011). A new face and new challenges for Online Mendelian Inheritance in Man (OMIM®) Human Mutation, 32 (5), 564-567 DOI: 10.1002/humu.21466

Friday SNPpets

Welcome to our Friday feature link collection: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…

What’s the Answer? (OMIM API now available)

BioStar is a site for asking, answering and discussing bioinformatics questions. We are members of thecommunity and find it very useful. Often questions and answers arise at BioStar that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those questions and answers here in this thread. You can ask questions in this thread, or you can always join in at BioStar.

This question was raised in the last month, and there was some discussion–but there’s been a big change in the options since, and I wanted to highlight that so people know:

Question: Is OMIM no longer available as structured data?

One of our project used to query OMIM data as XML through NCBI’s efetch utility, as described here for example:

[?]What is the best way to interact programmatically with OMIM?[?]

However, it seems the service has stopped functioning a few months ago. It now simply returns the following error:

Database: omim – is not supported

I can find no mention of an update to the API on NCBI’s website or anywhere else. At the same time, the pages accessible directly on OMIM’s website offer no link to structured data (XML or otherwise) and the downloadable file, while using some specific format to delimit fields, is still far from the flexibility of the former XML files (for example, it is impossible to retrieve metadata for each reference).

Is there currently any way to regain access to OMIM data in a structured, parsable format (XML…)?

There was a lot of discussion about the changes since OMIM at NCBI moved away from to its new spot at OMIM.org, but just this week I spotted a tweet from OMIM which directly answers the access issue now:

I’ve you’ve been waiting for the OMIM API access, check it out at the Help page.

(one) Video Tip of the Week (to hold them all): Variation and Disease Databases

After again reading Daniel MacArthur’s good rundown about the state of databases of human disease-causing variation from last year (One database to hold them all), I thought it might be nice to do a tip comparing several of them. I couldn’t get it under our self-imposed 5 minute limit for our tips (and technical limit of software I’m using, but that’s about to change). But as I perused our tips and other sites, I found we and others have quite a list of how-to tips to use these databases. So in today’s tip I’ve gathered video tips for 3 of the databases listed in the linked post. Below those tips I’ll link to other how-to videos for additional human variation and disease.

The databases mentioned are OMIM, Human Gene Mutation Database (HGMD), MutaDATABASE and The Human Variome Project . There are video tips for the first three.


Last year OMIM moved to http://www.omim.org and had a entire new interface. Mary was on top of it and did a tip on the new OMIM interface with lots of information on the move and OMIM in the post:

Our full tutorial on the new OMIM is coming soon.

HGMD has a public site and a by-subscription site. The latter includes access to the most current data and some added features. The publicly accessible site is out-of-date by three years. Because of HGMD restrictions, we aren’t able to do a tutorial or a tip on HGMD, but they do have an introduction video to their database:


Additionally, there is a good background page for more information.


Mary did a tip on MutaDatabase last summer:


Another excellent resource is Gen2Phen. The Gen2Phen project “aims to unify human and model organism genetic variation databases towards increasingly holistic views into Genotype-To-Phenotype (G2P) data, and to link this system into other biomedical knowledge sources via genome browser functionality.”  In that vein, they have quite an extensive list of Locus-specific databases and additional resources.

There are several other resources available for human disease variation including CGAP, dbGAP, GAD, PhenomicDB and several others. We have tutorials on all those if you wish to check those out.

Of course there’s dbSNP :D of which we have a tutorial and tip about searching human variation.

You can find an extensive list of other resources at Human Genome Variation Society (HGVS).

And an oft-asked question on Biostar is what kind of resources are there for this kind of data. You can find answers here, here and here.