Tag Archives: ontology

The PhenogramViz team illustrates how they analyze and visualize gene-phenotype relationships

Video Tip of the Week: PhenogramViz for evaluating phenotypes and CNVs

As I’ve mentioned before, once I start looking over some new tools I’m often led to others in the same arena that offer related but different features. That’s what happened when I looked at the Proband iPad app for human pedigrees. I noted that they are using important community standards, and I decided to follow those threads a bit. That led me to last week’s tip, the Human Phenotype Ontology (HPO).

HPO has been around for a while and I’ve been aware of it, but this recent re-investigation made me realize how mature it has become, and I was impressed with the amount of adoption there’s been in the genomics community in the big projects. But it also led me to some new tools that I hadn’t encountered before. This week’s tip highlights PhenogramViz–combining my appreciation for controlled vocabularies, standards, and data visualization.

The PhenogramViz team illustrates how they analyze and visualize gene-phenotype relationships

The PhenogramViz team illustrates how they analyze and visualize gene-phenotype relationships

Here’s now the PhenogramViz team describes their tool:

A tool that automatically analyses and visualizes gene-to-phenotype relations for a set of genes affected by CNV of a patient and a set of HPO-terms representing the symptoms of said patient. The tool makes full use of the cross-species phenotype ontology “uberpheno” (see here).

So if you have a patient with copy-number variation issues in their genome, you may be able to use this tool to lead to the genes in that CNV segment that convey certain phenotypes. So the goal–as stated in their paper linked below–is to assist with the clinical interpretation of the genome alterations.

The additional layer of this effort that I find useful is that they use another ontology to take this even further for supporting information. They employ the “Uberpheno” cross-species phenotype ontology to find further details in model organisms.

I’ll let you get a sense of how this works with one of their tutorial videos from their YouTube channel. They have others too–which will help you with different aspects on everything from installation to analyses. I’ll embed the one that shows how you start with a list of patient symptoms or phenotypes, then loading the CNVs or genes, then from the results list you can simply click for graphical representations of the gene-phenotype relationships. Then with the Cytoscape tools you can interact with the “phenograms” in more detail. There’s no sound, you can read the guidance in the callouts.

The videos include some abbreviations–like HPO. That’s why I talked last week about the Human Phenotype Ontology. I was prepping you for this one.  And in another video (Prioritization of pathogenic CNVs) they reference the scoring strategies, which you will find need further explanation in their paper linked below (Journal of Medical Genetics one). I would spend some time looking over how the scoring and ranking happens to understand what’s shown.

Although the focus of this is using the data for human diagnosis, I think it could also help researchers to choose more appropriate animal model for further testing. There are lots of complaints about the unsuitability of animal models for a range of subjects–but refining those choices would also be a huge benefit. Saving resources by helping to choose the right animal model would be another worthwhile use of this tool.

Check out PhenogramViz as a bridge between genomic segments and possible phenotypes. You can try it yourself with sample files they have available on their landing page.

Quick links:

PhenogramViz: http://compbio.charite.de/contao/index.php/phenoviz.html

Cytoscape: http://cytoscape.org/

References:

Köhler, S., Doelken, S., Mungall, C., Bauer, S., Firth, H., Bailleul-Forestier, I., Black, G., Brown, D., Brudno, M., Campbell, J., FitzPatrick, D., Eppig, J., Jackson, A., Freson, K., Girdea, M., Helbig, I., Hurst, J., Jahn, J., Jackson, L., Kelly, A., Ledbetter, D., Mansour, S., Martin, C., Moss, C., Mumford, A., Ouwehand, W., Park, S., Riggs, E., Scott, R., Sisodiya, S., Vooren, S., Wapner, R., Wilkie, A., Wright, C., Vulto-van Silfhout, A., Leeuw, N., de Vries, B., Washingthon, N., Smith, C., Westerfield, M., Schofield, P., Ruef, B., Gkoutos, G., Haendel, M., Smedley, D., Lewis, S., & Robinson, P. (2013). The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data Nucleic Acids Research, 42 (D1) DOI: 10.1093/nar/gkt1026

Köhler S., Doelken S.C., Ruef B.J., Bauer S., Washington N., Westerfield M., Gkoutos G., Schofield P., Smedley D. & Lewis S.E. & (2013). Construction and accessibility of a cross-species phenotype ontology along with gene annotations for biomedical research., F1000Research, PMID: http://www.ncbi.nlm.nih.gov/pubmed/24358873

Köhler, S., Schoeneberg, U., Czeschik, J., Doelken, S., Hehir-Kwa, J., Ibn-Salem, J., Mungall, C., Smedley, D., Haendel, M., & Robinson, P. (2014). Clinical interpretation of CNVs with cross-species phenotype data Journal of Medical Genetics, 51 (11), 766-772 DOI: 10.1136/jmedgenet-2014-102633

Shannon P., Markiel A., Ozier O., Baliga N.S., Wang J.T., Ramage D., Amin N., Schwikowski B. & Ideker T. Cytoscape: a software environment for integrated models of biomolecular interaction networks., Genome research, PMID: http://www.ncbi.nlm.nih.gov/pubmed/14597658

Video Tip of the Week: Human Phenotype Ontology, HPO

Typically, our Tips-of-the-Week cover a specific software tool or feature that we think readers would maybe like to try out. But this week’s tip is a bit different. It’s got a conceptual piece that is important, as well as referencing several software tools that work with this crucial concept to enable interoperability of many tools, helping us link different data types in a common framework.

Conceptually, the Human Phenotype Ontology (HPO) is much like other controlled vocabulary systems you may have used in genomics tools–like Gene Ontology, Sequence Ontology, or others that you might find at the National Center for Biomedical Ontology. We’ve covered the idea of broad parent terms, increasingly precise child terms, and standard definitions in tutorial suites. It’s important to standardize and share the same language to describe the same things among different projects, software providers, and as we move more genomics to the clinic, sharing descriptors for human phenotypes and conditions will be crucial.

The concepts and strategies are becoming mature at this point. and we now have lots of folks who agree and want to use these shared descriptors. A really nice overview of the state of phenotype descriptions and how to use them for discovery and for integration across many data resources was published earlier this year: Finding Our Way through Phenotypes.  It also offers recommendations for researchers, publishers, and developers to support and use a common vocabulary.

For this week’s video, I’m highlighting a lecture by one of the authors of that paper, Peter Robinson. It’s a seminar-length video, but it covers both the key conceptual features of the HPO, provides some examples of how it can be useful in translational research settings, and also describes the range of tools and databases that are using the HPO now. I think it’s worth the time to hear the whole thing. The audio is a bit uneven in parts, but you can get the crucial stuff.

The early part is about the concepts of specific terms, synonyms, and shared terms that can mean completely different things (think American football and European football). He describes the phenotype ontology. There are examples of research that leads to phenotypes that are then used as discovery and diagnostic tools. He talks about tools that utilize the HPO right now, including Phenomizer for obtaining or exploring appropriate terms, PhenIX, Phenotypic Interpretation of eXomes for prioritization of candidate genes in exome sequencing data sets. There is also PhenoTips, that can help you to collect and analyze patient data (and also edit pedigrees).

Many large scale projects and key genomics tools employ the human phenotype ontology.

Many large scale projects and key genomics tools employ the human phenotype ontology.

He also notes how tools like DECIPHER, NCBI Genetic Testing Registry, GWAS Central, and many more include the human phenotype vocabulary. This is a great sign for a project like this, that’s it is being adopted by so many groups and tools world-wide. They’ve also worked with key large-scale projects in this arena to ensure that the vocabulary is suited and workable, and update them when needed. They credit OMIM and Orphanet as being crucial to their efforts as well. As part of the Monarch Initiative, there seems to be solid support going forward as well.

There are more tools to discuss, but I’m going to save those for another post. This one is already loaded with things you should check out, so be sure to come back for further exploration of the HPO-related tools and projects that are worth exploring.

Quick links:

Human Phenotype Ontology: http://www.human-phenotype-ontology.org/

Phenomizer: http://compbio.charite.de/phenomizer/

PhenIX: http://compbio.charite.de/PhenIX/

PhenExplorer: http://compbio.charite.de/phenexplorer/

PhenoTips: https://phenotips.org/

Monarch Initiative: http://monarchinitiative.org/

References:
Deans A.R., Suzanna E. Lewis, Eva Huala, Salvatore S. Anzaldo, Michael Ashburner, James P. Balhoff, David C. Blackburn, Judith A. Blake, J. Gordon Burleigh, Bruno Chanet & Laurel D. Cooper & (2015). Finding Our Way through Phenotypes, PLoS Biology, 13 (1) e1002033. DOI: http://dx.doi.org/10.1371/journal.pbio.1002033

Kohler, S., Doelken, S., Mungall, C., Bauer, S., Firth, H., Bailleul-Forestier, I., Black, G., Brown, D., Brudno, M., Campbell, J., FitzPatrick, D., Eppig, J., Jackson, A., Freson, K., Girdea, M., Helbig, I., Hurst, J., Jahn, J., Jackson, L., Kelly, A., Ledbetter, D., Mansour, S., Martin, C., Moss, C., Mumford, A., Ouwehand, W., Park, S., Riggs, E., Scott, R., Sisodiya, S., Vooren, S., Wapner, R., Wilkie, A., Wright, C., Vulto-van Silfhout, A., Leeuw, N., de Vries, B., Washingthon, N., Smith, C., Westerfield, M., Schofield, P., Ruef, B., Gkoutos, G., Haendel, M., Smedley, D., Lewis, S., & Robinson, P. (2013). The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data Nucleic Acids Research, 42 (D1) DOI: 10.1093/nar/gkt1026

Köhler, S., Schulz, M., Krawitz, P., Bauer, S., Dölken, S., Ott, C., Mundlos, C., Horn, D., Mundlos, S., & Robinson, P. (2009). Clinical Diagnostics in Human Genetics with Semantic Similarity Searches in Ontologies The American Journal of Human Genetics, 85 (4), 457-464 DOI: 10.1016/j.ajhg.2009.09.003

Zemojtel, T., Kohler, S., Mackenroth, L., Jager, M., Hecht, J., Krawitz, P., Graul-Neumann, L., Doelken, S., Ehmke, N., Spielmann, M., Oien, N., Schweiger, M., Kruger, U., Frommer, G., Fischer, B., Kornak, U., Flottmann, R., Ardeshirdavani, A., Moreau, Y., Lewis, S., Haendel, M., Smedley, D., Horn, D., Mundlos, S., & Robinson, P. (2014). Effective diagnosis of genetic disease by computational phenotype analysis of the disease-associated genome Science Translational Medicine, 6 (252), 252-252 DOI: 10.1126/scitranslmed.3009262

Girdea, M., Dumitriu, S., Fiume, M., Bowdin, S., Boycott, K., Chénier, S., Chitayat, D., Faghfoury, H., Meyn, M., Ray, P., So, J., Stavropoulos, D., & Brudno, M. (2013). PhenoTips: Patient Phenotyping Software for Clinical and Research Use Human Mutation, 34 (8), 1057-1065 DOI: 10.1002/humu.22347

Video Tip of the Week: eGIFT, extracting gene information from text


eGIFT, as the tag line says, is a tool to extract gene information from text. It’s a tool that allows you to search for and explore terms  and documents related to a gene or set of genes. There are many ways to search and explore eGIFT, find genes given a specific term, find terms related to a set of genes and more. How does the tool do this? You can check out the user guide to find out more, but here is a brief summary from the site:

We look at PubMed references (titles and abstracts), gather those references which focus on the given gene, and automatically identify terms which are statistically more likely to be relevant to this gene than to genes in general. In order to understand the relationship between a specific iTerm and the given gene, we allow the users to see all sentences mentioning the iTerm, as well as the abstracts from which these sentences were extracted.

To learn more about how this tool was put together and the calculations involved, you can check out the BMC Bioinformatics publication about it from 2010, eGIFT: Mining Gene Information from the Literature.

But, for today, take a tour of the site and some of the things you can do in today’s Tip of the Week.

Relevant Links:
eGIFT
PubMed (tutorial)
XplorMed (tutorial)
Literature & Text Mining Resource Tutorials

Tudor, C., Schmidt, C., & Vijay-Shanker, K. (2010). eGIFT: Mining Gene Information from the Literature BMC Bioinformatics, 11 (1) DOI: 10.1186/1471-2105-11-418

WhatsYourProblem to WhatsTheAnswer

Our “What’s Your Problem” post will be transitioning to a “What’s the Answer” post this week and going forward. BioStar is a site for asking, answering and discussing bioinformatics questions. We are members of the community and find it very useful. Often questions and answers arise at BioStar that are germane to our readers (end users of genomics resources). Every week we will be highlighting one of those questions and answers here in this thread. You can still ask questions in this thread, or you can always join in at BioStar.

BioStar Question of the Week:

What is a good ontology for experimental results If i want to publish experimental results, preferably via RDFa using a standardized ontology what would be a good source to use. I am thinking of a triple such as:
Protein X — Interacts with — Protein Y
Where the ontology would spell out “Interacts with”.

Highlighted Answer:

I would recommend formatting your data using the IMEx (International Molecular Exchange Consortium)curation guidelines. This will allow you to submit your data easily to any of the participant databases (DIP, MINT, INTACT, etc). IMEx uses The PSI (Proteomics Standards Initiative) Molecular Interactionscontrolled vocabulary. There is a PSI-MI XML/CV validator here.

Check out the other answers, or provide one if you have insights into the problem.

Tip of the Week: Word Add-In for Ontology Recognition

In today’s tip I want to make you aware of a tool that I think will help researchers to present their own data and publications in an accurate and universally searchable way. I learned of the resource (UCSDBioLit) through an article in one of my recent BioMed Central article alert emails. This resource allows authors to mark-up their own publications with XML tags AS THEY WRITE their papers. This will allow faster and more accurate semantic searching of their research.

A huge problem in science today is the ability to quickly search the vast literature base and to accurately and efficiently find the data that you are interested in. Here at OpenHelix we focus on ways of effectively and efficiently get information out of public databases and resources, but at the other end of the process is the ability for scientific knowledge to be curated into those resources. We have featured biocurators and the phenomenal work that they do several times in the past, but it is work that never ends and can be very labor intensive. It often involves an initial triaging of a field’s literature, some level of automatic information gathering, and then careful manual effort on the part of scientist at the resource to gather and present the information through their site. I know from personal experience that the process of reading a paper, clarifying research details with an author, and then presenting that information to the author’s satisfaction can be a very long & labor intensive process, for both the curator AND the original author.

For years there has been discussion of ‘expert curation’ in which experts in the field author review or summary pages in a resource, or community curation jamborees, etc. And there have been fruits from many of these efforts, but in general participation is low. But who is more of an expert on the research being published other than the author himself? If authors could/would mark up their own papers during the publication process, not only could they be assured that it would be accurate but they would help make their research universally searchable without the lag required for searchability through a specific resource. Thus far document mark-up is has not been an easy process and has largely been deemed ‘not worth the effort’ for the level of attribution/recognition affiliated with it.

The BioMed Central article does a nice job of outlining and discussing many of these issues. It cites many other efforts and resources, explains their motivation and the implementation of their software. A nice feature of the tool is that there are interoperability features, and a real commitment to conforming with existing standards of practice. The article also presents an appendix of resource addresses of other groups involved in semantic searching and literature publication. I especially like this quote from the paper:

The Word add-in presented here will assist authors in this effort using community standards and by making it possible for the author of the document, the absolute expert on the content, to do so during the authoring process and to provide this information in the original source document.

You can also find brief tutorials on using the tool at SciVee: Word Add-in for Ontology Recognition Tutorial (1 of 4): Install Process

As a note, literature mark-up and enabling are currently an active area – Mary found another literature handling resource and paper as well: Check out the tip, the articles & the tools. Tell me what you find/think. Thanks! (OH, and Happy St. Patty’s to ya!)

UCSDBioLit Reference:
ResearchBlogging.org
Fink, J., Fernicola, P., Chandran, R., Parastatidis, S., Wade, A., Naim, O., Quinn, G., & Bourne, P. (2010). Word add-in for ontology recognition: semantic enrichment of scientific literature BMC Bioinformatics, 11 (1) DOI: 10.1186/1471-2105-11-103

Tip of the Week: The National Center for Biomedical Ontology

NCBO_tip_imageAnyone who has either used or helped to create a database of biological information has probably come across ontological terms. In today’s tip I feature a great resource devoted to promoting the creation and proper use of ontologies. The resource is the The National Center for Biomedical Ontology, and allows users to learn about ontologies, find and use ontologies that are already in existence, and even to add newly developed ontologies to the resource so others might use them.

Ontologies are basically organized sets of controlled vocabulary terms that are applied in a uniform manor across diverse collections of information. They are important because of their ability to make abstract biological terms computer searchable. They also aid in the interpretation of biological information by researchers because each term includes a definition of how and when it should be applied to biological information. In this tip I briefly touch on finding ontologies, and on the educational resources available from the NCBO and BioPortal web sites.