Tag Archives: interactions

Video Tip of the Week: BindingDB for binding affinities

Recently when I was adding videos to our SciVee collection, I noticed that there was a set of new videos about BindingDB. This database has been around for a long time, and I was surprised to realize that we hadn’t covered it yet. And it certainly only grows more important to understand proteins and their binding partners–whether they are other proteins or chemical compounds that can be important effectors of health and disease.

For a decade now this database has been curated and maintained to provide access to information from publications that is often not easily accessible. As their homepage says today:

BindingDB contains 832,773 binding data, for 5,765 protein targets and 362,123 small molecules.

That’s a lot of information available to you to investigate that they have collected. You can start with a protein of interest, or a compound, or a paper, and find related information from those points. There are various other tools and entry points as well.

In addition, it is integrated with many other key resources, including PDB and UniProt, MMDB and KEGG, and more. ChEMBL links offer handy links to compounds.

You can see from their “News” that they are actively maintaining this site, and are developing new tools to offer users ways to interact with the data. But the newest feature seems to be their videos–I’ll let them show you more about how to use their site.

BindingDB: Find and view all data for a target of interest

They offer several other quick tips on ways to interact–starting with an article and obtaining the data, and more. You can access them from the end of the video in the “Related” links, or explore their SciVee set. They are also found on the homepage of BindingDB right now. So check them out if you need protein binding data. They may have what you seek.


Quick link:

BindingDB site: http://bindingdb.org/


Liu, T., Lin, Y., Wen, X., Jorissen, R.N. & Gilson, M.K. (2007). BindingDB: a web-accessible database of experimentally determined protein-ligand binding affinities, Nucleic Acids Research, 35 (Database), D198. DOI: 10.1093/nar/gkl999

Video Tips of the Week: Annual Review IV (first half of 2011)

As you may know, we’ve been doing these video tips-of-the-week for FOUR years now. We have completed around 200 little tidbit introductions to various resources. At the end of the year we’ve established a sort of holiday tradition: we are doing a summary post to collect them all. If you have missed any of them it’s a great way to have a quick look at what might be useful to your work.

You can see past years’ tips here: 2008 I, 2008 II, 2009 I, 2009 II, 2010 I, 2010 II. The summary of the second half of 2011 will be available next week here.

January 2011

January 5: SKIPPY predicting variants w/ splicing effects

January 12: Twitter in Bioinformatics. This one was much more popular than I expected!

January 19: PolyPhen, for predicting the possible effects of mutations in genes

January 26: iRefWeb + protein interaction curation

February 2011

February 2: RCSB PDB Data Distribution Summaries

February 9: SIFT, Sorting (SNPs) Intolerant From Tolerant another tool for predicting the impact of mutations in genes.

February 16: Melina II for promoter analysis

February 23: SNPTips and viewing personal genome data This tip is one of the most-watched ones we’ve had. Thousands of views on SciVee!

March 2011

March 2: DAnCER for disease-annotated epigenetics data

March 9: World Tour of Genomics Resources

March 16: Encyclopedia of Life

March 23: ORegAnno for regulatory annotation

March 30: MetaPhoOrs, orthology and paralogy predictions

April 2011

April 6: The Taverna Project for workflows

April 13: VirusMINT , the branch of the Molecular Interaction database for viral interactions

April 20: LAMHDI for animal models

April 27: Dot Plots, Synteny at VISTA

May 2011

May 4: MycoCosm

May 11: InterMine for mining “big data”

May 18: Allen Institute’s Brain Explorer

May 25: SciVee, the YouTube of science

June 2011

June 1: New and Improved OMIM®

June 8: Converting Genome Coordinates

June 15: MutaDATABASE, a centralized and standardized DNA variation database

June 22: Update to NCBI’s Cn3D Viewer

June 29: Orphanet for Rare Disease information

What’s the answer? Cytoscape plug-ins

BioStar is a site for asking, answering and discussing bioinformatics questions. We are members of the community and find it very useful. Often questions and answers arise at BioStar that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those questions and answers here in this thread. You can ask questions in this thread, or you can always join in at BioStar.

Question of the week:

Cytoscape Plug-in for retrieving Protein-Protein Interactions

I wish to find out all interactions WITHIN a set of around 200 HUMAN proteins. The identifiers I can use are gene_name, Uniprot_accession and Uniprot_Ids. So far I tried two plugins viz. MIMI and APID2NET.

MIMI doesn’t seem to accept 200 proteins in one go, so I’ve to merge the networks. APID2NET shows too many nodes without any interactions. The STRING DB shows quite a many interactions for which MIMI/APID2NET don’t report anything.

I tried the STRING plugin too, but it looks like it can accept one ID at a time. Am I missing something here?

Can somebody recommend some good and hassle less plugins/tricks to import PPIs. I’ve to do subsequently a BINGO analysis.

Thanks in advance


Although most people in this arena will be familiar with Cytoscape, it can be challenging to know which specific plug-ins might be best for a given purpose. One of the cool things about a forum like BioStar is that there is a range of folks who have wide experience with the tools from so many different projects that often someone has a bit of guidance on things that did (or didn’t) work.

In this case someone offered a suggestion that seemed to fit the bill precisely! Check out the answers, and note the selected () answer to see which it is. It’s a tool I noticed in the past partly because it was the first bioinformatics tool that I had seen that came from Cuba and I thought that was cool–BisoGenet.

But if you have other suggestions you can also offer them at BioStar.

“What’s the Answer”

BioStar is a site for asking, answering and discussing bioinformatics question

s. We are members of the community and find it very useful. Often questions and answers arise at BioStar that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those questions and answers here in this thread. You can ask questions in this thread, or you can always join in at BioStar.

Today’s question and answer is:

What is the best resource for Protein/Genetic interactions?

One of several answers:

Protein interactions: If you are interested primary in protein interactions, look first at the IMEx consortium; all of the databases that you mention are a part of it. Their interactions are available through PSICQUIC web services, described here: http://code.google.com/p/psicquic/

Some websites combine data from multiple molecular interaction databases, e.g., Pathway Commonsand IRefWeb. I know you can download a combined dataset of interactions from Pathway Commons but I haven’t tried doing this with IRefWeb.

Genetic interactions: I think that BioGRID is the only major database that currently curates genetic interactions.


What’s the answer? Open thread

BioStar is a site for asking, answering and discussing bioinformatics questions. We are members of the community and find it very useful. Often questions and answers arise at BioStar that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those questions and answers here in this thread. You can ask questions in this thread, or you can always join in at BioStar.

BioStar Question of the Week:

In silico cell modelling platform : Do you know any platform/program/programming language or libraries for in silico cell modelling? It would be great to make visualisations too.

– by Vova Naumov

Although there was no “selected answer” for this one, the top rated answer provided a long list of tools that might be useful for this purpose. One of them is Cell Designer, which I commented on over there. I took a workshop on that a couple of years back and was really impressed with the features and functionality. But there are a lot of others over there–so if this is an area that you are interested in you might want to check out the full set of answers.

Note: Cell Designer is a tool from Japan, and there may be issues with accessing this tool sometimes now–they mention some downtime on their site. But keep it in mind in the future if you can’t get to it just yet. It’s a great tool.

Tip of the Week: iRefWeb + protein interaction curation

For this week’s tip of the week I’m going to introduce iRefWeb, a resource that provides thousands of data points on protein-protein interactions.  If you follow this blog regularly, you may remember that we had a guest post from the iRefWeb team not too long ago. It was a nice overview of many of the important aspects of this tool, and I won’t go into those again here–you should check that out. Andrei knows those details quite well!

And at the time we also mentioned their webinar was coming up. We were unable to attend that, though, because we were doing workshops at The Stowers Institute. I was delighted to find that their webcast is now available to watch in full. It’s about 40 minutes long and covers much more than my 5-minute appetizer could do.  It details many practical aspects of how to use iRefWeb effectively.

Because they’ve done all the prep work for me, I don’t need to spend much time on the structural and functional features here. What I would like to do is draw your attention to a different aspect of their work. Their project draws together protein interaction data from a variety of source databases–including some of our favorites such as MINT and IntAct (for which we have training suites available for purchase).  They then used the iRefWeb processes and projects to evaluate and consider the issues around curation of protein-protein interaction data, and recently published those results. That’s what I’ll be focusing on in the post.

Every so often a database flame-war erupts in the bioinformatics community. Generally it involves someone writing a review of databases and/or their content. These evaluations are sometimes critical, sometimes not–but often what happens is that the database providers feel that their site is either mis-represented, or unfairly chastised, or at a minimum incompletely detailed on their mission and methods. I remember one  flambé developed not too long ago around a paper by our old friend from our Proteome days–Mike Cusick–and his colleagues (and we talked about that here). As the OpenHelix team has been involved in plenty of software and curation teams, we know how these play out. And we have sympathy for both the authors and the database providers in these situations.

So when the iRefWeb site pointed me to their new paper I thought: oh-oh…shall I wear my asbestos pantsuit for this one???  The title is Literature curation of protein interactions: measuring agreement across major public databases.  Heh–how’s that working out for ya?

Anyway–it turns out not to need protective gear, in my opinion. Because their project brings data from several interaction database sources, they are well-positioned to collect information about the data to compare the data sets. They clearly explain their stringent criteria, and then look at the data from different papers as it is collected across different databases.

A key point is this:

On average, two databases curating the same publication agree on 42% of their interactions. The discrepancies between the sets of proteins annotated from the same publication are typically less pronounced, with the average agreement of 62%, but the overall trend is similar.

So although there is overlap, different database have different data stored. This won’t be a surprise to most of us in bioinformatics. But I think it is something that end users need to understand. The iRefWeb team acknowledges that there are many sources of difference among data curation teams. Some curate only certain species. Some include all data from high-throughput studies, others take only high-confidence subsets of that data. And it’s fine for different teams to slice the data how they want. Users just need to be aware of this.

It seems that in general there’s more agreement between curators on non-vertebrate model organism data sets than there is for vertebrates. Isoform complexity is a major problem among the hairy organisms, it turns out–and this affects how the iRefWeb team scored the data sets. And as always when curation is evaluated–the authors of papers are sometimes found to be at fault for providing some vagueness to their data sets.

The iRefWeb tools offer you a way to assess what’s available from a given paper in a straightforward manner. In their webinar, you can hear them describe that ~30 minutes in. If you use protein-protein interaction data, you should check that out.

Caveat emptor for protein-protein interaction data (well, and all data in databases, really). But iRefWeb provides an indication of what is available and what the sources are–all of it traceable to the original papers.

The paper is a nice awareness of the issues, not specific criticism of any of the sources. They note the importance of the curation standards encouraged by the Proteomics Standards Initiative–Molecular Interaction (PSI-MI) ontologies and efforts. And they use their paper to raise awareness of where there may be dragons. It seems that dragons are quite an issue for multi-protein complex data.

Your mileage may vary. If you are a data provider, you may want to have protective gear for this paper. But as someone not connected directly to any of the projects, I thought it was reasonable. And something to keep in mind as a user of data–especially as more “big data” proteomics projects start rolling out more and more data.

Quick links and References:

iRefWeb http://wodaklab.org/iRefWeb/

Their Webinar: http://www.g-sin.com/home/events/Learn_about_iRefWeb

Turinsky, A., Razick, S., Turner, B., Donaldson, I., & Wodak, S. (2010). Literature curation of protein interactions: measuring agreement across major public databases Database, 2010 DOI: 10.1093/database/baq026

Cusick, M., Yu, H., Smolyar, A., Venkatesan, K., Carvunis, A., Simonis, N., Rual, J., Borick, H., Braun, P., Dreze, M., Vandenhaute, J., Galli, M., Yazaki, J., Hill, D., Ecker, J., Roth, F., & Vidal, M. (2009). Literature-curated protein interaction datasets Nature Methods, 6 (1), 39-46 DOI: 10.1038/nmeth.1284

Reactome Webinar coming up; Wed Feb 2

We were on the road last week doing workshops, so this is a few days old now. But if you aren’t on the GO Friends mailing list it’s possible it’s new for you. A quick word about GO Friends list: because so many tools rely on Gene Ontology and have some kind of GO components, there are quite a range of things that come over that mailing list. It’s not just for GO developers per se. You might want to check it out.

Anyway, what I wanted to focus on today is this notice about the upcoming Reactome webinar. There have  been BIG changes to the interface, but the underlying coolness and high quality of all those biological pathways remains intact, of course! Reactome is a tool we have loved for a long time, and we’ve coordinated with the Reactome folks around the next updates for our tutorial. We’re working on that update now.

If you want to learn more about Reactome and these new changes, there’s going to be a webinar soon. You have to register, and I’ll only give some of the details here. Head over to the GO Friends message link to see the rest.

The Ontario Genomics Institute (OGI) and the Ontario Institute for Cancer Research (OICR) are co-hosting a one hour web conference/webinar about the Reactome Pathway Database (http://www.reactome.org) – a freely available, manually-curated resource of core biological pathways. The Reactome database offers pathway data encapsulating areas of human biology ranging from basic pathways of metabolism to complex events such as GPCR signaling and apoptosis.

This follow up webinar will introduce the updated website with a more intuitive user interface and a new suite of data analysis tools. Learn to use this database through case studies from various research groups.

The presentation will be given by Dr. Robin Haw, Manager of Reactome Outreach, OICR, and will cover how to use the updated Reactome resource for:

• Browsing and searching pathway knowledge,
• Integrating network and pathway data,
• Using Pathway and Expression Analysis tools to analyze experimental datasets,
• Annotating experimental datasets with Reactome BioMart,
• Discovering network patterns related to cancer and other diseases using the Reactome Functional Interaction Network Cytoscape plug-in,
• Introducing use cases for Reactome data and analysis tools.


Go to the link for the registration details. I’ll be listening in (if we don’t schedule a workshop for that day!)

WhatsYourProblem to WhatsTheAnswer

Our “What’s Your Problem” post will be transitioning to a “What’s the Answer” post this week and going forward. BioStar is a site for asking, answering and discussing bioinformatics questions. We are members of the community and find it very useful. Often questions and answers arise at BioStar that are germane to our readers (end users of genomics resources). Every week we will be highlighting one of those questions and answers here in this thread. You can still ask questions in this thread, or you can always join in at BioStar.

BioStar Question of the Week:

What is a good ontology for experimental results If i want to publish experimental results, preferably via RDFa using a standardized ontology what would be a good source to use. I am thinking of a triple such as:
Protein X — Interacts with — Protein Y
Where the ontology would spell out “Interacts with”.

Highlighted Answer:

I would recommend formatting your data using the IMEx (International Molecular Exchange Consortium)curation guidelines. This will allow you to submit your data easily to any of the participant databases (DIP, MINT, INTACT, etc). IMEx uses The PSI (Proteomics Standards Initiative) Molecular Interactionscontrolled vocabulary. There is a PSI-MI XML/CV validator here.

Check out the other answers, or provide one if you have insights into the problem.

Guest Post: iRefWeb — Andrei Turinsky

This next post in our continuing semi-regular Guest Post series is from Andrei Turinsky, one of the developers of iRefWeb. If you are a provider of a free, publicly available genomics tool, database or resource and would like to convey something to users on our guest post feature, please feel free to contact us at wlathe AT openhelix DOT com or the contact form (write ‘guest post’ as subject heading). We welcome introductions to your resource, information on updates, highlights of little known gems or opinion pieces on the state of genomic research and databases.

What is iRefWeb?

Protein-protein interactions (PPI) have become an important tool in biomedical research. Yet the PPI data for a specific organism tend to be distributed over a number of different databases. Comparison and integration of PPI information across databases remains a challenging task.

iRefWeb (Turner et al. (2010) Database, Vol. 2010, Article ID baq023.) is a web interface to a broad integrated landscape of protein-protein interactions (PPIs). For a given gene or protein, you can access all PPI records and protein complexes, consolidated non-redundantly from ten major public databases: BIND, BioGRID, CORUM, DIP, IntAct, HPRD, MINT, MPact, MPPI and OPHID. iRefWeb also presents various supporting evidence, helping you to gauge the reliability of an interaction. Versatile search filters allows you to retrieve the PPIs with a given level of support. Other features facilitate the analysis of possible inconsistencies across PPI data and the examination of PPI statistics. Data consolidation procedure effectively combines redundant records using the iRefIndex process (Razick et al (2008) BMC Bioinformatics 9, 405.).

Figure 1: The iRefIndex process aggregated over 916,059 original PPI records from source databases, 75% of which were redundant. The consolidation merged the redundant PPIs, reducing their number four-fold (orange). Only 232,612 PPIs were non-redundant (blue)

Continue reading

Friday SNPpets

Welcome to our Friday feature link collection: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…