Tip of the Week: From UniProt to the PSI SBKB and Back Again

It is often beneficial to visit multiple biomedical databases or resources, even if they seem to provide overlapping  information because no two resources focus on the exact same information, or present it in exactly the same way. Instead of duplicating each others’ curation efforts, database often link out to related information at other resources. You can think of these links as “social connections”, if you want and in today’s tip I want to show you a couple of connections between protein information resources, including a new connection that really features some of the core value of the PSI’s Structural Biology Knowledgebase, or SBKB.

I begin the tip at the UniProtKB, where I search for a UniProt ID number. From the resulting protein report I first briefly show you how to link out to a corresponding RCSB PDB report, where you can find high quality protein structure information and more. If you are interested in learning more about the RCSB PDB & how to use it, please check out OpenHelix’s full, free tutorial that is sponsored by the RCSB PDB.

From there I return to the UniProt report and demonstrate a new link out option that links to protein protocols, available materials, as well as information about theoretical models and predicted protein targets from the SBKB. I don’t have time to show it but a recent update to the SBKB allows users to now search the Structure Biology Knowledgebase with a UniProt accession number. These searches provide users with additional information including protein structure information and information about pre-released structure sequence. As with the RCSB PDB, we have a free tutorial on the SBKB that is sponsored by the Protein Structure Initiative.

As I scroll through the UniProt protein report users will see information and links for a wide variety of bioscience resources. OpenHelix, as I’m sure many of you are aware, has tutorials on how to use many of these resources. Our tutorials on the RCSB PDB and the PSI SBKB are both free. Our tutorials on UniProt and many other resources are available through a subscription to our database of trainings or through purchase of individual access. Whether you learn the resources through our tutorials, through the references I list below, or through your own explorations of the databases, there really is an amazing amount of information available through these interlinked, publicly-funded resources – please make use of them in your research!

RCSB PDB and OpenHelix Announce an Updated Free Tutorial and Training Materials

The new tutorial reflects the many changes and enhancements on the RCSB PDB site, and includes a narrated on-line tutorial, PowerPoint slides, handouts, and exercises.

Bellevue, WA (PRWEB) April 12, 2011

The Research Collaboratory for Structural Biology (RCSB) Protein Data Bank (PDB) has partnered with OpenHelix to provide a revised and updated tutorial (http://www.openhelix.com/PDB) on its free web based resource for studying biological macromolecules (http://www.pdb.org).

The RCSB PDB provides a variety of tools and resources to use to study biological macromolecules. The PDB is the single worldwide repository of experimentally-determined 3D biological structures of proteins, nucleic acids and complex assemblies. As a member of the Worldwide PDB collaboration (wwpdb.org), the RCSB PDB curates and annotates PDB data, and presents basic and advanced search, display and visualization methods to access these data.

The new tutorial reflects the many changes and enhancements on the RCSB PDB site, including a new data drill-down and data summary feature, updated ligand features such as a download page, images and binding affinity data, new report types and visualization options, among many others.

The new training materials (at http://www.openhelix.com/pdb) include an online narrated tutorial that demonstrates: basic and advanced searches, how to generate reports, the different options for exploring individual structures, and many of the research and educational resources and tools available at the RCSB PDB. The approximately 60-minute tutorial, which runs in just about any browser, can be viewed from beginning to end or navigated using chapters and forward and backward sliders.

In addition to the tutorial, RCSB PDB users can also access useful training and teaching materials including the animated PowerPoint slides used as a basis for the tutorial, suggested script for the slides, slide handouts, and exercises. This can save a tremendous amount time and effort for teachers and professors to create classroom content.

Users can view the tutorial and download the free materials at http://www.openhelix.com/pdb.

Tip of the Week: RCSB PDB Data Distribution Summaries

In today’s tip I will feature the data distribution summaries and their drill down features which you can see from many RCSB PDB searches. We are in the process of updating our full tutorial sponsored by the RCSB PDB team, and as part of that effort I’ve gotten to know and appreciate this new data presentation format. Over the last five years the RCSB PDB has really been working hard at redesigning their resource to be more easily accessed by a wide variety of users. Below you will find a recent citation from the group explaining all of their updates and the logic behind them. The paper is a good read because I won’t have time to do anything except scratch the surface of the redesign & you’ll get the details there, but also because  the intro also gives a great glimpse into what resources are dealing with in the way of ‘data deluge’. The increase in users AND data that the RCSB PDB has experienced over the last few years is mind boggling!

OK, back to the data distributions. To me these are really elegant ways of helping any user – PDB is by no means just for structural biologists – come to the RCSB PDB & quickly and easily access whole categories of interesting information and then drill down in detailed ways to access the specific structure or data that they are most interested in.  For example, I could begin with a keyword search for something as general as ‘kinase’. This search retrieves over 4 thousand hits, which could be quite daunting, but at the top of the report results are displayed under categories such as Organism, Taxonomy, Experimental Method, SCOP classification and more. Subcategories under each of these categories lets me know how many hits are, for example are a mixed Polymer type, are human hits, or are alpha and beta proteins. I can mouse over any subcategory title to find out the percent of hits in this category compared to all hits, or click on the title to further drill-down the data distribution on just that subcategory of results. The distribution summaries are updated to then focus specifically on the distribution of THOSE data. Using these summaries is much more intuitive than any text description description that I can muster.

My advice? Check out the tip, then check out the data distribution summaries, drill down utility, and all the other great features of the RCSB PDB & see how easy it is to find information on your favorite gene. Oh yea, and be watching for us to release our full, free & newly updated tutorial on the RCSB PDB resource soon!

