Tag Archives: pubmed

Video Tip of the Week: the New PubMed Filters Sidebar

In today’s tip I am linking to a YouTube video from NCBI that briefly explains the new Filters Sidebar feature that has been added to PubMed. We first saw a tweet that the change was coming back on May 2nd, just as I was completing a total update to our full PubMed tutorial*.

I struggled with whether to hold our production team for the new sidebar, or to produce our tutorial with the plan to update in the near future – it is always a struggle to know which is the best option because resource changes can occur at the speed of light, or according to geological time scales (ok, that’s an exaggeration but it feels that way when you want to release a wonderful, up-to-date project & something holds you up and causes delayed publication of our tutorial materials). With PubMed I was lucky – I saw a tweet that the sidebar feature would be added “in the next week”. I asked our voice professional to put the script on hold & I paced around PubMed waiting to see what (& when) things would occur.

True to their word, the sidebar feature showed up on PubMed results on May 10th, exactly one week since I had seen the “in the next week” announcement – my THANKS to the NCBI & PubMed Teams! :) Not only did they push out their updates in a timely manner, they made a YouTube video explaining the changes & discussing where future changes are slated to go. The video is clear, and quick, so I am using it as my tip this week. I’m not sure the feature is 100% stable, as I show in the image below, and describe later in the post, but I think the change might accomplish NCBI’s goal – for more people to notice & utilize filters for their searches.

In the video the narrator states that the filters area is gone & the two default filters are permanently selected, as indicated by the check marks that can’t be “unclicked”. I”m not seeing those check marks on either “Free full text available” link (shown) or the “Review” link, which is not in view in my image. I also see a difference as to whether I get the right filtered subsets depending on whether I am logged into My NCBI (the upper window shown in the back of the image), or not (the lower, front window). In my hands IE 9.0 & Firefox 12.0 both function similarly in these aspects.

The NCBI video doesn’t really show how results look after filters are added, but in playing with it to me it looks like all of your filters are applied to your search & you only get one set of results, not links to various subsets. Although it is now easier to add filters to searches, if that’s how filters are going to work going forward, I think I will miss the old filters – I kind of like being able to switch between various subcategories of results without having to change my filters or rerun searches. Be sure to share your thoughts & preferences with NCBI so that they can create the best resource for their users needs!

* OpenHelix tutorial for this resource available for individual purchase or through a subscription.

Quick links:

OpenHelix Introductory Tutorial on using PubMed (soon to be updated): http://www.openhelix.com/cgi/tutorialInfo.cgi?id=70

PubMed Resource: http://www.pubmed.gov/

PubMed Reference:
Sayers, E.W., Barrett, T., Benson, D.A., Bolton, E., Bryant, S.H., Canese, K., Chetvernin, V., Church, D.M., DiCuccio, M., Federhen, S. & (2011). Database resources of the National Center for Biotechnology Information, Nucleic Acids Research, 40 (D1) D25. DOI: 10.1093/nar/gkr1184

Announcement of Updated Tutorial Materials: UniProt, Overview of Genome Browsers, and World Tour of Resources

As many of you know, OpenHelix specializes in helping people access and utilize the gold mine of public bioscience data in order to further research.  One of the ways that we do this is by creating materials to train people – researchers, clinicians, librarians, and anyone interested in science - on where to find data they are interested in, and how to access data at particular public databases and data repositories. We’ve got over 100 such tutorials on everything from PubMed to the Functional Glycomics Gateway (more on that later).

In addition creating these tutorials, we also spend a lot of time to keep them accurate and up-to-date. This can be a challenge, especially when lots of databases or resources all have major releases around the same time. Our team continually assesses and updates our materials and in this post I am happy to announce recently released updates to three of our tutorials: UniProt, World Tour, and Overview of Genome Browsers.

Our Introductory UniProt tutorial shows users how to: perform text searches at UniProt for relevant protein information, search with sequences as a starting point, understand the different types of UniProt records, and create multi-sequence alignments from protein records using Clustal.

Our Overview of Genome Browsers introduces users to introduce Ensembl, Map Viewer, UCSC Genome Browser, the Integrated Microbial Genomes (IMG) browser, and to the GBrowse software system. We also touch on WebGBrowse, JBrowse, the Integrative Genomics Viewer (IGV), the ARGO Genome Browser, the Integrated Genome Browser (IGB)GAGGLE, and the Circular Genome Viewer, or CGView.

Our World Tour of Genomics Resources is free and accessible without registration. It includes a tour of example resources, organized by categories such as Algorithms and Analysis tools, expression resources, genome browsers (both Eukaryotic and Prokaryotic/Microbial) , Literature and text mining resources, and resources focused on nucleotides, proteins, pathways, disease and variation. This main discussion will then lead into a discussion of how to find resources with the free OpenHelix Resource Search Portal, followed by learning to use resources with OpenHelix tutorials, and a discussion of additional methods of learning about resources.

Quick Links:

OpenHelix Introductory UniProt tutorial suite: http://www.openhelix.com/cgi/tutorialInfo.cgi?id=77

OpenHelix Overview to Genome Browsers tutorial suite: http://www.openhelix.com/cgi/tutorialInfo.cgi?id=65

Free OpenHelix World Tour of Genomics Resources tutorial suite: http://www.openhelix.com/cgi/tutorialInfo.cgi?id=119


On a Mission for Protein Information

It’s probably just the human brain’s ability to connect dots  &  find patterns, but it can be interesting how many “unrelated” events and information bits accumulate in my head & eventually get mulled into an idea or theory. Take, for example, a recent biotech mixer, bits from an education leadership series & a past Nature article – each “event” has been meandering in my mind and now they are finding their way out as this blog post.

OK, now the explanation: At a recent local biotech event I heard about a company (KeraNetics) purifying keratin proteins & using them to develop therapeutic and research applications. The company & their research sounded very interesting & because a lot of it is aimed at aiding wounded soldiers, it also sounded directly beneficial. The talk was short, only about 20 minutes, so there wasn’t a lot of time for details or questions. I decided I’d venture forth through many of the bioscience databases and resources that I know and love, in order to learn more about keratin.

My quest was both fun and frustrating because of the nature of the beast – keratin is “well known” (i.e. it comes up in high school academic challenge competitions ‘a lot’, according to someone in the know), but is hard to work with (i.e. tough, insoluble, fibrous structural proteins) that is hard to find much general information on in your average protein database (because it is  made of many different gene products, all referred to as “keratin”). I decided to begin my adventure at two of my favorite protein resources, PDB & SBKB, but I found no solved structures for keratin. Because of the way model organism databases are curated and organized, I often begin a protein search there, just to get some basic background, gene names, sequence information, etc. I (of course) found nothing other than a couple of GO terms in the Saccharomyces Genome Database (SGD), but I found hundreds of results in both Mouse Genome Informatics (MGI) (660 genomic features) and Rat Genome Database (RGD) (162 rat genes, 342 human genes). I also found gene names (Krt*), sequences and many summary annotations with references to diseases with links to OMIM. When I queried for “keratin”, in OMIM I got 180 hits, including 61 “clinical synopsises”, in UniProt returned 505 reviewed entries and 2,435 unreviewed entiries, in Entrez Protein 10,611 results and in PubMed 26,430 articles with 1,707 reviews. I got my curiosity about KeraNetics’ research sated by using a PubMed advanced search for Keratin in the abstract or title & the PI’s name as author (search = “(keratin[Title/Abstract]) AND Van Dyke[Author]“).

I ended up with a lot of information leads that I could have hunted through, but it was a fun process in which I learned a lot about keratin. This is where the education stuff comes in. I’ve been seeing a lot of studies go by talking about reforming education to be more investigation driven, and I can totally see how that can work. “Learning” through memorization & regurgitation is dry for everyone & rough for the “memory challenged”, like me. Having a reason or curiosity to explore, with a new nugget of data or understanding lurking around each corner, the information just seems to get in better & stay longer. (OT, but thought I’d mention a related site that I found today w/ some neat stuff: Mind/Shift-How we will learn.)

And I could have done the advanced PubMed search in the beginning, but what fun would that have been? Plus there is a lot that I learned about keratin from what I didn’t find, like that there wasn’t a plethora of PDB structures for keratin proteins. That brings me to the final dot in my mullings – an article that I came across today as I worked on my reading backlog: “Too many roads not taken“. If you have a subscription to Nature you can read it, but the main point is that researchers are still largely focusing on the same set of proteins that they have been for a long time, because these are the proteins for which there are research tools (antibodies, chemical inhibitors, etc). This same sort of philosophy is fueling the Protein Structure Initiative (PSI) efforts, as described here. Anyway, I found the article interesting & agree with the authors general suggestions. I would however extend it beyond these physical research tools & say that going forward researchers need more data analysis tools, and training on how to use them – but I would, wouldn’t I? :)


  • Sierpinski P, Garrett J, Ma J, Apel P, Klorig D, Smith T, Koman LA, Atala A, & Van Dyke M (2008). The use of keratin biomaterials derived from human hair for the promotion of rapid regeneration of peripheral nerves. Biomaterials, 29 (1), 118-28 PMID: 17919720
  • Edwards, A., Isserlin, R., Bader, G., Frye, S., Willson, T., & Yu, F. (2011). Too many roads not taken Nature, 470 (7333), 163-165 DOI: 10.1038/470163a

What’s the answer? Database anomalies

BioStar is a site for asking, answering and discussing bioinformatics questions. We are members of the community and find it very useful. Often questions and answers arise at BioStar that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those questions and answers here in this thread. You can ask questions in this thread, or you can always join in at BioStar.

The question for the week:

Incorrect/unusual entries in main databases (GenBank, UniProt, PDb)? Pierre Poulain asks ” I… advise my students to be cautious with the data they can find in these databases. To illustrate this, I found quite unusual entries in GenBank:..” and he then lists some good ones.

There were several interesting, and funny, answers including one from our own Mary,

My favorite bizarre database item was a PubMed one. This was long before that NCBI ROLF blog was created. I was searching for genes identified in the transition to gray hair. This was not useful….


This is the TITLE (note, not the abstract):

I am a 64-year-old man, and I’ve always been proud of my perfect health record. I’ve also been proud of my full head of hair, even after the gray started creeping in. Four months ago I caught pneumonia and spent eight days in the hospital (three in intensive care). It took a while, but I’m finally back to normal – except that my hair is falling out. It comes out in clumps when I shampoo or even comb it, and it’s gotten noticeably thin all over. I remember reading about Propecia in your newsletter but I don’t have the old issue. Should I try the medication?

Check out the other answers for good examples as to why the researcher should always double-check the data.

Friday SNPets

Welcome to our Friday feature link collection: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…

Updated Online Tutorials for NCBI resources including an NCBI Overview and PubMed and the Gene Expression Omnibus tutorials

Comprehensive tutorials on the publicly available NCBI resources enable researchers to quickly and effectively use these invaluable resources.

Seattle, WA (PRWEB) June 8, 2010 – OpenHelix today announced the availability of three updated tutorials on NCBI resources.

The National Center for Biotechnology Information, NCBI, is home to many of the most commonly used publicly available databases and tools in molecular biology today. They house such popular and widely used databases as GenBank, PubMed, GEO, Entrez Gene, Entrez Protein, and more. NCBI also produces, maintains and updates a variety of tools, like the large family of BLAST sequence similarity searching tools and the Entrez search and retrieval tools. In addition, they provide an extensive variety of services for education, news dissemination and different types of data submission. This tutorial presents a broad overview of NCBI’s databases, tools, educational resources and data submission protocols. In addition to an update on this overview, OpenHelix has updated both it’s PubMed and GEO tutorials. PubMed is the premiere search engine for biomedical literature. More than 18 million citations from life science journals can be searched through this free service. The Gene Expression Omnibus, or GEO, is a valuable resource designed to store high-throughput gene expression and molecular abundance data. These three tutorials, in conjunction with the many other OpenHelix up-to-date tutorials on NCBI resources such as BLAST, Entrez, dbSNP, MMDB, Viral resoruces, MapViewer and others will give you a set of training resources to help be efficient and effective at accessing and analyzing genome data.

The tutorial suites, available through an annual OpenHelix subscription, contain an online, narrated, multimedia tutorial, which runs in just about any browser connected to the web, along with slides with full script, handouts and exercises. With the tutorials, researchers can quickly learn to effectively and efficiently use these resources. The scripts, handouts and other materials can also be used as a reference or for training others.

These tutorials will teach users:

NCBI Overview

*to understand the basic structure of NCBI and its different types of resources
*to navigate NCBI to find the databases and analysis tools you need
*what types of educational resources are available at NCBI
*basic data submission procedures and background information
*how to search the entire NCBI site, as well as just the subset of Entrez databases


*basic, advanced, and Boolean search methods
*additional searching methods like the Entrez Global query and the MeSH query
*tips to understand the visual cues and displays
*to use My NCBI to customize your results and save searches which can be run and emailed regularly

Gene Expression Omnibus (GEO)

*efficient ways to query GEO for specific genes or experimental designs
*how to navigate through GEO output displays to find the specific information you want
*how to navigate GEO’s complex data architecture to search GEO by specific record types

To find out more about these and over 85 other tutorial suites visit the OpenHelix Catalog and OpenHelix. Or visit the OpenHelix Blog for up-to-date information on genomics and genomics resources.

About OpenHelix
OpenHelix, LLC, (www.openhelix.com) provides a bioinformatics and genomics search and training portal, giving researchers one place to find and learn how to use resources and databases on the web. The OpenHelix Search portal searches hundreds of resources, tutorial suites and other material to direct researchers to the most relevant resources and OpenHelix training materials for their needs. Researchers and institutions can save time, budget and staff resources by leveraging a subscription to nearly 100 online tutorial suites available through the portal. More efficient use of the most relevant resources means quicker and more effective research.

Personal Genomics, clinical assessment and online resources

ResearchBlogging.orgThe Lancet paper, Clinical assessment incorporating a personal genome, has held my fascination this weekend (yes, I read it at the beach). Mary posted Friday and again Saturday on the paper and related NPR segment. It feels to me to be a seminal paper, though I do agree with Daniel at Genetic Future, there are a lot there we still don’t know. A large portion of the variation is in non-coding regions, and thus predictions and propensities are hard to come by with the available analysis. In fact, as he pointed out, many of the coding region variations have little information as to their effect on disease. I would add also that even if we get to that holy grail of $1,000 to sequence a personal genome, this kind of extensive analysis would still be time and cost-prohibitive for the vast majority of sequenced genomes.

Yet, as with all early steps in science and medicine, there’s missing pieces, large gaps and huge efforts (think “space travel,” “computers,” “microwave ovens,” “internet,”) that over time become inexpensive and commonplace (ok, so the former isn’t necessarily “inexpensive”). Sequencing genomes will become inexpensive before the analysis does, but both will come. And I think this paper is pointing to that future.

The other hurdle to large scale personal genomics I see (of course) is the understanding and use of the genomics and data resources. The authors use a large (and excellent, in my opinion) suite of genomics resources to do obtain data and do their analysis. I’ll list them here with links in alphabetical order:

dbSNP (T)
HapMap (T)
PubMed (T)
UniProt (T)

All of these resources have a wealth of data, but even then, that is a lot of analysis and familiarization that is needed with each tool. Each tool does have documentation and tutorials, and of course OpenHelix has tutorials on many of the ones mentioned (those with linked “T”s after the name). Still, this one analysis took a large number of tools and familiarization.

The paper does have a pretty good figure (figure 1) outlining the analysis process. For example, they SIFTed the genome to find gene-associated, non-synonymous, rare and novel and disease associated variations and then analyzed those using dbSNP, HGMD, OMIM and PubMed to analyze something like HFE2 which might have an association with Haemochromotosis. One of my quibbles with the paper, as often is with these papers, is that there isn’t a good methods ‘walk-through’ of the paper using something like Galaxy or Taverna in a history or workflow that would help reproduce the analysis.

We also have a tutorial I’d like to point you to, one that walks through a similar process and teaches users the basics of walking through that process. You can find this tutorial here, it’s free and publicly available. The tutorial walks the user through the analysis of a gene variation, in this case in the CYPC9 that effects an individual’s response to Warfarin. There is a similar variation (different gene, affects same drug response) in the paper. The tutorial uses the NIEHS SNPs site to get an overview of the variation including SIFT and PolyPhen predictions, then to the UCSC Genome Browser to find an overview of the region, walks through the dbSNP information and does a quick tag SNP analysis using GVS. That tutorial is only one very small step in what will have to be a immense education into genomics and genomics resources.

That is all to point out that the paper is an fascinating first step, and as a first step suggests the gaping holes we will have in bringing personal genomics to medicine.

Ashley, E., Butte, A., Wheeler, M., Chen, R., Klein, T., Dewey, F., Dudley, J., Ormond, K., Pavlovic, A., & Morgan, A. (2010). Clinical assessment incorporating a personal genome The Lancet, 375 (9725), 1525-1535 DOI: 10.1016/S0140-6736(10)60452-7

When databases crack you up…

If you are someone who’s spent a lot of time deep in the recesses of databases — deeper than the average end users — sometimes you find some really interesting  things.  Sometimes they are instructive, such as: hmm…I didn’t realized mice had a bone there until I was working the the anatomical hierarchy at Jax…  Sometimes they are creepy.  Buried in the MeSH hierarchy was about the most repulsive term I’d ever seen in a controlled vocabulary.  I complained about this to them probably 10 years ago, and just realized it doesn’t appear in MeSH 2010, finally.

But then there are other times when a database search leaves one ROFL.  That happened some time ago when I came across this odd tidbit in a search for gray hair genes. It generated some discussion among my sphere of colleagues about other funny things we’ve come across in the databases.

Well, there’s one whole blog dedicated to the pursuit of humor in NCBI’s PubMed.  I just found out from the #scio10 tweets from the ScienceOnline2010 meeting that they have found a new home on the Discover blogs collection!

NCBI ROFL: Hello, world! (again)

Congrats to them.  If you find you need to chuckle at the literature sometimes–or need a funny sample for a presentation perhaps, check them out at their new home. They also take suggestions. So if you find something in PubMed that cracks you up, send it along.

Webcasts in PubMed?

Inigo Montoya: You keep using that word. I do not think it means what you think it means.

So I was reading the NLM Technical Bulletin for November-December this morning. (Yeah, gripping, I know–and a month late.  You know how the holidays are…).  But I came across something intriguing.  Here’s what it says:

Hmm…webcasts.  Ok.  But how are they going in?  And what are the sources?  How are they annotated?  So let’s have a look.  At PubMed I clicked on advanced searches.  Checked the “Type of Article” box.  And let the search run.

I got 3. Here are my results:

I went to the first one at the publisher’s site.  There’s a brief abstract-like introduction.  There you can link to 2 videos.  There is a patient examination, pre- and post-treatment.  There’s a word doc with the video legend.  Here’s the whole legend…I’m not really sure why this need to be delivered in a word doc and not on that page.

Legend: A 43-year-old man with thiamine deficiency, manifested as gait and eye movement abnormalities without encephalopathy (video 1) that markedly improves following prompt diagnosis and empiric thiamine replacement (video 2).

Ok. This is useful stuff for neurologists, I’m sure. But it’s not what I would have called a “webcast”. Maybe it’s just me…I would have thought that was pretty much essentially a figure in this paper.

The second one I didn’t expect to have access to.  But when I got to the site it appeared that I did.  My German is non-existent, but I was able to decipher the word “podcast”.  So I clicked. I had access to a German podcast.  That’s cool.  But also isn’t what I would have called a “webcast”.

Ok. The next one I don’t have access to.  I can’t assess what it really is.  But I have to say 2/3 are not what I expected as webcasts…

Maybe it’s semantic.  I thought webcasts would be seminars people gave on their work, or special published items like the training materials we have, or something.  Maybe recordings from conference presentations.

I like the idea–I think we need to start thinking about ways to make these types of valuable publications/presentations available.  I was wondering how the content would be indexed by the NLM.  All of our training webcasts have the full script available as text, but I would say that’s rather uncommon.  Most cases have a title and a bare abstract at best in my experience.

What do you think?  Is that what you call webcasts?  My current assessment of this development = Idea: excellent.  Execution (currently): eh.

MacDonald, R., Stanich, P., Monrad, P., & Mateen, F. (2009). Teaching Video NeuroImages: Wernicke encephalopathy without mental status changes Neurology, 73 (20) DOI: 10.1212/WNL.0b013e3181c1de31

Tip of the Week: NCBI Makeover!

NCBI_interface_movieThe two earliest web-based bioinformatics resources that I can remember relying on in my career were Pedro’s List and NCBI.  (For those of you who need a little nostalgia trip you can see a copy of Pedro’s list here.) There are plenty of descendants of Pedro’s list in various forms–including our recently launched resource search tool.  But the National Center for Biotechnology Information (NCBI) interface has kinda been…well…comfortingly stable–for a really long time.  I looked in the Wayback Machine to see what the older interfaces used to look like.   I was able to find one variant from 1997 which I had forgotten about until I saw it.  But then I kept looking and found the version I am most familiar with starting in 1999.  If you compare 1999 to 2009 you will see essentially the same layout.  Here is a comparison of the previous interfaces, and then the new one:

NCBI interfaces through the years

NCBI interfaces through the years


Well, that’s all changing now!  The NCBI is doing a MAJOR overhaul of the interfaces.  You can examine the homepage look at the  Preview site here (link may break when they move over to production with it), and you can look at the PubMed changes here, and even start using the PubMed preview site here.

This is a huge break with the past, and like all new interfaces will take a little time to get used to.  But I have to say I like the organization.  The left navigation will make finding the tools easier.  The “Popular” box will be quick access to the most frequently used items.  Highlights and news are available still as well.  There are some things I’ll miss. We liked the site map layout to explain the features in an overview sort of way, and the preview page doesn’t link to that–it links to the alphabetical list.  Might change, though.

Anyway–I think the new look is nice and effective.  Of course we’ll have to update all of our NCBI tutorials with new shots of the interfaces.  But it looks like the underlying tools don’t change much conceptually–but they may move the location of the items (like the PubMed filters).  So as soon as the interface becomes the main site and appears to be stable we’ll make our changes.

This short Tip of the Week introduces the new interface briefly to get you starting to think about how to navigate around.  Check it out!

NCBI: http://www.ncbi.nlm.nih.gov/