Tag Archives: data mining

New NCBI Image Database

Mary brought up a paper just recently about what we are missing when data mining papers: Figures and figure legends.

Enter the NCBI Image database. This very new database includes over 3 million images that are found in the full-text resources (i.e. PubMed Central) at NCBI. So, I did a search for “drosophila phylogeny” and found some great images and figures. The results will not only pull out the figure, but also the figure legend. I got over 200 results. The links in the search result figure titles take you directly to the figure. Below the legend you can see links to the full text. It’s a great start to searching figures and figure legends.

Along with this, PubMed search results now are enhanced with images from this database (if, remember, the article is in the full-text resources.. but over time a lot of research published with

NIH funding will go there won’t they?). For example, go to this abstract for the paper “Text mining and manual curation of the chemical-gene-disease networks for the comparative toxicogenomics database.” Scroll down just a bit, you’ll see the figures from this paper, which have been deposited in the NCBI image database. You can go directly to the link to all the figures or to the papers.

Of course, as stated, not all articles will have images in the database, only those deposited in PubMed Central. You’ll find a lot of your searches won’t have this image strip because the journal isn’t deposited there . But with 3 million images and more journal articles going to PMC every day, this database and feature of PubMed could prove to be quite useful.

Hattip: APD at CTD :)

Tip of the Week: Ratmine

Ratmine is a ‘data warehouse’ that allows the user to construct queries across different areas of biological knowledge from SNPs to Pathways. It’s developed by the people at RGD and uses Intermine a project developed for Flymine and as part of a project between RGD, SGD and ZFIN to implement Intermine for these databases and ” develop new methods of interoperability for cross-organism research.” We’ve mentioned Intermine before and it’s also used in ModEncode Intermine is going to have to be a subject of a later post I think :).

This tip is actually a video done by the RGD group and one of those gems I’ve found at SciVee in our attempts to integrate our tips at SciVee (which will be coming). We occasionally will highlight a short tutorial done by someone else here at our tips (occasionally) and since I’ve found this gem and just got back from vacation in Florida :)…
Btw, while you are at it, you might want to check out this interesting set of tutorials on biomedical ontologies.