Mary brought up a paper just recently about what we are missing when data mining papers: Figures and figure legends.
Enter the NCBI Image database. This very new database includes over 3 million images that are found in the full-text resources (i.e. PubMed Central) at NCBI. So, I did a search for “drosophila phylogeny” and found some great images and figures. The results will not only pull out the figure, but also the figure legend. I got over 200 results. The links in the search result figure titles take you directly to the figure. Below the legend you can see links to the full text. It’s a great start to searching figures and figure legends.
NIH funding will go there won’t they?). For example, go to this abstract for the paper “Text mining and manual curation of the chemical-gene-disease networks for the comparative toxicogenomics database.” Scroll down just a bit, you’ll see the figures from this paper, which have been deposited in the NCBI image database. You can go directly to the link to all the figures or to the papers.
Of course, as stated, not all articles will have images in the database, only those deposited in PubMed Central. You’ll find a lot of your searches won’t have this image strip because the journal isn’t deposited there . But with 3 million images and more journal articles going to PMC every day, this database and feature of PubMed could prove to be quite useful.
Hattip: APD at CTD