What’s The Answer? (extracting data from PDFs)

BioStar is a site for asking, answering and discussing bioinformatics questions and issues. We are members of the community and find it very useful. Often questions and answers arise at BioStar that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those items or discussions here in this thread. You can ask questions in this thread, or you can always join in at BioStar.

This week’s question is something that I can imagine having broad utility–way beyond bioinformatics. But I swear–supplemental files have made me crazy on more than one occasion. They can be hundreds of pages long, with key things just buried in there. These handy options are something I will definitely use.

Forum: Large Tables in Supplementary PDF in Journal Articles

What are you opinions on this ? When a table is embedded in a PDF, it can’t be correctly pasted into a spreadsheet program, especially if it spans multiple pages. This makes computational operations on the table impossible. It would be easy for the journals to have rules for providing tables as CSV or XLS.


Thank you Dario for initiating this discussion! I learned useful stuff. Go see the options, and add more if you have them. However, you will also see that I found a supplement that I wanted to try the tools with. And that didn’t go so well…