Publishers + bioinformatics tools. Good idea.

So I was watching my twitter feed today and saw this tidbit come along:

@FN_Press: Elsevier Introduces Genome Viewer http://bit.ly/nS92lE

….The Genome Viewer utilizes a genome browser developed by NCBI (the National Center for Biotechnology Information at the National Institutes of Health). Elsevier collaborated with the NCBI as it was developing the browser, and is the first publisher to incorporate the technology into an application for viewing detailed information about the gene sequences that are mentioned in articles.

When an author of an article tags a gene sequence, Elsevier matches this gene with information in NCBI”s databases and pulls this information into the article. This allows readers of the article to get specific information about each strand by hovering over it, and also offers functionality such as flipping the strands, zooming to a sequence, or going to a specific position to define a track of interest within the sequence….

There’s nothing I love more than exploring a new genome viewer, or a clever new use of an existing one! The press release offered links to a couple of papers that were supposed to show this new feature. Um…I couldn’t find it.

Step 1: Locate genome viewer.

Step 2: Explore genome viewer.

Step 3: Write up blog post about new genome viewer.

So here we are, and I’m stuck at Step 1 still. But I’ve sort of hurdled over that to Step 3. I’m sure the folks at Elsevier are going to get me to this browser–I’ve already been in touch with them and they are going to help me out.

But in the meantime I’d like to say how cool an idea this is. I have always thought there should be more integration between the science publications and the databases. And not only because I firmly believe that, in large part, the data isn’t in the papers anymore.

We took a different approach. We recently partnered with BioMedCentral’s team to tag articles with the computational resources mentioned in their publication for which we have training. On their pages you’ll see a link to our site that looks like this, on a recent paper about CNVs in trypanosomes: http://www.biomedcentral.com/1471-2164/12/139

On our pages–such as this example of our landing page for the GBrowse tutorial–you’ll see recent papers that referred to this tool.

So if you were interested in GBrowse you could quickly see who is working with it, how, and it would help you to assess if GBrowse is the right tool for you needs. And you could use our tutorial to help understand ways to explore the data from a project that uses GBrowse. In many of the “big data” projects you aren’t going to get a gene list or a gene link. You’ll need to explore the set in toto at their sites. I’ve had my ranty pants* on about this before

Yeah. I have strong feelings about this.

I also think it’s a good idea for publishers. There’s a bit of pushback I’ve seen about subscription pricing, including this letter that has always stuck with me since I read it:

The Head of the Harvard Library System is Pissed

Profit margins of journal publishers in the fields of science, technology, and medicine recently ran to 30–40 percent; yet those publishers add very little value to the research process, and most of the research is ultimately funded by American taxpayers through the National Institutes of Health and other organizations.

I think that by adding handier access to the data in a paper, or to the tools needed to go further, publishers can add value beyond just the traditional publication.

If I could only find it…

++++++++++++++++++++++

*hat tip to Mike the Mad Biologist for my new favorite phrase, “ranty pants

12 thoughts on “Publishers + bioinformatics tools. Good idea.

  1. Shaun

    Hi Mary,

    When I pasted in one of the links, the browser came up directly below the abstract: http://dx.doi.org/10.1016/j.ygeno.2010.06.001

    I think this is a great idea, and it will be especially useful for ChIP-seq/RNA-seq papers. GEO now asks (but doesn’t enforce) everyone submitting ChIP-seq data to provide a WIG file describing coverage over the genome along with the raw data. When your data gets sent to the SRA, the bare-bones NCBI genome browser is added to your GEO pages. I’m really glad that the journals are now starting to source from that same browser.

    I’ve recently been making a point of including instructions in my journal submissions that guide reviewers through how to load our ChIP-seq GEO WIG files directly to the UCSC genome browser. I think it’s really powerful to allow the reviewers to assess the quality of the ChIP-seq data themselves rather than just presenting a handful of cherry-picked examples in the figures. Directly loading from GEO to UCSC provides a couple of advantages, such as maintaining reviewer anonymity and allowing the people submitting the data to keep it private until manuscript acceptance. I’m sure you guys know how to construct the UCSC custom track URL to allow sourcing directly from GEO, but let me know if not.

  2. Mary Post author

    Thanks Shaun–what browser are you using? None of us have found it on any variety of IE or FF, on both PC and Macs so far. Here’s what I see: http://screencast.com/t/tgTw7XbtG

    That’s an awesome point about reviewers too.

    Feel like doing a guest post on your strategy ;) ? Sounds like it would be helpful for people….

  3. Shaun Mahony

    It’s working for me on both FF5.0 and Chrome12.0. Maybe it’s a HTML5 thing? It does take a couple of seconds to load, though, so try again. Here’s what my screen looks like in Chrome:
    http://people.csail.mit.edu/mahony/screen.png

    I’d love to do a guest post on the reviewers issue; thanks for the offer! Email me to tell me when you want it, etc. Beware, though, it’s a bit of a hack!

  4. Shaun

    Just thinking — is it possible that you didn’t see it because you don’t have a journal subscription? That would be really lame if they’ve set it up that way, given that this is public data sourced from a public genome browser. Anyway, just to give you a feel for the browser, the embedded browser in the article links to this page on NCBI:
    http://1.usa.gov/pPkXEt

    and it’s the same look and feel. You’ve probably already seen this browser from GEO pages or elsewhere.

  5. Mary Post author

    Yeah, we have tried everything we can think of, including waiting…. We don’t have a subscription, this is true. Yet we can see the full text of the paper. So if it was a demo paper with open access, I would expect to see it anyway. One of us logged in to Science Direct and still didn’t see it.

    It is a good question though of how it behaves with subscription stuff. Is it like the abstract and becomes available to all, or not?

    And we’ll ping you via email.

  6. Casey Bergman

    I was able to access the new Genome Viewer on our University VPN, but not off of it, so this is likely to be a subscription access issue. Not a great start for a new feature from a company with Open Access credibility issues…

    What the Elsevier system appears to be doing is mining GenBank accession numbers from the full text, and providing an embedded link-out to the NCBI genome viewer. So in reality Elsevier have not produced any new Genome Browser functionality. Moreover, the Genbank identifier text mining appears to be primitive, lacking e.g. range expansion since the text “accession numbers GU646204 to GU646336.” in http://www.sciencedirect.com/science/article/pii/S0378111911002125 only links to two GenBank accessions (GU646204 and GU646336).

    If this kind of integration is of interest to people, I suggest they check out a project developed by Max Haeussler called text2genome (http://www.text2genome.org). text2genome extracts DNA sequences from full text and maps them to genes and genomes to provide a direct annotation of the DNA sequences in a paper, rather than an indirect annotation of GenBank accessions linked to an article. Data from text2genome is served via custom tracks to the UCSC Genome Browser and a DAS tracks to the Ensembl Genome Browser.

    We have also mashed-up publications and genomes in a custom BioMart called pubmed2ensembl (http://www.pubmed2ensembl.org), where users can check out papers linked to genes and genes linked to papers from a variety of data sources, some based on text-mining and others based on genome sequence analysis. Data in pubmed2ensembl can be displayed via the BioMart DAS tracks on the Ensembl Genome Browser as well.

  7. Mary Post author

    Thanks for the details Casey.

    I like the idea, but since I haven’t been able to see the implementation I haven’t drawn any real conclusions yet. I’m also wondering how updates will affect this: if the gene we are talking about today in this paper is this gene, but things change in a subsequent assembly–which one will we have in the browser? And will that be clear to users?

  8. Trey

    Casey,
    thanks for those, I was just looking over text2genome and looks great.

    This kind of integration is really helpful when reading papers, here’s hoping publishers add more of it :)

  9. Mary Post author

    Hey thanks @Rafael! I was able to see the shots, and am eager to test drive it myself. I’m a world-class software tester, you know ;) Seriously, I love to kick the tires.

    If you have any links to documentation that I can start reading, that would be very nice too. I’m one of 10 people on the planet who actually reads documentation I think….

  10. Pingback: What’s the answer? (Publishing apps) | The OpenHelix Blog

Comments are closed.