(de)Funding Databases

From Deepak Singh:

Scientists spend years collecting and generating increasing amounts data. The data ranges from raw instrument data, “finished” data (e.g. a

crisis_newbanner_correctsize1_flattenedgenome sequence which is constructed after aligning all the short reads from a next-gen sequencer), and annotated data, which has been marked up to add additional information. We have repositories where a lot of this data goes, RCSB, NCBI, etc. In many cases there is clarity in these

destinations and for the better part, resources like RCSB and NCBI are well funded and long lived (although I am always nervous about RCSB). However, many data repositories are dependent on funding, with no guarantees that the funding will be renewed. Given the size of some of these data resources, shouldn’t we be thinking of a more sustainable model for funding? This is a general problem for infrastructure resources, given the cost and the fact that you shouldn’t be looking at these from a 3-5 year perspective. This especially baffles me when libraries come into play. Shouldn’t the timescale there be in the 10’s of years?mndoci.com, The disconnect in funding data resources, Oct 2009

You should read the whole article.

A recent example of this is the arabidopsis resource, TAIR.

It is an excellent resource (tutorial), but currently their homepage includes a plea for a change in the funding mechanisms for long-term research data infrastructure mechanisms.  Their previous NSF grant expired, and a current one has dramatically decreasing levels of funding (as you can see in the image above). They’ve been encouraged to find other funding sources (subscriber fees), etc. a and the NSF is considering looking at other possibilities to change their funding mechanisms.

Dealing with genomics and biology databases on a daily basis, we have seen this all to often. Funding exists to create and develop an excellent resource, but mechanisms to maintain these resources are hard to come by. As Deepak and the TAIR developers suggest, we researchers need to have a discussion about how build a more sustainable model that keeps this data freely available and accessible to researchers for long periods of time.

8 thoughts on “(de)Funding Databases

  1. Kristi

    Hey Trey!
    Love the new OpenHelix digs – very nice. :)
    Thanks for this blog post – very interesting (and scary) topic.

  2. Trey Post author

    Thanks Kristi! I’m glad you like them. We have some more planned :D.

    yes, it’s a bit scary (the topic, not our new digs :). Frankly though, I think we _have_ to find a good way to do this, because most of these databases and resources are indispensable!

  3. Mary

    Thanks for that link, I’ll check it out later.

    This is another case of “Free” isn’t free, as much as the idealism of that is delightful. I keep hearing how disk space is so cheap, and bandwidth keeps dropping, so everything should be Free {I’m looking at you Chris}. But all the “success” examples I hear about are things that don’t make money–twitter, skype, youtube, etc. Same thing again in the Kurzweil talk I heard at MIT last week.

    Charming idea. But I keep hearing it mostly from kids in college whose parents are paying the bills, and people who have made a bunch of money and now expound on the beauty of “free”. Or people who can contribute to open source projects because they have a grant in hand on some other project.

    It isn’t a sustainable project model or business model for the long term.

  4. swarbre

    In response to the above comment, regarding when free isn’t free, when discussing biological databases we are talking about resources which are of greater importance than utube, sykpe etc. If utube disappears other sites will appear (already have) which would replace the functionality. The “data” lost is not high value the same is not true for biological databases. I don’t think many in the biological sciences see these as resources without a cost, rather they believe it is appropriate for important resources to have the stability and other advantages provided by state funding.

    No one in the UK sees the national health service as free we just believe such an important resource should be available to everyone regardless of an individuals ability to pay.

  5. Mary

    @swarbre: I’m all for state funding of the databases. It’s bizarre what happens now: initial funding is provided, and then (as we see clearly for TAIR) it vanishes. And then somehow these resources are supposed to fund themselves.

    That is not sustainable. These public projects are not designed to develop their own funding in any way, and aren’t even permitted to use their public funding to do the kinds of things one would need to in order to self-funding (marketing–a word you can’t use to grant agencies). And if they did derive funding from someone else (let’s say Monsanto came in to fund TAIR) scientists would be having kittens about how TAIR was now compromised.

    But a lot of people also think these things are “free” and would refuse to pay a subscription fee. And yet they won’t lobby for the funding, either, because they think big db projects get too much grant money at the expense of bench science.

  6. Pingback: (re)Funding Databases I | The OpenHelix Blog

  7. Pingback: (re)Funding Databases II | The OpenHelix Blog

  8. Pingback: Friday SNPpets | The OpenHelix Blog

Comments are closed.