At the recent ICSB meeting I attended a session on Standards and Repositories. This was full of people who actually care about submission standards, curation issues, and very clean and tidy data. Yes, serious geekitude. I loved it.

Some of it covered familiar ground for me. I was involved in microarray experiments in the early days and was very familiar with the MIAME stuff. I know exactly why we have to get the experimental details, meta data about the experiment, and analysis details about a given experiment. I’m on board. And I know that we need the folks who generate the data to give us the correct and full information, or those of us who make our living pulling data back out of databases would be seriously adrift.

But getting people to actually conform to the standards is another issue entirely. Some people just don’t know about the standards. Some don’t have time. Some don’t have tools. Some just don’t have the interest. Some aren’t peer pressured into doing so.

It is great when a community group has come together to create standards and propagate those to their community. But very much like the wiki efforts, this is largely a volunteer thing with little reward and little respect as a use of one’s time. Although everyone wants standards, they can be contentious and undervalued even within a research community.

When granting agencies and publishers put their weight behind adherence to standards, that’s very helpful. But even that often doesn’t come with much financial support, and can be fragmented in application.

But as just an end user it is really hard to know if there are standards, if they are working, and where to go to find out information on them. A new effort is trying to make that a little more transparent: MIBBI.

Susanna-Assunta Sansone (yes, the most lyrical name I heard lately) gave a great talk in this session about standards, the challenges, and an attempt to solve part of this. There is a project called MIBBI, Minimum Information for Biological and Medical Investigation that is a sort of metameta project for standards. MIBBI.org is a group that has been organized to wrangle some of the issues around standards.

As I understand it, the premise is that standards are great and necessary (although often not respected and valued)–but they are often being developed by different research community groups in isolation. While we often need a lot of the same info (species, chemical compounds, meta data, etc) the groups are not talking with each other to coordinate at all. “Harmonizing” checklists would be really terrific. And we also need to accommodate downstream uses of data tha we don’t even know about yet. So we have to capture as much as we can, but that has to be useful for many groups.

MIBBI is trying to be a central resource for standards. They collect them, and generate checklists that investigators, grant agencies, and publishers could use to help everyone know and adhere to the standards.

If you are a bioinformatics geek, you should know about this effort. You should check out the MIBBI site and their recent paper in Nature Biotechnology (freely available on their web site). In that paper I learned about standards groups that I hadn’t heard of before–many beyond MIAME. They also have a great table about the variety of checklists that I found really illuminating into the issues.

I think this is a great idea. As we generate more and more high-volume data in every area of biomedical research we need help to store and sift through it all. I hope it is valued. But will it take off? Will it persist? MIBBI.

    Thanks for the hugely positive write-up! Definitely bring bio-informa-geek friends and we’ll all work together towards a more efficient and efficacious bioscience :)

    Tragically (speaking as someone engaged both with this ‘meta’ project, and also one several component projects such as MIAPE), engaging with the user community (by which I mean bioscientists, without whom we are nothing, as well as developers) is always an uphill struggle (though it is definitely easier to engage with developers for some reason). If I’m honest, I think that until some policy components (from funders and publishers) really back all this stuff up (which they will within a couple of years, roughly), people won’t dedicate the time to find out about comment on any of it, because optional = ignored for the most part (after all, they didn’t go into biological research for the love of writing up and good data management). There is also a visibility issue (i.e., we’re probably missing people who would contribute if they knew), but we’re doing all we can there; publishing willy-nilly, going to meetings and dabbling in the black arts wrt search engines.

    A slightly strange perceptional issue for ‘us’ (many of whom are also members of various ‘them’) is the significant body of opinion that this is really just a make-work scheme for parasitic bioinformaticians (I’m paraphrasing, but not much). Arguments about how EMBL-Bank started small but is now indispensible (along of course with GenBank and DDBJ) don’t wash, so there’s a real selling job too (which runs something like “How good data management practice can benefit YOU” — covered at some length in, for example, the MIAPE paper http://tinyurl.com/3grwwn — section on page three titled “Cui bono?”] and, spun for pharma, a minor thing I wrote a while back http://tinyurl.com/4caknw though that isn’t free, sadly). Oh and don’t try to say that such standards will help people spot sources of error in their data lol. For one, if it is someone else that spots it thanks to all the lovely metadata you provided for them…

    So, anyway, anyone who can bring to the table a critical eye, ideas, hours, or (god forbid) tame experimentalists is thoroughly welcome. Please do send a mail or something — have a look at http://www.mibbi.org/index.php/Discuss for a couple of ways to join in.

    Cheers, Chris.

    Hi Chris–thanks for those thoughts. I’m on the road doing trainings this week and won’t be able to read your references just yet, but I will.

    But let me add a thought on this: outreach is crucial. That is pretty self-serving for me to say, I know, since that is my business. What I mean is that bench researchers need to understand better why this is of value to them. We see them in our training sessions, and they are smart people. They are also busy people. They come to our training sessions to improve their efficiency of getting information they need out of databases. It might be an opportunity to talk to them about what goes in.

    I might suggest you try to reach the trainers. Give them training materials that make the case that they can use with groups they are training. Don’t just rely on uploading slides from a conference somewhere. Give more specifics about what to do and how to do it for people outside the organization.

    We do try to get outside our own circle as much as poss, but beyond the various more general media and regulatory/funding type bodies it can be difficult to find people who care one jot. We’ve had a good response from vendors (of kit, software, etc.) and several actually participate directly, but I’m not sure they coulnt as outsiders by anyone’s estimation.

    But the trainers thing is a fantastic idea that would never have occurred to me — once your back from your travel, I’d love to follow up on that on the phone or something (are you within striking distance of Cambridge?). Not to pester you specifically, more for a steer generally, if you wouldn’t mind.

    The only issue I see is that, presumably, the subjects you and others cover when training are dictated from on high somewhere? Or do some have more flexibility about what subjects they cover? I suppose if the audience wants it (whether they know they do or not), then it’s legit?


    I’m near the the _other_ Cambridge, actually :) But we can touch base next week–I’ll email and we can pick a time. I have some thoughts. I think there is some flexibility, actually.

    Gotta catch a flight now! TTYL.

