Video Tip of the Week: BioRxiv, A preprint server for biology

 Open access to scientific research has been advocated for a long time, even before the advent of the internet. With the internet, the movement grew. Many open access journals, NIH now requires NIH-funded research to be open access within a year of publication, NSF and other agencies are working on similar plans.

A part of that movement to open access to research, “preprint” has also grown. The preprint of scientific research allows for the fast dissemination of research and the open discussion of results before they’ve gone through peer review. Peer-reviewed research can take weeks, months and sometime years to be publicly disseminated through publication. In a modern world of fast-developing and changing science, preprint distribution allows for a much faster access to research

The most well-known preprint server is arXiv. Started in 1991 at the Los Alamos National Laboratory and moving to Cornell University Library, arXiv allows for the open access preprint dissemination of physics, mathematical and computer science research. By many standards, it has been a success in getting research quickly and openly discussed.

There have been previous attempts for such biological research preprints in the past and currently: Nature Precedings (which ceased accepting new manuscripts in 2012), PeerJ Preprints and others.

That a traditional publisher such as Nature (Nature Precedings) had made a foray an open access preprint repository, goes to the need and demand for such as service. The rise of biological research preprints in arXiv has grown rapidly.biorxiv Just last year, a case was made in PLOS Biology for just such a server for life sciences research by Desjardins-Proulx et al.

The first and most often discussed advantage of open preprints is speed. The time between submission and the official publication of a manuscript can be measured in months, sometimes in years. For all this time, the research is known only to a select few: colleagues, editors, and reviewers. Thus, the science cannot be used, discussed, or reviewed by the wider scientific community. In a recent blog post, C. Titus Brown noted how posting a paper on arXiv quickly led to a citation (arXiv papers can be cited), and his research was used by another researcher. The current system of hiding manuscripts before acceptance poses problems for both scientists and publishers. Manuscripts that are unknown cannot be used and thus take more time to be cited. It has been shown that high-energy physics, with its high arXiv submission rate, has the highest immediacy among physics and mathematics.

And now we have it. Above you will find the promo video for a new life sciences open access preprint server: bioRxiv. Science has a introductory post about it from November 2013 (2 days after it was announced).

Like arXiv, bioRxiv is housed and run by Cold Spring Harbor. LIke arXiv, it is open access, preprint and has similar rules. You can learn more about the specifics (such as journal preprint policies) on the about page. Articles can be in most any life sciences topic from biochemistry to zoology, and other fields, such as physics, if the research has direct relevance to life science. It will not, however, publish medical research such as clinical trials.

Articles are placed into three categories:

Articles in bioRxiv are categorized as New ResultsConfirmatory Results, orContradictory ResultsNew Results describe an advance in a field. Confirmatory Results largely replicate and confirm previously published work, whereasContradictory Results largely replicate experimental approaches used in previously published work but the results contradict and/or do not
support it.

The biological research community has asked for it, and here it is. Currently, there are only 200 or so manuscripts submitted, a quick search of ‘retrovirus‘ brings up only 3 results. But, bioRxiv is only 6 months old. Keep an eye on it, better yet, test it out and submit.

New to me old news: Dryad data repository

Recently, some major Journals implemented new data archiving policies, including the American Naturalist:

The American Naturalist“The American Naturalist requires authors to deposit the data associated with accepted papers in a public archive. For gene sequence data and phylogenetic trees, deposition in GenBank or TreeBASE, respectively, is required. There are many possible archives that may suit a particular data set, including the Dryad repository for ecological and evolutionary biology data (http://datadryad.org). All accession numbers for GenBank, TreeBASE, and Dryad must be included in accepted manuscripts before they go to Production. Any impediments to data sharing should be brought to the attention of the editors at the time of submission.”

Re: Dryad. This data repository is new to me. Though they are not particularly ‘data-rich’ at the moment (just over 1,000 data files), the idea and purpose behind Dryad seems to have been a long time coming. We have huge data repositories for sequence, structure and the like, but there is so many types data published that reside only at individual journals… or on someone’s hard drive. Dryad’s purpose is a repository for these kinds of evolutionary and ecological data (among others).

It appears that the repository project only got started about two years ago (funded anyway) though I could be wrong, but they’ve made some headway. Nature journals now list Dryad as a recommended option for data repository. And as linked and quoted above, several other large journals now either require or recommend Dryad as a repository.

There is a quick video on how to submit data here. The search capabilities are a bit limited (for example, once you search you can’t alter the original search term without starting all over), but I’m sure with time and funding this will change.

Definitely check it out.

Friday SNPpets

Welcome to our Friday feature link collection: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…

Open Access Publishing… funding it

Five universities, Harvard, Cornell, Dartmouth, UC Berkeley and MIT, have compacted to support open-access publishing by funding publishing fees. Many open access journals, because the do not charge readers, use the model of charging for publication. This could be a barrier to publishing in an OA journal, so the compact:

…supports equity of the business models by committing each university to the timely establishment of durable mechanisms for underwriting reasonable publication fees for open-access journal articles written by its faculty for which other institutions would not be expected to provide funds.

Open access isn’t free, someone has to pay for it.. the provider, the user or some other model. I, personally, like the idea behind open access publishing of research. I believe it can be one model, among several, to make access to research free and available to help advance further research. Two years ago, the Consolidated Appropriations Act made it a requirement that NIH-funded research be published in such a manner that it was open after a certain period of time, a boon to open access publishing. Publishers of all stripes are attempting to develop ways to make research available to researchers, and pay for it.

I look forward to seeing how these universities underwrite those fees and which other universities join the compact.

Adventures in publishing

A new open access journal, Ideas in Ecology and Evolution, has, well, opened. It’s published at Queen’s College in Canada.

“So?” you ask, “there are lots of journals up starting all the time”.

This one is different. It’s experimenting with a lot of things (ok, so there seem to be a lot of journals experimenting with the model lately). The subject matter is not research per se, but ideas. Having been to my share of ecology and evolution conferences and discussion, I can see this journal has opened itself up to some quite lovely discussions.

As explained by Bob O’hara, there are some interesting review process experiments going on here too. Authors pay to get their ideas published, reviewers are paid, reviewers are not anonymous and they get to publish their views of the article as a companion piece. Bob discusses the issues we’ve all heard about the pros of anonymity (and they are valid ones), but this might work in this case. I also agree with Bob on one point, this structure (reviewers publishing their views) will indeed increase discussion, but I’d too like to see some mechanism for a broader discussion. As it is designed now, it will be like watching TV pundits arguing the finer points of health policy, which I guess is informative, but I’d like to see some mechanism that allows a broader discussion of the article. Something like PLoS has, which I think would actually work better in a journal of ideas like this.

Well, we’ll see. Right now there is nothing there but the editorial. I’ll be watching though.

hat tip: Coturnix

Open Access Publishing

If you haven’t already seen it, open-access publishing either just made a jump backward or forward. The not-so-open access Springer publisher bought Biomed Central, the open access publisher recently. Open access publishing took a huge leap with the passage of a law last year that requires NIH-funded research to be open access and deposited in PubMed central within 12 months of publication. The law hasn’t not met resistence though. Perhaps Springer saw the writing on the wall, so to speak, and decided that buying BioMed Central was a good move in a world were open-access publishing seems to be gaining ground. Or…?  According the BioMed Central FAQ about the buy, BioMed Central publishing will remain 100% open access.

Happy Open Access day! (Oct 14th)

openacess_day_logo.jpgAccording to this BioMed Central blog, tomorrow is Open Access day – how shall we celebrate? Maybe read an extra open access article or two? Attend one of the events? Your blog on Oct 14th about the importance of open access to you could win you a ‘bag of swag’ PLoS ‘Synchroblogging competition – get writing this weekend’ competition. Another option is to join the BioMed Central Facebook fan club. As you can see there are LOTS of ways to support OpenAcess – let us know what you do to celebrate this important movement.

Database "openness"

We train on publicly available databases and resources. For our purposes on deciding when to develop training, the definition is relatively straightforward: Can the academic researcher access the data without cost or license restriction? If the answer is yes, our next step is to determine if we can develop training materials based on the resource without cost or license restriction and to ask the providers specifically for permission to do so. We ask permission for several reasons: let the developer know what we are doing, verify the restrictions or lack there of, build good relationships, etc.

That first decision, “is it publicly available?”, would seem a relatively clearcut criteria, but we have found that it isn’t always. There are several problems. Often, the ‘terms of use’ or copyright documentation is difficult to find on the web site or non-existent. Even when it available, the terms, language and restrictions can vary quite a bit across databases, countries and even within a resource at times. Determining what “publicly available” is and which resource fits that definition can be less than simple, to say the least.

Open Access Evolution

Dr. Eisen at UC Davis has started a new blog theme on his “Tree of Life” blog called “Open Evolution” (open access publications, open source programs, etc) and has started with open access journals. He has listed a few open access journals (and there’s a good discussion in the comments about the difference between ‘open access’ and ‘free online access’ journals) and is asking if anyone knows of any others. He hasn’t asked for it yet, but I’ve got some ideas for open source/access phylogeny analysis programs and/or databases. I’ll post a few of those in the coming week or so, but for now here is a link to a list of such programs (some on this list I’m not sure are open source, I’ll cull these later too).

Navigating the literature

progress slideWe have a slide we like to present at some trainings showing the rise in the amount of raw sequence data and number of complete genomes over the last 18 years. There is another slide we show that indicates the rise of the number of databasesdatabase growth and analysis tools over the years as listed in the annual database issue of NAR. The number has been doubling every 4 years.

Well, there is another slide we can show too, and this shows the growth of the literature risenumber of abstract entries into PubMed over the last 20 years (from Hunter and Cohen, 2006). Like data and databases, the number of research articles published and indexed just keeps getting larger. This increase in number is both a bane and a boon to researchers. Well, of course not only the number of papers indexed is growing, the amount of text is growing (open access, etc) and is about to grow even more with the signing of the new open access act. Searching, mining and making sense of all this literature is going to be a challenge, it is a challenge now.

