Tip of the Week: Introduction to R Statistical Software (with video)

This week’s video tip is different from our usual tips in several ways. First, you won’t hear me–this webinar was done by Heather Merk of Ohio State. We also usually highlight web-based tools, and this presentation on R statistical computing tools relies on the command line. And it’s longer than we usually do–but because of the detail that this introduction to R covers it will take some time. Note: this first YouTube video is just 1 of 7 that they have created from the whole webinar. You can watch them in series at YouTube, or go over to the webinar page at the UMass Extention Site for all of them, and additional links to slide downloads, the sample data sets you can use, and references + helpful links to many other resources.

But if you have been hearing about biologists who are using R for their work, or you’ve been seeing it mentioned in papers, and you’ve been wondering how and why to get started with R, this will be worth your time. It’s a gentle introduction to where to get R, how to start interacting with it, some tips on formatting your stuff, and links to additional resources for help using R.

R is both a programming language and an analysis environment for statistical computing that has broad utility, but is being widely adopted in biology and bioinformatics. This webinar focuses on plant breeding data, but you’ll see that it could be a terrific analysis platform for many of the data types you might use.

Although as I mentioned this material focuses on using R via command line, you can also interact with R on the web. In our Galaxy workshops many people were delighted to see R tools had been incorporated into the Galaxy interface–you can see them on the bottom of the tools column on the left. Ross Lazarus of Harvard has been integrating R features into the Galaxy interface. The overall goal of Galaxy is to create a web-based toolkit for researchers to access a lot of tools they might need to otherwise use at separate places or install themselves. The reference below illustrates multiple strategies in Galaxy, and Ross was one of the authors of that.

R is certainly not just for biologists, though. This recent tweet from a bioinformatics practitioner illustrates its range:

That just illustrates the range of things that are possible–there is probably something you ought to be doing in R to analyze your data. Have a look at the webinar and see if it would be useful to you.


Quick links:

Webinar full details page: Introduction to R Statistical Software: Application to Plant Breeding Webinar presented by Heather Merk.

R Project for Statistical Computing: http://www.r-project.org/

R Genetics + Galaxy: http://rgenetics.org/trac/rgalaxy

RStudio: http://www.rstudio.com/


For numerous helpful documents about R and specifically about applications in biology see the webinar page reference + external links section.

Tip of the Week: RGenetics at Galaxy

About 6-7 months ago, Mary mentioned that R-Genetics analysis was coming to Galaxy. Well, it has now and is available at the public Galaxy site. The old Rgenetics site links to the new one and the information about using Galaxy as a wrap around interface for the Rgenetics project tools. Today’s tip just points you to the tool and gives you a quick overview of what is there. You’ll need to do some exploring to learn to use it! Of course, we have our publicly available Galaxy tutorial to get you started.

(oh, and I point you to this tutorial on analyzing Desmond Tutu’s SNPs using Galaxy that I thought was interesting)