Video Tip of the Week: DNA Subway

At a recent training workshop on the UCSC Genome Browser, I spoke with an educator who is using a custom local installation of the browser to work with students on bioinformatics lessons. It’s a project called the Genomics Education Partnership at WashU, and students learn by annotating regions of the genome with bioinformatics tools. You can see the team’s installation of the browser here. It sounded like an enjoyable and effective method and useful to students.

So the other day when I was exploring some of the resources available from the iPlant Collaborative, I was reminded of the annotation educational method by their very cool DNA Subway project. It’s another strategy to educate students with genome annotation tools–but I also think there might be some scientists who might want to use it beyond formal educational settings. It’s not new–I can remember reading about it in the past, but looking at it again with fresh eyes after that other conversation was worthwhile. And they’ve added new features since I last explored.

Student annotation projects are widespread, and there are probably numerous different successful strategies that local folks have implemented to set this up. But I suspect that more folks who are teaching bioinformatics might find the workflow infrastructure of the DNA Subway system a useful mechanism to use themselves, rather than setting up their own. So this week’s video tip of the week highlights the DNA Subway. Oh–and by the way: just because it’s at iPlant doesn’t mean it’s restricted to plants. You can go over there and see the various species options.

The providers of the Subway describe it as:

DNA Subway makes high-level genome analysis broadly available to students and educators and provides easy access to the types of data and informatics tools that drive modern biology. Using the intuitive metaphor of a subway map, DNA Subway organizes research-grade bioinformatics analysis tools into logical workflows and presents them in an appealing interface.”

I thought this was a really effective way to conceptualize the tasks that need to occur on a project. And it’s integrated with the tools you need at each “stop” to accomplish the tasks. The new “green line” in Beta that they have created isn’t shown in the video, but you should have a look at the site. It’s got tools for NGS RNA-seq data analysis, integrating the Tuxedo workflow protocol that includes TopHat, Bowtie, and Cufflinks, and is a really good thing for students to be exposed to. If you go over to the DNA Subway site itself and choose the “green line” to explore, you can see more information.

I can’t seem to embed their video, so I’d recommend you look at the larger size version on a separate page, and to go over and have a look for yourself at the DNA Subway.

Go over to their site by clicking on the image to access the video.

What’s the Answer? Alternatives to Galaxy

BioStar is a site for asking, answering and discussing bioinformatics questions. We are members of thecommunity and find it very useful. Often questions and answers arise at BioStar that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those questions and answers here in this thread. You can ask questions in this thread, or you can always join in at BioStar.

This week’s highlighted question is

What are the “alternatives to Galaxy for wrapping a command line tool in GUI?

Or in other words, what workflow systems are out there in addition to Galaxy (a great tool, but sometimes people need something different :).  The answers to this question will help both bioinformaticists who create tools and biologists who use them, giving the former alternatives for doing this if need be and the latter other workflow systems to try out.

Several were highlighted including TavernaYabi and Knime and a list was provided from wikipedia. Check out the answers for more examples.

Tip of the Week: BioCatalogue for finding web services

A couple of years back at a conference I was introduced to BioCatalogue.  It seemed to me to be a really useful idea: locate bioinformatics tools and databases that are web-accessible, and that also have a mechanism to use the web service features to access the tool/server using strategies that don’t require the main web interface of the site.  There are some introductions  to the concept of web services out there–some of them are more for introduction, but most are aimed at programmers.  Essentially it is kind of a back door into the tool, and lets you pull the information you need out in ways that you want–not constrained by the main user interface.

BioCatalogue is a curated collection of these web services.  The creators  of BioCatalogue provide the framework and perform some of the  collection and annotation–but they also enable the user community to bring in web services and annotate them as well.  This means that you can use BioCatalogue to find and learn more about the services, and you can feed back into the system as well if you join the community.  If you are a software provider you can register your service there–so more people can locate you and learn about your project.  Another really nice aspect of BioCatalogue is that they monitor the services.  As we know at OpenHelix–plenty of times a tool you have accessed in the past is suddenly unavailable.  Sometimes they are intermittent server problems, but sometimes they are longer-term issues.  BioCatalogue is regularly checking  the status of the tools so you can have confidence that the tool has been up and seems stable.

The Web Server issue (see the 2009 issue here) of Nucleic Acids Research provides a wealth of  information about useful servers with bioinformatics tools.  And there’s a paper for the 2010 Server  issue about BioCatalogue that will offer more details on the background (linked below).  In this week’s movie I can only briefly introduce the site and the features available.  Check out the paper from the BioCatalogue team, and explore the documentation wiki to learn more about the features and functions that are  provided.

Now, these web services are not for everyone.  For many people the main user interface will still be the best mechanism to access a tool. But if you need more advanced or customized queries, or if you want to create inflows into your own tools, or if you want to use some of the cool work flow software that’s  out there now (such as Galaxy or Taverna)–web services may be right for you.

Check out BioCatalogue  (and remember the -ue spelling!) http://www.biocatalogue.org/

Bhagat, J., Tanoh, F., Nzuobontane, E., Laurent, T., Orlowski, J., Roos, M., Wolstencroft, K., Aleksejevs, S., Stevens, R., Pettifer, S., Lopez, R., & Goble, C. (2010). BioCatalogue: a universal catalogue of web services for the life sciences Nucleic Acids Research DOI: 10.1093/nar/gkq394

Tip of the Week: Sharing your analysis process

galaxyworkflow_thWe’ve introduced Galaxy (http://www.usegalaxy.org) before in the Tip of the Week section, have showed you one thing useful you could do with it, and now we also have a free tutorial and training materials that introduce you to the basic use of the tool. In today’s tip of the week, I’m going to show you workflows. Workflows was in beta until recently, so it isn’t in the first version of our tutorial (though it will be in the second). It’s a great feature, so I wanted to introduce it here. Workflows allows you to set up an automated process that takes your data through a preset series of manipulation and analysis steps. This can be very useful if there is a process you are doing a lot and you don’t want to have to do each step every time, or if you create a analysis process you’d like to share with a colleague. (and I’d like to point out that Galaxy has a good number of other short screencasts of tasks you might want to check out after doing the tutorial )

Are you ready to create a workflow?

Yesterday I attended the final session of the ICSB conference that I could fit into my schedule: a session on web services in systems biology. (I would link to the description but the ICSB server is down while I write this…) There were several tools covered that I will address later (including one of our old favorites: Reactome. And Esther Schmidt showed me a trick to accomplish some teeny little thing that was making me crazy….Yea Esther!). But I wanted to get you thinking about using tools in workflow pipelines. This is not just for giant sequencing projects anymore!

Although there are a variety of tools that can let the average user in on this handy strategy, yesterday we heard specifically about the Taverna project. Taverna will let you pull re-usable modules of analysis tools into a series of actions that you can perform on your lists, or favorite sequences, or genomic regions, or whole genomes…and annotate, analyze, and process. Don’t be daunted by the look of that project page. We can help you to understand what to do and how to do it. But start to think about the series of things you might be doing from website-to-website as you do your research on genes of interest. Can you imagine a way to streamline that and set up a re-usable protocol to do that? I’ll bet you can….

More later on these types of services. But I’m off to Copenhagen today and won’t be online much until next week. Enjoy your weekend! Scandinavians seem to really understand the purpose of the weekend…