Tag Archives: education

Video Tip of the Week: DNA Subway

At a recent training workshop on the UCSC Genome Browser, I spoke with an educator who is using a custom local installation of the browser to work with students on bioinformatics lessons. It’s a project called the Genomics Education Partnership at WashU, and students learn by annotating regions of the genome with bioinformatics tools. You can see the team’s installation of the browser here. It sounded like an enjoyable and effective method and useful to students.

So the other day when I was exploring some of the resources available from the iPlant Collaborative, I was reminded of the annotation educational method by their very cool DNA Subway project. It’s another strategy to educate students with genome annotation tools–but I also think there might be some scientists who might want to use it beyond formal educational settings. It’s not new–I can remember reading about it in the past, but looking at it again with fresh eyes after that other conversation was worthwhile. And they’ve added new features since I last explored.

Student annotation projects are widespread, and there are probably numerous different successful strategies that local folks have implemented to set this up. But I suspect that more folks who are teaching bioinformatics might find the workflow infrastructure of the DNA Subway system a useful mechanism to use themselves, rather than setting up their own. So this week’s video tip of the week highlights the DNA Subway. Oh–and by the way: just because it’s at iPlant doesn’t mean it’s restricted to plants. You can go over there and see the various species options.

The providers of the Subway describe it as:

DNA Subway makes high-level genome analysis broadly available to students and educators and provides easy access to the types of data and informatics tools that drive modern biology. Using the intuitive metaphor of a subway map, DNA Subway organizes research-grade bioinformatics analysis tools into logical workflows and presents them in an appealing interface.”

I thought this was a really effective way to conceptualize the tasks that need to occur on a project. And it’s integrated with the tools you need at each “stop” to accomplish the tasks. The new “green line” in Beta that they have created isn’t shown in the video, but you should have a look at the site. It’s got tools for NGS RNA-seq data analysis, integrating the Tuxedo workflow protocol that includes TopHat, Bowtie, and Cufflinks, and is a really good thing for students to be exposed to. If you go over to the DNA Subway site itself and choose the “green line” to explore, you can see more information.

I can’t seem to embed their video, so I’d recommend you look at the larger size version on a separate page, and to go over and have a look for yourself at the DNA Subway.

Go over to their site by clicking on the image to access the video.

Go over to their site by clicking on the image to access the video.

Quick links:

DNA Subway main description page: http://www.iplantcollaborative.org/discover/dna-subway

DNA Subway installation: http://dnasubway.iplantcollaborative.org/

DNA Subway video tour (larger size): http://dnasubway.iplantcollaborative.org/files/tour/index.html


Goff S.A., Vaughn M., McKay S., Lyons E., Stapleton A.E., Gessler D., Matasci N., Wang L., Hanlon M. & Lenards A. & (2011). The iPlant Collaborative: Cyberinfrastructure for Plant Biology, Frontiers in Plant Science, 2 34. DOI:

Trapnell C., Roberts A., Goff L., Pertea G., Kim D., Kelley D.R., Pimentel H., Salzberg S.L., Rinn J.L. & Pachter L. & (2012). Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks, Nature Protocols, 7 (3) 562-578. DOI:

On a Mission for Protein Information

It’s probably just the human brain’s ability to connect dots  &  find patterns, but it can be interesting how many “unrelated” events and information bits accumulate in my head & eventually get mulled into an idea or theory. Take, for example, a recent biotech mixer, bits from an education leadership series & a past Nature article – each “event” has been meandering in my mind and now they are finding their way out as this blog post.

OK, now the explanation: At a recent local biotech event I heard about a company (KeraNetics) purifying keratin proteins & using them to develop therapeutic and research applications. The company & their research sounded very interesting & because a lot of it is aimed at aiding wounded soldiers, it also sounded directly beneficial. The talk was short, only about 20 minutes, so there wasn’t a lot of time for details or questions. I decided I’d venture forth through many of the bioscience databases and resources that I know and love, in order to learn more about keratin.

My quest was both fun and frustrating because of the nature of the beast – keratin is “well known” (i.e. it comes up in high school academic challenge competitions ‘a lot’, according to someone in the know), but is hard to work with (i.e. tough, insoluble, fibrous structural proteins) that is hard to find much general information on in your average protein database (because it is  made of many different gene products, all referred to as “keratin”). I decided to begin my adventure at two of my favorite protein resources, PDB & SBKB, but I found no solved structures for keratin. Because of the way model organism databases are curated and organized, I often begin a protein search there, just to get some basic background, gene names, sequence information, etc. I (of course) found nothing other than a couple of GO terms in the Saccharomyces Genome Database (SGD), but I found hundreds of results in both Mouse Genome Informatics (MGI) (660 genomic features) and Rat Genome Database (RGD) (162 rat genes, 342 human genes). I also found gene names (Krt*), sequences and many summary annotations with references to diseases with links to OMIM. When I queried for “keratin”, in OMIM I got 180 hits, including 61 “clinical synopsises”, in UniProt returned 505 reviewed entries and 2,435 unreviewed entiries, in Entrez Protein 10,611 results and in PubMed 26,430 articles with 1,707 reviews. I got my curiosity about KeraNetics’ research sated by using a PubMed advanced search for Keratin in the abstract or title & the PI’s name as author (search = “(keratin[Title/Abstract]) AND Van Dyke[Author]“).

I ended up with a lot of information leads that I could have hunted through, but it was a fun process in which I learned a lot about keratin. This is where the education stuff comes in. I’ve been seeing a lot of studies go by talking about reforming education to be more investigation driven, and I can totally see how that can work. “Learning” through memorization & regurgitation is dry for everyone & rough for the “memory challenged”, like me. Having a reason or curiosity to explore, with a new nugget of data or understanding lurking around each corner, the information just seems to get in better & stay longer. (OT, but thought I’d mention a related site that I found today w/ some neat stuff: Mind/Shift-How we will learn.)

And I could have done the advanced PubMed search in the beginning, but what fun would that have been? Plus there is a lot that I learned about keratin from what I didn’t find, like that there wasn’t a plethora of PDB structures for keratin proteins. That brings me to the final dot in my mullings – an article that I came across today as I worked on my reading backlog: “Too many roads not taken“. If you have a subscription to Nature you can read it, but the main point is that researchers are still largely focusing on the same set of proteins that they have been for a long time, because these are the proteins for which there are research tools (antibodies, chemical inhibitors, etc). This same sort of philosophy is fueling the Protein Structure Initiative (PSI) efforts, as described here. Anyway, I found the article interesting & agree with the authors general suggestions. I would however extend it beyond these physical research tools & say that going forward researchers need more data analysis tools, and training on how to use them – but I would, wouldn’t I? :)


  • Sierpinski P, Garrett J, Ma J, Apel P, Klorig D, Smith T, Koman LA, Atala A, & Van Dyke M (2008). The use of keratin biomaterials derived from human hair for the promotion of rapid regeneration of peripheral nerves. Biomaterials, 29 (1), 118-28 PMID: 17919720
  • Edwards, A., Isserlin, R., Bader, G., Frye, S., Willson, T., & Yu, F. (2011). Too many roads not taken Nature, 470 (7333), 163-165 DOI: 10.1038/470163a

News about the Integrated Microbial Genomes (IMG) resource

I’ve got a few news items regarding IMG, or Integrated Microbial Genomes, from the DOE Joint Genome Institute. The first item is that their Sept 2010 release occurred this week. IMG is now on version 3.2, has updated features and a bunch of new/revised genomes. I’ve begun updating our tutorial & will let you know when that is released. It’s not the craziest level of tool changes that I’ve seen from this group, but dang, they SURE don’t rest on their laurels! They are constantly changing and improving their interface and database.

If you are involved in microbial research and haven’t already checked out this powerful resource, I strongly suggest that you do. We’ve been training on this resource since 2006 and really believe in its value, which seems to increase with each of their releases. Mary & Trey presented an IMG workshop at NIH recently and it was surprising how many of their researchers were not aware of IMG. We hear that pretty often and it is too bad, it has so much to offer the microbial community and others as well.

The second item is that IMG has an annotation tool specifically designed for undergraduate education. Iddo Friedberg  describes this as ‘Way cool’ in a recent tweet. The program/interface is named the “Integrated Microbial Genomes Annotation Collaboration Toolkit (IMG-ACT)“, and is somewhat associated with the “Interpret a GEBA Genome for Education” project from JGI. “GEBA” stands for Genomic Encyclopedia of Bacteria and Archaea.  Both efforts are aimed at encouraging undergraduate research in microbial genome annotation, which might lead to the ‘alternative science career’ as a biocurator!

You can read all about the tool in their PLoS Biology article “Incorporating Genomics and Bioinformatics across the Life Sciences Curriculum“, or see a tour of the program/interface here. The tour makes the interface seem a bit clunky to me, but well thought out with lots of solutions to problems/issues often associated with undergraduate classes. The paper really provides a nice overview of the concept, collaborations, and initial outcomes of the 2008-2009 program.

Sign-ups are occurring for the 2011-2012 version of the program. The time frame is as follows:

Timeline to Participate:
1. Apply to be part of the 2011-2012 team by Monday, November 5, 2010 (download the application)
2. After acceptance, attend the workshop at the JGI (January 2011)
3. Implement in 2011-2012 academic year

as can be seen at the bottom of this page.

IMG-ACT Reference:
ResearchBlogging.orgDitty, J., Kvaal, C., Goodner, B., Freyermuth, S., Bailey, C., Britton, R., Gordon, S., Heinhorst, S., Reed, K., Xu, Z., Sanders-Lorenz, E., Axen, S., Kim, E., Johns, M., Scott, K., & Kerfeld, C. (2010). Incorporating Genomics and Bioinformatics across the Life Sciences Curriculum PLoS Biology, 8 (8) DOI: 10.1371/journal.pbio.1000448

Briefings in Bioinformatics – our education paper is available now

Back in April I happened to mention that we (OpenHelix) were writing a paper on informal sources of bioinformatics education (in a Friday SNPets item) and we were asked to announce when the paper came out. Well, we got word late last week that the article has been published. The article appears in a special issue of Briefings in Bioinformatics that is devoted to bioinformatics education. I’m not sure if all the articles in the issue are available yet, but it looks like several are in the journal’s Advanced Access area. Bioinformatics education is an area (obviously) that OpenHelix cares deeply about & we are anxiously awaiting our copies of the full issue so we can read all the articles, but I digress…

The title “OpenHelix: bioinformatics education outside of a different box” (if you hit a paywall, or have trouble accessing, we will gladly send a reprint. Just email the corresponding author, Jennifer listed in the abstract or ask from our contact link- Trey) was a cool suggestion from one of the article’s reviewers – my original title was much tamer (ok, more boring). Regardless of the final title, what we wanted to do in the article is to discuss informal sources of bioinformatics education. By education we do mean acquiring applicable information that allows a researcher to operate within the field of bioinformatics. By informal we mean outside of traditional, credit based classes and degrees. Essentially we provide a bit of the knowledge and know-how that we’ve gathered over years of working with hundreds of resources, thousands of workshop attendees, and countless online contacts about where a researcher, or librarian, or whoever can turn for various informational needs in the field of bioinformatics.

Our contention is that not everyone needs to program in order to manage and manipulate their biological data these days. There are SO many fine publicly available databases, algorithms, tools and more, it is just a matter of awareness and training for anyone to be able to reformat and analyze their personal data sets. We maintain that :

…bioinformatics education needs to do a minimum of four things:

1. raise awareness of the available resources
2. enable researchers to find and evaluate resource functionality
3. lower the barrier between awareness and use of a resource
4. support the continuing educational needs of regular resource users

In the paper we walk through each of these – we first describe example needs associated with the point, and then cover possible informal resources that meet the needs. The article includes tables of resources and links to them and many many references. We really hope that is a very useful resource in the field of bioinformatics education.  I am already looking forward to contributing to the next special education issue, both to hone my writing skills and to extend the information we can provide readers. Please do comment, email, whatever and let us know about the resources that you use, what you learned from the article, etc. Oh, here’s the citation info:
Williams, J., Mangan, M., Perreault-Micale, C., Lathe, S., Sirohi, N., & Lathe, W. (2010). OpenHelix: bioinformatics education outside of a different box Briefings in Bioinformatics DOI: 10.1093/bib/bbq026

Scivee group

I’ve mentioned this before, but as I am trying to get this weeks tip ready, I thought I’d remind our readers that we have a community over at Scivee (youtube for science :): Genomics Resource Training. We post all our tips there now and we add videos from other users that train users about genomics resources. We have about 2-3 dozen videos in our community now. Come on over and join!

Education at NCBI

I’d like to point out the new NCBI Education page. There is a lot there that you might want to check out. NCBI will be, starting this fall, offering a series of two-day training courses they are calling Discovery Workshops. Two years ago they ended the NCBI Field Guide workshops, so this seems to be a welcome change.

There are also webinars. Our research suggests that webinars are not particularly popular, so I’m curious how these turn out. There are also ‘how-to’ guides, documentation, community, teacher resources. It’s quite a nice site with lots of things to check out.

I’d also like to point out the “recommended links” section. There are lots of links to additional educational resources like the Cold Spring Harbor’s Dolan DNA Learning Center and much more. And, incidentally :), a link to our own free tutorials which was very nice to see. You might want to check those out, we have over 10 including PDB, SGKB, UCSC Genome Browser, Galaxy, several model organism databases, and more.

Happy National Lab Day (tomorrow)!

Speaking of outreach, I just saw a tweet from the NSF reminding me that tomorrow is the culmination of National Lab Day!

From the press release:

The National Science Foundation has launched a Web-based resource for scientists seeking guidance on how to effectively interact with K-12 teachers as part of National Lab Day, a grassroots effort to invigorate science education.

National Lab Day is a volunteer effort designed to form local “communities of support” around science, technology, engineering and mathematics (STEM) teachers and to connect them with STEM professionals who will share their expertise as well as their excitement and passion for their disciplines.

Point those teachers and kids to these resources.  Scientists–volunteer for projects.  Let’s get ‘em trained up in science. We’re gonna need ‘em. I wish I had heard about the volunteer opportunities earlier. But it looks like some of them are still ongoing, as this isn’t supposed to be limited to one specific day.  Maybe there’s something there…

Science Rocks!

thank_youThat was the word I got back from a kid who was assembling a plastic model of a virus that was donated to his science class. I also got this fabulous diagram of a plant cell from another kid. There were some funny and sweet thank you notes in the mail for me as one of the donors to the project. The project was called “Fill Our Display Case” and you can see the kids building the models in the photos over there.

A friend on another blog has been running a weekly project to find worthy science classroom projects on the Donor’s Choice site, and try to raise funds for these classes. A lot of people complain about science education today–but don’t know what to do about it, and she’s found a way to make a difference directly to teachers who want to change it up.

Some people use the holidays as a time to focus their giving to meritorius projects and causes.  If you are considering holiday donations please think about the Donor’s Choice projects.  You get to pick projects you like, and you can see the delight of the teachers and kids who benefit.  It just struck me what a great way it is to find science educators who need support, and directly touch the lives of kids with science!  And now I have very nice artwork to hang in my office :)

Tip of the Week: The National Center for Biomedical Ontology

NCBO_tip_imageAnyone who has either used or helped to create a database of biological information has probably come across ontological terms. In today’s tip I feature a great resource devoted to promoting the creation and proper use of ontologies. The resource is the The National Center for Biomedical Ontology, and allows users to learn about ontologies, find and use ontologies that are already in existence, and even to add newly developed ontologies to the resource so others might use them.

Ontologies are basically organized sets of controlled vocabulary terms that are applied in a uniform manor across diverse collections of information. They are important because of their ability to make abstract biological terms computer searchable. They also aid in the interpretation of biological information by researchers because each term includes a definition of how and when it should be applied to biological information. In this tip I briefly touch on finding ontologies, and on the educational resources available from the NCBO and BioPortal web sites.

Tip of the Week: Genomic Encyclopedia of Bacteria & Archaea (GEBA)

Being summer, a strangely slow connection and some other factors, I am embedding a talk from Doug Ramsey (posted on SciVee) on the GEBA project at JGI (instead of doing a tip myself :). The GEBA project recognizes that many, if not most, of the bacterial and archaeal genomes that have been sequenced to date have some relevance to human disease or other human interest. This of course is reasonable, but it also leads to big gaps in our knowledge of bacterial evolution and genomics, knowledge that would help us better understand those genomes that we find relevant and knowledge that in and of itself can be quite interesting and potentially useful. View the talk to learn more about this project to sequence 100 phylogenetically diverse bacterial and Archaeal genomes.
I’m also posting this as an introduction to JGI’s Adopt a Genome project. This project allows student groups to adopt and study a bacteria in the GEBA project and hopefully add to our knowledge and annotations of the genome while learning. The students can then annotate the adopted genome by using IMG-ACT.