Tag Archives: RNA-Seq

UCSC Genome Bioinformatics

Video Tip of the Week: UCSC Genome Browser Exon-only Mode

The team at UCSC Genome Browser continues to update their resources and offer new ways to find and visualize features of interest to researchers. One of the newer features is the “multi-region” option.  When it was first launched, I did a tip on how to use that, with some of the things that I noticed while I was testing it pre-launch. But now the folks at UCSC have their own video on the exon-only display that you might also find useful.

One of the things that is illustrated here is how the exon-only mode is handy to enhance your exploration of RNA-Seq data. It also uses a great ENCODE data set as an example, and if you haven’t been using that collection it’s a good reminder of the kinds of things you can find in that resource still. And this extensive data set shows how much easier it is to look at different isoforms in the data in this new exon-only mode.

So have a look at this display option if you haven’t before, especially how it can help you to see transcript differences. If you aren’t familiar with the ENCODE data that’s being used, you can also see our training on that which will help you to understand how to use that data and the filtering features that are also used in this video.

Special note: I have updated the UCSC Intro slides to include the new Gateway strategies as well. So download those slides for the latest look. 


Disclosure: UCSC Genome Browser tutorials are freely available because UCSC sponsors us to do training and outreach on the UCSC Genome Browser.

Quick links:

UCSC Genome Browser: http://genome.ucsc.edu

UCSC Genome Browser training materials: http://openhelix.com/ucsc

ENCODE: http://www.openhelix.com/ENCODE2


Speir, M., Zweig, A., Rosenbloom, K., Raney, B., Paten, B., Nejad, P., Lee, B., Learned, K., Karolchik, D., Hinrichs, A., Heitner, S., Harte, R., Haeussler, M., Guruvadoo, L., Fujita, P., Eisenhart, C., Diekhans, M., Clawson, H., Casper, J., Barber, G., Haussler, D., Kuhn, R., & Kent, W. (2016). The UCSC Genome Browser database: 2016 update Nucleic Acids Research, 44 (D1) DOI: 10.1093/nar/gkv1275

The ENCODE Project Consortium (2012). An integrated encyclopedia of DNA elements in the human genome Nature, 489 (7414), 57-74 DOI: 10.1038/nature11247

expVIP example

Video Tip of the Week: expVIP, an Expression, Visualization, and Integration Platform

As I mentioned last week, I am watching a lot of farmers on twitter talk about this year’s North American growing season. To get a taste of that yourself, have a look at #Plant16 + wheat as a search. This is where the rubber of tractor tires and plant genomics hits the…well…rows. And just coincidentally I saw a story about this new plant genomics research tool–actually in the farming media.

It’s kind of nice to see plant bioinformatics get some recognition beyond the bioinformatics nerd community. The piece “New online tool helps predict gene expression in food crops” did a pretty good job of talking about the features of the expVIP tool, and I was eager to have a look.

expVIP stands for expression Visualization and Integration Platform. expVIP exampleAlthough the emphasis here is plant data, it can be used for any species. A good summary of their project is taken from their paper (linked below):

expVIP takes an input of RNA-seq reads (from single or multiple studies), quantifies expression per gene using the fast pseudoaligner kallisto (Bray et al., 2015) and creates a database containing the expression and sample information.

And it can handle polyploid species–try that on some of the tools aimed at human genomics! They illustrate this with some wheat samples from a number of different studies. And then they use the metadata about the studies, such as tissues and treatment conditions, to show how it works with some great sorting and filtering options. They created a version of this for you to interact with on the web: Wheat Expression Browser. But you can create your own data collections with their tools, aimed at your species or topics of interest.

This week’s Video Tip of the Week is their sample of how this Wheat Expression Browser works. Although you see the wheat data here, it’s just an example of how it can work with any species you’d like to examine.

I followed along and tried what they were showing in the video, and I found it to be a really slick and impressive way to explore the data. The dynamic filtering and sorting was really nice. You can customise the filtering/sorting/etc for the visualizations with the metadata that’s useful to your research. So you could set the tissue types, or treatment conditions, or whatever you want–and filter around to look at the expression with those. They go on to show that their strategies to compare genes in different situations seemed to reflect known biology in disease and abiotic stress conditions.

So their pipeline for gene matching, as well as the tools to explore and visualize RNA-Seq data, offer a great way to look at data that you might generate yourself or you could mine from existing submitted data–but that might not be well organized and available in a handy database just yet.

Quick links:

Wheat expression browser: www.wheat-expression.com

expVIP at GitHub: https://github.com/homonecloco/expvip-web


Philippa Borrill, Ricardo Ramirez-Gonzalez, & Cristobal Uauy (2016). expVIP: a customisable RNA-seq data analysis and visualisation platform Plant Physiology, 170, 2172-2186 : 10.​1104/​pp.​15.​01667

Friday SNPpets

This week’s SNPpets include RNAMiner for mining RNA-seq data and MarkerMiner for angiosperms, who qualifies to be a bioinformatician, how to attract women to scitech careers, dangers of default parameters, and 10 simple rules to win a Nobel Prize, and more….

Welcome to our Friday feature link collection: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…


Genomics Impact on Infectious Disease (with video)

As part of the The Genomics in Medicine Lecture Series from NHGRI, Jonathan Zenilman gave a lecture on various ways that new genome technology has aided in the ways that clinicians can diagnose, manage, and treat infectious disease. This lecture series is delivering a number of videos on various intersections of clinical medical situations and genomics, not just on the basic research in the field.

The opportunities to study previously murky microbial situations–including unculturable organisms, and mixed colonies of microbes that can be invading wounds, was really interesting (but thinking about the community in a brain abscess was…er…not suitable for lunchtime viewing perhaps). But not only were these situations nearly impossible to understand before–but they couldn’t tell whether there were resistant organisms in there. So it seriously affected treatments.

An interesting data point: in brain abscesses, using standard culturing techniques, they identified 22 bugs. Using PCR they found 72! Some were unknown, too. One patient had 16 strains. That’s quite a battle.

One side effect of these new strategies though it that it freaks out hospital administrators. Suddenly because of the increased sensitivity of the tests, there are a lot more organisms reflected on their reports.

The new techniques are really going to help in the treatment of chronic wounds. One important point was that checking the RNA-seq data is key, because it’s important to know which transcriptomes are active. Dead bugs complicate the analysis, so knowing which ones are currently alive and affecting the wound is crucial.

Another important outcome of this work would be getting pointers to more pathogen-directed therapies. Broad-spectrum treatments are causing problems of their own, and having more precise ways to target the bad bugs would really be worthwhile.

I’ve attached a sample of the kinds of data that Zenilman and colleagues have published on the types of work he describes in this lecture, but you can find many more examples. I wanted to choose an open access example though, so this is the one I include.


Price, L., Liu, C., Melendez, J., Frankel, Y., Engelthaler, D., Aziz, M., Bowers, J., Rattray, R., Ravel, J., Kingsley, C., Keim, P., Lazarus, G., & Zenilman, J. (2009). Community Analysis of Chronic Wound Bacteria Using 16S rRNA Gene-Based Pyrosequencing: Impact of Diabetes and Antibiotics on Chronic Wound Microbiota PLoS ONE, 4 (7) DOI: 10.1371/journal.pone.0006462

ENCODE RNA-Seq data standards–we’re gonna need ‘em

I just got an important email from the ENCODE announcement mailing list at UCSC Genome Browser.  I haven’t had time to go through them as I’m packing for a trip, but I think the PDF document will make some fine airplane reading!

The ENCODE Consortium has finalized ‘Standards, Guidelines and Best Practices for RNA-Seq V1.0′, as part of the Consortium’s continuing effort to generate data standards.   The document is available at the ENCODE portal here:


“RNA-Seq is a directed experimental approach aimed at characterizing transcription in biological samples. This document presents a set of guidelines and standards focused on best practices for creating ‘reference quality’ transcriptome measurements.”

It was followed by a direct link to the RNA-Seq PDF document: http://encodeproject.org/ENCODE/protocols/dataStandards/ENCODE_RNAseq_Standards_V1.0.pdf

I think it’s going to be interesting to read this, as I was just considering RNA-Seq data the other day when Stephen Turner started a discussion on some hot news:

@genetics_blog Wish I could read ($ub) MT @GenomeWeb: Transcript abundance substantially disagrees btwn RNA-seq expts w/ same platform http://bit.ly/kfShZH
And we replied with this:
@OpenHelix: @genetics_blog Refers to this paper http://bit.ly/l9akCF
The paper is about technical variability in RNA-Seq data from the same samples prepared the exact same way. I think it’s going to be important to be aware of the variability in this data as we explore it. And I’m sure the ENCODE consortium folks will have a look at this paper and consider that information.

One of the great things about the fact that the ENCODE consortium is working to develop standards is that there are these great big data sets that are available to all of us to look at, and there are people charged with evaluating the technology and the methods to get the most out of them.

If you aren’t familiar with the ENCODE project and data sets, please have a look at the ENCODE tutorial materials that we have, which are freely available because they are sponsored by the UCSC ENCODE team. We show you about the project framework, how to identify that data over there, and some important aspects of interacting with it.