At the recent (and excellent) Beyond the Genome 2010 conference, Len Pennachio gave a talk about the VISTA Enhancer Browser that reminded me how much I have always liked this project. It’s the kind of project I’d do if I had a lab: it takes the computational data we’ve been accumulating + developmental biology bench techniques = cool new insights into the function of conserved regions of the genome that we previously didn’t know much about.
The foundation of the project is that we’ve got a number of species genomic sequences that we can compare–and the VISTA suite offers a number of ways to perform these types of comparative genomics analyses and provides really nice visualizations of the data (we’ve got a free tutorial sponsored by VISTA that you can watch to see how it works). You can see peaks of high conservation across multiple species, which suggest there’s something important going on in that region. But when they are outside of the gene region per se, it’s not always obvious what the sequence represents–but the idea is that they may be cis-regulatory elements. So the Enhancer Browser team clones out those regions, and hooks them to reporter constructs. The constructs are placed into mouse oocytes and then put into pseudopregnant mice, and the embryos are examined on day 11 to see if there is an interesting pattern of expression of the reporter construct. Now, these are subject to limitations: it’s one time point they are examining so earlier or later activity is not known. And it’s possible that integration of the construct has affected expression (in positive or negative ways). But they examine multiple embryos for each construct to work around that location effect.
This data is accumulated and becomes available in the Enhancer Browser. You can search by genes of interest to see if a region near your favorite gene has been examined. Or you can examine them by tissue/localization pattern if there’s a developmental time point you may be interested in. To get a quick sense of the kind of things you can find take a look at the handy Gallery set of images. There are various ways to search or browse the data. That’s what I’ll be introducing in the Tip movie this week.
But they also “enhanced” this project but adding another technique to the process. Beyond the computational identification of conserved regions, they also began to do ChIP-Seq to pull down sequences that are bound to the p300 protein in embryos in various tissues of the embryo. That’s illustrated nicely in Figure 1 of the second paper. They obtain the sequence of those pieces and put those into eggs as well, and the rest of the process is similar. So the starting point is different: this is protein-bound sequences to start with, from a given tissue. But it also seems to be identifying working elements that can influence spatial and temporal expression of the reporter constructs. They say it has increased their success in finding working elements by 5x to 16x.
So I think this is a great way to use computational techniques and bench work in a pretty-big-data way. It’s not easy to do the mouse benchwork part so it’s not quite as big as a pure sequencing foray. But it’s exactly the kind of project I’d design if I had access to a lab. I have a different topic I’d be interested in, but the same kinds of strategies would be useful for that as well.
Anyway–explore the Enhancer Browser to learn more about these possible regulatory elements.
Visel, A., Blow, M., Li, Z., Zhang, T., Akiyama, J., Holt, A., Plajzer-Frick, I., Shoukry, M., Wright, C., Chen, F., Afzal, V., Ren, B., Rubin, E., & Pennacchio, L. (2009). ChIP-seq accurately predicts tissue-specific activity of enhancers Nature, 457 (7231), 854-858 DOI: 10.1038/nature07730
Visel, A., Minovitsky, S., Dubchak, I., & Pennacchio, L. (2007). VISTA Enhancer Browser–a database of tissue-specific human enhancers Nucleic Acids Research, 35 (Database) DOI: 10.1093/nar/gkl822