In the series of talks from the Current Topics in Genome Analysis course from NHGRI, Laura Elnitski spoke on regulation and epigenetics. I’ll include some of my notes below, but be sure to check out the whole talk when you have a chance–and the slides are available for download from the CTGA page.
Dr. Elnitski frames the talk by indicating that we’ve been focusing on the roughly 2% of the genome that consists of protein-coding genes, but that there’s a lot more going on outside of that, and how much more there is to learn about other aspects of genome regulation. One of the papers she uses to illustrate that makes it clear how much of the variation we are aware of outside of protein coding regions (Hindoff et al 2009; around 11 minutes). That paper described the NIH GWAS catalog, which analyzed disease/trait-associated SNPs (TAS) and found that “88% of TASs were intronic (45%) or intergenic (43%)”. And if that’s the case, you need to think about ways to evaluate the effects of these differently than if it was a protein variation that resulted.
Due to this fact–that it’s not just proteins we need to be looking at–Elnitski says, “So throughout this talk we’ll take a look at functional categories of the genome, to further explain the steps you might consider to ascertain function at these GWAS sites.”
One way to evaluate a region that contains a non-coding variant is to consider it’s evolutionary relationships. How conserved is this tidbit in other species? Laura describes how PHASTCons and GERP can help you to analyze that (around 21 minutes). These tools use different approaches to find constrained elements. You can use knowledge of regions that have accelerated rates of change to suss out interesting features (she used the opposable thumb and foot/ankle region among bipedals as interesting examples of that sort of change; around 26 minutes).
Another type of landscape feature described was enhancer signatures. She offered a nice diagrammatic view of what this look like around a region to convey possible enhancer function (around 32 minutes). The look at the representation of the histone code could probably help people who are trying to use the ENCODE data tracks at UCSC to visualize that–and in slide 63 she looks at what the pattern of codes in an active promoter might look like, and then after that key differences of enhancers and what repressed regions look like (around 1hr). I found that really helpful.
One point she stressed though–epigenetic patterns are very cell-type specific–be sure to look at various cell types, and tread carefully with conclusions if your cell type of interest has not been evaluated yet (around 36 minutes). [As a side note, I worry about this particularly as a misuse by cranks of the features of epigenetics--they are already going out and telling people they can fix everything wrong with their health by affecting their epigenetics. Now, let's say you claim to treat diabetes or autism with your detox epi-fix--what is the impact on other cell types exactly??]
She also goes on to explain how these features rely on the 3D structure of the nucleus, looping interactions, and the packing of the chromosomes, with some nice guidance on how to think about that and the types of techniques to assess that. And just after I watched this, a paper came out describing more of this topology with the Hi-C strategy that she referenced.
It’s also important to consider that splicing defects can have consequences that wouldn’t be obvious just from looking at coding sequence per se. Although a substitution might be synonymous and not change an amino acid, it could still affect splicing. The SKIPPY tool that was developed by her group (and that Jennifer highlighted as a Tip of the Week) was suggested as a way to explore this (around 47 minutes).
This talk was a useful guide to thinking about non-coding genomic features to consider for your research. There were helpful graphics and tools provided. Have a look–it’s worth your time.
Woolfe, A., Mullikin, J., & Elnitski, L. (2010). Genomic features defining exonic variants that modulate splicing Genome Biology, 11 (2) DOI: 10.1186/gb-2010-11-2-r20
Hindorff, L., Sethupathy, P., Junkins, H., Ramos, E., Mehta, J., Collins, F., & Manolio, T. (2009). Potential etiologic and functional implications of genome-wide association loci for human diseases and traits Proceedings of the National Academy of Sciences, 106 (23), 9362-9367 DOI: 10.1073/pnas.0903103106