A pretty unusual week. Cutting-edge science and as well as access to medical literature that’s hundreds of years old. Predictions about the future. New ways to use one of my favorite visualization tools (UpSet) and influential visualizations from the past. Veterans genomes and minion genomes. And more….
Welcome to our Friday feature link collection: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…
As you may know, we’ve been doing these video tips-of-the-week for FOUR years now. We have completed around 200 little tidbit introductions to various resources from last year, 2011 (yep, it’s 2012 now). At the end of the year we’ve established a sort of holiday tradition: we are doing a summary post to collect them all. If you have missed any of them it’s a great way to have a quick look at what might be useful to your work.
Epigenetics and epigenomics are becoming more exciting areas of investigation, and we are seeing more requests for database resources to support them, and for the sources of data from these types of experiments. If you aren’t aware of these investigations at this point, check out their entries in the Talking Glossary of Genetic Terms:
Epigenetics: Epigenetics is an emerging field of science that studies heritable changes caused by the activation and deactivation of genes without any change in the underlying DNA sequence of the organism. The word epigenetics is of Greek origin and literally means over and above (epi) the genome.
Epigenome: The term epigenome is derived from the Greek word epi which literally means “above” the genome. The epigenome consists of chemical compounds that modify, or mark, the genome in a way that tells it what to do, where to do it, and when to do it. Different cells have different epigenetic marks. These epigenetic marks, which are not part of the DNA itself, can be passed on from cell to cell as cells divide, and from one generation to the next.
And for the talking part–you can hear Dr. Laura Elnitski talk about these in more detail–have a listen at each entry. And just today an article providing an epigenetics primer appeared in my inbox: Epigenetics: A Primer.
These intriguing–and sometimes puzzling–chromatin modification (CM) signals and leads that are being unveiled in many labs and projects now are becoming more widely available in different databases. For this week’s tip of the week I’ll introduce DAnCER: Disease-Annotated Chromatin Epigenetics Resource, one of the tools that is organizing this type of data and enabling additional explorations. You can find DAnCER here: http://wodaklab.org/dancer/
In the associated publication, the DAnCER team describes other useful resources that provide epigenetics data. These include ChromDB, ChromatinDB (for yeast), and the Human Histone Modification Database (HHMD), among others. I’m also aware of other sources. A few months back I introduced the NCBI Epigenomics resource as my tip-of-the-week. (At that time I promised that when the publication became available I’d mention it–that’s now at the bottom in the references section below.) There’s also quite a bit of this data flowing in to the UCSC Genome Browser ENCODE DCC. Including–may I add–some data from the very cool Elnitski bi-directional promoter studies. You can find similar data types via the modENCODE project as well.
So, there are lots of resources out there. Each provider has different projects, species, goals, displays, etc. But the group that developed DAnCER wanted to fill a niche they didn’t see available already: linking these epigenetic changes to possible disease association data. Here’s how they describe their work:
Our research effort therefore strives to explore CM-related genes in the context of their protein-interaction network, their partnership in multi-protein complexes and cellular pathways, as well as their gene expression profiles….
They are well-suited to linking this kind of information. You may remember our previous explorations and discussions of iRefWeb. The kind of network and interaction data that they assemble in that context can be brought to the chromatin-modification arena. The point is that you can take steps beyond the modifications you know about, to explore their neighborhood of interactions, and potentially unearth important disease relationships from that.
The data includes several species, and because of that evolutionary conservation can also be explored.
So if you find that you are interested in exploring chromatin modifications, and want to take that data further, check out DAnCER, and the other tools and projects that are providing this type of information. If you have used the iRefWeb interface, you’ll see some similarities in structure. Search options with many filters are available. Color-coded and sortable results are provided. Links to gene details within the Wodak lab tools and external links are offered. On the gene pages at DAnCER you’ll have many types of annotations, including Gene Ontology descriptions, evidence type and references, neighbors, and protein domain information as well. And besides the texty-table based stuff, you can choose to load up the interactive network/interaction graphic, just like with the iRefWeb tool.
There’s a lot of opportunity to learn things from this tool. Try it out.
Turinsky, A., Turner, B., Borja, R., Gleeson, J., Heath, M., Pu, S., Switzer, T., Dong, D., Gong, Y., On, T., Xiong, X., Emili, A., Greenblatt, J., Parkinson, J., Zhang, Z., & Wodak, S. (2010). DAnCER: Disease-Annotated Chromatin Epigenetics Resource Nucleic Acids Research, 39 (Database) DOI: 10.1093/nar/gkq857
Fingerman, I., McDaniel, L., Zhang, X., Ratzat, W., Hassan, T., Jiang, Z., Cohen, R., & Schuler, G. (2010). NCBI Epigenomics: a new public resource for exploring epigenomic data sets Nucleic Acids Research, 39 (Database) DOI: 10.1093/nar/gkq1146
As you may know, we’ve been doing tips-of-the-week for three years now. We have completed around 150 little tidbit introductions to various resources. At the end of the year we’ve established a sort of holiday tradition: we are doing a summary post to collect them all. If you have missed any of them it’s a great way to have a quick look at what might be useful to your work.
We spend a lot of time talking about sequence data: where to find it, how to analyze it, etc. But increasingly we are seeing more and more data that comes from epigenomics projects. Recently a tweet from NCBI got me to look at their Epigenetics site again. http://www.ncbi.nlm.nih.gov/epigenomics
Their definition of epigenetics is:
What is Epigenetics?
Interest in epigenetics has exploded in recent years, but the central question it aims to answer has been with us for decades: how do the many cell types of the body maintain drastically different gene expression patterns while sharing exactly the same DNA?
Epigenetics refers to a gene activity state that may be stable over long periods of time, persist through many cell divisions, or even be inherited through several generations, all without any change to the primary DNA sequence (Roloff and Nuber 2005, Ng and Gurdon 2008, Probst, et al. 2009).
This tip of the week will take a look at access to the data. I’ll be taking a look at what happens when you use the Sample Browser as a starting point to see some of the data via browsing. You can do more complex and custom queries with the Advanced Query form, which looks like other query building tools at NCBI. I won’t have time to cover that, but I wanted you to know it was available.
For my example I just chose the top sample that was in the list at the time I did this tip. And it was fortuitous for a couple of reasons. First it was exactly the kind of paper that I was talking about in my recent post (The data isn’t in the papers anymore, you know.) This paper (referenced below) has a huge volume of data. It looks at 39 types of histone modifications, and looks at them genome wide. There’s no way to publish all that as figures in this paper. There are summary figures, but not individual ones for that data collection. You’d have to visualize this yourself elsewhere. The second reason it was cool was because the data perfectly validates some of the data I’ve been using to develop the ENCODE project tutorial we’ve just created with the UCSC ENCODE team.
Anyway–check out the NCBI Epigenomics resource for a great way to visualize data on this topic. Data that you will not find in the papers.
Wang, Z., Zang, C., Rosenfeld, J., Schones, D., Barski, A., Cuddapah, S., Cui, K., Roh, T., Peng, W., Zhang, M., & Zhao, K. (2008). Combinatorial patterns of histone acetylations and methylations in the human genome Nature Genetics, 40 (7), 897-903 DOI: 10.1038/ng.154
Currently there’s isn’t a reference for NCBI Epigenomics. I contacted the Help Desk to be sure, and they told me it’s been submitted but isn’t out yet. I’ll update this when that reference becomes available.