Today’s tip of the week is a quick introduction to ChromoHub. ChromoHub is an annotated phylogeny of chromatin-mediated signaling genes. As the ChromoHub site says these are “genes involved in writing, reading and erasing the histone code.” These are epigenetic modifications that emerging as target classes for future drug therapies.
ChromoHub maps annotated information about these genes onto a phylogeny of the genes where the researcher can find a wealth of information. The information one can find ranges from cancer data, SNPS, protein structure, protein-protein interactions to PubMed and funding information. There is a lot of information to view.
Today’s tip introduces you to the tool and how to add and view the annotations. There is a lot more at ChromoHub. You can suggest data that the developers have missed and download the information, alignment files and images and more.
ChromoHub was developed by SGC, the Structural Genomics Consortium. This is a private-public partnership that supports discovery of new medicines through open access research. ChromoHub is just one of the tools and resources developed by the consortium.
To find out more about the resource, check out the links and reference below.
Who can resist a nice cup of eggnog for the holidays (especially with added brandy). I know I can’t. I make my grandpa’s recipe every December and, considering it uses tons of sugar, eggs, heavy cream and alcohol and that 1/2 & 1/2 is the lightest ingredient, only December.
Oh, that’s not what this tip is about, it’s about database of orthologous groups of genes, eggNOG. We’ve mentioned eggNOG before several times, but only in passing or in relation (orthologous? :D) to another database or tool. Today, in perfect timing for the season, thought I’d do a quick tip to introduce eggNOG.
eggNOG is a relatively straightforward database to use, but it has a wealth of information you might want to check out. As the recent paper in NAR states:
Orthologous relationships form the basis of most comparative genomic and metagenomic studies and are essential for proper phylogenetic and functional analyses…. Orthology, defined as homology via speciation, is a crucial concept in evolutionary biology and is essential for disciplines such as comparative genomics, metagenomics and phylogenomics. The concepts of orthology and paralogy, with the latter being defined as homology via duplication, have been used as a foundation to introduce the concept of clusters of orthologous groups: proteins that have evolved from a single ancestral sequence existing in the last common ancestor (LCA) of the species that are being compared, through a series of speciation and duplication events. Orthologous groups (OGs) have proven useful for functional analyses and the annotation of newly sequenced genomes as orthologs tend to have equivalent functions.
721 801 orthologous groups, encompassing a total of 4 396 591 genes…. from 1133 species.
For more about orthologous groups, methods used and pros and cons of methodology, you might want to check out the paper referenced below. They’ve included several informative and helpful reviews and references.
Right now, take a quick tour of what eggNOG can offer.
Powell, S., Szklarczyk, D., Trachana, K., Roth, A., Kuhn, M., Muller, J., Arnold, R., Rattei, T., Letunic, I., Doerks, T., Jensen, L., von Mering, C., & Bork, P. (2011). eggNOG v3.0: orthologous groups covering 1133 organisms at 41 different taxonomic ranges Nucleic Acids Research DOI: 10.1093/nar/gkr1060
Today’s tip will revisit the database and redo a search that was done in the tip from 2009, this time using a protein search instead of a category search. Gnad, F., Gunawardena, J., & Mann, M. (2010). PHOSIDA 2011: the posttranslational modification database Nucleic Acids Research, 39 (Database) DOI: 10.1093/nar/gkq1159
In our ongoing pursuit of up-to-date tutorials, I’ve been tracking changes that are occurring at resources and planning our updates accordingly. Protein resources are especially going to keep me out of trouble this summer, because their developers and curators have been busy! I’ve compiled a short synopsis below, and would appreciate comments on any other resources you know about, or want to brag about!
I featured the ExPASy list of proteomic tools in a past tip. As of Tuesday this list is no longer being kept up-to-date, but the ExPASy resource has been expanded beyond being “just” a proteomics resource and is now the new SIB Bioinformatics Resource Portal. According to its developers, the portal:
“provides access to scientific databases and software tools in different areas of life sciences including proteomics, genomics, phylogeny, systems biology, population genetics, transcriptomics etc. … On this portal you find resources from many different SIB groups as well as external institutions.”
And never fear, there is still an up-to-date list of proteomics tools found here.
I mentioned in my tip last week that NCBI’s MMDB has undergone an update & I’ll be updating our tutorial on it soon.
BioStar is a site for asking, answering and discussing bioinformatics questions. We are members of thecommunity and find it very useful. Often questions and answers arise at BioStar that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those questions and answers here in this thread. You can ask questions in this thread, or you can always join in at BioStar.
recognise and reward scientists working in Europe who have made a major contribution to the advancement of science through the use of computational methods.
It was awarded to Peer Bork for his work on the human microbiome. Peer definitely deserves it, as does his lab.The science and scientists that come from the Bork group are stellar. Ok, so I have a personal interest in this: I worked in his lab for 4 years, from 1999-2003. It was one of the best experiences (science and personal) of my life. Also, BioByte Solutions, started by a Bork lab researcher, has helped put together our new free database and resource search (which we’ll be introducing next week).
Congratulations Peer! Now, what is he going to do with that 368,000 dollars?!
I’m going to admit, I know little of acetylation as a regulatory mechanism, though after reading through the paper, I found this quite and interesting find and it suggests to me that genomics has a lot to offer in the advance in our understanding of regulation and evolution.
Three things jumped out at me though.
The first is minor. The authors use the term Acytelome. You can now add that to the huge list of -omics terms to keep straight :D.
The second is that they use STRING to complete an analysis of networked interactions of the proteins discovered in their study and the processes where they are found, as you can see in their figure.
I did my postdoc and some later research in the lab (Peer Bork, EMBL) that developed STRING, and I’ve created a tutorial on it, so any time it’s used, I’m interested :D. So, I went to Methods and Materials to see how the analysis was done. Though there was a decent explanation of the process, it was not enough for me to recreate the analysis. This is not a criticism of the paper or the authors, but of how papers are being published. More and more, papers include genomics analysis, but rarely are these reported in the research paper in the detail needed to easily reproduce the analysis. Projects like Galaxy (publicly available tutorial) and Taverna are filling that void, so I’d like to see more Methods and Materials sections include analysis histories and workflows. It definitely would help in the advancement of science.
This week I’m going to introduce a tool that searches a whole bunch of resources for you with one single click. Harvester, from the Karlsruhe Institute of Technology, offers a really simple interface for searching. If your species is one of the ones collected in their search, you will find that Harvester will enable you to search a slew of databases with just one query–NCBI, UCSC, MINT, STRING and many others. The results will provide quick links to some databases, and some results pages will be embedded in one big web page that you can scroll down and overview really quickly. The embedded pages aren’t just summary text–they are the actual database pages in situ! You can see them and interact with them just as if you were on that site doing the search.
This 3 minute movie introduces you to Harvester. If you quickly need a summary of what’s in all the databases they collect, it is a very handy tool. It does remind me of a Swiss army knife–not earth shatteringly novel from an algorithmic perspective, but many useful tools pulled together in one place. Try it out!
Bioinformatics.org is a great organization and web site (disclosure: I’ve taught an online course with them :D) and they regularly have online course in the field of bioinformatics that are more in the theory and analysis area of bioinformatics (where ours is more in the use and access of resources). If you need bringing up to speed on protein-protein interactions, there is room in next week’s course on said subject.
We have training in several protein-protein interaction resources such as STRING, soon MINT, so this bioinformatics course seems a nice complement. To learn more about the course, follow me under the fold…
In the previous post I briefly mentioned a paper coming out of the Bork lab at EMBL.
The lab just made public a new tool: STITCH, “a resource to explore known and predicted interactions of chemicals and proteins.” This is a sister project to STRING, a great tool for exploring the interactions of proteins