Video Tip of the Week: eggNOG for the holidays (or to explore orthologous genes)

ResearchBlogging.org Who can resist a nice cup of eggnog for the holidays (especially with added brandy). I know I can’t. I make my grandpa’s recipe every December and, considering it uses tons of sugar, eggs, heavy cream and alcohol and that 1/2 & 1/2 is the lightest ingredient, only December.

Oh, that’s not what this tip is about, it’s about database of orthologous groups of genes, eggNOG. We’ve mentioned eggNOG before several times, but only in passing or in relation (orthologous? :D) to another database or tool. Today, in perfect timing for the season, thought I’d do a quick tip to introduce eggNOG.

eggNOG is brought to you by the same research group that developed a lot of other excellent tools such as SMART (protein domains), STRING (protein-protein interactions, STITCH (protein-chemical interactions) , iTOL and so much more. Of course they do some fascinating research too.

eggNOG is a relatively straightforward database to use, but it has a wealth of information you might want to check out. As the recent paper in NAR states:

Orthologous relationships form the basis of most comparative genomic and metagenomic studies and are essential for proper phylogenetic and functional analyses…. Orthology, defined as homology via speciation, is a crucial concept in evolutionary biology and is essential for disciplines such as comparative genomics, metagenomics and phylogenomics. The concepts of orthology and paralogy, with the latter being defined as homology via duplication, have been used as a foundation to introduce the concept of clusters of orthologous groups: proteins that have evolved from a single ancestral sequence existing in the last common ancestor (LCA) of the species that are being compared, through a series of speciation and duplication events. Orthologous groups (OGs) have proven useful for functional analyses and the annotation of newly sequenced genomes  as orthologs tend to have equivalent functions.

eggNOG contains:

721 801 orthologous groups, encompassing a total of 4 396 591 genes…. from 1133 species.

For more about orthologous groups, methods used and pros and cons of methodology, you might want to check out the paper referenced below. They’ve included several informative and helpful reviews and references.

Right now, take a quick tour of what eggNOG can offer.

Tip of the Week: A year in tips III (last half of 2010)

As you may know, we’ve been doing tips-of-the-week for three years now. We have completed around 150 little tidbit introductions to various resources. At the end of the year we’ve established a sort of holiday tradition: we are doing a summary post to collect them all. If you have missed any of them it’s a great way to have a quick look at what might be useful to your work.

Here are the tips from the first half of the year, and below you will find the tips from the last half of 2010 (you can see past years’ tips here: 2008 I2008 II2009 I2009 II):


July 7: Mint for Protein Interactions, an introduction to MINT to study protein-protein interactions
July 14: Introduction to Changes to NCBI’s Protein Database, as it states :D
July 21: 1000 Genome Project Browser, 1000 Genomes project has pilot data out, this is the browser.
July 28: R Genetics at Galaxy, the Galaxy analysis and workflow tool added R genetics analysis tools.


August 4: YeastMine, SGD adds an InterMine capability to their database search.
August 11: Gaggle Genome Browser, a tool to allow for the visualization of genomic data, part of the “gaggle components”
August 18: Brenda, comprehensive enzyme information.
August 25: Mouse Genomic Pathology, unlike other tips, this is not a video but rather a detailed introduction to a new website.


September 1: Galaxy Pages, and introduction to the new community documentation and sharing capability at Galaxy.
September 8: Varitas. A Plaid Database. A resource that integrates human variation data such as SNPs and CNVs.
September 15: CircuitsDB for TF/miRNA/gene regulation networks.
September 21: Pathcase for pathway data.
September 29: Comparative Toxicogenomics Database (CTD), VennViewer. A new tool to create Venn diagrams to compare associated datasets for genes, diseases or chemicals.


October 6: BioExtract Server, a server that allows researcher to store data, analyze data and create workflows of data.
October 13: NCBI Epigenomics, “Beyond the Genome” NCBI’s site for information and data on epigenetics.
October 20: Comparing Microbial Databases including IMG, UCSC Microbial and Archeal browsers, CMR and others.
October 27: iTOL, interactive tree of life


November 3: VISTA Enhancer Browser explore possible regulatory elements with comparative genomics
November 10: Getting canonical gene info from the UCSC Browser. Need one gene version to ‘rule them all’?
November 17: ENCODE Data in the UCSC Genome Browser, an entire 35 minute tutorial on the ENCODE project.
November 24: FLink. A tool that links items in one NCBI database to another in a meaningful and weighted manner.


December 1: PhylomeDB. A database of gene phylogenies of many species.
December 8: BioGPS for expression data and more.
December 15: RepTar, a database of miRNA target sites.

Peer Bork wins 2009 award

Royal Society and Académie des sciences Microsoft Award was won by Peer Bork this year. The award is funded by Microsoft (250,000 euro) and is given to

recognise and reward scientists working in Europe who have made a major contribution to the advancement of science through the use of computational methods.

It was awarded to Peer Bork for his work on the human microbiome. Peer definitely deserves it, as does his lab.The science and scientists that come from the Bork group are stellar. Ok, so I have a personal interest in this: I worked in his lab for 4 years, from 1999-2003. It was one of the best experiences (science and personal) of my life. Also, BioByte Solutions, started by a Bork lab researcher, has helped put together our new free database and resource search (which we’ll be introducing next week).

Congratulations Peer! Now, what is he going to do with that 368,000 dollars?!

And let me use this opportunity to point out some of the great tools and databases developed by the Bork group:
STRINGAnalysis of known and predicted protein-protein interactions in all known genomes (OpenHelix Tutorial, by subscription)
STITCHDatabase of known and predicted interactions of chemicals and proteins.
SMARTDomain analysis (OpenHelix Tutorial, by subscription)
iTOLan online tool for the display and manipulation of phylogenetic trees.
XplorMedDataming in MedLine (OpenHelix Tutorial, by subscription)

And a whole lot more