Video Tip of the Week: Microbiome Resources From JGI

Just over a month ago an issue of Nature had two articles from the Human Microbiome Project Consortium – you may have seen them, or noticed the Friday SNPets items we had on them. I promised myself that I’d read the articles (which I did), and that I’d visit my old friends the IMG (Integrated Microbial Genomes) & IMG/M (IMG with Microbiome Samples) to see what is new at these powerful microbial genome resources. In today’s tip I decided to take you along on my visit with me, because I found that IMG now has a resource dedicated to the analysis of genomes related to the Human Microbiome Project called the Integrated Microbial Genomes-Human Microbiome Project, or IMG/HMP. We visit both IMG/M (briefly) and the IMG/HMP in today’s tip.

When I referred to IMG as an old friend, I really do feel that way – our tutorial on IMG* was one of my first projects for OpenHelix. I was new, and IMG was new, having been released in March of 2005, only a few months before I created our tutorial (on their June 2005 release, if I am remembering correctly). They have grown into such an extensive, powerful resource. To give you an idea of how fast they have grown & developed, our current IMG tutorial is version 12 and I’ll be working on version 13 as soon as I finish updating our SGD tutorial. When we first created our IMG/M tutorial*, metagenomes were a relatively new concept and the resource included a total of 24 microbiome samples – now it has over 1000!

But enough with the nostalgia, let’s get to the resources! :) IMG/M integrates metagenome data with isolate microbial genome sequences from the integrated microbial genome (IMG) system to enable the analysis of phylogenetic composition and functional or metabolic potential of the aggregate genomes (metagenomes) in microbial communities (microbiomes). Genomes generated as part of the Human Microbiome Project (HMP) are included into IMG/M from RefSeq via IMG. IMG/M resources allow users analyze metagenomes, genomes, genes and functions by making lists of items and then manipulating them in “analysis carts”. Metagenomes can also be analyzed using the tools provided from their ‘Metagenome Details’ page. These options are explained in much more detail than I can cover here in the IMG/M reference that I site below. I also link to the most recent IMG publication, since an understanding of it is essential to understand any IMG/M-based resource.

* OpenHelix tutorial for this resource available for individual purchase or through a subscription.

Quick Links:
Integrated Microbial Genomes (IMG): http://img.jgi.doe.gov/cgi-bin/pub/main.cgi

Integrated Microbial Genomes with Microbiomes (IMG/M): http://img.jgi.doe.gov/cgi-bin/m/main.cgi

Integrated Microbial Genomes-Human Microbiome Project (IMG/HMP): http://www.hmpdacc-resources.org/imgm_hmp/

OpenHelix Introductory Tutorial on IMG: http://www.openhelix.com/cgi/tutorialInfo.cgi?id=54

OpenHelix Introductory Tutorial on IMG/M: http://www.openhelix.com/cgi/tutorialInfo.cgi?id=24

Victor M. Markowitz, I-Min A. Chen, Ken Chu, Ernest Szeto, Krishna Palaniappan, Yuri Grechkin, Anna Ratner, Biju Jacob, Amrita Pati, Marcel Huntemann, Konstantinos Liolios, Ioanna Pagani, Iain Anderson, Konstantinos Mavromatis, Natalia N. Ivanova, & Nikos C. Kyrpides (2012).
IMG/M: the integrated metagenome data management and comparative analysis system Nucl. Acids Res. , 40 DOI: 10.1093/nar/gkr975

Victor M. Markowitz1, I-Min A. Chen, Krishna Palaniappan, Ken Chu, Ernest Szeto, Yuri Grechkin, Anna Ratner, Biju Jacob, Jinghua Huang, Peter Williams, Marcel Huntemann, Iain Anderson, Konstantinos Mavromatis, Natalia N. Ivanova, & Nikos C. Kyrpides (2012). IMG: the integrated microbial genomes database and comparative analysis system Nucl. Acids Res., 40 DOI: 10.1093/nar/gkr1044

NAR database issue (always a treasure trove)

The advance access release of most of the  NAR database issue articles is out. As usual, this this database issue includes a wealth of new and updated data repositories and analysis tools. We’ll be writing up additional more extensive blog posts on it and doing some tips of the week over the next couple months, but I thought I’d highlight the issue and some of the reports:

There are a lot of updates to many of the databases we know and love (links to go full text article): UCSC Genome Browser, Ensembl, UniProt, MINT, SMART, WormBase, Gene Ontology,  ENCODE, KEGG, UCSC Archaeal Browser, IMG/M, DBTSS, InterPro and others (we have tutorials on all those listed here).

And, as an indication of the explosion of data available (itself a subject of a database issue article, SRA), there are a lot of new(ish) databases on specific datatypes such as MINAS, a database of metal ions in nucleic acids (nice name :D); doRiNA, a database of RNA interactions in post-transcriptional regulation; BitterDB, a database of bitter compounds and well over 100 more.

And I’ll give a special shout out to my former PI at EMBL because I can, Peer Bork’s group has 4 databases listed in the issue: eggNOG, SMART, STITCH and OGEE. (and he and a couple members are on the InterPro paper also).

This is going to be a wealth of information to wade through!

UCSC Genome Browser: http://genome.ucsc.edu
Ensembl: http://www.ensembl.org/
UniProt: http://www.uniprot.org/
MINT: http://mint.bio.uniroma2.it/mint/
SMART: http://smart.embl.de/
WormBase: http://www.wormbase.org/
Gene Ontology: http://www.geneontology.org/
ENCODE: http://genome.ucsc.edu/ENCODE/
KEGG: http://www.kegg.jp
UCSC Archaeal Brower: http://archaea.ucsc.edu/
IMG: http://img.jgi.doe.gov/cgi-bin/w/main.cgi
DBTSS: http://dbtss.hgc.jp/
InterPro: http://www.ebi.ac.uk/interpro




Tip of the Week: A year in tips III (last half of 2010)

As you may know, we’ve been doing tips-of-the-week for three years now. We have completed around 150 little tidbit introductions to various resources. At the end of the year we’ve established a sort of holiday tradition: we are doing a summary post to collect them all. If you have missed any of them it’s a great way to have a quick look at what might be useful to your work.

Here are the tips from the first half of the year, and below you will find the tips from the last half of 2010 (you can see past years’ tips here: 2008 I2008 II2009 I2009 II):


July 7: Mint for Protein Interactions, an introduction to MINT to study protein-protein interactions
July 14: Introduction to Changes to NCBI’s Protein Database, as it states :D
July 21: 1000 Genome Project Browser, 1000 Genomes project has pilot data out, this is the browser.
July 28: R Genetics at Galaxy, the Galaxy analysis and workflow tool added R genetics analysis tools.


August 4: YeastMine, SGD adds an InterMine capability to their database search.
August 11: Gaggle Genome Browser, a tool to allow for the visualization of genomic data, part of the “gaggle components”
August 18: Brenda, comprehensive enzyme information.
August 25: Mouse Genomic Pathology, unlike other tips, this is not a video but rather a detailed introduction to a new website.


September 1: Galaxy Pages, and introduction to the new community documentation and sharing capability at Galaxy.
September 8: Varitas. A Plaid Database. A resource that integrates human variation data such as SNPs and CNVs.
September 15: CircuitsDB for TF/miRNA/gene regulation networks.
September 21: Pathcase for pathway data.
September 29: Comparative Toxicogenomics Database (CTD), VennViewer. A new tool to create Venn diagrams to compare associated datasets for genes, diseases or chemicals.


October 6: BioExtract Server, a server that allows researcher to store data, analyze data and create workflows of data.
October 13: NCBI Epigenomics, “Beyond the Genome” NCBI’s site for information and data on epigenetics.
October 20: Comparing Microbial Databases including IMG, UCSC Microbial and Archeal browsers, CMR and others.
October 27: iTOL, interactive tree of life


November 3: VISTA Enhancer Browser explore possible regulatory elements with comparative genomics
November 10: Getting canonical gene info from the UCSC Browser. Need one gene version to ‘rule them all’?
November 17: ENCODE Data in the UCSC Genome Browser, an entire 35 minute tutorial on the ENCODE project.
November 24: FLink. A tool that links items in one NCBI database to another in a meaningful and weighted manner.


December 1: PhylomeDB. A database of gene phylogenies of many species.
December 8: BioGPS for expression data and more.
December 15: RepTar, a database of miRNA target sites.

Tip of the Week: Comparing Microbial Databases

A few weeks ago a commenter asked me to compare IMG (Integrated Microbial Genomes) to the UCSC Microbial Genome browser. I’ve been exploring & thinking since then & am going to give a very brief comparison of those two resources in today’s tip & I’ll expand the comparison to other resources here in the text of this post.

IMG extends capabilities

DOE JGI extended and updated the content of IMG and IMG/M recently as this linked press release shows (here is another), so today I’d like to just make a quick post to highlight that OpenHelix has a tutorial on IMG/M that is sponsored by JGI and thus free.

Metagenomics is a huge new field and IMG/M has some excellent tools and data!

JGI's Sequencing Plans for 2009

JGI logo Just saw in today’s GenomeWeb Daily News email that the Joint Genome Institute has announced its sequencing plans for 2009. It includes both genomes and metagenomes. You can read the GenomeWeb Daily News article, or the whole JGI announcment of projects which, according to JGI Director Eddy Rubin:

 “The scientific and technological advances enabled by the information that we generate from these selections promise to take us faster and further down the path toward clean, renewable transportation fuels while affording us a more comprehensive understanding of the global carbon cycle”.

 I for one am looking forward to exploring the new information as soon as it is available in IMG and IMG/M!

Metagenomics making the big Times (as in NY)

Metagenomics, which is really a new area of study (barely this last decade) in comparison to most biological areas of research, is already making into the mainstream press. The New York Times has an article yesterday entitled “Bacteria thrive in the inner Elbow, No Harm Done” (you’ll need a free registration to read that). The article quotes from metagenomic studies that show that our inner elbows contain a unique microbiome of species even in comparison to our upper arm, though it goes a  more into metagenomic studies than just that. As the article states:

The research is part of the human microbiome project, microbiome meaning the entourage of all microbes that live in people.
The project is an ambitious government-financed endeavor to catalog the typical bacterial colonies that inhabit each niche in the human ecosystem.

That’d be this project. The article does a decent job of explaining why this project is helpful, but doesn’t really explain how this approach of metagenomics is different or how it’s done (in fact, it never says the word “metagenomics“). Oh well, can’t have everything.

For your edification: There is last year’s report “The New Science of Metagenomics” from the National Research Council (you can read free online, or purchase as a book. Also, of course there are at least two extensive databases of metagenomic data, IMG/M (free tutorial) and Camera.

Tip of the Week: Functional Abundance Profile Searching in Genomes

Wow, how can you resist blogging about an article who’s title begins “Dissecting biological ‘‘dark matter’’…”? I found the article while working on a new ‘sponsored’ (read ‘free’) tutorial on the Integrated Microbial Genomes with Microbiome Samples (or IMG/M) resource. (I’m hoping the tutorial will be released in the near future & I’ll post when it is available.) IMG/M is a microbial resource that specializes in the analysis of metagenomes. Metagenomes are becoming hot – we’ve blogged about them in the past, as have lots of others. According to the article I found, “biological dark matter” refers to our paltry knowledge & understanding of the Earth’s microbial diversity. The article reports a method for isolating individual bacterial cells from the microbiome of the human mouth, but for my contribution to understanding microbial diversity, I want to give you a tip on using the IMG/M resource to select protein families in genomes based on their relative abundance. Click the image above to view the tip, or follow the links in this post to learn more on your own.
ResearchBlogging.orgMarcy, Y., Ouverney, C., Bik, E.M., Losekann, T., Ivanova, N., Martin, H.G., Szeto, E., Platt, D., Hugenholtz, P., Relman, D.A., Quake, S.R. (2007). Inaugural Article: Dissecting biological “dark matter” with single-cell genetic analysis of rare and uncultivated TM7 microbes from the human mouth. Proceedings of the National Academy of Sciences, 104(29), 11889-11894. DOI: 10.1073/pnas.0704662104