This week’s tip of the week is on Gemini which is the acronym for “GENome MINing.” Unlike most of the tips we give every week, this one is a software package. But, it is does use and integrate with many internet databases such as dbSNP, ENCODE, UCSC, ClinVar and KEGG. It’s also a freely available, open source tool and quite a useful software package that gives the researcher the ability to create quite complex queries based on genotypes, inheritance patterns, etc. The above 12 minute clip is a talk given at a conference that gives a introduction of the science behind the tool.
The abstract from the recent paper from the developers gives a good introduction concerning the functionality of the tool:
Modern DNA sequencing technologies enable geneticists to rapidly identify genetic variation among many human genomes. However, isolating the minority of variants underlying disease remains an important, yet formidable challenge for medical genetics. We have developed GEMINI (GEnome MINIng), a flexible software package for exploring all forms of human genetic variation. Unlike existing tools, GEMINI integrates genetic variation with a diverse and adaptable set of genome annotations (e.g., dbSNP, ENCODE, UCSC, ClinVar, KEGG) into a unified database to facilitate interpretation and data exploration. Whereas other methods provide an inflexible set of variant filters or prioritization methods, GEMINI allows researchers to compose complex queries based on sample genotypes, inheritance patterns, and both pre-installed and custom genome annotations. GEMINI also provides methods for ad hoc queries and data exploration, a simple programming interface for custom analyses that leverage the underlying database, and both command line and graphical tools for common analyses. We demonstrate GEMINI’s utility for exploring variation in personal genomes and family based genetic studies, and illustrate its ability to scale to studies involving thousands of human samples. GEMINI is designed for reproducibility and flexibility and our goal is to provide researchers with a standard framework for medical genomics.
If you’d like to learn more, there is some pretty good documentation of the software package here.
While I’m at it, and totally unrelated except it’s human genomics, there is this slideshare presentation of the ‘current’ state of personal genomics. Current is in quotes because the slideshare is actually from 3 years ago, but there is a lot of good information in there. Anyone know of a more up-to-date slide set or extensive intro to the current state of personal genomics science similar to this?
(tutorials are linked below for those tools in bold above)
Paila U, Chapman BA, Kirchner R, & Quinlan AR (2013). GEMINI: Integrative Exploration of Genetic Variation and Genome Annotations. PLoS computational biology, 9 (7) PMID: 23874191