Tip of the Week: Gemini, exploration of genetic variation

This week’s tip of the week is on Gemini which is the acronym for “GENome MINing.” Unlike most of the tips we give every week, this one is a software package. But, it is does use and integrate with many internet databases such as dbSNP, ENCODE, UCSC, ClinVar and KEGG. It’s also a freely available, open source tool and quite a useful software package that gives the researcher the ability to create quite complex queries based on genotypes, inheritance patterns, etc.  The above 12 minute clip is a talk given at a conference that gives a introduction of the science behind the tool.

The abstract from the recent paper from the developers gives a good introduction concerning the functionality of the tool:

Modern DNA sequencing technologies enable geneticists to rapidly identify genetic variation among many human genomes. However, isolating the minority of variants underlying disease remains an important, yet formidable challenge for medical genetics. We have developed GEMINI (GEnome MINIng), a flexible software package for exploring all forms of human genetic variation. Unlike existing tools, GEMINI integrates genetic variation with a diverse and adaptable set of genome annotations (e.g., dbSNP, ENCODE, UCSC, ClinVar, KEGG) into a unified database to facilitate interpretation and data exploration. Whereas other methods provide an inflexible set of variant filters or prioritization methods, GEMINI allows researchers to compose complex queries based on sample genotypes, inheritance patterns, and both pre-installed and custom genome annotations. GEMINI also provides methods for ad hoc queries and data exploration, a simple programming interface for custom analyses that leverage the underlying database, and both command line and graphical tools for common analyses. We demonstrate GEMINI’s utility for exploring variation in personal genomes and family based genetic studies, and illustrate its ability to scale to studies involving thousands of human samples. GEMINI is designed for reproducibility and flexibility and our goal is to provide researchers with a standard framework for medical genomics.

If you’d like to learn more, there is some pretty good documentation of the software package here.

While I’m at it, and totally unrelated except it’s human genomics, there is this slideshare presentation of the ‘current’ state of personal genomics. Current is in quotes because the slideshare is actually from 3 years ago, but there is a lot of good information in there. Anyone know of a more up-to-date slide set or extensive intro to the current state of personal genomics science similar to this?


Relevent Links:

GEMINI Software package
UCSC Genome Browser

(tutorials are linked below for those tools in bold above)

Relevant Reference:

Paila U, Chapman BA, Kirchner R, & Quinlan AR (2013). GEMINI: Integrative Exploration of Genetic Variation and Genome Annotations. PLoS computational biology, 9 (7) PMID: 23874191

Google DNA Maps (spoof…er…I think…)

I had no intention of posting for a few days because of the holiday, but this was just too funny to pass up. I had to watch it a couple of times to catch everything; even the crawls at the bottom are hysterical.

Hat tip to Casey Bergman for retweeting this–I might have missed it otherwise.

@jandot: Google DNA Maps - hilarious, well, kind of …  http://youtu.be/mOgTVx9ge5M

Ethics committee report on genome sequencing and privacy

Well, this changes my morning. It’s a 150 page report.

Get the report itself here: Privacy and Progress in Whole Genome Sequencing

Press or blog coverage (I’ll update if I see others):

Nature: US ethics panel reports on DNA sequencing and privacy

Science: President’s Ethics Panel Urges New Protections for Whole Genome Data

AP: Bioethics panel urges more gene privacy protection

USA Today: Panel: Protect patients who use whole genome sequencing

AARP: How private is your genetic code? Less so than you might think.


Quick note: Misha Angrist interview on Skeptically Speaking tonight

Hey folks–just a quick reminder of an interview with Misha this Sunday evening (North America). Other times found here: Event Time Announcer. If you can’t make it, the recording goes up a bit later in the week and you can check it out then.

Here’s the blurb from Skeptically Speaking, on Google+ (and you can find me and Misha in that conversation too).

Live, Sunday at 6 pm MT, we’ll discuss DNA, genetics, and personal genomics with Dr. +Misha Angrist, Assistant Professor at the Duke Institute for Genome Sciences & Policy, and author of Here Is a Human Being: At the Dawn of Personal Genomics. Email questions to live@skepticallyspeaking.com, or join us live in the chat!

To listen live go here: Skeptically Speaking #143 Here is a Human Being.

What would bioinformatics professionals do with their personal genome? “I simply don’t want to know.”

Over the long holiday weekend I noticed an interesting item in my twitter feed. A number of people were pointing to the post entitled: My Genome Via E-mail by David Ewing Duncan. Some of you may be familiar with David’s writing and his big project called “Experimental Man“.  He has been exploring all sorts of biomedical tests and investigations about his body, making him probably the case of personalized medicine with the most depth at this point.

Well, he has also taken to genomics as part of this, of course. And now he’s one of the people in the Personal Genome Project and has his full genome sequence in hand. Well, sort of. He has it, but he’s asking for guidance on what to do with it:

This is an appeal: Send me you ideas for how best to interpret my newly sequenced complete genome!

Now, as an exercise over a year ago I thought this through. I have no expectation of having my genome any time soon–but it’s a question people ask me and I thought it was fun to think about. I reviewed that post the other day and I still think that’s what I’d do:

1) Assessment and QC

2) Build a personal genome browser with various tracks, including a literature track for personally curating stuff interesting or relevant to me

3) Look closer at specific medically relevant genes. I know this is looking under the flashlight, but the most knowledge and anything actionable would probably be in this set. I’d also look specifically into family issues (like that allergy/eczema stuff I found in my 23andMe data) and try to learn things there.

But I also thought I’d like to know what some of my peers in bioinformatics/genomics would do. As you may know if you follow this blog, we participate in discussions at BioStar. The participants here are active in genomics research around the world, and they are super-users of the tools of art in this field. Who better to ask? So I posted a question asking what they would do with their personal genome sequence. I offered my skeletal workflow as an example, and expected some thoughts on what they would do.

What would you do with your personal genome data? is my question over there.

To my surprise, the top rated answer at this time says this:

I may be in minority but I’ll say this: right now I simply don’t want to know – Did you ever notice how genomic variation never correlates with good news. It seems there is only bad news. There are no SNPs for happiness, friendship or love….

Um. Ok. That’s one way to approach this. I was surprised, really–I didn’t expect the question to become philosophical. I really wanted a workflow.

For most of the weekend the second rated answer was this:

Whatever you do with it should be up to you to decide, use it for your personalized medicine if you wish. So my sole recommendation is: Keep the data private and well protected and encrypted! Decide in an informed way, whom you grant access to them….

There were real concerns about the security and misuse of this data.

There are a couple of other interesting answers as well. I have to say it was fascinating. It wasn’t what I expected–but it was illuminating for me. I haven’t always been the most enthusiastic participant in the personal genomics debate, as I have real concerns about security and misuse of the information, and the current utility. But it’s certainly coming whether we are ready or not, and I really wanted to know what people would do with it in a concrete way if they had it. And I thought bioinformatics/genomics professionals would have the best leads on this.

“right now I simply don’t want to know”

I’m considering adding a bounty to my question over there. You can add some of your own points to the question for encouragement to obtain an answer. And I’ll still be the highest ranked identified female over there–so I can afford the points.

If you have some thoughts and want to join BioStar, and if you give me a decent workflow, I may award the points to you!  Anyone? Bueller? Bueller? Anyone?

Myriad patent decision: Genoscenti will be disappointed.

So it appears that the decision is now out on the case against Myriad’s patents on the BRCA diagnostic testing. I haven’t read the decision yet, I saw a great stream of tidbits coming from @genomicslawyer, Dan Vorhaus. Check out the whole thing, starting around here and working upwards:

@genomicslawyer: $MYGN opinion is out: bit.ly/qe3bZm CAFC full reversal on gene patents, partial reversal on diagnostic patents.

This made me laugh, as it is quite true and I’m sure we’ll be seeing the Genoscenti discuss this very soon:

RT @matthewherper: @genomicslawyer as expected. Genoscenti will be disappointed.

I think this is probably the best assessment of it so far though:

RT @genomicslawyer: Plenty of material here for critics/fans of gene patents, SCOTUS & Congress. Little chance this is last word.

I’d put money on this, for sure….

UPDATES with other coverage: UPDATE 1-Myriad can patent breast cancer genes – court

RT @NatureNews: US Court upholds Myriad’s breast cancer gene patents http://goo.gl/fb/kNrNN

NHGRI tweets a link to the NYT story: RT @genome_gov: U.S. fed. appeals ct. affirms co.’s right to patent genes used as basis for genetic test of ovarian/breast cancers. http://qoo.ly/6he

RT @GENbio: If you missed today’s important court ruling on #Myriad’s patents on BRCA genes click here. http://bit.ly/pN04SW

RT @ScienceInsider U.S. Appeals Court Backs Gene Patents in Myriad Case http://bit.ly/mVCrm9

UPDATE: The detailed analysis begins to roll in. You should check out this post from @genomicslawyer Dan Vorhaus and John Conley, on this decision:  Pigs Return to Earth: Federal Circuit Reinstates Most—But Not All—of Myriad’s Patents

Must see: Richard Resnick TEDxBoston talk

Recently my twitter feed was burning up at the live presentation at TEDxBoston by Richard Resnick. I had caught most of it (thanks to the tweeps who sent word), and scribbled down a few notes. But mostly I wrote in my notebook that I needed to seek out this talk on the web later and review it. Just found it–and you need to watch it.

Richard Resnick gives an excellent presentation of the state of genome sequencing today–the rate and the increase–with just a couple of minutes and some well-done graphics. But he quickly moves to what this means for medicine today.

He describes a typical case of a woman’s cancer and the medical interventions for it over the last few years–and says that this will look like bloodletting to us in the near future because it’s so primitive. He explains how her genome and that of her cancer was examined to discover issues.

He relates the story of the twins who were discovered to have a treatable condition based on their genome sequencing, after suffering for years with unknown problems. And the story of Nick, who Matthew Herper called The First Child Saved By DNA Sequencing.

“The prospect of using the genome as a universal diagnostic is upon us today.”

He talks about how this can give all of us extra years of health. But he also turns to how this impacts the planet, including food production which is being affected by this technology too–and says:

Now look, as long as we continue to increase the population, we’re going to have to continue to grow and eat genetically modified foods. And that’s the only position I’ll take today.

Next he places the genomic revolution into personal context and consumer uses and social implications. He shows an application he had for life insurance which *specifically* demands to know if you have had a personal genomics test done. (By the way, the US GINA legislation does not prevent discrimination based on your genome for this kind of insurance–a lot of people don’t realize that).

The excitement of the time we live in, with appropriate warnings about the implications, are really well done in this talk. And he asks everyone watching to wake up and influence the genomic revolution we are in.

It’s just over 11 minutes long. Watch it. Srsly. Worth your time.

In case the embed doesn’t work, or to watch embiggened, go here: http://tedxtalks.ted.com/video/TEDxBoston-Richard-Resnick-Th-2

TEDxBoston – Richard Resnick – The Next Hot Commodity of Genome Sequences

George Church at TEDMED, many thoughts on personal genomics

This is a talk from a Ted Med in October of 2010, but I just watched it and it is relevant for the state of play in this field today.

For people who don’t know Church, he’s one of the leaders of Knome, Inc (one of the personal genomics companies) and also of the Personal Genome Project (PGP). And you can check out his medical and genomic details as part of that project as well. But he has wide connections and influence in this arena, and it’s worth hearing his perspective on personal genomics. He sets it up talking about “synthetic personal genomes” and mentions how he’s a mutant. He moves on to talk about phenomes, and even how green chemistry is coming together with genomics.

He talked about the AnAge database, which was new to me: the animal ageing and longevity database–and how they are exploring the genomes of long lived organisms for information. (I always love a new database…check out my profile for what I think of them….) He highlighted the story of that child whose genome was sequenced that I talked about before, and how it demonstrates that sequencing personal genomes is right in the clinic today. He also speaks about the epigenome toward the end. It’s a lot to cover, and I’m not sure it’s the most accessible talk for novices, and it ends a bit abruptly (I imagine he has enough material to go on for about 7 days straight, but TED limits everyone to 15 minutes). But man–I wrote down 5 different thought + action items out of this that I have to go work on right now…

My favorite part was when he said this–about the environment and health, which he says is a frequently asked question: “If health is mostly environmental, why do we need genetics?” And I love the way he framed the answer–that the point of personal genomics is not so say “here’s your genetic destiny and get used to it” but instead determine “what’s the ideal environment for your genome”. “And it’s not one-size-fits-all.” <–THIS is a point I keep trying to make with people who are trying to deny any influence of genetics on health. Of course environment influences biology–that’s pretty much what biology is about. But your genome constrains the response it can give–and you can’t force the same lifestyle, food, and clean living strategies on everyone because each person may respond differently.


Hat tip to @fbfukushima for the retweet of the original:

Video @TEDMED – Pushing back against skepticism, George Church talks about the bright future of personal genomics.

SNPTips update (1.1)

I did a tip of the week on SNPTips a few months ago (more information there). It’s a great addon to view your genomic data while browsing databases and web sites. They’ve moved to version 1.1. There are two nice new features and some bug fixes. The features are:
*You can now use your deCODEme data, in addition to the 23andme support they started with.
*You can use SNPTips even without raw data to view SNPs on a page.
*and it’s been updated for Firefox 4.x.

You can check our our previous tip here (which still applies :).

SNPTips landing page at 5am Solutions.