Video Tip of the Week: Your DNA, Your Say. GA4GH wants to hear from you. And you.

The Global Alliance for Genomics and Health (GA4GH) has come up a few times on our blog. The last time we highlighted them for a tip, it was about their Beacon tool. The idea of the Beacon is that it could interrogate a database but in a very subtle way, without needing access to the entire sequence information of a patient. It would ask a simple yes/no question about a given sequence variant–and if a “yes” came back, then a researcher could go through the process of getting proper access to protected patient data.

So it was a way to keep people from pawing through data that they don’t need. And yet it could still connect people who might benefit from research, with researchers who need information.

But certainly issues of patient or donor privacy are hot topics. More and more data will come in from large projects, or from diagnostic samples, and cancer vs normal tissue comparisons, and we are going to struggle with the access vs. privacy matters for a while. The general public is only now becoming aware of the impacts. But we certainly need people to understand and we’ll want them to contribute to expanding our knowledge about health and disease.

That’s why the folks associated with GA4GH, the Wellcome Trust, and the Wellcome Genome Campus are eager to engage the public on their feelings on use of genomic sequence data. They have launched a project called “Your DNA Your Say”[PDF], in the form of a survey with videos to help understand where people are on this issue. Here’s the intro video to entice you to answer the survey:

I answered the survey because I do have concerns about access to information that will help us drive the science forward, as well as about the potential for misuse of the information. But I would like them to hear from as many people as possible, so that we can understand the barriers to research and donation that are looming. Have your say. And spread the word.

You can learn more about their ideas in a variety of publications–I’ll link to one below, but there are other publications and more details about the overall projects and individual tools at the GA4GH web site.

Quick links:

Survey site: YourDNAYourSay.org

GA4GH: http://genomicsandhealth.org/


Video Tip of the Week: PhenomeCentral

Silos. This is a big problem for us with human genome data from individuals. We’re getting sequences, but they are locked up in various ways. David Haussler’s talk at the recent Global Alliance for Genomics and Health meeting (GA4GH) emphasized this barrier, and also talked about ways they are looking to work around the legal, social, and institutional barriers that we’ve created. He talked about Beacon, which I highlighted recently as a Tip of the Week. But there are other strategies needed to connect physicians and patients with other folks who might help them get to answers. Heidi Rehm’s talk provided information about a possible tool for this: PhenomeCentral.

Unfortunately, the videos aren’t uploaded to YouTube, you have to go to the June 10 Meeting page and obtain them from there. The one that contained the information on PhenomeCentral is the one called “Matchmaker Exchange”.

PhenomeCentralLogoThe mission of PhenomeCentral, according to their site, is:

PhenomeCentral is a repository for secure data sharing targeted to clinicians and scientists working in the rare disorder community. PhenomeCentral encourages global scientific collaboration while respecting the privacy of patients profiled in this centralized database.

Certainly people in bioinformatics are familiar with the really crucial information from OMIM and Orphanet. But these are aggregators of information, not patient-specific. There may be lists of features of a condition, but how they appear in a given patient’s situation might vary.

What this new strategy will do is let doctors and researchers take the phenotype and genotype data (you can upload VCF files), and make predictions about the genes involved. They also have ways to “matchmake” possibly similar disease manifestations. This project is part of the larger “MatchMaker Exchange” collection (Note: MME is not a dating site…it’s also still under development). But the idea is that with patient details one could search for matches with other similar patients (depending on the privacy level of the records, of course). It sounded to me like a kind of BLAST for medical conditions (they didn’t call it that). But it also has ways to semantically link phenotype concepts, because they might be entered differently by different evaluating physicians, yet be the same type of issue underneath. That Human Phenotype Ontology (HPO) that I’ve covered a couple of times lately enables this.

They have 3 levels of privacy settings included: private, matchable (where you can find it in a search, but it’s not wide open to everyone), and public.

So although I used the GA4GH talk as a launching point to learn more about the features and conceptual parts of the PhenomeCentral software, I also came across this other webinar that was more specific about the software features (which is what I typically prefer for our tips, the specific software tools). The Genetic Alliance is a patient-centric group interested in answers for genetic and genome-variant medical situations, actively working with advocacy groups and researchers to bridge the needs of both. In their webinar series last year they included PhenomeCentral.

What I didn’t realize from the GA4GH overview was that there are additional tools, including a pedigree tool in the PhenoTips part. We find a lot of people find our blog searching for pedigree tools, so I wanted to be sure to mention that specifically. You can try it out by entering fake data in the playground over there, and accessing the Pedigree Tool from that record. This was also handy for me because I didn’t create a login for the main PhenomeCentral site due to the privacy issues.

So have a look at PhenomeCentral. And from the GA4GH video I learned that there is a special journal issue coming up in the fall that will have papers related to these projects. So I’ll link to the PhenoTips publication below now, but when more references become available for this tool or project I’ll add them in. I expect there will be metrics about algorithms in use and other technical details that are important for fully evaluating the tool.

Quick links:

PhenomeCentral: https://phenomecentral.org/

PhenoTips: https://phenotips.org/ (has the playground + pedigree tool)

GA4GH videos: http://genomicsandhealth.org/news-events/events/june-10th-meeting-presentations

Friday SNPpets

This week’s SNPpets include several new tools that I want to examine, including functional annotations in a couple of different ways. But other stuff includes designing DTC consumer products,  NLM’s future directions, a new Genomics subreddit, and stunning phylogenetics representations among other interesting reads.

Video Tip of the Week: Beacon, to locate genome variants of potential clinical significance

This week’s Video Tip of the Week follows on last week’s chatter about the Internet of DNA. As I mentioned then, the Beacon tool we touched on was going to get more coverage. So this week’s video is provided by the Beacon team, part of the larger Global Alliance for Genomics and Health project (GA4GH).

I’ve touched on some of the GA4GH work in the past. I heard more about a very interesting piece of it from David Haussler at the recent TRICON meeting.

D. Haussler, slide from TRICON talk.

D. Haussler, slide from TRICON talk.

The talk was called “Stable Reference Structures for Human Genome Analysis” and it was important for me to see this. I’ve been wrestling with some of the literature (linked below) that describes ways to represent genome variations among massive numbers of humans. It really helped me to hear it described and shown as cartoons on slides that were less like equations. And how this will play out in graphs and visualizations with software tools is of particular interest to me.

So one branch of the Data Working Group of the GA4GH is tasked with how to represent the variations as multiple paths as graphs, instead of the one linear reference genome we think of today. It has to accommodate many types of variations–inversions, deletions, duplications, as well as just SNPs. So, as the kids say today, it’s complicated. But we have to figure it out. Stay tuned, I’m sure we’ll be talking more about this in the years to come.


Beacon is like SETI for genome variations.

Another branch of this project is tasked with trying to figure out how to share genomic data among all the international producers of this data. If we can’t share the data, we won’t be able to look at the variations among humans and learn from them, nevermind display them. This has additional layers of social and legal complexity we are just beginning to face. As a first pass at sharing this data, a “Beacon” system has been implemented to help researchers locate variations of interest to them.

You should read up on the whole Beacon philosophy and see its current implementation at their site. From what I gather, it is a minimal way to share genome information, without incurring privacy and consent barriers that might be hit if you were pulling down a whole genome. You can query any site that implements a Beacon to ask: do you have a variation at this position? And the Beacons can respond with “yes” or “no”. If there are useful variations, you can then pursue them from there, and if you need access to more you can go through the channels then. But at least you’ve possibly found some needles in some haystacks that you might not have known about otherwise.

The Beacon team has done a short video explaining this. It has no audio, just explanatory text with the graphics. Marc Fiume gave me permission to embed it here.

The “Beacon of Beacons” aggregates the query to send it out to all the known Beacons. You can use it today to search for this kind of data. The video also notes that you can cloak the name of the institution to protect patient privacy.

I have been more acutely concerned about genomic privacy issues than some of my cohorts in this arena. And I fully accept that there will not be privacy–what I want is protection from misuse of the information, which I find lacking in the US legal framework right now. That said, I think that Beacon is a nice work-around for that. If I had a variant of concern, I could ping these other sites to see if others had it. Or vice-versa. But the framework under which the donor of that material provided the data would not be pierced. This makes total sense to me, and I can accept this strategy.

Sharing the genomic data from sequenced individuals is going to be tricky and complex. But I’m keen to see the GA4GH group tackle it. I like several of the directions that I’ve seen so far. But right now–check out Beacon. Implement one if you have this kind of data, and let’s see if it works.

Quick links:

Global Alliance for Genomics and Health: http://genomicsandhealth.org/

Beacon (project details page): http://ga4gh.org/#/beacon

Beacon of Beacons (where you would do a search): http://ga4gh.org/#/beacon/bob


During David Haussler’s talk, he also referenced these papers:

What’s The Answer? (Internet of DNA)

This week’s highlighted discussion tackles the “Internet of DNA”, a story I picked last week in my SNPpets post, which has bubbled up elsewhere. And Biostar folks look at the more technical implications of “A global network of millions of genomes….”

Biostars is a site for asking, answering and discussing bioinformatics questions and issues. We are members of the Biostars_logo community and find it very useful. Often questions and answers arise at Biostars that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those items or discussions here in this thread. You can ask questions in this thread, or you can always join in at Biostars.

This week’s discussion comes as part of an interesting week on the personalized medicine front. A whole bunch of things are coming together–the US getting a Chief Data Scientist who talks about bioinformatics, The NEJM talking about training physicians to deal with medical genomics issues, and the “Internet of DNA” getting out into the popular science media realm. So have a look at what bioinformatics nerds made of this, and what their thoughts are:

Forum: A global network of millions of genomes could be medicine’s next great advance. | Beacon

Internet of DNA

A global network of millions of genomes could be medicine’s next great advance.

Availability: 1-2 years

Noah is a six-year-old suffering from a disorder without a name. This year, his physicians will begin sending his genetic information across the Internet to see if there’s anyone, anywhere, in the world like him.


Do you think this will happen within 2 years?


This is the technical implementation I think  that they are talking about:

The Beacon project is a project to test the willingness of international sites to share genetic data in the simplest of all technical contexts. It is defined as a simple public web service that any institution can implement as a service. The service is designed merely to accept a query of the form “Do you have any genomes with an ‘A’ at position 100,735 on chromosome 3″ (or similar data) and responds with one of “Yes” or “No.” A site offering this service is called a “beacon”.


So it just a federated query over multiple large genomics (+ phenotypes) data sets. Full genomes are not centralized, or moved, so privacy is less of a concern.


And please, contribute your own thoughts over there. We need to be having this discussion. Also, watch for more on this Beacon….

Video Tip of the Week: Google Genomics, API and GAbrowse

This week’s video tip comes to us from Google–it’s about their participation in the “Global Alliance for Genomics and Health” coalition. Global Alliance is aimed at developing genomic data standards for interoperability, and they’ve been working on creating the framework (some background links below in the references will provide further details). It has over 170 members, and one of these members is Google. Although Google talked about this earlier this year when they joined this group, more recently pieces have begun to emerge about the directions and specific tools. Google’s efforts made the mainstream news recently in their announcement about working on a project to examine genomic data associated with autism.

Although this video doesn’t talk about a single specific tool like we usually cover, it provides more detail about this framework for building tools which is important to understand. And in this video I learned about a new browser developed under this project that I did have a quick look at, and I’ll add below.

They browser that they reference is called GAbrowse–I assume that means Global Alliance browse–but there’s not a lot of detail. Their “about” dialog box says this:

GABrowse is a sample application designed to demonstrate the capabilities of the GA4GH API v0.1.

Currently, you can view data from Google, NCBI and EBI.

  • Use the button on the left to select a Readset or Callset.
  • Once loaded, choose a chromosome and zoom or drag the main graph to explore Read data.
  • Individual bases will appear once you zoom in far enough.

The code for this application is in GitHub and is a work in progress. Patches welcome!

I kicked the tires a bit, but it’s clearly not fully fleshed out at this point. When I tried to zoom up from the nucleotide level it went up a bit, but eventually you hit a point that says “This zoom level is coming soon!” So certainly there’s more to come, and a lot more functionality that would be necessary. But it’s early. And it’s just a demo. I have no idea if it’s intended to become a stand-alone public browser.

So if you are interested in issue of cross-compatibility of human genomic data (and as far as I can tell this is all human-centric, I’d like to see a wider conversation on this), it’s probably worth knowing what Google is offering here. You should also be aware of what the Global Alliance is working on. Below I’ve added some of the publications and media I’ve seen about their efforts.

Hat tip to Can Holyavkin on Google+ for the link to the video.  https://plus.google.com/u/0/114690993717100405711/posts/gwNy5E7E6Vb?cfem=1

Quick links:

Global Alliance for Genomics and Health: http://genomicsandhealth.org/

Google genomics: https://developers.google.com/genomics/

GAbrowse: http://gabrowse.appspot.com

