Tag Archives: nhgri

DNA Day approaches (& when did the day start changing?)

DNA Day is approaching. On April 25th, nearly 60 years ago, Watson and Crick published their research clinching DNA as the genetic code. DNA Day is celebrated internationally to raise the public’s understanding of genetics, DNA and genomics.

You can also check out many sites this month (and through the year) to learn more about genetics and DNA. If you have any, please comment and I’ll add them to our list.

The University of Utah has a great site, “Learn Genetics” with a DNA Day run down.

The research journal Nature has a list of links and topics celebrating the 50th anniversary, including a free archive of the papers published by Watson, Crick, Franklin and others.

The National Human Genome Research Institute (NHGRI) celebrates National DNA Day every year (though unexplainably a different day in April every year*) with events, chats and more.

I also have a question for our readers, why is the NGHRI celebrating DNA Day on different days the last couple years. I always understood DNA Day to be April 25th because that was the day the Watson/Crick paper was published. NGHGRI’s National DNA Day celebration was on April 25th for the years 2007, 2008 and 2009, but in 2010 it was April 23rd and this year it’s April 15th. I kind of understand last year, like Veterans day or other holiday that falls on a Sunday, it’s nice to move it to a day people pay attention. Last year April 25 fell on a Sunday, so move it to a Friday. It’s not like NHGRI is alone, ASHG, 23andMe and others are doing DNA Day on April 15th this year (well, 23andme had their special yesterday). What’d I miss?

NHGRI Symposium – Viewable Online Now

Image from NHGRI Decade of Human Genome Sequence Symposium intro

Just over a week ago, while Mary & Trey were presenting our live UCSC and ENCODE workshops, I was back home enjoying the live feed of the “A Decade with the Human Genome Sequence: Charting a Course for Genomic Medicine” symposium from NHGRI. I missed a few of the live session due to family obligations, but really enjoyed the rest of it, both for the historic perspectives that it gave as well as for the battle cry going forward that it provided. A lot of “heavy” genomics hitters were included, both in the audience and as speakers.

This morning I noticed that the videos and most slides from the day are now available, & I thought I’d share the link with you. I can see several reasons to watch this series of talks – to help celebrate the human genome project, out of curiosity, for a historic perspective, or even to learn what types of things NHGRI & even NIH are likely to lean their funding might towards. The speakers did touch on some guiding themes for going forward, but from my perspective the symposium really was more about celebrating was has been – how little we knew 10+ years ago, examples of translational successes, and (especially) how far sequencing technologies have come. Eric Lander said something like if you can sequence, you should in your research because sequencing is cheap and getting cheaper each day. What was touched on a lot less in the symposium talks is exactly WHAT to DO with all this cheap sequence data. And I don’t even mean the ‘data deluge, where do we store it’ part of the issue, which is of huge concern. I mean on a concrete level, once I personally get sequence, or my lab gets sequence, how do we learn from that and move our research forward? Sean Eddy gave a great talk that touched on the incredible things that his and his wife’s labs are doing, but I don’t think every researcher that takes Lander’s advice has the where-with-all and know-how of Eddy.

In the panel discussion moderated by Sharon Terry, Stephen Sherry of NCBI discussed his experience with being the ‘family interpreter’ of their 23andMe genome data. I’ve heard similar stories of figuring out “what to do with it” from Trey, as he has blogged about in the past. And these are people “in the know” who have lots of databases and tools available to them. What is the “average”  personal genomics consumer going to do with their genome? There were some pointed audience questions and micro blog comments about data analysis. Someone pointed out that as sequencing costs go down, analysis costs go up. There were no real answers given by the symposium speakers to this question. Yea, ok, it was a celebration & time was short. But this is a HARD question to answer! How are we going to “translate” all our cheap sequence “information” into health knowledge?

If you only plan on pondering on one of these symposium talks, I’d suggest Maynard Olson’s closing remarks. For me it’s like one of those college lectures where, if you didn’t realize who he is or listen to his ideas, you might think you were in for a nap. But upon listening, and most importantly, reflecting you realize that the ideas were so big & rapid fire that you were probably only chewing on a third of it & still having a mouthful. I cannot possibly do justice to his talk without just quoting the whole thing so I’ll just recommend you watch with your thinking cap on. I personally keyed in on his phrase “radical integration”. I am absolutely biased but I’d like to think that what we do here at OpenHelix is helping with that integration. We may not be getting people’s genomic information directly into their medical charts, or helping build the technology to move towards electronic medical records, but we are helping to integrate by bridging the understanding gaps between software developers and end users, between the experts in one field and the resources available in related, but as-yet not integrated fields. We are providing outreach everyday to help integrate researchers and resources across research areas. So are lots of other groups – resource providers who understand the importance of outreach, institutions that believe in the value of resource training, science librarians and bioinformatics groups who work diligently to further the efforts of researchers at their institutions. We often describe OpenHelix as acting as a bridge, and in his talk Francis Collins uses a bridge metaphor for spanning the “valley of death” between Fundamental Knowledge and the Application of fundamental knowledge.

What I want is a realization and appreciation by funding agencies, universities, and conference organizers alike of the importance and impact of resource outreach on our ability to integrate sequence data into daily science and health – you gotta be able to understand it & analyze it to use it! Ok, nuff said for this rant. The symposium videos are cool for a lot of reasons – if you’ve got some down-time I’d suggest you check them out.

Oh yea, and if you do watch or read the full strategic plan, leave your comments & thoughts in one of the many areas that they are providing:
A Decade with the Human Genome Sequence (YouTube playlist of the symposium)
The (NHGRI) Strategic Plan (comments at the bottom of the page)
NIH Feedback page (more for new National Center for Advancing Translational Sciences (NCATS) feedback than NHGRI or genome feedback)

Tip of the Week: SKIPPY predicting variants w/ splicing affects

More and more disease-causing mutations are being identified in exonic splicing regulatory sequences (ESRs). These disease effects can result from ESR mutations that cause exon skipping in functionally diverse genes. In today’s tip I’d like to introduce you to a tool designed to detect exon variants that modulate splicing. The tool is named SKIPPY and has been developed and is maintained by groups in the Genomic Functional Analysis research section of the NHGRI.

At the end of the post I cite a very well-written paper describing the development of SKIPPY, as well as the background on why the tool was developed. I won’t have time to go into all those details, but if you are interested the paper is freely available from Genome Biology. The site also has nice, clear documentation and example inputs – which I will use as my examples. Splicing can be modulated in a variety of ways, including the loss or gain of exonic splicing enhancers (ESEs) or silencers (ESSs). Variants accomplishing either of those are referred to as splice-affecting genome variants, or SAVs. Not all of the abbreviations are explained on the results page, as you will see in the tip, but all are explained in detail in the SKIPPY publication, and the  ‘Methods and Interpretations‘ and ‘Quick Reference and Tutorial‘ areas of the site.

I first found the tool because it was mentioned in a nice review entitled “Using Bioinformatics to predict the functional impact of SNVs“, which is a paper that reviews mechanisms by which point mutations can effect function, describes many of the algorithms and resources available & provides some sage advice. I’ll post more on it in a later post. For now, check out the tip & the SKIPPY resource, and if you use the site please let us know what you think.

Woolfe, A., Mullikin, J., & Elnitski, L. (2010). Genomic features defining exonic variants that modulate splicing Genome Biology, 11 (2) DOI: 10.1186/gb-2010-11-2-r20

Cline, M., & Karchin, R. (2010). Using bioinformatics to predict the functional impact of SNVs Bioinformatics DOI: 10.1093/bioinformatics/btq695

I can haz outreach? Nobody speaks for the end users.

Recently there was much buzz in the #bioinformatics twittersphere over this blog post by Sean Eddy: The next five years of computational genomics at NHGRI

It is a very nice post about some exciting prospects for the future.  The idea of planning “explicitly for sustainable exponential growth” is wise.  There will be no abatement of the flow of data at this point–it’s no longer a big bolus of one species data, or one type of project.  The taps are wide open now, and we just keep adding more taps.

I also love the idea of “democratization“.  In part, it includes:

….To enable individual investigators to make effective use of large datasets, we must create an effective infrastructure of data, hardware, and software. NHGRI has extensive experience in big data, and can lead and catalyze across the NIH….

Now, I know this is a snippet of some thoughts–there may be more to it in the actual planning meetings on this.  But it pushed my buttons because it sounds a lot like what we always hear about big data projects: build it and they will come.

It got a little better in another segment:

Spur better software development. Traditional academia and funding mechanisms do not reward the development of robust, well-documented research software; at the same time, the history of commercial software viability in a narrow, rapidly-moving research area like computational genomics is not at all encouraging….

Well-documented research software.  Sigh.  We probably read more documentation than most people. And even the good documentation can be brutal. Dated. And not particularly effective. But still–if nothing else, please reward time spent on documentation….

But what is missing for me from this–and not just this, but most of these big data types of projects–is a real commitment to outreach and support for end users.  Formal, organized, supported, rewarded, outreach.  Sometimes there is a place to write to with questions.  But we probably send in more questions to projects than most people too–and the success rate for answers varies widely.  But even when we get good answers–that’s not enough.

I know funding is hard.  We can’t fund everything.  Databases and software project have to struggle to even persist.  Curation is frequently not valued enough.  And often curators are expected to do outreach as just one of their tasks…which pushes outreach even further down the priority list.  But without dedicated outreach–formal, quality, active outreach–databases and software projects won’t have so many users, and not many effective users.   Which will make funding agencies wonder if they should keep supporting them.  Which…well, you can see where this spiral goes….

What bugs me, I guess, is essentially this: Nobody speaks for the end users. There’s really no one in these types of meeting that really speaks for the consumers of this software and this data.  I mean people who aren’t directly attached to the data production and management.   The project teams think they are thinking about the users.  They really want users.  But ur not doin’ it rite.

I would like to see outreach and end user support valued, required, and really done right.  No matter how  much hardware and documentation you throw at these projects, if people 1) don’t know it exists, and 2) have no idea how to use it, the project will not yield all the results that it could. A marker paper is nice.  But it’s not sufficient, folks. And it’s nice to have the high-end team members talk at conferences. But that reaches only a tiny subset of the users or potential users.  And another thing about that: a lot of times people are hesitant to ask what sound like naive questions to the high-end representatives of these projects.  I’m jes’ sayin.

Yes, this is fairly self-serving for me to say.  But we see the users when we do outreach.  They crave it.  They love it.  We’ve been lucky to be a part of some great projects that do outreach right.  We have seen it work.  It should be Standard Operating Procedure on software and database projects.  Not an afterthought.


nr_Bassett-DachshundI had a Basset Hound growing up. His name was Useless, Useless S. Grunt. Well, actually it was formally Ulysses S. Grant because the US Kennel Club wouldn’t accept Useless S. Grunt as a name as they felt it was too demeaning. Not sure if they felt it was demeaning to the dog or to the president, but that’s neither here nor there is it?

So,you ask, what made me think of that long-passed sweet dog that tripped over it’s too-long ears with it’s too-short legs? It turns out that they found out what genetic cause there was for those short legs in Basset Hounds (and Dachshunds and other breeds).

As NHGRI’s press release states:

In a study published in the advance online edition of the journal Science, the researchers led by NHGRI’s Elaine Ostrander, Ph.D., examined DNA samples from 835 dogs, including 95 with short legs. Their survey of more than 40,000 markers of DNA variation uncovered a genetic signature exclusive to short-legged breeds. Through follow-up DNA sequencing and computational analyses, the researchers determined the dogs’ disproportionately short limbs can be traced to one mutational event in the canine genome – a DNA insertion – that occurred early in the evolution of domestic dogs.

The insertion turns out to be a retrogene, which of course I also find interesting in that I studied retrotransposable elements. Reverse transcriptase has this habit of reverse transcribing RNA into DNA which can get reinserted back into the genome (hence processed pseudogenes of course).

The study is interesting for two reasons (other than because I had a Basset Hound and studied the evolution of retroelements ;), it gives us a further clue into evolutionary events that lead to large changes in morphology and the role of retrotranscription and it gives us a clue into possible human conditions.

For more about dog genome, you can read our several posts about the dog genome, go to NCBI’s dog genome home site (or UCSC or Ensembl and other browsers) and read the paper (needs a subscription of course, it’s in Science). It’s an interesting read so far (I want to find some time to read it more fully, perhaps Useless doesn’t live up to his name.. he didn’t really even then :D).

Tip of the Week: PhenX Toolkit for GWAS Phenotype and Exposure Studies


Today’s tip is on a new resource brought to you by the National Human Genome Research Institute, or NHGRI. The resource is PhenX Toolkit version 2.1, which was released on May 22 2009. The PhenX Toolkit provides protocols for taking standardized measurements of research subjects’ physical characteristics and their environmental exposures. You can browse for protocols by domain or measurement type, or search for protocols. If you register, you are also able to collect sets of reports. These can be save for each of your projects, or for later modification. I’ll introduce you (briefly) to all of this and more in this tip.

I learned about this new resource from a GenomeWeb Daily News article in which they published NHGRI’s press release.

Pointing us out at Genome.gov :)

ohonnhgripageNHGRI recently pointed out our new set of tutorials on model organism databases (funded mainly by NHGRI :) on their home page, genome.gov. Always nice to be recognized :D.

And it gives me the opportunity to again point out that we do indeed have seven publicly available tutorials and training materials (slides, exercises, etc) on model organism databases including SGD, RGD, MGI, WormBase, FlyBase and ZFIN… and a seventh on GBrowse, a generic genome browser used by some of these and other genome databases.

Check them out (and fill out the new poll to the left :D.

Happy Birthday Chuck!

NHGRI asks if Darwin is relevant today….and guess what the answer is? :)

You can go here for a page devoted to the festivities: http://genome.gov/27529500

You can launch the video there if it doesn’t work here:


My favorite part of the video is when Leslie Biesecker takes us from Darwin–>software, of course.  Later on he also talks about how important evolutionary concepts are to our interpretation of health and disease.  I mean, I know you guys get this–but I think it is the piece that makes me craziest about the people who want to deny evolution and its relevance today.

Couldn’t they have found at least 1 woman to interview, though?  I saw them in the background….I know they were there…

"Genetic Town Halls" report is available

I’m very interested in public policy and genetics. There are a number of threads that I was following along those lines. On the actual legislation I was watching the GINA efforts, and participating where I could. I was reading an article on the downstream effects of that today (Two Cheers for GINA, by McGuire and Majumder, Genome Med 2009, 1:6 doi:10.1186/gm6).  One sentence sums up my feeling on GINA–we absolutely needed some protection, but other problems in our health care and insurance systems will persist…

If significant sections of the public focus on these gaps in US policy, reluctant to enter the genomic era without a blanket guarantee against harm, GINA may fail to live up to the hopes of its supporters.

There was also a series of public meetings about biobanking and genetics research that I was following (Town hall meetings on genes + environment studies).  I wish I could have participated in these town halls to get a sense of the room full of people interested in this topic–but none of them were near me.  However, the report on these was just released and you can get the summary of the outcomes from the sessions:

Center releases report on genetic town hall series

….Most participants felt that the biobank should go forward, and more than half indicated they were likely to participate in it if asked. Among the issues participants weighed in on were privacy protections for participants and concerns about possible misuse of information collected, the nature of the proposed study’s consent agreement, and the ability to get individual research results back from the study….

If you go to the DNApolicy.org site you can download the report in PDF form.  It is clear that the participants were concerned about discrimination based on the information–especially by insurers, but also law enforcement.  And this is despite the passage of GINA during this timeframe.  There are privacy concerns in general, too.  And the potential for misuse for “nefarious” purposes.  They also saw the benefits–research and new knowledge, new medications, increased precision for treatments.

Thanks to the Genetics & Public Policy Center folks for the report.  Thanks to the Genetic Alliance Policy Bulletin mailing list for the heads-up.

NHGRI wants your input

Man, it seems like everyone wants to hear from us these days….Here’s another request that came via The Genetic Alliance mailing list:

Dear Colleagues,

The National Human Genome Research Institute has embarked on a long-range planning process focused on the future of human genome research. To kick-start a conversation among our community, we have posted three white papers on our website to start the conversation on the following topics: diagnostics, preventive medicine, and pharmacogenomics; therapeutics; and education and community engagement.

These white papers are available for viewing and comment at http://www.genome.gov/About/Planning. We invite your review and comment in two phases. Phase 1, open now, will collect community thoughts solely on the questions posed in the white papers, aimed at ensuring we are asking the right questions. Phase 1 will continue through January 30, 2009. Once the questions are refined, Phase 2 will commence and collect community input regarding how best to answer the questions, probably starting in mid-February 2009 and continuing through mid-April 2009. Other white papers on other topics may be added as the process continues.

To stimulate discussion, comments received will be anonymously posted for viewing. Comments received through this white-paper process will be used to generate topics for further planning activities and workshops, which will be held in 2009 and 2010.

We encourage you to participate in this important discussion and look forward to your input. Since we would like this to be an inclusive process, please share this announcement with any colleagues who may be interested in participating.


Alan E. Guttmacher, M.D.

Acting Director

National Human Genome Research Institute, NIH