Tag Archives: nhgri

List of GWAS studies

They are still working on the recorded version of the NHGRI GWAS seminar that we attended last week, but I wanted to point you to a useful web page they mentioned. It is a collection of GWAS studies with the top 5 SNPs from each listed, as long as they made a certain threshold.

As of 11/24/08, this table includes 202 publications and 435 SNPs.” according to the Catalog of Published Genome-Wide Association Studies.

So if you are interested in GWAS data this is a nice collection of that literature. It also comes as an Excel doc you can download.

The traits they cover are quite a range–from freckles to diabetes to bipolar disorder and many more. I think I would like to take some of these data over to the UCSC Genome Brower’s Genome Graphs feature where you can visualize the data on a handy genome graphic. To get this figure, here’s what I did:

1. Took the GWAS excel file.

2. Pulled out the rs IDs for the SNPs. Some cells had to be fixed because the data within it is a series of comma delimited SNPs. Moved each to a single cell.

3. Cleaned up any non rsIDs. I end up with 480 SNPs. I left the duplicates for now.

4. Created a plain text file of these SNPs. I gave each one a value of 1 just for the purposes of the genome graphs software. I just wanted to see all these SNPs on the genome in one graphic. Genome graphs tool tells me:

Loaded 12351941 elements from snp126 table for mapping.
Mapped 479 of 480 (99.8%) of markers
These data are now available in the drop-down menus on the main page for graphing

Off we go…Here are my SNPs on the genome graph–the SNPs are teeny blue dots. Ok, I don’t know what it means either. I just wanted a sense of what was coming out of all the GWAS studies and where they actually were on the genome. I would like to take another look at the data, this was just a quick pass–I’m intrigued by the SNPs that come up in multiple studies and I’m curious about what those genes do. Hmmm…..


Genes for Complex Traits in the Domestic Dog

dog_webinar.jpgThe title for the next seminar in the NHGRI webinar series is just a teaser–I don’t have any more information about the next seminar right now. Thursday, January 8, 2009. 1pm ET.

It was posted on the webinar I attended yesterday on GWAS studies so I took a screen shot of it.

If this is a topic that interests you, watch the webinar website. You do need to register ahead of time for these, and an email comes with login information specific for you.

I’ll have more thoughts on the GWAS one later, but I wanted you to be able to put this on your calendar and save the date if it something you might want to see.

NHGRI webinar coming up: GWAS

nhgri_webinar.jpgThe NHGRI has an ongoing webinar series running–check out the upcoming one and the previous ones here.

I just got an email announcement that they have scheduled that GWAS one for Nov. 20, 2pm ET, and the email has a bit more content description:

The upcoming webinar is entitled, “Genome-Wide Association Studies: The Basics of the science and related policies”. Laura Rodriguez, Acting Director of the Office of Policy, Communications, and Education at NHGRI, will host the webinar and also speak about data sharing policies related to GWAS studies. Teri Manolio, Director of the Office of Population Genomics will report the latest on GWAS results, and what they can tell us about genomics and health. We will then take questions from participants.

I’m curious about the recent removal of GWAS data from the public sphere, and to see what they expect will happen afterwards.

They’ll have sign up information posted soon, they say. Keep an eye out if you are interested because you do have to shoot them an email to get registered.

ENCODE wants your input on the data release policy

enc_data_release.jpgThis week’s Tip of the Week is a bit different than some of the others that I have done in the past. I’m going to take you through parts of a document–the newly released draft of the Data Release Policy for ENCODE (go over to this page at NHGRI and get a copy of the document). I know–you expect software from us. But I will also show you a bit of software at the end, if you can stick with me for that. OK?

We’ve been talking about the ENCODE projects about once a month lately. We are hoping to raise awareness and understanding about the framework, foundations, and goals for ENCODE. That’s because a TON of genome-wide data is going to be collected and offered to researchers worldwide as this project progresses. And as we proceed I’ll be showing you how to access that data in the UCSC Genome Browser, since UCSC is the DCC (or data coordination center) to wrangle the human data around ENCODE.

encode_logo.gifHowever, if you are going to use ENCODE data, you need to know about the guidelines for using that data. That’s what I’ll cover today. And I’ll also give you a peek at some of the first data to come through the process at UCSC on the test server*. It is a sample of ChIP-Seq data from HudsonAlpha that I’ll use as an example.

In short, this data policy tries to balance the needs of the users of this publicly-funded data with those of the scientists who are generating this data. They are proposing a 9-month non-scoop window: the providers will release the data and have 9 months to submit their manuscripts on it. In the meantime, you can look at the data and start to use it. But in general, they ask that you don’t submit a paper without the consent of the ENCODE team in that window. The appendix offers a couple of nice scenarios about the appropriate use of the data so it helps to clarify this.

I hope you’ll have a look at the ENCODE draft data release policy and think about using the ENCODE data. And please give NHGRI and the ENCODE team feedback on this.

*Note on the test server: this is a sandbox for developers at UCSC, the data might not have all be QCed yet, and data here should not be considered final form. But you can have a look.

There’s been some coverage of the request for comment elsewhere, too, if you want to read more about this: http://www.genomeweb.com/issues/news/149419-1.html

UCSC Genome Browser “News” item has a link to the document as well.

NHGRI webinars–first one this Thursday.

Just got this notice from the Genetic Alliance newsletter:


You can learn more at the NHGRI site: http://www.genome.gov/27527023 This first one appears to be:

All About the Genetic Information Non-Discrimination Act of 2008 (GINA)

What is GINA? How will it affect me? How will I – and my family – be protected? Join NHGRI Deputy Director Alan Guttmacher, NHGRI Health Policy Analyst M.K. Holohan, and President and CEO of the Genetic Alliance Sharon Terry on July 17, 2008 at 1 p.m. Eastern to learn about this ground-breaking act that was passed into law in May 2008.

It does look like you have to register, so be sure to check out the NHGRI site and reply to that email for details.

I can’t attend–someone let me know how it goes! I wish I could hear it, I’ve been following this legislation for a long time….Maybe it will be available as a recording later, I can’t tell.

Tip of the Week: Human Genome Structural Variation Viewer

Looking at the NHGRI News feed recently, I noticed this story (below) about a new genomic data collection that intrigued me. I found out about a new resource that I wanted to share as this week’s Tip of the Week. So this ~4 minute movie discusses my path to the Human Genome Structural Variation resource and a quick look at some of the data. But the paper was so influential on my thinking about the genome that I wanted to cover that in more detail in text form as well. So for a quick hit, watch the movie. For more detail, check out the text and links below.  Quick trip to the database: http://hgsv.washington.edu

Researchers Produce First Sequence Map of Large-Scale Structural Variation in Human Genome


….Other recently created maps, such as the HapMap, have catalogued the patterns of small-scale variations in the genome that involve single DNA letters, or bases. However, the scientific community has been eagerly awaiting the creation of additional types of maps in light of findings that larger scale differences account for a great deal of the common genetic variation among individuals and between populations, and may account for a significant fraction of disease. While previous work has identified structural variation in the human genome, a sequence-based map provides much finer resolution and location information….

I spend a lot of time thinking about the official or “reference” human genome sequence. This sequence–the one that was released to all that fanfare a few years back–is a composite of several people. Rather like a “generic” genome.

Continue reading

Town hall meetings on genes + environment studies

dna1.jpgI was pretty intrigued by this brief notice I saw on GenomeWeb:

GPPC to Hold Touring Town Halls on Large-Scale Cohort Studies

The National Human Genome Research Institute wants to know how to get thousands of Americans from a wide swath of social, regional, economical, and ethnic groups to participate in a series of public meetings related to a proposed large cohort study on the role of genes and environment in health.

I would be interested in attending something like this to see what’s going on, and to hear what the public thinks about this. But none of them are near me. Cities listed are: Kansas City, Mo.; Jackson, Miss.; Middletown and Philadelphia, Pa.; Phoenix Ariz.; and Portland, Oregon.

I would also attend to raise awareness of an issue that makes me nuts–that we have been unable to move the GINA legislation forward for so so long. The NYT did a pretty nice article this week about the issues surrounding genetic privacy IRL–in real life–today. Of course, Congress already heard this in testimony from Francis Collins and people who are affected by this now. This is only going to increase as more genes are linked to disease, and that data is growing very quickly.

But I wanted to know more about these town halls. It appears to be Genetics and Public Policy Center, out of Johns Hopkins. Their notice on these meetings is available, and you can register:

The town hall events will be held on March 8 in Kansas City, MO; April 5 in Phoenix, AZ; April 19 in Jackson, MS; April 24 in Portland, OR; and May 13 in Philadelphia, PA. Members of the public can register to attend by calling Erin Wiley at (202) 374-0840 or online.

Their notice links to a bit more info on the content (biobanks seem emphasized) and you can register.

I would love to hear any reports from these meetings when they really happen. I was hoping the Philly one would overlap with a training we are doing in Philly, but it doesn’t….hmm….

OpenHelix receives $1 million NIH grant for genomics training portal

January 21, 2008 (Seattle, WA) – Thanks to a $1 million grant, OpenHelix (www.openhelix.com) has been developing an innovative set of online tools for use by scientific researchers. The tools will greatly reduce the amount of time necessary to locate and use the vast genomics and bioinformatics resources available to scholars and scientists. Once relevant resources are located through an innovative search tool, researchers will learn how to use them with extensive tutorial suites. The SBIR (Small Business Innovation Research) grant was awarded by the National Human Genome Research Institute (Grant number 9R44HG004531).

Freely accessible genomics and bioinformatics resources

With numerous online databases and other genomics and bioinformatics resources available to scientists, the time spent identifying thebest resources and using them in an efficient manner has been a challenge for even the most well-staffed organization.Much data is underutilized due to a lack of awareness of its existence. When scholars and scientists do happen to locate needed information in an online resource, they then must figure out each resource’s unique navigation methods and each documentation style. Introductory training on many resources is either nonexistent or not sufficient to effectively teach users how to best use the site.

“The need for such a resource is clear in the bioinformatics area,” says Joan E. Brooks Ph.D., co-founder of Garbrook Knowledge Resources and former co-founder of Proteome, an online genomics information database company. “The OpenHelix solution will be a promising leap forward to assure the public investment in these resources is fully realized.”

Improving efficiency and effectiveness of research

While genomics resources and data continue to grow rapidly, scientists are at a disadvantage when trying to decide the best resource for them. The search and tutorial portal will enable faster completion of research projects, leading to an accelerated increase in the use and dissemination of scientific knowledge.“We are now looking at some very innovative ways to search a large number of resources, including semantic search using widely used and accepted ontologies” Warren Lathe, co-founder, OpenHelix Chief Scientific Officer and Principal Investigator on the grant said, “The science community will be very excited about the tools we are going to offer this year.”

The groundbreaking search function will provide various methods for locating and ranking genomics resources. As they use the OpenHelix online search for their projects, scientists and other researchers will use a ranking system within the search results to filter the list that pertains to their particular needs, something not previously available.The tutorials also include training material for use in the classroom setting, giving faculty ready-made, updated material to train students.By matching researchers quickly and efficiently with the resources that are most relevant to their needs and providing training so the researcher can effectively use the resource, the grant from the NHGRI will help fulfill the promise of research breakthroughs provided by the post-genomic era.

About OpenHelix

OpenHelix, LLC, (www.openhelix.com) provides the genomics knowledge you need when you need it. OpenHelix currently provides online self-run tutorials and on-site training for institutions and companies on the most powerful and popular free, web based, publicly accessible bioinformatics resources. In addition, OpenHelix is contracted by resource providers to provide comprehensive, long-term training and outreach programs. Headquartered in Washington State, OpenHelix also has offices in San Francisco, Boston and North Carolina. Further information can be found on www.openhelix.com or by calling 1-888-861-5051.

Don't know pup's papa or the mutt's mama?

My nephew, my doctor and my contractor all got new pet dogs from the pound in the last couple months. All three dogs are the sweetest dogs you can imagine, but the owners have no idea what mix of breeds they are. They can only guess.

Well, actually, they could have a DNA test done. Continue reading