Tag Archives: hapmap

Important announcement from HapMap about data archiving

This is from the HapMap team for the human data (not the green HapMap I referenced recently). Older data is going to be removed from the HapMap.org browser and from BioMart. It will still be available in the ftp archive–but I just thought a heads-up was in order for folks who might not be on the mailing list:

Beginning July 1st, HapMap data releases prior to October 2008 (or release #24) will no longer be available via the HapMap Genome Browser and HapMart utility.

The following three data releases will continue to be served via the HapMap Genome Browser:

* HapMap Release #24 — Latest HapMap (phase I+II) data release


* HapMap3 Draft #2 (hapmap3_r2) — Latest HapMap3 (phase III) data release


* HapMap Release #27 — Latest merged HapMap (phase I+II+III) data release


The following data release will continue to be served via HapMart:
* HapMap Release #27 — Latest merged HapMap (phase I+II+III) data release


All other data releases will still be available for FTP download:


If you aren’t on the mailing list you can check it out and sign up here: http://osceola.cshl.edu/mailman/listinfo/announcements

Tip of the Week: F-SNP

fsnp_thumbThere are a lot of databases to search for to find SNP data, HapMap, dbSNP, SeattleSNPs, Genome Variation Server and many more. I’m going to add one more to your data mining arsenal, F-SNP. F-SNP (described more fully here in the 2008 NAR Database issue),

provides integrated information about the functional effects of SNPs obtained from 16 bioinformatics tools and databases. The functional effects are predicted and indicated at the splicing, transcriptional, translational, and post-translational level. As such, the F-SNP database helps identify and focus on SNPs with potential pathological effect to human health.

…as they say in the introduction. It looks to be a good first stop to find SNPs of functional relevance. The databases they pull from to get their information include several I’ve mentioned above and also the UCSC Genome database, Ensembl, SIFT and PolyPhen predictions and more. I’ve given a quick intro in the tip this week on how to get functional SNP information from F-SNP.

1001 Genomes: plant researchers raise by 1

arabidopsisThere is plenty of buzz out there for the big data biology projects–but usually the focus is the human data (with a few token model organisms thrown in).  But this week plant researchers renewed the call for big plant data.  I’m totally on board with that.

The 1000 Genomes project to obtain more human variation information is well underway, funded, and has companies supporting it.  And that’s great–I’m all for this too!  But as someone who survives largely on the kindness of plants I want more plant research going on.  I want to see this funded and supported.  And as we face increasing stresses on resources from limitations like oil and water supplies to wacky climate conditions and environmental consequences I think we could well afford to spend less time gazing at our human genomic navels and devote more attention to the plants.

There is already some work on this Arabidopsis project.  The first paper with data on this effort came out last fall.  But the researchers are still having to go out and lobby for this project.  A new opinion piece in Genome Biology calls out for awareness and support for this effort.

They have already done a first generation green HapMap.  The paper last fall illustrated the feasibility of the project by looking at the reference Col-O (Columbia) and Bur-O and Tsu-1 strains.  The paper presents the process, compares their pipeline software with another package (SHORE that they developed and MAQ), They have a GBrowse installation that presents the data  (and you can get free training on GBrowse here to effectively use the site).  They also provide data to TAIR.

I think this is important and I hope it gets the same level of support and respect that 1000 humans will get.

1001 Genomes main site: http://1001genomes.org/

1001 Genomes GBrowse: http://gbrowse.weigelworld.org/cgi-bin/gbrowse/ath_reseq_1001/
Clark, R., Schweikert, G., Toomajian, C., Ossowski, S., Zeller, G., Shinn, P., Warthmann, N., Hu, T., Fu, G., Hinds, D., Chen, H., Frazer, K., Huson, D., Scholkopf, B., Nordborg, M., Ratsch, G., Ecker, J., & Weigel, D. (2007). Common Sequence Polymorphisms Shaping Genetic Diversity in Arabidopsis thaliana Science, 317 (5836), 338-342 DOI: 10.1126/science.1138632

Ossowski, S., Schneeberger, K., Clark, R., Lanz, C., Warthmann, N., & Weigel, D. (2008). Sequencing of natural strains of Arabidopsis thaliana with short reads Genome Research, 18 (12), 2024-2033 DOI: 10.1101/gr.080200.108

Weigel, D., & Mott, R. (2009). The 1001 Genomes Project for Arabidopsis thaliana Genome Biology, 10 (5) DOI: 10.1186/gb-2009-10-5-107

Tip of the Week: Visualizing GWAS with HapMap tools

hapmap_gwas_movie.jpgWe are seeing a lot of interest in visualizing GWAS data lately.  We cover this a bit in our UCSC Genome Browser tutorial.  And we recently did a pretty popular post on a quick look at the NHGRI GWAS catalog data using the UCSC Genome Graphs tool.

But as I was looking at the HapMap tools again recently, I noticed that they have a tool for this as well.  So today’s tip examines that tool for visualizing the NHGRI GWAS catalog data, and having a look at the GBrowse view of this data in genomic regions with the HapMap context.  In this movie I load up one of the sample data sets and move from that GWAS karyogram visualization to the HapMap GBrowse view.  Click the image to view the movie.

How to pick a genome database platform

I was reading a newsletter I get from Biotechniques, and their WebWatch often has some fun items. (You may need to get a free login to see the WebWatch.) This week they referred to the MaizeGDB database in the post Amaizing Base. Although I had been aware of MaizeGDB before, it was a nice reminder to go over and have a look to see what’s new.

When I went over there I was intrigued by the new browser they are about to launch (in mid-October). The link says “coming soon” and I went to check out the information there.

Currently that link goes to a page that describes their move to a more sequence-centric representation of their data. It was a fascinating look at their decision process to move to a new browser platform and what they decided to do. For database geeks like me, seeing their ranking of the importance of various features was very compelling.

And what they decided? GBrowse!

We have a tutorial available on GBrowse. Usually we do tutorials on specific sites, but as we kept seeing GBrowse over and over at different sites we created a tutorial for that. It helps me to understand the underlying basic browser when I visit any site that employs it. Even though the wrappings and the data types will vary at different sites, understanding how it works makes it much easier to use at any new site that uses it. HapMap, MGI, WormBase, FlyBase, TAIR, Watson’s personal genome, and a whole bunch of other sites use the GBrowse software.

Looking forward to checking out the MaizeGDB GBrowse version when it launches!

Tip of the Week: Human Genome Structural Variation Viewer

Looking at the NHGRI News feed recently, I noticed this story (below) about a new genomic data collection that intrigued me. I found out about a new resource that I wanted to share as this week’s Tip of the Week. So this ~4 minute movie discusses my path to the Human Genome Structural Variation resource and a quick look at some of the data. But the paper was so influential on my thinking about the genome that I wanted to cover that in more detail in text form as well. So for a quick hit, watch the movie. For more detail, check out the text and links below.  Quick trip to the database: http://hgsv.washington.edu

Researchers Produce First Sequence Map of Large-Scale Structural Variation in Human Genome


….Other recently created maps, such as the HapMap, have catalogued the patterns of small-scale variations in the genome that involve single DNA letters, or bases. However, the scientific community has been eagerly awaiting the creation of additional types of maps in light of findings that larger scale differences account for a great deal of the common genetic variation among individuals and between populations, and may account for a significant fraction of disease. While previous work has identified structural variation in the human genome, a sequence-based map provides much finer resolution and location information….

I spend a lot of time thinking about the official or “reference” human genome sequence. This sequence–the one that was released to all that fanfare a few years back–is a composite of several people. Rather like a “generic” genome.

Continue reading

Finding History in the Genome

ResearchBlogging.orgWe are starting a little bit of genetic genealogy in our household. I’ve always have been an avid genealogist and with an adopted child we’ve been found it interesting and helpful to delve a bit deeper into our heritages in a way we couldn’t have 10 years ago. So, I’m a bit aware lately of studies of historical and genetic links to our backgrounds…

A study in this month’s American Journal of Human Genetics suggests that the history of the spread of Islam and the crusades can be found in the genomes of the male population in Lebanon. The study found that…

926 Lebanese men were typed with Y-chromosomal SNP and STR markers, and unusually, male genetic variation within Lebanon was found to be more strongly structured by religious affiliation than by geography.

Their hypothesis was that migrations within historical times contributed to difference. The data was from the Genographic Public Participation Project (National Geographic). Their conclusions?

Continue reading

HapMap Sudoku

hapmap_sudoku1.jpgUmmm…sometimes you are poking around the web sites of these databases and you find some odd stuff. Today I found a HapMap Sudoku. There are some word combinations that you think will never come out of your mouth (or fingertips in this case). And then–lo and behold–an unusual pair of things you like combine to be something you didn’t expect.

So, if you are interested, you can go to the HapMap Tutorials page on their site and click the link for HapMap Sudoku. It is a powerpoint slide (not sure why, PDF would be fine.) Might be fun to do at lunch or on a plane one of these days.

What would other fun database games be? We already saw GenBank word search. The whole field is a giant logic puzzle. In the end I suppose it is all the game of Life.

A taste of OpenHelix

The bloggers here at OpenHelix and some of our family and friends decided to do the taste tests. You know the ones. You probably did them in your genetics class. I used them in my introductory biology class at CCSF years ago and had hundreds of the test strips left. So, we thought we’d distribute them to the bloggers and families here and see what the results were. The test strips are for sodium benzoate, PTC and thiourea. There is also a control strip of no taste (but paper). I numbered the strips and sent them to the bloggers and families (so they wouldn’t know what they were tasting, control or otherwise). And here are the results (and some database links to more about the genetics of taste):

Continue reading

Tip of the Week: One search to rule them all

Ok, well not exactly (wouldn’t that be nice). What do Ensembl, Gramene, Reactome, Wormbase, HapMap and RGD databases all have in common? (other than “.org” ;)) They all have a search mechanism powered by the same software, BioMart. [link fixed; Mary] In this week’s tip we take a real quick look at these and other databases use of BioMart and briefly show you that the steps are the same (choose dataset, filter dataset, list attributes to show) and you can search some of them all at the same place.