We’re all Neanderthal now, and I can analyze that…

If you haven’t been reading the news, the draft sequence of the Neandertal (or is it Neanderthal, spell check won’t take the former) was released and published in Science today. There is a lot of fascinating stuff over there. Still reading it. Of course the big news, the stuff thats flying through the news, are non-African genomes are 1-4% Neanderthal. This seems to conclusively settle the question that yes, we are a little bit Neanderthal and we didn’t replace them, we absorbed them with some interbreeding. Perhaps not so completely as that but definitely some admixture going on. As Razib of Gene Expression points out, it’s fascinating to watch how quickly, in the face of data, the paradigm has shifted. (great post and discussion, should read it).

As Razib points out, and as you can read in the announcement at UCSC, the UCSC Genome Browser now has the draft data up in the hg18 genome assembly. Like the coding region allelic differences data, selective sweep data, etc. With the Neanderthal data now being in the UCSC Genome Browser and other data sources, we can pull apart that data, analyze it.. (and you know I’ll be putting my personal genome in a comparative track when I ever get it. Just curious ya know).

(btw, there is an interesting photo, copyrighted… so I won’t post it here, you might want to check out. There’s an interesting story there, how our illustrations of Neanderthal have evolved over the years to be more ‘humanizing’ as we learn that they made tools, had culture and now… are part of our ancestry…”)

I am itching to go play there and see what I can see, as I am sure many scientists are. It’s also fascinating to be in this world of huge amounts of data coming quickly. I think a lot of paradigms will be shifting for a while.

UCSC Main site: http://genome.ucsc.edu and click Neandertal from left navigation button.

Ancient Genomes: Neanderthal

So, yesterday was the 200th anniversary of Darwin’s birth. Lots of festivities and NPR stories surrounding that day including a few announcements like UCSC announcing their v200th browser code a day early so as to coincide (they couldn’t resist the coincidence :)). Another announcement that was apropos was the announcement that researchers at the Max Planck Institute for Evolutionary Anthropology have finished the draft sequence of the Neanderthal genome. Since only about 63% of the genome is actually covered (3.7 billion bps covered of the 3.2 billion bp genome, with duplications), when one announces a “draft” can be a bit arbitrary, so the 200th anniversary of the of the man who wrote “The Descent of Man, and selection in relation to Sex” is as good a time as any. And we are learning a few things like, Neanderthal’s might have had the physical ability for language, but couldn’t stand milk as adults (didn’t agree with their digestion). It is expected a draft and research will be published at the end of this year. We’ll report on that of course, and link to any browsers they might be setting up :D. Ancient genomes are teaching us some things.

Speaking of which, the Exploratorium, an excellent science museum in my fair city, has a great exhibit (on site and online) on the ‘how we know things’ and how science works. This exhibit is specifically on the origins of humans and Neanderthal DNA and the research at Max Planck figures prominently.

Tip of the Week: Homophila

homophilahomophila2(click either graphic to see the tip of the week movie) It’s not Halloween yet, but thought I’d get us started in the mood by introducing you to a database that has some obvious references to the movie “The Fly” (the 1958 version is the only really worth watching :). Ok, so the database doesn’t actually help you turn humans into flies, that’s a few years away (that’s a joke of course). No, this is one of those resources that does one thing and does it well. It’s very straightforward and simple… it takes human disease genes and sequences found in OMIM and finds the homologs in the Drosophila melanogaster genome. The name of the database is Homophila. From the results you can find the links to the data and go from there. Simple function that can be very useful. Give it a try.

Broad's newly released Genomics Data Viewer IGV

igv1.jpgFrom the Genome-Technology mailing list I found about about this software release from the Broad Institute:

Broad Institute Makes Genomics Data Viewer Public

By a GenomeWeb staff reporter

NEW YORK (GenomeWeb News) – The Broad Institute of MIT and Harvard has created a genomics informatics tool that will allow researchers to visualize genomic information, and has made it publicly available for free, Broad said today….

So of course I went to check it out. Because I love new software! You can check it out yourself here:

Integrative Genomics Viewer: http://www.broad.mit.edu/igv/

There is a quick start introduction and a movie you can watch where someone demonstrates some clicks (no audio, or if there was I didn’t get any). A quick registration gives you access. A little java downloading and you are off to the races. There is a sample data set to get you started.

My first question was: what genomes can I see? Lo and behold–the FAQ says:

Answer: Sequence is read from the genome on a server at the Broad. For sequences to appear, you must be connected to the internet, the server must be available, and the genome that you have selected must be on the server. As of July 2008, the server provides sequence for the following genomes: hg17, hg18, mm8, and mm9.

At first I thought it was a tool to pull in your own genomes and view stuff, but it appears to rely on what’s on their server. But I haven’t dug enough yet, I’m not certain that the final answer. But if you are using one of those genomes, I could see some real utility in pulling in your data as tracks and viewing it alongside the reference sequence.

Looks nice to me. I’ll be checking it out some more and I’ll let you know what I find. Feel free to add your own reviews!

Tip of the Week: Predominately Expressed Genes

So, let’s say you need to find genes that are not only highly expressed in a tissue of a species of interest, but predominately expressed in that tissue (not highly expressed in other tissues). There are, like with any question, several ways to go about it, but in today’s tip of the week, I’m going to show you how to do it using the Gene Sorter. This is a tool brought to you by the same people who do the UCSC Genome Browser (Golden Path). The basic purpose of the tool is to take a gene of interest and sort other genes (which ones to sort can be filtered) in the genome based on some type of similarity (name, GO terms, expression, protein similarity, etc). What we are going to use it for in this tip is a little different since we don’t have a gene of interest, we are looking for interesting genes. There is a free tutorial and training on the UCSC tools including the Gene Sorter (UCSC Advanced Topics, download the slides and view/read the “Gene Sorter” section and do the exercises for the advanced topics). Know of other ways to do this? Suggest them in the comments!

Pedigree drawing software

A question across the MGI mailing list this weekend was: “Is there a good, easy to use template/program for drawing pedigrees that someone could recommend?”

So far the suggestions include:
Pedigree Draw: http://www.pedigree-draw.com/ A mac-only tool available for purchase.

Pedigree Viewer: http://www-personal.une.edu.au/~bkinghor/pedigree.htm is a free tool, for Windows.

Another suggestion came for HaploPainter: http://haplopainter.sourceforge.net/html/index.html this for installation on Windows/linux, but also can be installed on Macs with a bit of awareness about the installation process. The diagram with the HaploPainter page looked really nice, so I went to check out the paper. Thiele and Nürnberg were challenged by some genome-wide scans with over 10,000 SNPs that they wanted to display. They created this software to solve their problem, and released it for others use as well. Looks like it could be really useful–we will have to test it out.

If anyone has other suggestions, add them here and I can send them back to the mailing list–or, of course, you can sign up yourself!

EDIT: here’s another nice looking program submitted to the MGI mailing list: http://eyegene.ophthy.med.umich.edu/madeline/index.php Madeline 2.0

Progeny: this company has software that people have told us they like. I just realized they had a free pedigree tool that you can check out: http://www.progenygenetics.com/students/

I just remembered another project that I had heard about–the Surgeon General’s Family Health project has a tool for family medical history pedigrees: https://familyhistory.hhs.gov/ My Family Health Portrait.

PediDraw–found another one, and I’ll keep adding them as I find them.  This is a web-based tool.  The paper accompanying this one is available in PubMedCentral.

Thiele, H. and Nürnberg, P. (2004). HaploPainter: a tool for drawing pedigrees with complex haplotypes. Bioinformatics, 21(8), 1730-1732. DOI: 10.1093/bioinformatics/bth488

Acceleration of human adaptive evolution

I’ve been following a fascinating conversation about this paper: Recent acceleration of human adaptive evolution by John Hawks et al. In this paper the authors found that:

Genomic surveys in humans identify a large amount of recent positive selection. Using the 3.9-million HapMap SNP dataset, we found that selection has accelerated greatly during the last 40,000 years.

There is an interesting discussion going on in the blogosphere. Continue reading