and we love our model organisms! A little ditty about sticklebacks from a lab:
You know, when the catfish genome is complete, that will be a cool addition to our “Yet another Genome” posts (which I should make a regular series or some update somewhere). Till the genome is complete, you can view and analyze catfish genomic data at cBARBEL, reported in this weeks NAR advance access: Catfish Breeder And Researcher Bioinformatics Entry Location. Among other tools and schema, they use GBrowse (we do have a free tutorial to compare the data to the Zebrafish genome.
Mark this database (as with many others) as one whose acronyms were created to fit the name. Barbels are the whiskers on a catfish and cBARBEL stands for “Catfish Breeder and Researcher Bioinformatic Entry Location.” See, I was thinking more along the lines of “Catfish Breeder And Researcher Research Entry Location” or Catfish Barrel, but that is too culturally obscure and specific isn’t it? cBARBEL is good .
All kidding aside, it’s a great start to another agriculturally important model organism database.
Welcome to our Friday feature link collection: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…
If you haven’t been reading the news, the draft sequence of the Neandertal (or is it Neanderthal, spell check won’t take the former) was released and published in Science today. There is a lot of fascinating stuff over there. Still reading it. Of course the big news, the stuff thats flying through the news, are non-African genomes are 1-4% Neanderthal. This seems to conclusively settle the question that yes, we are a little bit Neanderthal and we didn’t replace them, we absorbed them with some interbreeding. Perhaps not so completely as that but definitely some admixture going on. As Razib of Gene Expression points out, it’s fascinating to watch how quickly, in the face of data, the paradigm has shifted. (great post and discussion, should read it).
As Razib points out, and as you can read in the announcement at UCSC, the UCSC Genome Browser now has the draft data up in the hg18 genome assembly. Like the coding region allelic differences data, selective sweep data, etc. With the Neanderthal data now being in the UCSC Genome Browser and other data sources, we can pull apart that data, analyze it.. (and you know I’ll be putting my personal genome in a comparative track when I ever get it. Just curious ya know).
(btw, there is an interesting photo, copyrighted… so I won’t post it here, you might want to check out. There’s an interesting story there, how our illustrations of Neanderthal have evolved over the years to be more ‘humanizing’ as we learn that they made tools, had culture and now… are part of our ancestry…”)
I am itching to go play there and see what I can see, as I am sure many scientists are. It’s also fascinating to be in this world of huge amounts of data coming quickly. I think a lot of paradigms will be shifting for a while.
UCSC Main site: http://genome.ucsc.edu and click Neandertal from left navigation button.
This next post in our continuing semi-regular Guest Post series is from Eric Lyons, of CoGe at the University of California, Berkeley. If you are a provider of a free, publicly available genomics tool, database or resource and would like to convey something to users on our guest post feature, please feel free to contact us at wlathe AT openhelix DOT com.
Thanks both for the prior CoGe post (editors note: a tip of the week on GoGe) and the invitation to write a bit about CoGe. Since most people are probably not familiar with CoGe, let me begin with how it is designed:
CoGe’s architecture and philosophy: Solve a problem once
CoGe is a web-based platform for comparative genomics and consists of many interconnected web-based tools. The entire system is hooked up to a database that can store any version of any genome in any state of assembly from any organism (currently ~9000 genomes from ~8000 organisms). Each of CoGe’s tools is designed to do one task (e.g. search and display information about a genome, compare two genomes and generate syntenic dotplots, search any number of genomes for similar sequence, manage a list of genes, etc.), and are linked to one another. This means that there is no predefined analysis workflow. Instead, people can begin exploring a genome of interest, compare it to what they want, find something interesting, explore that, finding something else, explore that, etc.) People anywhere in the world can perform computationally intense analyses by clicking a few buttons on a web-page, and letting our servers crunch away on whatever genomes we have currently loaded in our system . Since each tool is web-based, links are used to move from tool to tool which creates an easy way to save an analysis for future work or to send to a colleague. This also has the benefit that as we develop new tools to solve a specific problem, we can generalize the solution, and plug it into CoGe’s database and connect it to its pre-existing tool set. Overall, this allows an easy way for us to expand CoGe’s functionality.
Welcome to our Friday feature link dump: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…
to assemble a genomic zoo—a collection of DNA sequences representing the genomes of 10,000 vertebrate species, approximately one for every vertebrate genus. The trajectory of cost reduction in DNA sequencing suggests that this project will be feasible within a few years. Capturing the genetic diversity of vertebrate species would create an unprecedented resource for the life sciences and for worldwide conservation efforts.
10,000 vertebrate genomes. That’s a lot. In fact, that’s 50 fold greater number than is currently in progress (that list is chordata, at 208, including multiple genomes form one species, humans), and nearly 500 fold the number of complete vertebrate genomes available. An ambitious goal to say the least. The participants are multitude including the coordinators David Haussler (Howard Hughs Medical Institute, UCSC), Stephen O’Brien (Laboratory of Genomic Diversity, National Cancer Institute) and Oliver Ryder (Institute of Conservation Research, San Diego Zoo).
The three coordinators’ institutes suggest some of particular benefits that they hope to get out of this project: medical data, evolutionary data, conservation efforts. I do believe such a project will indeed bring many of those benefits. I remember 20 years ago the arguments about the Human Genome Project (too expensive, too ambitious, benefits won’t be commiserate, big science pushing out basic research). I think it’s arguable that the worst fears were not realized and that there have been a number of benefits already and soon to come. 10k vertebrate genomes now seems feasible and beneficial.
Of course, one could hope there’d be a plant project and an invertebrate project down the pike?
And, it goes without saying, if you thought there were database funding, accessibility, usability, etc issues now…
Being summer, a strangely slow connection and some other factors, I am embedding a talk from Doug Ramsey (posted on SciVee) on the GEBA project at JGI (instead of doing a tip myself . The GEBA project recognizes that many, if not most, of the bacterial and archaeal genomes that have been sequenced to date have some relevance to human disease or other human interest. This of course is reasonable, but it also leads to big gaps in our knowledge of bacterial evolution and genomics, knowledge that would help us better understand those genomes that we find relevant and knowledge that in and of itself can be quite interesting and potentially useful. View the talk to learn more about this project to sequence 100 phylogenetically diverse bacterial and Archaeal genomes.
I’m also posting this as an introduction to JGI’s Adopt a Genome project. This project allows student groups to adopt and study a bacteria in the GEBA project and hopefully add to our knowledge and annotations of the genome while learning. The students can then annotate the adopted genome by using IMG-ACT.
Hmm, that sounds like magic mushrooms. But that’s not what I’m talking about .
Recently, I had the opportunity to give a short workshop at the Genetics and Genomics of Infectious Disease conference in Singapore. It went well, and I learned a lot. Afterwards, my family came out to SE Asia for a vacation, but because of they way frequent flyer miles are, I had 5 days of free time. Malaysian Borneo, here I come! I spent three days hiking through the Sarawak jungle with a new-found British friend. It was fascinating. One of the most fascinating aspects for me were the stars at my feet.
So, yesterday was the 200th anniversary of Darwin’s birth. Lots of festivities and NPR stories surrounding that day including a few announcements like UCSC announcing their v200th browser code a day early so as to coincide (they couldn’t resist the coincidence ). Another announcement that was apropos was the announcement that researchers at the Max Planck Institute for Evolutionary Anthropology have finished the draft sequence of the Neanderthal genome. Since only about 63% of the genome is actually covered (3.7 billion bps covered of the 3.2 billion bp genome, with duplications), when one announces a “draft” can be a bit arbitrary, so the 200th anniversary of the of the man who wrote “The Descent of Man, and selection in relation to Sex” is as good a time as any. And we are learning a few things like, Neanderthal’s might have had the physical ability for language, but couldn’t stand milk as adults (didn’t agree with their digestion). It is expected a draft and research will be published at the end of this year. We’ll report on that of course, and link to any browsers they might be setting up . Ancient genomes are teaching us some things.
Speaking of which, the Exploratorium, an excellent science museum in my fair city, has a great exhibit (on site and online) on the ‘how we know things’ and how science works. This exhibit is specifically on the origins of humans and Neanderthal DNA and the research at Max Planck figures prominently.