When I was doing my Ph.D. in the ancient days of the Sanger Method sequencing and reading in the results with one hand on the keyboard and reading the GATCs on the read (and going to the lab in the snow uphill both ways), my purpose for slogging through all that was to eventually get a phylogeny of the sequences of the retrotransposable elements I was studying. Why did I want that phylogeny? Because I was comparing the phylogeny of the retroelements to that of the species in which they reside. We were attempting to determine if these retroelements were stable within the taxa lineage (they are) or there was promiscuous horizontal transfer occurring. We did those comparisons, but it would have been nice to have a ‘cophylogeny reconstruction’ program :D. There are often times similar comparisons of phylogenies are necessary. Host-parasite studies, coevolution, etc. Jane is a software package (free with registration) that uses a heuristic approach, “running a genetic algorithm with an internal fitness function that is evaluated using a dynamic programming algorithm.” It can often give an optimal solution for that cophylogeny you are studying. Jane was developed in the research group of Ran Libeskind-Hadas at Harvey Mudd College and you can read more about the algorithm and approach here. They also have an extensive written tutorial. In these tips we usually focus on web-interface to tools, but I liked this package (and it’s free) and wanted to play around with it, so today I’ll walk you through a very quick intro to downloading and getting started with the tool. Quick Links: Jane Jane Tutorial CoPhylogeny Reconstruction TreeMap (another cophylogeny reconstruction software) CopyCat (yet another) Book Chapter on Cophylogeny and reconstruction Conow, C., Fielder, D., Ovadia, Y., & Libeskind-Hadas, R. (2010). Jane: a new tool for the cophylogeny reconstruction problem Algorithms for Molecular Biology, 5 (1) DOI: 10.1186/1748-7188-5-16
Who can resist a nice cup of eggnog for the holidays (especially with added brandy). I know I can’t. I make my grandpa’s recipe every December and, considering it uses tons of sugar, eggs, heavy cream and alcohol and that 1/2 & 1/2 is the lightest ingredient, only December.
Oh, that’s not what this tip is about, it’s about database of orthologous groups of genes, eggNOG. We’ve mentioned eggNOG before several times, but only in passing or in relation (orthologous? :D) to another database or tool. Today, in perfect timing for the season, thought I’d do a quick tip to introduce eggNOG.
eggNOG is brought to you by the same research group that developed a lot of other excellent tools such as SMART (protein domains), STRING (protein-protein interactions, STITCH (protein-chemical interactions) , iTOL and so much more. Of course they do some fascinating research too.
eggNOG is a relatively straightforward database to use, but it has a wealth of information you might want to check out. As the recent paper in NAR states:
Orthologous relationships form the basis of most comparative genomic and metagenomic studies and are essential for proper phylogenetic and functional analyses…. Orthology, defined as homology via speciation, is a crucial concept in evolutionary biology and is essential for disciplines such as comparative genomics, metagenomics and phylogenomics. The concepts of orthology and paralogy, with the latter being defined as homology via duplication, have been used as a foundation to introduce the concept of clusters of orthologous groups: proteins that have evolved from a single ancestral sequence existing in the last common ancestor (LCA) of the species that are being compared, through a series of speciation and duplication events. Orthologous groups (OGs) have proven useful for functional analyses and the annotation of newly sequenced genomes as orthologs tend to have equivalent functions.
721 801 orthologous groups, encompassing a total of 4 396 591 genes…. from 1133 species.
For more about orthologous groups, methods used and pros and cons of methodology, you might want to check out the paper referenced below. They’ve included several informative and helpful reviews and references.
Right now, take a quick tour of what eggNOG can offer.
Powell, S., Szklarczyk, D., Trachana, K., Roth, A., Kuhn, M., Muller, J., Arnold, R., Rattei, T., Letunic, I., Doerks, T., Jensen, L., von Mering, C., & Bork, P. (2011). eggNOG v3.0: orthologous groups covering 1133 organisms at 41 different taxonomic ranges Nucleic Acids Research DOI: 10.1093/nar/gkr1060
Just a quick comment gathered from a link Mary showed me. I have had my run ins with reviewers of papers and grants. Some reviewers definitely have an agenda and, as humans, all reviews have subjective biases. I could tell you stories. But, for the most part, even given the few huge frustrations, the review process have made my research and papers better and tighter science.
Peer review is not without it’s problems, there are entire blogs (like Peer-to-Peer) and communities discussing it. And of course, a lot of substandard research is published, as possiblty evidenced by the recent discussion of the arsenic-eating bacteria paper. I’ve read my quota of really bad research and spurious conclusions from peer-reviewed journals. But again, it’s the sum-total of the peer review system (and the subsequent discussion, more research, rebuttal publications) that has obviously created excellent science and the advancement of our understanding of biology heretofore.
Nature Precedings is not an alternative peer-review, it’s a place to put research before peer-reviewed publication to invite discussion, spur further research and claim priority. But I’ve seen it pointed to as the part of an alternative. Yet, it’s papers like this (and two others by the same author) that make me realize that peer-review is a necessary purgatory. I won’t spend the time eviscerating the issues of this research, they are legion and it’s not worth my time. I can imagine the casual reader of Nature Precedings might come across it and see all the biolingo and think it’s legitimate research, but really that doesn’t concern me. But, the author of these articles uses them to lend legitimacy to his main thesis: “genome data proves false the theory of evolution.” He does this in a press release (http: //www.prweb.com/releases/theory/genome/prweb4896744.htm .. I won’t link so as not to give this any more web legitimacy, but you can take the space out if you want to see it) where he links to all three “publications”:
Using modern genomics, Dr. Senapathy and his team’s work, showed how the abundance and diversity of life on earth originated directly in the prebiotic environment. They have presented the results in three scientific publications in Nature Precedings: publication 1, publication 2, publication 3.
Research shows that modern genome data completely uproots the evolution model.
He uses the well-deserved respect of the brand “Nature” and the sleight of hand to call them “publications” (when all they are pre-prints with no peer review or review of the science) to lend legitimacy to a counterfeit conclusion.
With all the ‘woo’ in this world, I would suspect that peer-review or some other rigorous solution (which I haven’t yet seen) is more necessary than ever to move science forward.
Edit by Mary: I was just watching some journalists discover this story on MuckRack (http://muckrack.com/sci ) and here’s what they said:
This next post in our continuing semi-regular Guest Post series is from Eric Lyons, of CoGe at the University of California, Berkeley. If you are a provider of a free, publicly available genomics tool, database or resource and would like to convey something to users on our guest post feature, please feel free to contact us at wlathe AT openhelix DOT com.
Thanks both for the prior CoGe post (editors note: a tip of the week on GoGe) and the invitation to write a bit about CoGe. Since most people are probably not familiar with CoGe, let me begin with how it is designed:
CoGe’s architecture and philosophy: Solve a problem once
CoGe is a web-based platform for comparative genomics and consists of many interconnected web-based tools. The entire system is hooked up to a database that can store any version of any genome in any state of assembly from any organism (currently ~9000 genomes from ~8000 organisms). Each of CoGe’s tools is designed to do one task (e.g. search and display information about a genome, compare two genomes and generate syntenic dotplots, search any number of genomes for similar sequence, manage a list of genes, etc.), and are linked to one another. This means that there is no predefined analysis workflow. Instead, people can begin exploring a genome of interest, compare it to what they want, find something interesting, explore that, finding something else, explore that, etc.) People anywhere in the world can perform computationally intense analyses by clicking a few buttons on a web-page, and letting our servers crunch away on whatever genomes we have currently loaded in our system . Since each tool is web-based, links are used to move from tool to tool which creates an easy way to save an analysis for future work or to send to a colleague. This also has the benefit that as we develop new tools to solve a specific problem, we can generalize the solution, and plug it into CoGe’s database and connect it to its pre-existing tool set. Overall, this allows an easy way for us to expand CoGe’s functionality.
Today’s tip is on Genomicus. Genomicus is a great tool to visualize gene duplication, synteny and genome evolution. The search and display interfaces are quite straightforward, and there are lots of great features (viewing ancestral gene information, links out to resources, different views of phylogenies, etc) in the tool. This video is only a short introduction. You can delve deeper into the tool with the help and documentation, including an 11 minute video.
There is also a recent (advance access) paper in the journal “Bioinformatics” that will give you a lot more detail on how the database and tool works and what is there.
Muffato, M., Louis, A., Poisnel, C., & Roest Crollius, H. (2010). Genomicus: a database and a browser to study gene synteny in modern and ancestral genomes Bioinformatics DOI: 10.1093/bioinformatics/btq079
You will also notice today the video is a SciVee embed. We are trying out a new way to post and share our tips. SciVee allows us to not only post on our blog, but for you to share the tip with others and also for scientists in the SciVee community to view the tips. This is only a test. We will be working with this for the next couple weeks to find the best way to post and share. Eventually, soon, we hope to share these on Facebook and Youtube also. If the video is not high enough quality for you (SciVee and other video sharing sites by necessity reduce size, you can try out the entire mpeg4 version a this link.
Can’t let the day go by without acknowledging the 200th anniversary of the day Charles Darwin was born. Arguably one of the most brilliant scientists ever to grace this planet. I have to agree with Razib, I just recently reread Origin of the Species and like every time I’ve read it (4th now I think), I am struck by how amazingly perceptive and prescient the man was.
Though I know it was fictionalized and there are of course some quibbles with the portrayal, I enjoyed the new movie, Creation, about the writing of that book. If anything, it’s spurred me on to learn more about the man and his life and read Voyage of the Beagle. WHich, surprisingly since I’ve read others like Descent of Man, Variation under Domestication, and Expression of Emotions, I’ve never read.
In the movie, Darwin tells his children, particularly his daughter, different stories about his voyage and adventures as bedtime stories. Listening to those and watching the reenactments got me interested in reading the book. And I have a vague idea about rewriting it as a children’s adventure book, I think my daughter would like it. I’ll just chalk that up to one of those “in the future” projects (or maybe someone already did it?)
I had a Basset Hound growing up. His name was Useless, Useless S. Grunt. Well, actually it was formally Ulysses S. Grant because the US Kennel Club wouldn’t accept Useless S. Grunt as a name as they felt it was too demeaning. Not sure if they felt it was demeaning to the dog or to the president, but that’s neither here nor there is it?
So,you ask, what made me think of that long-passed sweet dog that tripped over it’s too-long ears with it’s too-short legs? It turns out that they found out what genetic cause there was for those short legs in Basset Hounds (and Dachshunds and other breeds).
As NHGRI’s press release states:
In a study published in the advance online edition of the journal Science, the researchers led by NHGRI’s Elaine Ostrander, Ph.D., examined DNA samples from 835 dogs, including 95 with short legs. Their survey of more than 40,000 markers of DNA variation uncovered a genetic signature exclusive to short-legged breeds. Through follow-up DNA sequencing and computational analyses, the researchers determined the dogs’ disproportionately short limbs can be traced to one mutational event in the canine genome – a DNA insertion – that occurred early in the evolution of domestic dogs.
The insertion turns out to be a retrogene, which of course I also find interesting in that I studied retrotransposable elements. Reverse transcriptase has this habit of reverse transcribing RNA into DNA which can get reinserted back into the genome (hence processed pseudogenes of course).
The study is interesting for two reasons (other than because I had a Basset Hound and studied the evolution of retroelements ;), it gives us a further clue into evolutionary events that lead to large changes in morphology and the role of retrotranscription and it gives us a clue into possible human conditions.
For more about dog genome, you can read our several posts about the dog genome, go to NCBI’s dog genome home site (or UCSC or Ensembl and other browsers) and read the paper (needs a subscription of course, it’s in Science). It’s an interesting read so far (I want to find some time to read it more fully, perhaps Useless doesn’t live up to his name.. he didn’t really even then :D).
So, yesterday was the 200th anniversary of Darwin’s birth. Lots of festivities and NPR stories surrounding that day including a few announcements like UCSC announcing their v200th browser code a day early so as to coincide (they couldn’t resist the coincidence :)). Another announcement that was apropos was the announcement that researchers at the Max Planck Institute for Evolutionary Anthropology have finished the draft sequence of the Neanderthal genome. Since only about 63% of the genome is actually covered (3.7 billion bps covered of the 3.2 billion bp genome, with duplications), when one announces a “draft” can be a bit arbitrary, so the 200th anniversary of the of the man who wrote “The Descent of Man, and selection in relation to Sex” is as good a time as any. And we are learning a few things like, Neanderthal’s might have had the physical ability for language, but couldn’t stand milk as adults (didn’t agree with their digestion). It is expected a draft and research will be published at the end of this year. We’ll report on that of course, and link to any browsers they might be setting up :D. Ancient genomes are teaching us some things.
Speaking of which, the Exploratorium, an excellent science museum in my fair city, has a great exhibit (on site and online) on the ‘how we know things’ and how science works. This exhibit is specifically on the origins of humans and Neanderthal DNA and the research at Max Planck figures prominently.
NHGRI asks if Darwin is relevant today….and guess what the answer is?
You can go here for a page devoted to the festivities: http://genome.gov/27529500
You can launch the video there if it doesn’t work here:
My favorite part of the video is when Leslie Biesecker takes us from Darwin–>software, of course. Later on he also talks about how important evolutionary concepts are to our interpretation of health and disease. I mean, I know you guys get this–but I think it is the piece that makes me craziest about the people who want to deny evolution and its relevance today.
Couldn’t they have found at least 1 woman to interview, though? I saw them in the background….I know they were there…
A new open access journal, Ideas in Ecology and Evolution, has, well, opened. It’s published at Queen’s College in Canada.
“So?” you ask, “there are lots of journals up starting all the time”.
This one is different. It’s experimenting with a lot of things (ok, so there seem to be a lot of journals experimenting with the model lately). The subject matter is not research per se, but ideas. Having been to my share of ecology and evolution conferences and discussion, I can see this journal has opened itself up to some quite lovely discussions.
As explained by Bob O’hara, there are some interesting review process experiments going on here too. Authors pay to get their ideas published, reviewers are paid, reviewers are not anonymous and they get to publish their views of the article as a companion piece. Bob discusses the issues we’ve all heard about the pros of anonymity (and they are valid ones), but this might work in this case. I also agree with Bob on one point, this structure (reviewers publishing their views) will indeed increase discussion, but I’d too like to see some mechanism for a broader discussion. As it is designed now, it will be like watching TV pundits arguing the finer points of health policy, which I guess is informative, but I’d like to see some mechanism that allows a broader discussion of the article. Something like PLoS has, which I think would actually work better in a journal of ideas like this.
Well, we’ll see. Right now there is nothing there but the editorial. I’ll be watching though.
hat tip: Coturnix