Ensembl has been teasing about an update for a while now, version #51. It’s not out yet :). But today on the Ensembl blog they do have a bit of a preview. It has a new design, some new stuff under the hood to make it run better, a new configuration panel and some of the tracks will behave differently (more options). They give a screenshot of the new interface, but it’s not particularly large, so you get the gist of it at least. No word other than “the web team are working hard to tidy up the loose ends of the release!,” so we assume that will be soon. Looking forward to it.
DOE JGI extended and updated the content of IMG and IMG/M recently as this linked press release shows (here is another), so today I’d like to just make a quick post to highlight that OpenHelix has a tutorial on IMG/M that is sponsored by JGI and thus free.
Metagenomics is a huge new field and IMG/M has some excellent tools and data!
Well, I was reading this press item about a team of scientists from the University of Melbourne and Baylor College of Medicine who are sequencing the genome of Helicoverpa armigera aka cotton bollworm, corn earworm, tobacco budworm.. you get the picture, it’s a major agricultural pest. One thing caught my eye, they expect it to be completed in four months. These projects are getting shorter and shorter.
Anyway, I did a search on that species (which brings me to a question, why can I find it in the NCBI taxonomy browser and Wikipedia, but not the Encyclopedia of Life?) which lead me to an interesting database I’d never seen before: Pherobase, a database of pheromones. It appears to be a one-man operation (El-Sayed, AM) but extensive nonetheless. I don’t know enough about this field to review it for scientific merit (though there does seem to be a decent amount of info), but thought I’d put it out there. Amazing where internet searches can semi-randomly lead you.
Hey, say you’ve got a bacterial genome you just sequenced in your spare time (hey, the way technology is going, it’s not far off) and you need to do a quick and dirty annotation to get you started. Well, there are several tools out there to do that, predict genes, annotate regions, etc. I’d like to show you one in this tip that you might not have thought of but could be a useful tool to get started. It’s GATU (Genome Annotation Transfer Utility) at VBRC. As the name suggests, this doesn’t do any major gene predicting, what it does is take your genome and compare it to a closely related genome (the closer the better of course) and transfers all the annotation from the characterized genome. This is from a viral resource (VBRC) but it works just as well with bacterial genomes, something that might not have been obvious and puts another tool in your belt.
This week’s tip introduces a nice feature and tool of the Viral Bioinformatics Resource Center (VBRC). There are a lot of great tools at the VBRC to search and analyze hundreds of viral genomes. Most, if not all, of the tools can be used for searching and analyzing bacterial genomes also. The tool we are introducing in this tip is Base by Base. This tip actually came from a question from one of our readers in our weekly “WYP” feature a few weeks back. Reader Azalea asked:
I’m looking for a pairwise sequence alignment tool which can anchor specific nucleotides to be arbitrarily aligned.I just hope to fix certain positions to be aligned, which will change the whole alignment.
Chris Upton at VBRC suggested Base by Base. I’ve had the opportunity to use Base by Base and it’s a useful tool for working with pairwise alignments (could probably be used for any two sequences, not just bacterial and viral) and looks like a tool that Azalea might be able to use. Today’s tip shows you quickly how to add two sequences, align part by hand and select another region to align by algorithm (choice of T-Coffee, ClustalW or MUSCLE).
We’ve been playing with puzzles here the last while… amino acid spelling, sudoku, word search (and well, real science puzzles too), so I thought I’d try a bit of a crossword puzzle. Funny, I like creating them, just don’t play them very much. Hopefully that won’t show up here too much and it’s fun (for you crossword people out there). This is a “genome crossword.” The answers are species (common name, not scientific) for which the genome is complete or in progress. Some of the references might be obscure, but that’s the nature of crosswords isn’t it? Hint if you are just stumped: google and wikipedia are your friends. You can try the crossword online here, or you can download the genome crossword pdf here. I’d love to hear if you tried it and if how to improve my crossword building skills :D. I’m an amateur.
A paper published today in PLoS One reports on research that shows the feasibility of taking a gene or genomic region from an extinct species and inserting it into the genome of an extant species and resurrect the extinct species DNA function in the transgenic mice. The extinct species was the Tasmanian tiger or Thylacine (that links to the wikipedia page, anyone want to become the curator for the EOL page which is pretty minimal at this point?) and the ‘surrogate’ species was Mus musculus.
And, as the abstract says,
While other studies have examined extinct coding DNA function in vitro, this is the first example of the restoration of extinct non-coding DNA and examination of its function in vivo. Our method using transgenesis can be used to explore the function of regulatory and protein-coding sequences obtained from any extinct species in an in vivo model system, providing important insights into gene evolution and diversity.
It is an fascinating piece of research.
It’s fun day here at Openhelix :). We were noticing that we were getting searches for things like “word search for non infectious diseases” and “Protein word search,” apparently due to this earlier post about searching AA sequences for real words. So we thought we’d run with it :). Using this site, I’ve created a word search using a few (30) of the completed eukaryotic genomes (species’ common names) you can find on GOLD (a list of completed and ongoing genome projects, great resource). So, try it, test your knowledge of which genomes are completed and your ability to see words in weird places. If you want, you can download a pdf of the word search puzzle here (genomes word search) to mark up. Now, I haven’t listed the the species names on the file or here, you know… to make it a bit more challenging. But if you really need to see which words you are looking for you can see them here, and if you just need the key, it’s at the bottom of the linked page.
So, when you are waiting for that experiment or on the bus home, have at it. We might do these occasionally for the fun of it.
Today’s tip of the week introduces you to Gramene, a great database of grass genomes including rice, corn, oats, millet, wheat and others. The database is full of serious data and genomic analysis tools for the grasses, but today we are going to show you something fun you could do with Gramene… plan your dinner. After quickly showing you that there is a wealth of data on a large number of species, we’ll point you to some information on how to cook those genomes.
We’ve written before about the feel of ‘a genome a day’ around here. RPM at Evolgen points to a paper that suggests his prediction (from last year) that “de novo sequencing of whole eukaryotic genomes may be a thing of the past.” Perhaps he is correct, though we do have quite a large number of de novo sequencing projects for eukaryotic genomes in the pipeline for the moment. He suggests that, as this paper has done, sequencing projects will “use 454 to sequence cDNA libraries.” Though there is loss of data in not sequencing the non-transcriptome part of the genome, as the abstract in the paper he points to says:
We conclude that 454 sequencing, when performed to provide sufficient coverage depth, allows de novo transcriptome assembly and a fast, cost-effective, and reliable method for development of functional genomic tools for nonmodel species. This development narrows the gap between approaches based on model organisms with rich genetic resources vs. species that are most tractable for ecological and evolutionary studies.
There is a lot of interesting discussion in the comments to his post.