Friday SNPpets

This week’s SNPpets cover a range of issues. Attempting to community-curate bioinformatics tools, a new paper on UCSC Genome Browser‘s features, iPlant now reborn as CyVerse, errors in databases, personalized diagnosis from the UK 100000 Genomes project and the re-launch of their PanelApp, and some new plant tools: AgroPortal and 30 plants going into OrothoDB. Also, read those thesis submissions carefully. There may be an easter egg.

Video Tips of the Week: Annual Review IV, 2nd half

As you may know, we’ve been doing these video tips-of-the-week for FOUR years now. We have completed around 200 little tidbit introductions to various resources from last year, 2011 (yep, it’s 2012 now). At the end of the year we’ve established a sort of holiday tradition: we are doing a summary post to collect them all. If you have missed any of them it’s a great way to have a quick look at what might be useful to your work.

You can see past years’ tips here: 2008 I2008 II2009 I2009 II2010 I2010 II. The summary of the first half of 2011 is available from last week.

July 6: Prioritizing genes using the Gene Prioritization Portal

July 13: PolySearch, searching many databases at once

July 20: Human Epigenomics Visualization Hub

August 3: SNPexp, correlation between SNPs and gene expression 

August 10: CompaGB for comparing genome browser software

August 17: CoGe, comparing genomes revisited

August 24: Domain Draw for quick motif diagrams

September 7: Plant comparative genomics using Plaza

September 14: phiGENOME for bacteriophage genome exploration

September 21: Getting flanking sequences of genomic locations

October 5: VnD resource for genetic variation and drug information

October 12: Track Hubs in UCSC Genome Browser

October 19: Mitochondrial Transcriptome GBrowser 

November 2: MizBee Synteny Browser

November 9: The new database of genomic variants: DGV2

November 16: MapMi, automated mapping of microRNA loci

November 23: BioMart's new central portal

November 30: Phosphida, a post-translational modification database

December 7: VarSifter, for identifying key sequence variations

December 14: Big changes to NCBI's genome resources

December 21: eggNOG for the Holidays (or to explore orthologous genes)

December 28: Video Tips of the Week: Annual Review IV (first half of 2011)

Tip of the Week: Plant comparative genomics using Plaza

ResearchBlogging.orgPlaza, a resource for plant comparative genomics, has a lot more than meets the eye at first. Currently the database has comparative tools and data for nearly 2 dozen plants including monocots, dicots, mosses and algae. There are some obvious tools and data from the homepage, but I suggest you take a look at the documentation and tutorials, you’ll find there is a lot more once you start delving into it. In this tip, I’m going to walk though getting phylogeny of a gene family from a specific clade to illustrate that there is a lot here that you might not see at first.

Plaza: http://bioinformatics.psb.ugent.be/plaza/

For grass comparative genomics you might also want to check out Gramene (tutorial, subscription) and for general comparative genomics, VISTA (tutorial, free) is an excellent resource also.

Proost, S., Van Bel, M., Sterck, L., Billiau, K., Van Parys, T., Van de Peer, Y., & Vandepoele, K. (2009). PLAZA: A Comparative Genomics Resource to Study Gene and Genome Evolution in Plants THE PLANT CELL ONLINE, 21 (12), 3718-3731 DOI: 10.1105/tpc.109.071506

Choosing a genome browser for your organism…

There are a number of genome browsers out there–we’ve covered that a number of times.  And there are always new ones coming along.  With the onslaught of sequence data we’re about to get from high-throughput sequencing, more and more research groups, communities, and individuals are going to need to choose a genome browser to use to display their data.

One time I stumbled across the survey results for a group that was choosing a new platform to display their community’s data: MaizeGDB.  I wrote about it then because I thought it was interesting, and because I know people are facing this pretty regularly now.  We get asked.  But since that time they have progressed, implemented, and they wrote up their experience.  It’s now been published in Database.

It’s a pretty straighforward paper.  They describe their needs and their assessment of the resources their community had and used.  They surveyed likely users to see what they wanted, and how they felt about the pieces that already existed.  One piece they specifically noted–when asked, many users did not say they used Ensembl, but the Ensembl software was the foundation of one of the items they did say they used.  MaizeGDB writes:

This result shows that users may not be aware of the underlying browser software that the various web sites use.

Ah, yeah.  Here’s another thing this shows: database end users are definitely not thinking about browser software the same way database developers are.  And I do not mean end users are stupid.  They just do not think about this stuff the way software providers think they do.  We keep trying to tell providers this.  It’s not always well received.

So anyway, they move on to assess the candidates for their new implementation.  The focus on Ensembl, GBrowse, Map Viewer, UCSC Genome Browser, and xGDB.  They describe the framework, possibilities, and limitations of each for their purposes.  I think this is a nice look at the various options that lots of people considering the issue should find useful.  They also address that there are other browser that have since, or may still, come along in the future that could be considered, but at the time these were the focus.

They go on to describe their implementation experience.  They seem pleased with it.  And they highlight a one of their favorite pieces, a Locus Lookup tool, that they have added as well.  It sounds like it’s serving their community really nicely.

This is a highly useful paper for the people in the market for genome browsers.  It’s not for everyone, for sure.  Well, at least not yet.  But your day is coming. You’ll need a browser eventually….

You can check out their GBrowse implementation at: http://gbrowse.maizegdb.org/

And if you are interested you can see our free GBrowse training suite here: http://www.openhelix.com/gbrowse

Sen, T., Harper, L., Schaeffer, M., Andorf, C., Seigfried, T., Campbell, D., & Lawrence, C. (2010). Choosing a genome browser for a Model Organism Database: surveying the Maize community Database, 2010 DOI: 10.1093/database/baq007

Andorf, C., Lawrence, C., Harper, L., Schaeffer, M., Campbell, D., & Sen, T. (2010). The Locus Lookup tool at MaizeGDB: identification of genomic regions in maize by integrating sequence information with physical and genetic maps Bioinformatics, 26 (3), 434-436 DOI: 10.1093/bioinformatics/btp556

EDIT: added links to a couple of older blog posts, should have had them in before….

Happy Memorial Day (and gardening) to you this weekend!

Summer is rapidly approaching and I’m so looking forward to a nice long Memorial Day weekend with outdoor cookouts and plenty of time for gardening. Those of us New Englanders that have endured a long, hard winter really natural viagra alteratives appreciate ending our hibernation and spending time outside in the spring and summer. Gardening is one of my favorite activities, and in this region we are strongly advised to wait until Memorial Day to do the majority of our planting. But after hearing that one of my colleagues had just come down with poison ivy, I began to wonder why these plants so often get in the way of enjoying our short season of outdoor life.

Poison ivy, oak and sumac have always been a very annoying part of growing up in New England. They are plants that I never had too many fond thoughts of. Yet, I never really knew much at all about them – other than the itchy, irritating red rash they cause – that is. I decided to do a little digging, reasoning that they must have some redeeming, or at least interesting, biological qualities. After all, it seems that they are only protecting themselves against all of us herbivores. They can’t exactly run away from us, so they have to keep us at bay some how. Their defense mechanism seems quite clever actually.

A quick check in Wikipedia revealed that poison ivy is a member of the Anacardiaceae family of flowering plants. To my surprise cashew and pistachio plants are also members of this same family. Apparently not all members of this plant family are skin irritants at least! The reaction you get from poison ivy is due to contact with urushiol, a very potent oil found in the sap. In fact, only about 1 nanogram is needed to cause a rash (as little as ¼ of an ounce is said to be necessary to cause a rash on every person on earth). The rash, or Toxicodendron dermatitis, is a result of the immune system’s delayed hypersensitivity response – i.e., the reaction may take hours or days to develop. Interestingly, about 20% of the population is not allergic to urushiol. They can wander through poison ivy indefinitely and have no problems (the genetic variations responsible for this trait are certain to be an interesting topic for future work in the genomics and immunology fields). Another surprising fact was that many animals don’t have any type of allergic reaction to urushiol. Deer, goats, horses and cattle are fine with these poisonous plants. In fact, one of the suggested ways to get rid of poison ivy is to get a goat. This seems to be another very interesting genetics of immunity issue – how and why do some animals manage to not only evade these plants, but thrive on them. As more complete genomes are resolved the genes, SNPs, or genetic variations in general, will be uncovered and we should all be enlightened.

Continue reading

Tip of the week: Sol Genomics Network

sol_genomics_networkAside from a short stint at the ASHG meeting, where it is all about the human genome with a smidge of attention to the microbes that hang around with us, I’m back and I’m focusing on plant resources again.  Recently I began to explore the Sol Genomics Network site, and that will be the focus of this tip of the week.

Sol Genomics Network focuses on “Solanaceae as model system for diversity” as they describe themselves.  And they aim to link genotypes to phenotypes for a collection of plant species.  Currently species information found at this site include: tomato, potato, eggplant, pepper, petunia, tobacco, and coffee.  Not all of them have browsers available here, but there are some maps for several, and there are links to other sources that may provide more information about the projects, clone collections, and additional details. They are also developing a breeder’s toolbox and they’d like to have some feedback on the needs of the community on that.

We will take a look at their tomato browser today, which is implemented in GBrowse, the Generic Genome Browser from the GMOD project tool kit that supports so many species and data types–and if you want some help using GBrowse you should see our freely available tutorial on that.

The site also include a number of outreach activities for students at varying levels–including a lab exercise for the high school level, a word find puzzle for youngsters with these species (we like puzzles here), and the fun and interactive animated series with a sequencing puzzle where you generate a small assembly with some sample BAC fragments (ok, they are really small BACs, but you get the point).   I know a lot of, ah, mature scientists who could stand to work with the concept of the assembly to grok that a bit better, actually….

Go directly to the BAC assembly sequencing puzzle here if you don’t have time for the whole tip of the week:  http://bti.cornell.edu/multimedia/puzzleComplete.html

Sol Genomics Network site directly: http://solgenomics.net/

More Solanaceae resources: http://solanaceae.plantbiology.msu.edu/

Plants at ScienceBlogs! Woot!

arabidopsisI really enjoy reading ScienceBlogs.  There are high quality science communicators over there.  And I get to read current stuff in my field, and it’s a nice place to read some of the other fields too–with a lower barrier of entry than trying to read physics papers, for example.

But one thing I thought was missing was representation of plant science.  So I wrote to them a while back requesting a plant science blog.  Now, I’m not saying that tipped the balance, but I’m not afraid to ask for stuff I want :)  And now there is one.

Pam Ronald–whose work I’ve written about before–is now one of the SciBlings!  I’m just tickled.   Her blog: Tomorrow’s Table, is moving over there.  It’s still early, you can’t seem to get there from the front page of ScienceBlogs yet.  But you know how moving to a new place goes….If you want to see some of her work as an introduction, you can watch the talk she gave at The Long Now recently.

I learned about this from Biofortified, another great place for plant science blogging.  I also read Genetic Maize regularly.  And every day I read agro.biodiver.se.  Check ‘em out if you are a fan of plant science.

Tip of the Week: TARGeT

target_thumbToday’s tip is on a TARGeT. TARGeT is, as the the paper’s title in the this year’s NAR’s issue states, “a web-based pipeline for retrieving and characterizing gene and transposable element families from genomic sequences.” There are several things you can do at TARGeT. Using BLAST, PHI BLAST, MUSCLE and TreeBest ,the main function of TARGeT is  to quickly obtain gene and transposon families from a query sequence. The tip today is a quick intro to the tool and a search on an R1 non-LTR transposon.