Tag: plants

Choosing a genome browser for your organism…

26 April, 2010 (15:26) | Genomics Research, Genomics Resource News | By: Mary

There are a number of genome browsers out there–we’ve covered that a number of times.  And there are always new ones coming along.  With the onslaught of sequence data we’re about to get from high-throughput sequencing, more and more research groups, communities, and individuals are going to need to choose a genome browser to use to display their data.

One time I stumbled across the survey results for a group that was choosing a new platform to display their community’s data: MaizeGDB.  I wrote about it then because I thought it was interesting, and because I know people are facing this pretty regularly now.  We get asked.  But since that time they have progressed, implemented, and they wrote up their experience.  It’s now been published in Database.

It’s a pretty straighforward paper.  They describe their needs and their assessment of the resources their community had and used.  They surveyed likely users to see what they wanted, and how they felt about the pieces that already existed.  One piece they specifically noted–when asked, many users did not say they used Ensembl, but the Ensembl software was the foundation of one of the items they did say they used.  MaizeGDB writes:

This result shows that users may not be aware of the underlying browser software that the various web sites use.

Ah, yeah.  Here’s another thing this shows: database end users are definitely not thinking about browser software the same way database developers are.  And I do not mean end users are stupid.  They just do not think about this stuff the way software providers think they do.  We keep trying to tell providers this.  It’s not always well received.

So anyway, they move on to assess the candidates for their new implementation.  The focus on Ensembl, GBrowse, Map Viewer, UCSC Genome Browser, and xGDB.  They describe the framework, possibilities, and limitations of each for their purposes.  I think this is a nice look at the various options that lots of people considering the issue should find useful.  They also address that there are other browser that have since, or may still, come along in the future that could be considered, but at the time these were the focus.

They go on to describe their implementation experience.  They seem pleased with it.  And they highlight a one of their favorite pieces, a Locus Lookup tool, that they have added as well.  It sounds like it’s serving their community really nicely.

This is a highly useful paper for the people in the market for genome browsers.  It’s not for everyone, for sure.  Well, at least not yet.  But your day is coming. You’ll need a browser eventually….

You can check out their GBrowse implementation at: http://gbrowse.maizegdb.org/

And if you are interested you can see our free GBrowse training suite here: http://www.openhelix.com/gbrowse

References:
Sen, T., Harper, L., Schaeffer, M., Andorf, C., Seigfried, T., Campbell, D., & Lawrence, C. (2010). Choosing a genome browser for a Model Organism Database: surveying the Maize community Database, 2010 DOI: 10.1093/database/baq007

Andorf, C., Lawrence, C., Harper, L., Schaeffer, M., Campbell, D., & Sen, T. (2010). The Locus Lookup tool at MaizeGDB: identification of genomic regions in maize by integrating sequence information with physical and genetic maps Bioinformatics, 26 (3), 434-436 DOI: 10.1093/bioinformatics/btp556

EDIT: added links to a couple of older blog posts, should have had them in before….

Tip of the week: Sol Genomics Network

28 October, 2009 (08:25) | Genomics Research, Genomics Resource News, New Resource, Tip of the Week | By: Mary

sol_genomics_networkAside from a short stint at the ASHG meeting, where it is all about the human genome with a smidge of attention to the microbes that hang around with us, I’m back and I’m focusing on plant resources again.  Recently I began to explore the Sol Genomics Network site, and that will be the focus of this tip of the week.

Sol Genomics Network focuses on “Solanaceae as model system for diversity” as they describe themselves.  And they aim to link genotypes to phenotypes for a collection of plant species.  Currently species information found at this site include: tomato, potato, eggplant, pepper, petunia, tobacco, and coffee.  Not all of them have browsers available here, but there are some maps for several, and there are links to other sources that may provide more information about the projects, clone collections, and additional details. They are also developing a breeder’s toolbox and they’d like to have some feedback on the needs of the community on that.

We will take a look at their tomato browser today, which is implemented in GBrowse, the Generic Genome Browser from the GMOD project tool kit that supports so many species and data types–and if you want some help using GBrowse you should see our freely available tutorial on that.

The site also include a number of outreach activities for students at varying levels–including a lab exercise for the high school level, a word find puzzle for youngsters with these species (we like puzzles here), and the fun and interactive animated series with a sequencing puzzle where you generate a small assembly with some sample BAC fragments (ok, they are really small BACs, but you get the point).   I know a lot of, ah, mature scientists who could stand to work with the concept of the assembly to grok that a bit better, actually….

Go directly to the BAC assembly sequencing puzzle here if you don’t have time for the whole tip of the week:  http://bti.cornell.edu/multimedia/puzzleComplete.html

Sol Genomics Network site directly: http://solgenomics.net/

More Solanaceae resources: http://solanaceae.plantbiology.msu.edu/

Plants at ScienceBlogs! Woot!

19 October, 2009 (09:48) | Genomics Research, Genomics Resource News | By: Mary

arabidopsisI really enjoy reading ScienceBlogs.  There are high quality science communicators over there.  And I get to read current stuff in my field, and it’s a nice place to read some of the other fields too–with a lower barrier of entry than trying to read physics papers, for example.

But one thing I thought was missing was representation of plant science.  So I wrote to them a while back requesting a plant science blog.  Now, I’m not saying that tipped the balance, but I’m not afraid to ask for stuff I want :)   And now there is one.

Pam Ronald–whose work I’ve written about before–is now one of the SciBlings!  I’m just tickled.   Her blog: Tomorrow’s Table, is moving over there.  It’s still early, you can’t seem to get there from the front page of ScienceBlogs yet.  But you know how moving to a new place goes….If you want to see some of her work as an introduction, you can watch the talk she gave at The Long Now recently.

I learned about this from Biofortified, another great place for plant science blogging.  I also read Genetic Maize regularly.  And every day I read agro.biodiver.se.  Check ‘em out if you are a fan of plant science.

Tip of the Week: TARGeT

9 September, 2009 (12:08) | Tip of the Week | By: Trey

target_thumbToday’s tip is on a TARGeT. TARGeT is, as the the paper’s title in the this year’s NAR’s issue states, “a web-based pipeline for retrieving and characterizing gene and transposable element families from genomic sequences.” There are several things you can do at TARGeT. Using BLAST, PHI BLAST, MUSCLE and TreeBest ,the main function of TARGeT is  to quickly obtain gene and transposon families from a query sequence. The tip today is a quick intro to the tool and a search on an R1 non-LTR transposon.

Tip of the Week: PLAN2L for Arabidopsis literature

19 August, 2009 (09:22) | Genomics Research, New Resource, Tip of the Week | By: Mary

plan2L_jingFor this tip of the week we look at a text-mining tool for the Arabidopsis literature, Plan2L, or PLant ANnotation to Literature.  It has a very straightforward interface that permits searching of the paper space, and you can do that with a variety of focal points: the bibliome as a whole, or with emphasis on interactions, regulation, cell cycle, and more.  The results offer links to the PubMed abstracts, and tabular results of the statistics of the term occurance in that area of focus.  Green results indicate positive scores and likely relevance, red are likely to be non-relevant, a graphical guide to quickly finding the data of interest. Links to other resources including the BioCreative server, WikiGenes, iHOP and TAIR are provided as well.

The current emphasis for this resource is Arabidopsis, but it would be quite useful for other species too.  If you are interested in text mining Arabidopisis I would also encourage you to compare the results with the Textpresso installation at TAIR to see what you discover in a different text miner interface as well.

Plan2L site: http://zope.bioinfo.cnio.es/plan2l/plan2l.html

For their recent paper on Plan2L see: http://www.ncbi.nlm.nih.gov/pubmed/19520768 or the full article freely available in PubMedCentral:  http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=19520768

A Tree is Barcoded in Brooklyn

4 August, 2009 (09:55) | General Science, Genomics Research, Genomics Resource News | By: Mary

Figure 1 of the Plant Barcoding paperScrolling through some of my regular podcasts the other day I came across this tidbit about bioinformatics growing in New York (among other things, or course!):

Barcoding Plant DNA (I hope the embed of the audio file works, first time I’m trying that…)

It is a discussion with Dr. Damon Little, a curator of bioinformatics from the New York Botanical Garden.  The focus of the discussion is the recent publication of the CBOL Plant Working Group which has settled on the regions that will be used for barcoding plants.

If you aren’t familiar with barcoding efforts yet, you can check out Jennifer’s prior post with some background and great links.  Essentially a small snippet of DNA sequence is used to (hopefully) uniquely identify a given species.  This can be stored in a database–Dr. Little of the NY Botanical Garden refers to GenBank at NCBI, but there are other sites as well.  I was just reading about the web interface for barcoding called iBarcode.org for analyzing and managing this sort of data.

The Consortium for the Barcode Of Life Plant Working Group summary press release of this work can be found here.   The paper that describes the work is Open Access in PNAS here.  The paper describes the genes that had been candidates for the barcode, and the ones that were selected (rbcL + matK).  They described primer selection and sequencing results for the series they examined.  They evaluate which ones meet the barcoding standard criteria and provide the selections.  They use MUSCLE to examine the sequence alignments.

This is an excellent effort on many fronts.  Just assessing and cataloging biodiversity is useful itself, but this can also help to identify plants that are claimed to be used in food or medicine products to see if that is what’s really in there.  It can help combat poaching of protected species–for example, it can identify wood harvested that shouldn’t have been taken for lumber.

Glad to see this work moving forward and getting out in front of the public!

Related links

Podcast direct page: http://www.wnyc.org/shows/lopate/episodes/2009/07/29/segments/137623

NYBG: http://www.nybg.org/

Barcode blog: http://phe.rockefeller.edu/barcode/blog/

Scientific American article on the topic: http://www.scientificamerican.com/blog/60-second-science/post.cfm?id=botanists-agree-on-dna-barcode-for-2009-07-29

Consortium for the Barcode of Life (CBOL): http://www.barcoding.si.edu/

References
CBOL Plant Working Group (2009). A DNA barcode for land plants PNAS, 106 (31), 12794-12797 : 10.1073/pnas.0905845106

Singer, G., & Hajibabaei, M. (2009). iBarcode.org: web-based molecular biodiversity analysis BMC Bioinformatics, 10 (Suppl 6) DOI: 10.1186/1471-2105-10-S6-S14

Edgar, R. (2004). MUSCLE: multiple sequence alignment with high accuracy and high throughput Nucleic Acids Research, 32 (5), 1792-1797 DOI: 10.1093/nar/gkh340

BREAD: rising money for science?

30 June, 2009 (12:09) | General Science, Genomics News, Genomics Research | By: Mary

I keep an eye on a lot of mailing lists.  Usually they are the ones for database or software resources in our field.  But I also keep an eye on some funding ones.  We aren’t always eligible, but it also helps us to get a sense of the directions that projects are going.

Yesterday I saw one that surprised me on several levels.  It is called BREAD funding.  BREAD stands for Basic Research to Enable Agricultural Development.  I found this one interesting because:

1. It is a joint project between NSF and The Gates Foundation.  Maybe there are other federal funding projects that involve private foundations like this.  But I haven’t seen them.

The National Science Foundation (NSF) and the Bill & Melinda Gates Foundation (BMGF) are partnering to support a new research program to be administered by NSF. The objective of the BREAD Program is to support innovative scientific research designed to address key constraints to smallholder agriculture in the developing world

2. It is giving money for plant genomics in agriculture.  Cool! Among the possible directions for the research:

  • New strategies for creating resistance to major diseases and pests that affect plants, animals or insects of agricultural importance, and that have major impact in broad regions of the developing world.

3. It actually uses the phrase “climate change” and calls it a threat.  And acknowledges several thing that I don’t think the last administration was serious about at all:

  • Novel approaches to using the genetic diversity of plants, microbes, or animals to enhance the ability of small-scale farmers to adapt to emerging threats of global climate change, emerging diseases, and the rising costs of energy.

Anyway, I’m delighted to see basic research on plants in agriculture in Africa and Asia getting some attention.  I was pleased when I heard Hillary Clinton refer to this recently, but I was waiting for someone to show me the money. And what do you know–they did.

This specific grant: http://www.nsf.gov/pubs/2009/nsf09566/nsf09566.htm?govDel=USNSF_25

More on the BREAD program: http://www.nsf.gov/funding/pgm_summ.jsp?pims_id=503285

Press release on the NSF + BMGF partnership:  http://www.eurekalert.org/pub_releases/2009-03/nsf-nsf033009.php

Biodiversity databases

22 June, 2009 (10:24) | General Science, Genomics Research | By: Mary

nature_reserveI know sometimes I joke about “another day, another genome” as it seems like we can check off another genome daily.  And as the next-gen technology spreads further that’s going to be even more common.  It’s gotten me thinking a lot about which species ought to be done.  And how will sequencing research teams choose?

The folks at the Agricultural Biodiversity Weblog have me intrigued on a bunch of resources that are not the ones that most bioinformatics folks in my sphere have focused on.  I mean, I know why we focus so much effort on model organisms and the big food species like at Gramene and PlantGDB, and I support that.  But when you start thinking about the other organisms that we rely on so much–in the big agriculture way and the small agriculture way–I think we need to bring those animals and plants into the herd :)   And we can soon.

Their recent post Linking up livestock databases was the one that prompted this post.  But they write a lot of things I like (especially about plant genetic resources) and really have me wondering and reading, and thinking about how to raise awareness on the other valuable species.

The livestock post pointed to several nice resources that I was unaware of before.  In an article by Eildert Groeneveld in the Globaldiv Newsletter the focus in animals, and he offers several nice links. Check out the diverse sheep at the Heritage Sheep Breeds web site.  Check out the species in the Central Documentation for Animal Biological Diversity in Germany here.  Or the breed data collection at Oklahoma State–have you ever seen goats like those Alti Mountain goats?  Wowsa.    How about the Domestic Animal Diversity Information System or DAD-IS?  There are other great links as well in the newsletter–check ‘em out.  Another thrust of the article is linking up individual data with breed data via the EFABIS project as well to enhance the knowledge, and you can learn more about that in the newsletter.

Anyway, there are some really fascinating variations here.  Understanding them would be a great project for folks with a next-gen sequencer waiting for input.   Have a look.  And celebrate rare breeds.  We are going to need them in times of climate change.

Growing plant scientists

15 June, 2009 (11:19) | General Science, Genomics Research | By: Mary

I’m fascinated by all the genomes I see–and I’m delighted to see plant scientists raising awareness for that research.  So this weekend it was nice to see some commentary about support for plant scientists on the political blogs.  From an article by Hillary Clinton on Huffpo:

Attacking Hunger at Its Roots

4. We will expand knowledge and training by supporting R&D and cultivating the next generation of plant scientists.

I hope that’s true–and that the funding comes through for that.  But how nice to see an administration say out loud that they support science–and plant science specifically.

The article was also related to the work of Dr. Gebisa Ejeta who sleuthed out a way to create a plant resistant to Striga, a very tricksy parasitic weed that was seriously impacting sorghum farmers in Africa.  Congrats to Dr. Ejeta who won the World Food Prize for this work.  More like Ejeta!

Hat tip to the Biodiversity Weblog post that started me reading on this compelling work.

Phytozome

8 June, 2009 (20:32) | Genomics Resource News | By: Trey

A newly enhanced database and resource is available to researchers called Phytozome. Phytozome is targeted as a hub of genomic data for plants of interest in biofuel research and a joint project of the DOE JGI and UC Berkeley’s Center for Integrative Genomics. As a recent press release states,

The gene families available in Phytozome, defined at several evolutionarily significant epochs, provide a framework for the transfer of functional information to important biofuel and agricultural crops from model plant systems, as well as allowing users to explore land plant evolution.

This release is v. 4 and includes the genomes of 14 plants from green algae, arabidopsis and corn. The resource uses GBrowse (free tutorial and training materials) as it’s genome browser, BioMart for advanced searching and has BLAST capability. I find Gramene a bit more extensive than Phytozome, but the focus of the two (biofuel plants and agricultural grains for Phytozome and Gramene respectively) are different and Phytozome is becoming quite extensive.

I remember going to a DOE/JGI users conference last year and being quite impressed with the research going on in biofuel, and also more sobered by the obstacles both techological and practical (use of food-producing land, etc) that we face. With rising gas prices and temperatures, can’t ask for too much information!