Tag Archives: GenBank

Picnics in Peril: Bees, Mosquitoes and Zombie Ants!

My husband loves spending time on our deck & mosquitoes love spending time chewing on me. Maybe that’s why I’ve been taking notice of research on the “Bugs of Summer” a bit more lately. Or maybe there’s just a bunch more of it right now, but I’ve seen what seems like an unusually large number of articles on bees, mosquitoes, and even “zombie ants”.

It’s a given that I’ve got to begin with the zombie ants. I mean, look how much traffic the CDC’s “Zombie Apocalypse” generated. The article is entitled “Behavioral mechanisms and morphological symptoms of zombie ants dying from fungal infection” and it is in the May 9th issue of BMC Ecology, which is open access & therefore free to read. (amusing side note is that the ad on the abstract page when I looked was for Terminix.) The article is an “extended phenotype” study in which ants (Camponotus leonardi) infected by a fungus (Ophiocordyceps unilateralis s.l.) display altered phenotypes – namely they descend from a Thai rain forest canopy, bite leaf veins, and then dying lock-jawed on the vein. The authors explain their use of the term zombies:

The term zombie ants underlines that, while the manipulated individual may look like an ant, it represents a fungal genome expressing fungal behavior through the body of an ant.

The authors followed 21 zombies, all with confirmed fungal infections, and observed changes in their activity and morphological changes in their head cavities. I generally don’t read ecology research but I think I’ve seen the process featured in nature programs that I’ve watched & the paper explained it well. I would like to read further research on the changes in gene expression, etc. that would help complete the story of the process.

I’ve been aware of bee issues, such as Colony Collapse Disorder, for a while now – it is hard not to be when even popular television such as Dr. Who reference it – but it was an article in an alumni magazine that really got me interested in the subject. Turns out last fall a tornado hit Ohio State’s Ohio Agricultural Research and Development Center (OARDC) and did thousands of dollars of damage to the honey bee program’s inventory and equipment, including a comprehensive collection of hive and queen bee production boxes. Today I came across an article describing the Honey Bee PeptidAtlas. I mentioned the PeptideAtlas as an item in a Friday SNPets post, but didn’t go into it further. The article is in BMC Genomics, is free to read, and has the title: “A honey bee (Apis mellifera L.) PeptideAtlas crossing castes and tissues.” It has been a long time since I spent much time with a genome that isn’t associated with a fairly well established proteome, but that was the state (according to the authors) of the bee genome before they built the bee PeptidAtlas.  The authors have built the Honey Bee PeptideAtlas on the backbone of the PeptideAtlas, which they describe as providing

a central, stable resource for mass spectrometry data supporting protein identification information for several species.”

They collected a large set of MS/MS data from A. mellifera, matched peaks to peptides & peptides to a comprehensive protein set compiled from RefSeq, GenBank and Gnomon predictions. The atlas was then constructed & functional predictions were made using BLAST2GO. The Honey Bee PeptideAtlas contains over 3000 proteins and 27,000 peptides, and should be a valuable resource for be proteogenomic studies.

The third “Bug of Summer”, the mosquito, is one that I experience often but do not understand well. My husband & I can be sitting next to each other & he can apparently not get a single bite while I am covered with big itchy bumps – why do they like me so well? I’ve always assumed that it was because of temperature – my skin often feels warmer than his. According to two articles in Nature: “Malaria: Mosquitoes bamboozled” (editor’s summary) and “Ultra-prolonged activation of CO2-sensing neurons disorients mosquitoes” (both require a subscription), there may be another reason – carbon dioxide. Apparently exhaled carbon dioxide is the most important cue for guiding the long-distance host-seeking flight of blood-feeding female mosquitoes, so perhaps my breathing somehow creates a higher CO2 trail. The authors used Drosophila melanogaster as a model organism to study compounds that might inhibit the CO2 detection machinery in mosquitoes. They were able to identify a cocktail of compounds that so overactivated the CO2-sensing neurons of three disease-causing mosquitoes (Anopheles gambiae, A. aegypti and Culex quinquefasciatus) that it greatly reduced their ability to navigate the CO2 train to a potential blood meal. Unfortunately according to the authors the volatile odorants reported in the paper

“…have undesirable safety profiles at high concentrations and are not ideal for human use without further testing.”

I also think such “long range” repellents will need to be paired with nearer-range effectors as well. They may find you through the CO2 in your breath, but once they are in range they seem to no longer need that cue and will bite you anywhere that will itch massively – like around your toenails, on your ankles & at the back of your knees – I’m might be biased from experience though…

PS, here’s another mosquito paper for you: Dissecting gene expression in mosquito

PSS, anyone know if the chemicals that commercial treatment “mosquito removal” companies use harm “good bugs”, humming birds, etc? Inquiring, concerned & itchy minds would like to know…

For good results, try several databases

Just wanted to point out this paper recently published in BMC Genomics: Integrating multiple genome annotation databases i… [BMC Genomics. 2010]

Often in our trainings we are asked which annotations or databases are best. Our stock and, frankly, accurate answer is that it depends on what you are looking for, your personal preferences and more. This paper concludes, at least when it comes to zebrafish transcript data, that pulling annotations from several databases instead of one increases the ability to get full data. Might sound obvious to some, but it’s always good to see the data.. and to point it out.

Tip of the Week: NCBI Makeover!

NCBI_interface_movieThe two earliest web-based bioinformatics resources that I can remember relying on in my career were Pedro’s List and NCBI.  (For those of you who need a little nostalgia trip you can see a copy of Pedro’s list here.) There are plenty of descendants of Pedro’s list in various forms–including our recently launched resource search tool.  But the National Center for Biotechnology Information (NCBI) interface has kinda been…well…comfortingly stable–for a really long time.  I looked in the Wayback Machine to see what the older interfaces used to look like.   I was able to find one variant from 1997 which I had forgotten about until I saw it.  But then I kept looking and found the version I am most familiar with starting in 1999.  If you compare 1999 to 2009 you will see essentially the same layout.  Here is a comparison of the previous interfaces, and then the new one:

NCBI interfaces through the years

NCBI interfaces through the years


Well, that’s all changing now!  The NCBI is doing a MAJOR overhaul of the interfaces.  You can examine the homepage look at the  Preview site here (link may break when they move over to production with it), and you can look at the PubMed changes here, and even start using the PubMed preview site here.

This is a huge break with the past, and like all new interfaces will take a little time to get used to.  But I have to say I like the organization.  The left navigation will make finding the tools easier.  The “Popular” box will be quick access to the most frequently used items.  Highlights and news are available still as well.  There are some things I’ll miss. We liked the site map layout to explain the features in an overview sort of way, and the preview page doesn’t link to that–it links to the alphabetical list.  Might change, though.

Anyway–I think the new look is nice and effective.  Of course we’ll have to update all of our NCBI tutorials with new shots of the interfaces.  But it looks like the underlying tools don’t change much conceptually–but they may move the location of the items (like the PubMed filters).  So as soon as the interface becomes the main site and appears to be stable we’ll make our changes.

This short Tip of the Week introduces the new interface briefly to get you starting to think about how to navigate around.  Check it out!

NCBI: http://www.ncbi.nlm.nih.gov/

A Tree is Barcoded in Brooklyn

Figure 1 of the Plant Barcoding paperScrolling through some of my regular podcasts the other day I came across this tidbit about bioinformatics growing in New York (among other things, or course!):

Barcoding Plant DNA (I hope the embed of the audio file works, first time I’m trying that…)

It is a discussion with Dr. Damon Little, a curator of bioinformatics from the New York Botanical Garden.  The focus of the discussion is the recent publication of the CBOL Plant Working Group which has settled on the regions that will be used for barcoding plants.

If you aren’t familiar with barcoding efforts yet, you can check out Jennifer’s prior post with some background and great links.  Essentially a small snippet of DNA sequence is used to (hopefully) uniquely identify a given species.  This can be stored in a database–Dr. Little of the NY Botanical Garden refers to GenBank at NCBI, but there are other sites as well.  I was just reading about the web interface for barcoding called iBarcode.org for analyzing and managing this sort of data.

The Consortium for the Barcode Of Life Plant Working Group summary press release of this work can be found here.   The paper that describes the work is Open Access in PNAS here.  The paper describes the genes that had been candidates for the barcode, and the ones that were selected (rbcL + matK).  They described primer selection and sequencing results for the series they examined.  They evaluate which ones meet the barcoding standard criteria and provide the selections.  They use MUSCLE to examine the sequence alignments.

This is an excellent effort on many fronts.  Just assessing and cataloging biodiversity is useful itself, but this can also help to identify plants that are claimed to be used in food or medicine products to see if that is what’s really in there.  It can help combat poaching of protected species–for example, it can identify wood harvested that shouldn’t have been taken for lumber.

Glad to see this work moving forward and getting out in front of the public!

Related links

Podcast direct page: http://www.wnyc.org/shows/lopate/episodes/2009/07/29/segments/137623

NYBG: http://www.nybg.org/

Barcode blog: http://phe.rockefeller.edu/barcode/blog/

Scientific American article on the topic: http://www.scientificamerican.com/blog/60-second-science/post.cfm?id=botanists-agree-on-dna-barcode-for-2009-07-29

Consortium for the Barcode of Life (CBOL): http://www.barcoding.si.edu/

CBOL Plant Working Group (2009). A DNA barcode for land plants PNAS, 106 (31), 12794-12797 : 10.1073/pnas.0905845106

Singer, G., & Hajibabaei, M. (2009). iBarcode.org: web-based molecular biodiversity analysis BMC Bioinformatics, 10 (Suppl 6) DOI: 10.1186/1471-2105-10-S6-S14

Edgar, R. (2004). MUSCLE: multiple sequence alignment with high accuracy and high throughput Nucleic Acids Research, 32 (5), 1792-1797 DOI: 10.1093/nar/gkh340

Tip of the Week: File Format conversion

fileformat_thumbMany of us have worked with DNA and protein sequences of course in several different formats: GenBank and FASTA to name two broadly used ones, but there are many others. Different tools and databases will often require different formats. More often than not, converting from one to the other format isn’t too much of a problem, the database will do it for you, or there will be some help documentation. But this isn’t always the case. There are several ways you can convert formats, for example Galaxy has some limited ability to do this and some databases allow you to export sequence in one of several formats, but often you’ll need a bit more help. ReadSeq is a publicly available software package (downloadable from that link) that will do just that. You could download or install, but EBI also has a web interface for ReadSeq (as do some other services). Today’s tip is for those of you somewhat new to sequence formats (or even some of us who aren’t) and need a quick web interface to converting formats.

Treasure Hunts

Thought I’d recommend a little fun treasure hunt using GenBank. It’s a fun (if you are a biology and database geek like me) project that will hone some skills using GenBank, introduce you to a nice tool called ‘Blink” and maybe find some interesting anomalies. It’s all outlined (in several blog posts) by Sandra at her blog “Discovering Biology in a Digital World” (great blog btw) in a post entitled “A general method and good student project for finding interesting anomalies in GenBank.” I’m having fun with it, I’ll tell you (and her) if I find something.

Speaking of finding things, here are some interesting things I’ve come across randomly lately.

I came across this interesting book the other day. Isn’t ‘databases’ or ‘genomics’ per se (ok, so not at all really), but it’s a look at what geological footprint humans and human civilization will have on the Earth some 100 million years from now. What some alien traveler might find from this “Anthropocene” period we are in. Read more about it here and listen to the interview.

PLoS Computational Biology has a paper introducing a new database you might want to check out: mouseNet . As stated at the database site, it is a “functional network for laboratory mouse based on integration of diverse genetic and genomic data…to… predict novel functional assignments and network components.”

Wikification of Genbank

Speaking of Genbank’s 25th, a few weeks ago Science had a news piece “Proposal to ‘Wikify’ Genbank Meets Stiff Resistance.” Apparently, those in the Mycology research community have found many inaccuracies in the Genbank records and wish to see a change that would allow annotations to be made by the community:

a scheme like those used in herbaria and museums, where specimens often have multiple annotations: listing original and new entries side by side. It would be a community operation, like Wikipedia, in which the users themselves update and add information, but not anonymously.

But the idea is meeting resistance from Genbank’s Managers:

Continue reading

GenBank's 25th Anniversary (Highlights)

I know the liveblogging is hard to read–it is really a reflection of my notes as the talks were progressing. I’m going to clean them up a bit, but mostly I’m going to leave them in case people need a quick summary of what was discussed. The videocasts for the talks are still going to be available, but they are unfortunately in giant many-hour chunks with no guidance as to what is in there exactly, or when you might try to find them.

So I’m going to highlight a few things here that I found especially interesting (and indicate where in the videocast you might find it). Of course, you may have other areas of interest and find other things you prefer. Feel free to watch all of them! Details of my choices below, and the approximate time on the video. You can move the slider to get to that approximate time point. Continue reading

Liveblogging the GenBank 25th Anniversary II

I’m preparing to liveblog this event again today, internets permitting:

GenBank: Celebrating 25 years of Service at NCBI: http://www.tech-res.com/GenBank25/ official announcement.

The agenda is here: http://www.tech-res.com/GenBank25/agenda.html

There is a link to a videocast of the event from the Celebration link, supposedly:
View event:

You will be able to view the event at http://videocast.nih.gov when the event is live.

Will try to update as often as I can, if I have decent wireless and power.

Session Chair: Steven Salzberg.

Continue reading

Liveblogging the GenBank 25th Anniversary

I’m preparing to liveblog this event, internets permitting:

GenBank: Celebrating 25 years of Service at NCBI: http://www.tech-res.com/GenBank25/ official announcement.

The agenda is here: http://www.tech-res.com/GenBank25/agenda.html

Not being a married person, I didn’t know which one this was. I had to look it up. This is Silver. I can’t think of a decent gift, so I’m not bringing one. Maybe they are registered somewhere??

There is a link to a videocast of the event from the Celebration link, supposedly:

View event: You will be able to view the event at http://videocast.nih.gov when the event is live.
Air date: Monday, April 07, 2008, 9:00:00 AM

Will try to update as often as I can, if I have decent wireless and power.

Welcome remarks

Michael Gottesman: GenBank one of the major accomplishments of the NIH. Major reasons for success: 1. timely, visionary idea. Already a protein seq database (Dayhoff), need for nucleotides as well. 2. International cooperation from the beginning. Support from other US organisations as well. Stable foundation at NIH has been important. 3. Contributions of researchers providing the data has been a third key. 4. Technology improvements in sequening and comparison algorithms. 5. Move from contract basis to NCBI/NLM provided stable support.

Continue reading