Tag: GenBank

For good results, try several databases

22 January, 2010 (15:41) | Genomics Research, Genomics Resource News | By: Trey

Just wanted to point out this paper recently published in BMC Genomics: Integrating multiple genome annotation databases i… [BMC Genomics. 2010]

Often in our trainings we are asked which annotations or databases are best. Our stock and, frankly, accurate answer is that it depends on what you are looking for, your personal preferences and more. This paper concludes, at least when it comes to zebrafish transcript data, that pulling annotations from several databases instead of one increases the ability to get full data. Might sound obvious to some, but it’s always good to see the data.. and to point it out.

Tip of the Week: NCBI Makeover!

7 October, 2009 (07:45) | Genomics Resource News, Tip of the Week | By: Mary

NCBI_interface_movieThe two earliest web-based bioinformatics resources that I can remember relying on in my career were Pedro’s List and NCBI.  (For those of you who need a little nostalgia trip you can see a copy of Pedro’s list here.) There are plenty of descendants of Pedro’s list in various forms–including our recently launched resource search tool.  But the National Center for Biotechnology Information (NCBI) interface has kinda been…well…comfortingly stable–for a really long time.  I looked in the Wayback Machine to see what the older interfaces used to look like.   I was able to find one variant from 1997 which I had forgotten about until I saw it.  But then I kept looking and found the version I am most familiar with starting in 1999.  If you compare 1999 to 2009 you will see essentially the same layout.  Here is a comparison of the previous interfaces, and then the new one:

[caption id="attachment_2580" align="alignleft" width="300" caption="NCBI interfaces through the years"]NCBI interfaces through the years[/caption]

ncbi_new_2009

Well, that’s all changing now!  The NCBI is doing a MAJOR overhaul of the interfaces.  You can examine the homepage look at the  Preview site here (link may break when they move over to production with it), and you can look at the PubMed changes here, and even start using the PubMed preview site here.

This is a huge break with the past, and like all new interfaces will take a little time to get used to.  But I have to say I like the organization.  The left navigation will make finding the tools easier.  The “Popular” box will be quick access to the most frequently used items.  Highlights and news are available still as well.  There are some things I’ll miss. We liked the site map layout to explain the features in an overview sort of way, and the preview page doesn’t link to that–it links to the alphabetical list.  Might change, though.

Anyway–I think the new look is nice and effective.  Of course we’ll have to update all of our NCBI tutorials with new shots of the interfaces.  But it looks like the underlying tools don’t change much conceptually–but they may move the location of the items (like the PubMed filters).  So as soon as the interface becomes the main site and appears to be stable we’ll make our changes.

This short Tip of the Week introduces the new interface briefly to get you starting to think about how to navigate around.  Check it out!

NCBI: http://www.ncbi.nlm.nih.gov/

A Tree is Barcoded in Brooklyn

4 August, 2009 (09:55) | General Science, Genomics Research, Genomics Resource News | By: Mary

Figure 1 of the Plant Barcoding paperScrolling through some of my regular podcasts the other day I came across this tidbit about bioinformatics growing in New York (among other things, or course!):

Barcoding Plant DNA (I hope the embed of the audio file works, first time I’m trying that…)

It is a discussion with Dr. Damon Little, a curator of bioinformatics from the New York Botanical Garden.  The focus of the discussion is the recent publication of the CBOL Plant Working Group which has settled on the regions that will be used for barcoding plants.

If you aren’t familiar with barcoding efforts yet, you can check out Jennifer’s prior post with some background and great links.  Essentially a small snippet of DNA sequence is used to (hopefully) uniquely identify a given species.  This can be stored in a database–Dr. Little of the NY Botanical Garden refers to GenBank at NCBI, but there are other sites as well.  I was just reading about the web interface for barcoding called iBarcode.org for analyzing and managing this sort of data.

The Consortium for the Barcode Of Life Plant Working Group summary press release of this work can be found here.   The paper that describes the work is Open Access in PNAS here.  The paper describes the genes that had been candidates for the barcode, and the ones that were selected (rbcL + matK).  They described primer selection and sequencing results for the series they examined.  They evaluate which ones meet the barcoding standard criteria and provide the selections.  They use MUSCLE to examine the sequence alignments.

This is an excellent effort on many fronts.  Just assessing and cataloging biodiversity is useful itself, but this can also help to identify plants that are claimed to be used in food or medicine products to see if that is what’s really in there.  It can help combat poaching of protected species–for example, it can identify wood harvested that shouldn’t have been taken for lumber.

Glad to see this work moving forward and getting out in front of the public!

Related links

Podcast direct page: http://www.wnyc.org/shows/lopate/episodes/2009/07/29/segments/137623

NYBG: http://www.nybg.org/

Barcode blog: http://phe.rockefeller.edu/barcode/blog/

Scientific American article on the topic: http://www.scientificamerican.com/blog/60-second-science/post.cfm?id=botanists-agree-on-dna-barcode-for-2009-07-29

Consortium for the Barcode of Life (CBOL): http://www.barcoding.si.edu/

References
CBOL Plant Working Group (2009). A DNA barcode for land plants PNAS, 106 (31), 12794-12797 : 10.1073/pnas.0905845106

Singer, G., & Hajibabaei, M. (2009). iBarcode.org: web-based molecular biodiversity analysis BMC Bioinformatics, 10 (Suppl 6) DOI: 10.1186/1471-2105-10-S6-S14

Edgar, R. (2004). MUSCLE: multiple sequence alignment with high accuracy and high throughput Nucleic Acids Research, 32 (5), 1792-1797 DOI: 10.1093/nar/gkh340

Tip of the Week: File Format conversion

27 May, 2009 (00:48) | Genomics Research, Tip of the Week | By: Trey

fileformat_thumbMany of us have worked with DNA and protein sequences of course in several different formats: GenBank and FASTA to name two broadly used ones, but there are many others. Different tools and databases will often require different formats. More often than not, converting from one to the other format isn’t too much of a problem, the database will do it for you, or there will be some help documentation. But this isn’t always the case. There are several ways you can convert formats, for example Galaxy has some limited ability to do this and some databases allow you to export sequence in one of several formats, but often you’ll need a bit more help. ReadSeq is a publicly available software package (downloadable from that link) that will do just that. You could download or install, but EBI also has a web interface for ReadSeq (as do some other services). Today’s tip is for those of you somewhat new to sequence formats (or even some of us who aren’t) and need a quick web interface to converting formats.

Treasure Hunts

26 September, 2008 (15:42) | General Science, Genomics Resource News, New Resource | By: Trey

Thought I’d recommend a little fun treasure hunt using GenBank. It’s a fun (if you are a biology and database geek like me) project that will hone some skills using GenBank, introduce you to a nice tool called ‘Blink” and maybe find some interesting anomalies. It’s all outlined (in several blog posts) by Sandra at her blog “Discovering Biology in a Digital World” (great blog btw) in a post entitled “A general method and good student project for finding interesting anomalies in GenBank.” I’m having fun with it, I’ll tell you (and her) if I find something.

Speaking of finding things, here are some interesting things I’ve come across randomly lately.

I came across this interesting book the other day. Isn’t ‘databases’ or ‘genomics’ per se (ok, so not at all really), but it’s a look at what geological footprint humans and human civilization will have on the Earth some 100 million years from now. What some alien traveler might find from this “Anthropocene” period we are in. Read more about it here and listen to the interview.

PLoS Computational Biology has a paper introducing a new database you might want to check out: mouseNet . As stated at the database site, it is a “functional network for laboratory mouse based on integration of diverse genetic and genomic data…to… predict novel functional assignments and network components.”

Wikification of Genbank

11 April, 2008 (10:46) | Genomics News, Genomics Resource News | By: Trey

Speaking of Genbank’s 25th, a few weeks ago Science had a news piece “Proposal to ‘Wikify’ Genbank Meets Stiff Resistance.” Apparently, those in the Mycology research community have found many inaccuracies in the Genbank records and wish to see a change that would allow annotations to be made by the community:

a scheme like those used in herbaria and museums, where specimens often have multiple annotations: listing original and new entries side by side. It would be a community operation, like Wikipedia, in which the users themselves update and add information, but not anonymously.

But the idea is meeting resistance from Genbank’s Managers:

Click to continue reading “Wikification of Genbank”

GenBank's 25th Anniversary (Highlights)

10 April, 2008 (11:49) | General Science, Genomics News, Genomics Research, Genomics Resource News | By: Mary

I know the liveblogging is hard to read–it is really a reflection of my notes as the talks were progressing. I’m going to clean them up a bit, but mostly I’m going to leave them in case people need a quick summary of what was discussed. The videocasts for the talks are still going to be available, but they are unfortunately in giant many-hour chunks with no guidance as to what is in there exactly, or when you might try to find them.

So I’m going to highlight a few things here that I found especially interesting (and indicate where in the videocast you might find it). Of course, you may have other areas of interest and find other things you prefer. Feel free to watch all of them! Details of my choices below, and the approximate time on the video. You can move the slider to get to that approximate time point.

Click to continue reading “GenBank's 25th Anniversary (Highlights)”

Liveblogging the GenBank 25th Anniversary II

8 April, 2008 (06:37) | General Science, Genomics News, Genomics Research, Genomics Resource News | By: Mary

I’m preparing to liveblog this event again today, internets permitting:

GenBank: Celebrating 25 years of Service at NCBI: http://www.tech-res.com/GenBank25/ official announcement.

The agenda is here: http://www.tech-res.com/GenBank25/agenda.html

There is a link to a videocast of the event from the Celebration link, supposedly:
View event:

You will be able to view the event at http://videocast.nih.gov when the event is live.

Will try to update as often as I can, if I have decent wireless and power.

Session Chair: Steven Salzberg.

Click to continue reading “Liveblogging the GenBank 25th Anniversary II”

Liveblogging the GenBank 25th Anniversary

7 April, 2008 (06:35) | General Science, Genomics News, Genomics Research, Genomics Resource News | By: Mary

I’m preparing to liveblog this event, internets permitting:

GenBank: Celebrating 25 years of Service at NCBI: http://www.tech-res.com/GenBank25/ official announcement.

The agenda is here: http://www.tech-res.com/GenBank25/agenda.html

Not being a married person, I didn’t know which one this was. I had to look it up. This is Silver. I can’t think of a decent gift, so I’m not bringing one. Maybe they are registered somewhere??

There is a link to a videocast of the event from the Celebration link, supposedly:

View event: You will be able to view the event at http://videocast.nih.gov when the event is live.
Air date: Monday, April 07, 2008, 9:00:00 AM

Will try to update as often as I can, if I have decent wireless and power.

Welcome remarks

Michael Gottesman: GenBank one of the major accomplishments of the NIH. Major reasons for success: 1. timely, visionary idea. Already a protein seq database (Dayhoff), need for nucleotides as well. 2. International cooperation from the beginning. Support from other US organisations as well. Stable foundation at NIH has been important. 3. Contributions of researchers providing the data has been a third key. 4. Technology improvements in sequening and comparison algorithms. 5. Move from contract basis to NCBI/NLM provided stable support.

Click to continue reading “Liveblogging the GenBank 25th Anniversary”

Demise of the NCBI Field Guide

3 April, 2008 (13:12) | General Science, Genomics News, Genomics Research, Genomics Resource News | By: Mary

For funding reasons, NCBI (home of PubMed, BLAST, dbSNP, OMIM and more) has cut their outreach staff, canceled all onsite training seminars and this has to mean decreased support for online help, documentation and tutorials.

When we wrote our NIH grant, one of the models of success in the bioinformatics training area that we highlighted was the NCBI Field Guide program. For those who may be unfamiliar with it, it is a set of training modules delivered by the outreach team at NCBI. They would come to your site, cover many NCBI tools and do hands-on workshops. Another course (Enhanced Field Guide) drew science librarians and other trainers together to train them, and those folks could go back to their institutions and offer more-and-better searches and training for their constituents. We thought the Guides are a terrific group of people who were interested in people getting their hands on the myriad tools at NCBI and using them effectively. It wasn’t really a competitive situation—their remit was only for NCBI tools, and there were plenty of others out there for us to do. In fact, many people who contacted us for training did so because their local users enjoyed the NCBI training and they wanted similar engagements for other tools.

Recently, though, the calls changed. We found we were getting calls from people who said they weren’t going to be able to get any more Field Guide trainings. NCBI is discontinuing the outreach program. Quite frankly, we were surprised. A sample of the notifications people were getting: http://www.library.uiuc.edu/blog/bicnews/archives/2008/02/ncbi_field_cour.html

Unfortunately, that tremendous training opportunity will NOT occur. Yesterday NCBI Field Guide coordinator, Peter Cooper, sent the following email:

Because of budgetary constraints, NCBI has made reductions in some of its programs, and the education programs are affected. In fact, all outreach education programs (Field Guide, Mini-courses, Structures, PubChem) are terminated effective immediately. At this point we cannot reschedule this course or accept requests for future courses of any kind. This was as much a surprise to me as it is to you. Feel free to contact me if you have questions.

The Field Course, as well as the Mini-Courses and the Structure course, has been tremendously popular and useful (see list of sites where the Field Course has been offered recently), but the NCBI budget situation will not allow NCBI to continue to travel and offer these courses for the foreseeable future.

(emphasis mine)

Here’s a link to a similar letter at another location: http://www.twu.edu/as/bio/NCBI/FieldGuide/

We’ve confirmed this with a number of people directly involved; they have laid off nearly all of the outreach team. Some got reassigned. There can hardly be anyone there to even answer emails to the helpdesk anymore—and they get lots of emails every day.

I’ve been through layoffs before, a few times. It actually feels like a punch to the gut when I hear about it anywhere else—especially among people I know. I expect layoffs at companies, though. But if there was any group that was solidly in place, going to be around for a long time, I would have thought it was the NCBI outreach team. I’m quite sorry to hear that it has been dissolved.

In this time of so many resources & so much need for increased understanding, outreach has become an intregal part of a resource’s success – fewer instructional resources is an unfortunate consequence of decreased funding for science.