Tag Archives: blast

Friday SNPpets

Welcome to our Friday feature link collection: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…

Updated Online Tutorials for NCBI resources including an NCBI Overview and PubMed and the Gene Expression Omnibus tutorials

Comprehensive tutorials on the publicly available NCBI resources enable researchers to quickly and effectively use these invaluable resources.

Seattle, WA (PRWEB) June 8, 2010 – OpenHelix today announced the availability of three updated tutorials on NCBI resources.

The National Center for Biotechnology Information, NCBI, is home to many of the most commonly used publicly available databases and tools in molecular biology today. They house such popular and widely used databases as GenBank, PubMed, GEO, Entrez Gene, Entrez Protein, and more. NCBI also produces, maintains and updates a variety of tools, like the large family of BLAST sequence similarity searching tools and the Entrez search and retrieval tools. In addition, they provide an extensive variety of services for education, news dissemination and different types of data submission. This tutorial presents a broad overview of NCBI’s databases, tools, educational resources and data submission protocols. In addition to an update on this overview, OpenHelix has updated both it’s PubMed and GEO tutorials. PubMed is the premiere search engine for biomedical literature. More than 18 million citations from life science journals can be searched through this free service. The Gene Expression Omnibus, or GEO, is a valuable resource designed to store high-throughput gene expression and molecular abundance data. These three tutorials, in conjunction with the many other OpenHelix up-to-date tutorials on NCBI resources such as BLAST, Entrez, dbSNP, MMDB, Viral resoruces, MapViewer and others will give you a set of training resources to help be efficient and effective at accessing and analyzing genome data.

The tutorial suites, available through an annual OpenHelix subscription, contain an online, narrated, multimedia tutorial, which runs in just about any browser connected to the web, along with slides with full script, handouts and exercises. With the tutorials, researchers can quickly learn to effectively and efficiently use these resources. The scripts, handouts and other materials can also be used as a reference or for training others.

These tutorials will teach users:

NCBI Overview

*to understand the basic structure of NCBI and its different types of resources
*to navigate NCBI to find the databases and analysis tools you need
*what types of educational resources are available at NCBI
*basic data submission procedures and background information
*how to search the entire NCBI site, as well as just the subset of Entrez databases


*basic, advanced, and Boolean search methods
*additional searching methods like the Entrez Global query and the MeSH query
*tips to understand the visual cues and displays
*to use My NCBI to customize your results and save searches which can be run and emailed regularly

Gene Expression Omnibus (GEO)

*efficient ways to query GEO for specific genes or experimental designs
*how to navigate through GEO output displays to find the specific information you want
*how to navigate GEO’s complex data architecture to search GEO by specific record types

To find out more about these and over 85 other tutorial suites visit the OpenHelix Catalog and OpenHelix. Or visit the OpenHelix Blog for up-to-date information on genomics and genomics resources.

About OpenHelix
OpenHelix, LLC, (www.openhelix.com) provides a bioinformatics and genomics search and training portal, giving researchers one place to find and learn how to use resources and databases on the web. The OpenHelix Search portal searches hundreds of resources, tutorial suites and other material to direct researchers to the most relevant resources and OpenHelix training materials for their needs. Researchers and institutions can save time, budget and staff resources by leveraging a subscription to nearly 100 online tutorial suites available through the portal. More efficient use of the most relevant resources means quicker and more effective research.


I have a vague memory of reading about COBALT a while back, but at the time it was an executable file to download and I think I put it away as “to do.”  Well, a couple days ago I was over at the NCBI BLAST site for something (tip of the week?), and noticed there was a “new” flash for COBALT. So, COBALT is now integrated as a web-tool on the NCBI site. The short description of what COBALT is, from the site:

COBALT is a multiple sequence alignment tool that finds a collection of pairwise constraints derived from conserved domain database, protein motif database, and sequence similarity, using RPS-BLAST, BLASTP, and PHI-BLAST.
Pairwise constraints are then incorporated into a progressive multiple alignment.

I haven’t tried it out yet, compared it to other multiple sequence alignment tools, but thought I’d point it out to those who haven’t yet noticed it.

Tip of the Week: TARGeT

target_thumbToday’s tip is on a TARGeT. TARGeT is, as the the paper’s title in the this year’s NAR’s issue states, “a web-based pipeline for retrieving and characterizing gene and transposable element families from genomic sequences.” There are several things you can do at TARGeT. Using BLAST, PHI BLAST, MUSCLE and TreeBest ,the main function of TARGeT is  to quickly obtain gene and transposon families from a query sequence. The tip today is a quick intro to the tool and a search on an R1 non-LTR transposon.

RNAi to save the bees?

bee_graphicSo the other day on a political board, actually, I heard about a treatment that may be helping to fight Colony Collapse Disorder (CCD) in bees.  It had originally been posted by Treehugger.  So the blogs have been abuzz with the news.  It seems there is an Israeli company called Beeologics that has developed a product called Remembee which can knock down the IAPV (Israeli acute paralysis virus) that is purported to be one of the contributors to CCD.

The technique involves RNAi, RNA interference.  Now, this is the same technology that many environmentalists decry wildly in their Hawaiian papayas.  It was the same biotechnology that was awared the Nobel Prize a few years back.  If you aren’t familiar with it there’s a reasonable intro diagram on the Nobel Prize site that covers it.  Slide 3 and 4 in that graphic are helpful.

There’s a fairly goofy YouTube interview about the company and their efforts:

But that wasn’t enough for me, I wanted to read a paper about this.  And PubMed to the rescue, I found a paper that appears to describe what they are doing.

According the the paper they pulled out some sequences for the IAPV and for honeybees.  They also did a GFP control for a sequence unrelated to bees or IAPV.  They are required by the EPA to demonstrate that the sequences they will use to interfere with the RNA of the virus didn’t match the bee genome to elimate off-target effects.  They did this with BLAST.

They inoculate control and experimental colonies with virus.  They generated dsRNA that they fed to bees.  The bees who get the IAPV infection + dsRNA IAPV have mortality curves that are slightly lower but parallel to the untreated bees.  Those that got IAPV + GFP or IAPV only showed much steeper population declines.  Seemed pretty straightforward: feeding the dsRNA to bees was protective.

They acknowledge that IAPV may not be the only component of CCD, but that even improving the health of bee colony may help them defeat other stresses too.

I haven’t found any subsequent papers on the field trials yet, that would be great to see.  But it looks like they are definitely in the field with this.  And I’m delighted to see the environmentalists appreciate the use of biotechnology.  Gives me a warm, buzzy feeling :)

Maori, E., Paldi, N., Shafir, S., Kalev, H., Tsur, E., Glick, E., & Sela, I. (2009). IAPV, a bee-affecting virus associated with Colony Collapse Disorder can be silenced by dsRNA ingestion Insect Molecular Biology, 18 (1), 55-60 DOI: 10.1111/j.1365-2583.2009.00847.x

Teaching and annotating at the same time

plos teaching paperA recent paper (couple weeks ago) in PLoS Biology from Hingamp et al. had me intrigued. Entitled Metagenome Annotation Using a Distributed Grid of Undergraduate Students, the lecturers put together a system to teach bioinformatics to undergraduates that uses new unannotated sequences from metagenome projects. As stated in the announcement,

This method asks students to randomly pick and analyze unknown metagenomic DNA fragments from a real research sequence stockpile. The student’s mission, using Internet tools only, is to figure out from which organism the DNA comes from, and what biological function it might have. As well as gaining confidence and proficiency in bioinformatics, students experience the authentic research process of weighing the arguments, establishing prediction reliability, building hypotheses, and maintaining rigorous disourse.

The lecturers have put together  a teaching-annotation procedure in a publicly accessible “annotation environment” they call “Annotathon.” This web interface walks the student through the annotation process in a procedure as you see in the figure here. Since you can join and use this interface, I thought I’d give it a test drive.

Continue reading

Bioinformaticist survey results

Mary wrote about this Bioinformaticist Career survey previously and now the results areblast survey out (that’s some analysis, here are the full results).

Looking through the results and discussing it briefly with others, I would say that there is nothing particularly surprising from my experience in the bioinformatics field. The survey reflects what I see today. What is notable is that many bioinformaticists in the survey have commented that they see the future of bioinformatics moving toward more integration with other fields and more cross-disciplinary needs, as illustrated by this comment by one (and many others):

“bioinformatics will be incorporated into cross-disciplinary work by scientists that will learn to use computational tools and insights as a commonplace part of their experimeweb site usents, part in silico, part wet bench”

I would agree, though of course I could well be wrong :), with that and along with that will come the need less for ‘bioinformaticists’ and more for biologists trained in bioinformatics. A quick look of some of the analysis is also not particularly surprising, BLAST is by far and away the most cited application of use (lots of alignment and phylogeny stuff there too) as can be seen by the chart (left ) on the results analysis site. Asked which web application/site they use most, it’s an interest mixture between utility sites Google Docs (hmm, maybe there’s something I should look at there :), Gmail, Twitter, WordPress and biological resources like NCBI, UCSC Genome Browser, Ensembl, BLAST, Connotea and PubMed. I think the questions here might have been designed better. Those are two entirely different categories and I would love to see the results (of course ;) for the resources if that was more explicitly asked. Perhaps next survey (btw, that those 5 biological resources made the it in the graph is no surprise to me).

It will be interesting to go through this data in some more detail!

New Online Tutorials for Sequence Similarity Search Tools BLAST and FASTA

OpenHelix today announced the availability of new tutorial suites on two highly used sequence similarity search resources: BLAST and FASTA. BLAST, from the National Center for Biotechnology Information (NCBI) at NIH and FASTA, accessed through a web interface at European Bioinformatics Institute (EBI), are both excellent and widely used tools for finding sequence similarities for proteins and nucleic acids.The tutorial suites, available for single purchase or through a low-priced yearly subscription to all OpenHelix tutorials, contain a narrated, self-run, online tutorial, slides with full script, handouts and exercises. With the tutorials, researchers can quickly learn to effectively and efficiently use these resources. These tutorials will teach users:

  • the principles of sequence comparisons
  • basic explanations of scoring matrices and alignments
  • main features of the FASTA and BLAST algorithms
  • how to perform a sequence similarity search
  • how to view and interpret the similarity results
  • how to find additional biological information about matching sequences

To find out more about these and other tutorial suites visit the OpenHelix Tutorial Catalog and OpenHelix or visit the OpenHelix Blog for up-to-date information on genomics. About OpenHelix
OpenHelix, LLC, (http://www.openhelix.com) provides the genomics knowledge you need when you need it. OpenHelix currently provides online self-run tutorials and on-site training for institutions and companies on the most powerful and popular free, web based, publicly accessible bioinformatics resources. In addition, OpenHelix is contracted by resource providers to provide comprehensive, long-term training and outreach programs.

So long SSAHA

nar databasesOne of the reasons we started this blog (and company) is because not only are the number of genomics databases, analysis tools and resources rising dramatically, they are in constant flux. New resources are born constantly (the graph on the right shows the rise in the number of resources listed in the annual NAR database issue), current resources are continually being updated and changed, resources merge (UniProt) and some just fade away.

I bring that up because Ensembl actually is not undergoing an update, having a major update and part of it is fading away… all at the same time. Continue reading

Demise of the NCBI Field Guide

For funding reasons, NCBI (home of PubMed, BLAST, dbSNP, OMIM and more) has cut their outreach staff, canceled all onsite training seminars and this has to mean decreased support for online help, documentation and tutorials.

When we wrote our NIH grant, one of the models of success in the bioinformatics training area that we highlighted was the NCBI Field Guide program. For those who may be unfamiliar with it, it is a set of training modules delivered by the outreach team at NCBI. They would come to your site, cover many NCBI tools and do hands-on workshops. Another course (Enhanced Field Guide) drew science librarians and other trainers together to train them, and those folks could go back to their institutions and offer more-and-better searches and training for their constituents. We thought the Guides are a terrific group of people who were interested in people getting their hands on the myriad tools at NCBI and using them effectively. It wasn’t really a competitive situation—their remit was only for NCBI tools, and there were plenty of others out there for us to do. In fact, many people who contacted us for training did so because their local users enjoyed the NCBI training and they wanted similar engagements for other tools.

Recently, though, the calls changed. We found we were getting calls from people who said they weren’t going to be able to get any more Field Guide trainings. NCBI is discontinuing the outreach program. Quite frankly, we were surprised. A sample of the notifications people were getting: http://www.library.uiuc.edu/blog/bicnews/archives/2008/02/ncbi_field_cour.html

Unfortunately, that tremendous training opportunity will NOT occur. Yesterday NCBI Field Guide coordinator, Peter Cooper, sent the following email:

Because of budgetary constraints, NCBI has made reductions in some of its programs, and the education programs are affected. In fact, all outreach education programs (Field Guide, Mini-courses, Structures, PubChem) are terminated effective immediately. At this point we cannot reschedule this course or accept requests for future courses of any kind. This was as much a surprise to me as it is to you. Feel free to contact me if you have questions.

The Field Course, as well as the Mini-Courses and the Structure course, has been tremendously popular and useful (see list of sites where the Field Course has been offered recently), but the NCBI budget situation will not allow NCBI to continue to travel and offer these courses for the foreseeable future.

(emphasis mine)

Here’s a link to a similar letter at another location: http://www.twu.edu/as/bio/NCBI/FieldGuide/

We’ve confirmed this with a number of people directly involved; they have laid off nearly all of the outreach team. Some got reassigned. There can hardly be anyone there to even answer emails to the helpdesk anymore—and they get lots of emails every day.

I’ve been through layoffs before, a few times. It actually feels like a punch to the gut when I hear about it anywhere else—especially among people I know. I expect layoffs at companies, though. But if there was any group that was solidly in place, going to be around for a long time, I would have thought it was the NCBI outreach team. I’m quite sorry to hear that it has been dissolved.

In this time of so many resources & so much need for increased understanding, outreach has become an intregal part of a resource’s success – fewer instructional resources is an unfortunate consequence of decreased funding for science.