Tag Archives: splicing

Tip of the Week: SKIPPY predicting variants w/ splicing affects

More and more disease-causing mutations are being identified in exonic splicing regulatory sequences (ESRs). These disease effects can result from ESR mutations that cause exon skipping in functionally diverse genes. In today’s tip I’d like to introduce you to a tool designed to detect exon variants that modulate splicing. The tool is named SKIPPY and has been developed and is maintained by groups in the Genomic Functional Analysis research section of the NHGRI.

At the end of the post I cite a very well-written paper describing the development of SKIPPY, as well as the background on why the tool was developed. I won’t have time to go into all those details, but if you are interested the paper is freely available from Genome Biology. The site also has nice, clear documentation and example inputs – which I will use as my examples. Splicing can be modulated in a variety of ways, including the loss or gain of exonic splicing enhancers (ESEs) or silencers (ESSs). Variants accomplishing either of those are referred to as splice-affecting genome variants, or SAVs. Not all of the abbreviations are explained on the results page, as you will see in the tip, but all are explained in detail in the SKIPPY publication, and the  ‘Methods and Interpretations‘ and ‘Quick Reference and Tutorial‘ areas of the site.

I first found the tool because it was mentioned in a nice review entitled “Using Bioinformatics to predict the functional impact of SNVs“, which is a paper that reviews mechanisms by which point mutations can effect function, describes many of the algorithms and resources available & provides some sage advice. I’ll post more on it in a later post. For now, check out the tip & the SKIPPY resource, and if you use the site please let us know what you think.

Woolfe, A., Mullikin, J., & Elnitski, L. (2010). Genomic features defining exonic variants that modulate splicing Genome Biology, 11 (2) DOI: 10.1186/gb-2010-11-2-r20

Cline, M., & Karchin, R. (2010). Using bioinformatics to predict the functional impact of SNVs Bioinformatics DOI: 10.1093/bioinformatics/btq695

Gene expression and SNPs…very neat stuff

microarray_nhgri_publicA question on the blog last week got me going through my old posts, because I was sure that I had done one on a database of SNP effects on gene expression.  But it turned out that was in my memory, but still in the draft posts for the blog….

I had come across the work on Genomeweb here:

Duke Team Finds Variants Linked to Tissue-Specific Gene Expression, Splicing

A team of Duke University researchers used a genome-wide screen to find interactions between genetic variants, gene expression, and alternative splicing in blood and brain tissue. In doing so, they found extensive between-tissue differences in SNP effects — only about half of the polymorphisms had common effects in both tissues tested. The team is starting to catalogue the data on the effects that specific genetic variants have on gene expression and splicing in various tissues.

So of course I went looking for the paper and the catalog….

Tissue-Specific Genetic Control of Splicing: Implications for the Study of Complex Traits by Heinzen et al.  The paper is from PLoS Biology last December, and they introduce some QTLs that were new to me–eQTLs, for expression quantitative trait loci, and sQTLs for splicing ones.  They interrogate exon-based microarrays and look for possible effects of nearby SNPs.  I think this approach has some limitations that they concede (you can’t tell exactly which transcript may be affected, just that there is likely an effect on that gene’s expression).  I also think that known exons do not represent the alternative splicing world completely yet.  I think there are a lot of rare temporal and spatial transcription events that aren’t captured in the public databases yet, and won’t be represented in the tissue types selected.  But I think it was a nice attempt to ask the question, and I’m sure  more tissues will be explored over time.

The resource you can explore that has this data is called SNPExpress, and the introduction states:

SNPExpress is a database and its user interface that we developed to permit interrogation of the effects of common SNPs on exon and transcript level expression, in two different human tissues: brain and PBMC ( Peripheral Blood Mononuclear Cell ).

So if this type of data is of interest to you, please check out their paper and their database.  They also have a related tool attached to SNPExpress that is a WGA viewer that might be of interest.

Specific links from post:

GenomeWeb article http://www.genomeweb.com/issues/news/151467-1.html

SNPExpress: http://people.genome.duke.edu/~dg48/SNPExpress/

WGA viewer: http://people.genome.duke.edu/~dg48/WGAViewer/

SNPs in splicing paper: http://biology.plosjournals.org/perlserv/?request=get-document&doi=10.1371/journal.pbio.1000001&ct=1

Heinzen, E., Ge, D., Cronin, K., Maia, J., Shianna, K., Gabriel, W., Welsh-Bohmer, K., Hulette, C., Denny, T., & Goldstein, D. (2008). Tissue-Specific Genetic Control of Splicing: Implications for the Study of Complex Traits PLoS Biology, 6 (12) DOI: 10.1371/journal.pbio.1000001


UPDATE: (6/13/2012) I just noticed that the links to this software weren’t working, so I checked with the team. The new link for SNPExpress is: http://compute1.lsrc.duke.edu/softwares/SNPExpress/

WGAViewer is http://compute1.lsrc.duke.edu/softwares/WGAViewer/

New and Updated Online Tutorials for ASTD, Entrez Protein and MMDB

Comprehensive tutorials on the ASTD, Entrez Protein, and MMDB databases enable researchers to quickly and effectively use these invaluable variation resources.

Seattle, WA September 24, 2008 — OpenHelix today announced the availability of new tutorial suites on the Alternative Splicing and Transcript Diversity (ASTD) database, Entrez Protein and the Molecular Modeling Database (MMDB). ASTD is an European Bioinformatics Institute (EBI) resource for alternative splice events and transcripts for the human, mouse, and rat systems. Entrez protein is a comprehensive database of protein information brought to you by the National Center for Biotechnology Information (NCBI). MMDB is another NCBI resource which contains an extensive collection of three-dimensional protein structures with detailed annotation that can be used to learn about the structure and function of many proteins. Together these three tutorials give the researcher an excellent set of resources to carry their research from transcript to 3d protein structure.

The tutorial suites, available for single purchase or through a low-priced yearly subscription to all OpenHelix tutorials, contain a narrated, self-run, online tutorial, slides with full script, handouts and exercises. With the tutorials, researchers can quickly learn to effectively and efficiently use these resources. These tutorials will teach users:


  • to perform Quick and Advanced searches
  • to navigate gene and transcript report pages
  • to predict intron/exon boundaries and likely regulatory protein binding site
  • to search manually curated data regarding alternate splicing

Entrez Protein

  • to perform basic and advanced searches utilizing the many available tools and options
  • to understand the protein records and exploit the many internal and external links you are provided with
  • to explore some of the resources provided by the NCBI network of databases, such as “My NCBI”


  • to search MMDB using both basic and advanced query techniques
  • to understand the detailed results you obtain
  • to visualize and manipulate structures using NCBI’s Cn3D structural viewer
  • to locate and view structurally aligned homologs

To find out more about these and other tutorial suites visit the OpenHelix Tutorial Catalog and OpenHelix or visit the OpenHelix Blog for up-to-date information on genomics.

About OpenHelix
OpenHelix, LLC, provides the genomics knowledge you need when you need it. OpenHelix currently provides online self-run tutorials and on-site training for institutions and companies on the most powerful and popular free, web based, publicly accessible bioinformatics resources. In addition, OpenHelix is contracted by resource providers to provide comprehensive, long-term training and outreach programs.