Tag Archives: SGD

Tip of the Week: Yeast genome? There is an app for that!

GBrowse navigation basics tutorial from yeastgenome on Vimeo.

The Saccharomyces Genome Database (SGD) has several short video tutorials that introduce basic navigation (shown above), expression data and more.  Each of these tutorials are short, 1-2 minutes, and there are 21 of them (15 on yeastmine alone). If you want to go further in depth, we have a tutorial on the Saccharomyces Genome Database (SGD) (subscription)also that is about an hour long, modular and includes exercises. We have also done tips on Yeastmine and other SGD related tools (open access). You can also find a tutorial at OpenHelix on GBrowse, which is the browser used at SGD. And there is this short 5 minute GBrowse video tutorial.

So, a yeast researcher has no lack for video training on using SGD.

But, today I wanted to introduce you to SGD’s new app for the mobile researcher: YeastGenome App. The app has some pretty decent functionality. As their FAQ enumerates:


“Saccharomyces Genome Database by gene name or keyword to find fundamental information about your favorite gene. Browse the database by feature type and quickly view fundamental information, sequence information, Gene Ontology, interactions, phenotypes, and references associated with the terms.

Does the app have any other special features?
Yes, you can use the app to save your favorite genes in a convenient list. You can also e-mail yourself or your friends and colleagues any information you find about yeast genes in the Saccharomyces Genome Database.”

I’ve had a go at the app for a bit, and it makes browsing and searching yeast genome data pretty convenient and easy. The app was reported in this year’s Database issue at NAR and it gives a good rundown of the app. Don’t need convincing? Then you can go right to iTunes and get it now.

But this reminds me, I did a feature on mobile apps for genomics research last year that reviewed GeneWall, Wowser and MyGenome and the year before that introduced an app for “moving molecules”. THis new app and several I’ve seen in the interim since those posts suggest that perhaps it’s time to do a new post on mobile apps available for genome research.  Perhaps that will be the next tip of the week from me in a few weeks.

On a Mission for Protein Information

It’s probably just the human brain’s ability to connect dots  &  find patterns, but it can be interesting how many “unrelated” events and information bits accumulate in my head & eventually get mulled into an idea or theory. Take, for example, a recent biotech mixer, bits from an education leadership series & a past Nature article – each “event” has been meandering in my mind and now they are finding their way out as this blog post.

OK, now the explanation: At a recent local biotech event I heard about a company (KeraNetics) purifying keratin proteins & using them to develop therapeutic and research applications. The company & their research sounded very interesting & because a lot of it is aimed at aiding wounded soldiers, it also sounded directly beneficial. The talk was short, only about 20 minutes, so there wasn’t a lot of time for details or questions. I decided I’d venture forth through many of the bioscience databases and resources that I know and love, in order to learn more about keratin.

My quest was both fun and frustrating because of the nature of the beast – keratin is “well known” (i.e. it comes up in high school academic challenge competitions ‘a lot’, according to someone in the know), but is hard to work with (i.e. tough, insoluble, fibrous structural proteins) that is hard to find much general information on in your average protein database (because it is  made of many different gene products, all referred to as “keratin”). I decided to begin my adventure at two of my favorite protein resources, PDB & SBKB, but I found no solved structures for keratin. Because of the way model organism databases are curated and organized, I often begin a protein search there, just to get some basic background, gene names, sequence information, etc. I (of course) found nothing other than a couple of GO terms in the Saccharomyces Genome Database (SGD), but I found hundreds of results in both Mouse Genome Informatics (MGI) (660 genomic features) and Rat Genome Database (RGD) (162 rat genes, 342 human genes). I also found gene names (Krt*), sequences and many summary annotations with references to diseases with links to OMIM. When I queried for “keratin”, in OMIM I got 180 hits, including 61 “clinical synopsises”, in UniProt returned 505 reviewed entries and 2,435 unreviewed entiries, in Entrez Protein 10,611 results and in PubMed 26,430 articles with 1,707 reviews. I got my curiosity about KeraNetics’ research sated by using a PubMed advanced search for Keratin in the abstract or title & the PI’s name as author (search = “(keratin[Title/Abstract]) AND Van Dyke[Author]“).

I ended up with a lot of information leads that I could have hunted through, but it was a fun process in which I learned a lot about keratin. This is where the education stuff comes in. I’ve been seeing a lot of studies go by talking about reforming education to be more investigation driven, and I can totally see how that can work. “Learning” through memorization & regurgitation is dry for everyone & rough for the “memory challenged”, like me. Having a reason or curiosity to explore, with a new nugget of data or understanding lurking around each corner, the information just seems to get in better & stay longer. (OT, but thought I’d mention a related site that I found today w/ some neat stuff: Mind/Shift-How we will learn.)

And I could have done the advanced PubMed search in the beginning, but what fun would that have been? Plus there is a lot that I learned about keratin from what I didn’t find, like that there wasn’t a plethora of PDB structures for keratin proteins. That brings me to the final dot in my mullings – an article that I came across today as I worked on my reading backlog: “Too many roads not taken“. If you have a subscription to Nature you can read it, but the main point is that researchers are still largely focusing on the same set of proteins that they have been for a long time, because these are the proteins for which there are research tools (antibodies, chemical inhibitors, etc). This same sort of philosophy is fueling the Protein Structure Initiative (PSI) efforts, as described here. Anyway, I found the article interesting & agree with the authors general suggestions. I would however extend it beyond these physical research tools & say that going forward researchers need more data analysis tools, and training on how to use them – but I would, wouldn’t I? :)


  • Sierpinski P, Garrett J, Ma J, Apel P, Klorig D, Smith T, Koman LA, Atala A, & Van Dyke M (2008). The use of keratin biomaterials derived from human hair for the promotion of rapid regeneration of peripheral nerves. Biomaterials, 29 (1), 118-28 PMID: 17919720
  • Edwards, A., Isserlin, R., Bader, G., Frye, S., Willson, T., & Yu, F. (2011). Too many roads not taken Nature, 470 (7333), 163-165 DOI: 10.1038/470163a

Friday SNPpets

Welcome to our Friday feature link collection: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…

Tip of the Week: A year in tips III (last half of 2010)

As you may know, we’ve been doing tips-of-the-week for three years now. We have completed around 150 little tidbit introductions to various resources. At the end of the year we’ve established a sort of holiday tradition: we are doing a summary post to collect them all. If you have missed any of them it’s a great way to have a quick look at what might be useful to your work.

Here are the tips from the first half of the year, and below you will find the tips from the last half of 2010 (you can see past years’ tips here: 2008 I2008 II2009 I2009 II):


July 7: Mint for Protein Interactions, an introduction to MINT to study protein-protein interactions
July 14: Introduction to Changes to NCBI’s Protein Database, as it states :D
July 21: 1000 Genome Project Browser, 1000 Genomes project has pilot data out, this is the browser.
July 28: R Genetics at Galaxy, the Galaxy analysis and workflow tool added R genetics analysis tools.


August 4: YeastMine, SGD adds an InterMine capability to their database search.
August 11: Gaggle Genome Browser, a tool to allow for the visualization of genomic data, part of the “gaggle components”
August 18: Brenda, comprehensive enzyme information.
August 25: Mouse Genomic Pathology, unlike other tips, this is not a video but rather a detailed introduction to a new website.


September 1: Galaxy Pages, and introduction to the new community documentation and sharing capability at Galaxy.
September 8: Varitas. A Plaid Database. A resource that integrates human variation data such as SNPs and CNVs.
September 15: CircuitsDB for TF/miRNA/gene regulation networks.
September 21: Pathcase for pathway data.
September 29: Comparative Toxicogenomics Database (CTD), VennViewer. A new tool to create Venn diagrams to compare associated datasets for genes, diseases or chemicals.


October 6: BioExtract Server, a server that allows researcher to store data, analyze data and create workflows of data.
October 13: NCBI Epigenomics, “Beyond the Genome” NCBI’s site for information and data on epigenetics.
October 20: Comparing Microbial Databases including IMG, UCSC Microbial and Archeal browsers, CMR and others.
October 27: iTOL, interactive tree of life


November 3: VISTA Enhancer Browser explore possible regulatory elements with comparative genomics
November 10: Getting canonical gene info from the UCSC Browser. Need one gene version to ‘rule them all’?
November 17: ENCODE Data in the UCSC Genome Browser, an entire 35 minute tutorial on the ENCODE project.
November 24: FLink. A tool that links items in one NCBI database to another in a meaningful and weighted manner.


December 1: PhylomeDB. A database of gene phylogenies of many species.
December 8: BioGPS for expression data and more.
December 15: RepTar, a database of miRNA target sites.

Tip of the Week: YeastMine

For this week’s tip I would like to take you over to the Saccharomyces Genome Database (SGD) & from there try out the beta release of YeastMine. YeastMine is based on the InterMine open source data warehouse system. We’ve featured other incarnations of InterMine, such as RatMine from RGD in this tip and modMine (associated with the modENCODE project) in this tip, so you’ve already seen some of its capabilities. The aspect that I want to focus on when we look at YeastMine is the interoperability of InterMine resources.

Mary noticed the beta YeastMine release notice first & mentioned it to me. When I got over to SGD, not only did I see the notice on YeastMine, but also noticed that they are now linking to GeneMANIA in some of their interaction resources. I think that’s cool because soon we will be releasing a new GeneMANIA-sponsored tutorial. I’ll head back over to SGD & maybe do a tip on that too, some other day, but for now enjoy today’s tip on how to make a gene list and then link to gene homolog information on FlyMine, the FlyBase/Drosophila version of InterMine.

Tip of the Week: Ratmine

Ratmine is a ‘data warehouse’ that allows the user to construct queries across different areas of biological knowledge from SNPs to Pathways. It’s developed by the people at RGD and uses Intermine a project developed for Flymine and as part of a project between RGD, SGD and ZFIN to implement Intermine for these databases and ” develop new methods of interoperability for cross-organism research.” We’ve mentioned Intermine before and it’s also used in ModEncode Intermine is going to have to be a subject of a later post I think :).

This tip is actually a video done by the RGD group and one of those gems I’ve found at SciVee in our attempts to integrate our tips at SciVee (which will be coming). We occasionally will highlight a short tutorial done by someone else here at our tips (occasionally) and since I’ve found this gem and just got back from vacation in Florida :)…
Btw, while you are at it, you might want to check out this interesting set of tutorials on biomedical ontologies.

Pointing us out at Genome.gov :)

ohonnhgripageNHGRI recently pointed out our new set of tutorials on model organism databases (funded mainly by NHGRI :) on their home page, genome.gov. Always nice to be recognized :D.

And it gives me the opportunity to again point out that we do indeed have seven publicly available tutorials and training materials (slides, exercises, etc) on model organism databases including SGD, RGD, MGI, WormBase, FlyBase and ZFIN… and a seventh on GBrowse, a generic genome browser used by some of these and other genome databases.

Check them out (and fill out the new poll to the left :D.

Tip of the Week: Model Organism Database tutorials

gbrowseFor the tip of the week today, we’d like to point out a number of new (free to you) tutorials on model organism database resources. These seven tutorials (include flash movie tutorial, slides for downloading, exercises and handouts) were partly funded by a NHGRI grant. We just put out a press release on this, but I thought the Tip of the Week would be a great place to introduce you to these tutorials. We have seven tutorials that are (or will soon be) publicly available (this link takes you to a list and links to all these tutorials). The first four available are on GBrowse, WormBase, RGD (Rat Genome Database) and MGI (Mouse Genome Informatics. GBrowse (the tutorial linked to here), was developed by the Generic Model Organism Database (GMOD) project and is a great tool to develop genome browsers for model and research organisms. Many model organism databases use GMOD resources in full or part, including many of the ones we have tutorials on here. Three more will be coming very soon on ZFIN (Zebrafish), FlyBase (Drosophila) and SGD (yeast). Check them out :).

Free Tutorials on Model Organism Genomic Databases Released by OpenHelix

OpenHelix today announced the free availability of tutorial suites on model organism databases and resources used extensively in research. The first tutorial suites available are GBrowse, Rat Genome Database (RGD), Mouse Genome Informatics (MGI), and WormBase. To be added in the coming weeks are Zebrafish Information Network (ZFIN), FlyBase and Saccharomyces (Yeast) Genome Database (SGD).

The tutorial suites, funded in part by a grant from the National Human Genome Research Institute of the National Institutes of Health, include a self run, narrated tutorial introducing the resource and how to use its feature and functions. Each suite also includes PowerPoint slides, handouts, and exercises that can be used for reference or for training others.

One of the first tutorials available is on GBrowse, developed by the Generic Model Organism Database (GMOD) project, a popular tool used by researchers to develop genome browsers for model organisms, species of interest, and particular topics. By learning how to use this “generic” genome browser, you can leverage that knowledge to use dozens of resources devoted to a wide range of research areas.

“The OpenHelix GBrowse user tutorial is very well done and will be an excellent resource for the many research communities that use GBrowse to visualize genomic data,” said Dave Clements of the National Evolutionary Synthesis Center who runs the GMOD help desk.

Model organisms, such as yeast, mouse, rat, flies, and many others, have long been used by researchers to expand our understanding of biology and to assess the effectiveness and safety of therapies before going to human trial. Many of the genomes of these organisms have been completely sequenced, giving the scientific community even greater insight into the organisms and their relation to human biology. The genome data is now available and searchable on publicly available online databases and resources.

You can view the Model Organism tutorials at http://www.openhelix.com/model_organisms.shtml. OpenHelix provides over 60 other tutorial suites on a number of genomic databases and resources through an individual, group, or institutional subscription. Further information can be found at www.openhelix.com.

About OpenHelix
OpenHelix, LLC, (www.openhelix.com) provides the genomics knowledge you need when you need it. OpenHelix provides online self-run tutorials and on-site training for institutions and companies on the most powerful and popular free, web based, publicly accessible bioinformatics resources. In addition, OpenHelix is contracted by resource providers to provide comprehensive, long-term training and outreach programs.

Tip of the Week: cast a SPELL and find your genes!

Today’s tip is on a resource named SPELL, which stands for ‘Serial Pattern of Expression Levels Locator’. You enter a small set of gene names. The SPELL search engine will analyze expression datasets collected from GEO, ArrayExpress, SMD, etc. and tell you which datasets are most informative for your genes. It will also return a list of other genes with similar expression profiles. The spell_logo.jpgresults link to the original publications and to gene summaries at SGD, and provides you with a list of over-represented GO terms. I found SPELL through SGD’s Expression Connection (a wonderful resource in and of itself), and liked what I saw. Plus the name ‘SPELL’ is sort of appropriate for Halloween, which is fast approaching. Read more about the resource here, which is also referenced below.

M. A. Hibbs, D. C. Hess, C. L. Myers, C. Huttenhower, K. Li, O. G. Troyanskaya (2007). Exploring the functional landscape of gene expression: directed search of large microarray compendia Bioinformatics, 23 (20), 2692-2699 DOI: 10.1093/bioinformatics/btm403