BioStar is a site for asking, answering and discussing bioinformatics questions. We are members of thecommunity and find it very useful. Often questions and answers arise at BioStar that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those questions and answers here in this thread. You can ask questions in this thread, or you can always join in at BioStar.
Welcome to our Friday feature link collection: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…
I couldn’t figure out the folklore part, but this handy little interface called Matrix2png takes lists and make a matrix diagram, looked a bit useful to me: RT @pleonard: Wonder if authors of visualization software for protein sequencing knew it would one day be used on Danish folklore? http://bit.ly/e9HS5v [Mary]
Moving towards integrating genomics into healthcare: RT @eurogene: Combining adv informatics, genomics, consultation, eMERGE 1st step to inc genomic into routine healthcare http://bit.ly/gYtNHl Hat tip Keith Grimaldi [Mary]
This week I’m returning to the exercise wherein I look at tools that analyze lists of genes. As before, I’m taking that list of genes I created some time ago. It was generated as a list of “disease” genes from UniProt. Today I’m taking that list to another resource: DAVID. DAVID is an unfortunate name to Google for this, but it stands for: Database for Annotation, Visualization andIntegrated Discovery.
I have to say I was really impressed with the speed, ease, and results of this effort. It uploaded easily, automatically detected the species options, was quick to set for human as the focus, it offered 3 handy viewing option buttons really quickly, and provided informative output that would be really useful in further exploring my list. I had only chosen one of the possible options with default settings. There’s a lot more you can do with DAVID and we cover more of that in our full tutorial. But this quick start movie shows you something of the process and the outcome.
Well, actually, GOEAST young scientist. This week’s tip of the week builds on the data from my previous tip. I had generated a list of genes and I wanted to use that list at a variety of sites to analyze the features of my list. So this week I have tried that at GOEAST.
I used the first 1999 items in my UniProt disease list and uploaded them to GOEAST. The movie shows you the process and a quick look at the outcome.
This view is just a quick example of a basic list upload using their Batch tool. The Batch tool algorithm is somewhat different from the pre-loaded microarray gene analysis they say, because of the way the background is calculated. It outputs the GO terms and groups the genes that fit that GO term. There are a number of other features of GOEAST that look intriguing and helpful. It looks like they handle various microarray platforms easily. They have a variety of outputs (web, text file, graphical). I couldn’t get the graphical output to work sometimes (probably my list is too large). I also would have liked to do the list the other way–to have the list of my genes and have the terms associated with them. I haven’t figured out if there is a way to do that so far.
I’m not going to draw many conclusions from this yet–I want to try a variety of tools and think about the features and the quality of the results. But this tool seemed to effectively group my genes into buckets with GO terms that could be helpful for an analysis.
One of the most common questions we get when we are out doing software training is: what do I do with a list of genes? People generate lists from all sorts of biomedical research forays: microarray results, database searches, literature searches, library screens, etc. The source doesn’t matter much–in the end people have this list that they need to analyze, assess, categorize, group, filter, and manage.
We’ve been looking into some tools to accomplish this. We’ve already demonstrated a few of them already (Reactome SkyPainter, Gene Ontology Term Enrichment, MatchMiner…). But there are more that I want to explore. What I decided to do was to create a standard list that I’m going to use to explore and evaluate different tools. Today the tip is where I got this list and how I created it. I want to be able to refer back to this list in the upcoming “list” tips, and thought that if I explained that first it would help.
So today’s tip is obtaining a list of disease genes from UniProt. Now, you could just go to UniProt yourself and get this handy list. But I show you how to get there starting from the UniProt homepage, and what I did to filter this list to a set of unique gene symbols for disease genes in Excel. I end up with ~2500 unique symbols for disease genes that will be the input for upcoming tips.
In this week’s tip I wanted to talk about a tool that offers a handy way to visualize the items in a list of genes that you might have on pathway diagrams. Reactome offers a tool called SkyPainter that allows you to enter a list of genes which is then analyzed statistically for genes in certain pathways. But then–and here’s the cool part–you also get a diagram of the pathways with the over-represented genes painted on a map of their pathway universe. See–SkyPainter. Anyway, it is a tool I have liked for years and I’ve been thinking a lot more about lists of genes and pathway representations. So I wanted to share that with you. This ~4 minute movie shows you how to access SkyPainter at Reactome and get started using it. Have fun!
The question we probably hear the most from researchers is…what can I do with a giant list of genes to figure out what’s going on in there? And about once every 6 months this question comes across the Gene Ontology mailing list. This is followed by a flurry of developers who offer their cool tools for analysis purposes. There are actually quite a few different tools with different strategies out there–and they are designed for different purposes, and in this tip I’m going to use the Gene Ontology consortium’s GO Term Enrichment tool as a primary example, but I’ll also point you to a list of other tools to try out.
MatchMiner translates one type of gene ID into another type – essentially the genetic equivalent of Swahili-to-German translating software. In this tip I’ll show you how to do a translation on a list of genes, or a ‘Batch Lookup’.