Category Archives: What’s the Answer?

What’s the Answer? (FANTOM5 promoter atlas)

Biostars is a site for asking, answering and discussing bioinformatics questions and issues. We are members of the Biostars_logo community and find it very useful. Often questions and answers arise at Biostars that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those items or discussions here in this thread. You can ask questions in this thread, or you can always join in at Biostars.

This week’s highlighted question at Biostar is something that rang a bell. I’ve been meaning to take a look at this resource, but it got buried on my desk under a pile of other stuff and I forgot to get back to it.

Question: FANTOM5 Promoter Atlas

Hi All,

I went through this paper on promoter atlas in FANTOM5 http://www.nature.com/nature/journal/v507/n7493/full/nature13182.html#the-fantom5-promoter-atlas , my question is that do they have cell wise CAGE dataset or is it global , as I cannot see the cell wise CAGE expression on their website.

Thanks in advance.

Aishwarya.Kulkarni

The thing that reminded me about that paper was in the answer–Taylor Raborn noted that this can be found in the ZENBU resources. In both March and in May I started tip-of-the-week posts about ZENBU as I can tell from my draft folder, but other stuff came up. I really have to visit that in the new year. If people could stop developing new resources for a while, I can catch up…? Right, that will happen.

Until I have a chance to get back to it (we have our annual special summary posts over the next two weeks and other stuff already in the hopper for early next year), you’ll have to settle for the ZENBU wiki details on their genome browser.

References:
Forrest A.R.R., Hideya Kawaji, Michael Rehli, J. Kenneth Baillie, Michiel J. L. de Hoon, Vanja Haberle, Timo Lassmann, Ivan V. Kulakovskiy, Marina Lizio, Masayoshi Itoh & Robin Andersson & (2014). A promoter-level mammalian expression atlas, Nature, 507 (7493) 462-470. DOI: http://dx.doi.org/10.1038/nature13182

Severin J., Marina Lizio, Jayson Harshbarger, Hideya Kawaji, Carsten O Daub, Yoshihide Hayashizaki, Nicolas Bertin & Alistair R R Forrest (2014). Interactive visualization and analysis of large-scale sequencing datasets using ZENBU, Nature Biotechnology, 32 (3) 217-219. DOI: http://dx.doi.org/10.1038/nbt.2840

What’s the Answer? (tidy data format)

Biostars is a site for asking, answering and discussing bioinformatics questions and issues. We are members of the Biostars_logo community and find it very useful. Often questions and answers arise at Biostars that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those items or discussions here in this thread. You can ask questions in this thread, or you can always join in at Biostars.

This week’s highlighted post at Biostar is about “tidy data”. Ah, quite the concept. The day when data becomes tidy will be one to celebrate. Anyway, I think it’s a worthwhile discussion to have, and I’m looking forward to the comments as this develops. If you have thoughts, please bring them over there too.

Usually I highlight most of the question here, but this time there are pieces that are too large–examples of format issues–so I’ll just give you the bullet and send you over to Biostar to read the whole thing.

Forum: Principles of Tidy Data (Hadley Wickham) and the VCF format

Hadley Wickham, the author of ggplot and many other popular R packages, has recently published a very good paper regarding the principles of tidy data. This article introduces a new library called tidyr, and also proposes a standard for formatting and organizing data before data analysis.

I personally think that the principles proposed in the article are very good, and that they help a lot in data analysis. Some of these are already adopted by many ggplot2/plyr users, as you need a data frame in a long format in order to produce most of the plots.

My question is whether it would make sense to apply these principles to bioinformatics. In particular, if we look at the VCF format, it fails at least two of the rules mentioned in the paper:

- “3.1. Column headers are values, not variable names”  (because individuals are encoded on distinct columns)

- “3.2. Multiple variables stored in one column” (because each genotype column contains the status of one or more alleles, plus its coverage etc…

For example, let’s take the example from the 4.0 specs of VCF:

[examples here]

[More discussion of the issues within samples, so go read over there]

What do you think? Will we all convert to tidy VCF in the far future?

–Giovanni M Dall’Olio

So, tidy VCF. What do you think? Some people are already musing about it. Discuss over there.

Reference:
Wickham H.W. (2014). Tidy Data, Journal of Statistical Software, 59 (10). http://www.jstatsoft.org/v59/i10

What’s The Answer? (missing applications, revisited)

Biostars is a site for asking, answering and discussing bioinformatics questions and issues. We are members of the Biostars_logo community and find it very useful. Often questions and answers arise at Biostars that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those items or discussions here in this thread. You can ask questions in this thread, or you can always join in at Biostars.

This week’s highlighted post is actually a trip down memory lane. It floated up to the top recently because the someone raised the question again:

HI all,

Since last post in this thread is almost 4 years old, I am just curious. What was already sold, what has changed and more important, Which Application Is Truly Missing In Bioinformatics today?

One of the things I see is still the need for some data format standards. Another one is related to lack of global standards how to build data analysis pipelines.

I am curious about your thoughts.

klemen

Bioinformatics moves very fast in some ways, yet in other ways the same old problems remain. It was kind of interesting to look over the things we all desired years ago, and think about where we are since then.

Original post question:

Question: Which Application Is Truly Missing In Bioinformatics?

It’s a simple & straight questions. Just think about an app that when you found it, you first thought would be – “OMG!!! That’s it” – or smth like – “I wish I could have found/written/idealized it before”. Don’t need to be a bioinformatical swiss knife or a McGuyver paper clip. Just smth that would make your life much happier/easier.

My example is quite simple. I really wish that some sort of Monte Carlo Simulator of Generic Urn Models (population genetics rlz!) just appear in the net, with a nice, clean and well documented API (written in C) and bindings for my favorite scripting languages. That’s what I really miss, right now. What’s your story?

Jarretinha

So go over and walk down memory lane. This is kind of an interesting way to have the sort of institutional memory of a specialist group to look back on, stuff that you don’t necessarily capture in the formal science routes.

What’s The Answer? (genomics is not special, stop reinventing the wheel)

Biostars is a site for asking, answering and discussing bioinformatics questions and issues. We are members of the Biostars_logo community and find it very useful. Often questions and answers arise at Biostars that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those items or discussions here in this thread. You can ask questions in this thread, or you can always join in at Biostars.

This week’s highlighted Biostar post is one of the most interesting ones I’ve seen in a while. It started with a provocative premise, and this provoked a number of really fascinating responses and discussion. To lure you over there, here’s a tweet that captures the initial post:

(and this generated some chatter on twitter, if you follow the time stamp you can see that)

One of the response resounded across the genoscenti as well:

I think those short summaries are better than me bringing the post over here like I usually do. You should read the whole thing in situ, with the responses. So just go over from the links in the tweets, or from here.

Heh. This is what’s great about forums. This is way better than you get in the stuffy mainstream literature (with the except of Dan Graur).

What’s The Answer? (mobile bioinformatics apps)

Biostars is a site for asking, answering and discussing bioinformatics questions and issues. We are members of the Biostars_logo community and find it very useful. Often questions and answers arise at Biostars that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those items or discussions here in this thread. You can ask questions in this thread, or you can always join in at Biostars.

This week’s highlighted discussion is about mobile apps. The original post sought some suggestions on what might be a useful mobile app. I would have to say the community seemed…er…underwhelmed with the thought of mobile apps for stuff. But that said, maybe there is a killer app out there waiting to happen. Do you have any ideas on what you’d want to see on a mobile device?

Forum: Bioinformatics Mobile App

Hi Everyone,

We are in the process of creating bioinformatics mobile applications. Rather than common app we want to give app for scholars and scientist for them to access the data wherever they and whenever they want.

Please give your suggestions and recommendations to pick the area or functionalties need to be implemetned.

Thanks.

aeinsights

I thought the discussion was interesting, even if nothing came immediately to mind. Although I recently had some fun with the PDB mobile app, it was mostly to look at cool structures while I was bored in a queue somewhere. I also know that one time at a dinner party the TimeTree app came in handy for looking for a date for a last common ancestor. But I can’t think of much heavy lifting I’d want to do on a small screen. But if you have some ideas, do share them over there.

What’s The Answer? (biggest challenges)

Biostars is a site for asking, answering and discussing bioinformatics questions and issues. We are members of the Biostars_logo community and find it very useful. Often questions and answers arise at Biostars that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those items or discussions here in this thread. You can ask questions in this thread, or you can always join in at Biostars.

This week’s highlighted question is a pretty broad one. And there’s certainly been discussion of it there, but in addition the original poster used the answers that have been coming along to build a survey. And you have the chance to answer there if you’d like too.

Question: What are the biggest challenges bioinformaticians have with data analysis?

Dear all,

I am doing a research among bioinformaticians, and I am interested in understanding your work, the challenges, and the opportunities.

So my question is, what are the challenges bioinformaticians have with data analysis?

Thank you in advance.

Klemen

So if you are curious about the issues, or have some thoughts, bring them over.

What’s the Answer? (vintage bioinformatics)

Biostars is a site for asking, answering and discussing bioinformatics questions and issues. We are members of the Biostars_logo community and find it very useful. Often questions and answers arise at Biostars that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those items or discussions here in this thread. You can ask questions in this thread, or you can always join in at Biostars.

This week’s highlighted post is actually a BLAST from the past. Although it doesn’t directly say so, it came along around the same time as the “Old Bioinformaticians” conversation. It’s what bioinformatics nerds talk about like your grandparents do, talking about walking to school, in the snow, uphill, both ways–you know? And yeah, I contributed.

But it seems that Pierre Lindenbaum turned this into a curation effort to capture some of this history. I think that’s a nice idea. And people will want these kinds of things for talks and papers sometimes, and possibly for teaching the youngsters. So if you have some of these early bioinformatics artifacts, please contribute them over there.

Forum: Vintage / unconventional pictures for Bioinformatics

I’m looking for Vintage / unconventional pictures for Bioinformatics.

Feel free to add an URL to the picture below. If you’re the owner of the picture, tell me if I you allow me to upload the picture on wikipedia commons.

Please, don’t upvote my answers.

See also: Bioinformatic Cartoon

PS:  e.g: do you have a picture of a printed version of the GCG manual somewhere ?

–Pierre Lindenbaum

Go dust off your items, and share some photos.

What’s the Answer? (openly hate R)

Biostars is a site for asking, answering and discussing bioinformatics questions and issues. We are members of the Biostars_logo community and find it very useful. Often questions and answers arise at Biostars that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those items or discussions here in this thread. You can ask questions in this thread, or you can always join in at Biostars.

Although I highlighted the original post a couple of weeks back, this Bioinformatics nerd “Uses This” series at Biostar has continued to be really informative and sometimes amusing.  I can’t even extract them to give a fair look because there are so many now, you should just go read them all. Not only is it an interesting cross-section of bioinformatics folks on a bunch of different topics and species, there are really good tips on software tools you might want to know about.

But I’ll extract this piece from today’s chat with Pablo because I used it in the click-baity title:

Forum: Pablo Cingolani of snpEff uses this

What do you use to create plots and charts?
I use R for stats, plots and charts. Although I openly hate R because I think it’s one of the least intuitive programming languages in the planet (followed closely by Malbolge and BrainF***)

Heh. But they aren’t all wonky tools either–some great tips on tools like project management or even remote meeting software have come along:

Forum: Hadley Wickham of ggplot and RStudio uses this

What tools/software do not get enough recognition?
Here’s three that I love and not enough people know about:

  • Selectorgadget: if you ever do any web scraping, you will love the way it learns css/xpath selectors based on positive and negative examples.
  • iDoneThis: we use this at RStudio. It’s a great way to keep track of what you’ve achieved, and to see what your colleagues are working on.
  • appear.in: super simple video chat. No logins, just share a link, and the quality is way better than google hangouts.

Really interesting stuff. Go read “Uses This” posts.

What’s The Answer? (transmembrane protein dbs)

Biostars is a site for asking, answering and discussing bioinformatics questions and issues. We are members of the Biostars_logo community and find it very useful. Often questions and answers arise at Biostars that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those items or discussions here in this thread. You can ask questions in this thread, or you can always join in at Biostars.

This week’s highlighted question drew less response than I expected. It’s a good question, and would be of major interest for folks looking for druggable targets. So I figured–yeah, there must be a site that focuses on this. But I couldn’t pull one out of my memory banks. I was hoping someone else would. Any thoughts?

Question: Are there any specialist transmembrane protein databases?

I am working almost exclusively with transmembrane proteins. Are there any databases that specialise in categorising transmembrane proteins. For example by membrane type, number of membrane spanning regions, number of non-polar helices, whether the protein is functional or structural, et cetera.

Good Gravy

Bring an answer over there if you know of one.

What’s The Answer? (what do bioinformatics folks use?)

Biostars is a site for asking, answering and discussing bioinformatics questions and issues. We are members of the Biostars_logo community and find it very useful. Often questions and answers arise at Biostars that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those items or discussions here in this thread. You can ask questions in this thread, or you can always join in at Biostars.

This week’s highlighted item from Biostars is actually one post that was the first of a new series. Inspired by the “Uses This” via The Setup, an interview offers a quick look at what a variety of folks use to do their jobs, Istvan started asking bioinformatics professionals what tools they use for their work. And some other bonus questions.

The first in the series was Jim Robinson of IGV. But since then a number of others have been added (you can follow them with the tag or see the list underneath the first one). Istvan is also welcoming other folks to submit the answers if you want to share what you are up to, and how you get there.

Forum: Jim Robinson of the Integrative Genomics Viewer (IGV) uses this

Based on user suggestion we launch series of posts based on ideas promoted by the Uses This website.

How are the tools that we use every day being developed? What do bioinformaticians with proven track record use to get their work done?

I have sent out a few emails and I will start posting answers as they come in. Feel free to send me candidates (or volunteer) for the interviews.

[The list of questions]

What hardware do you use?

What is your text editor?

What software do you use for your work?

What do you use to create plots and charts?

What do you consider the best language to do bioinformatics with?

What bioinformatics tools/software do not get enough recognition?

[Go over to Biostars to read Jim's answers]

Istvan Albert

Interesting stuff. And more to come. Keep checking.