Category Archives: What’s the Answer?

Biostars

What’s the Answer? (bioinformatics one-liners)

As much of a fan as I am of web-based tools for accessing what you need, there are times when the command line can so quickly accomplish what you need. When I looked at this new post at Biostars it was already hugely popular 3 hours into the day. So I think this captured some attention from the field, and some of our readers might want to check out some of these ideas, or offer your own. So, this week’s unusual highlighted question is about the command line.


Biostars is a site for asking, answering and discussing bioinformatics questions and issues. We are members of the Biostars_logo community and find it very useful. Often questions and answers arise at Biostars that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those items or discussions here in this thread. You can ask questions in this thread, or you can always join in at Biostars.


Question: Best bioinfo one-liners?

Whereas an infinity of efficient tools exists out there, it is sometimes still quicker for achieving simple tasks to execute a one linux command. I’m starting by sharing 3 I use quite often.

##1 get the sequences length distribution form a fastq file using awk
zcat file.fastq.gz | awk 'NR%4 == 2 {lengths[length($0)]++} END {for (l in lengths) {print l, lengths[l]}}'

##2 Reverse complement a sequence (I use that a lot when I need to design primers)
echo 'ATTGCTATGCTNNNT' | rev | tr 'ACTG' 'TGAC'

##3 split a multifasta file into single ones with csplit:
csplit -z -q -n 4 -f sequence_ sequences.fasta /\>/ {*}

I may be wrong, but I’ve not found such a list in Biostars.

So, what comes to your mind? I hope this post will yield some gold nuggets ;-)

Manu Prestat

There was a lot of chatter–have a look.

Biostars

What’s The Answer? (brain connectome)

This week’s highlighted item lets you find answers in brains. What do the brain connections look like in 3D? I love 3D brain maps–not in a zombie manner, just in an astonishing complexity manner. And although this is a different type of computational resource than we usually explore, I thought it was interesting.


Biostars is a site for asking, answering and discussing bioinformatics questions and issues. We are members of the Biostars_logo community and find it very useful. Often questions and answers arise at Biostars that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those items or discussions here in this thread. You can ask questions in this thread, or you can always join in at Biostars.


Tool: Budapest Reference Connectome: a 3D visualization tool to browse connections in the human brain

I am pleased to announce a new tool in neuroscience (connectomics): Budapest Reference Connectome a 3D visualization tool to browse connections in the human (consensus) brain.

The connectome has 1015 nodes, but those corresponding to the same larger cerebral area are drawn at the same spot, for a cleaner view. You can use your mouse to zoom and rotate, and and if you click a node you can see only connections of that node.

We welcome any comment, question and suggestion.

[couple of sample images are included over there]

Csaba Kerepesi

The idea of the “human consensus brain” makes me giggle. We could really use that. But as if we could all agree on one! That said, you have the option of using the menus over there to load up a female brain or a male brain as well as the consensus brain. I went over and tried it out.

Another neat thing about their project is that they are using Biostar as a option for support. I think that’s really neat. I’m so over mailing lists. And yet I still have to read them all the time.

Reference:
Szalkai, B., Kerepesi, C., Varga, B., & Grolmusz, V. (2015). The Budapest Reference Connectome Server v2.0 Neuroscience Letters, 595, 60-62 DOI: 10.1016/j.neulet.2015.03.071

What’s the Answer? (user friendly software)

reddit_iconThis week’s highlighted chatter is about the never-ending quest for better ways to access and use other people’s software. I don’t think there’s anything new here, but it may be a nice reminder for developers that others want to use the things you are developing–make it easier for them to do so.

reddit question icon Tips for developing more user friendly bioinformatics software?

This seems to be a reoccurring theme: I read a cool new bioinformatics paper that develops some method for doing exactly what I want to try out on my data. I try to find the code so I can apply the method to my data. Some times the code is not available so I have to contact the author. Other times, the code is available but so poorly documented that I have to contact the author and ask for clarification. Most frequently, the code is available, reasonably documented, but takes some strange input format that I’m not sure how to massage my data into and I spend a lot of time just getting everything in the right format.

What are some of your tips, suggestions, or recommendations for developing more user friendly bioinformatics software? There must be industry standards that we can learn and borrow from.

JEFworks

The ensuing discussion was valuable. Good ideas, good techniques. Have a look.

What’s the Answer? (woolly mammoth ORFs)

This week’s highlighted question was interesting to me in a couple of ways. It was a good question about the recent analysis of the woolly mammoth genome, making it a nice example of post-publication discussion. But mostly I just loved the chatter about issues and challenges around extinct organisms and their sequences. We are living the in the future now. And that’s so awesome.


Biostars is a site for asking, answering and discussing bioinformatics questions and issues. We are members of the Biostars_logo community and find it very useful. Often questions and answers arise at Biostars that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those items or discussions here in this thread. You can ask questions in this thread, or you can always join in at Biostars.


Question: Where are the mammoth’s ORFs?

Not sure if anyone from the Swedish Museum of Natural History is on this forum but does anyone know any plans to process the bam files from http://www.ebi.ac.uk/ena/data/view/ERP008929 into something we can actually use for looking at protein evolution? Might Ensembl eventualy pick up the data for their pipeline Emily_Ensembl ? and/or the NCBI ? This is not the first time journal editors allow a new genome paper without the genome in question being in any usable form for biologists

“Complete Genomes Reveal Signatures of Demographic and Genetic Declines in the Woolly Mammoth”

http://www.citeulike.org/user/cdsouthan/article/13590852

cdsouthan

I loved Emily’s explanation, and this part: “…mammoths have no active transcription…”. I thought to myself, well, not yet. The Plan to Turn Elephants Into Woolly Mammoths Is Already Underway. George Church and CRISPR are on the way back to the future already.

What’s the Answer? (movies for bioinformaticians)

reddit_iconPrevious What’s the Answer? posts that we did on something at Reddit Bioinformatics have been popular. So occasionally I’ll be highlighting interesting threads from the bioinformatics subreddit that people might find interesting (or amusing).

On to this week’s highlighted question. When I first looked there were just a couple of ideas–but it generated a lot of interesting suggestions. Two of which I added to my Netflix queue already.


reddit question iconMovies for Bioinformaticians

Hello there,

I am organising a lab social for about 24 bioinformaticians. Any ideas for movies I can put to the vote?

Proxima256


There were some pretty typical suggestions, but I think everyone in the field probably already knows them. But one of them, The Perfect 46, I talked about here, which I think is a conversation-worthy film for a nerd event of this sort.  Have a look at thread. Offer your ideas.

What’s the Answer? (network analysis, plants)

This week’s question comes up on a pretty regular basis, but I always like to see what people are using for exploring the networks of their genes of interest. This was a new species, though, and I was curious to see if there was something particularly relevant to this plant.


Biostars is a site for asking, answering and discussing bioinformatics questions and issues. We are members of the Biostars_logo community and find it very useful. Often questions and answers arise at Biostars that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those items or discussions here in this thread. You can ask questions in this thread, or you can always join in at Biostars.


Question: Network analysis software

I have RNA-seq data for a plant species (Sorghum) that its genome is already mapped, any suggestion for doing network analysis? I prefer to do it without using R packages.

Thanks

mahnazkiani

If you have any other ideas, though, it would be nice to see some suggestions from the plant science side. Go comment over there.

What’s the Answer? (phylogenetic tree tools)

reddit_iconThe previous What’s the Answer? post that we did on something at Reddit Bioinformatics was popular. It led people to some software they weren’t familiar with for editing multiple sequence alignments. So this week we’ll try another post from this subreddit that might be informative for folks interested in phylogenetic tree tools.

reddit question icon Suggestion for phylogenetic tree visualization tools

Hi!
I’m an MS Biology student a bit more on the in silico side of things. One of the projects I’m involved in would require me to visualize (preferably) unrooted phylogenetic trees (based on miRNA sequences). I’m looking for suggestions regarding this problem, what tools should I use?

I’ve used FigTree and UGENE before but I’d like to broaden the palette of available tools.

–submitted by tronke

There were a number of popular tools in the replies: R tools ape, phytools and others, FigTree, Dendroscope, Mesquite, MEGA, and Archaeopteryx. Check out the full discussion thread over there.

What’s The Answer? (proteins without genes in the dbs)

This week’s highlighted discussion offers a peek at some odd situations in public databases. Sometimes there are things missing that you can’t quite figure out. I thought the exploration of why this happens was interesting and informative about working with databases.


Biostars is a site for asking, answering and discussing bioinformatics questions and issues. We are members of the Biostars_logo community and find it very useful. Often questions and answers arise at Biostars that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those items or discussions here in this thread. You can ask questions in this thread, or you can always join in at Biostars.


This week’s highlighted issue at Biostars is one of the ones that can be really mystifying to encounter. But because of the way databases are curated, sometimes there are odd situations that don’t make sense at first glance. Sometimes these are real bugs–but other times they are decisions that had to be made to accommodate some strange feature of biology that doesn’t align with a database configuration.

Question: Proteins without genes ? Is that even possible ?

Hello all,

I am looking at some mass-spec data.
I found several fragments mapping to Ig heavy chain V-II region WAH protein and want to find corresponding gene.

Example http://www.uniprot.org/uniprot/P01770

Uniprot Screenshot
Uniprot says the gene name as “NULL”. Is this an annotation error or any special aspect of Ig regions am missing ? I want to map several proteins with these type of names to genes.

  • Cluster of Ig heavy chain V-I region HG3
  • Cluster of Ig heavy chain V-II region SESS
  • Cluster of Ig heavy chain V-III region BRO
  • Cluster of Ig lambda chain V-I region NEW
  • Cluster of Ig lambda chain V-II region BUR
  • Ig heavy chain V-II region WAH
  • Ig heavy chain V-III region BUT
  • Ig heavy chain V-III region GAL
  • Ig heavy chain V-III region NIE
  • Ig heavy chain V-III region WEA
  • Ig kappa chain V-I region Kue
  • Ig kappa chain V-I region Wes
  • Ig kappa chain V-III region VG (Fragment)
  • Ig lambda chain V-III region LOI
  • Ig lambda chain V-III region SH
  • Ig lambda chain V-V region DEL

How can I map these to corresponding gene names ?

Thoughts ?

Khader Shameer

Having been involved in curation, I can see how this transpired. But there was a great answer from the UniProt folks themselves in the thread. And input from others too. I thought the discussion was fascinating. Go have a look at the outcome.

 

What’s the Answer? (alignment editors)

PuzzledThis week’s highlighted question is from the Bioinformatics discussion area at Reddit. There are a range of topics discussed in that subreddit, and some of the tool-specific ones are very helpful in learning about new software.

What are some of the best multiple alignment editors that allow for manual editing?

Cross-platform/open-source would be preferred.

AtlasAnimated

There were tools I am familiar with (JalView is the one I have used the most), but I learned about a new tool that looked useful as well. AliView. It sounds as if they have provided a nice tool that manages large datasets better than existing software. As they describe it on their site:

“AliView is yet another alignment viewer and editor, but this is probably one of the fastest and most intuitive to use, not so bloated and hopefully to your liking.”

Heh. Not bloated.

Anyway, looks like it may be worth kicking the tires a bit. In the paper they note that it was related to the 1000 Plants (1kp or OneKP) project “while designing degenerate primers for a diverse set of ferns from transcriptome data”. Anders Larsson talks about the what was needed for this work, and it seems like these needs are going to be common among a lot of folks doing these kinds of large-scale sequencing projects with new species. So I can see this utility of this, and would encourage folks to have a look at AliView.

Quick links:

JalView: http://www.jalview.org/

AliView: http://www.ormbunkar.se/aliview/

Reference:

Larsson A. (2014). AliView: a fast and lightweight alignment viewer and editor for large datasets, Bioinformatics, 30 (22) 3276-3278. DOI: http://dx.doi.org/10.1093/bioinformatics/btu531

What’s The Answer? (what’s next in bioinformatics?)

This week’s highlighted discussion tackles a pretty broad and open-ended issue–what’s next in bioinformatics? The answers varied, interestingly, and presented a lot of great directions. I’d love to see other people’s ideas.


Biostars is a site for asking, answering and discussing bioinformatics questions and issues. We are members of the Biostars_logo community and find it very useful. Often questions and answers arise at Biostars that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those items or discussions here in this thread. You can ask questions in this thread, or you can always join in at Biostars.


Forum: what is next going to happen in bioinformatics?

In fact many people around the world are working in this domain. some studied bioinformatics and some not (even I see physician are doing bioinformatics). I have been reading papers from all known journals which publish biology related bioinformatics papers or pure bioinformatics. I can tell , pretty much around a topic all times.  I know it is very general question and we cannot give a great and direct answer to it. However, I would like to know which topics you think are the hot spot these days for bioinformatics?

for example, many people are doing sequencing ( of course we cannot have a golden standard because “all modelling are wrong but some are useful “) so these types of studies are going to be forever?

We all know that bioinformatics is only a tool and not the pure science itself. so can we think that it is a died field since mathematics/statistics found itself already or so much left to do ? if so much left to do, what could be those topics ?

I am so eager to know about your opinion

–Mo

I put down some thoughts I had, but I really enjoyed reading the others–like the long one from Francis Ouellette.