BioStar is a site for asking, answering and discussing bioinformatics questions. We are members of thecommunity and find it very useful. Often questions and answers arise at BioStar that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those questions and answers here in this thread. You can ask questions in this thread, or you can always join in at BioStar.
Today’s featured issue: What do you do with hundreds of genomes?
ClustalW is extremely limited when in comes to multiple whole genome sequencing. I have recently just looked at mugsy which claims to be able to align a little over 30 whole genomes.
Is there a software that can align 400 whole genomes? This would be over a Gb of data.
Any help would be enormously appreciated.
It wasn’t clear at first, but it turned out this was a set of bacterial genomes. However, more and more researchers are going to want to align, analyze, and visualize enormous sets of the newly sequenced genomes of all sorts with different strategies. The number of genomes that are coming out every week continues to astound me. Just yesterday I was looking at that paper on the 10,000 birds and it boggled my mind–but not all of those genomes are fully available now, and that could affect the ultimate conclusions at this point. But of course, there’s a lot of issues to consider about how to do analyses of this sort and there is debate. It’s debate we need to have now though–the healthy science kind of debate. Those genomes are coming.
Anyway–check out the answers for this question, and if you have other strategies to suggest be sure to add your voice to the discussion.