Friday SNPpets

This week’s SNPpets include tidbits about the Exome Aggregation Consortium (ExAC), new RepeatMasker track at UCSC Genome Browser, sequencing undiagnosed patients, the African Genome Variation Project, Icelandic knockouts, and more….


Welcome to our Friday feature link collection: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…

What’s the Answer? (alignment editors)

PuzzledThis week’s highlighted question is from the Bioinformatics discussion area at Reddit. There are a range of topics discussed in that subreddit, and some of the tool-specific ones are very helpful in learning about new software.

What are some of the best multiple alignment editors that allow for manual editing?

Cross-platform/open-source would be preferred.

AtlasAnimated

There were tools I am familiar with (JalView is the one I have used the most), but I learned about a new tool that looked useful as well. AliView. It sounds as if they have provided a nice tool that manages large datasets better than existing software. As they describe it on their site:

“AliView is yet another alignment viewer and editor, but this is probably one of the fastest and most intuitive to use, not so bloated and hopefully to your liking.”

Heh. Not bloated.

Anyway, looks like it may be worth kicking the tires a bit. In the paper they note that it was related to the 1000 Plants (1kp or OneKP) project “while designing degenerate primers for a diverse set of ferns from transcriptome data”. Anders Larsson talks about the what was needed for this work, and it seems like these needs are going to be common among a lot of folks doing these kinds of large-scale sequencing projects with new species. So I can see this utility of this, and would encourage folks to have a look at AliView.

Quick links:

JalView: http://www.jalview.org/

AliView: http://www.ormbunkar.se/aliview/

Reference:

Larsson A. (2014). AliView: a fast and lightweight alignment viewer and editor for large datasets, Bioinformatics, 30 (22) 3276-3278. DOI: http://dx.doi.org/10.1093/bioinformatics/btu531

Video Tip of the Week: Protein structure information for public outreach. Really.

This week’s tip isn’t about a specific tool–but a really interesting look at how a tool was used in the context of some general public outreach messaging. Recently I posted about Aquaria, a new tool available to let biologists explore protein structures, mutations, and domains in user-friendly ways. But an interesting example of how the information about protein structures can be used to drive understanding came from a video animation of protein accumulation in Alzheimer’s. Just have a look at the video first and enjoy it. How cool is that clathrin basket pulling the vesicle in?

Description from their brochure at the launch [PDF]:

Christopher Hammang’s “Alzheimer’s Enigma” which explores the neurons of the human brain, and reveals how normal protein breakdown processes become dysfunctional and result in plaque formation during Alzheimer’s disease.

I found out about it as I was looking at the upcoming VIZBI talks and exploring their site for other features. In the VizbiPlus section there are a number of excellent animations of molecular processes, and this video was one of them. Be sure to watch for other tweets with the #vizbi hashtag for the next few days. I bet you’ll see some amazing tools and visualizations, as always.

Recently I mentioned the longer, more comprehensive, video from the Aquaria team, but I didn’t use that for my tip–I just used the short version overview. But the longer version had this extra bonus piece of how their software had been used by this animator. Here is Christopher Hammang, creator of this video, describing how he used the Aquaria information to generate the structural model for his animation:

Often it helps people to see how someone else used a tool for a project to get a better grasp of it. And this seemed like such a compelling and unusual example, I wanted to highlight it.

So again I’ll point you to the Aquaria tool tip from earlier this month to explore more, now with an understanding of an example of its use. But I would also encourage you to have a look at the other animations coming out of VIZBI at the VizbiPlus page. I swear, the animated intestine is way cooler than you might expect. The diabetes + insulin receptor videos are really informative and helpful. A cancer video illustrates a misbehaving p53.  Go look.

Quick links:

VizbiPlus videos: http://www.vizbi.org/Plus/

VizbiPlus Poster from Hammang and team: Alzheimer’s Enigma: Putting the Pieces Together http://www.vizbi.org/Posters/2015/B08

Vizbi Posters: http://www.vizbi.org/Posters/

Aquaria: http://aquaria.ws

Reference:
O’Donoghue S.I., Kenneth S Sabir, Maria Kalemanov, Christian Stolte, Benjamin Wellmann, Vivian Ho, Manfred Roos, Nelson Perdigão, Fabian A Buske, Julian Heinrich & Burkhard Rost & (2015). Aquaria: simplifying discovery and insight from protein structures, Nature Methods, 12 (2) 98-99. DOI: http://dx.doi.org/10.1038/nmeth.3258

Friday SNPpets

This week’s SNPpets include CaptureSeq, a wicked cool whiteboard for your computer (I just wish you could upload an image or a doc), finding human “knockouts”, and my favorite new tool name of the week: GOTTCHA for metagenome analysis.


Welcome to our Friday feature link collection: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…

What’s The Answer? (what’s next in bioinformatics?)

This week’s highlighted discussion tackles a pretty broad and open-ended issue–what’s next in bioinformatics? The answers varied, interestingly, and presented a lot of great directions. I’d love to see other people’s ideas.


Biostars is a site for asking, answering and discussing bioinformatics questions and issues. We are members of the Biostars_logo community and find it very useful. Often questions and answers arise at Biostars that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those items or discussions here in this thread. You can ask questions in this thread, or you can always join in at Biostars.


Forum: what is next going to happen in bioinformatics?

In fact many people around the world are working in this domain. some studied bioinformatics and some not (even I see physician are doing bioinformatics). I have been reading papers from all known journals which publish biology related bioinformatics papers or pure bioinformatics. I can tell , pretty much around a topic all times.  I know it is very general question and we cannot give a great and direct answer to it. However, I would like to know which topics you think are the hot spot these days for bioinformatics?

for example, many people are doing sequencing ( of course we cannot have a golden standard because “all modelling are wrong but some are useful “) so these types of studies are going to be forever?

We all know that bioinformatics is only a tool and not the pure science itself. so can we think that it is a died field since mathematics/statistics found itself already or so much left to do ? if so much left to do, what could be those topics ?

I am so eager to know about your opinion

–Mo

I put down some thoughts I had, but I really enjoyed reading the others–like the long one from Francis Ouellette.

Video Tip of the Week: Designing proteins, using Rosetta

As often happens, last week’s tip on visualizing structures led me to some more reading and thinking about creating protein structures. And although it’s important for biologists to be able to use more of the information about protein structures and variations in their work from tools like Aquaria or PDB, it’s also important for some researchers to be on the other end of the pipeline and actually making the protein structures. Further, this also leads to the possibility of better designs of novel proteins as therapeutics–for example, making antibodies like the ones that could possibly battle Ebola.

As I looked around for protein design software to highlight for a tip, it was clear to me that the level of complexity of the problems in designing proteins didn’t really lend itself to short videos. There are some introductory seminars and tutorials on the Rosetta tools, but these certainly require a bit of time to explore. Instead, I’ve decided to highlight this really nice overview on the aspects of protein design that you would have to tackle to make customized proteins.

This iBiology “Introduction to Protein Design” by David Baker is really well done. There’s also a second seminar that is more detailed about designing proteins with new functions to solve many problems in biomedical research and environmental challenges.

This seems incredibly important and useful–but certainly daunting to get started. One way to get a head start on this would be to take an intro workshop. I was recently notified about the opportunity to learn from a couple of researchers who are very skilled with the Rosetta tools–Daisuke Kuroda and Jared Adolf-Bryfogle.

I’m including in the references a nice review of the basics of computational design of antibodies by Kuroda et al. And also a paper by Adolf-Bryfogle and team that covers important aspects of the component parts of antibodies that you would need to predict structures and design new ones, which are stored in the database they’ve created. This should give you a sense of the challenges and opportunities. And give you a good foundation for the concepts.

RosettaCommonsLogoRosetta software has been a powerhouse of protein design for many years. It’s been a leader in the CASP competitions (Critical Assessment of protein Structure Prediction). It’s got a strong user community: Rosetta Commons. You can obtain and use the software in a variety of ways, including some servers for academic use, and one important stop would be the ROSIE servers, “The Rosetta Online Server that Includes Everyone hosts several servers for combined computer power as a free resource for academic users.”

Quick links:

ROSIE servers: http://rosie.rosettacommons.org (note there’s a specific protocol section there that covers antibodies)

Rosetta Commons: https://www.rosettacommons.org/

PyIgClassify: http://dunbrack2.fccc.edu/PyIgClassify/

Short course:  Designing Antibodies with Rosetta Sunday May 3 2015. Early registration discount ends soon.

References:

Kuroda D., H. Shirai, M. P. Jacobson & H. Nakamura (2012). Computer-aided antibody design, Protein Engineering Design and Selection, 25 (10) 507-522. DOI: http://dx.doi.org/10.1093/protein/gzs024

Adolf-Bryfogle J., Q. Xu, B. North, A. Lehmann & R. L. Dunbrack (2014). PyIgClassify: a database of antibody CDR structural classifications, Nucleic Acids Research, 43 (D1) D432-D438. DOI: http://dx.doi.org/10.1093/nar/gku1106

Rybicki E.P. (2014). Plant-based vaccines against viruses, Virology Journal, 11 (1) 205. DOI: http://dx.doi.org/10.1186/s12985-014-0205-0

Note: OpenHelix is a part of Cambridge Healthtech Institute.

World Tour of Genomics Resources II, webinar recording available

worldtourIIpreviewQuick update: the recent webinar we delivered, “World Tour of Genomics Resources II”, is now available as a downloadable recording. Access it here. There’s a short video preview there, but the whole thing is about an hour long.

If you want the slides and the handout with the list of resources, those are available in our previous post:  World Tour of Genomics Resources II, webinar follow-up post. We are going to convert this into a regular tutorial suite with a professional recording soon, and it will be available in our catalog then.

Friday SNPpets

This week’s SNPpets include new nomenclature guidelines from the American College of Medical Genetics and Genomics (ACMG) on what to call a variant in medical situations; a HapMap for wheat; and a spanking for vendor sites that over-promise in personal cancer testing. Oh–and an undergrad genome class that teaches sequencing.


Welcome to our Friday feature link collection: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…

 

What’s The Answer? (beginner projects to pick up)

This week’s highlighted discussion tackles the topic of small projects for folks who are just beginning their training in bioinformatics, or possibly a career transition into a new area. It’s an issue that has come up a number of times, and this new idea for connecting students and projects is a good one, I think.


Biostars is a site for asking, answering and discussing bioinformatics questions and issues. We are members of the Biostars_logo community and find it very useful. Often questions and answers arise at Biostars that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those items or discussions here in this thread. You can ask questions in this thread, or you can always join in at Biostars.


We’ve talked in the past about having some kind of mechanism to connect students or young coders who need projects with small tasks that need to be done. And just last week one of the searches to our blog brought this: small_projects_bioinfo

So when I saw the discussion at Biostars, I was interested.

Forum: Open Source projects to contribute to

I’m a Computer Scientist/experienced developer looking to get into the field, and contributing to Open Source seems to be one of the first suggestions people make for starting out in bioinformatics.

I was wondering if anyone had any recommendations for open source software projects worth contributing to, particularly ones that might have some low hanging fruit or are in real need of help. Is there any tools you folk are using right now that really needs feature X or are you a project maintainer that needs a dig out? The difficulty I’m having is that because I’m not working with these tools day to day, I don’t have the best view of the commonly used tools and their associated problems.

So if anyone has any suggestions I’m going to try fit in some OSS contributions with my own contracting jobs/spare time bio studying. My programming background is Java, Javascript, Python, PHP and I’m learning some R at the moment while I do some coursera specialisations. I’ve done quite a bit of systems admin if the work involved server-side clustering/distributed systems etc.

Thanks

–shane

The answer with the idea for the “Pick me up!” tag struck me as a good system for this sort of thing. Maybe others could implement this kind of tag on their projects too, if they have suitable small tasks. So I thought I’d raise the awareness of that a little bit–in case someone comes to us on a search for “small academic projects in bioinformatics” again. I hope they find some. I still think it’s a need on both sides.

Video Tip of the Week: Aquaria, streamlined access to protein structures for biologists

This week’s Video Tip of the Week is Aquaria, a new resource for exploring protein structures, mutations, and similarities to other proteins. It’s a very well-designed and interactive experience for end users. It is aimed largely at biologists who could benefit from exploring the structural details of their proteins of interest, but are daunted by tools aimed at structural biologists. But for tool developers, you should also look at how this rollout went. It’s one of the best examples of a tool launch I’ve seen in this field. And I’ve seen a lot.

So first, the tool. Aquaria offers users a streamlined way to access and explore protein structures. Combining the kinds of information you get from the PDB structure resources, and additional details like the UniProt mutations. Currently you start with a basic search by asking for a protein by name, or PDB or UniProt ID. They have pre-calculated the relationships of proteins in PDB and Swiss-Prot to quickly offer you a structure and related proteins. The paper notes: “Currently, Aquaria contains 46 million precalculated sequence-to-structure alignments, resulting in at least one matching structure for 87% of Swiss-Prot proteins and a median of 35 structures per protein….” In addition, it lets you explore other important biological features such as InterPro domains, post-translational modifications, so you can think about how the mutations + structures + functions impact a given protein that you are interested in. As they describe it:

“We have loaded SNP data from Uniprot and Interpro so you can see where the mutations lie on your 3D model. And we have found that you may be pleasantly surprised to find your mutations clustering in 3D space!”

The Aquaria folks provided an intro video to get you started:

Another handy feature they provided is a Quick Reference Card with shortcuts to the functions [PDF]. In addition to this intro, they have a longer video as well. This is more like a typical lecture with the background, the framework, the goals of the project, and more about the underlying database.

Now, this thing about the rollout of this software project. I found it when I was looking over the talks at the upcoming VIZBI conference (Visualizing Biological Data). Every year I find there are awesome ideas that come out of VIZBI, and tools I want to explore. Among them this year is Aquaria. So I went looking for more detail, and found some of the traditional stuff. The paper (below), the press release, etc. And then I found the Reddit discussion. The Aquaria team did a Science AMA on this tool. It engaged a range of folks–some folks just fans of science who had probably never seen protein structures before. That’s fine with me–the more folks who appreciate research and learn about how researchers explore proteins is a good thing. But others had good technical questions for the team–such as other ways to find proteins of interest with sequence searches, or integration with other tools like UCSC Genome Browser. All the answers are over there. I enjoyed the question about the name of the tool:

It seems you get the ideas we had in mind: using Aquaria lets us observe these fascinating creatures (proteins) from the natural world. Aquaria creates an artificial environment and lighting where we can observe isolated proteins; like aquarium fish, proteins are often beautiful and (usually) live in water.

I asked them about how this played out, and they had ~1000 folks visit their site as a result of this Reddit event. That was really interesting to me, and a very neat route to drive awareness.

They also provided a way to support users with one of my other favorite resources–Biostars. They created a support thread there where uses can ask questions and get answers. https://www.biostars.org/t/aquaria/ I so prefer this to mailing lists, and I’m glad to see this easy method to get support. In fact, I asked something that I couldn’t quite figure out yet.prot_structure_sample (Here’s the protein I was looking at: http://aquaria.ws/P09616/7ahl/A I wanted to see all the subunits in full color, you de-select autofocus to do that. And color by chains for this version.)

Also, for the developer types: they offer a way for you to interact with the Aquaria software to add your own features of interest with their API. Maybe you have new mutations you have found in some sequence you’ve obtained in your lab, for example. They are offering guidance on that here: http://bit.ly/aquaria-features. They touch on this in the longer video (~27min) if you want a bit more explanation. I suspect from the high quality support they are offering, they’d be interested to hear from you and what features you’d like to see applied to these proteins as well.

So kudos to this team for a nifty tool and really serious multi-media outreach efforts. I think it was well done on all counts. I’ll bet you Reddit reached more of the right folks than a press release ever will. PIOs take note–get your scientists on Reddit.

Quick links:

Aquaria site: http://aquaria.ws/

Reddit Science AMA: https://www.reddit.com/r/science/comments/2w2jvw/science_ama_series_we_are_dr_sean_odonoghue_and/

Biostar support thread: https://www.biostars.org/t/aquaria/

Reference:
O’Donoghue S.I., Kenneth S Sabir, Maria Kalemanov, Christian Stolte, Benjamin Wellmann, Vivian Ho, Manfred Roos, Nelson Perdigão, Fabian A Buske, Julian Heinrich & Burkhard Rost & (2015). Aquaria: simplifying discovery and insight from protein structures, Nature Methods, 12 (2) 98-99. DOI: http://dx.doi.org/10.1038/nmeth.3258