Tag Archives: Clustal

Announcement of Updated Tutorial Materials: UniProt, Overview of Genome Browsers, and World Tour of Resources

As many of you know, OpenHelix specializes in helping people access and utilize the gold mine of public bioscience data in order to further research.  One of the ways that we do this is by creating materials to train people – researchers, clinicians, librarians, and anyone interested in science - on where to find data they are interested in, and how to access data at particular public databases and data repositories. We’ve got over 100 such tutorials on everything from PubMed to the Functional Glycomics Gateway (more on that later).

In addition creating these tutorials, we also spend a lot of time to keep them accurate and up-to-date. This can be a challenge, especially when lots of databases or resources all have major releases around the same time. Our team continually assesses and updates our materials and in this post I am happy to announce recently released updates to three of our tutorials: UniProt, World Tour, and Overview of Genome Browsers.

Our Introductory UniProt tutorial shows users how to: perform text searches at UniProt for relevant protein information, search with sequences as a starting point, understand the different types of UniProt records, and create multi-sequence alignments from protein records using Clustal.

Our Overview of Genome Browsers introduces users to introduce Ensembl, Map Viewer, UCSC Genome Browser, the Integrated Microbial Genomes (IMG) browser, and to the GBrowse software system. We also touch on WebGBrowse, JBrowse, the Integrative Genomics Viewer (IGV), the ARGO Genome Browser, the Integrated Genome Browser (IGB)GAGGLE, and the Circular Genome Viewer, or CGView.

Our World Tour of Genomics Resources is free and accessible without registration. It includes a tour of example resources, organized by categories such as Algorithms and Analysis tools, expression resources, genome browsers (both Eukaryotic and Prokaryotic/Microbial) , Literature and text mining resources, and resources focused on nucleotides, proteins, pathways, disease and variation. This main discussion will then lead into a discussion of how to find resources with the free OpenHelix Resource Search Portal, followed by learning to use resources with OpenHelix tutorials, and a discussion of additional methods of learning about resources.

Quick Links:

OpenHelix Introductory UniProt tutorial suite: http://www.openhelix.com/cgi/tutorialInfo.cgi?id=77

OpenHelix Overview to Genome Browsers tutorial suite: http://www.openhelix.com/cgi/tutorialInfo.cgi?id=65

Free OpenHelix World Tour of Genomics Resources tutorial suite: http://www.openhelix.com/cgi/tutorialInfo.cgi?id=119


World tour of workshops, recent stop: Morocco, Africa

Trainers & organizers

Last year I had the opportunity to give a workshop in Ifrane Morocco (UCSC Genome and Table browsers, Galaxy) at Al Akhawayn University. This year, Mary and I returned for a longer 3-day workshop at University Hassan II in Mohammadia. OpenHelix was a co-sponsor of the workshop (donating our time, materials and expertise). The workshop covered a plethora of topics from a world tour of resources (tutorial-free) and introductory UCSC  Genome Browser (tutorial-free) and ENCODE (tutorial-free) to genome variation analysis in dbSNP (tutorial-subscription) and analysis using Galaxy (tutorial-subscription). You can see the full schedule of the topics Mohammadia Workshop Schedule here (pdf).

As last year, we were impressed with the students (there were 117 total, about 50/50 gender ratio). English is their 3rd or 4th language in most cases, Moroccan Arabic, French or various African languages being their language of choice. Yet, they were attentive and asked very perceptive and fascinating questions. They were also very enthusiastic

The workshop students

learners. It was a delight to teach them.

We’d like to thank Mohammed Bourdi at NIH, who spent large amounts of time and financial resources to organize this (and last year’s) workshop. We hope to repeat and expand these for next year and perhaps years to come. We will be looking for sponsors.

Several questions were asked at the workshop we’d like to reiterate the answers here and seek some answers from our readers:

*One student was looking for wheat genome resources for designing primers. The wheat genome is as yet incomplete, but there are some resources to get started:
Wheat Genome Sequencing Consortium
Gramene’s wheat resources
Wheat Genetic and Genomic Resource Center @ Kansas State
Perhaps also COGE for conserved sequences
edited to add:
CerealsDB and
James’ post on the wheat draft sequence might give some insight into that huge genome.
*Another student asked about dotplot tools:
Galaxy offers a large collection of EMBOSS tools including dotplot analysis, as does EBI Emboss tool

* Another question concerned finding a ‘dynamic programming’ (optimal solution) multiple sequence alignment tool as opposed to a heuristic one. The issue with this is the complexity of the search space of dynamic programming solution, this slide set might help with the understanding, particularly slides 1-5 and 17-22. It is too computationally intensive. That said, the student might want to check out MSAProps and this list at Wikipedia.

Do our readers have any other guidance on this?

Teaching moment

* Another student asked  if we know how to find DC-area internships in biological sciences. Another student (mathematician from Mali) was looking for something in the US in bioinformatics. Any ideas of programs to bring African biology students to the US or Canada?

If our Moroccan students (or anyone else) have any additional questions, please feel free to ask them here!


ANd a side note. Last year I had all of 3 hours to tour Fes. This year I took advantage of my trip. Mary and I spent a few days in Fes and Marrakech. My family joined us in Marrakech and later my family and I toured for 8 days visiting the Atlas mountains, the Sahara and Fes. Needless to say, it was a trip of a lifetime. Morocco is a fascinating and beautiful place. I look forward to visiting again.

Gates and doors of Fes are beautiful

camel excursion to the Sahara





What’s the answer? Open thread

BioStar is a site for asking, answering and discussing bioinformatics questions. We are members of the community and find it very useful. Often questions and answers arise at BioStar that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those questions and answers here in this thread. You can ask questions in this thread, or you can always join in at BioStar.

BioStar Question of the Week:

Multiple sequence alignment of thousands of proteins

I want to track the evolution of several domains, and for doing so, I need to align and cluster 1000′s of sequences. is it possible? and what is the best software to use for that? Eventually I want to understand which is the most “basal” sequence that might lead me to the most ancient protein containing this sequence.

–by Dror

The selected answer:

“mafft –auto” is stable for up to hundreds of thousands of proteins and produces reasonable alignments: http://mafft.cbrc.jp/alignment/software/

–by avilella

But there are a couple of other options as well, as with most bioinformatics solutions!  This includes a hot-off-the-press lead on the new Clustal version (Clustal Omega). Check out the others over there.

Tip of the Week: Drawing a Tree

phylodendron_thumbSo, you’ve got your sequences aligned using Clustal, Muscle or T-Coffee (or other program), you’ve created a tree data file using PAUP, Phylip or one of the other many algorithms out there, now you want to draw and visualize those relationships. A good place to go to find a list of tools to do that is at the huge list of phylogeny tools at Department of Genome Sciences at the University of Washington (there is a lot more than just tree drawing of course!). Many of these tools are downloaded and feature rich, one might suit your needs, so check them out. One this tip points out is Phylodendron, which you can download, but also has a web interface. This will allow you to quickly draw a tree for viewing.