Category Archives: Tip of the Week

HMDB_logo

Video Tip of the Week: Human Metabolome Database, HMDB

HMDB_logoThe HMDB, or Human Metabolome DataBase, is another nice data collection and tools from the Wishart lab. Although we have mentioned it in the past, because of it’s emphasis more on small molecules it isn’t something we covered in detail. But with this new video that’s available, I thought it was a good time to include it in our database resources for folks who might be seeking out this kind of metabolomics data.

Their overview video that will be our tip of the week notes that currently their resource contains over 40,000 metabolites. They introduce the types of information contained within, including not only chemical names and structures, but also descriptions, taxonomies, concentrations in biological fluids, reactions and pathways and the roles in human disease. The video goes on to describe the ways to interact with the data, via browsing or searching, and more.

There’s a tremendous amount of information on the pages, with appropriate links to many other useful sources as well. Of course you can also search with sequences using BLAST, and the gene pages  will offer lots of detail and links to enzyme and metabolite products that may be useful to think about. And in the never-ending search for appropriate biomarkers for medical situations, this is a really useful repository of knowledge.

While preparing for this tip, I also happened to notice a tweet from their group that was useful for folks who are trying to learn about these topics.

It appears to be a popular and effective guide to exploring translational biomarker discovery topics, with another tool that was new to me: ROCCET, ROC Curve Explorer & Tester. The tutorial with links is listed below as well. So have a look if you are interested in evaluating this type of biomarker data.

Quick links:

HMDB: http://www.hmdb.ca/

ROCCET: http://www.roccet.ca/ROCCET/

References:

Wishart, D., Jewison, T., Guo, A., Wilson, M., Knox, C., Liu, Y., Djoumbou, Y., Mandal, R., Aziat, F., Dong, E., Bouatra, S., Sinelnikov, I., Arndt, D., Xia, J., Liu, P., Yallou, F., Bjorndahl, T., Perez-Pineiro, R., Eisner, R., Allen, F., Neveu, V., Greiner, R., & Scalbert, A. (2012). HMDB 3.0–The Human Metabolome Database in 2013 Nucleic Acids Research, 41 (D1) DOI: 10.1093/nar/gks1065

Xia, J., Broadhurst, D., Wilson, M., & Wishart, D. (2012). Translational biomarker discovery in clinical metabolomics: an introductory tutorial Metabolomics, 9 (2), 280-299 DOI: 10.1007/s11306-012-0482-9

gene.iobio.io

Video Tip of the Week: gene.iobio for genome and variation browsing

Twitter erupted recently with some chatter about a new tool that people seemed to really like. The iobio team from the Marth lab had launched a new gene “app” on their iobio framework. Here was some of the response:

So of course I wanted to have a look. And I agree–it is a very slick tool, and fun to explore. If you want to get started, check out their gene.iobio announcement blog post for a bit of the goals and features in text form:

Gene.iobio is designed to help medical and clinical researchers hunt for disease-causing genetic variants through a combination of real-time genomic data analysis and intuitive visualization.

And they invite you to watch their intro video to get a sense of how it works. Their overview video is our tip this week:

They have other videos as well, on more specific use cases, via their YouTube channel. There is one publication I could find, too, which gives you some of the background on their goals and intentions for their software and app ecosystem (linked below). I like this summary of their goals from the paper, I think it helps you to understand these nifty and speedy tools:

We have developed and are continually expanding a web-based analysis system, iobio (http://iobio.io/), to empower all biological researchers to analyze—easily, interactively and in a visually driven manner—large biomedical data sets that are essential for their research, without onerous resource requirements.

On their blog they also talk about some of the other apps they already have–bam.iobio and vcf.iobio. They also note that they provide Docker containers to make it very easy for you to deploy your own installation if you’d like to have one. (If you are new to Docker, we’ve done tips on that too: intro, and a bioinformatics application example). And they note in their paper that they intend to have other folks use their libraries to create more apps that have similar features. So if people like this, and they seem to so far, you may see more of these tools coming along. Check ‘em out.

Quick links:

iobio main site: http://iobio.io/

gene.iobio: http://gene.iobio.io/

Reference:

Miller, C., Qiao, Y., DiSera, T., D’Astous, B., & Marth, G. (2014). bam.iobio: a web-based, real-time, sequence alignment file inspector Nature Methods, 11 (12), 1189-1189 DOI: 10.1038/nmeth.3174

OpenHelix_logo_2015

Video Tip of the Week: World Tour of Genomics Resources, part II

This week’s tip is not our usual short video. We’ll connect you to our newest tutorial suite, our World Tour of Genomics Resources, part II. Our previous tour was really popular–because as much as bench researchers know about the tools they currently use–everyone realizes there are more tools out there. And many of them don’t realize that there could be some very handy ones for tasks that they have.

This time the tour discusses not only tools for which we have full tutorial suites (video, slides, handouts, exercises), but also a lot of the handy problem-solving tools that we cover in our weekly tips. Things like UpSet for exploring data relationships among sets–which scales way better than Venn diagrams for genomics data sets. Or like Slidify to make slides from RStudio directly. We won’t have full training suites on these, but people will find them really useful in their daily work.

Sometimes we will also add tips about tools for which we have suites, but that have new features. For example, although thousands of people watch our UCSC Genome Browser full trainings, we also have tips that highlight new features or tools that aren’t part of the basic intro–such as new wiggle track features, or the Genome Browser in a Box. So we help people keep current in the field this way, even with existing tools they use.

But still we adhere to our philosophy that we explained in our paper (below). Raising awareness of tools that are out there, and help with how to find and use them effectively. This World Tour illustrates that.

worldtour2_click

Quick links:

New tutorial suite: http://www.openhelix.com/worldtour2

References:
Williams, J., Mangan, M., Perreault-Micale, C., Lathe, S., Sirohi, N., & Lathe, W. (2010). OpenHelix: bioinformatics education outside of a different box Briefings in Bioinformatics, 11 (6), 598-609 DOI: 10.1093/bib/bbq026

Aaraport_logo

Video Tip of the Week: Araport, Arabidopsis Portal

The recent Plant Biology 2015 conference tweets were full of delightful morsels (). Some of them edible. I am very psyched to learn of the Legume Federation. Legumes are *way* at the top of my list of favorite organisms. I think it was their tweet of the Araport data that led to this week’s video tip of the week.

Araport was new to me. I had been familiar with TAIR, and I knew of some of their changes after the funding went away. But this got me looking into how the community was re-organizing to support the plant model organism data, and also providing supporting tools and new directions.

The paper (below, 2014) describes the foundations and some of the transition issues from TAIR. And it also describes some of the tools that they are making available going forward. They have a customized InterMine called ThaleMine that can help you make customized queries to mine the data. There’s a JBrowse for visual browsing of the genome data. They are also maintaining a GBrowse with the Arabidopsis data, and they have a page of browser comparisons. There’s BLAST for sequence-based searching.

But what’s also very cool is that they are also making this a framework for people to build their own apps around. This “community contributed modules” is a great idea. So tools that folks may need for their particular research directions can be built right on top of the Araport setup.

This week’s tip is the ThaleMine intro video that they have provided.

I would keep an eye out for more videos from them in the future as well. There’s also a   Presentations page, and if you want more of an overall look at their foundations and plants there’s a nice overview slide deck from the ICAR 2015 conference.

Check out the Araport resources for arabidopsis and plant genomics tools.

Quick links:

Araport: http://www.araport.org/

References:

Hanlon, M., Vaughn, M., Mock, S., Dooley, R., Moreira, W., Stubbs, J., Town, C., Miller, J., Krishnakumar, V., Ferlanti, E., & Pence, E. (2015). Araport: an application platform for data discovery Concurrency and Computation: Practice and Experience DOI: 10.1002/cpe.3542

Krishnakumar, V., Hanlon, M., Contrino, S., Ferlanti, E., Karamycheva, S., Kim, M., Rosen, B., Cheng, C., Moreira, W., Mock, S., Stubbs, J., Sullivan, J., Krampis, K., Miller, J., Micklem, G., Vaughn, M., & Town, C. (2014). Araport: the Arabidopsis Information Portal Nucleic Acids Research, 43 (D1) DOI: 10.1093/nar/gku1200

PathWhiz sample

Video Tip of the Week: PathWhiz for Pathways, Part II

This week’s tip is a follow-up to the PathWhiz one featured last week. After I had finished writing that one, the second video in the series became available. It has a lot more detail on how to work with the tool.

I’m not going to go into the introduction here again, you can flip back and read the piece for some more of the foundation. But if you are ready to look at the more advanced guidance in this video, it’s worth the time.

Quick links:

PathWhiz: http://smpdb.ca/pathwhiz

PathWhiz legends to see the graphics: http://smpdb.ca/pathwhiz/legend

Reference:
Pon, A., Jewison, T., Su, Y., Liang, Y., Knox, C., Maciejewski, A., Wilson, M., & Wishart, D. (2015). Pathways with PathWhiz Nucleic Acids Research, 43 (W1) DOI: 10.1093/nar/gkv399

Benson G (2015). Editorial: annual Web Server Issue in 2015 Nucleic Acids Research, 43 (W1) DOI: 10.1093/nar/gkv581

PathWhiz sample

Video Tip of the Week: PathWhiz for graphical appeal and computational readability

“Pathway diagrams are the road maps of biology.” This is how the folks from PathWhiz begin their recent paper. I came across it in the Nucleic Acids Research web server issue which was recently announced. The NAR database issue in January and the mid-year web server issue are perfectly timed items that I can content mine all year ’round. And I am always drawn to the tools which are offering better visualizations for data. PathWhiz is offering better road maps. So I definitely wanted to take a look.

They note that historically pathway data has been artistically rendered for print applications like papers and posters, and can be quite engaging and attractive. But actually working with pathways in computational settings can be a bit more, um, sterile–I guess I would characterize it. Their goal seems to be to merge the two: better options for graphical components, but still machine-readable for further manipulation and exploration. They summarize their goal in this hybrid approach:

PathWhiz is essentially a web server designed for the facile creation of colourful, visually pleasing and biologically accurate pathway diagrams that are machine-readable, interactive and fully web compatible.

The paper goes on to describe a lot of the foundational concepts and the implementation. There are important technical aspects covered about the formats and file types. But the best way to get a feeling for it is their intro video. You can also access that on their tutorial page and I’ll include it here.

Mid-way through this PathWhiz video 1, they show you the difference between a KEGG, Reactome, and WikiPathways visualization to give you a sense of the differences. (~4:15). A part II video is coming, but not available yet. It has posted since I started this.

Look through the “legends” area to see the kind of handy diagrams you might need–molecules, membranes, or cellular organelles, or even tissues like brain or liver. Tab through the different types of graphics that are available to get a sense of how your pathways could look rendered in the PathWhiz system. PathWhiz sampleYou can try it out easily too: there’s a “guest” mode where you can just kick the tires. Or you can create a login and work on some of the ones that might be useful for your work and your presentations and papers. Those can be saved and locked, but can also be cloned and expanded on by other people. You can also get a sense of what some of the more mature diagrams can look like by browsing the pathway collection. I thought this one: 17-alpha-hydroxylase deficiency (CYP17), had nice examples of the tissue (kidney) and organelles involved that quickly give you a grasp of what’s going on and where. I’ve just shown a small part of it in this image, it’s much more detailed at full size. You can zoom in to see the pathway components. And you can see from here that the details are exportable in a number of ways by clicking the “Downloads” tab.

So for better representations for humans to view, while also preserving the important functions that computational renderings can offer, PathWhiz is worth a look. Go over and try it out.

Quick links:

PathWhiz: http://smpdb.ca/pathwhiz

PathWhiz legends to see the graphics: http://smpdb.ca/pathwhiz/legend

Reference:
Pon, A., Jewison, T., Su, Y., Liang, Y., Knox, C., Maciejewski, A., Wilson, M., & Wishart, D. (2015). Pathways with PathWhiz Nucleic Acids Research, 43 (W1) DOI: 10.1093/nar/gkv399

Benson G (2015). Editorial: annual Web Server Issue in 2015 Nucleic Acids Research, 43 (W1) DOI: 10.1093/nar/gkv581

UCSC Genome Bioinformatics

Video Tip of the Week: Introduction to the UCSC Genome Browser

UCSC Genome BioinformaticsThis week’s tip is quite multi-media. There’s a video, as required. But there’s a traditional published paper format, too. And there’s also the free training slides and exercises from us, sponsored by the folks who create the UCSC Genome Browser. So if you prefer audio, graphics, or text–we’ve got it all in this week’s tip.

For years we’ve been doing the UCSC Genome Browser online training suites. And those materials are still available for everyone to see. But I know some people prefer to have someone walk through the stuff in a webinar or workshop. And if you are using our materials yourself for training others, it might help to hear how I present it “live”. For this week’s tip, here’s a snippet of the recent webinar I just did, with my most current slide set.

You can access the whole thing from our site here: http://www.bio-itworld.com/openhelix/introduction-to-the-ucsc-genome-browser-webinar/

But the main reason I’m highlighting this is because of our paper that’s recently been un-firewalled. We have this Current Protocols in Molecular Biology paper that we did a while back (first in 2009), and they asked us to update this last year. This updated version is now publicly available in PubMedCentral.

This paper was a fun paper to write. I like to do the step-by-step series. It forces me to really think like a new user, looking at the menus and the buttons and everything that someone new to the software might face. And if you are in a teaching situation, you could offer this paper to students to let them try these things out. You could pair it with either the webinar or our standard recording. And I think this multi-media strategy could be really effective in getting people to grasp the concepts, and also build their confidence with the tools.

I spent some time working through the paper to see if there were any serious differences since we submitted it a while ago. I will note that there are a few changes since we wrote this. For example, the former “Variations and Repeats” group has become “Variations” and “Repeats” as separate track groups. And “Literature” moved into “Phenotype and Literature”. But I don’t think that will trip up most users. Use that as a teachable moment about interfaces changing…. Also, of course, version numbers for dbSNP have changed. But again–most people can follow along, or even try the old version to see the differences.

Probably the biggest difference is the part with the evolutionary relationships. Now that there are 100 species instead of the prior 46 species version, a couple of things about that interface changed. Now you need to check “All species” instead. They don’t separate out vertebrates the way they used to.

Another interface change in the part with the Track Hubs will be potentially confusing. As an introduction to hubs, Bob Kuhn wrote this part that walks you through the basic structure of the setup of a hub. All of our text it still ok, but you can’t just get the URL like we had originally shown. Still, it walks you through the structure of the hubs without problems if you just type the URL instead of copying it.

So use our materials to teach yourself, or to teach others. We hope this offers different ways that will work for everyone.

Quick links:

UCSC Genome Browser: genome.ucsc.edu

UCSC Intro Webinar: http://www.bio-itworld.com/openhelix/introduction-to-the-ucsc-genome-browser-webinar/

UCSC Intro Tutorial suites (video, with our free slides + exercises): http://www.openhelix.com/ucscintro

UCSC Advanced Tutorial suites (video, slides, exercises): http://www.openhelix.com/ucscadv

Reference:

Mangan ME, Williams JM, Kuhn RM, & Lathe WC (2014). The UCSC Genome Browser: What Every Molecular Biologist Should Know Current Protocols in Molecular Biology., 107 (19.9), 199-199 DOI: 10.1002/0471142727.mb1909s107

Disclosure: These tutorials are freely available because UCSC sponsors us to do training and outreach on the UCSC Genome Browser.

PhenomeCentralLogo

Video Tip of the Week: PhenomeCentral

Silos. This is a big problem for us with human genome data from individuals. We’re getting sequences, but they are locked up in various ways. David Haussler’s talk at the recent Global Alliance for Genomics and Health meeting (GA4GH) emphasized this barrier, and also talked about ways they are looking to work around the legal, social, and institutional barriers that we’ve created. He talked about Beacon, which I highlighted recently as a Tip of the Week. But there are other strategies needed to connect physicians and patients with other folks who might help them get to answers. Heidi Rehm’s talk provided information about a possible tool for this: PhenomeCentral.

Unfortunately, the videos aren’t uploaded to YouTube, you have to go to the June 10 Meeting page and obtain them from there. The one that contained the information on PhenomeCentral is the one called “Matchmaker Exchange”.

PhenomeCentralLogoThe mission of PhenomeCentral, according to their site, is:

PhenomeCentral is a repository for secure data sharing targeted to clinicians and scientists working in the rare disorder community. PhenomeCentral encourages global scientific collaboration while respecting the privacy of patients profiled in this centralized database.

Certainly people in bioinformatics are familiar with the really crucial information from OMIM and Orphanet. But these are aggregators of information, not patient-specific. There may be lists of features of a condition, but how they appear in a given patient’s situation might vary.

What this new strategy will do is let doctors and researchers take the phenotype and genotype data (you can upload VCF files), and make predictions about the genes involved. They also have ways to “matchmake” possibly similar disease manifestations. This project is part of the larger “MatchMaker Exchange” collection (Note: MME is not a dating site…it’s also still under development). But the idea is that with patient details one could search for matches with other similar patients (depending on the privacy level of the records, of course). It sounded to me like a kind of BLAST for medical conditions (they didn’t call it that). But it also has ways to semantically link phenotype concepts, because they might be entered differently by different evaluating physicians, yet be the same type of issue underneath. That Human Phenotype Ontology (HPO) that I’ve covered a couple of times lately enables this.

They have 3 levels of privacy settings included: private, matchable (where you can find it in a search, but it’s not wide open to everyone), and public.

So although I used the GA4GH talk as a launching point to learn more about the features and conceptual parts of the PhenomeCentral software, I also came across this other webinar that was more specific about the software features (which is what I typically prefer for our tips, the specific software tools). The Genetic Alliance is a patient-centric group interested in answers for genetic and genome-variant medical situations, actively working with advocacy groups and researchers to bridge the needs of both. In their webinar series last year they included PhenomeCentral.

What I didn’t realize from the GA4GH overview was that there are additional tools, including a pedigree tool in the PhenoTips part. We find a lot of people find our blog searching for pedigree tools, so I wanted to be sure to mention that specifically. You can try it out by entering fake data in the playground over there, and accessing the Pedigree Tool from that record. This was also handy for me because I didn’t create a login for the main PhenomeCentral site due to the privacy issues.

So have a look at PhenomeCentral. And from the GA4GH video I learned that there is a special journal issue coming up in the fall that will have papers related to these projects. So I’ll link to the PhenoTips publication below now, but when more references become available for this tool or project I’ll add them in. I expect there will be metrics about algorithms in use and other technical details that are important for fully evaluating the tool.

Quick links:

PhenomeCentral: https://phenomecentral.org/

PhenoTips: https://phenotips.org/ (has the playground + pedigree tool)

GA4GH videos: http://genomicsandhealth.org/news-events/events/june-10th-meeting-presentations

References:
Girdea, M., Dumitriu, S., Fiume, M., Bowdin, S., Boycott, K., Chénier, S., Chitayat, D., Faghfoury, H., Meyn, M., Ray, P., So, J., Stavropoulos, D., & Brudno, M. (2013). PhenoTips: Patient Phenotyping Software for Clinical and Research Use Human Mutation, 34 (8), 1057-1065 DOI: 10.1002/humu.22347

MorphoGraphX sample images, via: DOI: http://dx.doi.org/10.7554/eLife.05864.004

Video Tip of the Week: MorphoGraphX, morphogenesis in 4D

This week’s Video Tip of the Week covers a different aspect of bioinformatics than some of our other tips. But having been trained as a cell biologist, I do consider imaging software as an important part of the crucial software ecosystem. Also, since it’s a holiday week and traffic may be light in the US, I thought something really nice to look at was a good plan.

I found out about this software via ResearchBlogging, via The Node’s Anne-Lise Routier-Kierzkowska’s post about the work she and her team have done: MorphoGraphX: A platform for quantifying morphogenesis in 4D. It’s a nice overview of the kinds of things that this software can do, and what the origins were. I really like the backstory types of posts from researchers writing about their own work–go read that, I’m not going to replicate it here.

On the MorphoGraphX site, the other things they describe as features of their software include:

  • Shape extraction
  • Growth analysis
  • Signal quantification
  • Protein localization

The introductory video from their team is a nice overview. But you should definitely see their paper, which has additional video figures that show more of the features and the utility. There are several different video figures that are fascinating to watch. Really–go watch the paper–don’t print it. Paper or PDFs wouldn’t cut it for this story.

No audio for this video. Just lovely images with some text guidance. I don’t have the computing capacity to try it myself, nor to I have the stacks of images that I used to have. But there are many nice examples of what it can do. And Anne-Lisa’s blog post speaks about what researchers are doing with it.

Quick link:

MorphoGraphX: http://www.mpipz.mpg.de/MorphoGraphX

Reference:

Barbier de Reuille, P., Routier-Kierzkowska, A., Kierzkowski, D., Bassel, G., Schüpbach, T., Tauriello, G., Bajpai, N., Strauss, S., Weber, A., Kiss, A., Burian, A., Hofhuis, H., Sapala, A., Lipowczan, M., Heimlicher, M., Robinson, S., Bayer, E., Basler, K., Koumoutsakos, P., Roeder, A., Aegerter-Wilmsen, T., Nakayama, N., Tsiantis, M., Hay, A., Kwiatkowska, D., Xenarios, I., Kuhlemeier, C., & Smith, R. (2015). MorphoGraphX: A platform for quantifying morphogenesis in 4D eLife, 4 DOI: 10.7554/eLife.05864

google_scholar

Video Tip of the Week: handy way to make citations quickly

google_scholarThis is not a typical tip–where we explore the features and details of bioinformatics tools. But it’s one of those handy little features that may make your life easier. It’s made mine better lately.

I had been using the ScienceSeeker citation generator system for creating citations that would then aggregate to either ScienceSeeker or ResearchBlogging. But ScienceSeeker’s model recently changed. And ResearchBlogging’s support and stability is…well, uneven. But I still would like my posts tagged with appropriate citations and DOIs so they can be found later with other tools and searches.

The helpful folks at ScienceSeeker offered this alternative strategy for quickly grabbing a citation. I’ve already used it a few times now. And I thought other science bloggers might also find this handy. Or anyone wanting a quick formatted cite. And then to just tag it with the DOI is simple. (But boy, I wish they had a version that had DOIs. Maybe I should ask for that.)

I’ve been using Google Scholar a lot lately because the collection is getting better as the paper below notes, and it is becoming a bit more refined with less nonsense items pulled in. mr_happy In the past I was really upset to see detritus like “Mr. Happy’s Health News” in there. But I looked recently and Mr. Happy was gone. There were also some really terrible activist “reports” on biotechnology loaded with unsourced and incorrect information. I’ve seen less of that too, but I haven’t looked specifically for those of late.

But there have been many times I’ve been able to locate a PDF over there that has come in very handy. Yet I had never tried to use that software feature before to create the links. I’m glad it’s available. I just wish there was a version for blog posts with the links done up right. I checked with the Altmetric support pages to see what I need to have in the structure to be sure it gets counted, and here’s the suggested syntax: How do I ensure that my blog posts are picked up by Altmetric?

2. Always include links to the papers that you reference
If you blog a lot about research, the best way to make sure that your posts get picked up by Altmetric is to include a direct link to a scholarly article.

You can include a link to the journal in a variety of different formats, which include but are not limited to:

You can also link to datasets or objects that are hosted on figshare or Dryad Digital Repository, and these mentions will also be picked up by Altmetric. You can link to these objects using a link the DOI URLs, e.g., http://dx.doi.org/10.6084/m9.figshare.1167458.

I know it’s not that hard to add a DOI URL. But it is an extra step I didn’t have to do with the sciblogging citation generators. However, I can’t see an obvious place to offer suggestions or contact the developers. If anyone knows how to reach that team, let me know.

Quick link:

Google Scholar: https://scholar.google.com/

Reference:

Harzing, A. (2013). A longitudinal study of Google Scholar coverage between 2012 and 2013 Scientometrics, 98 (1), 565-575 DOI: 10.1007/s11192-013-0975-y