Lately I’ve been keeping an eye on a lot of the tools that link individuals with sequence data, their phenotypes, and researchers/physicians who may either study or treat the associated medical issues (see MyGene2 most recently) . But there’s a lot of room upstream of these kinds of patient outcomes to explore genotypes and phenotypes. This week’s Video Tip of the Week is for Genonets, offering “Analysis and Visualization of Genotype Networks”, a tool that can help to explore these relationships for pre-clinical/research scenarios as well.
A recent paper explains the goals behind their tools, and they also have a series of videos on their web site to help you get going with Genonets. I’ll put the intro video here, but be sure to click over to their “Learn Genonets” page for a lot more. There’s also a text-based tutorial you can work through which is helpful.
You can also kick the tires a bit with a sample file that’s available from their search page. Just click the checkbox to load it up and try it out. And then be sure to explore those “deep dives” videos to go further.
Khalid, F., Aguilar-Rodríguez, J., Wagner, A., & Payne, J. (2016). Genonets server—a web server for the construction, analysis and visualization of genotype networks Nucleic Acids Research DOI: 10.1093/nar/gkw313
The team at UCSC Genome Browser continues to update their resources and offer new ways to find and visualize features of interest to researchers. One of the newer features is the “multi-region” option. When it was first launched, I did a tip on how to use that, with some of the things that I noticed while I was testing it pre-launch. But now the folks at UCSC have their own video on the exon-only display that you might also find useful.
One of the things that is illustrated here is how the exon-only mode is handy to enhance your exploration of RNA-Seq data. It also uses a great ENCODE data set as an example, and if you haven’t been using that collection it’s a good reminder of the kinds of things you can find in that resource still. And this extensive data set shows how much easier it is to look at different isoforms in the data in this new exon-only mode.
So have a look at this display option if you haven’t before, especially how it can help you to see transcript differences. If you aren’t familiar with the ENCODE data that’s being used, you can also see our training on that which will help you to understand how to use that data and the filtering features that are also used in this video.
Special note: I have updated the UCSC Intro slides to include the new Gateway strategies as well. So download those slides for the latest look.
Disclosure: UCSC Genome Browser tutorials are freely available because UCSC sponsors us to do training and outreach on the UCSC Genome Browser.
Speir, M., Zweig, A., Rosenbloom, K., Raney, B., Paten, B., Nejad, P., Lee, B., Learned, K., Karolchik, D., Hinrichs, A., Heitner, S., Harte, R., Haeussler, M., Guruvadoo, L., Fujita, P., Eisenhart, C., Diekhans, M., Clawson, H., Casper, J., Barber, G., Haussler, D., Kuhn, R., & Kent, W. (2016). The UCSC Genome Browser database: 2016 update Nucleic Acids Research, 44 (D1) DOI: 10.1093/nar/gkv1275
The ENCODE Project Consortium (2012). An integrated encyclopedia of DNA elements in the human genome Nature, 489 (7414), 57-74 DOI: 10.1038/nature11247
Last week’s tip encouraged people to think about how their DNA may be used by various stakeholders. This could be researchers, physicians, pharmaceutical companies, and so on. But one thing it didn’t really cover–now that I think of it–was connecting with other families who may share variations that impact the health of someone in their household. If there isn’t research, or treatment, this connection alone might be worth it for some families.
The site is MyGene2, and it can help unrelated folks who have the same genome challenges connect with each other. It also can connect folks to researchers interested in the topic. But it does seem to be aimed more specifically at families seeking each other. It was recently awarded a chance to compete for the final prize in the Open Science Prize effort, and they got a funding boost to keep going.
There is a “welcome” video that they’ve made, but it’s light on the software details. Still, though, I wanted to share the information so families may find it, and researchers may want to know about this resources as well. The video isn’t embeddable, though, so you’ll have to click to view it:
You can learn more about the resources from their FAQ collection. I’ve found a couple of references (below) that provide some further information about the project [Note: the Genetics in Medicine one goes to a paywalled, piece–but you can access the pre-print version PDF at bioRxiv]. As more and more families who are seeking answers will have sequencing information available, they’ll need a place to go with that. I hope they find each other, and find answers.
Chong, J., Yu, J., Lorentzen, P., Park, K., Jamal, S., Tabor, H., Rauch, A., Saenz, M., Boltshauser, E., Patterson, K., Nickerson, D., & Bamshad, M. (2015). Gene discovery for Mendelian conditions via social networking: de novo variants in KDM1A cause developmental delay and distinctive facial features Genetics in Medicine DOI: 10.1038/gim.2015.161
The Global Alliance for Genomics and Health (GA4GH) has come up a few times on our blog. The last time we highlighted them for a tip, it was about their Beacon tool. The idea of the Beacon is that it could interrogate a database but in a very subtle way, without needing access to the entire sequence information of a patient. It would ask a simple yes/no question about a given sequence variant–and if a “yes” came back, then a researcher could go through the process of getting proper access to protected patient data.
So it was a way to keep people from pawing through data that they don’t need. And yet it could still connect people who might benefit from research, with researchers who need information.
But certainly issues of patient or donor privacy are hot topics. More and more data will come in from large projects, or from diagnostic samples, and cancer vs normal tissue comparisons, and we are going to struggle with the access vs. privacy matters for a while. The general public is only now becoming aware of the impacts. But we certainly need people to understand and we’ll want them to contribute to expanding our knowledge about health and disease.
That’s why the folks associated with GA4GH, the Wellcome Trust, and the Wellcome Genome Campus are eager to engage the public on their feelings on use of genomic sequence data. They have launched a project called “Your DNA Your Say”[PDF], in the form of a survey with videos to help understand where people are on this issue. Here’s the intro video to entice you to answer the survey:
I answered the survey because I do have concerns about access to information that will help us drive the science forward, as well as about the potential for misuse of the information. But I would like them to hear from as many people as possible, so that we can understand the barriers to research and donation that are looming. Have your say. And spread the word.
You can learn more about their ideas in a variety of publications–I’ll link to one below, but there are other publications and more details about the overall projects and individual tools at the GA4GH web site.
Lawler, M., Siu, L., Rehm, H., Chanock, S., Alterovitz, G., Burn, J., Calvo, F., Lacombe, D., Teh, B., North, K., Sawyers, C., & , . (2015). All the World’s a Stage: Facilitating Discovery Science and Improved Cancer Care through the Global Alliance for Genomics and Health Cancer Discovery, 5 (11), 1133-1136 DOI: 10.1158/2159-8290.CD-15-0821
There are many tools at NCBI, with a huge range of functions. Literature, sequence data, variations, protein structure, chemicals and bioassays, and more. It’s hard to keep track of what’s available. Their video tutorials are helping me to be aware of new tools, and new features within existing tools. For this week’s Tip of the Week, we’ll look at their recent video for ProSplign. It’s a tool that will help you align protein information to genomic sequences.
Although the Genome Workbench itself has been around for a while (we featured it as a tip it first in 2013), it is constantly underdevelopment, and new features are available regularly. And although this tip focuses on how to use the ProSplign piece, if you haven’t used it much it will help you to understand how a number of tools within the Workbench can be accessed. You can also see that Splign is available in the tool list–which is another NCBI tool for a similar type of process, but with mRNA sequences as the focus.
If you want to have a text-based type of walk-through instead, there is a page that will take you through the features (see the quick links below). And there are other videos that will help you to explore the Genome Workbench features as well–there’s a handy special playlist of just those videos. Subscribe to their YouTube channel for notices of their new items.
This week’s Video Tip of the Week is actually a whole bunch of videos. Although I’ll highlight one here as our tip, there are many great talks from the recent JGI Genomics of Energy & Environment meeting. Although typically we focus on specific software tools for our tips, I think this is a nice case of also looking at the type of research done with the tools.
This is a nice example of how to make a meeting accessible for a lot of people as well, using multiple strategies. The video channel, a Storify, dropboxes of slides (below), and the agenda details can help you to decide what might be relevant for your work. For example, we’ve talked about Docker, but you can now see how it’s deployed by the folks who are talking about it here. There’s a talk with Phytozome. And much more.
For today I’ll highlight MetaSub as one of the projects from the Mason lab. The Mason lab has participated in projects you probably heard about in the media–including swabbing the NYC subway system. You can see that data at PathoMap. MetaSUB stands for a data collection effort coming up soon, the Metagenomics & Metadesign of Subways and Urban Biomes. A global swabbing festival of the 10 busiest subways in the world (including my own–I wonder if I can do the station in my neighborhood?), to get more geospatial metagenomics maps, find antimicrobial resistance markers, and look for new biosynthetic gene clusters. It will be held on June 21, 2016–the summer solstice. It will tell us way more about our urban environments than we currently know. Maybe too much. But it’s a great idea, sure to reveal things we don’t know about our lived environment right now.
And here are the slides for the talk, as promised in the video. Mason tweets them:
He seriously did get through those 138 slides in 30 minutes. I was skeptical when I downloaded them before watching through them with the talk–but he really managed it. I was kind of out-of-breath just watching it.
He also talked about extreme environment sampling, and MetaPhlan2 and HUMAnN2 analyses, in a later segment. The whole thing is an excellent and breezy discussion of real-world genomics and a lot of appealing stories that the public would connect with. They are also doing educational outreach with a HTGAA course (How To Grow Almost Anything). There some really fun stuff with the Gowanus canal (seriously), and so much opportunity just hanging around in our cities. But also–what’s growing in space. They are working on space station mold. And astronauts–the NASA twins. They are also sending up a MinION (which they checked to see would work in microgravity–see paper below).
It was a very engaging talk. From an apparently very busy guy.
Afshinnekoo, E., Meydan, C., Chowdhury, S., Jaroudi, D., Boyer, C., Bernstein, N., Maritz, J., Reeves, D., Gandara, J., Chhangawala, S., Ahsanuddin, S., Simmons, A., Nessel, T., Sundaresh, B., Pereira, E., Jorgensen, E., Kolokotronis, S., Kirchberger, N., Garcia, I., Gandara, D., Dhanraj, S., Nawrin, T., Saletore, Y., Alexander, N., Vijay, P., Hénaff, E., Zumbo, P., Walsh, M., O’Mullan, G., Tighe, S., Dudley, J., Dunaif, A., Ennis, S., O’Halloran, E., Magalhaes, T., Boone, B., Jones, A., Muth, T., Paolantonio, K., Alter, E., Schadt, E., Garbarino, J., Prill, R., Carlton, J., Levy, S., & Mason, C. (2015). Geospatial Resolution of Human and Bacterial Diversity with City-Scale Metagenomics Cell Systems, 1 (1), 72-87 DOI: 10.1016/j.cels.2015.01.001
Alexa B.R. McIntyre, Lindsay Rizzardi, Angela M Yu, Gail L. Rosen, Noah Alexander, Douglas J. Botkin, Kristen K. John, Sarah L. Castro-Wallace, Aaron S. Burton, Andrew Feinberg, & Christopher E. Mason (2015). Nanopore Sequencing in Microgravity bioRxiv DOI: 10.1101/032342
However, the main gateway page was largely the familiar look. The gateway–where you begin to do most text-based or region-based queries for a species–was mostly altered only with some additional buttons and options. And an increasingly long list of species to choose from. But now–it’s time to look again. The gateway is very different today. You’ll have faster and easier access to get started when you go to the site, and new ways to engage with the data that you want to begin to access.
There are additional details on the UCSC landing page in the News area, including credits to the development team involved. The other key pieces include some relocations of the previous button options:
Note that a few browser utilities that were previously accessed through links and buttons on the Gateway page have been moved to the top menu bar:
*Browser reset: Genome Browser > Reset All User Settings
*Track search: Genome Browser > Track Search
*Add custom tracks: My Data > Custom Tracks
*Track hubs: My Data > Track Hubs
*Configure tracks and display: Genome Browser > Configure
The UCSC team has created a short intro video to the new look. That is our Video Tip of the Week:
Of course, this means we’ll need to update our slides and exercises. We like things to stabilize a bit after a rollout to be sure things are solid. But soon we’ll include the new navigation in our materials.
The underlying ways to access the particular assembly features you need for a given genome, and the data for your tracks of interest, is unchanged. So those parts of our training materials will still help you to get the most out of your searches. We’ll let you know when we’ve made the changes to the materials as well.
Speir, M., Zweig, A., Rosenbloom, K., Raney, B., Paten, B., Nejad, P., Lee, B., Learned, K., Karolchik, D., Hinrichs, A., Heitner, S., Harte, R., Haeussler, M., Guruvadoo, L., Fujita, P., Eisenhart, C., Diekhans, M., Clawson, H., Casper, J., Barber, G., Haussler, D., Kuhn, R., & Kent, W. (2015). The UCSC Genome Browser database: 2016 update Nucleic Acids Research DOI: 10.1093/nar/gkv1275
Disclosure: UCSC Genome Browser tutorials are freely available because UCSC sponsors us to do training and outreach on the UCSC Genome Browser.
As I mentioned last week, I am watching a lot of farmers on twitter talk about this year’s North American growing season. To get a taste of that yourself, have a look at #Plant16 + wheat as a search. This is where the rubber of tractor tires and plant genomics hits the…well…rows. And just coincidentally I saw a story about this new plant genomics research tool–actually in the farming media.
expVIP stands for expression Visualization and Integration Platform. Although the emphasis here is plant data, it can be used for any species. A good summary of their project is taken from their paper (linked below):
expVIP takes an input of RNA-seq reads (from single or multiple studies), quantifies expression per gene using the fast pseudoaligner kallisto (Bray et al., 2015) and creates a database containing the expression and sample information.
And it can handle polyploid species–try that on some of the tools aimed at human genomics! They illustrate this with some wheat samples from a number of different studies. And then they use the metadata about the studies, such as tissues and treatment conditions, to show how it works with some great sorting and filtering options. They created a version of this for you to interact with on the web: Wheat Expression Browser. But you can create your own data collections with their tools, aimed at your species or topics of interest.
This week’s Video Tip of the Week is their sample of how this Wheat Expression Browser works. Although you see the wheat data here, it’s just an example of how it can work with any species you’d like to examine.
I followed along and tried what they were showing in the video, and I found it to be a really slick and impressive way to explore the data. The dynamic filtering and sorting was really nice. You can customise the filtering/sorting/etc for the visualizations with the metadata that’s useful to your research. So you could set the tissue types, or treatment conditions, or whatever you want–and filter around to look at the expression with those. They go on to show that their strategies to compare genes in different situations seemed to reflect known biology in disease and abiotic stress conditions.
So their pipeline for gene matching, as well as the tools to explore and visualize RNA-Seq data, offer a great way to look at data that you might generate yourself or you could mine from existing submitted data–but that might not be well organized and available in a handy database just yet.
Over the years I’ve started to follow a lot of farmers on twitter. It might sound odd to folks who are immersed in human genomics and disease. But I actually find the plant and animal genomics communities to be pushing tech faster and further to the hands of end-users than a lot of the clinical applications are at this point in time. And as #Plant16 rolls out to feed us, there was a lot of soybean chatter in my twittersphere.
So when SoyBase tweeted a reminder about some of their videos, I thought the timing was great. They have a YouTube channel for some videos to help users access the SoyBase data. And one of the tools they illustrate is CMap. Although we’ve touched on CMap a couple of times on the blog and in our training videos, we never featured it. It’s one of the GMOD family members that can offer you comparisions of different map coordinate data sets. But conceptually I think it’s a good idea for people to think about physical map vs sequence mapping data. And this video shows how you can examine these different representations at SoyBase.
Besides their software videos, though, SoyBase also links to a lot of other videos that help people to understand more about many aspects of soybean cultivation. Check out their wide range of topics on their Video Tutorials page. You never see how to use a two row harvester at human genomics databases, do you?
I didn’t expect to do another tip on the paths through experiments or data this week. But there must be something in the water cooler lately, and all of these different tools converged on my part of the bioinformatics ecosphere. As I was perusing my tweetdeck columns, a new tool from the folks who do the Caleydo projects offered more paths through data: Pathfinder, Visual Analysis of Paths in Graphs.
This new tool offers another way to look across relationships in data sets. Finding paths through data is only getting harder with every new data set we get, but continues to become more important to pull in the characteristics of the alternate routes and yet still have the context of the overall picture. Scaling paths is hard. So the Calydo team aims at several key aspects of the problem with their new Pathfinder tool. The full details are in the paper (cited below), but I’ll list the points for the features they deliver here:
1. Query for paths.
2. Visualize attributes.
3. Visualze group structures in paths.
4. Rank paths.
5. Visual topology context.
6. Compare paths.
7. Group paths.
In addition to clever visualization and query strategies, the team always offers an nice intro video to give you a sense of what the tool can do for you. So the new video on Pathfinder is our Video Tip of the Week.
The example used is the sets of authors on publications. But it’s easy to imagine signalling pathways, or some types of sequence variation pathways, or many other kinds of paths researchers need to represent. They have a use case example in the paper of KEGG pathways. In the video, there’s a quick look at a pathway that includes copy number variations and gene expression data as attributes that may be important for understanding the paths.
Try it out. There’s a demo site available (linked below), and start to think about how you could use Pathfinder to analyze data that you are interested for your research directions.
Hat tip to Alexander Lex for the notice of the new tool:
Christian Partl, Samuel Gratzl, Marc Streit, Anne Mai Wassermann, Hanspeter Pfister, Dieter Schmalstieg, & Alexander Lex (2016). Pathfinder: Visual Analysis of Paths in Graphs Computer Graphics Forum ((EuroVis ’16)) In press.