Ok, it’s been a while since this was a regular feature. But I am still finding that I want to show some videos of science topics and software tools sometimes. So it may not be a regular feature, but I will be highlighting some videos that seem interesting to me for various reasons.
This video struck me because I recently gave a talk about the information from ancestral genomes and the influence of the DNA on us today (as well as how we visualize that). They use software that we’ve talked about before, PolyPhen and SIFT, in this analysis. And it would have been handy to have this as a resources to give out to the audience members, who were general public folks in a pub. I am impressed that a research team did this additional step of explaining their research in this way.
Dannemann, M., Prüfer, K., Wagner, A., & Kelso, J. (2017). Functional implications of Neandertal introgression in modern humans Genome Biology 18:61. DOI: 10.1186/s13059-017-1181-7
This week’s video tip was prompted by a couple of things. First, it was a tweet from FlyBase, about their video channel. It’s been a while since we’d done a FlyBase Tip of the Week, so that was enough of a reason.
But it also came just after I was thinking about the importance of the model organisms, based on the campaign by the model organism databases to save their funding. Here’s a tweet from the yeast genome database (SGD) with a plea:
I have a soft spot for model organisms, not only because of the tremendous amount of great biology they’ve provided. I was a postdoc at The Jackson Lab, and I am acutely aware of how crucial it is to have the depth of the species specialists involved in creating and maintaining the resources that are appropriate for their organism. But it’s more than just institutional knowledge and data, of course. It’s also the importance of the community of researchers working on that organism, supporting them and having their needs met in many ways with species-specific resources.
So this week’s tip highlights some features of the FlyBase tools, as a way to remind folks of the great work that’s going on at model organism databases (MODs).
NHGRI/NIH has recently advanced a plan in which the MODs will be integrated into a single combined database, along with a 30% reduction in funding for each MOD (see also these Nature and Science news stories). While increased integration will present many advantages, the plan will result in a loss of critical organism-specific datasets. The funding cut will also cripple core functions such as high quality literature curation and genome annotation, degrading the utility of the MODs. Given the large number of scientists that this policy change would affect and the importance of their work, this is a matter of extreme concern.
I have always shouted about the importance of high-quality curation. It’s so undervalued, but it’s only more and more crucial now that we are getting so much sequence data and we need the best existing knowledge to help guide us through it. Now is not the time to cut back on curation.
So if you have valued MOD data and community sites, please consider signing on to the letter of support.
Lately I’ve been keeping an eye on a lot of the tools that link individuals with sequence data, their phenotypes, and researchers/physicians who may either study or treat the associated medical issues (see MyGene2 most recently) . But there’s a lot of room upstream of these kinds of patient outcomes to explore genotypes and phenotypes. This week’s Video Tip of the Week is for Genonets, offering “Analysis and Visualization of Genotype Networks”, a tool that can help to explore these relationships for pre-clinical/research scenarios as well.
A recent paper explains the goals behind their tools, and they also have a series of videos on their web site to help you get going with Genonets. I’ll put the intro video here, but be sure to click over to their “Learn Genonets” page for a lot more. There’s also a text-based tutorial you can work through which is helpful.
You can also kick the tires a bit with a sample file that’s available from their search page. Just click the checkbox to load it up and try it out. And then be sure to explore those “deep dives” videos to go further.
Khalid, F., Aguilar-Rodríguez, J., Wagner, A., & Payne, J. (2016). Genonets server—a web server for the construction, analysis and visualization of genotype networks Nucleic Acids Research DOI: 10.1093/nar/gkw313
The team at UCSC Genome Browser continues to update their resources and offer new ways to find and visualize features of interest to researchers. One of the newer features is the “multi-region” option. When it was first launched, I did a tip on how to use that, with some of the things that I noticed while I was testing it pre-launch. But now the folks at UCSC have their own video on the exon-only display that you might also find useful.
One of the things that is illustrated here is how the exon-only mode is handy to enhance your exploration of RNA-Seq data. It also uses a great ENCODE data set as an example, and if you haven’t been using that collection it’s a good reminder of the kinds of things you can find in that resource still. And this extensive data set shows how much easier it is to look at different isoforms in the data in this new exon-only mode.
So have a look at this display option if you haven’t before, especially how it can help you to see transcript differences. If you aren’t familiar with the ENCODE data that’s being used, you can also see our training on that which will help you to understand how to use that data and the filtering features that are also used in this video.
Special note: I have updated the UCSC Intro slides to include the new Gateway strategies as well. So download those slides for the latest look.
Disclosure: UCSC Genome Browser tutorials are freely available because UCSC sponsors us to do training and outreach on the UCSC Genome Browser.
Speir, M., Zweig, A., Rosenbloom, K., Raney, B., Paten, B., Nejad, P., Lee, B., Learned, K., Karolchik, D., Hinrichs, A., Heitner, S., Harte, R., Haeussler, M., Guruvadoo, L., Fujita, P., Eisenhart, C., Diekhans, M., Clawson, H., Casper, J., Barber, G., Haussler, D., Kuhn, R., & Kent, W. (2016). The UCSC Genome Browser database: 2016 update Nucleic Acids Research, 44 (D1) DOI: 10.1093/nar/gkv1275
The ENCODE Project Consortium (2012). An integrated encyclopedia of DNA elements in the human genome Nature, 489 (7414), 57-74 DOI: 10.1038/nature11247
Last week’s tip encouraged people to think about how their DNA may be used by various stakeholders. This could be researchers, physicians, pharmaceutical companies, and so on. But one thing it didn’t really cover–now that I think of it–was connecting with other families who may share variations that impact the health of someone in their household. If there isn’t research, or treatment, this connection alone might be worth it for some families.
The site is MyGene2, and it can help unrelated folks who have the same genome challenges connect with each other. It also can connect folks to researchers interested in the topic. But it does seem to be aimed more specifically at families seeking each other. It was recently awarded a chance to compete for the final prize in the Open Science Prize effort, and they got a funding boost to keep going.
There is a “welcome” video that they’ve made, but it’s light on the software details. Still, though, I wanted to share the information so families may find it, and researchers may want to know about this resources as well. The video isn’t embeddable, though, so you’ll have to click to view it:
You can learn more about the resources from their FAQ collection. I’ve found a couple of references (below) that provide some further information about the project [Note: the Genetics in Medicine one goes to a paywalled, piece–but you can access the pre-print version PDF at bioRxiv]. As more and more families who are seeking answers will have sequencing information available, they’ll need a place to go with that. I hope they find each other, and find answers.
Chong, J., Yu, J., Lorentzen, P., Park, K., Jamal, S., Tabor, H., Rauch, A., Saenz, M., Boltshauser, E., Patterson, K., Nickerson, D., & Bamshad, M. (2015). Gene discovery for Mendelian conditions via social networking: de novo variants in KDM1A cause developmental delay and distinctive facial features Genetics in Medicine DOI: 10.1038/gim.2015.161
The Global Alliance for Genomics and Health (GA4GH) has come up a few times on our blog. The last time we highlighted them for a tip, it was about their Beacon tool. The idea of the Beacon is that it could interrogate a database but in a very subtle way, without needing access to the entire sequence information of a patient. It would ask a simple yes/no question about a given sequence variant–and if a “yes” came back, then a researcher could go through the process of getting proper access to protected patient data.
So it was a way to keep people from pawing through data that they don’t need. And yet it could still connect people who might benefit from research, with researchers who need information.
But certainly issues of patient or donor privacy are hot topics. More and more data will come in from large projects, or from diagnostic samples, and cancer vs normal tissue comparisons, and we are going to struggle with the access vs. privacy matters for a while. The general public is only now becoming aware of the impacts. But we certainly need people to understand and we’ll want them to contribute to expanding our knowledge about health and disease.
That’s why the folks associated with GA4GH, the Wellcome Trust, and the Wellcome Genome Campus are eager to engage the public on their feelings on use of genomic sequence data. They have launched a project called “Your DNA Your Say”[PDF], in the form of a survey with videos to help understand where people are on this issue. Here’s the intro video to entice you to answer the survey:
I answered the survey because I do have concerns about access to information that will help us drive the science forward, as well as about the potential for misuse of the information. But I would like them to hear from as many people as possible, so that we can understand the barriers to research and donation that are looming. Have your say. And spread the word.
You can learn more about their ideas in a variety of publications–I’ll link to one below, but there are other publications and more details about the overall projects and individual tools at the GA4GH web site.
Lawler, M., Siu, L., Rehm, H., Chanock, S., Alterovitz, G., Burn, J., Calvo, F., Lacombe, D., Teh, B., North, K., Sawyers, C., & , . (2015). All the World’s a Stage: Facilitating Discovery Science and Improved Cancer Care through the Global Alliance for Genomics and Health Cancer Discovery, 5 (11), 1133-1136 DOI: 10.1158/2159-8290.CD-15-0821
There are many tools at NCBI, with a huge range of functions. Literature, sequence data, variations, protein structure, chemicals and bioassays, and more. It’s hard to keep track of what’s available. Their video tutorials are helping me to be aware of new tools, and new features within existing tools. For this week’s Tip of the Week, we’ll look at their recent video for ProSplign. It’s a tool that will help you align protein information to genomic sequences.
Although the Genome Workbench itself has been around for a while (we featured it as a tip it first in 2013), it is constantly underdevelopment, and new features are available regularly. And although this tip focuses on how to use the ProSplign piece, if you haven’t used it much it will help you to understand how a number of tools within the Workbench can be accessed. You can also see that Splign is available in the tool list–which is another NCBI tool for a similar type of process, but with mRNA sequences as the focus.
If you want to have a text-based type of walk-through instead, there is a page that will take you through the features (see the quick links below). And there are other videos that will help you to explore the Genome Workbench features as well–there’s a handy special playlist of just those videos. Subscribe to their YouTube channel for notices of their new items.
This week’s Video Tip of the Week is actually a whole bunch of videos. Although I’ll highlight one here as our tip, there are many great talks from the recent JGI Genomics of Energy & Environment meeting. Although typically we focus on specific software tools for our tips, I think this is a nice case of also looking at the type of research done with the tools.
This is a nice example of how to make a meeting accessible for a lot of people as well, using multiple strategies. The video channel, a Storify, dropboxes of slides (below), and the agenda details can help you to decide what might be relevant for your work. For example, we’ve talked about Docker, but you can now see how it’s deployed by the folks who are talking about it here. There’s a talk with Phytozome. And much more.
For today I’ll highlight MetaSub as one of the projects from the Mason lab. The Mason lab has participated in projects you probably heard about in the media–including swabbing the NYC subway system. You can see that data at PathoMap. MetaSUB stands for a data collection effort coming up soon, the Metagenomics & Metadesign of Subways and Urban Biomes. A global swabbing festival of the 10 busiest subways in the world (including my own–I wonder if I can do the station in my neighborhood?), to get more geospatial metagenomics maps, find antimicrobial resistance markers, and look for new biosynthetic gene clusters. It will be held on June 21, 2016–the summer solstice. It will tell us way more about our urban environments than we currently know. Maybe too much. But it’s a great idea, sure to reveal things we don’t know about our lived environment right now.
And here are the slides for the talk, as promised in the video. Mason tweets them:
He seriously did get through those 138 slides in 30 minutes. I was skeptical when I downloaded them before watching through them with the talk–but he really managed it. I was kind of out-of-breath just watching it.
He also talked about extreme environment sampling, and MetaPhlan2 and HUMAnN2 analyses, in a later segment. The whole thing is an excellent and breezy discussion of real-world genomics and a lot of appealing stories that the public would connect with. They are also doing educational outreach with a HTGAA course (How To Grow Almost Anything). There some really fun stuff with the Gowanus canal (seriously), and so much opportunity just hanging around in our cities. But also–what’s growing in space. They are working on space station mold. And astronauts–the NASA twins. They are also sending up a MinION (which they checked to see would work in microgravity–see paper below).
It was a very engaging talk. From an apparently very busy guy.
Afshinnekoo, E., Meydan, C., Chowdhury, S., Jaroudi, D., Boyer, C., Bernstein, N., Maritz, J., Reeves, D., Gandara, J., Chhangawala, S., Ahsanuddin, S., Simmons, A., Nessel, T., Sundaresh, B., Pereira, E., Jorgensen, E., Kolokotronis, S., Kirchberger, N., Garcia, I., Gandara, D., Dhanraj, S., Nawrin, T., Saletore, Y., Alexander, N., Vijay, P., Hénaff, E., Zumbo, P., Walsh, M., O’Mullan, G., Tighe, S., Dudley, J., Dunaif, A., Ennis, S., O’Halloran, E., Magalhaes, T., Boone, B., Jones, A., Muth, T., Paolantonio, K., Alter, E., Schadt, E., Garbarino, J., Prill, R., Carlton, J., Levy, S., & Mason, C. (2015). Geospatial Resolution of Human and Bacterial Diversity with City-Scale Metagenomics Cell Systems, 1 (1), 72-87 DOI: 10.1016/j.cels.2015.01.001
Alexa B.R. McIntyre, Lindsay Rizzardi, Angela M Yu, Gail L. Rosen, Noah Alexander, Douglas J. Botkin, Kristen K. John, Sarah L. Castro-Wallace, Aaron S. Burton, Andrew Feinberg, & Christopher E. Mason (2015). Nanopore Sequencing in Microgravity bioRxiv DOI: 10.1101/032342
However, the main gateway page was largely the familiar look. The gateway–where you begin to do most text-based or region-based queries for a species–was mostly altered only with some additional buttons and options. And an increasingly long list of species to choose from. But now–it’s time to look again. The gateway is very different today. You’ll have faster and easier access to get started when you go to the site, and new ways to engage with the data that you want to begin to access.
There are additional details on the UCSC landing page in the News area, including credits to the development team involved. The other key pieces include some relocations of the previous button options:
Note that a few browser utilities that were previously accessed through links and buttons on the Gateway page have been moved to the top menu bar:
*Browser reset: Genome Browser > Reset All User Settings
*Track search: Genome Browser > Track Search
*Add custom tracks: My Data > Custom Tracks
*Track hubs: My Data > Track Hubs
*Configure tracks and display: Genome Browser > Configure
The UCSC team has created a short intro video to the new look. That is our Video Tip of the Week:
Of course, this means we’ll need to update our slides and exercises. We like things to stabilize a bit after a rollout to be sure things are solid. But soon we’ll include the new navigation in our materials.
The underlying ways to access the particular assembly features you need for a given genome, and the data for your tracks of interest, is unchanged. So those parts of our training materials will still help you to get the most out of your searches. We’ll let you know when we’ve made the changes to the materials as well.
Speir, M., Zweig, A., Rosenbloom, K., Raney, B., Paten, B., Nejad, P., Lee, B., Learned, K., Karolchik, D., Hinrichs, A., Heitner, S., Harte, R., Haeussler, M., Guruvadoo, L., Fujita, P., Eisenhart, C., Diekhans, M., Clawson, H., Casper, J., Barber, G., Haussler, D., Kuhn, R., & Kent, W. (2015). The UCSC Genome Browser database: 2016 update Nucleic Acids Research DOI: 10.1093/nar/gkv1275
Disclosure: UCSC Genome Browser tutorials are freely available because UCSC sponsors us to do training and outreach on the UCSC Genome Browser.
As I mentioned last week, I am watching a lot of farmers on twitter talk about this year’s North American growing season. To get a taste of that yourself, have a look at #Plant16 + wheat as a search. This is where the rubber of tractor tires and plant genomics hits the…well…rows. And just coincidentally I saw a story about this new plant genomics research tool–actually in the farming media.
expVIP stands for expression Visualization and Integration Platform. Although the emphasis here is plant data, it can be used for any species. A good summary of their project is taken from their paper (linked below):
expVIP takes an input of RNA-seq reads (from single or multiple studies), quantifies expression per gene using the fast pseudoaligner kallisto (Bray et al., 2015) and creates a database containing the expression and sample information.
And it can handle polyploid species–try that on some of the tools aimed at human genomics! They illustrate this with some wheat samples from a number of different studies. And then they use the metadata about the studies, such as tissues and treatment conditions, to show how it works with some great sorting and filtering options. They created a version of this for you to interact with on the web: Wheat Expression Browser. But you can create your own data collections with their tools, aimed at your species or topics of interest.
This week’s Video Tip of the Week is their sample of how this Wheat Expression Browser works. Although you see the wheat data here, it’s just an example of how it can work with any species you’d like to examine.
I followed along and tried what they were showing in the video, and I found it to be a really slick and impressive way to explore the data. The dynamic filtering and sorting was really nice. You can customise the filtering/sorting/etc for the visualizations with the metadata that’s useful to your research. So you could set the tissue types, or treatment conditions, or whatever you want–and filter around to look at the expression with those. They go on to show that their strategies to compare genes in different situations seemed to reflect known biology in disease and abiotic stress conditions.
So their pipeline for gene matching, as well as the tools to explore and visualize RNA-Seq data, offer a great way to look at data that you might generate yourself or you could mine from existing submitted data–but that might not be well organized and available in a handy database just yet.