Tag Archives: UCSC Genome Browser


Video Tip of the Week: UniProt updates, now including portable BED files

UniProt is one of the core resources that provides tremendously important curated information about proteins. You will find links to UniProt in lots of other tools and databases as well, but we’ve always championed going directly there for the full look at all the wide range of information they offer. Their foundation remains solid, but they also continue to add new and useful features over time. Recently they had a webinar to describe some of the new things, and the recording of that webinar will be this week’s Video Tip fo the Week.

The video starts with an overview of the whole UniProt site. The core of their great resource is the same, of course. UniProtKB, UniRef, and UniParc are there for various ways to look across the data. The handy Proteomes collection of the proteins in a given species is available, and they also have reference proteomes from that access point. There’s a short section in the video that’s a guide to the basic search functions.

About 9 minutes in they introduce the UniRule annotation features. When certain conditions are met, an annotation gets applied to a protein–which you can trace from the protein pages by clicking on the UniRule link for that annotation. unirule_sampleAnd their software offers a very cool way to look and see how/when conditions are applied. It will load a decision flow path and highlights what the logic rules were used in that particular case, so you can trace it and understand how a protein got a given item. That’s what I illustrate in the screen shot here.

About 14 min, the topic changed to the new Genome Annotation Tracks. They now offer you a way to take their annotations for a UniProtKB entry and use them with a separate genome browser. They hand you BED or BigBed files for different features. You can also load the whole thing as a Hub file to see all the sequence feature data at once. They are species-specific, and started with human, but others are coming. You can access them from the “Downloads” area of the homepage. The video also described a bit about the structure there as well. So you could take these files to ENSEMBL or UCSC Genome Browser and load them, with all the UniProt features now to compare to the existing genomic context at those browsers. They illustrate how you can look at the “active site” annotations, but you can also look at post-translation modification sites, domains, etc. This was a feature that was new to me, and looks like a terrific idea.

So even if you think you know UniProt, check out these new options for additional ways to interact with the high-quality information they provide. Good stuff.

Quick links:

UniProt: http://www.uniprot.org/


The UniProt Consortium (2014). UniProt: a hub for protein information Nucleic Acids Research, 43 (D1) DOI: 10.1093/nar/gku989

View External Tools

Video Tip of the Week: Send UCSC Genome Browser sequence to external tools

The folks at the UCSC Genome Browser are always adding new features, new data, and new genomes to their site. And although they use the genome-announce mailing list to get the word out, even I can miss a notice. There was news recently of a new feature associated with the graphical genome browser that I’ve been waiting for (as I had tested it prior to roll-out), but they sent the email out when I was prepping for Thanksgiving, and I didn’t catch it for a few days.

But now I’ve had a chance to see it, kick the tires more, and I want to show you how it works. The basic feature is this: you can take the genomic sequence from the window you are viewing and with a couple of clicks deliver it right to another tool for more helpful data or analysis. This is really handy from the graphical viewer. Before you were limited to obtaining the sequence with the “Get DNA” option. You could get the sequence, copy/paste, and do whatever. Which is fine, and might still be right for some other tools. But now you can skip that copy/paste part and explore features of your sequence directly. If you don’t have time for the video, here’s the bullet–use this View–>In External Tools menu to get there.

View External Tools

There are a range of options. You can still jump right to NCBI Map Viewer or Ensembl to look at the same region in their browsers. But now you can select primers, see restriction sites, look at RNA characteristics, find protein domains, or examine CRISPR tools. You can look for transcription start sites or some TF binding motifs. Here’s this week’s Video Tip that shows the process.

You can still “get DNA” if you want to. And you can still use the Table Browser to send data from your custom queries to even more tools (like Galaxy and others). But we know from our workshops that most people we train spend their time in the graphic browser and these tools should help them accomplish more. You can see all of these options in our freely available training suites (linked below).

To see if there are other things you missed over the past year, check out the new NAR database issue paper from the UCSC team (linked below). It has updates about many of their accomplishments and recent features, so have a look and see if there are other useful aspects for assisting your research.

Quick links:

UCSC Genome Browser: http://genome.ucsc.edu/

Intro full training suite: http://openhelix.com/ucsc

Advanced topics training suite: http://openhelix.com/ucscadv


Speir, M., Zweig, A., Rosenbloom, K., Raney, B., Paten, B., Nejad, P., Lee, B., Learned, K., Karolchik, D., Hinrichs, A., Heitner, S., Harte, R., Haeussler, M., Guruvadoo, L., Fujita, P., Eisenhart, C., Diekhans, M., Clawson, H., Casper, J., Barber, G., Haussler, D., Kuhn, R., & Kent, W. (2015). The UCSC Genome Browser database: 2016 update Nucleic Acids Research DOI: 10.1093/nar/gkv1275

Disclosure: UCSC Genome Browser tutorials are freely available because UCSC sponsors us to do training and outreach on the UCSC Genome Browser.


Video Tip of the Week: UCSC Table Browser and Custom Tracks

UCSC IntroThis week’s video tip is longer than usual. But if you want to dig deeper into all the data that you know is coming in to the UCSC Genome Browser, you want to use the Table Browser. If you’ve only used the genome browser interface, you are missing a lot of opportunity to mine for great data.

The UCSC Genome Browser folks have been continuously adding new features and data over the years since they’ve been sponsoring our free training materials. But the look of the Table Browser hadn’t changed all that much. However, with the move to the new hg38 assembly as the default human genome, it was a good time for us to update the shots of the training materials for the Table Browser. It also gave us the chance to include track hubs, which we hadn’t covered in the previous version. And we also touch briefly on Genome Browser in a Box.

As before, we have a recorded video. And we have also included updated slides that you can use to train others. And our exercises are there for more practice, too.


We’ve also mentioned before that we have a handy paper that also can help people to get up to speed on UCSC features. Our Current Protocols paper is now un-paywalled in PMC, so if you want to supplement the video with this other piece, there’s a bit more detail on how to do hubs of your own. But you should also see the specific papers on that. And there’s more guidance in this video tip from before.

So check out the new materials for the advanced topics. And see the new preprint (linked below) for more details and upcoming features as well.

Quick links:

UCSC Genome Browser homepage: genome.ucsc.edu

Training suite: openhelix.com/ucscadv


Rosenbloom, K., Armstrong, J., Barber, G., Casper, J., Clawson, H., Diekhans, M., Dreszer, T., Fujita, P., Guruvadoo, L., Haeussler, M., Harte, R., Heitner, S., Hickey, G., Hinrichs, A., Hubley, R., Karolchik, D., Learned, K., Lee, B., Li, C., Miga, K., Nguyen, N., Paten, B., Raney, B., Smit, A., Speir, M., Zweig, A., Haussler, D., Kuhn, R., & Kent, W. (2014). The UCSC Genome Browser database: 2015 update Nucleic Acids Research, 43 (D1) DOI: 10.1093/nar/gku1177

Matthew L. Speir, Ann S. Zweig, Kate R. Rosenbloom, Brian J. Raney, Benedict Paten, Parisa Nejad, Brian T. Lee, Katrina Learned, Donna Karolchik, Angie S. Hinrichs, Steve Heitner, Rachel A. Harte, Maximilian Haeussler, Luvina Guruvadoo, Pauline A. Fujita, Christopher Eisenhart, Mark Diekhans, Hiram Clawson, Jonathan Casper, Galt P. Barber, David Haussler, Robert M. Kuhn, W. James Kent. (2015). The UCSC Genome Browser database: 2016 update bioRxiv DOI: 10.1101/027037

Mangan ME, Williams JM, Kuhn RM, & Lathe WC (2014). The UCSC Genome Browser: What Every Molecular Biologist Should Know Current Protocols in Molecular Biology., 107 (19.9), 199-199 DOI: 10.1002/0471142727.mb1909s107

Disclosure: UCSC Genome Browser tutorials are freely available because UCSC sponsors us to do training and outreach on the UCSC Genome Browser.

UCSC Genome Bioinformatics

What’s the Answer? (keyboard shortcuts for UCSC)

This week’s solution is unusual. It didn’t come from my usual sources of questions. It came from twitter. It’s from the UCSC Genome Browser team. But it’s such a handy answer to a simple problem that it’s this week’s Answer post. So here’s my source of an answer to a ?

I had it in my files and forgot to try it out at first, but now I have. And I think there are several that I would probably use pretty often. But I know I use the web interface more than some people.

Anyway, try it out on some common tasks.

Grinstein on dataviz at VIZBI.

Video Tip of the Week: Weave, Web-based Analysis and Visualization Environment

At the recent Discovery On Target conference, a workshop on data and analytics for drug discovery contained several informative talks. This week’s Video Tip of the Week was inspired by the first speaker in that session, Georges Grinstein. Not only was the software he talked about something I wanted to examine right away (Weave)–his philosophy on visualization of data was so in line with my informal thoughts on the topic that I just connected with it immediately. But also–stay for the “living figures” down below.

Grinstein on dataviz at VIZBI.

Grinstein on dataviz at VIZBI.

Grinstein has been working on dataviz for a long time. And he’s been working with big data since long before big data was trendy. For some of his background and philosophy, check out this talk at a VIZBI conference. Because so many of the problems are the same across big data types, the software that he’s been working on could really be useful for the new issues facing big data in biology. But I don’t know that I’ve heard about it among the genoscenti just yet. (In this talk he also covers RadViz, a radial visualization tool that some folks might find useful. It was also mentioned in the workshop.)

One of the key things that he wanted us to take away from the workshop was that we need to offer people multiple, interactive visualizations for them to get the most of out the data. This is something I’ve been looking for quite a bit. I fell in love with an early version of the Caleydo stuff for exactly this reason. But I understand that it can be tricky.

Weave, or the Web-based Analysis and Visualization Environment, gets closer to this with super responsiveness than I’ve seen elsewhere. This week’s Video Tip is a short intro to this platform, but I’ll link you below to a longer form that you should watch if you want to dive into this tool. Here you’ll see that just by dragging a CSV file in, you can then set up a scatter plot, bar chart, parallel coordinates, a color histogram, and a table. In seconds. Really.

This brief intro doesn’t do full justice to this tool, of course. I joined the Weave-users discussion group and found a recent webinar recording that you should watch. But you’ll have to grab it from the group, it doesn’t appear to be stored on a video platform site (search for the thread called IVPR Update on Weave Monday 3/23). It goes into more detail on the features, of course. And sharing data, and reproducibility of the visualizations with the session history options.

I downloaded the Weave Desktop and ran it on my little system. I grabbed some transcription factor score data from the ENCODE project with the UCSC Table Browser, got it in csv format, pulled it in, and within seconds was looking over all the data on the X chromosome for this TFBS I was interested in. Clicking an item in my table highlighted it in my histogram. And that was just to kick the tires. According to the video, you could have had a tile of Cytoscape (because you can integrate with Cytoscape–I didn’t get that far yet though) and checked out interaction data as well. Although I mention Cytoscape because readers here probably know it, that’s just one of the linkable tools. R is embedded, and other stats tools, and you can modify your scripts right from Weave. Some of these additional features may be part of the Analyst Workstation sub-project. I couldn’t always tell which tool had which features in my early explorations.

But if there’s one thing I’d like you to do after reading this post (if you read this far) is look at this paper that is just out. As I was noodling on Weave, I thought to myself that it was PERFECT to create the kind of “living figures” that I want to see in more papers. Now go see Dynamic Data Visualization with Weave and Brain Choropleths. I don’t care if you aren’t interested in brain choropleths–go look at the figures. In each one, there’s a link to a Weave demo, like this:

Weave demo PLOS

Click on those demos to load them. You can be interacting with the data on the brain maps, with pre-set Weave tiles of different features of the data set for you. Open the gears icons to change the settings. Now imagine this with gene expression maps in C. elegans bodies. Or with transcription factors and scores in mouse embryos. Or Venns with big piles of GO terms (but what I really want there is UpSet anyway). Or any of a dozen other types of data we get in big data papers now that are really impossible to explore in traditional publication format. I want this for genomics papers in the future, okay?

This software has a lot of potential for analysis, visualization, and sharing of data. I can’t cover it all in a brief blog post. The Weave team has thought carefully about sharing with colleagues, reusable templates, and provenance of data, and all this is built right into to this tool. If you are analyzing data for others, you can set up dashboards for them to see specific views. See their help and info docs for more details, and check out the longer videos in the forum.  I think it would connect with a lot of people–and could benefit the genomics community greatly. Have a look. I think you’ll like it.

Quick links:

Weave: http://iweave.com/

GitHub: https://github.com/WeaveTeam

Weave-users discussion: https://groups.google.com/forum/#!forum/weave-users

Weave desktop: http://info.oicweave.org/projects/weave/wiki/Installer

More videos, Weave IVPR channel: https://www.youtube.com/channel/UCXJrO9cug3c7B7eRJSwZ4vQ


Patterson, D., Hicks, T., Dufilie, A., Grinstein, G., & Plante, E. (2015). Dynamic Data Visualization with Weave and Brain Choropleths PLOS ONE, 10 (9) DOI: 10.1371/journal.pone.0139453

Daniels, K., Grinstein, G., Russell, A., & Glidden, M. (2012). Properties of normalized radial visualizations Information Visualization, 11 (4), 273-300 DOI: 10.1177/1473871612439357

Comparison of cancer genomics tools, via: Swiss Med Wkly. 2015;145:w14183

Video Tip of the Week: UCSC Xena System for functional and cancer genomics

When we go out and do workshops, we get a lot of requests from researchers who would like some guidance on cancer genomics tools. Our particular mission has been to aim more broadly at tools that are of wide interest and not to focus on a particular disease or condition area. But certainly the cancer genomics arena is going to be one of the ones that’s got so much opportunity for great bioinformatics-based outcomes in the near term. So I keep an eye out for tools researchers may want to explore.

When the “genomics” twitter column in my Tweetdeck dropped this new mini-review of cancer genomics tools on my desktop, I went to look right away: Data mining The Cancer Genome Atlas in the era of precision cancer medicine. TCGA is the focus of the data source they are talking about, but the tools included may have more data sets and wider utility, of course. Most of the tools described were familiar to me (cBioPortal, GDAC Firehose, UCSC Cancer Genomics Browser, canEvolve), but a couple of them were new. I had never explored the ProGeneV2 tools before. And the UZH Cancer Browser was also new to me.

Comparison of cancer genomics tools, via: Swiss Med Wkly. 2015;145:w14183

Comparison of cancer genomics tools, via: Swiss Med Wkly. 2015;145:w14183

One thing that’s very helpful to me is the kind of table they provided as Table 2. It’s a comparison of the main tools they are discussing, with different features of each compared. That’s handy for choosing the tool to spend time on, depending on your own research needs.

But they also referred to another tool that was new to me, Xena. “The UCSC cancer browser will be updated in the future, with the new Xena platform for visualisation and integration with Galaxy“. I can never resist new genomics visualization tools, and as a giant fan of Galaxy, I certainly need to know more about this.

So I went to look around for some information on it, and their introductory video is this week’s Tip of the Week.

So Xena is designed to let you combine your own data with large public resource collection data, without leaving your firewall or without being too onerous to pull down all the public data and manage it locally. You can explore functional genomics data and related phenotype and clinical data. It uses the “hubs” strategy that is becoming increasingly adopted as a way to integrate across data collections. We were just talking about hubs in another recent tip if these are new to you. It supports a wide range of data types to examine and visualize. If you want to go deeper, there’s a lot more information over at the Xena homepage. They have documentation, presentation slides, and a step-by-step demo available from a recent workshop.

Certainly one of the key features appears to be that you can integrate your own research data–which might be subject to strict privacy regulations–on your own computer with all the other key information from public data providers. Increasingly researchers I talk to at workshops need this aspect very much.

So try out Xena, and explore the other tools in the cancer genomics space, to see what’s right for your research.

Hat tip to Oscar:

And you can follow Xena on twitter for news and updates: https://twitter.com/UCSCXena

Quick links:

Xena: http://xena.ucsc.edu/


Cline, M., Craft, B., Swatloski, T., Goldman, M., Ma, S., Haussler, D., & Zhu, J. (2013). Exploring TCGA Pan-Cancer Data at the UCSC Cancer Genomics Browser Scientific Reports, 3 DOI: 10.1038/srep02652

Cheng PF, Dummer R, & Levesque MP (2015). Data mining The Cancer Genome Atlas in the era of precision cancer medicine. Swiss Med Wkly. (145) : 10.4414/smw.2015.14183

UCSC Genome Bioinformatics

UCSC Genome Browser, default human genome changed

This has gone out over the announcement mailing list, and is also on their web site. But in case you aren’t checking those, seemed important to get people to see.

UCSC Genome Bioinformatics

14 September 2015 — Human Genome Browser default changed to GRCh38/hg38

In conjunction with the release of the new 100-species Conservation track on the hg38/GRCh38 human assembly, we have now changed the default human browser on our website from hg19 to hg38. This should not affect your current browsing sessions; if you were last looking at the hg19 (or older) browser, the Genome Browser will continue to display that assembly for you when you start it up. There are circumstances, however, in which the selected assembly can switch to the newer version. For instance, the assembly will switch to hg38 if you reset your browser defaults. If you find yourself in a situation where some of your favorite browser tracks have “disappeared”, you may want to check that you’re viewing the right assembly.

We will continue our efforts to expand the annotation track set on the hg38 browser to include many of the tracks present on previous human assemblies. In cases where it makes sense, data may be simply “lifted” from hg19 using migration tools. In many instances, however, we must rely on our data providers to generate new versions of their data on the latest assembly. We will publish these data sets as they become available.

For a summary of the new features in the GRCh38 assembly, see the overview we published in March 2014.


What’s the Answer? (cancer data visualization tools)

This week’s highlighted question was less of a question than a notice about a new tool. And because I’m always interested in exploring new visualization tools, I was interested to have a look. In addition, we are frequently asked about tools specific for cancer genomics, and I like to be able to tell people about what I’ve found.

Biostars is a site for asking, answering and discussing bioinformatics questions and issues. We are members of the Biostars_logo community and find it very useful. Often questions and answers arise at Biostars that are germane to our readers (end users of genomics resources). Thursdays we will be highlighting one of those items or discussions here in this thread. You can ask questions in this thread, or you can always join in at Biostars.

So here’s this week’s highlighted question (well, the question mark was edited out later, but this is what it said when I grabbed it):

Tool: A new tool for cancer researcher developed by UCSC?

Another tool XENA that comes to the world of Bioinforamatics designed by UCSC. I am not sure if people are aware of it. It is developed recently and I got the notification last evening. Seems to be having a lot of potential for both data visualization and also producing quality images for publication. The paper is not yet out but the mention of the tool was done last year when the update paper of the UCSC Cancer Genomics browser was made for 2015.

The tool has got data from both TCGA and ICGC and is a powerful resource not only for public data comparing and viewing but also one can upload its own data or download the tool locally to a desktop app version and visualize it. The tool is available at the below link


The technical doc is here . Am sure it will be a great resource for the researchers and bioinformaticians across the globe. For analysis it also integrates the galaxy as well and if you format your data in a version as mentioned in help docs one can view their data as well. Enjoy and appreciate the work. Hope people would like it.


I am not involved in the work. I liked the tool a lot so thought of informing it to the community


It also generated a bit of discussion about the challenges of developing visualizations. Go have a look.

UCSC Genome Bioinformatics

Video Tip of the Week: UCSC features for ENCODE data utilization

UCSC Genome BioinformaticsAs noted in last week’s tip about the ENCODE DCC at Stanford, there was a workshop recently for the ENCODE project. There were a lot of folks speaking and a big room full of attendees. You should check out the full agenda and the playlist at the NHGRI site for all the videos, slides, and handouts: ENCODE 2015: Research Applications and Users Meeting.

This week I’m highlighting another video from this event. In this one, Pauline Fujita from the UCSC Genome Browser covers ways to work with ENCODE data in their browser.

Some of the talk includes intro stuff for brand new users, because there were certainly some in this workshop. If you are new to the tools, too, you can also see our free tutorial suites (below). Pauline also quickly highlights their Genome Browser in a Box virtual machine option for folks who have privacy sensitive or protected data, but only briefly. If you want some more info on that, check out our Tip of the Week on GBIB.

But soon she covered more detail on features like track hubs and how to use those (if you wanted to jump to that part, it begins around 20min). That extra search for items in the Track Hub is really good to know about. file_formats_helpAlso, there’s some guidance here on the types of file formats that you may want to use to structure your data. Also why you want BED vs Wiggle, for example. For the part that addresses these formats, jump to about 33min.

Towards the end there’s coverage of the Data Integrator. The idea with this feature is that maybe you’ve got some information on a region and you have this structured as a BED file–or a number of regions–and you want to find out what else is going on in those regions. The Data Integrator can help you with that by finding overlaps among different tracks of data (around 45min). The Variant Annotation Integrator does kind of a similar thing, but for VCF files with variation information (~48min). A smidge more guidance on track hubs comes in at 50min.

In our paper for Current Protocols (which is now in PubMedCentral), we talk a bit about the hubs structure too. So if it runs too quickly at the end, our paper shows some of that detail pretty much the same way. That might help you to think about how to structure them if the concept is new to you. But if you are ready to dive in, there’s a paper specifically about hubs. And there’s also more background on the browser’s tools and in the NAR database issue papers. There’s a lot of ENCODE data available to mine, and I really hope more folks can use the tools to find new insights into genomic regions they are interested in.

Quick links:

Track hubs: http://genome.ucsc.edu/cgi-bin/hgHubConnect

Data Integrator: http://genome.ucsc.edu/cgi-bin/hgIntegrator

Variant Annotation Integrator: http://genome.ucsc.edu/cgi-bin/hgVai

ENCODE features at UCSC: http://genome.ucsc.edu/ENCODE

UCSC tutorial suites:

UCSC Intro Tutorial suites (video, with our free slides + exercises): http://www.openhelix.com/ucscintro

UCSC Advanced Tutorial suites (video, slides, exercises): http://www.openhelix.com/ucscadv


Mangan ME, Williams JM, Kuhn RM, & Lathe WC (2014). The UCSC Genome Browser: What Every Molecular Biologist Should Know Current Protocols in Molecular Biology., 107 (19.9), 199-199 DOI: 10.1002/0471142727.mb1909s107

Rosenbloom, K., Armstrong, J., Barber, G., Casper, J., Clawson, H., Diekhans, M., Dreszer, T., Fujita, P., Guruvadoo, L., Haeussler, M., Harte, R., Heitner, S., Hickey, G., Hinrichs, A., Hubley, R., Karolchik, D., Learned, K., Lee, B., Li, C., Miga, K., Nguyen, N., Paten, B., Raney, B., Smit, A., Speir, M., Zweig, A., Haussler, D., Kuhn, R., & Kent, W. (2014). The UCSC Genome Browser database: 2015 update Nucleic Acids Research, 43 (D1) DOI: 10.1093/nar/gku1177

Raney, B., Dreszer, T., Barber, G., Clawson, H., Fujita, P., Wang, T., Nguyen, N., Paten, B., Zweig, A., Karolchik, D., & Kent, W. (2013). Track data hubs enable visualization of user-defined genome-wide annotations on the UCSC Genome Browser Bioinformatics, 30 (7), 1003-1005 DOI: 10.1093/bioinformatics/btt637

Disclosure: UCSC Genome Browser tutorials are freely available because UCSC sponsors us to do training and outreach on the UCSC Genome Browser.


Video Tip of the Week: ENCODE Data Coordination Center, phase 3


Image via: A User’s Guide to the Encyclopedia of DNA Elements (ENCODE). doi:10.1371/journal.pbio.1001046.g001

The ENCODE project began many years ago, with a pilot phase, that examined just 1% of the human genome. But this initial exploration helped the consortium participants to iron out some of the directions for later stages–including focusing on specific cell lines, techniques, and technologies in Phase 2. There have been a number of publications that came out from consortium members, but in addition to the participant’s papers, a lot of other folks have mined this data for various investigations as well. There’s still plenty of opportunity for discovery. Some people may not realize that there’s an also ENCODE phase 3 underway.

When we had a contract with the folks at UCSC Genome Browser for outreach on ENCODE, we developed materials to help people explore the data. But we hadn’t delved into it much since phase 3 began. But the other day I got a note from my NHGRI YouTube subscription (GenomeTV) that a whole workshop of ENCODE phase 3 information had been made available. So I wanted to have a look.

There is a series of video segments that correspond to this agenda from the ENCODE workshop. I’ll be highlighting one of them here, the one that introduces the features of the Phase 3 Data Coordination Center at Stanford now. But there may be others that you want to examine for your research goals as well. Another way to work through the different segments is available from the NHGRI page here: http://www.genome.gov/27561910 That page offers the slides, handouts, and exercises too.

The video is longer than our typical tips, but it’s worth seeing for the context and framework details. There’s also a section on searching and filtering, which explains how to locate precisely the things you want to find. There’s a helpful and funny analogy to searching for shoes as you would at Zappos. I’ve used the Zappos tool exactly that way, and I also like it very much. If you want more details on how their ontology structure helps them to accomplish this, check out the paper linked below. Also in the video, there’s a piece about how the metadata is structured, and what you can expect to find there.

There’s also a part about how to visualize the things you find. You end up loading them as a UCSC Genome Browser track hub, which is integrated with all they other data at UCSC. There’s another video with Pauline Fujita on the hubs which I’ll address separately later.

The playlist for the whole meeting is here. I won’t be highlighting all of them, but I may select more of them for future tips.

Quick link:

ENCODE portal: https://www.encodeproject.org/


Malladi, V., Erickson, D., Podduturi, N., Rowe, L., Chan, E., Davidson, J., Hitz, B., Ho, M., Lee, B., Miyasato, S., Roe, G., Simison, M., Sloan, C., Strattan, J., Tanaka, F., Kent, W., Cherry, J., & Hong, E. (2015). Ontology application and use at the ENCODE DCC Database, 2015 DOI: 10.1093/database/bav010

ENCODE Project Consortium (2012). An integrated encyclopedia of DNA elements in the human genome Nature, 489 (7414), 57-74 DOI: 10.1038/nature11247

ENCODE Project Consortium. (2011). A User’s Guide to the Encyclopedia of DNA Elements (ENCODE) PLoS Biology, 9 (4) DOI: 10.1371/journal.pbio.1001046

ENCODE Project Consortium (2004). The ENCODE (ENCyclopedia Of DNA Elements) Project Science, 306 (5696), 636-640 DOI: 10.1126/science.1105136