Tag Archives: cancer

A new look for the UCSC Cancer Genomics Browser

From the UCSC Genome Browser announcement mailing list:

The UCSC Cancer Genomics group has recently remodeled the interface of
their Cancer Genomics Browser (https://genome-cancer.ucsc.edu/) to make
it easier to navigate and more intuitive to display, investigate, and
analyze cancer genomics data and associated clinical information. This
tool provides access to many types of information —- biological
pathways, collections of genes, genomic and clinical information -— that
can be used to sort, aggregate, and perform statistical tests on a group
of samples. The Cancer Browser currently displays 473 datasets of 25
cancers from The Cancer Genome Atlas (TCGA) as well as data from the
Cancer Cell Line Encyclopedia (CCLE) and Stand Up To Cancer.

You can find more information about how to use this tool in the online
tutorial, user’s guide and FAQ. Any questions or comments should be
directed to genome-cancer@soe.ucsc.edu.

Donna Karolchik
UCSC Genome Browser Senior Engineering Manager


Friday SNPpets

Welcome to our Friday feature link collection: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…

Friday SNPpets

Welcome to our Friday feature link collection: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…

  • Chromothripsis – new model for some cancers? From GenomeWeb Daily News. I’m interested in seeing follow up studies on this. [Jennifer]
  • A new data source added to the BioMart Central portal: “EMAGE, a database of in situ gene expression data in the mouse embryo, has been added to BioMart Central Portal. The EMAGE website can be found at http://www.emouseatlas.org/emage/ and the EMAGE BioMart server can be found at http://biomart.emouseatlas.org/” (via the Mart-dev mailing list) [Mary]
  • Another potential outlet for scientists wanting to get involved: the Global Knowledge Initiative who’s goal is [Jennifer]

    We build global knowledge partnerships between individuals and institutions of higher education and research. We help partners access the global knowledge, technology, and human resources needed to sustain growth and achieve prosperity for all.

  • From GenomeWeb – an announcement about MoDEL the ‘World’s Largest Protein Video Database’ – it is free for academic, not-for-profit use. I haven’t tried it at all, but it sounds like it might be cool. Let us know if you check it out! [Jennifer]
  • Announcement from the International Cancer Genome Consortium (where you can access the data using the cutting edge BioMart build…Hat tip to @bffo: Update on ICGC website with a simplified application process for controlled access data  #bioinformatics #cancer #genomics  http://icgc.org/ [Mary]
  • Another resource for protein-protein and drug-protein interactions: PROMISCUOUS [Jennifer]
  • There’s a new Announcement mailing list for BioMart, as it gets migrated from the former EBI location.  Announce and Users lists are available–if you were on them you probably got automatically migrated. If you want to sign up, see this note:  [mart-announce] New BioMart announce and users mailing lists.  Hmm, that’s not entirely helpful as it hides the addresses you need. They are: mart-dev@ebi.ac.uk becomes users@biomart.org and mart-announce@ebi.ac.uk becomes announce@biomart.org [Mary]
  • REViGO – a resource for reducing and visualizing Gene Ontology trees, described in this paper: Supek F et al. PLoS Genet 6(6): e1001004. [Jennifer]

Tip of the Week: CircuitsDB for TF/miRNA/gene Regulation Networks

In this week’s tip I’d like to introduce you to CircuitsDB, which describes itself as:

“…a database where transcriptional and post-transcriptional (miRNA mediated) network information is fused together in order to propose and recognize non trivial regulatory combinations. “

I found out about the database from the BioMed Central article “CircuitsDB: a database of mixed microRNA/transcription factor feed-forward regulatory circuits in human and mouse“, which I cite below. I had already been thinking about miRNAs because I am slated to update our miRBase tutorial in the near future and have been reading/catching up on the latest in the field. The CircuitsDB paper by Olivier Friard et al does a really nice job of quickly and clearly laying out the background of the project – how transcription factors have long been studied for their transcriptional regulation of protein-coding genes involved in any manor of pathways, including those of disease. It goes on to describe that the study of microRNAs, or miRNAs, is a newer field studying the post-translational regulatory effects of miRNAs on protein-coding genes and their functions. Current efforts are moving to integrate the two areas of research to create more complete, and admittedly more complex, regulatory views of protein-coding genes and the affects on disease and other pathways.

The developers of CircuitsDB also very clearly describe how they have mined, analyzed and connected data from several top databases – many of which we have tutorials on, such as OMIM, miRBase, Ensembl and others – in order to create feed-forward regulatory loops, or FFLs, of TFs, affected miRNAs and ultimately affected protein-encoding genes. The image at the right is from their original paper: “Genome-wide survey of microRNA–transcription factor feed-forward regulatory circuits in human” (cited below), which reported the development of the computational framework for the mixed miRNA/TF Feed-Forward regulatory circuits that are freely available through the  CircuitsDB web interface. This original paper is available for free, with registration to RSC Publishing, and provides a detailed description of their original development, as well as access to several supplemental files.

Essentially networks linking transcription factors and affected genes, miRNAs and affected genes, and transcription factors and miRNAs were painstakingly connected through an ab-initio oligo analysis. Support was then gained for the connections by analyzing enriched GO terms, disease connections, and previously-known connections found in other specialized resources. The CircuitsDB interface offers multiple tools. The main tool (FFL) is what I show in this tip & is the tool that searches for the networks diagrammed above. The MYC FFL is an impressive “curated database of miRNA mediated Feed Forward Loops involving MYC as Master Regulator”, and includes information on the direction of regulation, loop participants, evidence levels and more. The Transcriptional network tool allows a user to search with either a miRNA & find its regulating TF, or search with a TF & find regulated genes or miRNAs. The Post-transcriptional network tool is similar, but allows searches for a miRNA or gene to find regulated genes or regulating miRNA, respectively. So check out the tip & then check out CircuitsDB – enjoy!

Friard, O., Re, A., Taverna, D., De Bortoli, M., & Corá, D. (2010). CircuitsDB: a database of mixed microRNA/transcription factor feed-forward regulatory circuits in human and mouse BMC Bioinformatics, 11 (1) DOI: 10.1186/1471-2105-11-435

Re, A., Corá, D., Taverna, D., & Caselle, M. (2009). Genome-wide survey of microRNA–transcription factor feed-forward regulatory circuits in human Molecular BioSystems, 5 (8) DOI: 10.1039/B900177H

Friday SNPpets

Welcome to our Friday feature link collection: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…

Tip of the Week: The Cancer Genome Workbench

In today’s tip I’d like to introduce you to the Cancer Genome Workbench, or CGWB. The workbench gathers cancer information from a wide variety of projects including Johns Hopkins University and GlaxoSmithKline Cancer Cell Line Genomic Profiling Data, NCI’s Therapeutically Applicable Research to Generate Effective Treatment (TARGET), NHGRI’s Tumor Sequencing Project (TSP), The Cancer Genome Atlas (TCGA), and the Sanger Center’s COSMIC initiative and presents the cumulative data as high-level summary visualizations. The CGWB’s genome-browser view is built on a UCSC Genome Browser backbone, for power and flexibility.

I noticed an announcement in the May 7th Nature Signaling Gateway Update email that the NCI-Nature Pathway Interaction Database – May Update was featuring a bioinformatics primer on The Cancer Genome Workbench. The primer is great & goes into much more detail about the Cancer Genome Workbench than I will be able to in this quick tip. I strongly check the primer, and the workbench out. When I went over to the workbench to explore, I quite honestly was a bit taken back by the complexity of the displays – the amount of data presented in their summary visualizations are somewhat intense.

I hope that in my tip movie I will be able to convince you that the small investment you will need to do to get acclimated to their images is well worth the amount of data you will quickly understand how to analyze. The views are so data rich, it takes a bit of adjusting to – there is very little labeling (to keep displays as clean as possible) and information is provided via pop-up messages as you scroll over the display. Once I got past the intensity of the displays, I was really amazed by the scope of data visualized in CGWB displays – data on every chromosome & gene over multiple datasets/experiments, in one 2D image. As the NCI primer says, cancer is complex – really complex. Being able to see such ‘big picture’ views as those provided by the Cancer Genome Workbench is a really powerful analysis aid. I for one am impressed with this resource, which is why I’ve chosen to feature it today.

In my 5 minute tip I was only able to show you the briefest of glimpses of the CGWB landscape and heatmap views. I was not able to show you the details of wither view, including a hyperlinked list of genes with the highest mutation frequencies. Nor was I able to show you the full scope of other views which include genome browser views (based on the UCSC Genome browser, as I mentioned earlier), correlation plots, protein domain views, 3D vizualizations, as well as next-gen and trace sequence views. Check out figure 1 of the bioinformatics primer to see examples of those.

I’ve added a citation to the original CGWB publication. It was published in 2007, and so does not cover all the current functions of the workbench, but I think reading it might help give you an idea of the workbench because it goes into the goals and background that the CGWB is based on more than the primer, which is much more up-to-date and focuses on the functionality of the workbench. In this paper you can also read how the authors utilized the workbench to analyze three public datasets, and see how it expanded their research findings.

All & all, I think the Cancer Genome Workbench is an amazing resource for cancer research. Be sure to check out the tip movie, the primer, the original CGWB publication and especially the CGWB! Thanks for joining us for this week’s tip.

ResearchBlogging.orgZhang, J., Finney, R., Rowe, W., Edmonson, M., Yang, S., Dracheva, T., Jen, J., Struewing, J., & Buetow, K. (2007). Systematic analysis of genetic alterations in tumors using Cancer Genome WorkBench (CGWB) Genome Research, 17 (7), 1111-1117 DOI: 10.1101/gr.5963407

Tip of the Week: International Cancer Genome Consortium

So, remember that tidal wave of data we were going to get from the human genome project?  Yeah.  That was a puddle compared to what’s coming your way now. For this week’s tip of the week I will introduce the very ambitious big data project from the International Cancer Genome Consortium (ICGC).  In addition, you’ll get your first look at the shiny new interface for BioMart!

People reading this blog know that we have made great progress on many fronts in the war on cancer.  But there’s an awful lot we don’t know yet.  The ICGC network of researchers plans to change that.  This international group of researchers has organized and standardized an effort to learn about tumors.  From their homepage:

ICGC Goal: To obtain a comprehensive description of genomic, transcriptomic and epigenomic changes in 50 different tumor types and/or subtypes which are of clinical and societal importance across the globe.

Check that out:

  • 50 tumor types.  Oh–and by the way–they will also obtain a normal tissue same from the same individual so you can see what’s part of the normal constitution and what has changed in the tumor.
  • Hundreds of samples of that tumor type.  Except for some rare tumors, they intend to obtain 500 of each tumor.
  • More than a dozen types of cancer. Breast, lung, brain, pancreas, liver, leukemia…and on and on.
  • Genomic. Transcriptomic. Epigenomic.  Each of these is a separate data set that needs to be obtained.  Oh, and already there are simple variations (small numbers of nucleotides), CNVs, structural re-arrangements, expression data….And that’s just the initial release.

Are you overwhelmed yet?  50 x 500 x more than a dozen x 3+ types of data (and that’s just back-of-the-napkin, there’s more…)  I am daunted just thinking about the scale of this.

They have organized and standardized the protocols, technologies, data collection, data submissions, and more.  You should check out their marker paper for a complete description of their framework.  They are going to make 2 types of data available: open access data that is de-identified.  And there is a controlled access data set with clinical details that you’ll have to register for access to.

Do note though: the data (like all these large data projects) is subject to data usage policies that you need to be aware of.  There is a publication moratorium that enables the data submitters a window to publish their findings before others are allowed to publish.  It’s that typical balance of rapid access to data + a non-scoop window for the data providers.  Be sure to familiarize yourself with the policies if you are going to use this data.

But let’s say you are ready for it–you understand the framework, you understand the usage policies–how do you get the data?  You use the very cool new interface for BioMart to do it!  This is your first opportunity to look at the GUI developed for BioMart v 0.8.  There’s more coming, this is an early version.  But that’s how you are going to be able to build great custom queries on the underlying data and pull it down.  You may be familiar with BioMart from any number of places now (Ensembl, Gramene, FlyBase, WormBase….more).  But this is the first implementation of the new look–you are going to want to check that out.

For this week’s Tip of the Week you’ll see the ICGC site, and a quick query of the initial data that is available in the Data Coordination Center (DCC).  But this is just an appetizer.  Brace yourselves–the deluge is coming.

A Nature News article offers a nice overview, but be sure to check out the full paper for the project details.

The International Cancer Genome Consortium site: http://icgc.org/

Oh, and this made me laugh:

Be sure to contact the ICGC team if you have any questions.  they want to help you to use this data, and will be happy to answer your questions.  Personally, I’m making it a mission to help them populate the FAQ–I’ve sent in questions.  And so far my answers have been quite speedy :)

Oy. The reference is longer than the blog post.  Sigh.

Hudson (Chairperson), T., Anderson, W., Aretz, A., Barker, A., Bell, C., Bernabé, R., Bhan, M., Calvo, F., Eerola, I., Gerhard, D., Guttmacher, A., Guyer, M., Hemsley, F., Jennings, J., Kerr, D., Klatt, P., Kolar, P., Kusuda, J., Lane, D., Laplace, F., Lu, Y., Nettekoven, G., Ozenberger, B., Peterson, J., Rao, T., Remacle, J., Schafer, A., Shibata, T., Stratton, M., Vockley, J., Watanabe, K., Yang, H., Yuen, M., Knoppers (Leader), B., Bobrow, M., Cambon-Thomsen, A., Dressler, L., Dyke, S., Joly, Y., Kato, K., Kennedy, K., Nicolás, P., Parker, M., Rial-Sebbag, E., Romeo-Casabona, C., Shaw, K., Wallace, S., Wiesner, G., Zeps, N., Lichter (Leader), P., Biankin, A., Chabannon, C., Chin, L., Clément, B., de Alava, E., Degos, F., Ferguson, M., Geary, P., Hayes, D., Hudson, T., Johns, A., Kasprzyk, A., Nakagawa, H., Penny, R., Piris, M., Sarin, R., Scarpa, A., Shibata, T., van de Vijver, M., Futreal (Leader), P., Aburatani, H., Bayés, M., Bowtell, D., Campbell, P., Estivill, X., Gerhard, D., Grimmond, S., Gut, I., Hirst, M., López-Otín, C., Majumder, P., Marra, M., McPherson, J., Nakagawa, H., Ning, Z., Puente, X., Ruan, Y., Shibata, T., Stratton, M., Stunnenberg, H., Swerdlow, H., Velculescu, V., Wilson, R., Xue, H., Yang, L., Spellman (Leader), P., Bader, G., Boutros, P., Campbell, P., Flicek, P., Getz, G., Guigó, R., Guo, G., Haussler, D., Heath, S., Hubbard, T., Jiang, T., Jones, S., Li, Q., López-Bigas, N., Luo, R., Muthuswamy, L., Francis Ouellette, B., Pearson, J., Puente, X., Quesada, V., Raphael, B., Sander, C., Shibata, T., Speed, T., Stein, L., Stuart, J., Teague, J., Totoki, Y., Tsunoda, T., Valencia, A., Wheeler, D., Wu, H., Zhao, S., Zhou, G., Stein (Leader), L., Guigó, R., Hubbard, T., Joly, Y., Jones, S., Kasprzyk, A., Lathrop, M., López-Bigas, N., Francis Ouellette, B., Spellman, P., Teague, J., Thomas, G., Valencia, A., Yoshida, T., Kennedy (Leader), K., Axton, M., Dyke, S., Futreal, P., Gerhard, D., Gunter, C., Guyer, M., Hudson, T., McPherson, J., Miller, L., Ozenberger, B., Shaw, K., Kasprzyk (Leader), A., Stein (Leader), L., Zhang, J., Haider, S., Wang, J., Yung, C., Cross, A., Liang, Y., Gnaneshan, S., Guberman, J., Hsu, J., Bobrow (Leader), M., Chalmers, D., Hasel, K., Joly, Y., Kaan, T., Kennedy, K., Knoppers, B., Lowrance, W., Masui, T., Nicolás, P., Rial-Sebbag, E., Lyman Rodriguez, L., Vergely, C., Yoshida, T., Grimmond (Leader), S., Biankin, A., Bowtell, D., Cloonan, N., deFazio, A., Eshleman, J., Etemadmoghadam, D., Gardiner, B., Kench, J., Scarpa, A., Sutherland, R., Tempero, M., Waddell, N., Wilson, P., McPherson (Leader), J., Gallinger, S., Tsao, M., Shaw, P., Petersen, G., Mukhopadhyay, D., Chin, L., DePinho, R., Thayer, S., Muthuswamy, L., Shazand, K., Beck, T., Sam, M., Timms, L., Ballin, V., Lu (Leader), Y., Ji, J., Zhang, X., Chen, F., Hu, X., Zhou, G., Yang, Q., Tian, G., Zhang, L., Xing, X., Li, X., Zhu, Z., Yu, Y., Yu, J., Yang, H., Lathrop (Leader), M., Tost, J., Brennan, P., Holcatova, I., Zaridze, D., Brazma, A., Egevad, L., Prokhortchouk, E., Elizabeth Banks, R., Uhlén, M., Cambon-Thomsen, A., Viksna, J., Ponten, F., Skryabin, K., Stratton (Leader), M., Futreal, P., Birney, E., Borg, A., Børresen-Dale, A., Caldas, C., Foekens, J., Martin, S., Reis-Filho, J., Richardson, A., Sotiriou, C., Stunnenberg, H., Thomas, G., van de Vijver, M., van’t Veer, L., Calvo (Leader), F., Birnbaum, D., Blanche, H., Boucher, P., Boyault, S., Chabannon, C., Gut, I., Masson-Jacquemier, J., Lathrop, M., Pauporté, I., Pivot, X., Vincent-Salomon, A., Tabone, E., Theillet, C., Thomas, G., Tost, J., Treilleux, I., Calvo (Leader), F., Bioulac-Sage, P., Clément, B., Decaens, T., Degos, F., Franco, D., Gut, I., Gut, M., Heath, S., Lathrop, M., Samuel, D., Thomas, G., Zucman-Rossi, J., Lichter (Leader), P., Eils (Leader), R., Brors, B., Korbel, J., Korshunov, A., Landgraf, P., Lehrach, H., Pfister, S., Radlwimmer, B., Reifenberger, G., Taylor, M., von Kalle, C., Majumder (Leader), P., Sarin, R., Rao, T., Bhan, M., Scarpa (Leader), A., Pederzoli, P., Lawlor, R., Delledonne, M., Bardelli, A., Biankin, A., Grimmond, S., Gress, T., Klimstra, D., Zamboni, G., Shibata (Leader), T., Nakamura, Y., Nakagawa, H., Kusuda, J., Tsunoda, T., Miyano, S., Aburatani, H., Kato, K., Fujimoto, A., Yoshida, T., Campo (Leader), E., López-Otín, C., Estivill, X., Guigó, R., de Sanjosé, S., Piris, M., Montserrat, E., González-Díaz, M., Puente, X., Jares, P., Valencia, A., Himmelbaue, H., Quesada, V., Bea, S., Stratton (Leader), M., Futreal, P., Campbell, P., Vincent-Salomon, A., Richardson, A., Reis-Filho, J., van de Vijver, M., Thomas, G., Masson-Jacquemier, J., Aparicio, S., Borg, A., Børresen-Dale, A., Caldas, C., Foekens, J., Stunnenberg, H., van’t Veer, L., Easton, D., Spellman, P., Martin, S., Barker, A., Chin, L., Collins, F., Compton, C., Ferguson, M., Gerhard, D., Getz, G., Gunter, C., Guttmacher, A., Guyer, M., Hayes, D., Lander, E., Ozenberger, B., Penny, R., Peterson, J., Sander, C., Shaw, K., Speed, T., Spellman, P., Vockley, J., Wheeler, D., Wilson, R., Hudson (Chairperson), T., Chin, L., Knoppers, B., Lander, E., Lichter, P., Stein, L., Stratton, M., Anderson, W., Barker, A., Bell, C., Bobrow, M., Burke, W., Collins, F., Compton, C., DePinho, R., Easton, D., Futreal, P., Gerhard, D., Green, A., Guyer, M., Hamilton, S., Hubbard, T., Kallioniemi, O., Kennedy, K., Ley, T., Liu, E., Lu, Y., Majumder, P., Marra, M., Ozenberger, B., Peterson, J., Schafer, A., Spellman, P., Stunnenberg, H., Wainwright, B., Wilson, R., & Yang, H. (2010). International network of cancer genome projects Nature, 464 (7291), 993-998 DOI: 10.1038/nature08987

Tip of the Week: CellMiner from NCI

CellMiner_tip_imageOnce a month I get a great email from my BioMed Central with suggestions of articles that I might want to read. In a recent edition there were lots of papers I wanted to read.  One of them was “CellMiner: a relational database and query tool for the NCI-60 cancer cell lines” by Uma T Shankavaram et. al. & is the reason for this tip of the week. The paper is very well written, clear, and points out some Excel download pitfalls I’ve struggled with in the past. I figured if their writing is so good, I’d better check out their web resource, CellMiner.

CellMiner is brought to you by the Genomics and Bioinformatics Group, Laboratory of Molecular Pharmacology (LMP), Center for Cancer Research (CCR), National Cancer Institute (NCI) and is created as a “database and query tool designed for the cancer research community to facilitate integration of the molecular datasets generated by the GBG and its collaborators on the NCI-60″. So, without further ado, please watch this tip, read the paper, and then utilize this great resource for your own scientific gains.

ResearchBlogging.org Uma T Shankavaram, Sudhir Varma, David Kane, Margot Sunshine, Krishna K Chary, William C Reinhold, Yves Pommier, & John N Weinstein (2009). CellMiner: a relational database and query tool for the NCI-60 cancer cell lines BMC Genomics, 10 (277) DOI: doi:10.1186/1471-2164-10-277

Tip of the Week: UCSC Cancer Genomics Browser


The folks associated with the UCSC Genome Browser have released a new browser and data collection called the Cancer Genomics Browser that is now available to you here:  http://genome-cancer.ucsc.edu/

They have done their own introduction to that software and data, so I’m just going to point you to their site today for this week’s Tip of the Week.  Go over there to watch the short video and get started using the site.

The paper that describes the resource is available from Nature Methods, and if you go to the supplementary materials there is even a tutorial in pdf form that you can access (even if you don’t subscribe to Nature Methods ;) )

More details about the project can be found on the job ad I saw for the project recently:

UCSC Cancer Genomics is the primary integrative bioinformatics group for the national I-SPY breast cancer trial (http://tr.nci.nih.gov/iSpy) and a key analysis group for The Cancer Genome Atlas project (http://cancergenome.nih.gov/), NCI’s flagship cancer genomics project. The UCSC Cancer Genomics Browser (http://genome-cancer.ucsc.edu) is rapidly expanding with support from a number of additional collaborations. This browser is built on the popular UCSC Genome Browser, which receives an average of 600,000 page requests per day and is accessed by 80,000 different biomedical researchers monthly, making it one of the most important and widely used web-based resources for biomedical research.


The paper:

The UCSC Cancer Genomics Browser

Jingchun Zhu, J Zachary Sanborn, Stephen Benz, Christopher Szeto, Fan Hsu, Robert M Kuhn, Donna Karolchik, John Archie, Marc E Lenburg, Laura J Esserman, W James Kent, David Haussler & Ting Wang. Nature Methods 6, 239 – 240 (2009). doi:10.1038/nmeth0409-239

caMOD 2.4 released

Just got an announcement about a new release of the caMOD Cancer Models database. This web-based resource holds information about mouse, rat, and other animal models relating to cancer research. It is also integrated with many other useful data resources.

From their description:

Retrieve information about the making of models, their genetic description, histopathology, derived cell lines, associated images, carcinogenic agents, and therapeutic trials. Links to associated publications and other resources are provided.

If you are a researcher in this field you can also submit this type of information.

The new features in this release are:

  • The integration of caMOD with caELMIR
  • Object model changes as result of the VCDE review
  • Compliance with NCICB Technology Stack Requirements

caELMIR is a data management system for pre-clinical experimental data. It is a LIMS, or more specifically an ELIMS, for Electronic Laboratory Information Management System. This is beyond the scope of our current resource coverage, but some people finding this blog might find it useful.