UCSC Genome Bioinformatics

UCSC replaces UCSC Genes with GENCODE as default gene set

… is a big deal. And now I have to change my training materials. But I think it’s worthwhile. The GENCODE set is very extensive and the range of annotated types captures important details.

This email came from the UCSC Genome Browser announcement mailing list. Pasting in full for those who aren’t on this list, or link to the list item here :

[genome-announce] GENCODE Genes Now the Default Gene Set on the Human (GRCh38/hg38) Assembly

In a move towards standardizing on a common …

Video Tip of the Week: Biodalliance browser with HiSeq X-Ten data

… here: Catalog ID: GM12878. Conveniently, this is one of the Tier 1 cell lines from the ENCODE project too, so there is other public data out there on this cell line–which I have explored in the past and knew some things about.

There are 2 different data sets of the sequence in the download files, and one of them is available in the browser to view. I’m sure the Genoscenti will be all over the downloadable files. But because I’m always interested new visualizations, I …

Video Tip of the Week: New UCSC “stacked” wiggle track view

… we don’t always have time to go into the details of this view, although we do explore it in the ENCODE material, because the track I’m using is one of the ENCODE data sets. I’ll use the same track in the same region as the announcement, which is shown here:

But when I first looked at this, I wasn’t sure if the peak–focus on the pink peak that represents the NHLF cell line–was meant to cover the whole area underneath or not. What I was trying to figure out is essentially this …

Protip: check the genome of your cell line. HeLa cells are “strikingly aberrant”

… I was aware of a lot of issues with the cell lines and missing or duplicated regions from the ENCODE data that was coming along some time ago: Mining the “big data” is…fascinating. And necessary.

People may be familiar with HeLa cells even if they aren’t in biomedical research because of the great book by Rebecca Skloot: The Immortal Life of Henrietta Lacks which explored the history of these cells and the woman whose terrible cancer led to their existence.

But there were …

Spanking #ENCODE

While I was on the road last week–ironically to do workshops including one on ENCODE data in the UCSC Genome Browser, a conflama erupted over a new paper that was published essentially spanking the ENCODE team for some of the claims they made. Some of the first notes I saw:

Wow – brutal (IMO excessive) takedown of ENCODE's "80% functional" claim: http://t.co/68xNepahm2 (via @benoitbruneau )

— Daniel MacArthur (@dgmacarthur) February 21, 2013 …

ENCODE transitions

In case you missed it, Friday evening this piece came over the ENCODE announcement mailing list:

ENCODE transitions

Today marks the end of 5 years of ENCODE whole-genome data production,

and the project is now transitioning to a new phase.  The newly

constituted project has been announced by NHGRI in the press release

here: http://www.genome.gov/27550184

….

Although UCSC will continue to participate actively in ENCODE data

management, we …

ENCODE floods the news networks…

My social media is abuzz with ENCODE publications and chatter right now. Some of the things I’d recommend (besides the huge collection of papers and Nature site, of course) or that made me laugh:

ENCODE project team leader Ewan Birney’s insights: ENCODE: My own thoughts

Guardian: Thousands of ‘genes’ found in parts of genome dismissed as junk DNA

Not Rocket Science: ENCODE: the rough guide to the human …

ENCODE data in the UCSC Genome Browser, part deux

There is some really terrific data flowing into the UCSC Genome Browser from the ENCODE project. And now we have updated our tutorials to catch you up on the tracks and strategies to explore that wealth of information.

Near the beginning of the ENCODE production phase, we created a tutorial to introduce folks to the project and the role of UCSC as the DCC–Data Coordination Center. We have decided to keep that tutorial available because it has some of the background that …

Fresh data–metrics! From ENCODE

The ENCODE project has been churning out data for a long time now. Suddenly, though, it’s become even easier to see how much data has come along. I just got an announcement from the ENCODE mailing list about their new Quality Metrics that are available.

Here’s the scoop :

The ENCODE consortium analysis working group has analyzed the quality of

the data produced using a variety of …