Category Archives: General Science

Protip: check the genome of your cell line. HeLa cells are “strikingly aberrant”

This is a paper I’ve been waiting for: the analysis of the HeLa genome. I was aware of a lot of issues with the cell lines and missing or duplicated regions from the ENCODE data that was coming along some time ago: Mining the “big data” is…fascinating. And necessary.

People may be familiar with HeLa cells even if they aren’t in biomedical research because of the great book by Rebecca Skloot: The Immortal Life of Henrietta Lacks which explored the history of these cells and the woman whose terrible cancer led to their existence.

But there were many discussions over the years about how different these cells are from actual tissues, and concerns over how representative they are for actual human research issues. Here are some:

So a new paper has been published that explores this–and it’s at the top of my reading list for later today.

Here’s the paper itself: http://www.g3journal.org/content/early/2013/03/11/g3.113.005777.abstract 

Hat tip Ward Plunet via twitter:
RT @WardPlunet: Havoc in biology’s most-used human cell line: Genome of HeLa cells sequenced for the first time http://t.co/VVpZmiwIiX .

Update: A piece from one of the paper’s authors:

Reference:

Landry JJM, Pyl1 PT, Rausch T, Zichner T, Tekkedil MM, Stütz AM, Jauch A, Aiyar RS, Pau G, Delhomme N, Gagneur J, Korbel JO, Huber W, & Steinmetz LM (2013). The Genomic and Transcriptomic Landscape of a HeLa Cell Line G3 : 10.1534/g3.113.005777

“The Revisionaries” and the Texas Textbook Massacre

I wrote about this film when I saw it at a local festival, but I wanted to alert you (well, the US readers) that it’s going to be shown on PBS soon.

It will be on the Independent Lens show. More here, trailer, etc: http://www.pbs.org/independentlens/revisionaries/

Check your local listings here: http://www.itvs.org/television?film=revisionaries and set your DVR. You have to see how this played out, and watch out for it in your own community.

Here’s the original trailer:

The Revisionaries Trailer from Naked Edge Films on Vimeo.

Hat tip Scott Johnson on G+: https://plus.google.com/u/0/111312280150673387587/posts/TDwJFUT1FoW

Rare photo of me in the wild….

Of downtown Boston, at Tufts Medical Center, singing the praises of IMG and the Integrated Microbial Genomes resources.

I love workshops that only require a trip on the Orange Line.

Today we were doing the World Tour of Genomics Resources. Tomorrow it is UCSC Genome Browser (intro + advanced), and Thursday ENCODE. So if you want to workshop vicariously you can check out all of our tutorials on those. The slides, handouts, and exercises are all over there for you to download if you’d like.

As much as I love the online training and webinars and all, you really do get important information about the needs of folks in the room that you just don’t really get from the intertubz, and I do like to do the material live.

Enjoying the 2012 NAR Web Server Issue & a Cup of Coffee

In hunting for something to feature for this week’s tip, I noticed that Nucleic Acids Research had released their 2012 Web Server Issue back in July. As many of you are might be aware, the Nucleic Acids Research journal is a forum where developers can present computational biology papers that describe the development of biologically relevant algorithms, novel usage of existing algorithms, or that report the development of biological databases & their usage. The web server issue is an annual special issue focused specifically on web-based software resources for analysis and visualization of molecular biology data.

This year marks their 10th web server issue & I decided to check it out. In order to devote full attention to the issue, I began by pouring myself a big cup of coffee in one of my favorite mugs, which somehow makes it taste better. Then I set out to enjoy the issue – every year I always begin by reading the opening editorial & then the article on the bioinformatics links directory. The editorial usually explains special emphasis for the issue (this year it is analysis of next-generation sequencing data), and is written by the executive editor of the issue, Gary Benson. For me, the editorial sets the tone of the issue, so to speak.

Next I consume the directory article, along with a couple of sips of my java. What interests me in the article is multifold. First is the discussion of trends that they see in the development of tools and resources, which is important for us here at OpenHelix. Figure 6 provides an interesting look at the categories and counts of resources from each annual issue – I am curious as to why all but one category decline in 2008. Table 1 also provides interesting data on tool trends.

I am also interested in the content of the list itself – it is a great list being developed by people that we have a lot of respect for. I was especially interested in this sentence from their article:

“The Bioinformatics Links Directory has also initiated active curation of its content, removing dead content and correcting content errors, which has resulted in more accurate although occasionally smaller counts for 2012.”

The emphasis is mine in the quote above. In my opinion this is a very important aspect of any list. If you remember, Mary posted on the idea of “Obituaries for bioinformatics tools.” and started a BioStar post to collect this information. The BioStar post generated significant comment & looks like it may have helped inspire the Bioinformatics Links Directory team, from the comments. But it makes sense that you need not just collect information but to continue to maintain and filter that data so that it remains relevant – I mean if the forest is cluttered with dead wood, the useful “live trees” (ok, resources) are obscured from users, right?

The problem is that keeping any list (or documentation or tutorials, etc.) up-to-date is a hard, labor intensive activity. Here at OpenHelix we also keep a list of biology-relevant resources that can be searched through for free, without registering, from our homepage. We currently have a summer intern culling through a list of over 5,000 resources and tools that we know of. She is eliminating duplicate entries in our database by finding and collecting alternative URLs – it is amazing how many resources have multiple entryways, each with their own URL. But different doors don’t make a different resource or utility so we eliminate them form our list. Then we will tackle the dead resources, the listings that just go to a tiny tool internal to a main resource, or to a pre-formatted PubMed search for something.

Creating AND maintaining a high quality list is not a trivial effort. In their paper the Bioinformatics Links Directory team describes remaining current as a “future challenge” and says:

“Although necessary to remain current and to advance the utility of the Bioinformatics Links Directory, these improvements will only prove useful if driven by the community. As a community-driven repository, everyone in the research or bioinformatics community has the opportunity to help make the collection better and more meaningful. “

I truly wish them better luck at “community curation” than many resources have had in the past, & hope they succeed. In our experience it works best with stable, sufficient funding because as they say: “you get what you pay for”.

OK, next post will be on actual resources in the web server issue, I promise! :)

Quick links:

2012 NAR Web Server Issue: http://nar.oxfordjournals.org/content/40/W1.toc

Bioinformatics Links Directory: http://bioinformatics.ca/links_directory/

OpenHelix Homepage & Search Portal: http://www.openhelix.com

References:
Gary Benson (2012). Editorial: NUCLEIC ACIDS RESEARCH ANNUAL WEB SERVER ISSUE IN 2012 Nucleic Acids Research, 40 (W1) DOI: 10.1093/nar/gks607

Michelle D. Brazas, David Yim, Winston Yeung, & B. F. Francis Ouellette (2012). A decade of web server updates at the bioinformatics links directory: 2003–2012 Nucleic Acids Research, 40 (W1) DOI: 10.1093/nar/gks632

Good reads in bioinformatics

Over the weekend we saw a nice bit of new readership based on a blog post at Homologus. It’s a list of reads that can be helpful in keeping current in various topics in bioinformatics and genomics. Have a look–and include them in your reading as well!

A Review of Bioinformatics Blogs

Some of it overlaps with Stephen Turner’s recent post that was also popular (How to Stay Current in Bioinformatics/Genomics ), but it’s got some differences as well. Have a look.

In a fast-moving field, some of the best stuff is really on the blogs, forums, and other social media outlets. You really need to be connected. And I would say that bioinformatics geeks are particularly strong at some of this. They are always on their computers near twitter–and there are good papers, software tools, conferences, workshops, new features in existing software, downtime issues, and more–all being put out to the ethers by really smart and connected people who recognize quality (and sometimes mock items of lower quality in amusing ways).  Some time ago I did a post on how to use Twitter in Bioinformatics if you aren’t there yet. The software interface has changed a bit but the basic features are the same.

Anyway–have a look and add some new blogs to your RSS feed or your regular route.

More Big Data to Consider: Bioimage Informatics

I’m not sure any more when I signed up for complementary copies of Nature Methods, but just like clockwork my copy arrives each month. If you’d like to get it too, you can apply for a subscription here (Firefox seems to work better than IE, btw). This month’s issue particularly interested me because it contains a focus on Bioimage Informatics. The focus appears to be free to read online.

I found the focus just after having read the Science News article “Blast Injuries Linked to Neurodegeneration in Veterans” by Greg Miller. In Greg’s piece there is a description of a distinctive neuropathology that has been seen in athletes and military veterans who had incurred head injuries. This same distinctive pattern is seen in a mouse model of blast injury & the image of the tangles of tau protein shown in the article struck me as so interesting that I told my husband about it over dinner one night, so I already had bioimages on my mind. I am also always interested in the field of bioinformatics, both personally and as a member of the OpenHelix team.

The commentaries, in the order that they were printed, were what I read initially. The first commentary is by Gene Myers, who was also involved in early genome bioinformatics, and it provided a very interesting perspective on both the current state of bioimage informatics and on the historic use of bioimages in systems genetics.The following quote made me grin:

The field is still in its early days, and there is no such thing as a typical bioimage informatician: they are either computer vision experts looking for new problems, classic sequence-based bioinformaticians looking for the new thing or physicists and molecular biologists whose experiments require them to bite the informatics bullet. … From my perspective, it is very reminiscent of the state of bioinformatics in the early 1980s: the exciting, somewhat chaotic free-for-all that is potentially the birth of something new.”

And the following paragraph stressing the importance of “due diligence of pilot studies” and “optimized protocols” reminded me of my days setting up a Biocore facility without enough funding for either sufficient pilot studies or optimization, which ultimately doomed the utility of the machine to my advisor and department alike. This commentary set the stage well for the rest of the articles. The other commentaries included a description of the difference in goals of the computer vision field and the bioimage informatics field, a plea for usability to be built into bioimaging software, and a historical commentary on the 25 years of NIH Image, now ImageJ.

The usability article sounded many many of the same cries that we make here at OpenHelix – if you want to have usable bioscience software that IS in fact USED, at a minimum you must 1) have funding and a mandate to maintain it over the long run, 2) have motivated developers that are responsive to their users needs and feedback, including fixing bugs and 3) (last but absolutely not least) you must provide awareness and training on your software. And in my opinion, any old training WON”T due – it has to be high quality, up-to-date, and easier to use & absorb than your average dry documentation on programming your VCR clock (OK, I’m dating myself there, but you KNOW what I mean…) I like their suggestion that funding agencies request descriptions of how the software be maintained and documented, and to be prepared to provide funding not just for development, but also for maintenance. (Why reinvent the wheel over & over, just to let each one go flat with disrepair?)

There were also reports on specific software, such as OMERO.searcher, SimuCell, PhenoRipper, Fiji, BioImageXD, and Icy, as well as on the Broad Bioimage Benchmark Collection (BBBC), a collection of microscopy image sets available for the testing and validation of new image-analysis algorithms.

The focus then concludes with a great review of bioimaging software tools, with the goal of providing a “how to” summary of using open-source imaging software for every stage of bioimage informatics. It begins with a discussion of data aquisition & continues through data storage and workflow systems. I might tweek figure one just a bit, but it does visualize that today software is required at every stage of image analysis – from automated image attainment to image retrieval and analysis. The authors also touch on the importance of image annotation and controlled vocabularies, or ontologies. Table 1 provides a nice resource listing including software names, primary function and URL – I have some new resources to check out now! :)

Overall, I’d suggest this focus on bioimage informatics to any life scientist, whether you are analyzing images today or not – I think it is provides a glimpse into an up&coming, exciting field.

Quick Links:
BioImageXD: http://www.bioimagexd.net/

Broad  Broad Bioimage Benchmark Collection (BBBC): http://www.broadinstitute.org/bbbc/

Fiji: http://imagej.nih.gov/ij/

Icy: http://icy.bioimageanalysis.org/

OMERO.searcher: http://murphylab.web.cmu.edu/software/searcher/

PhenoRipper: http://www.phenoripper.org/

SimuCell: http://www.SimuCell.org/

 

Reference List:
Greg Miller (2012). Blast Injuries Linked to Neurodegeneration in Veterans Science, 336 (6083), 790-791 DOI: 10.1126/science.336.6083.790

Gene Myers (2012). Why bioimage informatics matters Nature Methods, 9, 659-660 DOI: 10.1038/nmeth.2024

Anne E Carpenter, Lee Kamentsky, & Kevin W Eliceiri (2012). A call for bioimaging software usability Nature Methods 9, 9, 666-670 DOI: 10.1038/nmeth.2073

Kevin W Eliceiri, Michael R Berthold, Ilya G Goldberg, Luis Ibáñez, B S Manjunath, Maryann E Martone, Robert F Murphy, Hanchuan Peng, Anne L Plant, Badrinath Roysam, Nico Stuurmann, Jason R Swedlow, Pavel Tomancak, & & Anne E Carpenter (2012). Biological imaging software tools Nature Methods, 9, 697-710 DOI: 10.1038/nmeth.2084

UCSC Table Browser webinar follow-up post (May 24)

We’ll be having our May 24th webinar today, and we find there are questions to follow up afterwards that are often better handled in discussions on the blog.

If there are questions we didn’t have time to get to–or things we want to expand on with more detail–we can discuss them in this thread.

Or if you have other things you’ve been meaning to ask, let us know.

If you can’t make the webinar, the same material is covered in the training movie, slides, and exercises that are freely available, sponsored by the UCSC team: http://www.openhelix.com/ucscadv. You can also sign up to be informed of future webinars coming up on these topics, Galaxy, ENCODE and others.