Category Archives: General Science

Protip: check the genome of your cell line. HeLa cells are “strikingly aberrant”

This is a paper I’ve been waiting for: the analysis of the HeLa genome. I was aware of a lot of issues with the cell lines and missing or duplicated regions from the ENCODE data that was coming along some time ago: Mining the “big data” is…fascinating. And necessary.

People may be familiar with HeLa cells even if they aren’t in biomedical research because of the great book by Rebecca Skloot: The Immortal Life of Henrietta Lacks which explored the history of these cells and the woman whose terrible cancer led to their existence.

But there were many discussions over the years about how different these cells are from actual tissues, and concerns over how representative they are for actual human research issues. Here are some:

So a new paper has been published that explores this–and it’s at the top of my reading list for later today.

Here’s the paper itself: 

Hat tip Ward Plunet via twitter:
RT @WardPlunet: Havoc in biology’s most-used human cell line: Genome of HeLa cells sequenced for the first time .

Update: A piece from one of the paper’s authors:


Landry JJM, Pyl1 PT, Rausch T, Zichner T, Tekkedil MM, Stütz AM, Jauch A, Aiyar RS, Pau G, Delhomme N, Gagneur J, Korbel JO, Huber W, & Steinmetz LM (2013). The Genomic and Transcriptomic Landscape of a HeLa Cell Line G3 : 10.1534/g3.113.005777

“The Revisionaries” and the Texas Textbook Massacre

I wrote about this film when I saw it at a local festival, but I wanted to alert you (well, the US readers) that it’s going to be shown on PBS soon.

It will be on the Independent Lens show. More here, trailer, etc:

Check your local listings here: and set your DVR. You have to see how this played out, and watch out for it in your own community.

Here’s the original trailer:

The Revisionaries Trailer from Naked Edge Films on Vimeo.

Hat tip Scott Johnson on G+:

Rare photo of me in the wild….

Of downtown Boston, at Tufts Medical Center, singing the praises of IMG and the Integrated Microbial Genomes resources.

I love workshops that only require a trip on the Orange Line.

Today we were doing the World Tour of Genomics Resources. Tomorrow it is UCSC Genome Browser (intro + advanced), and Thursday ENCODE. So if you want to workshop vicariously you can check out all of our tutorials on those. The slides, handouts, and exercises are all over there for you to download if you’d like.

As much as I love the online training and webinars and all, you really do get important information about the needs of folks in the room that you just don’t really get from the intertubz, and I do like to do the material live.

Enjoying the 2012 NAR Web Server Issue & a Cup of Coffee

In hunting for something to feature for this week’s tip, I noticed that Nucleic Acids Research had released their 2012 Web Server Issue back in July. As many of you are might be aware, the Nucleic Acids Research journal is a forum where developers can present computational biology papers that describe the development of biologically relevant algorithms, novel usage of existing algorithms, or that report the development of biological databases & their usage. The web server issue is an annual special issue focused specifically on web-based software resources for analysis and visualization of molecular biology data.

This year marks their 10th web server issue & I decided to check it out. In order to devote full attention to the issue, I began by pouring myself a big cup of coffee in one of my favorite mugs, which somehow makes it taste better. Then I set out to enjoy the issue – every year I always begin by reading the opening editorial & then the article on the bioinformatics links directory. The editorial usually explains special emphasis for the issue (this year it is analysis of next-generation sequencing data), and is written by the executive editor of the issue, Gary Benson. For me, the editorial sets the tone of the issue, so to speak.

Next I consume the directory article, along with a couple of sips of my java. What interests me in the article is multifold. First is the discussion of trends that they see in the development of tools and resources, which is important for us here at OpenHelix. Figure 6 provides an interesting look at the categories and counts of resources from each annual issue – I am curious as to why all but one category decline in 2008. Table 1 also provides interesting data on tool trends.

I am also interested in the content of the list itself – it is a great list being developed by people that we have a lot of respect for. I was especially interested in this sentence from their article:

“The Bioinformatics Links Directory has also initiated active curation of its content, removing dead content and correcting content errors, which has resulted in more accurate although occasionally smaller counts for 2012.”

The emphasis is mine in the quote above. In my opinion this is a very important aspect of any list. If you remember, Mary posted on the idea of “Obituaries for bioinformatics tools.” and started a BioStar post to collect this information. The BioStar post generated significant comment & looks like it may have helped inspire the Bioinformatics Links Directory team, from the comments. But it makes sense that you need not just collect information but to continue to maintain and filter that data so that it remains relevant – I mean if the forest is cluttered with dead wood, the useful “live trees” (ok, resources) are obscured from users, right?

The problem is that keeping any list (or documentation or tutorials, etc.) up-to-date is a hard, labor intensive activity. Here at OpenHelix we also keep a list of biology-relevant resources that can be searched through for free, without registering, from our homepage. We currently have a summer intern culling through a list of over 5,000 resources and tools that we know of. She is eliminating duplicate entries in our database by finding and collecting alternative URLs – it is amazing how many resources have multiple entryways, each with their own URL. But different doors don’t make a different resource or utility so we eliminate them form our list. Then we will tackle the dead resources, the listings that just go to a tiny tool internal to a main resource, or to a pre-formatted PubMed search for something.

Creating AND maintaining a high quality list is not a trivial effort. In their paper the Bioinformatics Links Directory team describes remaining current as a “future challenge” and says:

“Although necessary to remain current and to advance the utility of the Bioinformatics Links Directory, these improvements will only prove useful if driven by the community. As a community-driven repository, everyone in the research or bioinformatics community has the opportunity to help make the collection better and more meaningful. “

I truly wish them better luck at “community curation” than many resources have had in the past, & hope they succeed. In our experience it works best with stable, sufficient funding because as they say: “you get what you pay for”.

OK, next post will be on actual resources in the web server issue, I promise! :)

Quick links:

2012 NAR Web Server Issue:

Bioinformatics Links Directory:

OpenHelix Homepage & Search Portal:

Gary Benson (2012). Editorial: NUCLEIC ACIDS RESEARCH ANNUAL WEB SERVER ISSUE IN 2012 Nucleic Acids Research, 40 (W1) DOI: 10.1093/nar/gks607

Michelle D. Brazas, David Yim, Winston Yeung, & B. F. Francis Ouellette (2012). A decade of web server updates at the bioinformatics links directory: 2003–2012 Nucleic Acids Research, 40 (W1) DOI: 10.1093/nar/gks632

Good reads in bioinformatics

Over the weekend we saw a nice bit of new readership based on a blog post at Homologus. It’s a list of reads that can be helpful in keeping current in various topics in bioinformatics and genomics. Have a look–and include them in your reading as well!

A Review of Bioinformatics Blogs

Some of it overlaps with Stephen Turner’s recent post that was also popular (How to Stay Current in Bioinformatics/Genomics ), but it’s got some differences as well. Have a look.

In a fast-moving field, some of the best stuff is really on the blogs, forums, and other social media outlets. You really need to be connected. And I would say that bioinformatics geeks are particularly strong at some of this. They are always on their computers near twitter–and there are good papers, software tools, conferences, workshops, new features in existing software, downtime issues, and more–all being put out to the ethers by really smart and connected people who recognize quality (and sometimes mock items of lower quality in amusing ways).  Some time ago I did a post on how to use Twitter in Bioinformatics if you aren’t there yet. The software interface has changed a bit but the basic features are the same.

Anyway–have a look and add some new blogs to your RSS feed or your regular route.

More Big Data to Consider: Bioimage Informatics

I’m not sure any more when I signed up for complementary copies of Nature Methods, but just like clockwork my copy arrives each month. If you’d like to get it too, you can apply for a subscription here (Firefox seems to work better than IE, btw). This month’s issue particularly interested me because it contains a focus on Bioimage Informatics. The focus appears to be free to read online.

I found the focus just after having read the Science News article “Blast Injuries Linked to Neurodegeneration in Veterans” by Greg Miller. In Greg’s piece there is a description of a distinctive neuropathology that has been seen in athletes and military veterans who had incurred head injuries. This same distinctive pattern is seen in a mouse model of blast injury & the image of the tangles of tau protein shown in the article struck me as so interesting that I told my husband about it over dinner one night, so I already had bioimages on my mind. I am also always interested in the field of bioinformatics, both personally and as a member of the OpenHelix team.

The commentaries, in the order that they were printed, were what I read initially. The first commentary is by Gene Myers, who was also involved in early genome bioinformatics, and it provided a very interesting perspective on both the current state of bioimage informatics and on the historic use of bioimages in systems genetics.The following quote made me grin:

The field is still in its early days, and there is no such thing as a typical bioimage informatician: they are either computer vision experts looking for new problems, classic sequence-based bioinformaticians looking for the new thing or physicists and molecular biologists whose experiments require them to bite the informatics bullet. … From my perspective, it is very reminiscent of the state of bioinformatics in the early 1980s: the exciting, somewhat chaotic free-for-all that is potentially the birth of something new.”

And the following paragraph stressing the importance of “due diligence of pilot studies” and “optimized protocols” reminded me of my days setting up a Biocore facility without enough funding for either sufficient pilot studies or optimization, which ultimately doomed the utility of the machine to my advisor and department alike. This commentary set the stage well for the rest of the articles. The other commentaries included a description of the difference in goals of the computer vision field and the bioimage informatics field, a plea for usability to be built into bioimaging software, and a historical commentary on the 25 years of NIH Image, now ImageJ.

The usability article sounded many many of the same cries that we make here at OpenHelix – if you want to have usable bioscience software that IS in fact USED, at a minimum you must 1) have funding and a mandate to maintain it over the long run, 2) have motivated developers that are responsive to their users needs and feedback, including fixing bugs and 3) (last but absolutely not least) you must provide awareness and training on your software. And in my opinion, any old training WON”T due – it has to be high quality, up-to-date, and easier to use & absorb than your average dry documentation on programming your VCR clock (OK, I’m dating myself there, but you KNOW what I mean…) I like their suggestion that funding agencies request descriptions of how the software be maintained and documented, and to be prepared to provide funding not just for development, but also for maintenance. (Why reinvent the wheel over & over, just to let each one go flat with disrepair?)

There were also reports on specific software, such as OMERO.searcher, SimuCell, PhenoRipper, Fiji, BioImageXD, and Icy, as well as on the Broad Bioimage Benchmark Collection (BBBC), a collection of microscopy image sets available for the testing and validation of new image-analysis algorithms.

The focus then concludes with a great review of bioimaging software tools, with the goal of providing a “how to” summary of using open-source imaging software for every stage of bioimage informatics. It begins with a discussion of data aquisition & continues through data storage and workflow systems. I might tweek figure one just a bit, but it does visualize that today software is required at every stage of image analysis – from automated image attainment to image retrieval and analysis. The authors also touch on the importance of image annotation and controlled vocabularies, or ontologies. Table 1 provides a nice resource listing including software names, primary function and URL – I have some new resources to check out now! :)

Overall, I’d suggest this focus on bioimage informatics to any life scientist, whether you are analyzing images today or not – I think it is provides a glimpse into an up&coming, exciting field.

Quick Links:

Broad  Broad Bioimage Benchmark Collection (BBBC):







Reference List:
Greg Miller (2012). Blast Injuries Linked to Neurodegeneration in Veterans Science, 336 (6083), 790-791 DOI: 10.1126/science.336.6083.790

Gene Myers (2012). Why bioimage informatics matters Nature Methods, 9, 659-660 DOI: 10.1038/nmeth.2024

Anne E Carpenter, Lee Kamentsky, & Kevin W Eliceiri (2012). A call for bioimaging software usability Nature Methods 9, 9, 666-670 DOI: 10.1038/nmeth.2073

Kevin W Eliceiri, Michael R Berthold, Ilya G Goldberg, Luis Ibáñez, B S Manjunath, Maryann E Martone, Robert F Murphy, Hanchuan Peng, Anne L Plant, Badrinath Roysam, Nico Stuurmann, Jason R Swedlow, Pavel Tomancak, & & Anne E Carpenter (2012). Biological imaging software tools Nature Methods, 9, 697-710 DOI: 10.1038/nmeth.2084

UCSC Table Browser webinar follow-up post (May 24)

We’ll be having our May 24th webinar today, and we find there are questions to follow up afterwards that are often better handled in discussions on the blog.

If there are questions we didn’t have time to get to–or things we want to expand on with more detail–we can discuss them in this thread.

Or if you have other things you’ve been meaning to ask, let us know.

If you can’t make the webinar, the same material is covered in the training movie, slides, and exercises that are freely available, sponsored by the UCSC team: You can also sign up to be informed of future webinars coming up on these topics, Galaxy, ENCODE and others.

The Texas Textbook Massacre: review of The Revisionaries

Usually we don’t get too political on this blog. But we take a pretty clear stand on teaching evolution. And if anyone ever came to us and demanded that we “teach the controversy“, we’d laugh really hard and then move our company to a country that wasn’t completely insane. So my perspective on what happened with the revision of science curriculum and textbooks in Texas is probably not a surprise. And I’ll be describing my view of the new film The Revisionaries with that in mind.


For some people, the concept of “evolution” is something of an esoteric discussion. It doesn’t really impact most of their days. Here at OpenHelix, though, it’s sort of the foundation of what we do. At each workshop we do I stand in front of people and talk about the representation of “evolutionary relationships” that they can evaluate in the UCSC Genome Browser, or some other tool.

So when I was notified by NCSE that this new documentary film The Revisionaries, about the Texas SBOE (State Board of Education) process to review and revise the science textbooks to discredit evolution, was playing at the Boston Independent Film Festival, I wanted to see how that transpired.

I had been aware that the process had happened, and that it was being driven by that odd species of Young Earth Creationist (YEC) that you hear about if you live in Massachusetts, but rarely see in the wild. It was fascinating to learn about how the Texas process played out.

The film focuses largely on the guy who was chair of the board from the round of revisions that comprised the science pieces. Don McLeroy is a dentist and unabashed YEC. You see him in his office proselytizing to patients who can’t really talk back, ironically enough. And you see him steer the board into some dangerous and wrong conclusions about what students should be hearing in their science classes. You can seem him quickly degenerate into word salad when asked to explain concepts related to the material. At the same time they dismiss the testimony of anyone with actual expertise in science. It’s excruciating to see.

It’s something like watching the croquet scene in Alice’s Adventures in Wonderland. And a surprise in the film is that you discover that there is a Queen of Hearts on the board too:

I pictured to myself the Queen of Hearts as a sort of embodiment of ungovernable passion – a blind and aimless Fury.
–Lewis Carroll

She comes off as much more malign and scary than Don—but I don’t want to detail that much more, you should see that play out in full, it would be something of a spoiler to provide any more.

There are some heroes of sanity. Kathy Miller of the Texas Freedom Network (TFN) clearly battles daily to keep religion and misinformation out of the public schools. Eugenie Scott (NCSE) and Ron Wetherington are shown carrying the flag of reality. But from the film you learn additional background of the Texas process and system that have really stacked the deck against them.

Although the film begins with the issues in 2009 on the science components, it turns later (and chillingly) to what happened next in 2010. I was unaware that the same group was focusing their attention to the Social Science and History curriculum. With the same strategies.  They were doing to history what they had done to science: using their influence and bafflegab to drop Jefferson from prominence, while elevating Aquinas and Calvin. They wanted to pump up the role of the social conservative movement during the Reagan era. And to strike hip-hop from the text and replace that with “country music”. I kid you not.

It’s hard to watch this film. Luckily I was with an audience that shared my horror at the anti-science sentiments that were flying around. And because it was a film festival showing we also were able to discuss the film with the director Scott Thurman to learn more. One attendee noted that in some ways Don comes off likeable—because at least he’s being honest about exactly who he is and what he believes. And the director tells us Don likes the film. It didn’t surprise me, because you also see him enjoying the radio attack ads against him during the re-election campaign which define him exactly how he sees himself—as a young earther trying to influence the textbooks. This doesn’t exactly gel with his claims that he’s not trying to influence the board with his own beliefs. But you can see how his brain works (such as it is).

More discussion with attendees after the film was hopeful. Textbook writers and publishers were in the group I was with, commiserating on the influences of special interest groups on education. It was noted that after learning about the Texas process some schools specifically avoid the texts approved there—which is a good side effect and may reduce their influence overall. A professor from Emerson College was hoping to bring this film to his communication students to show them about the influences on publishing. There were some good ideas: how the current technology should reduce the influence of big states like Texas and enable chapters to be swapped out to be better for other school districts. On the other hand—the publisher warned—this also enables teachers to swap out what they want more easily; be careful what we wish for here.

It was certainly a film you should see if you care about education, and science, even if it’s difficult to endure. Because this is not over. And it’s time that more people—especially science defenders–were paying attention to the down-ballot races, and school committees, and other types of community efforts like these. And don’t think they’ve stopped at science class—they are going after history and social studies now as well. Experts on those topics need to step up and join the fray.

One attendee noted that people who aren’t paying attention, aren’t voting, and aren’t participating in these types of things are just as responsible as Don for what happened. And I couldn’t agree more.

Trailer. Catch it if you can. Watch for local film festivals, maybe later on DVD.