Category Archives: Genomics News

Oy. I worry about this with cell line studies a lot. Mis-IDed + contaminated.

cellsVia NCBI Announce mailing list:

NCBI BioSample includes curated list of over 400 known misidentified and contaminated cell lines

The NCBI BioSample database now includes a curated list of over 400 known misidentified and contaminated cell lines. Scientists should check this list before they start working with a new cell line to see if that cell line is known to be misidentified.

Continuous cell lines are used widely in research as model systems for normal cellular processes and disease states. However, as noted by many (e.g. PubMed 23235867, 20143388, 19003294, 18072586, and 17522957), cell line cross-contamination or misidentification represents a serious and widespread problem, and researchers should take great care to check that their cell line is what they think it is. Cell lines can be easily mislabeled or become overgrown by cells derived from a different individual, tissue or species.

This problem is so common it is thought that thousands of misleading and potentially erroneous papers have been published using cell lines that are incorrectly identified (PubMed 20448633). The first step in combating this problem is to make sure your cell line is not on the list of known misidentified and cross-contaminated cell lines. Detailed information about how to test your cell lines is provided by the International Cell Line Authentication Committee.

NCBI BioSample curated list of misidentified and contaminated cell lines:[Attribute]

Articles on cell line cross-contamination and misidentification in PubMed mentioned above:

The International Cell Line Authentication Committee:

I also worry about SNV and all sorts of other issues within the cell lines. When the first data was coming out on CNVs in the ENCODE cell lines, I found duplications, and homozygous and heterozygous deletions, that would have concerned me if I was working on certain pathways. If I was still doing cell biology, I’d sequence my cell line of choice before I did another experiment with them.  Below I’ve linked to the PubMed reference they provided in the body.


American Type Culture Collection Standards Development Organization Workgroup ASN-0002. (2010). Cell line misidentification: the beginning of the end, Nature Reviews Cancer, 10 (6) 441-448. DOI:

Cambridge Healthtech Institute Announces the Acquisition of OpenHelix

Cambridge Healthtech Institute (CHI) announced the purchase of Washington–based OpenHelix, the provider of online and onsite training on some of the most popular and powerful open-access bioinformatics resources on the web.

“Knowing how to use the latest bioinformatics tools is critical to genomics research, which will only grow in importance,” said Phillips Kuhl, President of Cambridge Healthtech Institute “With an over ten year track record of developing and presenting training on open access bioinformatics databases and programs, OpenHelix is an instrumental service to researchers and a key addition to CHI’s family of conference and training products.”

OpenHelix will join the Cambridge Healthtech Institute as a division of Bio-IT World, a leading source of news and opinion on technology and strategic innovation in the life sciences, including drug discovery and development. “OpenHelix brings Bio-IT World an extensive and solid audience in the academic research community, as well as the opportunity to extend to our existing audience a valuable training product line,” said Lisa Scimemi, Publisher of Bio-IT World, “training that many of our readers need for themselves or their staff or students but may not be aware of.”

“We are proud of the success we have had in the past, with some of the top universities and medical schools subscribing to OpenHelix,” said Scott Lathe, CEO of OpenHelix “Working with Bio-IT World will bring us the infrastructure, resources, and market reach we need to further grow our tutorials, subscriptions, and product offerings.”

As part of the acquisition, Scott Lathe, CEO and co-founder of OpenHelix will become General Manager of the OpenHelix unit and Mary Mangan, President and co-founder of OpenHelix will become Director, Product and Content of the OpenHelix unit.

About Bio-IT World (
Bio-IT World provides outstanding coverage of cutting-edge trends and technologies that impact the management and analysis of life sciences data, including next-generation sequencing, drug discovery, predictive and systems biology, informatics tools, clinical trials, and personalized medicine. Through a variety of sources including,, Weekly Update Newsletter and the Bio-IT World News Bulletins, Bio-IT World is a leading source of news and opinion on technology and strategic innovation in the life sciences, including drug discovery and development.

About Cambridge Healttech Institute (
Cambridge Healthtech Institute (CHI), founded in 1992, is the industry leader in providing superior-quality scientific information to eminent researchers and business experts from top pharmaceutical, biotech, and academic organizations. Delivering an assortment of resources such as events, reports, publications and eNewsletters, CHI’s portfolio of products include Cambridge Healthtech Institute Conferences, Barnett Educational Services, Insight Pharma Reports, Cambridge Marketing Consultants, Cambridge Meeting Planners, Knowledge Foundation and Cambridge Healthtech Media Group, which includes Bio-IT World and Clinical Informatics News.

About OpenHelix (
OpenHelix, a Washington State company, was founded in 2003 to provide training on what was then a fledgling but quickly growing market of open access web based bioinformatics resources. OpenHelix has provided training and outreach services for many providers of resources, such as the UCSC Genome Browser, OMIM, and the Protein Data Bank (RSCB PDB). OpenHelix received a $1.2 million grant in 2007 to create a search engine for bioinformatics resources and to expand its tutorials suites. In 2009, it launched the subscription service to over 100 tutorial suites.

Public service announcement: Biocuration 2014 meeting

Hi folks–

Got the following email about the upcoming biocuration meeting. You like to have quality information stored in those databases you use? Thank the biocurators. Support the biocurators.


7th International Biocuration Conference
Biocuration 2014
Toronto, Canada
April 6-9, 2014
Abstracts Due: February 10, 2014


Dear Colleagues,

On behalf of the Organizing Committee, I am pleased to invite you to the 7th International Biocuration Conference in Toronto, Canada. The conference will will begin at 6pm on Sunday 6, April and run through to 5pm on Wednesday 9, 2014. The conference will provide a forum for trainees, biocurators, investigators, clinicians and developers of biological databases to discuss their work, promote collaboration and foster a sense of community in this very active and growing area of research. Participants from academia, government and industry interested in the methods and tools employed in curation of biological and medical data are encouraged to attend. While a number of speakers have been invited, please note that the majority of oral presentations will be drawn from openly submitted abstracts.

The primary site for the conference is Hart House, considered the cultural and ceremonial centre of the University of Toronto.

The proposed 2014 conference sessions and workshops will address the following challenges in biocuration:

Automated Function Prediction (workshop)
Big Data Curation: Dealing with supplementary data (workshop)
Biocreative Text Mining (workshop)
Biological and Clinical Ontologies (session)
Clinical Annotations (session)
Data Integration and Data Sharing (session)
Functional Annotations (session)
Microbial Informatics (session)
Phenotype (workshop)
Social Tools for publishing and curation (workshop)

Dr. Tim Hubbard, Wellcome Trust Sanger Institute
Dr. Suzanna Lewis, Lawrence Berkeley National Laboratory
Dr. Patricia Babbitt, California Institute for Quantitative Biosciences (QB3)
Dr. Lincoln Stein, Ontario Institute for Cancer Research

The registration form can be found at:

Submit your abstract for presentation (poster or talks only) via this link:

Abstract submission deadline (poster or oral presentation): February 10, 2014
Fellowships Application deadline: February 10, 2014
Notification of decision for abstract submission: February 24, 2014
Fellowships Notification of acceptance: February 24, 2014
Online registration ends: March 24, 2014.

[find this over at the registration site]

Looking forward to seeing you at Biocuration 2014.


Robin Haw

Organizing Committee
Robin Haw (Chair) Ontario Institute for Cancer Research, ON, Canada
Marc Gillespie (Co-chair) St John’s University, NY, USA
Francis Ouellette (Co-chair) Ontario Institute for Cancer Research, ON, Canada
Alex Bateman, European Bioinformatics Institute, UK
Michelle Brazas, Ontario Institute for Cancer Research, ON, Canada
Fiona Brinkman, Simon Fraser University, BC, Canada
Mike Cherry, Stanford University, CA, USA
Patricia Falzon, Ontario Institute for Cancer Research, ON, Canada
Nicole Gleed, Ontario Institute for Cancer Research, ON, Canada
Pascale Gaudet, Swiss Institute of Bioinformatics, Switzerland
Iddo Friedberg, Miami University, OH, USA
Todd Harris, Ontario Institute for Cancer Research, ON, Canada
Raja Mazumder, George Washington University, DC, USA
Dennis McCormac, Ontario Genomics Institute, ON, Canada
Ilene Mizrachi, National Center for Biotechnology Information, DC, USA
Monica C. Munoz-Torres, Lawrence Berkeley National Laboratory, CA, USA
John Parkinson, The Hospital for Sick Children, ON, Canada
Paul Thomas, University of California, CA, USA

Boston area genome geeks: free film screening


I posted a short item about that film a few days back, not expecting to hear that there was going to be free access to the screening. I thought it would require conference registration. But it looks like you can see the film without the conference attendance.

For free, I might go.


The Perfect 46. Here we go again.

At least I didn’t say GATTACA, right?

I got an ad today for the Consumer Genetics Conference that had this little teaser, for a “science factual” film.

On September 25 at the Consumer Genetics Conference, a special advance screening of The Perfect 46 will be held. This science factual feature film is about a geneticist who creates a website that pairs individuals with their ideal genetic partner for children. The Perfect 46 hopes to inspire debate about science and our world, specifically the “what ifs?” of personal genetic testing, set in the very near future. The film is without a final sound mix and is still a work in progress. A discussion of the film, the themes it raises, and viewer reactions will be held during an interactive breakout group session the following morning on Thursday, September 26 from 7:30-8:30am. Brett Ryan Bonowicz, the writer and director, will introduce the film and lead the follow-up breakout discussion. > Visit The Perfect 46 Website

Check out the site:

The first item in the FAQ cracked me up:

Q: How does this work?
A: The state of California already has your genome. For a nominal fee, we will run your genome alongside your partner’s genome and address your child’s likelihood for over 4000 genetic diseases.

I get the premise. And maybe this will generate useful discussion. But there are so many ways this could go wrong…We’ll have to see. There are a number of additional short videos/podcasts over on the “press” page for the movie site. I’ll just put one here, from their homepage. But I guess this is the one you should see first.

Hmm. Will this end well? I have been eager to see the genetics-related film “Decoding Annie Parker”. I’m not sure I’m eager for this one.

The Sasquatch Genome paper is out

Um, sorta. I can’t find the sequence in GenBank–ahem.

There’s much chatter right now, but nobody has actually read the paper. I am hoping a qualified science journalist takes a crack at it, because I’m unwilling to pay them (since they bought the journal) to see this thing.

EDIT to add things as they come along:

One person admits to having the paper:

I asked for the sequence accession numbers:

Konrad Karczewski apparently got a copy too, and is tweeting some initial assessment.

This is older:

DNA Day essay contest is open! 60th anniversary of the helix

Got students who are eager to tackle topics in genomics? Got some good science teachers who could coach them? Tell them about the American Society of Human Genetics (ASHG) DNA Day essay contest.

Direct link to the ASHG page in case the tweet one fails:

Over at the site there are more details about the questions to address. And the deadline is noted.


Ethics committee report on genome sequencing and privacy

Well, this changes my morning. It’s a 150 page report.

Get the report itself here: Privacy and Progress in Whole Genome Sequencing

Press or blog coverage (I’ll update if I see others):

Nature: US ethics panel reports on DNA sequencing and privacy

Science: President’s Ethics Panel Urges New Protections for Whole Genome Data

AP: Bioethics panel urges more gene privacy protection

USA Today: Panel: Protect patients who use whole genome sequencing

AARP: How private is your genetic code? Less so than you might think.


ENCODE floods the news networks…

My social media is abuzz with ENCODE publications and chatter right now. Some of the things I’d recommend (besides the huge collection of papers and Nature site, of course) or that made me laugh:

ENCODE project team leader Ewan Birney’s insights: ENCODE: My own thoughts

Guardian: Thousands of ‘genes’ found in parts of genome dismissed as junk DNA

Not Rocket Science: ENCODE: the rough guide to the human genome

NPR: Scientists Unveil ‘Google Maps’ For Human Genome

NYT: Far From ‘Junk,’ DNA Dark Matter Plays Crucial Role

NHGRI: RT @genome_gov: ENCODE, a multi-year effort by more than 440 researchers, has yielded astounding genomic insights.

NBC News: New DNA project shows us living beyond our genes

BBC Video: Human genome ‘more active than thought’

Guardian Video: What the Encode project tells us about the human genome and ‘junk DNA’ – video

BBC story: Detailed map of genome function

Science NOW: Human Genome Is Much More Than Just Genes

Ars Technica: Cataloging the controlled chaos of the human genome

Wired: New DNA Encyclopedia Attempts to Map Function of Entire Human Genome

CNN: DNA project interprets ‘book of life’

CBC: ‘Junk DNA’ has a purpose, new map of human genome reveals

The Telegraph: Worldwide army of scientists cracks the ‘junk DNA’ code

LA Times: ENCODE project sheds light on human DNA and disease

The Economist weighs in. (Hm.) The new world of DNA

Wall Street Journal: ‘Junk DNA’ Debunked

Gizmodo: The Human Genome Is Far More Complex Than Scientists Thought

Slashdot: ENCODE DNA Project: Big Data to Solve Genome Mysteries

Cosmos: Decade-long DNA project prompts ‘gene’ redefinition

Most bizarre title spin so far: “Occupy” comes to DNA: A genome for the 99 percent

Snorf: Everything??  Gigantic New Study Changes Everything We Knew About Human Genes

John Timmer (Ars Technica again): Most of what you read was wrong: how press releases rewrote scientific history

Maggie Koerth-Baker: ENCODE, the media, and what we really know about the human genome

Elizabeth Finkel: Aussie geneticist wins wager over junk DNA

Faye Flam: Skeptical Takes on Elevation of Junk DNA and Other Claims from ENCODE Project








Giggle II:

Giggle III:

GenomeTV from NHGRI:


And when you are ready to look around at the data yourself, do come back for our tutorials on ENCODE:

ENCODE Foundations (first tutorial on early ENCODE data):

ENCODE Data Available through the UCSC Genome Browser II:


[I'm going to keep this as an ongoing repository of items I'm seeing. May be edited frequently over the next week or so.]

Enjoying the 2012 NAR Web Server Issue & a Cup of Coffee

In hunting for something to feature for this week’s tip, I noticed that Nucleic Acids Research had released their 2012 Web Server Issue back in July. As many of you are might be aware, the Nucleic Acids Research journal is a forum where developers can present computational biology papers that describe the development of biologically relevant algorithms, novel usage of existing algorithms, or that report the development of biological databases & their usage. The web server issue is an annual special issue focused specifically on web-based software resources for analysis and visualization of molecular biology data.

This year marks their 10th web server issue & I decided to check it out. In order to devote full attention to the issue, I began by pouring myself a big cup of coffee in one of my favorite mugs, which somehow makes it taste better. Then I set out to enjoy the issue – every year I always begin by reading the opening editorial & then the article on the bioinformatics links directory. The editorial usually explains special emphasis for the issue (this year it is analysis of next-generation sequencing data), and is written by the executive editor of the issue, Gary Benson. For me, the editorial sets the tone of the issue, so to speak.

Next I consume the directory article, along with a couple of sips of my java. What interests me in the article is multifold. First is the discussion of trends that they see in the development of tools and resources, which is important for us here at OpenHelix. Figure 6 provides an interesting look at the categories and counts of resources from each annual issue – I am curious as to why all but one category decline in 2008. Table 1 also provides interesting data on tool trends.

I am also interested in the content of the list itself – it is a great list being developed by people that we have a lot of respect for. I was especially interested in this sentence from their article:

“The Bioinformatics Links Directory has also initiated active curation of its content, removing dead content and correcting content errors, which has resulted in more accurate although occasionally smaller counts for 2012.”

The emphasis is mine in the quote above. In my opinion this is a very important aspect of any list. If you remember, Mary posted on the idea of “Obituaries for bioinformatics tools.” and started a BioStar post to collect this information. The BioStar post generated significant comment & looks like it may have helped inspire the Bioinformatics Links Directory team, from the comments. But it makes sense that you need not just collect information but to continue to maintain and filter that data so that it remains relevant – I mean if the forest is cluttered with dead wood, the useful “live trees” (ok, resources) are obscured from users, right?

The problem is that keeping any list (or documentation or tutorials, etc.) up-to-date is a hard, labor intensive activity. Here at OpenHelix we also keep a list of biology-relevant resources that can be searched through for free, without registering, from our homepage. We currently have a summer intern culling through a list of over 5,000 resources and tools that we know of. She is eliminating duplicate entries in our database by finding and collecting alternative URLs – it is amazing how many resources have multiple entryways, each with their own URL. But different doors don’t make a different resource or utility so we eliminate them form our list. Then we will tackle the dead resources, the listings that just go to a tiny tool internal to a main resource, or to a pre-formatted PubMed search for something.

Creating AND maintaining a high quality list is not a trivial effort. In their paper the Bioinformatics Links Directory team describes remaining current as a “future challenge” and says:

“Although necessary to remain current and to advance the utility of the Bioinformatics Links Directory, these improvements will only prove useful if driven by the community. As a community-driven repository, everyone in the research or bioinformatics community has the opportunity to help make the collection better and more meaningful. “

I truly wish them better luck at “community curation” than many resources have had in the past, & hope they succeed. In our experience it works best with stable, sufficient funding because as they say: “you get what you pay for”.

OK, next post will be on actual resources in the web server issue, I promise! :)

Quick links:

2012 NAR Web Server Issue:

Bioinformatics Links Directory:

OpenHelix Homepage & Search Portal:

Gary Benson (2012). Editorial: NUCLEIC ACIDS RESEARCH ANNUAL WEB SERVER ISSUE IN 2012 Nucleic Acids Research, 40 (W1) DOI: 10.1093/nar/gks607

Michelle D. Brazas, David Yim, Winston Yeung, & B. F. Francis Ouellette (2012). A decade of web server updates at the bioinformatics links directory: 2003–2012 Nucleic Acids Research, 40 (W1) DOI: 10.1093/nar/gks632