Public service announcement: CAFA2 for protein functional annotations

Just got this email on the Biocurators mailing list, wanted to spread the word:

Announcing CAFA 2: The Second Critical Assessment of protein Function Annotations

Friends and Colleagues,

We are pleased to announce the Second Critical Assessment of protein Function Annotation (CAFA) challenge. The goal of the challenge is to predict functional annotations of genes/proteins. In CAFA, the organizers provide a set of about 100,000 protein sequences, of which most are completely unannotated and some are partially annotated with respect to their function. The participants are asked to predict functional annotation of these proteins before January 15, 2014. At that time, all predictions will be stored and we will wait for 6-12 months until new annotations are available in the biomedical literature and/or major databases. The initial evaluation will be provided in July 2014, during the ISMB conference (Boston, MA). Anyone in the world is welcome to participate.

In brief:

Web site:

Prediction submission deadline: January 15, 2014

Initial evaluation: July 12, 2014 in Boston

All targets can be downloaded from The web site also contains training data; however, the participants are *not* required to use it and even if they do, they can use any additional data of their choice, including the literature. The CAFA challenge is different from many other similar challenges because not even the organizers know which of the original target sequences will be functionally annotated after the submission deadline.

The CAFA 1 experiment is described in the following paper:

P. Radivojac et al. A large-scale evaluation of computational protein function prediction. Nature Methods (2013) 10(3): 221-227.

A brief introduction to the problem for computer scientists is provided at:

The mission of the Automated Function Prediction Special Interest Group (AFP-SIG) is to bring together computational biologists who are dealing with the important problem of gene and gene product function prediction, to share ideas and create collaborations. We also aim to facilitate interactions with experimental biologists and biocurators.

We hope that AFP-SIG serves an important role in stimulating research in annotation of biological macromolecules, but also related fields.

New in CAFA 2:

In CAFA 2, we would like to evaluate the performance of protein function prediction tools/methods and also expand the CAFA challenge to include prediction of human phenotypes associated with genes and gene products. As the last time, CAFA will be a part of the Automated Function Prediction Special Interest Group (AFP-SIG) meeting that will be held alongside the ISMB conference. AFP-SIG will be held as a two-day meeting in July 2014 in Boston.

About the CAFA experiment:

The problem: There are far too many proteins for which the sequence is known, but the function is not. The gap between what we know and what we do not know is growing. A major challenge in the field of bioinformatics is to predict the function of a protein from its sequence (and all other data one can find). At the same time, how can we judge how well these function prediction algorithms are performing and whether we are making progress over time?

The solution: The Critical Assessment of protein Function Annotation algorithms (CAFA) is an experiment designed to provide a large-scale assessment of computational methods dedicated to predicting protein function. We will evaluate methods in predicting the Gene Ontology (GO) terms in the categories of Molecular Function, Biological Process, and Cellular Component. In addition, predictors may use the Human Phenotype Ontology (HPO) for the human dataset. A set of protein sequences is provided by the organizers, and participants are expected to submit their predictions by the submission deadline, January 15, 2014. The predictions will be evaluated during the Automated Function Prediction (AFP) meeting, which has been approved as a Special Interest Group (SIG) meeting, at the ISMB 2014 conference (Boston, USA).

History: The first CAFA experiment was conducted in 2010-2011. Twenty-three groups submitted fifty-four algorithms for assessment. The results and most methods were published in Nature Methods and in a special supplement in BMC Bioinformatics. CAFA 1 has brought together a large group of computational predictors and, for the first time, provided us with a clear picture of the state of this important field. As with other critical assessment experiments, the aim of CAFA is improve protein function prediction by continuously challenging groups to develop more accurate methods.

How to participate in CAFA 2?

1. Go to

2. Download target proteins, already available

3. Submit predictions on or before January 15, 2014

4. Join us at the AFP-SIG, July 11-12, 2014 in Boston for the eighth protein function prediction meeting, to hear the CAFA 2 results, to present your work, and to learn about the latest research in computational protein function prediction

More details at:

Confirmed keynote speakers:

Fiona Brinkman, Simon-Fraser University, Canada

Mark Gerstein, Yale University, USA

We look forward to hearing from you!

The CAFA organizing Team: Predrag Radivojac, Michal Linial, Sean Mooney and Iddo Friedberg