Tag Archives: snp

What’s the answer? (1000Genomes SNPs issues)

BioStar is a site for asking, answering and discussing bioinformatics questions. We are members of thecommunity and find it very useful. Often questions and answers arise at BioStar that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those questions and answers here in this thread. You can ask questions in this thread, or you can always join in at BioStar.

This week’s question:

Why are there more non-synonymous SNPs than synonymous SNPs in the 1000 genomes data?

I have downloaded SNP data from the 1000 genomes project through Biomart and UCSC genome browser. These SNP data are annotated as being synonymous or non-synonymous (missense). In all textbooks it is said the the number of synonymous mutations should be much higher than non-synonymous mutations. Then why is it that I consistently observe higher number of non-synonymous SNPs for the human genome? Do you think there might be a mistake in annotating these SNPs or there is something else that I am missing?


This question generated a lot of discussion. And one of the key aspects is that you have to really pay attention to how the annotation features are provided in a database. Have a look at the chatter over there about various aspects of SNP annotations.

Video Tip of the Week: SNPeffect 4.0

One of the most frequent questions we hear when we do workshops is: how to I find out if this SNP has an effect on my favorite protein? Well, that’s assuming it is a coding SNP. Of course, promoter SNPs and splicing SNPs and other features would be great to assess as well. Right now, though, the most mature tools are those that look at the effects of variation on the coding of the amino acids in proteins.

We’ve talked before about some tools for this, including PolyPhen2 and SIFT. Each of them will offer different algorithms and options that might help you to explore your SNPs. But another tool is available that you should check out as well: SNPeffect 4.0.

SNPeffect isn’t new–this team has been developing it for a while. But their recent paper that describes new features in the 4.0 version spurred me to have a new look at it. There are some foundational things that are important to know about the data collection in their database. It’s not just a re-hash of dbSNP–it actually relies on another source of variation data. They use the UniProt collection of human proteins as the starting point. If you haven’t used UniProt much, you might not be aware of how much variation they catalog and store that are identified in the proteins (we cover this in our tutorial*). The SNPeffect team takes those variations and evaluates the impact they have on a protein with a variety of algorithms. Some of the variations will correspond to dbSNP entries–but not all of them do. You may find things here that you won’t find in dbSNP. So I would say it’s worth exploring your proteins of interest here as well.

The algorithms they use provide information on a number of features of the protein. TANGO and WALTZ assess protein aggregation and amyloid formation. LIMBO evaluates chaperone binding. Structural stability is predicted by FoldX (if a suitable structure is available). They also use SMART* and Pfam* to see if the variation occurs within domains of the protein. There are some other tools with more protein features examined as well. Check out the paper for more details.

You can also submit proteins of interest to their analysis suite from the “Submit a new SNPeffect job” links.

A new feature highlighted in their paper is the opportunity to do a Meta-analysis on groups of variations. You can explore the features of sets of variants in this way, using the different algorithms they offer.

This short video examines the pipeline, the basic interface, and a couple of sample pages. But you’ll want to go over and try a lot more to learn about your favorite proteins. There’s a lot of information that can come out of this that you might not have known before. Check it out.

*OpenHelix tutorials for these resources available for individual purchase or through a subscription

Quick links to resources discussed:

SNPeffect 4.0 http://snpeffect.switchlab.org/

PolyPhen 2 http://genetics.bwh.harvard.edu/pph2/

SIFT http://sift.jcvi.org/


De Baets, G., Van Durme, J., Reumers, J., Maurer-Stroh, S., Vanhee, P., Dopazo, J., Schymkowitz, J., & Rousseau, F. (2011). SNPeffect 4.0: on-line prediction of molecular and structural effects of protein-coding variants Nucleic Acids Research, 40 (D1) DOI: 10.1093/nar/gkr996

Video Tip of the Week: Variation Data from Ensembl

Trey introduced me to this “decent collection of video tutorials ” from Ensembl, but he and Mary are currently in Morocco teaching a 3-day bioinformatics workshop & then attending the conference (yes, I am envious!). I am therefore creating this week’s tip based on the tutorials that Trey pointed me to. In today’s tip I am going to parallel a tutorial available from Ensembl on SNP information in order to both: 1) show you haw you can access variation information from Ensembl and 2) compare doing these steps using Ensembl 64 (here in this video) and using Ensembl 54 (archived) (in the Ensembl video).

Bioscience resources often are continuously being developed and improved & it can be difficult to keep videos and documentation up-to-date. That’s why here at OpenHelix we work continuously to keeping our materials up-to-date, with weekly tips on new features and updated tutorials as updated sites become stable.

The Ensembl video (SNPs and other Variations – 1 of 2) is quite nice & provides more detail about the actual Ensembl data than I can in my short movie, but it was done a few years ago on an older version of Ensembl. Since then the resource has been updated, and gone through several new versions of the data. I’m going to follow the same steps that are done in part one of the Ensembl SNP tutorial so that you can see examples of what’s changed & what is pretty much the same. I’d suggest you watch both videos back-to-back to get a good idea of what’s changed, and what types of variation information are available from Ensembl. From that basis I’m sure you’ll be able to watch Ensembl’s second SNP video & apply it to using the current version of Ensembl without much trouble. For more details you can refer to the most recent Ensembl paper in the NAR database  issue, which describes not just variation information but Ensembl as a whole.

Quick links:

Ensembl Browser: http://www.ensembl.org/index.html

Legacy Ensembl Browser (release 54): http://may2009.archive.ensembl.org/index.html

Ensembl tutorial, part 1 of 2: http://useast.ensembl.org/Help/Movie?id=208

Ensembl tutorial, part 1 of 2: http://useast.ensembl.org/Help/Movie?id=211

OpenHelix Ensembl tutorial materials: http://www.openhelix.eu/cgi/tutorialInfo.cgi?id=95

Ensembl Tutorial List: http://useast.ensembl.org/common/Help/Movie?db=core

Flicek, P., Aken, B., Ballester, B., Beal, K., Bragin, E., Brent, S., Chen, Y., Clapham, P., Coates, G., Fairley, S., Fitzgerald, S., Fernandez-Banet, J., Gordon, L., Graf, S., Haider, S., Hammond, M., Howe, K., Jenkinson, A., Johnson, N., Kahari, A., Keefe, D., Keenan, S., Kinsella, R., Kokocinski, F., Koscielny, G., Kulesha, E., Lawson, D., Longden, I., Massingham, T., McLaren, W., Megy, K., Overduin, B., Pritchard, B., Rios, D., Ruffier, M., Schuster, M., Slater, G., Smedley, D., Spudich, G., Tang, Y., Trevanion, S., Vilella, A., Vogel, J., White, S., Wilder, S., Zadissa, A., Birney, E., Cunningham, F., Dunham, I., Durbin, R., Fernandez-Suarez, X., Herrero, J., Hubbard, T., Parker, A., Proctor, G., Smith, J., & Searle, S. (2009). Ensembl’s 10th year Nucleic Acids Research, 38 (Database) DOI: 10.1093/nar/gkp972

Video Tip of the Week: VnD Resource for Genetic Variation and Drug Information

In today’s tip I am going to feature a resource that I found recently. I’ve been updating our dbSNP tutorial, which Mary & Trey will be presenting at workshops in Morocco, and also our free PDB tutorial, which is sponsored by the RCSB PDB team. I have therefore been thinking about protein structures and small sequence variations a lot lately. As I explored the latest Database issue of NAR looking for resources to do a tip on, I found an article describing the VnD (genetic Variation and Drug) resource, which can also be accessed at the URL www.vandd.org, according to the NAR article. The article is “VnD: a structure-centric database of disease-related SNPs and drugs“, and figure one shows a veritable Who’s Who of protein, variation and disease resources, so I had to investigate.

What I found at VnD made me sure that this was a resource that I wanted to feature in a tip. VnD is from the Korean Bioinformation Center, or KOBIC, who has a list of databases and tools that they provide. I’ll save the rest of the KOBIC resources for another post & concentrate on VnD here. Compiling data from resources such as RefSeq, OMIM, UniProt, PDB, DrugBank, dbSNP, GAD and more might have been cool enough, depending on how it was done, but the VnD also does their own structure modeling analysis on how the variation affects the protein structure and drug/ligand binding.

This tip movie isn’t long enough to really show you the breadth of what is available from the VnD, but I hope it will be enough to encourage you to read the NAR article (listed below), and to check out VnD. One thing to note: don’t expect to find every dbSNP rs# over there – one that I’ve been using in our tutorial isn’t over there. They are specifically interested in variations within genes that might effect drug binding. But hey, you can’t query DrugBank with rs#s, and I’ve never seen the structure modeling done like VnD, so it is a worthy resource that you may want to investigate if you are interested in how genetic variations connect with disease and drug therapies.

Quick links:

VnD: Variations and Drugs resource -  http://vnd.kobic.re.kr:8080/VnD/index.jsp

Korean Bioinformation Center (KOBIC) – http://www.kobic.re.kr/

RCSB PDB – http://www.pdb.org

OpenHelix Tutorial on the RCSB PDB – http://www.openhelix.com/pdb

dbSNP: Short Genetic Variations, from NCBI -  http://www.ncbi.nlm.nih.gov/projects/SNP/

OpenHelix Tutorial on NCBI’s dbSNP – http://www.openhelix.com/cgi/tutorialInfo.cgi?id=39

For links to other resources and OpenHelix tutorials mentioned in this post, please see our catalog of resources – http://www.openhelix.com/cgi/tutorials.cgi

Yang, J., Oh, S., Ko, G., Park, S., Kim, W., Lee, B., & Lee, S. (2010). VnD: a structure-centric database of disease-related SNPs and drugs Nucleic Acids Research, 39 (Database) DOI: 10.1093/nar/gkq957

Tip of the Week: MutaDATABASE, a centralized and standardized DNA variation database

We all know and love dbSNP, and DGV, and 1000 Genomes, and HapMap, and OMIM, and the couple of other dozen variation databases I can think of off the top of my head. But–even though there’s a lot of stuff out there–you never know what you aren’t seeing. What *isn’t* yet stored in those resources?  One new consortium suggests that there’s a lot you aren’t seeing. And they aim to make it easier to collect variation data, curate it, visualize it, and have it all in one place. The resource they are constructing is called MutaDATABASE.

MutaDATABASE is a new effort to bring together a lot of variation information that is just not getting into existing databases as it should be. The group is described as “a large consortium of diagnostic testing laboratories in Europe, the United States, Australia, and Asia.” In their Nature Biotechnology correspondence they describe many of the barriers facing deposition of new variants in databases. Among them are lack of incentive (or lack of pressure by publishers and other organizations), challenging/difficult software interfaces for submissions, privacy concerns for medical testing situations, and some desire to withhold novel variations as intellectual property. Not all of these issues can be overcome with some software, but they aim to try.

The structural organization of the consortium and contributor community that they wish to develop is described in this slide, which is like Figure 1 in the publication:

So there is a group of MutaAdministrators who oversee the project as a whole (this name makes me giggle a little bit–like a sci-fi government might be called…). There are MutaCurators who assemble and review data on a given gene (is it really just genes? what about non-genic regions and large deletions and such–this isn’t entirely clear to me). Clinicians can give input into the curation, and MutaCircles is a group of labs that do diagnostic testing for a gene that can also discuss, submit, evaluate data. The MutaCurator role is a gatekeeper and accountability on the final appearance.

The gene-specific collections will be freely available online in their database, and link to disease/phenotype information associated with those variations as well. In the tip-of-the-week movie I’ll show you an example of how you might expect a gene record to look when it’s been filled out to some extent.

MutaReviews is a separate aspect that they describe this way on the web site:

MutaREVIEWS is a new “Gene review journal ” published only online, which is freely available to all users. It consists of a compilation of gene review studies that describe the most common human disease genes in a standardised way and lists all observed gene variants. The variants include monogenic variants with high penetrance, rare variants with reduced penetrance and polymorphisms without clinical significance. Each gene review is edited by a specific MutaCURATOR for that gene. These gene reviews are updated every 6 months. There are 12 issues per year.

It’s certainly in the early stages of this project. A lot of the genes I checked just haven’t been curated yet, and I understand that. I hope it works out: I do like the organization and structure, and a one-stop-shop would be handy. But the “build a platform they will come and curate” system has had mixed success elsewhere around biology. And some of the things that need to happen for this to take off are philosophical or possibly legal barriers that are going to vary quite a bit around the research and genetic testing world.

One thing I’d like to see them do is permit and encourage citizen science curation by people who are adopters of personal genomics and looking at data, and by disease community groups who have a specific interest in these genes, but have even more barriers to contribution than the researchers often do.  I’ve found stuff from my genome scan that I don’t really have any place to take, and there’s no way to supplement records at that provider’s site as far as I know. But maybe that’s another variation project somewhere….

Anyway, have a look at MutaDATABASE and see what you think. Or if you participate in this project and I’ve not got some part of this right, drop a note in the comments. I know it’s early in the project and I may not have all the finer points in hand from my looking around and reading.

MutaDATABASE: http://www.mutadatabase.org/ (freely available online database with the variation content)

The sample gene that’s well filled out: http://www.mutareporter.org/mutareporter/Mutadatabase.html?showgene=L1CAM#

MutaReporter: http://www.mutabase.com/index.php?option=com_content&view=article&id=48&Itemid=54 (required license and user subscriptions; but supposedly the MutaDATABASE will have a function to submit that does not require use of this specifically, if I understood that correctly)

MutaBASE: http://www.mutabase.com A company associated with the MutaReporter software. (We have no relationship with that company)


Bale, S., Devisscher, M., Criekinge, W., Rehm, H., Decouttere, F., Nussbaum, R., Dunnen, J., & Willems, P. (2011). MutaDATABASE: a centralized and standardized DNA variation database Nature Biotechnology, 29 (2), 117-118 DOI: 10.1038/nbt.1772

Friday SNPpets

Welcome to our Friday feature link collection: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…

Edit: check out what one of the top searches over the last week has been:

Tip of the Week: Varietas. A plaid database.

For this week’s Tip of the Week I’ll introduce Varietas, a resource that integrates human variation information such as SNP and CNV data, and offers a handy tabular output with links to additional databases that will enable researchers to quickly explore other sources of information about the variations or regions of interest.

I think this is the first resource I’ve used from Finland. And it’s definitely the first resource I have used that is plaid. But it struck me that plaid is a pretty good conceptualization of the variations that we see in the genomes. Some are a single thread, some are larger sections, and the overlaps  between the variations we observed in the genome are important to our understanding of them as well. And the history of computation leads back to textile manufacturing, in fact. So I thought it was a pretty good concept.

But let’s explore the threads of Varietas.  You can read the paper which  is linked below, but here I’ll just summarize some of the main features. First  let me say the focus of this database appears to be human variation. Although you wouldn’t know that from the site very clearly. As far as I could tell there wasn’t any other species data. But if  you want human variation data, you’ll find a variety of threads available to you.  If you check out the About page, you’ll see the source data available includes Ensembl, the NHGRI GWAS catalog, SNPedia, and GAD.  These sources also provide OMIM data, HGNC nomenclature, phenotypes, and MeSH terms. And the threads out include dbSNP, PubMed, SNPedia, and WikiGenes as well. This is also summarized nicely in Figure 1 of their paper.

It’s a very straightforward interface. There is a basic search with a text box for quick searching, and you select the type of data you are starting with: SNPs, genes, keywords, or locations. And the output will be a table with the results that correspond to  your query.

If  you have larger sets of features that you want to interrogate you can use the advanced forms to enter more data.

The tabular output can be viewed on the web with all the handy links. Or you can download the data as a text file to be used in other ways.

I’ll demonstrate the sample search for the movie, but you won’t see the full range of data that’s available there. I wish they had samples for each type of search. But I found one sample that will also show CNV results: choose the Location radio button and enter this location range to see some CNV samples 6:1234-123400

Varietas home page: http://kokki.uku.fi/bioinformatics/varietas/

PubMed record for the paper: http://www.ncbi.nlm.nih.gov/pubmed/20671203


Paananen, J., Ciszek, R., & Wong, G. (2010). Varietas: a functional variation database portal Database, 2010 DOI: 10.1093/database/baq016

Tip of the Week: Genome Variation Tour III

Today’s tip is the continuation of researching a single SNP in an individual genome. Trey will use a dbSNP RS ID to find linkage disequilibrium information between a SNP of interest and SNPs in the region easily and quickly. GVS, the Genome Variation Server at the University of Washington to analyze a dbSNP rs ID of your choice. This 3 minute screencast will show you how to use the GVS tool to quickly get this information for a wide range of populations.

Tip of the Week: International Cancer Genome Consortium

So, remember that tidal wave of data we were going to get from the human genome project?  Yeah.  That was a puddle compared to what’s coming your way now. For this week’s tip of the week I will introduce the very ambitious big data project from the International Cancer Genome Consortium (ICGC).  In addition, you’ll get your first look at the shiny new interface for BioMart!

People reading this blog know that we have made great progress on many fronts in the war on cancer.  But there’s an awful lot we don’t know yet.  The ICGC network of researchers plans to change that.  This international group of researchers has organized and standardized an effort to learn about tumors.  From their homepage:

ICGC Goal: To obtain a comprehensive description of genomic, transcriptomic and epigenomic changes in 50 different tumor types and/or subtypes which are of clinical and societal importance across the globe.

Check that out:

  • 50 tumor types.  Oh–and by the way–they will also obtain a normal tissue same from the same individual so you can see what’s part of the normal constitution and what has changed in the tumor.
  • Hundreds of samples of that tumor type.  Except for some rare tumors, they intend to obtain 500 of each tumor.
  • More than a dozen types of cancer. Breast, lung, brain, pancreas, liver, leukemia…and on and on.
  • Genomic. Transcriptomic. Epigenomic.  Each of these is a separate data set that needs to be obtained.  Oh, and already there are simple variations (small numbers of nucleotides), CNVs, structural re-arrangements, expression data….And that’s just the initial release.

Are you overwhelmed yet?  50 x 500 x more than a dozen x 3+ types of data (and that’s just back-of-the-napkin, there’s more…)  I am daunted just thinking about the scale of this.

They have organized and standardized the protocols, technologies, data collection, data submissions, and more.  You should check out their marker paper for a complete description of their framework.  They are going to make 2 types of data available: open access data that is de-identified.  And there is a controlled access data set with clinical details that you’ll have to register for access to.

Do note though: the data (like all these large data projects) is subject to data usage policies that you need to be aware of.  There is a publication moratorium that enables the data submitters a window to publish their findings before others are allowed to publish.  It’s that typical balance of rapid access to data + a non-scoop window for the data providers.  Be sure to familiarize yourself with the policies if you are going to use this data.

But let’s say you are ready for it–you understand the framework, you understand the usage policies–how do you get the data?  You use the very cool new interface for BioMart to do it!  This is your first opportunity to look at the GUI developed for BioMart v 0.8.  There’s more coming, this is an early version.  But that’s how you are going to be able to build great custom queries on the underlying data and pull it down.  You may be familiar with BioMart from any number of places now (Ensembl, Gramene, FlyBase, WormBase….more).  But this is the first implementation of the new look–you are going to want to check that out.

For this week’s Tip of the Week you’ll see the ICGC site, and a quick query of the initial data that is available in the Data Coordination Center (DCC).  But this is just an appetizer.  Brace yourselves–the deluge is coming.

A Nature News article offers a nice overview, but be sure to check out the full paper for the project details.

The International Cancer Genome Consortium site: http://icgc.org/

Oh, and this made me laugh:

Be sure to contact the ICGC team if you have any questions.  they want to help you to use this data, and will be happy to answer your questions.  Personally, I’m making it a mission to help them populate the FAQ–I’ve sent in questions.  And so far my answers have been quite speedy :)

Oy. The reference is longer than the blog post.  Sigh.

Hudson (Chairperson), T., Anderson, W., Aretz, A., Barker, A., Bell, C., Bernabé, R., Bhan, M., Calvo, F., Eerola, I., Gerhard, D., Guttmacher, A., Guyer, M., Hemsley, F., Jennings, J., Kerr, D., Klatt, P., Kolar, P., Kusuda, J., Lane, D., Laplace, F., Lu, Y., Nettekoven, G., Ozenberger, B., Peterson, J., Rao, T., Remacle, J., Schafer, A., Shibata, T., Stratton, M., Vockley, J., Watanabe, K., Yang, H., Yuen, M., Knoppers (Leader), B., Bobrow, M., Cambon-Thomsen, A., Dressler, L., Dyke, S., Joly, Y., Kato, K., Kennedy, K., Nicolás, P., Parker, M., Rial-Sebbag, E., Romeo-Casabona, C., Shaw, K., Wallace, S., Wiesner, G., Zeps, N., Lichter (Leader), P., Biankin, A., Chabannon, C., Chin, L., Clément, B., de Alava, E., Degos, F., Ferguson, M., Geary, P., Hayes, D., Hudson, T., Johns, A., Kasprzyk, A., Nakagawa, H., Penny, R., Piris, M., Sarin, R., Scarpa, A., Shibata, T., van de Vijver, M., Futreal (Leader), P., Aburatani, H., Bayés, M., Bowtell, D., Campbell, P., Estivill, X., Gerhard, D., Grimmond, S., Gut, I., Hirst, M., López-Otín, C., Majumder, P., Marra, M., McPherson, J., Nakagawa, H., Ning, Z., Puente, X., Ruan, Y., Shibata, T., Stratton, M., Stunnenberg, H., Swerdlow, H., Velculescu, V., Wilson, R., Xue, H., Yang, L., Spellman (Leader), P., Bader, G., Boutros, P., Campbell, P., Flicek, P., Getz, G., Guigó, R., Guo, G., Haussler, D., Heath, S., Hubbard, T., Jiang, T., Jones, S., Li, Q., López-Bigas, N., Luo, R., Muthuswamy, L., Francis Ouellette, B., Pearson, J., Puente, X., Quesada, V., Raphael, B., Sander, C., Shibata, T., Speed, T., Stein, L., Stuart, J., Teague, J., Totoki, Y., Tsunoda, T., Valencia, A., Wheeler, D., Wu, H., Zhao, S., Zhou, G., Stein (Leader), L., Guigó, R., Hubbard, T., Joly, Y., Jones, S., Kasprzyk, A., Lathrop, M., López-Bigas, N., Francis Ouellette, B., Spellman, P., Teague, J., Thomas, G., Valencia, A., Yoshida, T., Kennedy (Leader), K., Axton, M., Dyke, S., Futreal, P., Gerhard, D., Gunter, C., Guyer, M., Hudson, T., McPherson, J., Miller, L., Ozenberger, B., Shaw, K., Kasprzyk (Leader), A., Stein (Leader), L., Zhang, J., Haider, S., Wang, J., Yung, C., Cross, A., Liang, Y., Gnaneshan, S., Guberman, J., Hsu, J., Bobrow (Leader), M., Chalmers, D., Hasel, K., Joly, Y., Kaan, T., Kennedy, K., Knoppers, B., Lowrance, W., Masui, T., Nicolás, P., Rial-Sebbag, E., Lyman Rodriguez, L., Vergely, C., Yoshida, T., Grimmond (Leader), S., Biankin, A., Bowtell, D., Cloonan, N., deFazio, A., Eshleman, J., Etemadmoghadam, D., Gardiner, B., Kench, J., Scarpa, A., Sutherland, R., Tempero, M., Waddell, N., Wilson, P., McPherson (Leader), J., Gallinger, S., Tsao, M., Shaw, P., Petersen, G., Mukhopadhyay, D., Chin, L., DePinho, R., Thayer, S., Muthuswamy, L., Shazand, K., Beck, T., Sam, M., Timms, L., Ballin, V., Lu (Leader), Y., Ji, J., Zhang, X., Chen, F., Hu, X., Zhou, G., Yang, Q., Tian, G., Zhang, L., Xing, X., Li, X., Zhu, Z., Yu, Y., Yu, J., Yang, H., Lathrop (Leader), M., Tost, J., Brennan, P., Holcatova, I., Zaridze, D., Brazma, A., Egevad, L., Prokhortchouk, E., Elizabeth Banks, R., Uhlén, M., Cambon-Thomsen, A., Viksna, J., Ponten, F., Skryabin, K., Stratton (Leader), M., Futreal, P., Birney, E., Borg, A., Børresen-Dale, A., Caldas, C., Foekens, J., Martin, S., Reis-Filho, J., Richardson, A., Sotiriou, C., Stunnenberg, H., Thomas, G., van de Vijver, M., van’t Veer, L., Calvo (Leader), F., Birnbaum, D., Blanche, H., Boucher, P., Boyault, S., Chabannon, C., Gut, I., Masson-Jacquemier, J., Lathrop, M., Pauporté, I., Pivot, X., Vincent-Salomon, A., Tabone, E., Theillet, C., Thomas, G., Tost, J., Treilleux, I., Calvo (Leader), F., Bioulac-Sage, P., Clément, B., Decaens, T., Degos, F., Franco, D., Gut, I., Gut, M., Heath, S., Lathrop, M., Samuel, D., Thomas, G., Zucman-Rossi, J., Lichter (Leader), P., Eils (Leader), R., Brors, B., Korbel, J., Korshunov, A., Landgraf, P., Lehrach, H., Pfister, S., Radlwimmer, B., Reifenberger, G., Taylor, M., von Kalle, C., Majumder (Leader), P., Sarin, R., Rao, T., Bhan, M., Scarpa (Leader), A., Pederzoli, P., Lawlor, R., Delledonne, M., Bardelli, A., Biankin, A., Grimmond, S., Gress, T., Klimstra, D., Zamboni, G., Shibata (Leader), T., Nakamura, Y., Nakagawa, H., Kusuda, J., Tsunoda, T., Miyano, S., Aburatani, H., Kato, K., Fujimoto, A., Yoshida, T., Campo (Leader), E., López-Otín, C., Estivill, X., Guigó, R., de Sanjosé, S., Piris, M., Montserrat, E., González-Díaz, M., Puente, X., Jares, P., Valencia, A., Himmelbaue, H., Quesada, V., Bea, S., Stratton (Leader), M., Futreal, P., Campbell, P., Vincent-Salomon, A., Richardson, A., Reis-Filho, J., van de Vijver, M., Thomas, G., Masson-Jacquemier, J., Aparicio, S., Borg, A., Børresen-Dale, A., Caldas, C., Foekens, J., Stunnenberg, H., van’t Veer, L., Easton, D., Spellman, P., Martin, S., Barker, A., Chin, L., Collins, F., Compton, C., Ferguson, M., Gerhard, D., Getz, G., Gunter, C., Guttmacher, A., Guyer, M., Hayes, D., Lander, E., Ozenberger, B., Penny, R., Peterson, J., Sander, C., Shaw, K., Speed, T., Spellman, P., Vockley, J., Wheeler, D., Wilson, R., Hudson (Chairperson), T., Chin, L., Knoppers, B., Lander, E., Lichter, P., Stein, L., Stratton, M., Anderson, W., Barker, A., Bell, C., Bobrow, M., Burke, W., Collins, F., Compton, C., DePinho, R., Easton, D., Futreal, P., Gerhard, D., Green, A., Guyer, M., Hamilton, S., Hubbard, T., Kallioniemi, O., Kennedy, K., Ley, T., Liu, E., Lu, Y., Majumder, P., Marra, M., Ozenberger, B., Peterson, J., Schafer, A., Spellman, P., Stunnenberg, H., Wainwright, B., Wilson, R., & Yang, H. (2010). International network of cancer genome projects Nature, 464 (7291), 993-998 DOI: 10.1038/nature08987