Video Tip of the Week: The New Database of Genomic Variants – DGV2 (edited)
In today’s tip I will briefly introduce you to the beta version of the updated DGV resource. The Database of Genomic Variants, or DGV, was created in 2004 at a time early in the understanding of human structural variation, or SV, which is defined by DGV as genomic variation larger than 50bp. DGV has historically provided public access to SV data in humans who are non-diseased. In the past it both accepted direct data submissions on SV and also provided high quality curation and analysis of the data such that it was appropriate for use in biomedical studies.
We’ve had an introductory tutorial on using DGV for years, and we’ve posted on changes at DGV in the past, so we were quite interested to read in their recent newsletter that there is a newly updated beta version of the DGV resource. The increase in SV data being generated by many large-scale sequencing projects as well as individual labs, has made it difficult for the DGV to continue to collect SV data, to provide a stable and comprehensive data archive AND to manually curate it at the level they have in the past. Therefore the DGV team is now partnering with DGVa at EBI and dbVar at NCBI. DGVa and dbVar will accept SV data submissions, and will function as public data archives (PDA) and, according to the publication sited below, DGVa and dbVar will:
”...provide stable and traceable identifiers and allow for a single point of access for data collections, facilitating download and meta-analysis across studies.“
DGV will no longer accept data submissions, but will instead use accessioned SV data from the archives and focus on providing the scientific community and public at-large with a subset of the data. Again quoting from the paper referenced below:
“The main role of DGV going forward will be to curate and visualize selected studies to facilitate interpretation of SV data, including implementing the highest-level quality standards required by the clinical and diagnostic communities.“
The original DGV resource is still available while comments are collected on the updated beta site. For more information on the updated DGV I suggest you check out this documentation from the DGV team: From their FAQ – “What is the data model used for DGV2?” and from a link in their top navigation area – “DGV Beta User Tutorial“. Be sure to check out the new displays & data that’s available, and most importantly to send your comments & suggestions to the group so that they can design a resource best suited for your needs.
Original Database of Genomic Variants: http://projects.tcag.ca/variation/
New beta version of the Updated DGV: http://dgvbeta.tcag.ca/dgv/app/home
Introductory OpenHelix on Original DGV: http://www.openhelix.com/cgi/tutorialInfo.cgi?id=88
DGV Beta User Tutorial from DGV: http://dgvbeta.tcag.ca/dgv/docs/20111019-DGV_Beta_User_Tutorial.pdf
Church, D., Lappalainen, I., Sneddon, T., Hinton, J., Maguire, M., Lopez, J., Garner, J., Paschall, J., DiCuccio, M., Yaschenko, E., Scherer, S., Feuk, L., & Flicek, P. (2010). Public data archives for genomic structural variation Nature Genetics, 42 (10), 813-814 DOI: 10.1038/ng1010-813
(Free access from PubMed Central here)
Edit, March 5, 2012 – I wanted to add a clarification that we recieved through our contact link. I am pasting it in full, with permission from Margie:
We at TCAG think you did a great job on your video blog of the New Database of Genomic Variants.
I wanted to make a correction to one of your statements: “The increase in SV data (…) at the level they have in the past.”
We, the DGV team, have built a system that CAN handle the new volumes and types of SV data now being published, and we are able to curate all of these data. The reason we partnered with DGVa and dbVar was primarily to provide stable, “universal” accessions for SV data. We also work with DGVa and dbVar to define standard terminology, data types, and data exchange formats.
I just wanted to make sure it was clear that we are fully capable to handle the SV data being published now. Our reason for partnership was to foster standardized data and open data sharing across systems.
Thanks again for your blog post!