Tag Archives: genomics

Friday SNPpets

Welcome to our Friday feature link collection: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…

  • RoBuST “has been developed as root and bulb plant community research platform for integrated analysis of root and bulb genomics data.” Cool. I’m a big fan of roots and bulbs–oh, crap,  just realized I forgot to buy carrots for the Pav Bhaji.  Will try to get them tomorrow at the farmer’s market or Faneuil.  [Mary]
  • FEAST is a sensitive local alignment program with multiple rates of evolution. An interesting project as part of a Ph.D. thesis :). I haven’t tried it yet, but from the commentary, it looks good. [Trey]
  • Because Trey often talks about the CLOCK gene, I found this set of Nature papers interesting: Editor’s Summary – Clocking on to diabetes [Jennifer]
  • From BioMed Central: CIG-DB: the database for human or mouse immunoglobulin and T cell receptor genes available for cancer studies plus a link to the actual site (free, no registration required):  CIG-DB [Jennifer]
  • announcement: GMOD Europe 2010, 13-16 September 2010, Cambridge UK [Jennifer]
  • As most parents and anyone who has watched a child over time knows, a large portion of our personalities are genetic. But like height and sexuality, they aren’t easily reduced to single (or even multiple) gene causes as this recent GWAS research is showing. [Trey]
  • There’s a site that is fielding questions about predominantly on Next-Gen type sequencing related issues: http://i.seqanswers.com/ [Mary]

Moroccan Science

Al Akhawayn University, Ifrane Morocco

Last week I attended and taught a workshop for the Moroccan American Society for Life Sciences (Biomatec-US) at their 2nd International Workshop and 9th Annual Meeting, in Ifrane Morocco.

I was thoroughly impressed. Impressed with Morocco, Moroccan Scientists and Moroccan students. I had the opportunity to interact with all three. First this students. I taught three workshops, including a tour of genomic resources and two how-to’s for the UCSC Genome Browser and Table Browser. All were enthusiastically received. But more than that I was impressed by the enthusiasm these students showed for genomics and bioinformatic research. After each talk and later in the day, I was barraged with questions and requests (which I love). Their enthusiasm for science matches or surpasses any other group of science students I’ve met in my 20+ year career in biology. In addition to that, I met several students who I was able to discuss their research with a bit.

Also, I was able to discuss research in Morocco with several Moroccan scientists informally and attend a roundtable discussion about advancing Moroccan science, specifically biological and bioinformatics research. Moroccan scientists, both within and outside of Morocco, are doing worldclass research, including my host of course. The research done within Morocco and by the Moroccan ‘diaspora’ of scientists (there were Moroccan scientists from the US, Europe and the Middle East there), seems to be a ripe network that, together with the enthusiasm of the students, is a great resource for that nation.

If the level of research and enthusiasm of the researchers and students are any indication, Moroccan science will be making great strides in the years to come. Of course, this isn’t anything new I’m sure, just new to me :D.

I learned (relearned) two things on this trip. The world is very small, and very big. I met several people who with whom I had crossed paths with before or who we had mutual friends. There was the Moroccan scientist who I briefly met in Germany while doing a postdoc there and the Moroccan student who knew someone I knew from Qatar. I was asked to talk briefly and the roundtable discussion and I mentioned a virtual African conference I had given a workshop at, and that I thought there was a Moroccan hub at that conference. Sure enough, one of the scientists at the discussion had attended my workshop (and had good words for it :D). Ok, you might say, that’s the ‘world’ of science. Well, it got down to even the woman I met in the hotel who was a Fulbright scholar doing research on Berber and Arabic music… and the man who gave me a ride from the conference the last evening, who just happened to be her Moroccan supervisor.

And it’s a huge world with a lot to discover and awe my sometimes jaded self (rarely, but I can be there). I never had heard of Argan oil before,

Street & shops in the medina of Fes, Morocco

traditionally produced from seeds collected from the feces of goats (today it’s more likely collected and processed by more modern methods :), or even considered touring the magical medina of Fes (to which I MUST return). I had no inkling of the existence of Al Akhawayn University in Ifrane, a small liberal arts school in the cool (it snows) mountains of Morocco in Ifrane (why do I want to keep writing that as iFrane :D? ). Beautiful campus.

The other thing that came to mind while attending this conference and speaking with Moroccan scientists is the potential (and unnoticed reality) of the research possibilities outside of the US-European-Japanese triangle. Of course India and China are producing great research more and more over the years, but there are another 100 or so countries out there with another few billion people with huge potentials. Of course these smaller countries have always produced great scientists, but I was beginning to think that genomics and bioinformatics is beginning to assist smaller countries ‘leapfrog’ biological research much as cell phone technology allowed some developing countries to ‘leapfrog’ from traditional telephone lines (expensive, hard to do) to wireless (less expensive). Biological research has traditionally be resource intensive: labs, larger universities, equipment. Bioinformatics and genomics research, though still requiring infrastructure, has a lower barrier of entry I believe. I made a comment in my talk, “There is no lack of data,” and it’s true. The amount of data available for analysis is staggering. The number of publicly available tools and databases is overwhelming. One doesn’t have to do “big science” in genomics (though there sure is that) to do world-class research. Thar’s research gold in them thar data hills (sorry for the reference to the California gold rush, I _do_ live in what was the center of it all). Gold that can be mined by any individual, lab or nation with a bit of education and enthusiasm.

I hope to return next year to Morocco and next years conference. I have a lot more to learn :D. And maybe I can teach a bit too.

Friday SNPpets

Welcome to our Friday feature link dump: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…

  • A bit of a dust-up at ScienceBlogs as they added a corporate blog (Pepsi writing about nutrition science) and several of the bloggers left or threatened to leave. The Pepsi blog is no more. I hope that’s resolved, ScienceBlogs is an excellent collection of science writing. [Trey]
  • Pathway Tools Workshop 2010 held by the folks from BioCyc announced for October 21-25: http://bioinformatics.ai.sri.com/ptools10/ [Mary]
  • Animal portraiture against a white background. It’s been done before, this time with birds. It always reminds me how amazingly beautiful life can be. [Trey]
  • VectorBase announces that they have moved to the new style Ensembl browser with their current release–Yeah!  If you are interested in “Invertebrate Vectors of Human Pathogens”, this database may have species you want to know about. [Mary]
  • A good discussion about the recent ‘longevity gene’ study and it’s possible flaws by Razib Khan of Gene Expression [Trey]
  • Bandwidth-heavy, but really neat movies of tumor angiogenesis. You can open the Navigator menu to see the various movies listed, or you can migrate around the tumor yourself.  Hat tip to Jill!  [Mary]
  • On the GBrowse mailing list people were looking for examples of GBrowse 2.0 in action. WormBase indicated they are up to that version, and there was another research group with a species I never heard of before that also has it running: Gardnerella vaginalis.  They have compared 2 strains: one from a healthy woman, one suffering from infection. They show divergence, interestingly.  You can check out their recent publication on it from their publication tab.  A nice demonstration of how to use GBrowse for your species of interest. [Mary]

Regulating DNA (tests)

I’ve mentioned before that personal genomics seemed to have hit a tipping point. Some of the evidence of that seems to be that the FDA and other regulatory agencies have taken a heightened interest in mass-market gene and genomics tests.

That is going to be the next step in our progress towards personal genomics and medicine, and one that if done right will make this part of our history a successful one. To that end, the Genomics Law Report has an interesting post: “Transparency First: A Proposal for DTC Genetic Testing Regulation.” His argument, make the registry mandatory, make transparency mandatory, is a good start.

There is also a debate going on (which I’m going to be a fence-sitter on for now) on whether the FDA should be governing these ‘Direct to Consumer’ tests. Decisiontree says no:

The controversy seems to have stirred the FDA to assert its authority – and that of physicians – over any and all medical metrics. As readers of The Decision Tree know, I have little patience for the argument that we need doctors as gatekeepers of our genetic information. This isn’t a drug, and this isn’t a device – it’s information about ourselves, as ordinary as our hair color or our waist size or our blood pressure – all things that we can measure and consider without a doctor’s permission.

Gene Sherpa says they got it all wrong:

This is not about getting access to your data.

Fine, you want a whole genome, go get it!

The FDA is not asking should people be able to go out and buy this. It is asking several other questions.

1. Is Interpretation of biometric data considered medicine?

2. Is DTCG analyzing biometric data and intending to give an interpretation of that data which indicates a disease a person has?

3. Should we regulate a system which has not given indication of their quality control if they are indeed intending to provide medical diagnosis?

4. Are these methods of obtaining human samples to derive biometric data for the intent of analyzing and providing information about disease considered medical devices?

All three are interesting and informative reads. Just thought I’d point them out. (hat tip on those last to to Daily Scan).

Personal Genomics, tipping points and a personal perspective

ResearchBlogging.org Please indulge a long post from a personal perspective, what genomics is about to do for _me_. This is information that many, if not all, of our readers already know. I’ve been researching and working in either experimental biology or genomics for over 20 years. Ever since the beginning of the Human Genome Project , which coincidently started the same year I started my Ph.D. program, into my postdoctoral research at EMBL and now my work at OpenHelix, I’ve known that someday personal genomics was going to impact me, and millions of others, in a big way. Yet, it has always felt that it was one of those things that would be a decision I and we as a society didn’t have to make until we turned that corner that seemed always “just ahead.”

But now I think we’ve turned a corner. It feels, to mix metaphors, that we’ve hit a tipping point. The Human genome project, the mapping and sequencing of the/a human genome from 1990 to 2003, cost approximately 2,700,000,000 dollars (that’s 2.7 billion, I wanted to get all the zeros in). Celera did the genome for 300,000,000. The cost of sequencing an entire human genome has been plummeting ever since. In 2007, the cost of sequencing the genome of James Watson (co-discoverer of DNA) was about 2,000,000. The today cost is about 10,000. Complete Genomics and other companies are on the march to quickly reducing the cost of sequencing a genome under 1,000.

Let me graph the last 8 years for you, mind you, this is starting from the 300,000,000 number, not the 2.7billion, because that graph would be a straight line down.

So, within a year, the cost of sequencing your, my, genome will reach 1,000. If not less. We’ve seen this coming for years now, and it’s upon us. But what does it mean?  A lot of data. But data means nothing without context and analysis. Sequencing my genome would be a waste of 1,000 dollars if I gleaned nothing from it.

Yet, even that seems to have turned the corner from a few tidbits of genetic information to a steady steam and the beginning of a flood.

You know you’ve turned a corner when a genomics testing company begins to offer genetic tests to the mass market through Walgreens. There’s enough context in that data to make money from it, or so they hope. You can be sure the corner is safely behind you when the FDA tells Pathway Genomics and Walgreens that they will need to hold off while they make sense of the regulatory implications. Genomic ancestry test are are also gaining is usability… and scrutiny.

It was the recent Lancet paper on the clinical analysis that seemed to be a tipping point, not for me or those in the field. Genomics has been on my radar since 1988, but for society. I blogged about the paper and it’s use of genomics resources such as GVS, dbSNP and others. In the paper, the researchers did a thorough clinical assessment of an individual’s genome. We’ve brought down the cost of sequencing, now we are learning how much it’s going to take to assess that data from a medical point of view, and importantly, what we can learn from it.

What can we learn from it? I read this paper again from a personal perspective now. Could I learn something from sequencing and analyzing my genome, and if so what. My answer came to this: yes, I could learn something and in fact enough that I’m not convinced that as soon as that sequencing gets down to a 1,000 or lower (and is a high quality sequence :), I’m going to do it.

There are three things I see from this paper that one could learn from assessing their genome: prevention, early detection and therapy. I believe the former will be, for most people, something they already know and their genome sequence will tell them nothing new. The other two could be a wealth of information they will want, even need, to know. You’ll notice I left off ‘cure.’ I saw nothing in this paper, and nothing on the near horizon, that suggests to me that our genome sequence data will help with curing anything. Perhaps, just not much. Yet, the possibilities of early detection of disease and personalized drug treatment are tantalizing. Continue reading

Guest Post: New features at CTD – Allan Peter Davis

This next post in our continuing semi-regular Guest Post series is from Allen Peter Davis, of Comparative Toxicogenomics Database (CTD) at Mount Desert Island Biological Laboratory (MDIBL). If you are a provider of a free, publicly available genomics tool, database or resource and would like to convey something to users on our guest post feature, please feel free to contact us at wlathe AT openhelix DOT com.

The Comparative Toxicogenomics Database (CTD) is a free, public resource that promotes understanding about the effects of environmental chemicals on human health.  Since Trey’s original Tip of the Week about CTD, we’ve added many new features we’d like to highlight.

* The redesigned CTD homepage makes navigation easier and more intuitive.  Check out the keyword quick search box on every page, and try the “All” setting to see the scope of information available at CTD.

* A new Data Status page uses tag clouds to display the updated content for that month.

* We are particularly pleased to announce new statistical analyses of CTD data.  Chemical pages now feature enriched Gene Ontology (GO) terms, garnered from the genes that interact with a chemical.  In this release, CTD connects over 5,000 enriched GO terms to more than 4,500 chemicals.  As well, now our inferred chemical-disease relationships are also statistically scored and ranked.  Both new features will help users explore and generate testable hypotheses about the biological effects of chemicals.

* GeneComps and ChemComps discover genes or chemicals with a similar toxicogenomic profile to your molecule of interest.  Learn more about this feature in our recent publication.

* Reactome data are now also included with KEGG, for a more comprehensive view of pathways affected by chemicals.

VennViewer and MyGeneVenn are new tools that compare datasets for chemicals, diseases, or genes (including your own gene list) using Venn diagrams to discover shared and unique information.  These two visualization tools are a nice accompaniment to our original Batch Query tool for meta-analysis.

* The FAQ section under the “Help” menu provides examples of how to maximize your experience with CTD.

* Download our Resource Guide (pdf link) to keep as a handy reference card for CTD.

From the homepage, you can also subscribe to our monthly email newsletter to keep current with CTD’s growing content and features.  You can always contact us to request curation of your favorite chemical or paper.  And with our new “Author Alert” email program, we’ll even contact you to let you know when we’ve curated data from one of your publications in CTD.

We strive to be the best possible resource of chemical-gene-disease networks for the biological community, so feedback and input from users are of great importance to us.

- Allan Peter Davis

Friday SNPpets

Welcome to our Friday feature link dump: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…

Personal Genomics, clinical assessment and online resources

ResearchBlogging.orgThe Lancet paper, Clinical assessment incorporating a personal genome, has held my fascination this weekend (yes, I read it at the beach). Mary posted Friday and again Saturday on the paper and related NPR segment. It feels to me to be a seminal paper, though I do agree with Daniel at Genetic Future, there are a lot there we still don’t know. A large portion of the variation is in non-coding regions, and thus predictions and propensities are hard to come by with the available analysis. In fact, as he pointed out, many of the coding region variations have little information as to their effect on disease. I would add also that even if we get to that holy grail of $1,000 to sequence a personal genome, this kind of extensive analysis would still be time and cost-prohibitive for the vast majority of sequenced genomes.

Yet, as with all early steps in science and medicine, there’s missing pieces, large gaps and huge efforts (think “space travel,” “computers,” “microwave ovens,” “internet,”) that over time become inexpensive and commonplace (ok, so the former isn’t necessarily “inexpensive”). Sequencing genomes will become inexpensive before the analysis does, but both will come. And I think this paper is pointing to that future.

The other hurdle to large scale personal genomics I see (of course) is the understanding and use of the genomics and data resources. The authors use a large (and excellent, in my opinion) suite of genomics resources to do obtain data and do their analysis. I’ll list them here with links in alphabetical order:

dbSNP (T)
HapMap (T)
PubMed (T)
UniProt (T)

All of these resources have a wealth of data, but even then, that is a lot of analysis and familiarization that is needed with each tool. Each tool does have documentation and tutorials, and of course OpenHelix has tutorials on many of the ones mentioned (those with linked “T”s after the name). Still, this one analysis took a large number of tools and familiarization.

The paper does have a pretty good figure (figure 1) outlining the analysis process. For example, they SIFTed the genome to find gene-associated, non-synonymous, rare and novel and disease associated variations and then analyzed those using dbSNP, HGMD, OMIM and PubMed to analyze something like HFE2 which might have an association with Haemochromotosis. One of my quibbles with the paper, as often is with these papers, is that there isn’t a good methods ‘walk-through’ of the paper using something like Galaxy or Taverna in a history or workflow that would help reproduce the analysis.

We also have a tutorial I’d like to point you to, one that walks through a similar process and teaches users the basics of walking through that process. You can find this tutorial here, it’s free and publicly available. The tutorial walks the user through the analysis of a gene variation, in this case in the CYPC9 that effects an individual’s response to Warfarin. There is a similar variation (different gene, affects same drug response) in the paper. The tutorial uses the NIEHS SNPs site to get an overview of the variation including SIFT and PolyPhen predictions, then to the UCSC Genome Browser to find an overview of the region, walks through the dbSNP information and does a quick tag SNP analysis using GVS. That tutorial is only one very small step in what will have to be a immense education into genomics and genomics resources.

That is all to point out that the paper is an fascinating first step, and as a first step suggests the gaping holes we will have in bringing personal genomics to medicine.

Ashley, E., Butte, A., Wheeler, M., Chen, R., Klein, T., Dewey, F., Dudley, J., Ormond, K., Pavlovic, A., & Morgan, A. (2010). Clinical assessment incorporating a personal genome The Lancet, 375 (9725), 1525-1535 DOI: 10.1016/S0140-6736(10)60452-7

Genomics resource training scholarship, sponsored by Gramene and OpenHelix

Just thought our readers might like a heads up. I quote from a recent press release:

Cold Spring Harbor Laboratory, Oregon State University and Cornell University, creators of the Gramene Resource for Comparative Plant Genomics, partner with OpenHelix to offer online training on genomic resources to encourage diversity in science.

The Resource for Comparative Grass Genomics, Gramene, sponsors a Gramene tutorial with us, which is thus free to users. Additionally, Gramene is also sponsoring a program, partially funded by NSF,  to open all OpenHelix tutorials to educational institutions serving underrepresented populations. This will give all students, faculty and staff at the institution unlimited access to a wealth of tutorials in our catalog including training on NCBI’s PubMed, Entrez, PlantGDB and over 90 other tutorials on genomic resources. This would be a great opportunity for students and researchers at the applying institution to train on genomics resources!

If you belong to a qualifying institution and would like to apply for this program, please find more information here and send us your application! The deadline for application is June 30th, 2010.

(tweeted here)

Gramene Announces Scholarships for Groups Underrepresented in Science to Learn How to Use Bioinformatics and Genomics Resources

Cold Spring Harbor Laboratory, Oregon State University and Cornell University, creators of the Gramene Resource for Comparative Plant Genomics, partner with OpenHelix to offer online training on genomic resources to encourage diversity in science.

Bellevue, WA (PRWEB) April 22, 2010 — The creators of the Gramene Resource for Comparative Grass Genomics and OpenHelixannounce the availability of scholarships to colleges and universities serving underrepresented minorities for full access to over 85 online tutorial suites on bioinformatics and genomics resources. The program is partially funded by the National Science Foundation (NSF).

“An ongoing goal for Gramene, our institutions, and the NSF, has been to provide opportunities for advancement and training to underrepresented groups in science,” said Dr. Doreen Ware, of Cold Spring Harbor Laboratory and Principal Investigator of Gramene, “So we are excited to be able to offer individual and institution scholarships to an extensive and valuable catalog of online training on genomics resources.”

…we are excited to be able to offer individual and institution scholarships to an extensive and valuable catalog of online training on genomics resources.

Recipients will have access to the OpenHelix catalog of tutorial suites on a wide range of bioinformatics and genomics resources, including Gramene, PlantGDB, NCBI tools such as Entrez Gene, BLAST and PubMed and many more. A full catalog of tutorial suites is available at http://www.openhelix.com/cgi/tutorials.cgi.

Each tutorial suite includes a 45-60 minute, online, self-run, narrated introductory tutorial on how to use a specific resource. The tutorial suite also includes PowerPoint slides, slide handouts and exercises which can be used as reference material or to build classroom content.

“The study of genomics has affected just about every area of life sciences, so learning how to access and interpret genomic data is critical to research success,” said Scott Lathe, Chief Executive Officer of OpenHelix, “With the convenience and broad accessibility of online training, we hope these scholarships will help in leveling access to this important training and further the potential and ongoing careers of the recipients.”

Institutions can apply for a scholarship for access to the tutorials at http://www.openhelix.com/cgi/scholarships.cgi. The scholarships are available to to minority serving colleges and universities. Underrepresented in science means those racial and ethnic populations that are underrepresented in biology research relative to their numbers in the general population. Individual scholarships are available to U.S undergraduates, graduate students, post-doctoral students, faculty and staff. Application deadline is June 30, 2010 and a limited number of scholarships are available.

About Gramene
Extensive research over the past two decades has shown significant conservation of gene order within large segments of linkage groups in agriculturally important grasses such as rice, maize, sorghum, barley, oats, wheat, and rye. Grass genomes are substantially colinear at both large and short scales, opening the possibility of using syntenic relationships to rapidly isolate and characterize homologues in maize, wheat, barley and sorghum.

As an information resource, Gramene’s purpose is to provide added value to data sets available within the public sector to facilitate researchers’ ability to understand plant genomes and take advantage of genomic sequence known in one species for identifying and understanding corresponding genes, pathways and phenotypes in other plant species.

Current work is being supported by the NSF Plant Genome Research Resource grant award #0703908.

About OpenHelix
OpenHelix, LLC, (www.openhelix.com) provides the genomics knowledge you need when you need it. OpenHelix provides a bioinformatics and genomics search and training portal, giving researchers one place to find and learn how to use resources and databases on the web. More efficient use of the most relevant resources means quicker and more effective research.