Category: Genomics Research

Friday SNPpets

30 July, 2010 (09:18) | General Science, Genomics News, Genomics Research, SNPpets | By: Mary

Welcome to our Friday feature link dump: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…

http://genome.hmgc.mcw.edu/
http://genome-mirror.duhs.duke.edu/
http://genome-mirror.bscb.cornell.edu/
http://genome-mirror.binf.ku.dk/
http://genome.qfab.org/

A database of protocols?

29 July, 2010 (12:23) | General Science, Genomics Research | By: Mary

When we were doing a workshop recently about a huge range of bioinformatics/genomics database resources a young researcher came up and asked about one of the types of resources we didn’t yet include in our list–and I thought it was an excellent question.

Is there a database of protocols?

That’s such a good question–and I can see why it would be so useful to bench scientists. Yeah, I can show you all the software we cover. But there are still other needs, and searchable protocols would be very handy.

Off the top of my head I pointed her to the Current Protocols publications. I relied on those in grad school–but at that time they were in the red binder on the second shelf, and Hiram would always pull out the old chapters and put in the new ones–on actual paper!!

I also mentioned JOVE–the Journal of Visualized Experiments.  Jennifer talked about this before as a tip of the week. Since then they had to move to subscription, but it’s possible this young scholar’s institution does subscribe to that.

Trey remembered Open Wetware.

UPDATES: Great stuff from readers–thanks:

APD offers: NAR Methods http://nar.oxfordjournals.org/collections/index.dtl

Natalie offers: Protocol-Online http://www.protocol-online.org/ and her collected list: http://www.bib.umontreal.ca/sa/laboratoire-indispensables.htm

Jennifer suggests BioTechniques http://biotechniques.com/ and for  protein structure notes this resource: http://kb.psi-structuralgenomics.org/

_____________________________________

Do you know of others? I’ll edit the post as people bring them along. Thanks!

Moroccan Science

27 July, 2010 (16:29) | General Science, Genomics Research | By: Trey

[caption id="attachment_4895" align="alignleft" width="300" caption="Al Akhawayn University, Ifrane Morocco"][/caption]

Last week I attended and taught a workshop for the Moroccan American Society for Life Sciences (Biomatec-US) at their 2nd International Workshop and 9th Annual Meeting, in Ifrane Morocco.

I was thoroughly impressed. Impressed with Morocco, Moroccan Scientists and Moroccan students. I had the opportunity to interact with all three. First this students. I taught three workshops, including a tour of genomic resources and two how-to’s for the UCSC Genome Browser and Table Browser. All were enthusiastically received. But more than that I was impressed by the enthusiasm these students showed for genomics and bioinformatic research. After each talk and later in the day, I was barraged with questions and requests (which I love). Their enthusiasm for science matches or surpasses any other group of science students I’ve met in my 20+ year career in biology. In addition to that, I met several students who I was able to discuss their research with a bit.

Also, I was able to discuss research in Morocco with several Moroccan scientists informally and attend a roundtable discussion about advancing Moroccan science, specifically biological and bioinformatics research. Moroccan scientists, both within and outside of Morocco, are doing worldclass research, including my host of course. The research done within Morocco and by the Moroccan ‘diaspora’ of scientists (there were Moroccan scientists from the US, Europe and the Middle East there), seems to be a ripe network that, together with the enthusiasm of the students, is a great resource for that nation.

If the level of research and enthusiasm of the researchers and students are any indication, Moroccan science will be making great strides in the years to come. Of course, this isn’t anything new I’m sure, just new to me :D .

I learned (relearned) two things on this trip. The world is very small, and very big. I met several people who with whom I had crossed paths with before or who we had mutual friends. There was the Moroccan scientist who I briefly met in Germany while doing a postdoc there and the Moroccan student who knew someone I knew from Qatar. I was asked to talk briefly and the roundtable discussion and I mentioned a virtual African conference I had given a workshop at, and that I thought there was a Moroccan hub at that conference. Sure enough, one of the scientists at the discussion had attended my workshop (and had good words for it :D ). Ok, you might say, that’s the ‘world’ of science. Well, it got down to even the woman I met in the hotel who was a Fulbright scholar doing research on Berber and Arabic music… and the man who gave me a ride from the conference the last evening, who just happened to be her Moroccan supervisor.

And it’s a huge world with a lot to discover and awe my sometimes jaded self (rarely, but I can be there). I never had heard of Argan oil before,

[caption id="attachment_4896" align="alignright" width="225" caption="Street & shops in the medina of Fes, Morocco"][/caption]

produced from seeds collected from the feces of goats, or even considered touring the magical medina of Fes (to which I MUST return). I had no inkling of the existence of Al Akhawayn University in Ifrane, a small liberal arts school in the cool (it snows) mountains of Morocco in Ifrane (why do I want to keep writing that as iFrane :D ? ). Beautiful campus.

The other thing that came to mind while attending this conference and speaking with Moroccan scientists is the potential (and unnoticed reality) of the research possibilities outside of the US-European-Japanese triangle. Of course India and China are producing great research more and more over the years, but there are another 100 or so countries out there with another few billion people with huge potentials. Of course these smaller countries have always produced great scientists, but I was beginning to think that genomics and bioinformatics is beginning to assist smaller countries ‘leapfrog’ biological research much as cell phone technology allowed some developing countries to ‘leapfrog’ from traditional telephone lines (expensive, hard to do) to wireless (less expensive). Biological research has traditionally be resource intensive: labs, larger universities, equipment. Bioinformatics and genomics research, though still requiring infrastructure, has a lower barrier of entry I believe. I made a comment in my talk, “There is no lack of data,” and it’s true. The amount of data available for analysis is staggering. The number of publicly available tools and databases is overwhelming. One doesn’t have to do “big science” in genomics (though there sure is that) to do world-class research. Thar’s research gold in them thar data hills (sorry for the reference to the California gold rush, I _do_ live in what was the center of it all). Gold that can be mined by any individual, lab or nation with a bit of education and enthusiasm.

I hope to return next year to Morocco and next years conference. I have a lot more to learn :D . And maybe I can teach a bit too.

Friday SNPpets

23 July, 2010 (08:46) | Genomics Research, Genomics Resource News, SNPpets | By: Mary

Welcome to our Friday feature link dump: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…

Tip of the Week: 1000 Genomes Project Browser

21 July, 2010 (08:27) | Genomics Research, New Resource, Tip of the Week | By: Mary


You may have been hearing about the 1000 Genomes project–it’s one of the ongoing “big data” projects that is going to yield a great deal of variation information about the human genome. The goal is to sequence well over1000 genomes to identify “most genetic variants that have frequencies of at least 1% in the populations studied”.  They are doing this by sequencing large numbers of samples  with 4x coverage. You can read more about their strategy in their About page on their web site. It also lists the anticipated sample populations.

In this week’s Tip of the Week I’m going to take a quick spin through their browser. (You can also download all the data, but I’ll be focusing on the browser.) They have begun to release data now, and there are 6 individual sequences available at this time.  These are part of their “pilot” studies.  You can get some details on the pilot from their about page, which links to this PDF about the samples.

They are using the Ensembl framework to display their data. So if you are familiar with using Ensembl you’ll have some facility moving around this browser.  One thing that isn’t apparent right away from the site is that you can click the Resembl link on the display to turn on a track that puts the read/coverage data on the viewer. I also liked the alignment display  of all 6 genomes–but I’m sure that’s going to get challenging to view later with more and more genomes.

In an exchange with their very helpful help desk yesterday, I got this quick summary of the samples you’ll see:

For the high coverage populations NA12891, NA12892 and NA12878 are the CEU trio, NA19238, NA19239 and NA19240 are the YRI trio both father, mother, child respectively and both children were daughters.

If you have questions about their data, be sure to go ask them for help–they were very speedy with answers for me :) .

Some of the project data has also been picked up by UCSC and you can access the same sequences in the UCSC Genome Browser in the Genome Variants track on the March 2006 human assembly. (You’ll also see Venter, Watson, and some other individual genomes there).

Quick links:

The Project: http://www.1000genomes.org/

The Browser: http://browser.1000genomes.org/

An article in Science with some background:  A Plan to Capture Human Diversity in 1000 Genomes

A Personal Genomics blog: Genomes Unzipped

12 July, 2010 (13:08) | Genomics Research | By: Mary

Sorry for the light posting this month. We’ve been going wild with workshops all over the place, and travel is such a challenge these days. Oy.  And it’s also vacation season, of course.  And the proofs came for our paper today….But today I learned about a place to do some additional reading if you are itching for something new–check out the new blog Genomes Unzipped.

I read about it over at SciBlogs where Daniel introduced it today.  Looks like it will be a nice place to go for quality discussions in this arena. Check it out.

Tip of the Week: MINT for protein interactions

7 July, 2010 (09:05) | Genomics Research, Tip of the Week | By: Mary


We’ve long been fans of the tools developed by the team responsible for MINT: Molecular INTeraction database.  MINT is a curated resource full of experimentally verified protein-protein interactions, with some great visualization options.  In addition to the main MINT interface, there are other aspects to the site that bring other types of visualization as well.  We have done a tip on MINT in the past, but we wanted to re-visit this for our SciVee collection, and also mention a handy tool called Connect. Connect can be used to enter a list of up to ~100 proteins and generate the connection map between them.

HomoMINT: this tool extends the experimentally-verified interaction collection to include inferred interactions for human, based on data from model organisms.  So this is homologous interactions, hence the name….

Domino: a look at the domains that are involved in the protein-protein interactions.

VirusMINT: this aspect of MINT explores the viral proteins that includes how the virus proteins interact with host proteins to disrupt host physiology.

For this week’s tip I’ll focus mainly on the experimentally-verified portion of MINT and that interface, and introduce the others. You’ll see how to do a quick search, explore protein details, and then load up the network in the visualization tool.  We have a full tutorial on MINT available for subscribers for people who want to go deeper into the functionality–we can only barely touch on the features in our screencast movie limit.

Edit: should have put the MINT link more clearly http://mint.bio.uniroma2.it/mint/Welcome.do Go to MINT.

Ceol, A., Chatr Aryamontri, A., Licata, L., Peluso, D., Briganti, L., Perfetto, L., Castagnoli, L., & Cesareni, G. (2009). MINT, the molecular interaction database: 2009 update Nucleic Acids Research, 38 (Database) DOI: 10.1093/nar/gkp983

We’ve got widgets

28 June, 2010 (13:57) | Genomics Research, New Resource | By: Trey

I’ve mentioned others’ widgets before. They can be very handy tools on websites and blogs to add content and useful interactive searches, etc.

Well, we now have our own. As many of our readers know, we have a genomics and bioinformatics search engine that helps the researcher find the database or analysis tool that best fits their need. Type in a term and you get a list of genomics resources that are queued in rank of relevancy. In addition, you are shown where in context (the resource web site, or in our tutorials or blog if there) where the term was found. Additionally, you’ll find tutorials we’ve created on nearly 100 of them, about a dozen free to the user like PDB, SGKB, UCSC Genome Browser, and another 80 or so by subscription.

Anyway, you can now put the search (which of course is publicly available) on your blog or web site using one the widgets we’ve just had created (by the same people who helped create our database search). We have three sizes and you can find them and the code for them at this page.

You’ll also see I’ve put the smaller widget on the right column here on the blog. You can put a term in there and test it out. It will open another page with the results of our search. Try it out!

Big data specialists…yeah, but…

23 June, 2010 (15:05) | General Science, Genomics Research | By: Mary

There is a great discussion on Big Data today that I found on the twittosphere.  Hat tip to Paul Blaser on the tweet that got my attention.  I have posted a comment over there, but I decided as I was writing it that I wanted to bring it over here as well.  (I also added some links here that I couldn’t add over there since without preview I hate to not be able to test them.)

Deepak has a post up on the blog business|bytes|genes|molecules called The Biological Data Scientist.  It speaks to big data projects, and the need to have specialists in biological data to handle it.

I suspect that we do actually agree on much of the concept.  But like a lot of things, I think more downstream about the implementation of the topic on the ground.  And my thoughts on that are below, which I posted as a comment over there.

+++++++++++++++++++

Hmmm…I certainly agree with large chunks of this. But I don’t agree that this should be the domain of some kind of data scientist.  Or–more specifically–it does need to have their hands to some point.  But I think it still needs to be accessible to the handful-of-genes bench biologists.

The idea of the multi-functional team is terrific, when it is possible.  But we see a lot of people who are not getting that kind of support from their local “bioinformatics” club–for a couple of reasons: if you have some big-data folks on site, they have their own project to worry about. They are not eager to hand-hold others on the way in to the data.  It’s not their job. It’s not what they are supported to do, and it doesn’t help them with their next grant.

If you have some kind of dedicated bioinformatics core support, the quality of the support varies widely: the kinds of things they do, the skills they have, the interest in actual support.

We have seen some great examples.  For example, it seems to me the team at CHOP in Philly provides this kind of support: in house tools to support the researchers, bringing in the right tools to add more support, training everyone up to some level so they are at least aware of what the tools can do. (Samples of CHOP tools, team, and training.)

On the other hand, we’ve been to some major institutions–many with “big data” projects, who are getting next to zero interaction with anyone who could help them.  You’d be stunned if I told you who these people are.

Then there are those who don’t even have a shot at this.  People trying to keep up, and write new grants with hot new data, that are in some mid-western campus that really just doesn’t even have someone to ask.  I talked to one woman once that needed a really simple thing out of the UCSC Genome Browser.  It took me roughly 5 minutes to build the right query, pull the data out of the table browser, and hand it to her. I thought she was going to kiss me.  She told me she had expected that to take her 6 months of benchwork.

I would hate to see this strategy create a tier of biologists who are nearly locked out the data.  Because it is also still imminently clear that we can throw a lot of big data at project, but the crucial details require the “small people” to look closely at them.  And many of them feel excluded from the club already.

Guest Post: SNAP — Andrew Johnson

22 June, 2010 (14:01) | Genomics Research, Genomics Resource News, Guest Posts, New Resource | By: Trey

This next post in our continuing semi-regular Guest Post series is from Andrew Johnson, one of the developers and the concept designer of SNAP, SNP Annotation and Proxy Search which is hosted at the Broad Institute. If you are a provider of a free, publicly available genomics tool, database or resource and would like to convey something to users on our guest post feature, please feel free to contact us at wlathe AT openhelix DOT com or the contact form (write ‘guest post’ as subject heading). We welcome introductions to your resource, information on updates, highlights of little known gems or opinion pieces on the state of genomic research and databases.

SNAP (http://www.broadinstitute.org/mpg/snap/, Johnson et al. (2008) Bioinformatics 24(24): 2938), “SNP Annotation and Proxy search”, is a flexible, web-based tool that allows anyone in the world to quickly accomplish a range of SNP-related genetics and bioinformatics tasks. This post highlights some common questions andfeatures of SNAP, some more obscure uses, and recent and planned developments.

How did SNAP come about?

The idea for SNAP was originally sparked by GWAS analysts within a large collaborative group (the Framingham Heart Study SHARe project). This was in the pre-imputation era when GWAS investigators from different groups using different SNP arrays often wanted to find best proxy SNPs based on HapMap for comparison when they didn’t have common genotyped SNPs across groups. We initially implemented local programs to lookup upHapMap LD and also consider the presence of query and proxy SNPs on different commercial genotyping arrays. We quickly realized this was a community-wide problem as we received requests from outside collaborators so we decided it was worth developing a public tool and approached investigators at the Broad Institute. Through collaboration with Paul de Bakker, Bob Handsaker and others at the Broad Institute we were able to add more features like plotting and build a nice, quick and accessible interface. Many people have contributed ideas, testingand improvements to SNAP, and Bob Handsaker and Pei Lin in particular continue to maintain and update SNAP.

What do you use SNAP for the most?

The two major features of SNAP widely used 1) SNP LD queries, and 2) plotting of LD and association data. There are a number of flexible options for these functions. Beyond these, as a SNP bioinformatics specialist, I often use SNAP to rapidly retrieve information about a list of SNPs for other uses (see specialized queries below).

What are some commonly asked questions from users of SNAP?

Click to continue reading “Guest Post: SNAP — Andrew Johnson”