<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The OpenHelix Blog &#187; snps</title>
	<atom:link href="http://blog.openhelix.eu/?tag=snps&#038;feed=rss2" rel="self" type="application/rss+xml" />
	<link>http://blog.openhelix.eu</link>
	<description>at OpenHelix</description>
	<lastBuildDate>Thu, 09 Sep 2010 12:18:23 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Tip of the Week: RGenetics at Galaxy</title>
		<link>http://blog.openhelix.eu/?p=4899</link>
		<comments>http://blog.openhelix.eu/?p=4899#comments</comments>
		<pubDate>Wed, 28 Jul 2010 13:06:03 +0000</pubDate>
		<dc:creator>Trey</dc:creator>
				<category><![CDATA[Tip of the Week]]></category>
		<category><![CDATA[galaxy]]></category>
		<category><![CDATA[rgenetics]]></category>
		<category><![CDATA[snps]]></category>
		<category><![CDATA[variation]]></category>

		<guid isPermaLink="false">http://blog.openhelix.eu/?p=4899</guid>
		<description><![CDATA[]]></description>
			<content:encoded><![CDATA[<p><object width="480" height="400"><param name="movie" value="http://www.scivee.tv/flash/embedCast.swf" /><param name="allowfullscreen" value="true" /><param name="flashvars" value="id=19874&#038;type=3" /><embed src="http://www.scivee.tv/flash/embedCast.swf" allowfullscreen="true" width="480" height="400" flashvars="id=19874&#038;type=3"></embed></object> About 6-7 months ago, <a href="http://blog.openhelix.eu/?p=2924" target="_blank">Mary mentioned that R-Genetics analysis was coming to Galaxy</a>. Well, it has now and is available at the <a href="http://www.usegalaxy.org" target="_blank">public Galaxy site</a>. The <a href="http://old.rgenetics.org/rgenetics" target="_blank">old Rgenetics site</a> links to the new one and the information about using Galaxy as a wrap around interface for the Rgenetics project tools. Today&#8217;s tip just points you to the tool and gives you a quick overview of what is there. You&#8217;ll need to do some exploring to learn to use it! Of course, we have <a href="http://www.openhelix.com/galaxy" target="_blank">our publicly available Galaxy tutorial</a> to get you started.</p>
<p>(oh, and I point you to <a href="http://main.g2.bx.psu.edu/screencast" target="_blank">this tutorial on analyzing Desmond Tutu&#8217;s SNPs using Galaxy</a> that I thought was interesting)</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.openhelix.eu/?feed=rss2&amp;p=4899</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Guest Post: SNAP &#8212; Andrew Johnson</title>
		<link>http://blog.openhelix.eu/?p=4697</link>
		<comments>http://blog.openhelix.eu/?p=4697#comments</comments>
		<pubDate>Tue, 22 Jun 2010 18:01:07 +0000</pubDate>
		<dc:creator>Trey</dc:creator>
				<category><![CDATA[Genomics Research]]></category>
		<category><![CDATA[Genomics Resource News]]></category>
		<category><![CDATA[Guest Posts]]></category>
		<category><![CDATA[New Resource]]></category>
		<category><![CDATA[hapmap]]></category>
		<category><![CDATA[SNAP]]></category>
		<category><![CDATA[snps]]></category>

		<guid isPermaLink="false">http://blog.openhelix.eu/?p=4697</guid>
		<description><![CDATA[This next post in our continuing semi-regular Guest Post series is from Andrew Johnson, one of the developers and the concept designer of SNAP, SNP Annotation and Proxy Search which is hosted at the Broad Institute. If you are a provider of a free, publicly available genomics tool, database or resource and would like to convey [...]]]></description>
			<content:encoded><![CDATA[<p><em>This next post in our continuing semi-regular </em><a href="http://blog.openhelix.eu/?cat=7" target="_blank"><em>Guest Post series</em></a><em> is from Andrew Johnson, one of the developers and the concept designer of </em><em><a href="http://www.broadinstitute.org/mpg/snap/" target="_blank">SNAP, SNP Annotation and Proxy Search</a> which is hosted at the <a href="http://www.broadinstitute.org/" target="_blank">Broad Institute</a></em><em>. If you are a provider of a free, publicly available genomics tool, database or resource and would like to convey something to users on our guest post feature, please feel free to contact us at wlathe AT openhelix DOT com or </em><a href="http://www.openhelix.com/cgi/contact.cgi" target="_blank"><em>the contact form</em></a><em> (write &#8216;guest post&#8217; as subject heading). We welcome introductions to your resource, information on updates, highlights of little known gems or opinion pieces on the state of genomic research and databases.</em></p>
<p id="zw-11">SNAP (<a id="zw-13" href="http://www.broadinstitute.org/mpg/snap/">http://www.broadinstitute.org/mpg/snap/</a>, Johnson et al. (2008) Bioinformatics 24(24): 2938), “SNP Annotation and Proxy search”, is a flexible, web-based tool that allows anyone in the world to quickly accomplish a range of SNP-related genetics and bioinformatics tasks. This post highlights some common questions andfeatures of SNAP, some more obscure uses, and recent and planned developments.</p>
<p id="zw-24"><strong><em>How did SNAP come about?</em></strong></p>
<p id="zw-26">The idea for SNAP was originally sparked by GWAS analysts within a large collaborative group (the Framingham Heart Study SHARe project). This was in the pre-imputation era when GWAS investigators from different groups using different SNP arrays often wanted to find best proxy SNPs based on HapMap for comparison when they didn’t have common genotyped SNPs across groups. We initially implemented local programs to lookup upHapMap LD and also consider the presence of query and proxy SNPs on different commercial genotyping arrays. We quickly realized this was a community-wide problem as we received requests from outside collaborators so we decided it was worth developing a public tool and approached investigators at the Broad Institute. Through collaboration with Paul de Bakker, Bob Handsaker and others at the Broad Institute we were able to add more features like plotting and build a nice, quick and accessible interface. Many people have contributed ideas, testingand improvements to SNAP, and Bob Handsaker and Pei Lin in particular continue to maintain and update SNAP.</p>
<p id="zw-45"><em><strong>What do you use SNAP for the most?</strong></em></p>
<p id="zw-51">The two major features of SNAP widely used 1) SNP LD queries, and 2) plotting of LD and association data. There are a number of flexible options for these functions. Beyond these, as a SNP bioinformatics specialist, I often use SNAP to rapidly retrieve information about a list of SNPs for other uses (see specialized queries below).</p>
<p id="zw-57"><strong><em>What are some commonly asked questions from users of SNAP?</em></strong></p>
<p><strong><em><span id="more-4697"></span><br />
</em></strong></p>
<p>Many of the common questions are covered in detail in the FAQ and/or Documentation available on the website. Here are some questions I commonly receive.</p>
<p id="zw-63">How do I return all LD proxy relationships within 500kb of my query SNPs? Change the r^2 threshold to “No Limit” and leave the distance limit at “500”.</p>
<p id="zw-67">Why doesn’t my favorite SNP appear when I make a query? Occasionally query and/or expected proxy SNPs will not be found. This could be a result of an error in your representation of the query SNPid. The most likely explanation for proxy SNP is that the filters you’ve selected (r^2 and distance limits) have caused a SNP not to be included. Another likely explanation is that the expected SNP is &gt;500kb apart from query SNPs (see below). Rarely SNAP may return an alias SNPid for a proxy rather than the one you expect since SNAP takes aliases into account in queries (see below).  Finally, a SNP(s) may not be included in the HapMap release you are querying.HapMap releases 21 and 22 differ among a small number of SNPs. HapMap release 3 differs from prior releases at a greater number of SNPs. If it is important you should try querying different HapMap releases to find a release(s) with your SNP(s) of interest. Alternatively, you can also try to find genotype data separately and load into a program such as Haploview to calculate LD metrics.</p>
<p id="zw-100">Can I generate plots based on my own data? Yes. You can upload both your own association data (-log P) and your own LD data, as you may have generated LD estimates within your own population and/or a larger sample than available in the HapMap. If you don’t specify your own LD data SNAP uses HapMap by default. If you don’t provide chromosome and position SNAP fills these in based on HapMap. If you have de novo markers that are not in HapMap you can also include these as long as you specify the chromosome and positions and LD to the target SNP.</p>
<p id="zw-113">Why do I observe different LD estimates for the same pair of SNPs in different HapMap releases? Identical SNP pairs generally have identical, or very similar, LD estimates in different releases of HapMap. If LD estimates differ slightly it is attributable to differences in genotypes in the releases.</p>
<p id="zw-128">What if I just want to query LD among a select group of SNPs? Click the “Pairwise LD” tab. Copy and paste yourSNPid list, or upload a file. Your LD queries will be limited to only your SNPs of interest rather than all HapMapSNPs that meet the filtering criteria.</p>
<p id="zw-143">What if I want to find SNPs genotyped only on a specific array or group of arrays? There is a rapid way to limit queries to specific arrays. Click the ‘+’ on the Filter By Array. Select those arrays you want.</p>
<p id="zw-146">What do if I want to calculate long range LD or trans-chromosomal LD? SNAP returns results for pairwise LD between SNPs with distance up to 500kb. This is greater than the default of HapMap pre-calculated data of 250kb. In some cases users may want to assess SNPs that are further apart. A few options exist including 1) downloading the HapMap genotypes and loading into Haploview while removing the pairwise distance limitation, 2) calculating using PLINK, or 3) querying with the GLIDERS website (<a id="zw-159" href="http://mather.well.ox.ac.uk/GLIDERS/">http://mather.well.ox.ac.uk/GLIDERS/</a>). GLIDERS returns extreme long range queries on chromosomes or trans-chromosomal queries.</p>
<p id="zw-164"><a name="OLE_LINK1"></a><a name="OLE_LINK2"></a><strong><em>What are some specialized queries I can conduct with SNAP?</em></strong></p>
<p id="zw-168">Find annotation for SNPs regardless of LD proxies. SNAP doesn’t have to be used as an LD querying tool. You can simply retrieve information about a list of SNPs. To do so load your SNPids. Under “Search Options” select Distance Limit as 0 instead of the default 500kb. With the default settings SNAP will now only return information for your query SNPs themselves. You can select additional options like GeneCruiser annotations and MAFs. This is an excellent way to rapidly answer questions like: would a SNP(s) be genotyped on my array(s) of interest?Which of my SNPs are nonsynonymous SNPs? What are the genomic coordinates for my list of SNPs in a specific genome build (just select the corresponding HapMap build – Release 21=hg17, Release 22/HapMap 3=hg18)?What are the HapMap MAFs for my list of SNPs? Of course, you can also turn on proxy querying and ask these same questions in relation to both query and/or proxy SNPs. For instance, of my significant GWAS SNPs are any of them in LD with r^2 &gt; 0.5 with a known nonsynonymous SNP?</p>
<p id="zw-203">Find alias or alternate SNPids for my SNPs of importance. Some people do not realize that SNPs can suffer from a historical aliasing problem just like gene names. If you are using SNPids to query bioinformatics tools or databases, or to conduct cross dataset queries, to be extra cautious you should rely on genome positions and allelesor account for potential alias IDs. SNAP allows querying to return alias SNPids. Click the tab “Map SNP IDs“. You can retrieve IDs for a SNP across all previous dbSNP builds or specify a specific build to target.</p>
<p id="zw-224"><em><strong>What are recently added features of planned future updates to SNAP?</strong></em></p>
<p id="zw-228">SNAP is in version 2.1. Recent updates have included the addition of 1) HapMap 3 release featuring 12 population groupings, 2) SNP information for 7 new commercial SNP arrays (25 arrays now listed), 3) and the ability to include HapMap major and minor alleles, frequencies and observed genotype counts in output. We welcome suggestions for additional features. Most of the added features in the past have come from suggestions by active SNAP users and testers. In the future we plan to include query options based on 1000 Genomes Project based LDdata. We also plan to keep up with additional SNP array releases as they come to our attention.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.openhelix.eu/?feed=rss2&amp;p=4697</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Tip of the Week: Genome Variation Tour II</title>
		<link>http://blog.openhelix.eu/?p=4602</link>
		<comments>http://blog.openhelix.eu/?p=4602#comments</comments>
		<pubDate>Wed, 09 Jun 2010 06:28:36 +0000</pubDate>
		<dc:creator>Trey</dc:creator>
				<category><![CDATA[Tip of the Week]]></category>
		<category><![CDATA[CYP4F2]]></category>
		<category><![CDATA[dbSNP]]></category>
		<category><![CDATA[medical genomics]]></category>
		<category><![CDATA[OMIM]]></category>
		<category><![CDATA[personal genomics]]></category>
		<category><![CDATA[snps]]></category>
		<category><![CDATA[UCSC Genome Browser]]></category>
		<category><![CDATA[variation]]></category>

		<guid isPermaLink="false">http://blog.openhelix.eu/?p=4602</guid>
		<description><![CDATA[The last tip of the week I did was Genome Variation Tour I where we started our journey following one SNP in an individual&#8217;s genome through various databases to see what we can find out about that variation. In that tip we started out by looking at a SNP in the CYP4F2 gene in the [...]]]></description>
			<content:encoded><![CDATA[<p><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="480" height="400" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowfullscreen" value="true" /><param name="flashvars" value="id=18402&amp;type=3" /><param name="src" value="http://www.scivee.tv/flash/embedCast.swf" /><embed type="application/x-shockwave-flash" width="480" height="400" src="http://www.scivee.tv/flash/embedCast.swf" flashvars="id=18402&amp;type=3" allowfullscreen="true"></embed></object> The <a>last tip of the week I did was Genome Variation Tour I</a> where we started our journey following one SNP in an individual&#8217;s genome through various databases to see what we can find out about that variation. In that tip we started out by looking at a SNP in the CYP4F2 gene in the <a>UCSC Genome Browser</a> and followed it to <a href="http://www.ncbi.nlm.nih.gov/projects/SNP/" target="_blank">dbSNP</a>. Today&#8217;s tip will continue our journey to <a href="http://www.ncbi.nlm.nih.gov/omim">OMIM</a> to see what information we can find there. We&#8217;ll find this variation is clinically associated with Warfarin dosage effects and specifically this individual&#8217;s C/T heterozygosity indicates an intermediate dosage for effectiveness if indeed he ever needed this drug.  In some ways, your guess is as good as mine as to what we will find and what avenues we will be taking in the next few tips I&#8217;ll be doing. I&#8217;m am discovering information as I go along too. I can tell you though that the next installment of the genome variation tour will take us to PubMed, and a few not particularly well known but gem databases perhaps and probably back to the UCSC Genome Browser to expand our look at the interactions of several variations in this individuals genome.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.openhelix.eu/?feed=rss2&amp;p=4602</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Guest Post: WAVe &#8211; Pedro Lopes</title>
		<link>http://blog.openhelix.eu/?p=4464</link>
		<comments>http://blog.openhelix.eu/?p=4464#comments</comments>
		<pubDate>Tue, 25 May 2010 04:02:29 +0000</pubDate>
		<dc:creator>Guest</dc:creator>
				<category><![CDATA[Genomics Resource News]]></category>
		<category><![CDATA[Guest Posts]]></category>
		<category><![CDATA[New Resource]]></category>
		<category><![CDATA[GEN2PHEN]]></category>
		<category><![CDATA[genome variation]]></category>
		<category><![CDATA[guest]]></category>
		<category><![CDATA[snps]]></category>
		<category><![CDATA[WAVe]]></category>

		<guid isPermaLink="false">http://blog.openhelix.eu/?p=4464</guid>
		<description><![CDATA[This next post in our continuing semi-regular Guest Post series is from Pedro Lopez, developer of WAVe at the University of Aveiro Bioinformatic Group in Aveiro Portugal. If you are a provider of a free, publicly available genomics tool, database or resource and would like to convey something to users on our guest post feature, [...]]]></description>
			<content:encoded><![CDATA[<p><em>This next post in our continuing semi-regular </em><a href="http://blog.openhelix.eu/?cat=7" target="_blank"><em>Guest Post series</em></a><em> is from Pedro Lopez, developer of <a href="http://bioinformatics.ua.pt/WAVe/" target="_blank">WAVe</a> at the <a href="http://bioinformatics.ua.pt/" target="_blank">University of Aveiro Bioinformatic Group</a> in Aveiro Portugal. If you are a provider of a free, publicly available genomics tool, database or resource and would like to convey something to users on our guest post feature, please feel free to contact us at wlathe AT openhelix DOT com or <a href="http://www.openhelix.com/cgi/contact.cgi" target="_blank">the contact form</a> (write &#8216;guest post&#8217; as subject heading). We welcome introductions to your resource, information on updates, highlights of little known gems or opinion pieces on the state of genomic research and databases.</em></p>
<p>I would like to start by thanking Trey Lathe  for the opportunity to promote WAVe in this great blog. After his <a href="http://blog.openhelix.eu/?p=4319" target="_blank">short tip of the week post</a>, I&#8217;ll now try to make a more detailed overview of this new application.</p>
<p><strong>What is WAVe?</strong></p>
<p><a name="_GoBack"></a>WAVe stands for Web Analysis of the Variome and is a simple application focused on centralizing the access to distributed  and heterogeneous locus-specific databases (LSDB). LSDBs are an emerging type of bioinformatics applications, aiming at providing gene-centric information regarding discovered genomic variants. In WAVe, we offer both LSDBs as well as to its variants. Moreover, we also provide access to a comprehensive list of carefully selected external resources. With this, users have, in a single application, access to gene and variation information enriched with a multitude of gene-related resources in a lightweight and easy to use web application.</p>
<p><strong>What are WAVe&#8217;s key features?</strong></p>
<p>At this early stage, WAVe&#8217;s publicly available features are related with data access. Users can easily browse through available genes, search for genes, view gene info and access each gene RSS feed. In <a title="WAVe" href="http://bioinformatics.ua.pt/WAVe/" target="_blank">WAVe&#8217;s entry page</a>, users simply need to start typing a gene HGNC-approved symbol and several suggestions will appear: accepting one of them leads directly to the gene view page. Following the<strong> </strong><a title="WAVe | View All" href="http://bioinformatics.ua.pt/WAVe/gene/*" target="_blank"><strong>view all</strong></a><a title="WAVe | View All" href="http://bioinformatics.ua.pt/WAVe/gene/*" target="_blank"><strong> </strong></a>link, users can browse all available genes or check, for each gene, how many LSDBs and variants are available.</p>
<p>To access the application data, users just need to navigate in the gene tree. Each tree node represents a distinct data type and the various leaf provide access to external applications: by clicking a leaf, the destination page is loaded in the main content area. Repeating this process, users can navigate in the dozens of listed links for each gene.</p>
<p>WAVe also offers its core data to other developers. To obtain the gene tree and its links, users just need to add the <strong><a title="WAVe | Feed | BRCA2" href="http://bioinformatics.ua.pt/WAVe/gene/BRCA2/rss" target="_blank">rss</a></strong> tag to the end of gene address. This will output a RSS2.0 feed that can be easily parsed by any application or added to a feed reader.</p>
<p><strong>How was WAVe born?</strong></p>
<p>The <a title="Genotype to Phenotype: A Holistic Solution" href="http://www.gen2phen.org" target="_blank">european GEN2PHEN project</a> is an initiative to link, as deeply as possible, data from genotype features to its phenotype counterparts. The first step consisted in an attempt to improve various genomic variation resource scenarios. This implied normalizing LSDBs (the &#8220;LSDB-in-a-box&#8221; approach, <a title="Leiden Open-source Variation Database" href="http://www.lovd.nl" target="_blank">LOVD</a>) and defining novel data models and formats for data exchanges from and to LSDBs.</p>
<p>In a long term perspective, applying the GEN2PHEN-approved data models, will enhance the creation of new services and applications to integrate and interact with the exponentially growing dataset of genomic variation data.</p>
<p>With WAVe we tried a different approach based on three questions: why wait for everyone to adopt these new formats? What will happen to legacy LSDBs that won&#8217;t adopt the new formats? How can we have an immediate solution? We have created a lightweight integration architecture, based on links to applications and adopted a simple (yet familiar) tree-based navigation interaction to deploy a new application that can be used right now and will easily scale to integrate the foreseen data exchanges formats. Technical details aside, based on a manually curated LSDB list, we can connect and integrated any kind of LSDB application whether it is a modern LOVD application or a simple text-based legacy LSDB.</p>
<p><strong>How is it relevant?</strong></p>
<p>To demo WAVe efficiency let&#8217;s just try to perform a simple search in our lab: Are there any LSDBs for COL3A1 gene in the human species? And known variants? And what are the associated proteins and pathways?</p>
<p>In a WAVe-free scenario, to find out COL3A1 LSDBs (if any), researchers need to google it (the main COL3A1 LSDB does not appear in the first result page)  or, if you they are used to it, go to <a href="http://www.hgvs.org/" target="_blank">HGVS</a> site, go to the “<a href="http://www.hgvs.org/dblist/dblist.html" target="_blank">Databases &amp; Tools</a>” section, select “<a href="http://www.hgvs.org/dblist/glsdb.html" target="_blank">Locus-specific Mutation Databases</a>” and then search for the gene in search box. Now for the variants researchers just need to browse the last page they’ve just entered. How many clicks (and time!)  does it take?</p>
<p>For protein information, researchers enter in <a title="UniProt" href="http://www.uniprot.org/" target="_blank">UniProt</a> and search for <a href="http://www.uniprot.org/uniprot/?query=COL3A1&amp;sort=score" target="_blank">COL3A1</a>: that gives about 29 results. Add a filter for the human species and there are 5 results. Good enough to access directly to <a title="UniProt - Collagen alpha-1 (III) chain protein" href="http://www.uniprot.org/uniprot/P02461" target="_blank">P02461</a> (SwissProt reviewed). Though, there is new window/tab open. Now for pathway information, a <a title="KEGG" href="http://www.genome.jp/kegg/" target="_blank">KEGG</a> quick search for <a title="KEGG DBGET COL3A1" href="http://www.genome.jp/dbget-bin/www_bfind_sub?mode=bfind&amp;max_hit=1000&amp;locale=en&amp;serv=kegg&amp;dbkey=genes&amp;keywords=COL3A1&amp;page=1" target="_blank">COL3A1</a> lists 14 results. In the end, there are about 3 windows/tabs and made some 20 mouse clicks to obtain the desired information.</p>
<p>Using WAVe, researchers simply need to access WAVe, start typing the gene HGNC symbol, select <strong>COL3A1</strong> from the suggestions and access <a title="WAVe | View | COL3A1" href="http://bioinformatics.ua.pt/WAVe/gene/COL3A1/view" target="_blank">COL3A1</a> page. Once in the page, it&#8217;s as easy as browsing in the tree&#8230; Variations? Check the variation node, they&#8217;re even grouped according to the change type. UniProt information? Check the protein node where you have direct access to SwissProt, TrEMBL, PDB, Expasy and InterPro. And I guess you get the picture. In the end, one window/tab and about 6/7 mouse clicks.</p>
<p><strong>Other UA.PT Bioinformatics tools</strong></p>
<p>At the University of Aveiro&#8217;s Bioinformatics research group we are mainly young and enthusiast computer science experts, simply trying to make biology easier (at least in terms of computer applications!). Our more relevant web-based tools include <a title="MIND" href="http://bioinformatics.ua.pt/mind/" target="_blank">MIND</a> (a microarray analysis tool), <a title="GeneBrowser2" href="http://bioinformatics.ua.pt/gb2" target="_blank">GeneBrowser</a> (a gene expression tools, useful to process data gathered from systems like MIND) and <a title="QuExT" href="http://bioinformatics.ua.pt/quext/" target="_blank">QuExT</a> (a comprehensive MEDLINE mining application).</p>
<p>-Pedro Lopes</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.openhelix.eu/?feed=rss2&amp;p=4464</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Tip of the Week: WAVe, Web Analysis of the Variome</title>
		<link>http://blog.openhelix.eu/?p=4319</link>
		<comments>http://blog.openhelix.eu/?p=4319#comments</comments>
		<pubDate>Wed, 05 May 2010 04:14:21 +0000</pubDate>
		<dc:creator>Trey</dc:creator>
				<category><![CDATA[Tip of the Week]]></category>
		<category><![CDATA[databases]]></category>
		<category><![CDATA[ensembl]]></category>
		<category><![CDATA[Entrez]]></category>
		<category><![CDATA[KEGG]]></category>
		<category><![CDATA[LOVD]]></category>
		<category><![CDATA[NCBI]]></category>
		<category><![CDATA[PDB]]></category>
		<category><![CDATA[PharmaGKB]]></category>
		<category><![CDATA[Reactome]]></category>
		<category><![CDATA[snps]]></category>
		<category><![CDATA[UniProt]]></category>
		<category><![CDATA[variation]]></category>
		<category><![CDATA[WAVe]]></category>

		<guid isPermaLink="false">http://blog.openhelix.eu/?p=4319</guid>
		<description><![CDATA[Today&#8217;s Tip of the Week is a short introduction to WAVe, or the Web Analysis of the Variome. The tool was recently introduced to us, and I&#8217;ve found it a welcome introduction to the tools available to the researcher to analyze human variation. This is apropos considering the recent paper we&#8217;ve been discussing on the clinical [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.openhelix.com/downloads/jing/wave.mp4"><img class="alignleft size-medium wp-image-4320" title="wave_thumb" src="http://blog.openhelix.eu/wp-content/uploads/2010/05/wave_thumb-300x218.png" alt="" width="300" height="218" /></a>Today&#8217;s Tip of the Week is a short introduction to <a href="http://bioinformatics.ua.pt/WAVe/" target="_blank">WAVe, or the Web Analysis of the Variome</a>. The tool was recently introduced to us, and I&#8217;ve found it a welcome introduction to the tools available to the researcher to analyze human variation. This is apropos considering the recent paper we&#8217;ve been discussing on the clinical assessment of a personal genome (<a href="http://blog.openhelix.eu/?p=4250" target="_blank">here</a>, <a href="http://blog.openhelix.eu/?p=4264" target="_self">here</a> and <a href="http://blog.openhelix.eu/?p=4287" target="_blank">here</a>) and that papers implications for personalized medicine and the use of online variation resources. WAVe also has introduced me to some additional tools I&#8217;ve either not been aware of, or haven&#8217;t used, which might be of use such as: <a href="http://www.lovd.nl" target="_blank">LOVD</a> (Leiden Open Variation Database), <a href="http://bioinformatics.ua.pt/quext_dev/" target="_blank">QuExT</a> (Query Expansion Tool, also from the same developers as WAVe), and others. Of course there are also database information pulled in from Ensembl, Reactome, KEGG, InterPro, PDB, UniProt, NCBI and many others. Take some time to check it out.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.openhelix.eu/?feed=rss2&amp;p=4319</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Tip of the Week: HapMap data in Haploview</title>
		<link>http://blog.openhelix.eu/?p=3715</link>
		<comments>http://blog.openhelix.eu/?p=3715#comments</comments>
		<pubDate>Wed, 10 Mar 2010 05:01:17 +0000</pubDate>
		<dc:creator>Trey</dc:creator>
				<category><![CDATA[Tip of the Week]]></category>
		<category><![CDATA[haplotypes]]></category>
		<category><![CDATA[haploview]]></category>
		<category><![CDATA[hapmap]]></category>
		<category><![CDATA[LD]]></category>
		<category><![CDATA[linkage disequilibrium]]></category>
		<category><![CDATA[snps]]></category>

		<guid isPermaLink="false">http://blog.openhelix.eu/?p=3715</guid>
		<description><![CDATA[HapMap has had a few minor updates to their browser, and importantly, new phase 3 data was released early last year (drafts of that data were released in 2008). Haploview, the downloaded software that allows the user to perform in depth LD and haplotype analysis, has been recently updated from version 4.1 to version 4.2. [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.openhelix.com/downloads/jing/haploview.swf"><img class="alignleft size-medium wp-image-3716" title="haploview_thumb" src="http://blog.openhelix.eu/wp-content/uploads/2010/03/haploview_thumb-300x216.png" alt="" width="300" height="216" /></a><a href="http://hapmap.ncbi.nlm.nih.gov/" target="_blank">HapMap</a> has had a few minor updates to their browser, and importantly, new phase 3 data was released early last year (<a href="http://hapmap.ncbi.nlm.nih.gov/old_news.html.en" target="_blank">drafts of that data </a>were released in 2008). <a href="http://www.broadinstitute.org/haploview/haploview-downloads" target="_blank">Haploview</a>, the downloaded software that allows the user to perform in depth LD and haplotype analysis, has been recently updated from version 4.1 to version 4.2. Haploview can be used with user data or data downloaded from the HapMap project. Though, version 4.1 did not work for phase III HapMap project data, so the user had to use phase I and II data if they wanted to use version 4.1. Haploview has now been updated to version 4.2, allowing the user to use HapMap phase III data.</p>
<p>That&#8217;s a lot of versions and phases <img src='http://blog.openhelix.eu/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> . The short of it is, if you use Haploview 4.2, you can view and analyze data from any phase of the HapMap project.</p>
<p>Today&#8217;s tip briefly shows you how to download data from the HapMap project and view it in Haploview.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.openhelix.eu/?feed=rss2&amp;p=3715</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Top SNPs of the year</title>
		<link>http://blog.openhelix.eu/?p=3325</link>
		<comments>http://blog.openhelix.eu/?p=3325#comments</comments>
		<pubDate>Mon, 11 Jan 2010 20:14:05 +0000</pubDate>
		<dc:creator>Trey</dc:creator>
				<category><![CDATA[General Science]]></category>
		<category><![CDATA[hapmap]]></category>
		<category><![CDATA[SNPedia]]></category>
		<category><![CDATA[snps]]></category>

		<guid isPermaLink="false">http://blog.openhelix.eu/?p=3325</guid>
		<description><![CDATA[Interesting post from SNPedia blog (we mentioned being able to view SNPedia SNPS HapMap last year in a post) of the top 10 SNPs of the year. Of course, as they mention, it&#8217;s very subjective. Because they have chosen SNPs with serious health interest, I&#8217;ll semi-frivolously (because hey, no knowledge is necessarily &#8220;frivolous&#8221; nominate either: [...]]]></description>
			<content:encoded><![CDATA[<p>Interesting post from SNPedia blog (we mentioned<a href="http://blog.openhelix.eu/?p=82" target="_blank"> being able to view SNPedia SNPS HapMap </a>last year in a post) of the top 10 SNPs of the year.</p>
<p>Of course, as they mention, it&#8217;s very subjective.</p>
<p>Because they have chosen SNPs with serious health interest, I&#8217;ll semi-frivolously (because hey, no knowledge is necessarily &#8220;frivolous&#8221; <img src='http://blog.openhelix.eu/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  nominate either:</p>
<p>The <a href="http://www.snpedia.com/index.php/Rs17822931 " target="_blank">&#8220;ear wax&#8221; SNP</a> which determines whether you have &#8216;wet or dry&#8217; earwax, only because (yes, TMI) I have both, one in each ear so now I&#8217;m curious as to why.</p>
<p>and</p>
<p>The <a href="http://www.snpedia.com/index.php/Rs3057" target="_blank">&#8220;Perfect Musical Pitch&#8221; SNP</a>, only because my daughter and I seem to have that particular variation, and we know a few people who don&#8217;t <img src='http://blog.openhelix.eu/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> .</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.openhelix.eu/?feed=rss2&amp;p=3325</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tip of the Week: GRAIL for prioritizing SNPs</title>
		<link>http://blog.openhelix.eu/?p=3037</link>
		<comments>http://blog.openhelix.eu/?p=3037#comments</comments>
		<pubDate>Wed, 09 Dec 2009 12:45:58 +0000</pubDate>
		<dc:creator>Mary</dc:creator>
				<category><![CDATA[Genomics Research]]></category>
		<category><![CDATA[New Resource]]></category>
		<category><![CDATA[Tip of the Week]]></category>
		<category><![CDATA[entrez gene]]></category>
		<category><![CDATA[GRAIL]]></category>
		<category><![CDATA[GWAS]]></category>
		<category><![CDATA[snps]]></category>
		<category><![CDATA[text mining]]></category>
		<category><![CDATA[UCSC Genome Browser]]></category>

		<guid isPermaLink="false">http://blog.openhelix.eu/?p=3037</guid>
		<description><![CDATA[Perusing my copy of Nature Genetics last week, I was flipping through the pages and noticed an unusual graphic.  I looked at it a little closer and was convinced it was one of the Spirographs that I used to make as a kid.  (Remember those? I always liked that&#8230;.)  I looked a little bit closer [...]]]></description>
			<content:encoded><![CDATA[<div class="sticky_post"><p><a href="http://www.openhelix.com/downloads/jing/grail_snps.swf" target="_blank"><img class="size-full wp-image-3130" title="grail_snps_tip" src="http://blog.openhelix.eu/wp-content/uploads/2009/12/grail_snps_tip.jpg" alt="grail_snps_tip" width="300" align="left" /></a>Perusing my copy of Nature Genetics last week, I was flipping through the pages and noticed an unusual graphic.  I looked at it a little closer and was convinced it was one of the <a href="http://en.wikipedia.org/wiki/Spirograph" target="_blank">Spirographs</a> that I used to make as a kid.  (Remember those? I always liked that&#8230;.)  I looked a little bit closer and realized it was somewhat more informative than the Spirographs I used to draw.  This represented the relationships between genes, based on the literature.  Hmmm&#8230;.how did they do this, exactly?</p>
<p>The paper I was reading was <a href="http://www.nature.com/ng/journal/v41/n12/full/ng.479.html" target="_blank">Genetic variants at CD28, PRDM1 and CD2/CD58 are associated with rheumatoid arthritis risk</a> by Raychaudhuri et al, which was interesting enough.  I like to read the GWAS papers to see what the current techniques and strategies are, not only for the specific genes themselves.  And this paper reported the strategy that they used to prioritize their SNPs, and that they used<strong><span style="color: #ff0000;"> </span></strong><span style="text-decoration: underline;"><a href="http://www.broadinstitute.org/mpg/grail/" target="_blank"><strong><span style="color: #ff0000;">GRAIL</span></strong> </a></span>to generate the data for this graphic of gene relationships.  Check out <a href="http://www.nature.com/ng/journal/v41/n12/fig_tab/ng.479_ft.html" target="_blank">Figure 1</a> for the strategy.</p>
<p>When I saw the name GRAIL I thought&#8211;huh&#8230;.GRAIL is back with a new use?  I thought that was&#8230;ah&#8230;retired&#8230;at this point.  But this isn&#8217;t that GRAIL (<a href="http://compbio.ornl.gov/Grail-1.3/" target="_blank">http://compbio.ornl.gov/Grail-1.3/</a>, Gene Recognition and Assembly Internet Link).  This is a different GRAIL&#8211;the new one is <em><strong>Gene Relationships Among Implicated Loci</strong>. </em>So I had to go and read that paper, which is  <a href="http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal.pgen.1000534" target="_blank">Identifying Relationships among Genomic Disease Regions: Predicting Genes at Pathogenic SNP Associations and Rare Deletions</a> by Raychaudhuri et al.</p>
<p>This new GRAIL is all about text mining.  It is a tool that relies on statistical text mining of the literature for genes in a region and examines the relationships among those genes in the text.  The focus in their case is disease regions, but there&#8217;s no reason that you couldn&#8217;t use it for a variety of other topics.   As the authors state:</p>
<blockquote><p>Given only a collection of disease regions, GRAIL uses our text-based definition of relatedness (or alternative metrics of relatedness) to identify a subset of genes, more highly related than by chance; it also assigns a select set of keywords that suggest putative biological pathways.</p></blockquote>
<p>So you pull a set of genes out of the literature based on SNPs or locations of interest, and you can begin to assess what&#8217;s interesting in the set.   Now, the tool makes a lot of assumptions that you should be aware of if you are going to use it.  It assumes each region contains a single pathogenic gene.  I&#8217;m not sure that&#8217;s always going to be the case, but for this tool as long as you know that, that&#8217;s a fair assumption.  They suggest this helps to keep from multigenic regions from dominating the analysis.  Fair enough, but&#8230;what if that is the interesting aspect?  Still&#8211;that&#8217;s ok as long as you know.</p>
<p>In the paper they use validated SNPs from 4 different research areas:</p>
<ul>
<li>SNPs associated with serum lipid levels: GRAIL finds genes in the cholesterol biosynthesis pathway.</li>
</ul>
<ul>
<li>SNPs associated with height; they identify pathways they consider plausible.</li>
</ul>
<ul>
<li>Crohn&#8217;s disease; they confirm associations that have been seen.</li>
</ul>
<ul>
<li>Schizophrenia&#8211;and here they used rare deletions as the items of interest; they find related genes, many highly enriched in the CNS. So this suggests using this not only for SNPs but for CNVs this may be a useful strategy.</li>
</ul>
<p>Their Figure 1 nicely summarizes the strategy:</p>
<p><img class="size-full wp-image-3095" title="grail_fig1" src="http://blog.openhelix.eu/wp-content/uploads/2009/12/grail_fig1.jpg" alt="grail_fig1" width="605" height="553" /></p>
<p>One curious tweak of the data analysis was that they used the literature prior to December 2006, because right after that there was an onslaught of GWAS papers that would list a whole bunch of genes associated with regions that might be more tenuous still.  I understand this in theory, but I imagine it also eliminates more current research on genes of interest from other methods too.  I saw in the tool you could choose either pre-Dec 06 or a more up-to-date literature set.  It would be useful to try both if you use GRAIL and keep that in mind.</p>
<p>Another point to keep in mind: some genes are just not found in the abstracts, and they mention that is an issue.   So the set you can examine are those that were in the abstracts, and were identified properly with nomenclature, spelling, etc.  Text mining is cool, but has a lot of limitations around those aspects, and the use of synonyms too in general. It&#8217;s not just an issue for GRAIL, but for all text mining tools at this point.</p>
<p>They also devise a way to use Gene Ontology (GO) and some expression data in GRAIL as other &#8220;relatedness&#8221; metrics.  You&#8217;ll find those available from the GRAIL tool as well. <a href="http://www.nature.com/ng/journal/v41/n12/fig_tab/ng.479_ft.html" target="_blank"><img class="size-full wp-image-3107" title="spirograph" src="http://blog.openhelix.eu/wp-content/uploads/2009/12/spirograph.jpg" alt="spirograph" width="167" height="161" align="right" /></a></p>
<p>They don&#8217;t show any spirographs in their figures in this first GRAIL paper.  That one that drew me in was <a href="http://www.nature.com/ng/journal/v41/n12/fig_tab/ng.479_ft.html" target="_blank">Figure 2 in the arthritis paper</a>.  So I went over to the software to try to generate these myself.  The outcome at this point is a web page with text and links to UCSC Genome Browser, and Entrez Gene (from the individual genes and from the keyword list&#8211;keywords collect multiple Entrez Genes).  I was a little surprised that the keyword link wasn&#8217;t to PubMed as well.  Currently it doesn&#8217;t provide the graphic, but maybe that will come along over time.  If it does I&#8217;ll be sure to mention it on the blog.</p>
<p>One final note on the paper: in the supplemental section they compare GRAIL to other tools in this arena.  If you are interested in tools like we are here you may find some of them interesting as well.   The tools are listed with URLs in Table S5, and the comparison outcome is in Text S1:</p>
<blockquote><p><a href="http://pcdoeglas.med.rug.nl/prioritizer/" target="_blank"><strong><em>Prioritizer </em></strong></a>[2], <a href="http://www.ogic.ca/projects/g2d_2/" target="_blank"><strong><em>Gene2Disease</em></strong></a> (<strong><em>G2D</em></strong>) [3,4,5], <a href="http://www.transvar.org/results/candi_gene/" target="_blank"><strong><em>Commonality of Functional Annotation</em></strong></a> (<strong><em>CFA</em></strong>) [6], and <a href="http://www.genetics.med.ed.ac.uk/prospectr/" target="_blank"><strong><em>Prospectr </em></strong></a>[7]. There were five supervised tools: <a href="http://homes.esat.kuleuven.be/~bioiuser/endeavour/index.php" target="_blank"><strong><em>Endeavour </em></strong></a>[8], <a href="http://www.cmbi.ru.nl/GeneSeeker/basic_form.html" target="_blank"><strong><em>GeneSeeker </em></strong></a>[9], <a href="http://www.genetics.med.ed.ac.uk/suspects/" target="_blank"><strong><em>SUSPECTS </em></strong></a>[10], <a href="http://www-micrel.deis.unibo.it/~tom/" target="_blank"><strong><em>TOM </em></strong></a>[11], and <a href="https://dsgweb.wustl.edu/hutz/candid.html" target="_blank"><strong><em>CANDID </em></strong></a>[12]</p></blockquote>
<p>So check out GRAIL and see if you find gene relationships.  But don&#8217;t forget those caveats about the genes not listed in the abstracts, or the literature coverage dates.  The software can be found here:  <a href="http://www.broad.mit.edu/mpg/grail/" target="_blank">http://www.broad.mit.edu/mpg/grail/ </a></p>
<p>I know it&#8217;s a beta.  But I think it has a lot of potential to help people sift through the results they are getting from a variety of techniques.  Check it out.</p>
<p><em>NOTE: you may find periods that you can&#8217;t run GRAIL because it puts a burden on the servers.  You should try again during off hours if you are seeing problems with getting it to run.</em> <em>This happened to me during my testing of it last week.</em></p>
<p>The list of GWAS data I used to test GRAIL came from the NHGRI catalog, which we discussed here:  <a href="http://blog.openhelix.eu/?p=670" target="_self">List of GWAS studies</a>.  I tried the straight hair SNP list, and got a <a href="http://www.broadinstitute.org/mpg/grail/results/1259952407_out.html" target="_blank">pretty interesting set of results</a> that certainly included &#8220;epidermis&#8221; and &#8220;skin&#8221; as keywords, among other things.</p>
<p>++++++++++++ Citations ++++++++++++<br />
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.jtitle=PLoS+Genetics&amp;rft_id=info%3Adoi%2F10.1371%2Fjournal.pgen.1000534&amp;rfr_id=info%3Asid%2Fresearchblogging.org&amp;rft.atitle=Identifying+Relationships+among+Genomic+Disease+Regions%3A+Predicting+Genes+at+Pathogenic+SNP+Associations+and+Rare+Deletions&amp;rft.issn=1553-7404&amp;rft.date=2009&amp;rft.volume=5&amp;rft.issue=6&amp;rft.spage=0&amp;rft.epage=&amp;rft.artnum=http%3A%2F%2Fdx.plos.org%2F10.1371%2Fjournal.pgen.1000534&amp;rft.au=Raychaudhuri%2C+S.&amp;rft.au=Plenge%2C+R.&amp;rft.au=Rossin%2C+E.&amp;rft.au=Ng%2C+A.&amp;rft.au=%2C+.&amp;rft.au=Purcell%2C+S.&amp;rft.au=Sklar%2C+P.&amp;rft.au=Scolnick%2C+E.&amp;rft.au=Xavier%2C+R.&amp;rft.au=Altshuler%2C+D.&amp;rft.au=Daly%2C+M.&amp;rfe_dat=bpr3.included=1;bpr3.tags=Biology%2CBioinformatics">Raychaudhuri, S., Plenge, R., Rossin, E., Ng, A., International Schizophrenia Consortium, Purcell, S., Sklar, P., Scolnick, E., Xavier, R., Altshuler, D., &amp; Daly, M. (2009). Identifying Relationships among Genomic Disease Regions: Predicting Genes at Pathogenic SNP Associations and Rare Deletions <span style="font-style: italic;">PLoS Genetics, 5</span> (6) DOI: <a rev="review" href="http://dx.doi.org/10.1371/journal.pgen.1000534">10.1371/journal.pgen.1000534</a></span></p>
<p><span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.jtitle=Nature+Genetics&amp;rft_id=info%3Adoi%2F10.1038%2Fng.479&amp;rfr_id=info%3Asid%2Fresearchblogging.org&amp;rft.atitle=Genetic+variants+at+CD28%2C+PRDM1+and+CD2%2FCD58+are+associated+with+rheumatoid+arthritis+risk&amp;rft.issn=1061-4036&amp;rft.date=2009&amp;rft.volume=41&amp;rft.issue=12&amp;rft.spage=1313&amp;rft.epage=1318&amp;rft.artnum=http%3A%2F%2Fwww.nature.com%2Fdoifinder%2F10.1038%2Fng.479&amp;rft.au=Raychaudhuri%2C+S.&amp;rft.au=Thomson%2C+B.&amp;rft.au=Remmers%2C+E.&amp;rft.au=Eyre%2C+S.&amp;rft.au=Hinks%2C+A.&amp;rft.au=Guiducci%2C+C.&amp;rft.au=Catanese%2C+J.&amp;rft.au=Xie%2C+G.&amp;rft.au=Stahl%2C+E.&amp;rft.au=Chen%2C+R.&amp;rft.au=Alfredsson%2C+L.&amp;rft.au=Amos%2C+C.&amp;rft.au=Ardlie%2C+K.&amp;rft.au=Barton%2C+A.&amp;rft.au=Bowes%2C+J.&amp;rft.au=Burtt%2C+N.&amp;rft.au=Chang%2C+M.&amp;rft.au=Coblyn%2C+J.&amp;rft.au=Costenbader%2C+K.&amp;rft.au=Criswell%2C+L.&amp;rft.au=Crusius%2C+J.&amp;rft.au=Cui%2C+J.&amp;rft.au=De+Jager%2C+P.&amp;rft.au=Ding%2C+B.&amp;rft.au=Emery%2C+P.&amp;rft.au=Flynn%2C+E.&amp;rft.au=Harrison%2C+P.&amp;rft.au=Hocking%2C+L.&amp;rft.au=Huizinga%2C+T.&amp;rft.au=Kastner%2C+D.&amp;rft.au=Ke%2C+X.&amp;rft.au=Kurreeman%2C+F.&amp;rft.au=Lee%2C+A.&amp;rft.au=Liu%2C+X.&amp;rft.au=Li%2C+Y.&amp;rft.au=Martin%2C+P.&amp;rft.au=Morgan%2C+A.&amp;rft.au=Padyukov%2C+L.&amp;rft.au=Reid%2C+D.&amp;rft.au=Seielstad%2C+M.&amp;rft.au=Seldin%2C+M.&amp;rft.au=Shadick%2C+N.&amp;rft.au=Steer%2C+S.&amp;rft.au=Tak%2C+P.&amp;rft.au=Thomson%2C+W.&amp;rft.au=van+der+Helm-van+Mil%2C+A.&amp;rft.au=van+der+Horst-Bruinsma%2C+I.&amp;rft.au=Weinblatt%2C+M.&amp;rft.au=Wilson%2C+A.&amp;rft.au=Wolbink%2C+G.&amp;rft.au=Wordsworth%2C+P.&amp;rft.au=Altshuler%2C+D.&amp;rft.au=Karlson%2C+E.&amp;rft.au=Toes%2C+R.&amp;rft.au=de+Vries%2C+N.&amp;rft.au=Begovich%2C+A.&amp;rft.au=Siminovitch%2C+K.&amp;rft.au=Worthington%2C+J.&amp;rft.au=Klareskog%2C+L.&amp;rft.au=Gregersen%2C+P.&amp;rft.au=Daly%2C+M.&amp;rft.au=Plenge%2C+R.&amp;rfe_dat=bpr3.included=1;bpr3.tags=Biology%2CBioinformatics%2C+Genetics">Raychaudhuri, S., Thomson, B., Remmers, E., Eyre, S., Hinks, A., Guiducci, C., Catanese, J., Xie, G., Stahl, E., Chen, R., Alfredsson, L., Amos, C., Ardlie, K., Barton, A., Bowes, J., Burtt, N., Chang, M., Coblyn, J., Costenbader, K., Criswell, L., Crusius, J., Cui, J., De Jager, P., Ding, B., Emery, P., Flynn, E., Harrison, P., Hocking, L., Huizinga, T., Kastner, D., Ke, X., Kurreeman, F., Lee, A., Liu, X., Li, Y., Martin, P., Morgan, A., Padyukov, L., Reid, D., Seielstad, M., Seldin, M., Shadick, N., Steer, S., Tak, P., Thomson, W., van der Helm-van Mil, A., van der Horst-Bruinsma, I., Weinblatt, M., Wilson, A., Wolbink, G., Wordsworth, P., Altshuler, D., Karlson, E., Toes, R., de Vries, N., Begovich, A., Siminovitch, K., Worthington, J., Klareskog, L., Gregersen, P., Daly, M., &amp; Plenge, R. (2009). Genetic variants at CD28, PRDM1 and CD2/CD58 are associated with rheumatoid arthritis risk <span style="font-style: italic;">Nature Genetics, 41</span> (12), 1313-1318 DOI: <a rev="review" href="http://dx.doi.org/10.1038/ng.479">10.1038/ng.479</a></span></p>
<p><span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.jtitle=The+American+Journal+of+Human+Genetics&amp;rft_id=info%3Adoi%2F10.1016%2Fj.ajhg.2009.10.009&amp;rfr_id=info%3Asid%2Fresearchblogging.org&amp;rft.atitle=Common+Variants+in+the+Trichohyalin+Gene+Are+Associated+with+Straight+Hair+in+Europeans&amp;rft.issn=00029297&amp;rft.date=2009&amp;rft.volume=85&amp;rft.issue=5&amp;rft.spage=750&amp;rft.epage=755&amp;rft.artnum=http%3A%2F%2Flinkinghub.elsevier.com%2Fretrieve%2Fpii%2FS0002929709004649&amp;rft.au=Medland%2C+S.&amp;rft.au=Nyholt%2C+D.&amp;rft.au=Painter%2C+J.&amp;rft.au=McEvoy%2C+B.&amp;rft.au=McRae%2C+A.&amp;rft.au=Zhu%2C+G.&amp;rft.au=Gordon%2C+S.&amp;rft.au=Ferreira%2C+M.&amp;rft.au=Wright%2C+M.&amp;rft.au=Henders%2C+A.&amp;rfe_dat=bpr3.included=1;bpr3.tags=Biology%2CGenetics">Medland, S., Nyholt, D., Painter, J., McEvoy, B., McRae, A., Zhu, G., Gordon, S., Ferreira, M., Wright, M., &amp; Henders, A. (2009). Common Variants in the Trichohyalin Gene Are Associated with Straight Hair in Europeans <span style="font-style: italic;">The American Journal of Human Genetics, 85</span> (5), 750-755 DOI: <a rev="review" href="http://dx.doi.org/10.1016/j.ajhg.2009.10.009">10.1016/j.ajhg.2009.10.009</a></span></p>
</div>
]]></content:encoded>
			<wfw:commentRss>http://blog.openhelix.eu/?feed=rss2&amp;p=3037</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Tip of the Week: Getting flanking sequence</title>
		<link>http://blog.openhelix.eu/?p=2629</link>
		<comments>http://blog.openhelix.eu/?p=2629#comments</comments>
		<pubDate>Wed, 14 Oct 2009 04:03:44 +0000</pubDate>
		<dc:creator>Trey</dc:creator>
				<category><![CDATA[Genomics Research]]></category>
		<category><![CDATA[Tip of the Week]]></category>
		<category><![CDATA[galaxy]]></category>
		<category><![CDATA[sequence]]></category>
		<category><![CDATA[snps]]></category>
		<category><![CDATA[UCSC Genome Browser]]></category>

		<guid isPermaLink="false">http://blog.openhelix.eu/?p=2629</guid>
		<description><![CDATA[In an earlier What&#8217;s Your Problem thread, a researcher had hundreds of SNP locations where they were trying to easily obtain the flanking sequence of those hundreds of SNPs without having to go to each location in the UCSC Genome Browser and eyeballing. There are probably a few ways to do this, but I found [...]]]></description>
			<content:encoded><![CDATA[<div class="sticky_post"><p><a href="http://www.openhelix.com/downloads/jing/getflank.swf"><img class="alignleft size-medium wp-image-2630" title="getflank_thumb" src="http://blog.openhelix.eu/wp-content/uploads/2009/10/getflank_thumb-300x207.png" alt="getflank_thumb" width="300" height="207" /></a>In an earlier <a href="http://blog.openhelix.eu/?p=2561" target="_self">What&#8217;s Your Problem thread</a>, a researcher had hundreds of SNP locations where they were trying to easily obtain the flanking sequence of those hundreds of SNPs without having to go to each location in the <a href="http://genome.ucsc.edu" target="_blank">UCSC Genome Browser</a> and eyeballing. There are probably a few ways to do this, but I found that <a href="http://www.usegalaxy.org" target="_blank">Galaxy</a> was a good place to start. So, the tip this week is taking two SNP locations on the human genome and obtaining the flanking sequence from those locations and returning a file that could be saved either as a spreadsheet, text or even made back into a UCSC Genome Browser custom track that can then be uploaded, viewed and searched at UCSC. The process for individual researchers will be a bit different depending on the data and how the excel/worksheet/file is configured, but hopefully you&#8217;ll get the idea. The steps are thus:<br />
1. Upload your file (tab delineated text)<br />
2. Convert file to the &#8216;interval&#8217; format<br />
3. Cut out any columns of data from original file to save for later use.<br />
4. Get flanking chromosomal locations (then merge upstream and downstream records into one record)<br />
5. Get flanking sequence<br />
6. Paste data columns from step 3 to the data columns (chromosomal location and sequence) from step 5.</p>
<p>Voila, now you have a tab-delineated text file that can be opened in Excel, made into a custom track (in Galaxy), etc.</p>
<p>Any suggestions on other methods for doing this?</p>
<p>(OpenHelix does <a href="http://blog.openhelix.eu/?page_id=57" target="_blank">training on Galaxy and UCSC Genome Browser</a>).</p>
</div>
]]></content:encoded>
			<wfw:commentRss>http://blog.openhelix.eu/?feed=rss2&amp;p=2629</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Tip of the Week: F-SNP</title>
		<link>http://blog.openhelix.eu/?p=1894</link>
		<comments>http://blog.openhelix.eu/?p=1894#comments</comments>
		<pubDate>Wed, 17 Jun 2009 05:01:30 +0000</pubDate>
		<dc:creator>Trey</dc:creator>
				<category><![CDATA[Tip of the Week]]></category>
		<category><![CDATA[dbSNP]]></category>
		<category><![CDATA[ensembl]]></category>
		<category><![CDATA[GeneSNPs]]></category>
		<category><![CDATA[gvs]]></category>
		<category><![CDATA[hapmap]]></category>
		<category><![CDATA[polymorphisms]]></category>
		<category><![CDATA[SeattleSNPs]]></category>
		<category><![CDATA[snps]]></category>
		<category><![CDATA[UCSC Genome Browser]]></category>

		<guid isPermaLink="false">http://www.openhelix.com/blog/?p=1894</guid>
		<description><![CDATA[There are a lot of databases to search for to find SNP data, HapMap, dbSNP, SeattleSNPs, Genome Variation Server and many more. I&#8217;m going to add one more to your data mining arsenal, F-SNP. F-SNP (described more fully here in the 2008 NAR Database issue), provides integrated information about the functional effects of SNPs obtained from [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.openhelix.com/downloads/jing/fsnp.swf" target="_blank"><img class="alignleft size-medium wp-image-1898" title="fsnp_thumb" src="http://www.openhelix.com/blog/wp-content/uploads/2009/06/fsnp_thumb-300x193.png" alt="fsnp_thumb" width="239" height="153" align="left" /></a>There are a lot of databases to search for to find SNP data, <a href="http://www.hapmap.org" target="_blank">HapMap</a>, <a href="http://ncbi.nlm.nih.gov/SNP" target="_blank">dbSNP</a>, SeattleSNPs, Genome Variation Server and many more. I&#8217;m going to add one more to your data mining arsenal, <a href="http://compbio.cs.queensu.ca/F-SNP/" target="_blank">F-SNP</a>. F-SNP (described more fully <a href="http://nar.oxfordjournals.org/cgi/content/abstract/gkm904v1" target="_blank">here in the 2008 NAR Database issue</a>),</p>
<blockquote><p>provides integrated information about the functional effects of SNPs obtained from <a href="http://compbio.cs.queensu.ca/F-SNP/main_files/resources.html#tools">16 bioinformatics tools and databases</a>. The functional effects are predicted and indicated at the splicing, transcriptional, translational, and post-translational level. As such, the F-SNP database helps identify and focus on SNPs with potential pathological effect to human health.</p></blockquote>
<p>&#8230;as they say in the introduction. It looks to be a good first stop to find SNPs of functional relevance. The databases they pull from to get their information include several I&#8217;ve mentioned above and also the <a href="http://genome.ucsc.edu" target="_blank">UCSC Genome database</a>, <a href="http://www.ensembl.org" target="_blank">Ensembl</a>, <a href="http://sift.jcvi.org/" target="_blank">SIFT</a> and <a href="http://genetics.bwh.harvard.edu/pph/data/index.html" target="_blank">PolyPhen</a> predictions and more. I&#8217;ve given a quick intro in the tip this week on how to get functional SNP information from F-SNP.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.openhelix.eu/?feed=rss2&amp;p=1894</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
