In this week’s tip I want to be sure you are aware of using MyNCBI database filters to divide your Entrez searches into useful subsets of records. This feature has been available from My NCBI for as long as I have been using it, but last week I was reminded again that not everyone is familiar with this handy utility. My NCBI filters are available for each Entrez database and are very useful for subdividing your search results into more manageable bits, or even as a quick means of finding records that are contained in two different NCBI databases – that’s what ended up working for one of our “What’s Your Problem” commenters, as you can see in this post. Enjoy the tip, and then let us know what creative ways you put this utility to use!
That was the name of a talk that Lincoln Stein gave at the O’Reilly Bioinformatics Conference in 2003. I remember this conference really well for several reasons, including that talk….We were just starting our company (firmly in bioinformatics) and this was not exactly what I wanted to hear. It was also the first time I remember meeting Jim Kent–and it led to our interest in the UCSC Genome Browser which has since been really important for us.
It’s funny–I had recently been re-reading something else about that conference because of a friend on a non-science blog….John Sundman was telling me about this as we were discussing his book Acts of the Apostles, and we were discussing this article he also wrote in Salon about this conference.
In a strange collision of all these threads in my life, I was checking the My NCBI list I get each week and found that Lincoln had updated his thoughts on this topic in the new Genome Biology.
ABSTRACT: Bioinformatics has become too central to biology to be left to specialist bioinformaticians. Biologists are all bioinformaticians now.
Well, this was a delightful revision of the earlier prediction I must say. Although his emphasis is on folks who come with a stronger computational side and are now hybridizing that with benchwork biology, I think there are also more and more biologists–without formal computational training–who are ratcheting up their skills and becoming stronger with the tools as end users. I think there’s a really nice place in the middle for everyone. And we think it is a nice place to be. There are a lot of people in the bio pipeline who didn’t have access to the courses and programs Lincoln references in his article. We know, we see them in the UCSC trainings we give all over the place. And we are teaching them how to do custom tracks to display their own data. And they are eager to do it.
A recently sweep of the literature (courtesy of my standing My NCBI search) led me to another mitochondrial resource that I thought I would mention. MITOMASTER is the resource, and you can find details about it in this paper:
So of course I went to explore–I love a new database! Wandering around a bit, I find that they are the Molecular and Mitochondrial Medicine and Genetics Center (MAMMAG). I find this incredibly difficult to say–which is rather odd, since both my first and last name start with M you would think I would be better that this…
In any case, they are the home of a bunch of Mito features, it turns out. MitoMed, MitoMap, MitoWiki and MitoMaster. So I will now always think of it as the Mito4 place.
The page had a number of links to other mito resources that were new to me–including the FBI Forensic mtDNA database. I have to admit, I went to peek at it to see what was in there. But it requires a download. I’m just not downloading stuff from the FBI….
You have to register to access part of the site, so I did. Then I tried out Mitomaster with a sequence I pulled out of GenBank. Quite a straighforward interface. It seemed to run fast and deliver interesting information about the variants. There was a button for each variant that quickly offered a more detailed look at any changes in a given gene. Seemed like a pretty useful option. I’ll try it out more soon–but I wanted to mention it while the paper was fresh in my mind.
We have a slide we like to present at some trainings showing the rise in the amount of raw sequence data and number of complete genomes over the last 18 years. There is another slide we show that indicates the rise of the number of databases and analysis tools over the years as listed in the annual database issue of NAR. The number has been doubling every 4 years.
Well, there is another slide we can show too, and this shows the growth of the number of abstract entries into PubMed over the last 20 years (from Hunter and Cohen, 2006). Like data and databases, the number of research articles published and indexed just keeps getting larger. This increase in number is both a bane and a boon to researchers. Well, of course not only the number of papers indexed is growing, the amount of text is growing (open access, etc) and is about to grow even more with the signing of the new open access act. Searching, mining and making sense of all this literature is going to be a challenge, it is a challenge now.
I don’t know how you start your Tuesdays, but I start mine with literature. And coffee. This morning, though, I wasn’t sure I had enough coffee.
Every Monday evening NCBIruns some searches for me and sends me the results. I have several saved searches set up in the My NCBI system (a lot of alliteration…). I have a search that sends me papers that match keywords such as “bioinformatics” and “distance learning” among others. It is one of the ways I keep up with new resources. I find new genomics resources, and I find papers where people have used resources to analyze the data. Both are important to me.
But today in my results I found a species that I hadn’t heard of before, Pristionchus pacificus. And the title told me that I would learn about the genetics and genomics of this species, so I went to look at the abstract–it is linked right from my email, so I can hop over to NCBI. Continue reading →