We’ve been fans of InterMine for a long time. We did a tip-of-the-week in a while ago that highlighted ways that this software can be used to mine from big data projects of many types. The generic framework of InterMine can be customized for use at different projects–today I’ll include videos from the FlyMine installation and the YeastMine flavor–but you may find versions of this handy tool in many other places as well.
The first video is a broader overview of different types of things you can do–and although this is FlyMine, you’ll find similar behavior at the other Mines too.
This next video is more specific about a task that people need to accomplish–working with a list of genes. This example was recently produced by the YeastMine folks, but again this should work in a similar way across other Mines. You should also read the SGD blog post on it–Create, Analyze, Save: the Power of Gene Lists in YeastMine.
The other thing that I noticed about this framework is the effort of several of these model organism Mines to coordinate into this InterMOD structure. Although I am often wary of “one search to rule them all” sorts of efforts, there can be value in this as a central organizing principle as we keep adding more species genomes that may not have as well-developed communities and infrastructure to support them.
I certainly use a lot of query tools that are similar to these–like the UCSC Table Browser, and BioMart. UniProt offers ways to build queries that’s different but conceptually similar. Using these interfaces you can construct some clever and complex ways to extract information out of data repositories.
Smith R.N., Aleksic J., Butano D., Carr A., Contrino S., Hu F., Lyne M., Lyne R., Kalderimis A. & Rutherford K. & (2012). InterMine: a flexible data warehouse system for the integration and analysis of heterogeneous biological data., Bioinformatics (Oxford, England), DOI: 10.1093/bioinformatics/bts577
Lyne R., Smith R., Rutherford K., Wakeling M., Varley A., Guillier F., Janssens H., Ji W., Mclaren P. & North P. & (2012). FlyMine: an integrated database for Drosophila and Anopheles genomics., Genome biology, PMID: 17615057
Balakrishnan R., Park J., Karra K., Hitz B.C., Binkley G., Hong E.L., Sullivan J., Micklem G. & Cherry J.M. (2012). YeastMine–an integrated data warehouse for Saccharomyces cerevisiae data as a multipurpose tool-kit., Database : the journal of biological databases and curation, PMID: 22434830
Sullivan J., Karra K., Moxon S.A.T., Vallejos A., Motenko H., Wong J.D., Aleksic J., Balakrishnan R., Binkley G. & Harris T. & (2013). InterMOD: integrated data and tools for the unification of model organism research., Scientific reports, 3 (1802) PMID: 23652793