BioMart is widely-used data management open-source software, with an interface that enables end-users to generate complex and customized queries across many types and sources of biological data. It’s part of the GMOD tool kit, and many project teams that have big data have chosen the BioMart software to organize and make their data available to you.
We’ve been fans of BioMart for years. It was one of the earliest software tools we described, as it was integrated into many of the sites that we covered–such as Ensembl. Eventually we broke it out into its own tutorial suite, though, as there are now dozens of groups that have built Marts of their own. Although the skin may change and the data sets that are available will vary at different sites, the underlying software features are the same. Learning to use the main BioMart portal will help you to use all of them. Until recently the list of data providers that used BioMart was on the homepage, but here’s a taste of that list from my slides:
In this video tip I’ll introduce the newly re-designed BioMart main site, and touch on some of the other version of BioMart that you should get to know. We’ll be updating our tutorial suite with the new look soon, but most of the software functionality is the same as we’ve covered otherwise (available by subscription).
There are two main versions of BioMart circulating right now. The v 0.7 is the one that will probably be most familiar to people who have encountered BioMart at any of the genomics sites that have installations right now. But there’s a new and re-designed v 0.8 that is under development. It’s the one that’s used at the International Cancer Genome Consortium (ICGC.org) and there’s also a 0.8 central BioMart portal available to try out. Eventually this may replace many of the 0.7 setups, but this depends on the site. Some may persist with 0.7 for a while rather than updating. So it’s probably wise to have an idea of how to use both of them at this time.
One of the features of the new BioMart interface that’s already got bioinformatics folks talking is the ID converter. This is a common problem in the field, and Steven Turner thought this was a nice aspect of the facelift: BioMart Gene ID converter.
I also wanted to note that BioMart is one of the tools that you can use at Galaxy to access large swaths of data for further analysis. At Galaxy, open the “Get Data” menu to see that BioMart is one of your options.
There was also a lot of buzz about BioMart last week when a “Virtual Issue”of the journal Database was released that had not only an overview article about BioMart as a whole, but also several of the resources that use BioMart for their management and query interfaces as well. So you can see how widely useful this software is, among many different types of data providers. You can use the local installations of BioMart at a provider’s site, or you can use the main site to query from any of these sources as well–and more powerfully you can cross-database query too.
BioMart main site: http://www.biomart.org/
BioMart new style Bio Central portal: http://central.biomart.org/
BioMart pages at GMOD: http://gmod.org/wiki/BioMart
Virtual Issue of Database on BioMart: http://www.oxfordjournals.org/our_journals/databa/biomart_virtual_issue.html
Kasprzyk, A. (2011). BioMart: driving a paradigm change in biological data management Database, 2011 DOI: 10.1093/database/bar049
Zhang, J., Haider, S., Baran, J., Cros, A., Guberman, J., Hsu, J., Liang, Y., Yao, L., & Kasprzyk, A. (2011). BioMart: a data federation framework for large collaborative projects Database, 2011 DOI: 10.1093/database/bar038
Guberman, J., Ai, J., Arnaiz, O., Baran, J., Blake, A., Baldock, R., Chelala, C., Croft, D., Cros, A., Cutts, R., Di Genova, A., Forbes, S., Fujisawa, T., Gadaleta, E., Goodstein, D., Gundem, G., Haggarty, B., Haider, S., Hall, M., Harris, T., Haw, R., Hu, S., Hubbard, S., Hsu, J., Iyer, V., Jones, P., Katayama, T., Kinsella, R., Kong, L., Lawson, D., Liang, Y., Lopez-Bigas, N., Luo, J., Lush, M., Mason, J., Moreews, F., Ndegwa, N., Oakley, D., Perez-Llamas, C., Primig, M., Rivkin, E., Rosanoff, S., Shepherd, R., Simon, R., Skarnes, B., Smedley, D., Sperling, L., Spooner, W., Stevenson, P., Stone, K., Teague, J., Wang, J., Wang, J., Whitty, B., Wong, D., Wong-Erasmus, M., Yao, L., Youens-Clark, K., Yung, C., Zhang, J., & Kasprzyk, A. (2011). BioMart Central Portal: an open database network for the biological community Database, 2011 DOI: 10.1093/database/bar041
Haider, S., Ballester, B., Smedley, D., Zhang, J., Rice, P., & Kasprzyk, A. (2009). BioMart Central Portal–unified access to biological data Nucleic Acids Research, 37 (Web Server) DOI: 10.1093/nar/gkp265