Video Tip of the Week: Sharing #H7N9 data at with EpiFlu™

This week’s video Tip of the Week offers you a quick tour of GISAID’s resources and their EpiFlu™ database. This is the database you might be hearing about in the news—the one to which researchers submit the new H7N9 influenza sequence data that they are collecting. Originally this initiative was seeded as the “Global Initiative on Sharing Avian Influenza Data” but it has evolved to become the “Global Initiative on Sharing All Influenza Data” to describe a broadened mission to collect any flu data. Researchers around the world can quickly share their discoveries of any types of influenza viruses and make the sequence details as well as epidemiological and clinical data available to other researchers, who then explore and analyze that information. Researchers from various disciplines, including veterinary, virology, bioinformatics, epidemiology, immunology, and clinical analysis access this information. Currently the US CDC and others are using the data to explore the development of new vaccines, antiviral drugs, and diagnostic kits and to learn about the characteristics of the virus isolates which could affect public health policy making. Even before they had access to physical virus samples to test, they could begin assessments using the EpiFlu data. In this tip I’ll show you how researchers submit to and access this global resource. My goal is to show researchers who might want to use the information some of the details, but other people might be curious to have a look under-the-hood too.

For today’s tip I focus on how researchers would use the EpiFlu database, with a quick tour of some features. Recently I signed up for access to the site, which was quickly approved. And then I asked the GISAID team for permission to make the video, which they also quickly granted, and I have asked them to review the movie to make sure I didn’t go out of bounds of the Data Access Agreement. As a registered user, I’m not allowed to show the general public the sequence data itself, but I will show you how researchers would obtain the details they need to take their analyses further. And I did get permission to open one record to illustrate a key point the record. (Information in that record has been published in the New England Journal of Medicine by the submission team. Reference below.) Later, you can sign up for access to see the details yourself. The data is publicly accessible, as long as you identify yourself and agree to the terms. The terms of use are not designed to be a barrier to access and research—they are in place to give us the freedom and responsibility to use the data appropriately. There have been objections to this sharing model, but going into detail on the history and development of GISAID is not the subject of this post.

The site details

You can learn more about the issues that were a catalyst to the development of GISAID from an editorial published in Nature, cited below. GISAID continues to evolve, and you can learn more about the state of the current initiative and its scientific advisors by visiting their website. Since 2010, the German government, represented by the Federal Ministry of Food, Agriculture and Consumer Protection is the official host of the site, and the Federal Research Institute for Animal Health is responsible for the quality of the data in GISAID.

When you are logged into the GISAID site, you’ll have access to a range of features. Related news items are posted. You can see the list of all the other registered users, and you can easily contact them from within the system for questions and collaborations. Most importantly, though, you have access to the relational database component called EpiFlu. This is where researchers can submit new sequences that they isolate. There are many fields for storing crucial metadata. The entry form offers different fields depending on the type of isolate and host. EpiFlu is where other researchers can query for the types of strains, hosts, or submission details they are interested in. These sequences and metadata can be downloaded for use with other tools. There are also some analysis tools provided in the EpiFlu interface. Sequences can be submitted for BLAST analysis or used to generate a multiple-sequence alignment with an installation of Jalview.

In speaking with folks at GISAID last week about their philosophy I learned about upcoming new software they are working on. The GISAID group plans an EpiFlu 2.0 which they are building from scratch. That version will have additional features that enhance the connectivity with other resources and for enhancing collaborations, and with better scalability. As we continue to see the deluge of sequence data coming in from all kinds of sources in the future, this will really be necessary. I don’t know what the target date is for the next version, but I’ll be keeping an eye out for that as a future tip.

For non-researchers

If you are a member of the public curious about information sources on the flu, please read this excellent guidance on sources of flu information by Maryn McKenna: The New Bird Flu, and How to Read the News About It. Not all news is the same. Let’s be careful out there.

An example of sequence being submitted to GISAID:

One last note:

And researchers are using GISAID to help understand the behavior of H7N9 right now.


Quick links:

GISAID Foundation and EpiFlu website:

Follow GISAID on Twitter at @AIDigest:

Register for access:

BTW: I also asked how they pronounce their name—it’s like “jees-aid”, if you want to know. I had only seen it written in text form and hadn’t heard it from any sources so I wanted to be sure.

References for the video and post:

Editorial (2006). Boosting access to disease data Nature, 442 (7106), 957-957 DOI: 10.1038/442957a

Butler, D. (2013). Urgent search for flu source Nature, 496 (7444), 145-146 DOI: 10.1038/496145a

Gao, R., Cao, B., Hu, Y., Feng, Z., Wang, D., Hu, W., Chen, J., Jie, Z., Qiu, H., Xu, K., Xu, X., Lu, H., Zhu, W., Gao, Z., Xiang, N., Shen, Y., He, Z., Gu, Y., Zhang, Z., Yang, Y., Zhao, X., Zhou, L., Li, X., Zou, S., Zhang, Y., Li, X., Yang, L., Guo, J., Dong, J., Li, Q., Dong, L., Zhu, Y., Bai, T., Wang, S., Hao, P., Yang, W., Zhang, Y., Han, J., Yu, H., Li, D., Gao, G., Wu, G., Wang, Y., Yuan, Z., & Shu, Y. (2013). Human Infection with a Novel Avian-Origin Influenza A (H7N9) Virus New England Journal of Medicine DOI: 10.1056/NEJMoa1304459


Disclosure: OpenHelix has no financial or scientific relationship with the GISAID Foundation or EpiFlu. I merely approached them as I do many other resources to ask for permission to do a movie. I offered them the opportunity to review my materials because of the sensitivity of this issue and the desire to NOT cause any kind of international public health incident. The goal of this is to show people the insides of a database or resource that they may not be familiar with.

One thought on “Video Tip of the Week: Sharing #H7N9 data at with EpiFlu™

  1. Pingback: Video Tip of the Week: Influenza Research Database (IRD) | The OpenHelix Blog

Comments are closed.