Viruses Cross Borders. To Fight Them, Countries Must Let Medical Data Flow, Too
If nations could regulate viruses the way many regulate data, there would be no global pandemics. But the sad reality is that, in the midst of the worst global pandemic in living memory, many nations make it unnecessarily complicated and costly, if not illegal, for health data to cross their borders. In so doing, they are hindering critically needed medical progress.
In the COVID-19 crisis, data analytics powered by artificial intelligence (AI) is critical to identifying the exact nature of the pandemic and developing effective treatments. The technology can produce powerful insights and innovations, but only if researchers can aggregate and analyze data from populations around the globe. And that requires data to move across borders as part of international research efforts by private firms, universities, and other research institutions. Yet, some countries, most notably China, are stopping health and genomic data at their borders.
Indeed, despite the significant benefits to companies, citizens, and economies that arise from the ability to easily share data across borders, dozens of countries—across every stage of development—have erected barriers to cross-border data flows. These data-residency requirements strictly confine data within a country’s borders, a concept known as “data localization,” and many countries have especially strict requirements for health data.
China is a noteworthy offender, having created a new digital iron curtain that requires data localization for a range of data types, including health data, as part of its so-called “cyber sovereignty” strategy. A May 2019 State Council regulation required genomic data to be stored and processed locally by Chinese firms—and foreign organizations are prohibited. This is in service of China’s mercantilist strategy to advance its domestic life sciences industry. While there has been collaboration between U.S. and Chinese medical researchers on COVID-19, including on clinical trials for potential treatments, these restrictions mean that it won’t involve the transfer, aggregation, and analysis of Chinese personal data, which otherwise might help find a treatment or vaccine. If China truly wanted to make amends for blocking critical information during the early stages of the outbreak in Wuhan, then it should abolish this restriction and allow genomic and other health data to cross its borders.
But China is not alone in limiting data flows. Russia requires all personal data, health-related or not, to be stored locally. India’s draft data protection bill permits the government to classify any sensitive personal data as critical personal data and mandate that it be stored and processed only within the country. This would be consistent with recent debates and decisions to require localization for payments data and other types of data. And despite its leading role in pushing for the free flow of data as part of new digital trade agreements, Australia requires genomic and other data attached to personal electronic health records to be only stored and processed within its borders.
Countries also enact de facto barriers to health and genomic data transfers by making it harder and more expensive, if not impractical, for firms to transfer it overseas than to store it locally. For example, South Korea and Turkey require firms to get explicit consent from people to transfer sensitive data like genomic data overseas. Doing this for hundreds or thousands of people adds considerable costs and complexity.
And the European Union’s General Data Protection Regulation encourages data localization as firms feel pressured to store and process personal data within the EU given the restrictions it places on data transfers to many countries. This is in addition to the renewed push for local data storage and processing under the EU’s new data strategy.
Countries rationalize these steps on the basis that health data, particularly genomic data, is sensitive. But requiring health data to be stored locally does little to increase privacy or data security. The confidentiality of data does not depend on which country the information is stored in, only on the measures used to store it securely, such as via encryption, and the policies and procedures the firms follow in storing or analyzing the data. For example, if a nation has limits on the use of genomics data, then domestic organizations using that data face the same restrictions, whether they store the data in the country or outside of it. And if they share the data with other organizations, they must require those organizations, regardless of where they are located, to abide by the home government’s rules.
As such, policymakers need to stop treating health data differently when it comes to cross-border movement, and instead build technical, legal, and ethical protections into both domestic and international data-governance mechanisms, which together allow the responsible sharing and transfer of health and genomic data.
This is clearly possible—and needed. In February 2020, leading health researchers called for an international code of conduct for genomic data following the end of their first-of-its-kind international data-driven research project. The project used a purpose-built cloud service that stored 800 terabytes of genomic data on 2,658 cancer genomes across 13 data centers on three continents. The collaboration and use of cloud computing were transformational in enabling large-scale genomic analysis.
If policymakers want more international collaboration like this, including around COVID-19, then they should remove barriers to health data transfers and build a clear and predictable framework to clarify how data protection rules apply. That’s because, even in countries where explicit barriers don’t exist, risk aversion and legal uncertainty about health data governance makes many health firms and researchers reluctant to participate in health data sharing.
Just as the global financial crisis was the catalyst for the G20 to take unprecedented global economic action, so too should COVID-19 motivate countries to do the same for global health data governance. However, while G20 health ministers have met annually since 2017 to talk about health risk and security, the prospect for action at the G20 is low, especially given the fact that China-U.S. relations are at a historic nadir and because China and Russia do not allow cross-border data flows.
Given this, other leading members who recognize the value of data-driven health collaboration, such as Japan, the United States, and Singapore, will have to shift any such initiative elsewhere. There is no reason that a group of like-minded nations, including the United States, could not establish a health data-sharing alliance. At the multilateral level, the World Health Organization, which has ramped up its work on improving digital health strategies and policies, should issue a clear and strong statement against health data localization policies.
Countries could build upon existing initiatives. For example, the Global Alliance for Genomics and Health brings together hundreds of healthcare, university, biopharmaceutical and technology companies to create ways to enable the responsible, voluntary, and secure sharing of genomic and health-related data. There’s also the World Economic Forum’s “Breaking Barriers to Health Data,” which is working to build a pilot project that uses federated data systems to share genomic data.
Wherever the initiative is based, a focus on pandemic data sharing and research would be the obvious place to start. Rare diseases—those that affect less than 200,000 people—research is another worthy candidate for special attention, given the critical need to aggregate data from the small number of patients dispersed around the world. But these would ideally be just the starting steps toward the longer-term development of a broader health data governance framework to support data-driven health research.
Two decades after the sequencing of the human genome, and during the current rise of data- and AI-driven health services, the world is only just starting to see the true potential of these technologies. But a lot of work remains to be done to realize the benefits of these technologies at the global level. Following the worst pandemic in a century, policymakers hopefully will recognize that in the future they would be better off if researchers were able to pool their data and technological capabilities to respond quicker and more effectively to the next international health crisis, never mind more common health conditions.