What Can Be Done to Protect Endangered Government Data?
Event Summary
The federal government has made significant strides towards making vast amounts of government data freely available to the public, and businesses, researchers, civil society groups, journalists, and many others have put open data to good use. However, recent events suggest that some open government data may be at risk. For example, in February 2017, the Department of Agriculture abruptly blocked public access to an animal abuse database used by businesses across the country; in March 2017, the Department of Health and Human Services announced it would no longer ask questions about sexual orientation and gender identity to National Survey on Older Americans Act participants, sacrificing a valuable opportunity to collect data about pressing intersectional social issues; and in October 2017, the Federal Bureau of Investigation released its annual report on crime statistics with 70 percent fewer data tables than the prior year’s report.
On February 27th, 2018, as part of Endangered Data Week, the Center for Data Innovation (CDI) hosted a panel discussion about the risks to open government data, especially in agencies that are underfunded and understaffed, and what can be done to protect this data in the years to come. Daniel Castro, director of the CDI and the panel moderator, opened up the discussion by framing the issues surrounding data collection and storage. Castro described his concern of data becoming a political issue, “people realize they can't always win in the battle of ideas because they don't have data on their side and so instead of having an honest fight over policy they're waging a war on data”.
Any large-scale deletion of open government datasets, signaling the end of the open data era, that was feared at the outset of Donald Trump’s presidency has not happened. While this remains true, the panelists agreed that sneaky ways to undermine openness still remain. “[Data Refuge] really raised the nation’s awareness about these valuable data assets that could be at risk,” said Denice Ross, a technology fellow at New America, “And I think the mere act of holding these Data Refuge events around the country itself probably stemmed some of the data loss we might have seen.”
John Thompson, the former director of the Census Bureau, spoke about the bureau’s push for automation and priorities under the federal budget. When asked about his concerns about endangered data, Thompson stated that the Census has a hard time quelling public fears about endangering their personal information. Regarding the citizenship question, Thompson said: “Putting that [citizenship] question on the decennial census has the risk of raising fears among certain populations that it would be difficult for the Census Bureau to countermand.”
Panelists also agreed that although some datasets can be politically charged, open data is not a partisan issue. Gavin Baker, the Assistant Director of Government Relations at the American Library Association, cited the bipartisan OPEN Government Data Act, which would require that public data is published in a machine-readable format, as proof. Castro then emphasized that this bill will allow data to be shared between government agencies for analytical purposes but not make sensitive information public.
The panelists also agreed that “storytelling” - finding specific individuals who have benefitted from open data and showing how it impacted them — is key to gathering support for open data initiatives.
The discussion ended with a question from moderator Daniel Castro. Castro asked the panelists in this era of “fake news” are we prepared, ahead of time, to defend data against that type of attack. In response, Patricia Kim, co-founder of Data Refuge, emphasized the importance of storytelling and making the public aware of successful open data uses and practices.
Follow the conversation on Twitter using #datainnovation.