During the August convening of the Urban Policy Advisory Group, information and innovation officers from cities across the country discussed the need for data standards to support collaboration across departments, agencies, and cities.
As Abhi Nemani of Code for America put it, a data standard is basically a way to format a spreadsheet: a common set of headers and guidelines about how to think about data. This simple concept holds great potential for improving government in three major areas:
Open-source tools built on open standards can be significantly easier to implement, and can be applied quickly to other datasets using the same standards. Reuse brings down cost-to-deployment: Ideally, cities, departments, and agencies should be redeploying successful products and avoiding replicating painstaking labor already invested by the leader.
Standardized data allows not only cross-city implementation, but also inter-departmental and extra-governmental collaboration. Departments accessing shared standardized data do not have to make redundant collection efforts. Non-governmental partners can help represent the data in ways useful to citizens – for example, Yelp pulling restaurant inspection scores from a data catalogue to publish alongside user reviews. Analyzing these shared datasets could reveal new insights to departments formerly trapped in silos.
Open data and open standards invite public servants and private citizens alike to build novel applications on top of government data, leading to unforeseen insights.
Given this rosy picture of data standardization, why haven’t more cities and agencies gotten on board? By and large, city actors realize there is great value in supporting standardized, open data – the idea of accessing centrally developed products and cooperating with other agencies or cities through unified data warehouses is an attractive one. But they also realize the great practical difficulty in implementation.
Some of the common challenges to implementation include:
1. IT infrastructure
Taking in customer information is straightforward enough. Integrating the data with agency systems that might be several decades old is not. Overhauling the system of collecting, storing, and processing data would require an enormous investment of time and money.
2. Data & Standards Maintenance
Cities spend a great deal of resources collecting, reformatting, and managing data. Adopting data standards can mitigate these costs, but not eliminate all data-related maintenance. How often should data standards be revised? How often should new data be collected?
Why should one city follow the data standards of another? One might have peculiar health issues not found in another, or simply want to use similar datasets for different ends. Agreeing on shared standards necessarily means compromising for all parties involved, and it likely also means giving up some granularity in city-specific data.
4. Unwanted scrutiny
Novel overlays of 311 reporting, building inspections, and code enforcement might indeed uncover something new and interesting about a city. Such discoveries might also prompt unanticipated questions of city departments. On the flip side of transparency is unwanted scrutiny. While data standards commonly draw praise for “leveling the playing field,” they can also invite unfair comparisons among agencies with (justifiably) disparate goals.
Given the above challenges, department heads could be forgiven for being wary of investing significant resources in standardizing data as soon as possible. City governments seeking to adopt data standards and push open data initiatives need to find ways to address these challenges, better justifying the initiatives to their colleagues.
In New York City, many agencies providing health and human services have connected their datasets under an initiative called HHS-Connect. We spoke with Kristin Misner, Chief of Staff to the Deputy Mayor for Health and Human Services, about the initiative. Next week, see how data sharing is improving service delivery in New York.