The first question most open data advocates hear is, “Why?” Whether you’re trying to make the case within government or coming in from the outside, many, many advocates in our space spend a lot of time justifying open data’s potential instead of playing with its possibilities.
It’s hard to blame the data providers, however: “Open data” as a concept has really only come into public awareness over the last six years, and its benefits are not yet widely known -- or appreciated. Many of the challenges governments raise are legitimate considerations for such a new activity, and all challenges -- from the most predictable or frustrating to the most practical -- deserve attention. Experienced information freedom fighters have learned how to respond when challenged, but for those of us encountering new terrain and who seek guidance on how to respond, the collective knowledge of our community is hard to find, usually trapped in email groups, discussion boards, blogs, and the memories and experiences of individuals.
This is why, back in April, we at the Sunlight Foundation were charged by our friends at Technically Philly, a technology news and community organization, to round up the ‘Top 100 Reasons Not to Release Data’ and refute each and every one, with the goal of creating an open data dialogue resource for veteran and new voices alike.
We took the bait, and over the month of August we crowdsourced more than 50 reasons commonly cited for not releasing data heard by those working both inside and out of government at the federal, state, and local levels. The reasons reflected a wide variety of concerns, from cost and staff time restraints to privacy questions to the shock that APIs don’t always equal open data.
To respond to these challenges we decided once again to turn to the crowd to learn how others have responded to the concerns we’d collected. Dozens of people from many different backgrounds contributed to the project, sharing thoughts on social media, commenting on our public Google Doc, and even generating ideas on the Open Data Stack Exchange, where several threads were opened to dive deeper into specific subjects.
Using this input along with our own experience and materials from our peers at the National Neighborhood Indicators Partnership and some data warriors from the UK, we compiled a roundup of talking points to help unpack each challenge. The answers were rolled out in a 10-part blog series titled #WhyOpenData and organized by categories, as you can see below with a few examples.
Part 1: Apathy
There Are Few Public Requests For Data To Be Open/For Data In General
- Is it known that this information is being collected or maintained by the government? (If no one knows, of course no one is asking about it!)
Part 2: Confusion
How Would Someone Even Use That Kind Of Data?
- "What do you do? What does your agency/program do? Why is that important? Why was it created? The way you use or collect the data indicates some of its value. Opening the information can have a lot of impact on internal government processes, as well as for public stakeholders who need it."
Part 3: It's Hard!
I Don’t Know How To Organize On This Issue
- "Who could help us move forward with this? Let's build a coalition and create momentum in the government and in the community."
Part 4: Cost
It Would Require New Processes And Staff Training
- “There may never be a ‘perfect moment’ when opening data is easy and instant, but we can begin to explore small steps that benefit your staff and the public and ultimately open more data now and build toward a greater scale down the road.”
Part 5: Staffing Concerns
I Don’t Mind Making It Open, But I Worry Someone Else Might Object
- Identify who is generating the concern and offer to help talk with them about the reasons that opening data can be beneficial, innovative, and less difficult than they think.
Part 6: Legality
If We Publish This Data, People Might Sue Us
- If the concern is about information that could lead to lawsuits from people or businesses, make a policy that balances the public interest in accessing the data with the privacy concerns and stick to it.
Part 6, continued: Legality
It's Not Classified, But We Don't Think It Would Be Good PR To Open This
- "If you are worried about bad PR, be more proactive about good PR. Explain what the data means and why you are opening it."
Part 7: Accuracy
If We Put The Data Out There In Bulk, People Will Alter It
- People might already be misusing and misunderstanding the bits and pieces of what data is already available. Communicating the meaning of the data and working toward improved data quality and release could help clear up these misuses. In this way, bulk data can actually help clarify misunderstandings of data by sharing the largest scope of information available on a particular subject.
Part 8: Privacy
It's Classified Or Confidential / We Can’t Provide That Dataset Because One Part Is Classified
- "Is it possible to redact just the parts that cannot be released? Can the confidential parts be excluded, leaving something that’s still useful? There could be valuable information in the parts of the dataset that are not subject to privacy or security concerns."
Part 9: "Already" Public Data
That’s What FOIA Is For
- “If you service a lot of FOI requests, you’re in a great position to identify what data the public wants (often, the data they request the most). Sharing that information publicly and online will help reduce repetitive requests on your staff and likely will improve your relationship with the public.”
Part 10: Say What?
That Is A Good Idea, But We Don't Have An Open Data/Open Government Strategy Yet, So We Should Wait For That
- "Releasing open data doesn't have to wait for a policy or strategy. While having a strategy can be helpful for the guidance and sustainability of a government’s approach to data publishing, it doesn't hurt to start releasing data now. Experimenting with data release can help illuminate the benefits and challenges that it can bring. A policy can incorporate what you learn by starting to share data and can help you build in necessary regulations and activities you found difficult to execute without a policy or mandate.”
Obviously, this series is not inclusive of every argument, solution, or tactic for dealing with data disclosure, but it should serve as a good start for those looking to dig deeper into the barriers to opening data, no matter how complicated or commonplace. Perhaps you’ll even find that these resources will help you turn a “no” into a “maybe” or, better yet, a “yes.”
Thanks to those of you who helped make this resource possible. We hope to see it forked and remixed over time, and to watch our new fount of collective knowledge grow. To get started, join the conversation on Twitter with #WhyOpenData.