The Right Way to Regulate Algorithms

By Chris Bousquet & Stephen Goldsmith • April 3, 2018

This article originally appeared on CityLab.com.

Which public school will your child attend? How severe a sentence will you receive in the criminal justice system? Will you earn tenure as a teacher? In many cities, a new force is playing a critical role in answering these questions: algorithms.

Cities rely on algorithms to help make decisions that affect people’s lives in meaningful ways, from assessing teacher performance to predicting the likelihood of criminal re-offense. And yet, the general public knows almost nothing about how they work.

Take a recent example in New York City: The police department has begun using algorithms to help decide where to deploy officers across the city. In 2015, the New York Police Department performed a 45-day test of software company Azavea's HunchLab platform, which considers a variety of factors—from historic crime to proximity to bars and bus stops—to determine where crime is most likely to happen. In the coming years, the NYPD pursued similar tests with a number of other companies, and while the department did not deploy any one of these tools, it drew insights from these trials in order to into design its own predictive policing platform.

The purpose of data-driven algorithms like this one is to make policing more objective and less subject to individual bias. But many worry that the biases are simply baked into the algorithms themselves. Some opponents have argued that policing algorithms will disproportionately target areas with more people of color and low-income residents because they reinforce old stereotypes: Data on patterns of past arrest rates, for example, might cause an algorithm to target low-income neighborhoods where officers were historically more likely to pick up black kids for possession. Others dispute whether or not the program is effective. A study by the RAND Corporation of a similar predictive policing program in Shreveport, Louisiana, found no measurable effect on crime.

Even if these algorithms do improve policing, mistrust will continue so long as public information is lacking. Despite a recent lawsuit by NYU’s Brennan Center for Justice that required the NYPD to release parts of its correspondence with Azavea and two other companies, the public knows little about how New York’s current predictive policing works, whether it uses tainted data, or is effective at reducing crime. As it stands, residents, advocates, and researchers have little ability to evaluate these tools to determine whether they are accurate or fair. Even City Council members have struggled to understand how their own precincts make staffing decisions.

Granted, some opacity around algorithms is inevitable. Some algorithms are too complex to be communicated in a simple and satisfying way, source code is often the proprietary secret of private companies, and releasing detailed information can pose cybersecurity risks.

These caveats make algorithmic transparency tricky, but it’s worth figuring out. Governments need mechanisms for making sure algorithms are subject to the same scrutiny as other types of public decision-making. This transparency should be proactive, rather than requiring years of legal action as in the Azavea case.

This past summer, New York City Councilman James Vacca proposed a bill that was one of the first attempts to address this issue. The proposal opened a spirited debate in the city over algorithmic transparency, drawing testimony from academics and advocates eager to see radical transparency, local tech companies urging the city to remain conscious of interests in the private sector, and public leaders looking to ensure both accountability and security for algorithms. The city ultimately landed on a compromise proposal, which creates an expert task force to develop transparency recommendations for the mayor.

This bill was a good opening salvo, but now comes the more difficult task: actually developing these mechanisms for transparency. If implemented in the right way, efforts towards transparency can actually benefit cities and vendors by maximizing the value and reach of analytics work. There are a few basic requirements the task force should consider:

  • Share the motivation for using an algorithm. When a city communicates its goal—for example, predicting when and where crime will happen—residents can assess intentions, and government has a benchmark for evaluating results.
  • Explain what data went into the model and why. Revealing sources gives residents the opportunity to identify potential bias from data tainted by historically discriminatory practices.
  • Describe how developers analyzed the data. Different from publishing source code—which is both potentially dangerous and unhelpful to most residents—a description of process includes information that allows the public to understand how developers get from data to an output. For example, the New York Mayor’s Office of Data Analytics (MODA) explained that it used a Monte Carlo technique to optimize funding for a school lunch program, and outlined the steps involved in this process.
  • Publish performance data. Governments should release any performance evaluations provided to them by vendors, and publish their own data on success achieving policy goals initially communicated.

Regulators need to be careful that these requirements do not place so large a burden on cities that they would eschew data-driven strategies in favor of old-fashioned intuitive decision-making. But the creators of algorithms already have these conversations in private, so fulfilling these reporting mandates is really a matter of better documenting processes that are already happening.

Disclosing more information about algorithms is not only valuable from an ethical standpoint, but also benefits cities by expanding the reach and value of data analytics work. By making analytics processes publicly available, cities allow other agencies to find inspiration and insight to pursue similar projects, as well as criticize and improve existing efforts. Increased reliance on data and evidence, and increased transparency, are movements that can and should support each other as cities continue to push towards a tech-driven future.

This article has been updated to clarify that while HunchLab participated in a pilot with New York, the city has pursued its own efforts outside of this pilot, and that it is the city, not HunchLab, that is responsible for the city's disclosure policies.

Top photo credit YouTube.com/azavea

About the Author

Stephen Goldsmith 

Stephen Goldsmith is the Derek Bok Professor of the Practice of Urban Policy at the Harvard Kennedy School and the director of Data-Smart City Solutions at the Bloomberg Center for Cities at Harvard University. He previously served as the mayor of Indianapolis and deputy major of New York City.

Read Professor Goldsmith's full bio here.

About the Author

Chris Bousquet

Chris Bousquet is a PhD student in philosophy at Syracuse University. Prior to that, Chris was a Research Assistant/Writer for Data-Smart City Solutions. Chris holds a bachelor’s degree from Hamilton College.