A Catalog of Civic Data Use Cases

What kinds of operations-enhancing questions have cities asked and answered with data and analytics? The catalog below is an ongoing, regularly-updated resource for those interested in knowing what specific use cases can be addressed using more advanced data and analysis techniques.

For examples that are currently being implemented in cities across the country, you can click to expand the question to see additional information about the solution. All other examples represent potential questions that cities could work to address with data and analytics.

We welcome further submissions to the list by email. Submissions can include either current examples of how cities are addressing specific operational or policy issues with data, or ideas for how to address issues that you hope cities will one day be able to answer.

Health & Human Services

How can we improve cooperation between city services with shared clients?

Since the creation of a unified Department of Housing and Human Services (HHS), Boulder County, CO has been a testament to the benefits of holistic human services delivery. Through its integrated service delivery system, Boulder County has been able to expand the number of residents receiving services by 140%, focusing on front-end and early intervention measures to prevent more costly services in the future. Technology has been a key feature of this transition. The Department, as it exists today, was formed after a 2008 merger between the County’s housing and social services agencies. To support this effort, HHS developed an integrated service delivery system, including technological tools that allow employees to track clients’ case histories across programs, refer clients to additional program areas, and collaborate with other department caseworkers. Read more in "Boulder County Colorado: Integrated Service Delivery" by Sam Gill, Indi Dutta-Gupta, and Brendan Roach.

How livable are cities for an aging population?

Aging in place is an important consideration for city leaders, as the population of Americans over age 65 is exponentially growing. The AARP’s new Livability Index scores small, mid-size, and large cities according to seven categories like housing affordability, access to medical care, and transportation access, all important aspects of an age-friendly city.  

What methods help reduce evictions for low-income residents?

A new working paper from Princeton University found that providing legal representation for low-income tenants in housing court led to a significant reduction in eviction warrants issued against the clients. By reviewing data from New York City’s Universal Access program, which guarantees legal representation to low-income NYC renters, researchers found that tenants with representation were 72 percent less likely to receive eviction warrants and had an 85 percent in “monetary judgements,” or back rent owed. These effects were exemplified in areas of NYC with higher non-white and non-citizen populations.

Why should a city have a chief heat officer?

As extreme weather events happen more frequently and global temperatures rise, it’s more important than ever that local leaders plan for a hotter city. This includes things like looking out for vulnerable populations (unhoused folks, the elderly, children, folks with chronic health conditions), changing or “greening” infrastructure (cool roofs, increased tree canopy, less depaving), and engaging the public in emergency preparedness. Tackling extreme heat goes across agencies and departments, and requires a well-coordinated and authorized approach, which is why cities in the US have begun hiring a chief heat officer to guide and envision the pathways needed to accelerate the solutions. 


Learn more about the city of Los Angeles’ chief heat officer in this article.
  • Who is most likely to apply for a city service(s)?

  • What is the impact of providing an additional service(s) to a client already receiving one city service?

  • Which clients are most likely to apply for multiple services?

  • When clients apply for / obtain multiple services, which service do they typically apply for first?

  • Can we forecast the number of caseloads for city services?

Can data and analytics prevent lead paint poisoning?

Although lead paint was banned in the United States in the 1970’s because of the harmful effects exposure can have on children in particular, there may still be older homes with remnants of the dangerous chemical. The City of Chicago Health Department and the Center for Data Science and Public Policy at the University of Chicago (DSaPP) partnered to tackle this specific issue with the use of machine learning and analytics strategies. The goal of the partnership was to help identify homes that are most likely to still contain lead-based paint hazards. The data scientists at DSaPP built statistical models that predict possible exposure based on factors like the age of the house, historical health data on children previously exposed at certain addresses, and economic conditions of neighborhoods. The data sources used by the predictive model include blood lead level tests and home inspection records, combined with a variety of publicly available data, such as census information about neighborhoods and construction data about the size and structure of individual houses. The paper detailing the project said that the model creates a list of at-risk homes with individual risk-scores and that the Chicago Department of Public Health would use this list to prioritize homes it would target with outreach and intervention to engage at-risk families and landlords.

How can cities deliver social services more efficiently?

Five years ago, New York City launched an initiative, HHS-Connect, to collect its social service data in one place. The idea is to allow clients to walk into different social service agencies without having to re-enter their information and complete duplicate paperwork. “We have a vision of a client walking into, for example, a homeless shelter and not having to reapply your information if you had already been to the public welfare office or to the Administration for Children’s Services,” said Kristin Misner, chief of staff to the deputy mayor for health and human services. Read more in "Big Data Gives a Boost to Health and Human Services" by Stephen Goldsmith.

How can data and technology help streamline service enrollment and prevent “churn”?

In Philadelphia, an initiative that originally helped connect low-income seniors apply for benefits over the phone has transformed into a data-driven anti-poverty program that can enroll eligible residents in 19 different services and benefits within thirty minutes. Now called BenePhilly, the program functions as a quasi-governmental agency and has data sharing agreements with the state and city. By reducing duplicitous forms, opening in-person locations at trusted community organizations, and relying on powerful, open-source technology to assess eligibility, BenePhilly is transforming the process of applying for — and seamlessly receiving — benefits, and increasing trust in government assistance services. Read more about the program here.

Can we address air pollution and child health outcomes with data?

Across the globe, nitrogen dioxide (NO2) from traffic pollution causes 4 million new cases of asthma in children, according to a new study on vehicular air pollution. In response, the Environmental Defense Fund is using mobile sensors in cities like Oakland, Houston, and London to measure levels of air pollution at the block level. Mapping this information and showing how levels vary in urban areas, where different ends of the same block can have different levels of toxic pollution, helps inform policies at the city and municipal level. Recent congestion pricing proposals and low emission zones will be watched closely to see if they produce measurable changes in the pollution data.

How do we help clients leaving the criminal justice system get and keep jobs?


In 2021 the Brooking Institute and American Enterprise Institute Working Group on Criminal Justice Reform published the report A Better Path Forward for Criminal Justice, which is broken into chapters that focus on seven different aspects of the criminal justice system. The sixth and seventh chapters, Training and Employment for Correctional Populations and Prisoner Reentry, cover best practices for increasing employability, providing career and job readiness training, and coordinating pre-release employment activities for returning citizens. Highlights include the importance of secondary and postsecondary education programming in correctional institutions, the success of work-release programs, and reforming employment-based criminal background checks.

  • How do we help clients leaving foster care, homeless shelters, etc. get and keep jobs?

  • Which clients coming out of the juvenile justice, criminal justice, foster care, homeless services, or substance abuse systems who are placed in employment are most likely to return to city services?

  • Which interventions are most effective?

  • Which individuals and families placed into permanent housing are most at risk of returning to the homeless services system?

What are the causes of infant mortality that could be targets for intervention?

In the span of just one year, Cincinnati managed to decrease its infant mortality rate (IMR) by over 25  percent, from 13.3 deaths per 1,000 live births in 2012 down to 9.9 in 2013. To accomplish this feat, the city has incorporated and leveraged relevant data to concentrate its efforts where they are most needed. Since 2007, this targeted undertaking has tracked various indicators and outcomes, such as mother’s zip code, race/ethnicity, mental health, and smoking habits, as well as the child’s birth spacing and sleeping environment. By using data to zero in on quantifiable risk factors and on at-risk communities, Cincinnati is making major strides on a difficult undertaking, the fight against infant mortality. Read more in "Using Data to Combat Infant Mortality in Cincinnati" by Victoria Kabak. Maryland and Indiana also have notable successes in this area.

Can we target outreach and intervention to those at risk of poor health outcomes?

Improved analytical capabilities allow agencies to identify and implement more effective services for clients. In Oklahoma, as part of the SoonerCare (Medicaid) program, officials analyzed patient data including comorbidity factors to identify individuals prone to poor health outcomes. Equipped with a list of at-risk Medicaid recipients, managers have worked to sign these individuals up for intensive, managed-care programs. Meanwhile, the Rhode Island Department of Children, Youth, and Families developed the Real Connections program, which analyzes data on a child’s social network. Using this analysis of existing information, the Department is able to identify mentors best suited to enable the best outcome for each child. Read more in "The Technology Opportunity for Human Services" by Sam Gill, Indi Dutta-Gupta, and Brendan Roach. Stephen Goldsmith's "The Nexus Between Data and Public Health" highlights additional examples of health data that can help policymakers improve the health of their communities.

How can we identify causes of poor air quality causing health problems for residents?

Louisville, KY partnered with Propeller Health in May 2012 to distribute 500 smart inhalers to asthmatic residents. When the devices were used, they sent time and location data to both the patient’s physician and city officials, who used the data to generate “heat maps” of emergency asthma attacks. With the help of data analysts at IBM, public health officials compared the trends against a variety of potential causes — including air quality, pollen outbreaks and traffic congestion — to strategize interventions in the most at-risk areas. Today, the project continues. The city plans to deploy bike-mounted sensors to monitor air quality along routes that are frequented by children during the summer. Read more in "Health Data Isn't Just for Hospitals" by Stephen Goldsmith and "Monitoring Air Quality and the Impacts of Pollution" by Laura Adler, which discusses additional examples of sensor-based air quality monitoring.

Argonne National Library and the Chicago Department of Innovation and Technology partnered to develop the Array of Things, a citywide network of sensors mounted on lampposts. Among other uses, these sensors track the presence of a number of air pollutants, including carbon monoxide, nitrogen dioxide, ozone, and particulate matter, with plans to monitor volatile organic compounds (VOCs) in the near future. Chicago has used this data to predict air quality incidents in order to take preventative action and has released data to the public via the city’s open data portal.

Which individuals and families are most at risk of returning to the homeless services system?

There is not a lot of longitudinal data on the long-term outcomes of individuals who have experienced homelessness, but in 2018 the Homeless Policy Research Institute (HPRI) published data that looked at different homelessness interventions and measured "the ability of these interventions to help individuals avoid returning to homelessness." Based at the University of Southern California, HPRI focused this study on homelessness in Los Angeles. The identified several key factors that influenced whether or not individuals were likely to return to homelessness after engaging with an intervention (such as rapid rehousing or transitional housing). 

  1. Relationships and strong social capital reduces episodes of homelessness. Additionally, financial and emotional support help provide significant stability.

  2. Age can also predict who returns to homelessness, with older heads of households more likely to reenter emergency shelters.

  3. Individuals who leave interventions under "imposed departure from services" show some of the greatest return rates (this can include those who refused to pay rent, exhibited behavioral issues, engaged in drug use, etc).

  4. Exiting the system into stable life and housing situations is obviously an important predictor, as is the history of homelessness. Individuals who have experienced homelessness multiple times are more likely to enter into the system again, as opposed to individuals who are experiencing their first instance of homelessness. 

  5. There is also evidence to suggest that homeless people of color have a greater risk of reentry compared to other racial/ethnic groups, but there is very little quantitative data to support the qualitative evidence.

How can we predict which families are most at risk of homelessness?

New York City's Department of Homeless Services (DHS) has partnered with academics to develop customized risk assessment tools that support caseworkers in determining the best approach for each client during the screening process. DHS is also exploring more proactive approaches, partnering with the SumAll Foundation to analyze data on eviction notices to predict which cases are most likely to result in homelessness. While the pilot is still being tested in specific neighborhoods, the analytics will eventually become part of the City’s data visualization project that allow staff to visualize neighborhood data such as shelter entries and eviction filings, while also being able to tell caseworkers which of the thousands of households or buildings on the map are actually most at-risk of shelter entry.  By tailoring outreach efforts and reducing barriers to access, DHS can provide more services to more at-risk families. Read more in "Data-Driven Strategies for Reducing Homelessness" by Lyell Sakaue.

Additionally, research has should that nonbinary, transgender, and LGBTQ youth are one of the demographics most likely to experience homelessness. Across the country, about 40 percent of unhoused youth identify as LGBTQ. 

How can we improve homeless prevention programs by identifying youth most at-risk for homelessness?

Homelessness is a complicated issue, and while there are many mitigating factors, there are a few that are specific to youth and can help predict homlessness. It’s important to identify these and target prevention efforts to kids and youth who are part of these populations or exposed to known risk factors.

Meta-analyses have found that adults experiencing homelessness were more likely to experience adverse childhood experiences. There is also a high risk of homelessness among youth who were in the foster care system, who are also more likely to have had adverse childhood experiences and trauma. Additionally, youth who identify as LGBTQ+ are disproportionately represented in homeless populations. Family conflict during early teen years (13 -15) was also a predictor of homelessness in young adulthood. These studies pinpoint people and situations to direct homelessness prevention efforts.

How can we improve homeless prevention programs by identifying adults and families most at-risk for homelessness?

Homelessness is a complicated issue and there are many factors that intersect to affect homelessness like lack of affordable housing, addiction, and trauma history. There are also different types, so someone may be unhoused temporarily, episodically, or chronically. Although there are also many risk factors, researchers have found commonalities that could identify folks at risk of being unhoused and shape prevention efforts.

A nationally-representative meta-analysis showed that the trauma of a separation from parents/caregivers during childhood, as well as "indicators of mental illness and problems with drugs" were strongly associated with homelessness in adults. Research shows that adult veterans who experience military sexual trauma are at significantly greater risk of postdeployment homelessness compared with other veterans. Additionally, people with disabilities (physical, intellectual, and developmental) are over-represented in unhoused populations. There is also evidence of influencing community factors; for example, high-cost housing markets with insufficient affordable housing have higher rates of homeslessness.  These studies pinpoint people and situations to directly target with homelessness prevention efforts. 

What are intervention points and levers that cities can utilize to prevent evictions?

According to Marie Claire Tran-Leung, the evictions initiative project director at the National Housing Law Project, there are several points during the eviction process where city leaders can intervene — and the earlier, the better for tenants. Too often, officials are only notified of an eviction during the filing process. However, by working directly with residents and trusted advocacy organizations, collecting and processing data from sources like pretrial programs and utility companies, and acting as a convening for rental/housing stakeholders before evictions occur, city leaders can make a significant difference in the health, safety, and quality of life of residents.
  • Which public housing residents are most likely to be placed into employment?

  • Which city services have the greatest impact on reducing entry into homeless shelters?

  • Which client characteristics indicate that a client will leave a homeless shelter without a subsidy?

  • Which clients would benefit the most from housing services?

  • Which seniors in need of services and resources currently aren't receiving them?

How can we identify pregnant women at risk for having adverse births?

Data Science for Social Good partnered with the Illinois Department of Human Services to identify women at risk for having adverse births, which are associated with negative personal, financial, and developmental outcomes for both the mother and the baby. They identified risk factors, including stress, socioeconomic factors, substance use, quality of life indicators, healthcare access, and age, that could be used to predict women at high risk for adverse births. The state then provided targeted programs and assistance to these identified high-risk women, giving them needed assistance. Read more on the DSSG site

How can education data be used to identify off-track students?

Data Science for Social Good and the Mesa Public School System are using Mesa's education data to identify students who are off-track for their future plans. Using students' classes, grades, test scores, and attendances, they can predict which students are college-ready but may not be applying to college, only applying to two-year programs, or will enroll but not graduate from college. These students can then be given extra support and resources to empower them to apply, graduate, and succeed. Read more on the DSSG site.

What factors contribute to youth obesity?

A Texas law requires public schools to record fitness data on every student. Through data-sharing agreements with the school districts, Austin-based nonprofit Children's Optimal Health (COH) gathers metrics on BMI and cardiovascular fitness scores that are geo-tagged with social and economic information. COH converts de-identified person-level data to aggregate neighborhood-level maps that illuminate the conditions faced by families and children in the area, all while protecting personal information. Enhanced with other datasets, these maps tell a more complex story of the factors that influence health outcomes — from proximity to fast food restaurants to the stress of high neighborhood crime rates. Read more in "Austin Targets Youth Obesity with Neighborhood-Level Data" by Stephen Goldsmith.

How can we increase kindergarten readiness?

When Bloomberg Philanthropies launched the Mayors Challenge, issuing a call for the most innovative proposals by cities across the United States, Providence, RI Mayor Angel Taveras seized the opportunity to seek out a new, creative solution to a serious issue in his city. After reviewing a number of ideas, he and his team ultimately developed the program now known as Providence Talks. The team discovered research that shows that high-income children hear an average of 30 million more words than their low-income peers in the first three years. The seminal study of language environment by Betty Hart and Todd Risley showed that the amount of conversation children had with their parents by age three was positively associated with their IQ scores at that age, along with a host of other positive outcomes. Providence decided to apply this important finding by creating a city-led effort to close the word gap using innovative technology: devices that can record and allow measurement of the auditory environment of children. A preliminary study demonstrated that sharing feedback reports generated by the recording devices led caregivers to increase the number of words spoken to their children by 55 percent. Simply having the information about the number of words their children were hearing inspired them to talk more, read more, and interact more with their children. The children’s increased exposure to words was the single greatest predictor of improved language skills and learning readiness before entering school. Read more in "Providence Talks: Progress on Closing the 'Word Gap.'"

Where is there the greatest risk of mosquito-borne illness?

West Nile virus, an ailment once rare and relatively unknown in the United States, is now an annual danger in many suburban communities. In Suffolk County, New York, a large suburban and rural county on Long Island, officials began seeing West Nile cases in the early 2000s. The county developed a model to assess the risk of outbreak using a combination of statistical methods and geographic information systems. Through modeling, they found relationships between human West Nile cases, landscape factors, population demographics, and weather patterns. Initial results showed a complex interaction between these factors and human cases of West Nile virus. Using this hot-spot analysis, Vector Control now targets larvicide efforts in established hot-spots and uses aerial adulticide spray only where quantitative evidence supports the use of pesticides. By being strategic in the use of analytics, the agency has saved time and money, while still providing a high level of public safety. Read more in "Predictive Tools for Public Safety" by Stephen Goldsmith.

Zika is another mosquito-borne illness that is making headlines in U.S. cities. In "How U.S. Cities Can Target Zika Risk," Jon Jay analyzed data from Miami and found relationships between vacancy, poverty, and Zika outbreaks. He applied that analysis to New Orleans and Houston, suggesting that vacant housing reduction strategies could help such at-risk cities tackle the challenge of Zika.

Regulation

  • Can we determine where unsafe housing problems are unlikely to be reported through 311?

  • How can we use analytics to prioritize accessibility inspections for building alterations, and make sure they are compliant with municipal building code and state accessibility requirements?

  • Who is most likely to be guilty of financial crimes and fraud?

  • How can inspectors reduce response time to maintenance complaints?

How can we determine what businesses will have over-occupancy issues, including multiple incidents of over-crowding?

In order to predict overcrowded properties, Los Angeles has combined data on demand for housing in a particular neighborhood—including house prices, education, incomes, and weather— with supply estimates, based on land area, topographical constraints, and construction labor wages. In areas where demand appears to outstrip supply, overcrowding is more likely. A creative program in one London borough monitors sewage flow and compares expected waste output with observed waste, as well as looking to trash left on the streets and calls to pest control as overcrowding indicators. New York City tracks landlord tax payments, noise violations, and complaints as potential predictors of overcrowding. Read more “How Can Cities be Preemptive and Effective in Preventing Overcrowding?” by Nyasha Weinberg.

Which businesses are likely to be underpaying their taxes? How can we increase the productivity of auditors?

In New York City, a Finance Department auditing team decided to use analytics to increase the productivity of auditors reviewing companies thought to be underpaying their taxes. Using sophisticated data analytics, the commissioner instructed his department to look for patterns—identifying individuals who had businesses similar to others but who stood out as outliers on taxes paid. In so doing, the team reduced the portion of audit cases closing without change: from 37 percent to 22 percent over three years. This represents a 40 percent increase in productivity for the department and a 100 percent reduction of government intrusion for the thousands of companies that would have been catapulted into the audit process, with an end result of no change on their returns. Read more in "Making Data Matter in Administrative Systems" by Stephen Goldsmith.

Where are blighted properties located?

Detroit, possibly the city hit hardest by blight and vacancies, is leading the way in attacking the problem in a data-driven method. Detroit’s Blight Removal Task Force, deploying more than 200 people over 14 weeks, has successfully surveyed more than 99 percent of the city’s 380,217 properties. Information collected onsite, including photographs, lot characteristics, condition of structures and the owner, is sent wirelessly to the operations center, where it is checked while the team is still at the property. This information has helped identify candidates for demolition and areas of elevated safety concern. "Data Helps Calculate the True Costs of Blight" by Stephen Goldsmith.

Which properties cause the most problems?

In 2011, Boston created the Problem Properties Task Force, a cross-agency committee that works to address and preempt community disorder by identifying the city’s most risk-prone and risk-causing addresses. Each contributing agency (there were eight) furnishes their respective existing datasets, which are consolidated with data from the Mayor’s 24-Hour Hotline and then processed by the City’s High-Performance Analytic Appliance (HANA). Each member then contributes his or her department’s records and incoming public complaints to complement the analytics engine results in generating a comprehensive picture of the city’s most problematic residences. Then, using these sources of intelligence, the task force determines the appropriate action to take. This could mean increasing police surveillance, expediting enforcement proceedings by the Air Pollution Control Commission, levying of charges to recoup public costs, or commencing foreclosure proceedings against a property owner with delinquent real estate taxes. Read more in "Problem Properties: A Preemptive Strategy Toward Neighborhood Stability" by Craig Campbell.

Can cities “right-size” fines and fees to get minimal delinquency and maximum income?

Yes, lowering fines and fees leads to increased collection according to research from the Government Finance Officers Association, the nonprofit SERVUS, and the Chicago Booth School of Business. Data shows that this method, referred to as segmenting, also increases equity as fines and fees disproportionately affect low-income residents and communities of color. In the Data-Smart article The Problem With One-Size-Fits-All Fines and Fees Professor Goldsmith discusses this research and outlines the economic argument that government fines and fees operate on a demand-curve. 
  • How can we prioritize annual elevator safety inspections?  For example, can we predict or identify which elevators pass every year and could be outsourced to a 3rd party?

  • Which construction / renovation projects are the highest risk / should be inspected first?

  • Which equipment (such as boilers, elevators, cranes, vehicles, etc.) is the highest risk / should be inspected first?

  • What variables affect inspector productivity and which can be most easily influenced? What distinctions can be made between inspectors who complete a high number of inspections and those who are at the bottom end?

  • Based on the relationship between inspections and violations, what building inspection regimens are most effective at preventing violations from occuring?

  • How many inter-agency inspections are conducted each year? Do they effectively detect current violations?

  • Which city debts are least likely to be paid?

  • Which taxpayers are least like to pay?

Where is illegal grease disposal likely to be occurring?

With assistance from the Mayor’s Office of Analytics using a hotspot analysis, New York City's Business Integrity Commission cross-referenced industry data on grease production with restaurant permit data and sewer back-up data from the Departments of Health (DOH) and Environmental Protection (DEP) to better target enforcement and predict illegal activity. Since launching this partnership effort with DEP in the Fall of 2012, they have achieved an increase in violations by 30% while achieving a 60% reduction in manpower dedicated to grease enforcement. Read more in "Enforcement and Data" by Shari Hyman.

Where are there unreported incidents of food poisoning?

Diners who suffer food poisoning rarely report it through official channels, even though foodborne illness is a public health concern. However, sick, unhappy customers have incentive to vent their complaints on Yelp, a popular app and website for local business reviews. New York City’s Department of Health and Mental Hygiene recently completed a pilot project in partnership with the company aimed at identifying unreported outbreaks of foodborne illness. Working with software developers at Columbia University, city researchers converted nearly nine months of Yelp reviews into machine-readable data. They were then able to pinpoint potentially hazardous establishments by reviews that included terms such as “sick,” “vomit” or “food poisoning.” Scanning 294,000 restaurant reviews in New York, the software flagged three restaurants that together produced 16 documented illnesses. When health inspectors subsequently visited these establishments, they discovered astonishing health code violations: improperly sanitized surfaces and bare-hand contact with ready-to-eat food at the first two, and live roaches and evidence of mice at the third. Read more in "How Social Media Listening Can Improve Public Health" by Stephen Goldsmith.

Which restaurants are most likely to have code violations?

In a recently completed pilot program, Chicago used analytics to improve the process by which health inspectors identify "critical violations" in food establishments, usually related to improper food temperature. Here's how it worked: The city processed relevant data to identify predicting variables associated with violations, developed a model, ran a simulation and then used this forecast to allocate inspections in a way that prioritized likely violators. This data-optimized trial method sped up the process of identifying critical violations by seven days — meaning that restaurant patrons are that much less likely to contract a foodborne illness. Read more in "Chicago's Data-Powered Recipe for Food Safety" by Stephen Goldsmith.

How can data identify illegal tree removal?

Washington, D.C.'s Urban Forestry Administration used lidar data to identify illegal tree removal based on the original height of the trees. This vastly improved enforcement of permitting laws. Read more in "How D.C. Grew a Data-Driven Tree Strategy" by Stephen Goldsmith.

  • What city blocks need more inspection enforcement?

  • Which businesses are most likely to be violating weights and measures?

  • How can we tap social media for information on illegal businesses?

  • What property owners, architects, developers, businesses, and landlords need more regulatory enforcement?

  • How can we use social media to ensure licenses are conducting legal business?

  • How can we target stores that sell outdated food or expired baby formula?

  • Does the order of inspections (building, health, or fire) increase the rate of violation?

How can housing violations data be used to reduce blight?

The negative effects of blight and building vacancies can spread through an area in a city quickly, emphasizing the need for proactive data-driven strategies. The University of Chicago team of data scientists at the Center for Data Science and Public Policy (DSaPP) worked with the Cincinnati Department of Buildings and Inspections to develop a predictive model that allows for early intervention by building inspectors at homes and properties most at risk of vacancy or violations. The predictive models the team at DSaPP developed combined data about home values, fire, crime, tax, census, and water shutoff information with historical inspection data to develop a list of properties prioritized by their need for inspection. The logic is that the earlier an inspector can visit a property likely to be in violation of city code, the earlier problems can be addressed, and the more likely it will be that the property is fixed as opposed to abandoned. DSaPP’s blog post detailing the project says that the traditional method of property inspections being triggered by citizen complaints leads to a violation being found in 53 percent of cases. The initial results from 2015 when using the predictive model increases the likelihood of finding a building code violation in a specific property to 78 percent.

How can we encourage citizens to voluntarily comply with housing codes?

In order to address urban blight, New Orleans has used behavioral science to improve voluntary homeowner compliance. The city’s Office of Performance and Accountability (OPA) tested a new step in the code violation process whereby the city sends homeowners a letter stating that a 311 complaint has been made about their property. In the pilot, the letter resulted in a 6 percent decrease in observed violations and the city has now fully implemented the policy in the Code Enforcement Department. Read more in Katherine Hillenbrand’s piece “New Orleans Brings Data-Driven Tools to Blight Remediation.”

How do we determine the best course of action for blighted properties?

New Orleans noticed that using the traditional system for making decisions about what to do with blighted properties—tasking a single director with working through the complex list of factors and deciding the fate of homes—created a significant backlog of properties awaiting decision. In order to streamline the process, the city’s Office of Performance and Accountability (OPA) partnered with data science startup Enigma to create a machine learning algorithm called the Blight Scorecard that assigns properties a score from 0-100 to aid in the decision making process. To create the tool, the OPA manually scored over 600 test case properties on a number of factors, honed the list of criteria and determined how they affected the outcome, and then tested algorithms to see if a model could be trained on this data. Read more in “New Orleans Brings Data-Driven Tools to Blight Remediation” by Katherine Hillenbrand.

How can technology remove barriers to starting and maintaining small businesses?

In Miami, new business owners no longer have to endure the months-long process of in-person permitting and applications in order to found a new business. The city’s eStart initiative launched in January 2021 and is an all-digital tool that helps current and future business owners do things like apply for licenses and fill out zoning compliance forms. Users can do all of this from their phones, streamlining the process for business owners and government offices that usually dealt with duplicative forms.

Can we predict which stores sell cigarettes to youth?

A 2018 study published in JAMA Pediatrics found that retail sales of tobacco to minors is more prevalent than previously thought. Stores that are near schools are a prime location where youth are exposed to cigarette advertisements, and many cities are regulating the sale of tobacco products near school entrances. Additionally, certain pharmacies and convenience stores near gas stations have been cited by the FDA for the illegal sale and distribution of tobacco products to minors.

Can we use historical 911 and 311 call volumes to adequately staff and schedule their call floors?

In Tulsa, OK, the Urban Data Pioneers program is an inventive community engagement strategy that recruits community members to investigate and address pressing issues using public data. One of the most successful civic tech projects looked at data on 911 calls to help determine the highest call volume times. From this research, the call center optimized staffing by scheduling to meet the highest call demand times.

Which buildings are the highest risk/should be inspected first?

Through a partnership between the City of Atlanta and the Data Science for Social Good (Atlanta) program, a research team worked with the Atlanta Fire Rescue Department (AFRD) to build Firebird, a framework for identifying and prioritizing fire inspections at commercial properties. Firebird determines fire risk scores for city properties, using "machine learning, geocoding, and information visualization". The predictive risk model used available AFRD data, new data on previously uncounted commercial properties, and historic fire and inspection data. This data-driven approach helps local governments identify risky properties, manage fire department resources, and mitigate commercial property hazards. 

Mobility

  • Which indicators can help to identify areas with the greatest amount of idling?

  • How can we reduce accidents involving city vehicles? Where and when do most accidents involving city vehicles occur?

  • How should cities prepare for autonomous vehicles?

How can cities best regulate micro-mobility?

According to Keith Chen, Associate Professor of Behavioral Economics at the University of California Los Angeles, cities should take a behaviorally-smart approach to regulating micro-mobility. Instead of broad regulations or outright bans, Chen advocates that cities make decisions that are informed by the behaviors of its residents.Cities should also utilize dynamic caps of micro-mobility vehicles and find motivating incentives for companies to align with city goals. 

What is the best way to regulate curb space?

Promising results from multiple 2020 pilot studies show that data can help smooth curb usage and regulate zones and pricing. Each study was slightly different, as they were tailored to the size, existing infrastructure, and flow of each city. However, each city was able to collect data that guided when and where to open loading docks, how to accommodate residential deliveries, and how to use dynamic pricing to adjust congestion and traffic. In order to regulate the curbs, pilots like these must be conducted so that city officials have data to guide their initial decisions. Continued data collection will be important, to ensure that there aren’t adverse effects and to tweak the dynamic pricing and deliveries.

Which intersections are likely to be blocked, and when?

Intelligent transport systems (ITS) are used worldwide to predict traffic and increase safety. Intersections are one of the most difficult places to manage with ITS, since they involve so many different types of mobility (walking, biking, driving, scooting, etc). Researchers in China and Australia developed a new approach to predicting intersection traffic using tree models. Historic and recurring events needed to be combined with "real-time spatio-temporal information related to the nonrecurring events" like construction or accidents. Using real-world data sets from Melbourne, Australia the researchers were able to show that their new methods were better at predicting intersection traffic than old models.

How can we predict the effects of service interruptions and other disruptions on transit systems?

Back in 2011, Transport for London, the transit agency behind the London Tube, collaborated with a group from the University College of London to study the daily operations of the subway system via the familiar Oyster fare card. The result was a paper detailing how the commuting patterns of individuals coalesce into a massive, crowded network of movement on the Tube, resulting in congestion and strain at important system hubs. The smart card Oyster system allows researchers to collect data on the journeys of individual travelers (with the assurance of safeguards to protect customer privacy) to understand the complex dance of the metro system. The data visualization that the study produced allows researchers to model theeffects of various situations on congestion patterns; now planners can determine exactly what would happen if mechanical failures were to slow trains on a particular line or cause other service problems. Read more in "Streamlining the London Tube with Data" by Nick Carney.

How can we reduce the number of traffic accidents?

A number of US cities have implemented the Vision Zero initiative, a campaign that aims to eliminate traffic fatalities through education, enforcement, and engineering. As a part of the initiative, San Jose partnered with the Department of Transportation (DOT) and used data analytics and GIS analysis to identify 14 priority corridors where most major injury accidents occur. In response, the city has adjusted deployment times for Traffic Enforcement Unit (TEU) officers to align more closely with peak times of traffic collisions and provided them with GIS maps of high-incident intersections. Read more in “San Jose Improves Traffic Safety With Data” by Kevin Miller.

What is the effect of discounted fares for low-income transit riders?

A research study from the Massachusetts Institute of Technology compared travel behaviors for a control group and a group that received 50 percent reduced travel passes. The preliminary results showed that the group receiving the discounted cards made about 30 percent more trips than the control groups, and they took more trips for the purposes of health care and social services.

Can cities nudge residents to more sustainable modes of travel?

Since the transportation sector is the largest producer of greenhouse gasses in the United States, many cities are interested in using behavioral economics to encourage sustainable modes of transit like bikes and subways. Nudges are non-intrusive interventions that guide people toward a desirable action, and in this article there are several examples of effective nudges that cities around the world are using to "green" their transportation sector. 

How can cities drive parking innovation?

Drivers in Austin, Texas are the first in the country to have the option of using their car’s in-dashboard software to complete digital parking transactions. Not only is this more convenient for residents, it also allows the city to adjust rates based on useage and manage curb space and commercial zoning. 

How can cities prioritize micromobility privacy?

A group of micromobility advocates, providers, academics, nonprofit leaders and government official recently released a new set of micromobility transportation privacy principles. The group spent over a year creating these Privacy Principles for Mobility Data to offer guidance to cities around privacy, transparency, and data-sharing. 

What is the highway of the future?

In Georgia there is a corridor of incredible transportation innovation that is addressing sustainability, connectivity, safety, and driving technology through a creative partnership between the state department of transportation and nonprofit The Ray. This novel approach is a national model for intelligent infrastructure; learn more about this work through this interview on the Data-Smart City Pod and in this corresponding article.

How can city vehicles reduce idling and resultant pollution?

Many city vehicles are light- and medium-duty fleet vehicles, work trucks, and emergency vehicles. There are many ways cities can reduce idling, particularly with new technology. The first is by deploying hybrid vehicles, as a gas-electric hybrid does not produce exhaust when idling in queues, waiting for parking spaces, or while warming up in cold weather. Further options include retrofitting vehicles with heat recovery systems or battery power packs that allow drivers to utilize lights, heat, and equipment without requiring the engine. As an added bonus, many of these methods also reduce life-time costs.

How can cities effectively rework procurement practices?

Procuring goods and services is a large piece of government work — and often a large piece of local budgets. Yet according to Leah Tivoli, Seattle’s director of innovation and performance, “no one really pays attention to procurement.” To address the lack of improvement and innovation in this space, the city of Seattle collaborated with the Government Performance Lab at the Harvard Kennedy School to revamped its procurement process. By using data and dashboards to identify inefficiencies, streamlining request templates for proposals and bids, and providing support for employees, Seattle is a model for other cities.

Infrastructure

How can sensors help reduce water leaks and detect bursts?

Internet of Things technology is helping cities monitor water flow to optimize water pumping and reduce the amount of water lost to leakage. Read more in "Come Drought or High Water" by Laura Adler.

Is there a relationship between "greened" infrastructure and health?

Data from a study in Philadelphia found a positive impact from the greening of urban environments. Researchers recruited test subjects to wear heart rate monitors and walk around two clusters of vacant lots; some of the lots had received a greening treatment courtesy of the Philadelphia Horticulture Society. The other lots were the control. After reviewing the GPS and heart rate data, the researchers found that walking near the greened lots decreased subjects’ heart rates, as opposed to walking in the control lots. This study helps policymakers understand the ways in which housing and public health are intertwined, and how fairly simple infrastructure improvements can effect public health determinants. 

How can we predict major bridge problems before they happen?

Time, weather, and deferred maintenance have not been kind to many of New York City's East River crossings. The Brooklyn Bridge, an engineering marvel of its time, shows its age through the cracks in the masonry vaults that support the bridge's roadway over Manhattan. Fiber-optic sensors monitor these cracks, as well as other indicators such as temperature fluctuation, to assist structural engineers in determining when the vaults will ultimately need to be replaced. Further up the East River, on the Williamsburg Bridge, a series of interferometric and fiber Bragg grating sensors (both capable of measuring light waves) monitor wire deformation and breakage on the span's century-old suspension cables. Rather than make an annual manual inspection, engineers have access to continuous data, which can tell them if an individual strand in one of the bridge's cable is about to break. Read more in "How to Save America's Crumbling Bridges" by Stephen Goldsmith.

How can we improve utility inspections?

The Santa Clara Valley Water District in Santa Clara County, CA manages a network of natural and man-made infrastructure that supplies 1.8 million residents with water. In an effort to go paperless, district field staff was armed with GIS tablets to survey waterway infrastructure, cataloging and assessing the condition of levees and other assets. These data are now fed back into the district’s asset management software, allowing the agency to not only see infrastructure conditions but to make smart decisions about future investments. According to Esri, more than 4,000 paperless inspections have been processed since 2012. Read more in "Open Data's Road to Better Transit," by Stephen Goldsmith.

How can we prioritize tree trimmings and removals?

The City of Baltimore has one of the most highly regarded urban forests in the United States. But maintaining the net tree canopy, and the myriad positive side effects of urban greenery, requires a sophisticated understanding of tree location, health, and ownership. Old satellite imagery was imprecise and couldn't be easily mapped with other city data, but in 2006 the US Forest Service worked with the University of Vermont’s spatial analysis lab to create a novel land cover map of Baltimore.This new map combined aerial imagery, high-resolution landowner data, and light-reflecting technology to provide a closer view of the city's trees. Scientists and policy-makers can now overlay other computerized records – like data on health, crime, and property ownership – with the urban forest maps to guide tree maintenance, removal, and re-planting. 

How should city repairs be ranked for service?

City leaders are faced with a seemingly endless list of repairs and improvements, and deciding which to fix first is a complicated decision. For instance, sometimes 311 calls can be answered in a way that reinforces subconscious or systemic biases. In San Diego, the chief innovation officer developed an algorithm that assigns a prioritization score to each broken streetlight, so that lights near schools and high traffic areas are fixed first, which also prevents bias from creeping into the decision making.   
  • How can we reduce outage rates for agency fleets without purchasing new vehicles? Under what circumstances do outages take place most frequently?

  • How can we better predict where the next street light cable failure will be?

  • Can we predict what areas have more open hydrants?

  • Where should snow removal happen first?

  • What are current refuse locations in the city, and which receive the highest amount of complaints?

How can we prevent sewer overflows and flooding?

The Green City, Clean Waters program in Philadelphia is a city-wide low-impact development approach to mitigating the city’s combined sewer overflows (CSO). The program integrates very low-tech interventions like rain barrels and street trees with very high-tech data collection and analysis. The city is addressing a problem faced by many others that began before the 1950s: a CSO system wherein there is no physical division between stormwater and the sewer system responsible for wastewater coming from homes and businesses. This means when a big storm rolls in, the system becomes overwhelmed, stormwater and wastewater mix, and a toxic effluent is discharged into waterways, degrading the environmental quality for both plants and animals as well as citizens who may live or recreate near these rivers and streams. Luckily, big data analysis and a profusion of sensors spread within the city’s sewer system provide this vital piece of the puzzle, lending some big-technology insights to what is a purposefully low-tech, low-impact approach to attacking the CSO problem. Since the program’s conception, an extensive and quantitative evaluation plan has been in place. Philadelphia pulls data from sensors throughout the system (originally purposed just to warn departments and citizens of overflows) to see if the approach is really working, and also conducts health quality tests in various bodies of water to check if there is a substantive long-term impact. Real cost comparisons can be made between different elements of the program, allowing the city to adjust its plans over time and maximize the returns of each program dollar spent. Read more in "Low-Tech Solutions Meet Data Analytics in Philadelphia's CSO Approach" by Benjamin Weinryb Grohsgal.

Elsewhere, Chicago has recently begun an initiative to use sensors to manage stormwater. In a pilot project, underground sensors in test areas are collecting data on stromwater runoff to enable targeted depolyment of green infrastructure. Read more in "How a Smart City Tackles Rainfall" by Sean Thornton.

How can city infrastructure be more inclusive?

Typically, infrastructure is built for able-bodied people, but smart infrastructure and data insights can help city planners and officials build for residents who are differently-abled and/or have mobility challenges. This can include building an "age-friendly smart city," or using unbiased data to build an innovative and inclusive city for all. 

Where are potholes located?

Boston's Mayor's Office of New Urban Mechanics created a crowd-sourcing mobile app called Street Bump that helps residents improve their neighborhood streets by collecting road condition data while they drive. With StreetBump, citizen phones can report rough stretches of road to the City automatically as they drive over them, providing the City of Boston with a useful and cost-effective way of identifying which of its streets need work. Read more in "Beyond 311" by Stephen Goldsmith.

How can data guide electric vehicle infrastructure?

Electric vehicles, or EVS, are increasing in popularity in the United States, but one hesitation for potential consumers is the lack of charging infrastructure. Recent research from Ohio State University is helping local government leaders make informed decisions about EV infrastructure, using data to guide where charging stations should be located, based on route data from the Ohio metropolitan area. 

How can we better predict where will the next major pavement failure will be?

Cities have begun using digital monitoring techniques for roads in order to enable more frequent and accurate assessment of infrastructure quality. Some, like Cincinnati, have pursued vehicle-based monitoring, equipping vehicles with cameras, lasers, and sensors that identify road surface issues and create georeferenced images. Other cities, including Boston, have turned to smartphone apps that use devices’ cameras and accelerometers to track road quality. These innovations allow cities to identify road quality issues and target interventions before these problems become major pavement failures. Read more in “Sensors and Smartphones: Technological Solutions for Monitoring Road Conditions” by Paul Lillehaugen.

How can analytics improve routes for garbage trucks and other city vehicles?

UPS has a program in place called ORION—the On-Road Integrated Optimization and Navigation program — a sophisticated algorithm that ensures UPS vehicles take the most time- and energy-efficient routes. The program saves the company millions of miles each year, which adds up to hundreds of millions in savings and extreme reductions in CO2 emissions. The key to the program’s success is combining computational cost-cutting with functional, driver-friendly goals, like maintaining consistent routes and delivery times. This program provides a viable model for cities looking to improve routes for garbage trucks or other city vehicles. Cities including Boston, Philadelphia, and Raleigh, N.C. have begun taking steps towards route optimization for trash trucks, deploying smart bins that notify the appropriate agency when full to inform routes. Pairing this data with information on how busy a street is at a given time of day, how much a given neighborhood recycles, or where the trash goes after it's collected could make routes much more efficient. Read more in Stephen Goldsmith’s article “What a Brown Delivery Truck Could Teach Government."

Can infrastructure be upgraded to withstand climate change?

Often called "disaster hardening," the process of improving or repairing infrastructure - in a timely manner - helps prevent catastrophic damage from major storms, droughts, or other natural disasters that are intensified by climate change. This is common is places like the Netherlands, where mitigating flood risks is done by advanced preparation and care for infrastructure, rather than repairing them only once major damage has occurred to dams and dikes. Billion dollar climate change and weather disasters are costly for state, local, and federal governments, as outlined by the National Centers for Environmental Information. Timely infrastructure updates, a focus on "green" building, and conducting maintenance instead of deferring will create job opportunities as well as protect people and place from serious future damage.

How can data help mitigate stormwater runoff?

Washington, D.C.'s Urban Forestry Administration is exploring a model combining lidar and elevation data to find the best places to strategically plant trees to mitigate stormwater. Read more in "How D.C. Grew a Data-Driven Tree Strategy" by Stephen Goldsmith.

Under what circumstances do residents throw recyclables in the the trash instead of in recycling bins?

Many people who don’t recycle require additional information and messaging about the importance of recycling, proper recycling hygiene, and the importance of disposing of recyclables properly. Some information about recycling is universal, but many municipalities and states have their own specific regulations and methods; it is imperative that local and state governments have consistent, clear messaging so that residents are not only educated about recycling but are able to recycle correctly. The C40 has a helpful guide for cities on boosting recycling rates, which covers the importance of messaging but also touches upon utilizing data to set recycling goals, implementing things like door-to-door collection or material drop-offs, and incentivizing residents.

What lessons can be learned from climate vulnerable cities?

Miami is regarded as ground zero for climate change and one of the most vulnerable coastal cities in the United States. Both city and county leaders are using data to guide interventions, investments, and innovations. Learn more about how Miami local government relies on tools like sensors and data sharing in this article

What are the benefits of urban forestry?

better air quality and temperature regulation, there are also broader benefits around infrastructure management/preservation, community engagement, and mental health improvements. Similar to the recent addition of chief heat officers, some cities are hiring resiliency or forestry positions to develop a whole-of-government approach to urban arboriculture. 


In Tucson, AZ the chief resiliency officer manages a far-reaching urban forestry project that includes planting one million trees, developing a youth land stewardship initiative, and engaging residents in green space planning. And in Chattanooga, TN, city officials have adopted a GIS- and data-driven approach to greening the city, through the re-established parks department and urban forestry program.

Public Safety

How can we preempt youth violent crime?

In 2013, San Francisco began operating a real-time, web-based case management system across the the Departments of Public Health (DPH), Juvenile Probation (JPD), and the Human Services Agency (HSA)  to systematically identify at-risk youth that were clients of multiple city social services. Together, these agencies found that “Crossover clients” of multiple systems were at strikingly increased risk of committing a serious crime. 51 percent of San Franciscans involved in multiple service systems were convicted of a serious crime; a third had been served by all three agencies; and the overwhelming majority (88 percent) of these youth committed the crime more than 90 days after becoming a crossover client – a critical window during which, the analysis suggested, case workers may be able to intervene. Read more in "Getting Data to the Good Guys" by Christopher Kingsley and Stephen Goldsmith.

How can we prevent youth court-involvement?

Alicia Sasser Modestino has spent years studying the impacts of Summer Youth Employment Programs. Her research in Boston found a “30 percent reduction in property crime." This was determined by using arraignment data, and backed up by a similar result in Chicago. According to Modestino, participating in a youth summer job program "reduces the number of arraignments per youth.”

Modestino also had data for a subgroup of youth who were already court-involved, and there were even bigger improvements for that group. The reduction of violent and property crimes was “more like a 50 percent” and there was “a significant improvement in terms of recidivism — they're much less likely to commit future crimes.” 

How can we prevent violent crime?

New Orleans' NOLA for Life campaign analyzes data to determine likelihood of homicide, then targets its campaign components specifically at four neighborhoods where 40 percent of the city's homicides occur despite being home to just 19 percent of New Orleans' residents. And on an even more granular level, the campaign has sought to identify 200 New Orleans students who are most at risk for violence, with the goal of involving them in preventive programs. Read more in "How New Orleans is Winning a War Against Murder" by Stephen Goldsmith. 

How can data be used to prevent excessive use of force by police?

Working with the Charlotte-Mecklenburg Police Department in North Carolina in 2015, data scientists from the University of Chicago’s Data Science and Public Policy (DAaPP) developed an Early Intervention System (EIS) to identify individual officers most at-risk for adverse interactions with citizens. The team built a machine learning model that creates a ranked list of officers based on historical data and situational factors that police departments can use to target individual officers for specialized training. This provides a more effective measure for prevention against excessive use of force than simply instituting broad training programs. The DSaPP paper on this machine learning models describes a ranking of past use of force events in historical data with the designations; “not justified, preventable, and sustained.” This data serves as a historical basis for identifying past officers at-risk for future adverse incidents and is combined with other data on dispatch events, criminal complaints, citations, traffic stops, and arrests. The machine learning model “significantly outperformed” the existing within the Charlotte Mecklenburg Police Department used for targeting counseling and training. The model also provided individual risk scores and will more accurately allow the department to allocate resources and reduce unnecessary administrative tasks. The data can also be used at the dispatch level, allowing a dispatcher to recognize an officer that may be less suitable for a specific call because of the associated risk score for adverse citizen interactions.

Which offenders are most at risk of committing interpersonal (domestic) violence?

Accurate statistics around interpersonal (domestic) violence are incredibly difficult to obtain. However, in 2009 the US Department of Justice released "Practical Implications of Current Domestic Violence Research", a seminal report that provided data and education for law enforcement, prosecutors, and judges. This report identified several risk factors for abusers, including a large section on perpetrator characteristics. These included:

  1. Gender - the vast majority of perpetrators are male.
  2. Age - the median age of an abuser is 33 years old.
  3. Prior Record - although the data varies by state, in most places abusers had a prior record and/or were known to police. Having a prior record at all can also indicate likely hood of reabuse.
  4. Reabusers - Approximately 1/3 of abusers will reabuse in the short term, and more will reabuse in the long term. Studies show that for abusers who reoffend, the majority do so relatively quickly (within 2-6 months).

With this data, and other sources like the Office of Victims of Crime Intimate Partner Violence guide, local government and law enforcement can begin to understand which offenders are most at risk of committing interpersonal (domestic) violence, and know how to respond to and prevent these crimes. 

Which offenders are most at risk of recidivism?

In 2006, violent re-offenders established Philadelphia as one of the murder capitals of the United States. Philadelphia’s Adult Probation and Parole Department (APPD) oversaw 50,000 individuals, with only 295 probation officers. To manage the escalating crime, the APPD needed a systematic way of identifying the riskiest individuals and dedicating staff resources accordingly. If the APPD could accurately categorize recently paroled individuals as low-, medium-, or high-risk for potential to commit violent crime, the agency could save time and money and reduce the likelihood of violent recidivism. They turned to sociologist Richard Berk, who built a predictive engine based on tens of thousands of individual criminal records, with dozens of variables such as age, gender, previous zip code, number of previous crimes, and type of offense. This intelligent, machine-learning model enables the computer to find patterns and relationships across dozens of variables and constantly reassess those relationships as new data is added. Read more in "Predictive Tools for Public Safety" by Stephen Goldsmith.

How can data combat the crisis of missing and murdered Indigenous women and girls?

There is no accurate and reliable data about the crisis of murder and disappearance faced by Indigenous women and girls. This is due to a combination of poor data integrity, historic racism against Native communities, and a lack of protocol for these cases. However, it is possible to make changes - particularly to data practices - that would improve record keeping and help establish data-driven policies. This article, with advice from Indigenous experts, outlines steps that cities can take to address the crisis. 

Which service(s) offered to juvenile delinquents have the greatest impact in reducing recidivism?

The most effective services for justice-involved youth are ones that happen both before and after release. Research by the Office of Juvenile Justice and Delinquency Prevention found that cognitive-behavior therapy, programs targeting youth in the 16-24 age range, and individual treatment all helped prevent recidivism. The report specifically mentioned Project BUILD in Cook County, Ill, Operation New Hope, and the Wayne County (Michigan) Second Chance Reentry Program.

How can we anticipate where vulnerable people will need help evacuating?

The Centers for Disease Control and Prevention produced a data-driven guide for emergency managers called “Planning for an Emergency: Strategies for Identifying and Engaging At-Risk Groups,” which outlines how to identify vulnerable populations and how to best assist them during a crisis. The guidelines recommend working with the communities on preparedness planning, responding to a crisis with targeted data (that can easily be used by first responders), planning recovery around previously-identified subpopulations, and mitigating future disasters with data-driven policies.

Can we identify power outages in real time and coordinate emergency services' response?

Virtual Power Outage Detection Using Social Sensors” is a 2015 working paper that studies how social media can detect power outages. Through an analysis of Twitter and other social media platforms, searching for data and tweets related to a power outage, researchers proved that this type of social media mining was effective in identifying outages. Additionally, the site PowerOutage.US “collects, records, and aggregates live power outage data” and visualizes the information on a map of the United States. The site’s data comes from utility companies rather than consumers, but taken together these two types of data can produce a very comprehensive picture of power outages. Local emergency services could take advantage of the data to coordinate responses and provide assistance to those without power.

Where is a crime most likely to take place on a specific day?

In summer 2012, Seattle had an unexpected uptick in gun-related crimes. The city increased the number of officers patrolling the streets. As a result, the gun-related crimes decreased, but at high cost to the city. In response, the city began to consider predictive policing software.In late February of this year, Mayor Mike McGinn announced that Seattle implemented predictive policing software in two precincts. The idea behind predictive policing is that police departments have a wealth of data that has been collected over a number of years for every neighborhood and block of a city. By using that pre-existing data that can tell a story about past experience, police cruisers can patrol areas that match the same characteristics to prevent crimes from occurring. The software uses data from 2008 to predict potential crime and it is estimated to be twice as effective as a human data analyst working from the same information. For a cost of $73,000 for the software and an additional $45,000 per year for maintenance, the price of the predictive policing software in Seattle will likely limit the need for additional officers on patrol and reduce the number of arrests through place targeted patrolling and deterrence. Read more in “Seattle’s Predictive Policing Program” by Jessica Casey.

How can we identify gunshots before they are reported?

ShotSpotter works with municipalities to provide instantaneous gunfire alerts to police departments across the country. The core of ShotSpotter’s service is a wide-area acoustic surveillance system, supported by software and human ballistics experts, all focused on accurately detecting gunfire. The company mounts waterproof, watermelon-size, acoustic sensors on rooftops across a city. Networked together, an array of sensors can triangulate the incident location accurately in real time. If ten sensors detect a shot, the array can determine the incident location with a two-foot margin of error. ShotSpotter guarantees that it can accurately detect 80 percent of gunfire in coverage areas, although actual detection rates are as high as 95 percent. The technology has been implemented in 75 cities and towns across the United States, including Washington, D.C., and Milwaukee. Read more in "Predictive Tools for Public Safety," by Stephen Goldsmith.

How can social media data help identify public safety issues?

Huntington Beach, CA is monitoring real-time social media data for keywords that suggest problems might occur in order to deploy officers. Read more in "Learning from Location" by Laura Adler.

Which commercial establishments are most likely to be victims of armed robbery?

In 2014, police in Prince George's County, MD, found themselves faced with an alarming increase in armed robberies of commercial establishments.To reduce incidents of armed robbery, police analyzed crime data and identified nine business corridors where the robberies were concentrated, and they also zeroed in on 11 7-Eleven convenience stores outside the corridors that were the most likely to be targeted. Then they drilled down further, figuring out that Tuesdays, Thursdays and Saturdays were the nights when robberies were most likely to occur. The department deployed personnel based on the times and places where robberies were most likely to happen, but didn't stop there. Message boards on roadways in the targeted areas informed motorists (and warned potential criminals) that police operations were underway. Unoccupied police cars were parked in 7-Eleven parking lots and periodically moved. During the month that the county conducted this trial in innovative policing, armed robberies were reduced 40% compared to the same period the year before. Read more in "Harnessing Data to Fight Crime in Maryland" by Charles Chieppo.

Which 911 calls are classified incorrectly?

The Fire Department of New York Emergency Medical Service's (EMS) historical databases, already enormous, are steadily becoming far more useful for predictive analytics and other purposes: EMS's improved ability to spot patterns and trends can have a major impact on pre-hospital care. For starters, EMS can now compare the call type assigned to a 911 contact (based on what a caller says under emotional pressure) to the disease or complaint EMS actually finds when it arrives on the scene; knowing how people tend to mis-describe what's going on can help EMS change what operators ask of callers. Better data, better call-center scripts, better patient outcomes.Read more in "Wireless EMS in New York City," by Susan Crawford.

How can ambulances respond to medical emergencies faster?

Louisville Metro Emergency Medical Services (LMEMS) has sped up its ambulances’ turnaround times (the amount of time it takes from when an ambulance unloads a patient at a hospital until the crew becomes available to respond to another service call) in two ways. The first is by recording the time intervals for each step of its emergency responses with the Computer Aided Dispatch (CAD) system. This tool not only allows them to find which steps of the emergency response contain the greatest inefficiencies, but also holds ambulance crews accountable.The other is by monitoring the real-time location of the ambulances in the field. Using this tool, they can see the activity of their ambulance fleet, and communicate with crews to help them avoid any potential backups, or find out why straggling ambulances are not up to speed. By using data to identify obstacles to ambulance speed and hold ambulance drivers more accountable, the city has reduced its average ambulance turnaround time dramatically and saved the city $1.4 million dollars. Read more in "Stretch Goals" by Matthew McClellan.

Which homes are least likely to have smoke detectors?

New Orleans is looking to save lives by using data to predict which of the city’s buildings need to be equipped with fire alarms. By compiling data from sources like the 2011 American Housing Survey, the 2013 American Community Survey, the 2010 Using data collected by the Census and NOFD, the city determined that poverty among building inhabitants, building age and how long the residents have lived in a building are the best predictors that a structure may not have a smoke alarm installed. The city then determined that those over 65 and under 5 are most likely to die in building fires. It took the age data, added information about which areas of the city saw the highest concentration of fires over the previous five years, and mapped it. Finally, the likelihood of having a smoke alarm, residents' age and fire-concentration data were combined to rank every zone of the city based on the need for smoke alarms. NOFD is using the data to focus its door-to-door program to install free smoke alarms. Read more in "Predicting Fire Risk: From New Orleans to a Nationwide Tool" by Katherine Hillenbrand.

How can police dispatch be made more efficient?

The Atlanta Police Department wanted to reduce its dispatch time and improve its efficiency in employing human resources, so they turned to a team of data scientists for help. To find a solution to these problems, the team analyzed five years of data or approximately five million dispatches. It quickly became clear to the fellows that the traditional notion of workload (dispatch volume) did not capture the complexity of the work observed during the site visits. To weight dispatches more appropriately, a simple survey was developed by the team and then completed by 30 random dispatchers. Weights were then applied to dispatch types using a distribution from the survey results, effectively turning the notion of workload into an index. Coupling this with several other predictors, the team was able to develop a model to test different scenarios. One scenario that has gained traction as a result of the analysis is the movement of administrative dispatches (e.g. extra job check in and check out) to a single dispatcher, which creates greater availability for other dispatchers to focus on priority dispatches. Read more in "Optimizing Atlanta’s 911 Systems with Data Science" by John Zimmerman and Jon Keen.

How can cities prioritize 911 calls during times of disaster?

At the height of Hurricane Sandy, New York City’s 911 switchboard was receiving 20,000 calls an hour, many of which were not emergencies. The call volume led to slow response times and a lack of prioritization; there was no way to distinguish calls for downed tree branches from people in life-threatening situations. An important first step for future preparation is better educating citizens about what qualifies as a 911 call and what can be relegated to a non-emergency 311 call, an effort the city is undertaking now. The 311 hotline could be shifted to function primarily as a reporting mechanism, especially at times of disaster when city services and 911 phone lines are becoming overwhelmed. Even if an immediate answer isn’t guaranteed, text and data analytics of 311 texts, calls, and social media posts could allow these services to give first responders a better picture of where to focus their efforts. Citizens should be informed that any communication they send to the government would be registered in this way, bringing attention to their particular problems as well as those of their neighbors, while enhancing the entire city’s response capacity. Read more in "Getting the Lights on Faster" by Benjamin Weinryb Grohsgal and Stephen Goldsmith.

How can data sharing improve disaster response?

The Greater Cincinnati area created Raven911, a regional map-based program designed to enhance situational awareness in times of disaster. Read more in "Raven911 Gives Emergency Responders a Bird's Eye View" by Daniel Curtis.

Which streets will need to be evacuated before a flood?

Completed in 2010, Austin’s Flood Early Warning System (FEWS) combines flood maps, real-time data, and predictive modeling to make better evacuation decisions and plans in response to imminent flooding. The new system can predict which streets will become flooded and impassable up to 6 hours beforehand and map flooded areas and road closures, replacing an old system that only displayed flood danger levels of locations and often caused evacuations to take place once flooding had already occurred. Read more in "Forecasting Flooding in Austin" on Data-Smart City Solutions.

How can we predict flash floods?

In Europe, four cities are experimenting with X-band radar’s capacity to predict flash floods by counting every raindrop as part of the RainGain project. This innovation isn’t just about weather reporting: Real-time quantified rainfall data has the potential to help cities dynamically predict floods and deploy infrastructure to curb damage. Read more in "3 Ways to Optimize Urban Infrastructure" by Stephen Goldsmith.

How can we repair power outages faster after a natural disaster?

Smart grids, and particularly smart electric meters, played a promising role in improving disaster response and the speed with which power could be restored after super storm Sandy downed power lines across the east coast in 2012. That role was small-scale and local, since electric utilities' conversion to smart-grid technology has been slower than desired, but the potential is there for the technology to have a much larger impact as these systems are rolled out more widely. At best, phone calls and spotty service-outage reports can slowly piece together a hazy picture of the conditions of a power network. But smart meters, programmed to send out a "last call of distress" when power is lost, can automatically report service cuts. This gives a utility company instant access to regional maps of outages, allowing it to prioritize repair-crew mobilization and begin getting service back to customers without them even having to report an outage. Additionally, smart meters can automatically report getting back on line when power is restored, eliminating unnecessary calls between the utility company and customers or follow-up service-crew visits. Repair crews can move on to the next repair rather than spending time checking on their last one, increasing efficiency and reducing system repair time considerably. Read more in "Getting the Lights on Faster" by Benjamin Weinryb Grohsgal and Stephen Goldsmith.

Where will medical needs be after a natural disaster?

Direct Relief developed a social vulnerability index through demographic and housing information, and correlated those data against the constant stream of risk-assessment models generated by FEMA. Direct Relief could forecast where the medical needs would be, even before the storm made landfall. This data-driven modeling helped Direct Relief overcome the communications challenge in the first 48–72 hours after the storm. Health providers were completely out of contact—cell service and phone lines had gone down. There was no way for Direct Relief to know which providers needed assistance. With limited contact, Direct Relief used proxies, such as the electric-grid outage maps and whether local pharmacies were down in a particular area, to predict which groups needed assistance. Direct Relief volunteers were then sent to clinics in these vulnerable areas to confirm on-the-ground needs and coordinate medical-supply delivery. Read more in "Predictive Tools for Public Safety" by Stephen Goldsmith.

How can we identify individuals at risk of incarceration and deliver preventative service?

The Data-Driven Justice Initiative started as a White House mission in 2015 to break the “cycle of incarceration” and as many as 67 cities, counties, and states signed on to join the bipartisan effort. The municipalities agreed to install innovative solutions to divert potential incarcerees to mental illness or substance abuse treatment, using data analytics to target the individuals most in need of services.

The Data Science and Public Policy team (DSaPP) at the University of Chicago created an accurate Early Intervention System (EIS) that efficiently identifies individuals at risk of contact with the justice system so that the appropriate local agencies that employ this model can provide the necessary services and interventions. The data-driven EIS analyzes a jurisdiction’s mental health, justice, emergency medical response, and social services data to identify these individuals.In 2015, DSaPP partnered with Johnson County, Kansas to install a prototype EIS, combining data from the county’s mental health services, criminal justice system, and local EMS. The data scientists at DSaPP generated a list of 200 people most at risk of coming into contact with the criminal justice system that equipped county officials with a prioritized group most in need of intervention so as to save money from jail costs and reinvest those funds to bolster preventative measures.

Artificial Intelligence

Can artificial intelligence help analyze and answer 311 requests?

Yes, AI can help identify similarities or repeat issues in 311 requests as well as provide basic information to callers. For example, in Boston the Chief Information Officer Santiago Garces and his staff use ChatGPT to identify patterns in 311 reports. And in Portland, testing is underway to utilize artificial intelligence to answer and direct non-emergency calls, while decreasing hold times.

Can generative AI help local leaders award contracts and distribute funds to trustworthy vendors?

As outlined in the article Could ChatGPT Help Cities Better Vet Potential Vendors? Professor Stephen Goldsmith outlines how generative AI can more quickly and efficiently identify red flags and uncover repeat offenders by examining massive amounts of documentation, previous awards, and business filings.

What cities have developed guidelines on using AI?

As of fall 2023, the city of Boston has an interim AI use guideline that encourages “responsible experimentation” and offers example use cases including drafting emails. Seattle’s Interim Policy on AI has a similar function, as a short-term solution to questions about utilizing the new technology. The city of San Jose has a slightly different approach, with an AI Reviews and Algorithm Register that documents algorithms in use and facilitates reviews of proposed AI tools. NYC also has an AI Action Plan, produced by the city's Office of Technology and Innovation.

What role could AI play in reducing traffic jams?

In a large-scale traffic experiment conducted near Nashville in 2022, researchers from the CIRCLES Consortium research group deployed AI-equipped vehicles to test if they could alleviate traffic congestion caused by human behavior. These cars had advanced, AI-powered cruise control systems designed to optimize speed and traffic flow, in order to determine if a small number of these vehicles could collectively improve overall traffic efficiency by coordinating and adjusting their speeds based on real-time data. The results showed promise in reducing traffic jams, saving fuel, and making driving safer.

How should a local government think about leveling up AI usage?

In an episode of the Data-Smart City Pod, Luis Videgaray of the Massachusetts Institute of Technology’s Schwarzman College of Computing discusses the three levels for deployment of AI in a complex organization or large bureaucracy like a city government. 

Level one is straightforward, using tools and applications that already exist and are freely available. Level two is “a thin customization” that uses some of the organization's data, such as querying internal databases or government chatbots. Level three is the most transformative, and involves changing processes and embedding AI to empower individual decision-making, which requires true customization.

How can AI help manage public transportation networks?

Transport for London (TfL) is one of the busiest transportation systems in the world, with over 31 million daily journeys, and transportation officials are using big data to optimize transport services for such a large system. TfL captures 20 million daily "taps" through its ticketing system and employs data from multiple sources, such as bus location sensors, traffic signals, and cameras. Notably, TfL created a multi-modal travel dataset for buses called ODX, which improves network planning and response to disruptions in a mode of public transit that is typically more challenging to monitor. 

Big data and AI also aids in analyzing road safety trends and led to initiatives like an interactive collision map, showing ten years of crash data in an effort to highlight where increased attention and intervention is necessary. TfL aims to provide predictive travel information, including crowding data, for a more comfortable journey in the future for riders.

How do AI tools help include residents in public meetings?

The North Las Vegas City Council was the first jurisdiction in Nevada to incorporate real-time transcription and translation powered by AI into its public meetings. Given that nearly 42 percent of North Las Vegas residents identify as Hispanic, and nearly 38 percent speak languages other than English at home, this initiative seeks to make the government more inclusive. Using the language translation tool Wordly to translate Spanish to English, the AI-powered system provides translated transcriptions on large screens in real-time during public meetings.

Although accuracy depends on the speaker's pace and clarity, it significantly increases accessibility in real-time. Audience members can also use QR codes to access translations on their devices, and the city plans to expand the system to include more languages. This innovative approach has received positive feedback from both council members and residents, emphasizing the importance of using new technology to make diverse voices heard and understood in the community. 

 

Other cities are exploring similar tools to help assist residents with disabilities by providing other accessibility features such as voice commands and speech-to-text tools, or live captioning and screen-reading devices.

Does AI help with lead line replacement?

Benton Harbor, Michigan, has a dire water crisis due to unsafe levels of lead in its water supply, making it unfit for consumption, cooking, or even personal hygiene. After years of concerns from residents, the Benton Harbor Community Water Council and other advocates petitioned the U.S. Environmental Protection Agency to secure a safe water source and replace lead service lines. Locating these lead lines is a critical first step in the process, and engaged a Michigan-based company to develop an algorithm designed to assist communities in this endeavor. The algorithm analyzes factors like a home's age, location, and proximity to water infrastructure to predict whether it has a lead service line. The goal is to expedite the replacement process and mitigate the risk of lead exposure.

How are firefighters using AI?

One area that has rapidly adopted AI tools is fire fighting; cities and states across the country are relying on advanced technology like AI-enabled cameras and satellites to detect both smoke and fire, while AI models can predict future wildfire spots before a blaze occurs. As climate change causes more extreme weather events, including droughts, AI tools that can mitigate and even prevent wild fires are imperative.
This post has been updated over time with additional examples. Most recent update was January 17, 2024.