Data-Smart City Solutions

Search

By Data-Smart City Solutions

What kinds of operations-enhancing questions have cities asked and answered with data and analytics? The catalog below is an ongoing, regularly-updated resource for those interested in knowing what specific use cases can be addressed using more advanced data and analysis techniques.

For examples that are currently being implemented in cities across the country, you can click to expand the question to see additional information about the solution.  All other examples represent potential questions that cities could work to address with data and analytics.    

We welcome further submissions to the list by email.  Submissions can include either current examples of how cities are addressing specific operational or policy issues with data, or ideas for how to address issues that you hope cities will one day be able to answer.

Health & Human Services

  • What is the impact of providing an additional service(s) to a client already receiving one city service?
  • How can we improve cooperation between city services with shared clients?

    Since the creation of a unified Department of Housing and Human Services (HHS), Boulder County, CO has been a testament to the benefits of holistic human services delivery. Through its integrated service delivery system, Boulder County has been able to expand the number of residents receiving services by 140%, focusing on front-end and early intervention measures to prevent more costly services in the future. Technology has been a key feature of this transition. The Department, as it exists today, was formed after a 2008 merger between the County’s housing and social services agencies. To support this effort, HHS developed an integrated service delivery system, including technological tools that allow employees to track clients’ case histories across programs, refer clients to additional program areas, and collaborate with other Department caseworkers. Read more in "Boulder County Colorado: Integrated Service Delivery," by Sam Gill, Indi Dutta-Gupta, and Brendan Roach.

  • Who is most likely to apply for a city service(s)?
  • Which clients are most likely to apply for multiple services?
  • When clients apply for / obtain multiple services, which service do they typically apply for first?
  • Can we forecast the number of caseloads for city services?
  • How can cities deliver social services more efficiently?

    Five years ago, New York City launched an initiative, HHS-Connect, to collect its social service data in one place. The idea is to allow clients to walk into different social service agencies without having to re-enter their information and complete duplicate paperwork. “We have a vision of a client walking into, for example, a homeless shelter and not having to reapply your information if you had already been to the public welfare office or to the Administration for Children’s Services,” said Kristin Misner, chief of staff to the deputy mayor for health and human services. Read more in "Big Data Gives a Boost to Health and Human Services," by Stephen Goldsmith.

  • How do we help clients leaving the criminal justice system, foster care, homeless shelters, etc. get and keep jobs?
  • Which clients coming out of the juvenile justice, criminal justice, foster care, homeless services, or substance abuse systems who are placed in employment are most likely to return to city services?
  • How can we predict which homes have the highest lead levels to prevent lead poisoning incidents, especially during pregnancy?

    Data Science for Social Good partnered with the Chicago Department of Public Health to identify houses most at risk for lead contamination, based on blood tests, building records, and census data. Read more on the DSSG site.

  • Which interventions are most effective?
  • What are the causes of infant mortality that could be targets for intervention?

    In the span of just one year, Cincinnati managed to decrease its infant mortality rate (IMR) by over 25  percent, from 13.3 deaths per 1,000 live births in 2012 down to 9.9 in 2013. To accomplish this feat, the city has incorporated and leveraged relevant data to concentrate its efforts where they are most needed. Since 2007, this targeted undertaking has tracked various indicators and outcomes, such as mother’s zip code, race/ethnicity, mental health, and smoking habits, as well as the child’s birth spacing and sleeping environment. By using data to zero in on quantifiable risk factors and on at-risk communities, Cincinnati is making major strides on a difficult undertaking, the fight against infant mortality. Read more in "Using Data to Combat Infant Mortality in Cincinnati" by Victoria Kabak.

  • Can we target outreach and intervention to those at risk of poor health outcomes?

    Improved analytical capabilities allow agencies to identify and implement more effective services for clients. In Oklahoma, as part of the SoonerCare (Medicaid) program, officials analyzed patient data including comorbidity factors to identify individuals prone to poor health outcomes. Equipped with a list of at-risk Medicaid recipients, managers have worked to sign these individuals up for intensive, managed-care programs. Meanwhile, the Rhode Island Department of Children, Youth, and Families developed the Real Connections program, which analyzes data on a child’s social network. Using this analysis of existing information, the Department is able to identify mentors best suited to enable the best outcome for each child. Read more in "The Technology Opportunity for Human Services," by Sam Gill, Indi Dutta-Gupta, and Brendan Roach.

  • How can we identify causes of poor air quality causing health problems for residents?

    Louisville, KY partnered with Propeller Health in May 2012 to distribute 500 smart inhalers to asthmatic residents. When the devices were used, they sent time and location data to both the patient’s physician and city officials, who used the data to generate “heat maps” of emergency asthma attacks. With the help of data analysts at IBM, public health officials compared the trends against a variety of potential causes — including air quality, pollen outbreaks and traffic congestion — to strategize interventions in the most at-risk areas. Today, the project continues. The city plans to deploy bike-mounted sensors to monitor air quality along routes that are frequented by children during the summer. Read more in "Health Data Isn't Just for Hospitals," by Stephen Goldsmith.

  • Which individuals and families are most at risk of returning to the homeless services system?
  • Which individuals and families placed into permanent housing are most at risk of returning to the homeless services system?
  • How can we predict which families are most at risk for homelessness?

    New York City's Department of Homeless Services (DHS) has partnered with academics to develop customized risk assessment tools that support caseworkers in determining the best approach for each client during the screening process. DHS is also exploring more proactive approaches, partnering with the SumAll Foundation to analyze data on eviction notices to predict which cases are most likely to result in homelessness. While the pilot is still being tested in specific neighborhoods, the analytics will eventually become part of the City’s data visualization project that allow staff to visualize neighborhood data such as shelter entries and eviction filings, while also being able to tell caseworkers which of the thousands of households or buildings on the map are actually most at-risk of shelter entry.  By tailoring outreach efforts and reducing barriers to access, DHS can provide more services to more at-risk families. Read more in "Data-Driven Strategies for Reducing Homelessness," by Lyell Sakaue.

  • Which public housing residents are most likely to be placed into employment?
  • Which city services have the greatest impact on reducing entry into homeless shelters?
  • How can we identify pregnant women at risk for having adverse births?

    Data Science for Social Good partnered with the Illinois Department of Human Services to identify women at risk for having adverse births, which are associated with negative personal, financial, and developmental outcomes for both the mother and the baby. They identified risk factors, including stress, socioeconomic factors, substance use, quality of life indicators, healthcare access, and age, that could be used to predict women at high risk for adverse births. The state then provided targeted programs and assistance to these identified high-risk women, giving them needed assistance. Read more on the DSSG site

  • How can education data be used to identify off-track students?

    Data Science for Social Good and the Mesa Public School System are using Mesa's education data to identify students who are off-track for their future plans. Using students' classes, grades, test scores, and attendances, they can predict which students are college-ready but may not be applying to college, only applying to two-year programs, or will enroll but not graduate from college. These students can then be given extra support and resources to empower them to apply, graduate, and succeed. Read more on the DSSG site.

  • Which client characteristics indicate that a client will leave a homeless shelter without a subsidy?
  • Which clients would benefit the most from housing services?
  • What factors contribute to youth obesity?

    A Texas law requires public schools to record fitness data on every student. Through data-sharing agreements with the school districts, Austihttp://datasmart.ash.harvard.edu/news/article/boulder-county-colorado-integrated-service-delivery-496n-based nonprofit Children's Optimal Health (COH) gathers metrics on BMI and cardiovascular fitness scores that are geo-tagged with social and economic information. COH converts de-identified person-level data to aggregate neighborhood-level maps that illuminate the conditions faced by families and children in the area, all while protecting personal information. Enhanced with other datasets, these maps tell a more complex story of the factors that influence health outcomes — from proximity to fast food restaurants to the stress of high neighborhood crime rates. Read more in "Austin Targets Youth Obesity with Neighborhood-Level Data," by Stephen Goldsmith.

  • How can we increase kindergarten readiness?

    When Bloomberg Philanthropies launched the Mayors Challenge, issuing a call for the most innovative proposals by cities across the United States, Providence, RI Mayor Angel Taveras seized the opportunity to seek out a new, creative solution to a serious issue in his city. After reviewing a number of ideas, he and his team ultimately developed the program now known as Providence Talks. The team discovered research that shows that high-income children hear an average of 30 million more words than their low-income peers in the first three years. The seminal study of language environment by Betty Hart and Todd Risley showed that the amount of conversation children had with their parents by age three was positively associated with their IQ scores at that age, along with a host of other positive outcomes. Providence decided to apply this important finding by creating a city-led effort to close the word gap using innovative technology: devices that can record and allow measurement of the auditory environment of children. A preliminary study demonstrated that sharing feedback reports generated by the recording devices led caregivers to increase the number of words spoken to their children by 55 percent.  Simply having the information about the number of words their children were hearing inspired them to talk more, read more, and interact more with their children. The children’s increased exposure to words was the single greatest predictor of improved language skills and learning readiness before entering school. Read more in "Providence Talks: Progress on Closing the 'Word Gap.'"

  • Where is there the greatest risk of mosquito-borne illness?

    West Nile virus, an ailment once rare and relatively unknown in the United States, is now an annual danger in many suburban communities. In Suffolk County, New York, a large suburban and rural county on Long Island, officials began seeing West Nile cases in the early 2000s. The county developed a model to assess the risk of outbreak using a combination of statistical methods and geographic information systems. Through modeling, they found relationships between human West Nile cases, landscape factors, population demographics, and weather patterns. Initial results showed a complex interaction between these factors and human cases of West Nile virus. Using this hot-spot analysis, Vector Control now targets larvicide efforts in established hot-spots and uses aerial adulticide spray only where quantitative evidence supports the use of pesticides. By being strategic in the use of analytics, the agency has saved time and money, while still providing a high level of public safety. Read more in "Predictive Tools for Public Safety," by Stephen Goldsmith

Infrastructure

  • How can sensors help reduce water leaks and detect bursts?

    Internet of Things technology is helping cities monitor water flow to optimize water pumping and reduce the amount of water lost to leakage. Read more in "Come Drought or High Water" by Laura Adler.

  • How can we reduce the number of traffic accidents?
  • How can we predict the effects of service interruptions and other disruptions on transit systems?

    Back in 2011, Transport for London, the transit agency behind the London Tube, collaborated with a group from the University College of London to study the daily operations of the subway system via the familiar Oyster fare card. The result was a paper detailing how the commuting patterns of individuals coalesce into a massive, crowded network of movement on the Tube, resulting in congestion and strain at important system hubs. The smart card Oyster system allows researchers to collect data on the journeys of individual travelers (with the assurance of safeguards to protect customer privacy) to understand the complex dance of the metro system. The data visualization that the study produced allows researchers to model theeffects of various situations on congestion patterns; now planners can determine exactly what would happen if mechanical failures were to slow trains on a particular line or cause other service problems. Read more in "Streamlining the London Tube with Data" by Nick Carney.

  • How can we prioritize tree trimmings and removals? 
  • How can we predict major bridge problems before they happen?

    Time, weather, and deferred maintenance have not been kind to many of New York City's East River crossings. The Brooklyn Bridge, an engineering marvel of its time, shows its age through the cracks in the masonry vaults that support the bridge's roadway over Manhattan. Fiber-optic sensors monitor these cracks, as well as other indicators such as temperature fluctuation, to assist structural engineers in determining when the vaults will ultimately need to be replaced. Further up the East River, on the Williamsburg Bridge, a series of interferometric and fiber Bragg grating sensors (both capable of measuring light waves) monitor wire deformation and breakage on the span's century-old suspension cables. Rather than make an annual manual inspection, engineers have access to continuous data, which can tell them if an individual strand in one of the bridge's cable is about to break. Read more in "How to Save America's Crumbling Bridges," by Stephen Goldsmith.

  • How can we improve utility inspections?

    The Santa Clara Valley Water District in Santa Clara County, CA manages a network of natural and man-made infrastructure that supplies 1.8 million residents with water. In an effort to go paperless, district field staff was armed with GIS tablets to survey waterway infrastructure, cataloging and assessing the condition of levees and other assets. These data are now fed back into the district’s asset management software, allowing the agency to not only see infrastructure conditions but to make smart decisions about future investments. According to Esri, more than 4,000 paperless inspections have been processed since 2012. Read more in "Open Data's Road to Better Transit," by Stephen Goldsmith.

  • How can we reduce outage rates for agency fleet without purchasing new vehicles? Under what circumstances do outages take place most frequently?
  • How can we reduce accidents involving city vehicles? Where and when do most accidents involving city vehicles occur?
  • How can we prevent sewer overflows?

    The Green City, Clean Waters program in Philadelphia is a city-wide low-impact development approach to mitigating the city’s combined sewer overflows (CSO). The program integrates very low-tech interventions like rain barrels and street trees with very high-tech data collection and analysis. The city is addressing a problem faced by many others that began before the 1950s: a CSO system wherein there is no physical division between stormwater and the sewer system responsible for wastewater coming from homes and businesses. This means when a big storm rolls in, the system becomes overwhelmed, stormwater and wastewater mix, and a toxic effluent is discharged into waterways, degrading the environmental quality for both plants and animals as well as citizens who may live or recreate near these rivers and streams. Luckily, big data analysis and a profusion of sensors spread within the city’s sewer system provide this vital piece of the puzzle, lending some big-technology insights to what is a purposefully low-tech, low-impact approach to attacking the CSO problem. Since the program’s conception, an extensive and quantitative evaluation plan has been in place. Philadelphia pulls data from sensors throughout the system (originally purposed just to warn departments and citizens of overflows) to see if the approach is really working, and also conducts health quality tests in various bodies of water to check if there is a substantive long-term impact. Real cost comparisons can be made between different elements of the program, allowing the city to adjust its plans over time and maximize the returns of each program dollar spent. Read more in "Low-Tech Solutions Meet Data Analytics in Philadelphia's CSO Approach," by Benjamin Weinryb Grohsgal.

  • How can we better predict where will the next major pavement failure will be?
  • How can we better predict where the next major street light cable failure will be?
  • Can we predict what areas have more open hydrants?
  • Where should snow removal happen first?
  • Where are potholes located?

    Boston's Mayor's Office of New Urban Mechanics created a crowd-sourcing mobile app called Street Bump that helps residents improve their neighborhood streets by collecting road condition data while they drive. With StreetBump, citizen phones can report rough stretches of road to the City automatically as they drive over them, providing the City of Boston with a useful and cost-effective way of identifying which of its streets need work. Read more in "Beyond 311," by Stephen Goldsmith.

  • Which intersections are likely to be blocked, and when?
  • How can analytics improve routes for garbage trucks and other city vehicles?
  • Which indicators can help to identify areas with the greatest amounts of idling?
  • How can data help mitigate stormwater runoff?

    Washington, D.C.'s Urban Forestry Administration is exploring a model combining lidar and elevation data to find the best places to strategically plant trees to mitigate stormwater. Read more in "How D.C. Grew a Data-Driven Tree Strategy" by Stephen Goldsmith.

  • Under what circumstances do residents throw recyclables in the trash instead of in recycling bins (and how can we mitigate this in order to increase recycling diversion rates)?
  • What are the current refuse locations in the city? Which receive the highest amount of complaints?

Public Safety

  • How can data sharing improve disaster response?

    The Greater Cincinnati area created Raven911, a regional map-based program designed to enhance situational awareness in times of disaster. Read more in "Raven911 Gives Emergency Responders a Bird's Eye View" by Daniel Curtis.

  • How can we preempt youth violent crime?

    In 2013, San Francisco began operating a real-time, web-based case management system across the the Departments of Public Health (DPH), Juvenile Probation (JPD), and the Human Services Agency (HSA)  to systematically identify at-risk youth that were clients of multiple city social services. Together, these agencies found that “Crossover clients” of multiple systems were at strikingly increased risk of committing a serious crime. 51 percent of San Franciscans involved in multiple service systems were convicted of a serious crime; a third had been served by all three agencies; and the overwhelming majority (88 percent) of these youth committed the crime more than 90 days after becoming a crossover client – a critical window during which, the analysis suggested, case workers may be able to intervene. Read more in "Getting Data to the Good Guys," by Christopher Kingsley and Stephen Goldsmith.

  • How can data identify police officers with a higher risk of negative community interactions?

    The Center for Data Science and Public Policy partnered with the White House Police Data Initiative and the Charlotte-Mecklenburg Police Department to combine public and private police data to identify officers with a higher risk of negative community interactions. Police departments will be able to use these results to provide targeted resources and training to counsel, train, or support officers that need it. Read more on the DSSG site.

  • How can we prevent violent crime?

    New Orleans' NOLA for Life campaign analyzes data to determine likelihood of homicide, then targets its campaign components specifically at four neighborhoods where 40 percent of the city's homicides occur despite being home to just 19 percent of New Orleans' residents. And on an even more granular level, the campaign has sought to identify 200 New Orleans students who are most at risk for violence, with the goal of involving them in preventive programs. Read more in "How New Orleans is Winning a War Against Murder" by Stephen Goldsmith. 

  • Which offenders are most at risk of committing domestic violence?
  • Which offenders are most at risk of recidivism?

    In 2006, violent re-offenders established Philadelphia as one of the murder capitals of the United States. Philadelphia’s Adult Probation and Parole Department (APPD) oversaw 50,000 individuals, with only 295 probation officers. To manage the escalating crime, the APPD needed a systematic way of identifying the riskiest individuals and dedicating staff resources accordingly. If the APPD could accurately categorize recently paroled individuals as low-, medium-, or high-risk for potential to commit violent crime, the agency could save time and money and reduce the likelihood of violent recidivism. They turned to sociologist Richard Berk, who built a predictive engine based on tens of thousands of individual criminal records, with dozens of variables such as age, gender, previous zip code, number of previous crimes, and type of offense. This intelligent, machine-learning model enables the computer to find patterns and relationships across dozens of variables and constantly reassess those relationships as new data is added. Read more in "Predictive Tools for Public Safety" by Stephen Goldsmith

  • Which service(s) offered to juvenile delinquents have the greatest impact in reducing recidivism?
  • Where is a crime most likely to take place on a specific day?

    In summer 2012, Seattle had an unexpected uptick in gun-related crimes. The city increased the number of officers patrolling the streets. As a result, the gun-related crimes decreased, but at high cost to the city. In response, the city began to consider predictive policing software.In late February of this year, Mayor Mike McGinn announced that Seattle implemented predictive policing software in two precincts. The idea behind predictive policing is that police departments have a wealth of data that has been collected over a number of years for every neighborhood and block of a city. By using that pre-existing data that can tell a story about past experience, police cruisers can patrol areas that match the same characteristics to prevent crimes from occurring. The software uses data from 2008 to predict potential crime and it is estimated to be twice as effective as a human data analyst working from the same information. For a cost of $73,000 for the software and an additional $45,000 per year for maintenance, the price of the predictive policing software in Seattle will likely limit the need for additional officers on patrol and reduce the number of arrests through place targeted patrolling and deterrence. Read more in “Seattle’s Predictive Policing Program” by Jessica Casey.

  • How can we identify gunshots before they are reported?

    ShotSpotter works with municipalities to provide instantaneous gunfire alerts to police departments across the country. The core of ShotSpotter’s service is a wide-area acoustic surveillance system, supported by software and human ballistics experts, all focused on accurately detecting gunfire. The company mounts waterproof, watermelon-size, acoustic sensors on rooftops across a city. Networked together, an array of sensors can triangulate the incident location accurately in real time. If ten sensors detect a shot, the array can determine the incident location with a two-foot margin of error. ShotSpotter guarantees that it can accurately detect 80 percent of gunfire in coverage areas, although actual detection rates are as high as 95 percent. The technology has been implemented in 75 cities and towns across the United States, including Washington, D.C., and Milwaukee. Read more in "Predictive Tools for Public Safety," by Stephen Goldsmith

  • How can social media data help identify public safety issues?

    Huntington Beach, CA is monitoring real-time social media data for keywords that suggest problems might occur in order to deploy officers. Read more in "Learning from Location" by Laura Adler.

  • Which commercial establishments are most likely to be victims of armed robbery?

    In 2014, police in Prince George's County, MD, found themselves faced with an alarming increase in armed robberies of commercial establishments.To reduce incidents of armed robbery, police analyzed crime data and identified nine business corridors where the robberies were concentrated, and they also zeroed in on 11 7-Eleven convenience stores outside the corridors that were the most likely to be targeted. Then they drilled down further, figuring out that Tuesdays, Thursdays and Saturdays were the nights when robberies were most likely to occur. The department deployed personnel based on the times and places where robberies were most likely to happen, but didn't stop there. Message boards on roadways in the targeted areas informed motorists (and warned potential criminals) that police operations were underway. Unoccupied police cars were parked in 7-Eleven parking lots and periodically moved. During the month that the county conducted this trial in innovative policing, armed robberies were reduced 40% compared to the same period the year before. Read more in "Harnessing Data to Fight Crime in Maryland," by Charles Chieppo. 

  • Which 911 calls are classified incorrectly?

    The Fire Department of New York Emergency Medical Service's (EMS) historical databases, already enormous, are steadily becoming far more useful for predictive analytics and other purposes: EMS's improved ability to spot patterns and trends can have a major impact on pre-hospital care. For starters, EMS can now compare the call type assigned to a 911 contact (based on what a caller says under emotional pressure) to the disease or complaint EMS actually finds when it arrives on the scene; knowing how people tend to mis-describe what's going on can help EMS change what operators ask of callers. Better data, better call-center scripts, better patient outcomes.Read more in "Wireless EMS in New York City," by Susan Crawford

  • How can ambulances respond to medical emergencies faster?

    Louisville Metro Emergency Medical Services (LMEMS) has sped up its ambulances’ turnaround times (the amount of time it takes from when an ambulance unloads a patient at a hospital until the crew becomes available to respond to another service call) in two ways. The first is by recording the time intervals for each step of its emergency responses with the Computer Aided Dispatch (CAD) system. This tool not only allows them to find which steps of the emergency response contain the greatest inefficiencies, but also holds ambulance crews accountable.The other is by monitoring the real-time location of the ambulances in the field. Using this tool, they can see the activity of their ambulance fleet, and communicate with crews to help them avoid any potential backups, or find out why straggling ambulances are not up to speed. By using data to identify obstacles to ambulance speed and hold ambulance drivers more accountable, the city has reduced its average ambulance turnaround time dramatically and saved the city $1.4 million dollars. Read more in "Stretch Goals," by Matthew McClellan. 

  • Can we use historical 911 and 311 call volumes to adequately staff and schedule their call floors, at various times of the day/week/month/year?

  • Which homes are least likely to have smoke detectors?

    New Orleans is looking to save lives by using data to predict which of the city’s buildings need to be equipped with fire alarms. By compiling data from sources like the 2011 American Housing Survey, the 2013 American Community Survey, the 2010 Using data collected by the Census and NOFD, the city determined that poverty among building inhabitants, building age and how long the residents have lived in a building are the best predictors that a structure may not have a smoke alarm installed. The city then determined that those over 65 and under 5 are most likely to die in building fires. It took the age data, added information about which areas of the city saw the highest concentration of fires over the previous five years, and mapped it. Finally, the likelihood of having a smoke alarm, residents' age and fire-concentration data were combined to rank every zone of the city based on the need for smoke alarms. NOFD is using the data to focus its door-to-door program to install free smoke alarms. Read more in "Where There's Smoke, There's Data," by Charles Chieppo.

  • Where will medical needs be after a natural disaster?

    Direct Relief developed a social vulnerability index through demographic and housing information, and correlated those data against the constant stream of risk-assessment models generated by FEMA. Direct Relief could forecast where the medical needs would be, even before the storm made landfall. This data-driven modeling helped Direct Relief overcome the communications challenge in the first 48–72 hours after the storm. Health providers were completely out of contact—cell service and phone lines had gone down. There was no way for Direct Relief to know which providers needed assistance. With limited contact, Direct Relief used proxies, such as the electric-grid outage maps and whether local pharmacies were down in a particular area, to predict which groups needed assistance. Direct Relief volunteers were then sent to clinics in these vulnerable areas to confirm on-the-ground needs and coordinate medical-supply delivery. Read more in "Predictive Tools for Public Safety," by Stephen Goldsmith.

  • Where are rodent infestations most likely?

    In partnership with the Event and Pattern Detection Laboratory (EPD Lab) at Carnegie Mellon University, Chicago’s Department of Innovation and Technology (DoIT) is taking on a predictive approach to the “war on rats.”  By using data in innovative ways to help keep rat populations down, Chicago is putting to use a new strategy that can not only enhance rodent control initiatives, but add precision to other strategies that address a wide range of urban problems. Read more in "Using Predictive Analytics to Combat Rodents in Chicago," by Sean Thornton.

  • How can we anticipate where vulnerable people will need help evacuating?
  • Can we identify power outages in real time and coordinate emergency services' response?
  • Can crowdsourced information be used to improve the delivery of pest control services?
  • How can police dispatch be made more efficient?

    The Atlanta Police Department wanted to reduce its dispatch time and improve its efficiency in employing human resources, so they turned to a team of data scientists for help. To find a solution to these problems, the team analyzed five years of data or approximately five million dispatches. It quickly became clear to the fellows that the traditional notion of workload (dispatch volume) did not capture the complexity of the work observed during the site visits. To weight dispatches more appropriately, a simple survey was developed by the team and then completed by 30 random dispatchers. Weights were then applied to dispatch types using a distribution from the survey results, effectively turning the notion of workload into an index. Coupling this with several other predictors, the team was able to develop a model to test different scenarios. One scenario that has gained traction as a result of the analysis is the movement of administrative dispatches (e.g. extra job check in and check out) to a single dispatcher, which creates greater availability for other dispatchers to focus on priority dispatches. Read more in "Optimizing Atlanta’s 911 Systems with Data Science," by John Zimmerman and Jon Keen. 

  • How can cities prioritize 911 calls during times of disaster?

    At the height of Hurricane Sandy, New York City’s 911 switchboard was receiving 20,000 calls an hour, many of which were not emergencies. The call volume led to slow response times and a lack of prioritization; there was no way to distinguish calls for downed tree branches from people in life-threatening situations. An important first step for future preparation is better educating citizens about what qualifies as a 911 call and what can be relegated to a non-emergency 311 call, an effort the city is undertaking now. The 311 hotline could be shifted to function primarily as a reporting mechanism, especially at times of disaster when city services and 911 phone lines are becoming overwhelmed. Even if an immediate answer isn’t guaranteed, text and data analytics of 311 texts, calls, and social media posts could allow these services to give first responders a better picture of where to focus their efforts. Citizens should be informed that any communication they send to the government would be registered in this way, bringing attention to their particular problems as well as those of their neighbors, while enhancing the entire city’s response capacity. Read more in "Getting the Lights on Faster," by Benjamin Weinryb Grohsgal and Stephen Goldsmith.

  • Which streets will need to be evacuated before a flood?

    Completed in 2010, Austin’s Flood Early Warning System (FEWS) combines flood maps, real-time data, and predictive modeling to make better evacuation decisions and plans in response to imminent flooding. The new system can predict which streets will become flooded and impassable up to 6 hours beforehand and map flooded areas and road closures, replacing an old system that only displayed flood danger levels of locations and often caused evacuations to take place once flooding had already occurred. Read more in "Forecasting Flooding in Austin," on Data-Smart City Solutions. 

  • How can we predict flash floods?

     In Europe, four cities are experimenting with X-band radar’s capacity to predict flash floods by counting every raindrop as part of the RainGain project. This innovation isn’t just about weather reporting: Real-time quantified rainfall data has the potential to help cities dynamically predict floods and deploy infrastructure to curb damage. Read more in "3 Ways to Optimize Urban Infrastructure" by Stephen Goldsmith. 

  • How can we repair power outages faster after a natural disaster?

    Smart grids, and particularly smart electric meters, played a promising role in improving disaster response and the speed with which power could be restored after super storm Sandy downed power lines across the east coast in 2012. That role was small-scale and local, since electric utilities' conversion to smart-grid technology has been slower than desired, but the potential is there for the technology to have a much larger impact as these systems are rolled out more widely. At best, phone calls and spotty service-outage reports can slowly piece together a hazy picture of the conditions of a power network. But smart meters, programmed to send out a "last call of distress" when power is lost, can automatically report service cuts. This gives a utility company instant access to regional maps of outages, allowing it to prioritize repair-crew mobilization and begin getting service back to customers without them even having to report an outage. Additionally, smart meters can automatically report getting back on line when power is restored, eliminating unnecessary calls between the utility company and customers or follow-up service-crew visits. Repair crews can move on to the next repair rather than spending time checking on their last one, increasing efficiency and reducing system repair time considerably. Read more in "Getting the Lights on Faster" by Benjamin Weinryb Grohsgal and Stephen Goldsmith.

Regulation

  • Can we determine where unsafe housing problems are unlikely to be reported through 311?
  • How can we use analytics to prioritize accessibility inspections for building alterations, and make sure they are compliant with municipal building code and state accessibility requirements?
  • Who is most likely to be guilty of financial crimes and fraud?
  • Which properties cause the most problems?

    In 2011, Boston created the Problem Properties Task Force, a cross-agency committee that works to address and preempt community disorder by identifying the city’s most risk-prone and risk-causing addresses. Each contributing agency (there were eight) furnishes their respective existing datasets, which are consolidated with data from the Mayor’s 24-Hour Hotline and then processed by the City’s High-Performance Analytic Appliance (HANA). Each member then contributes his or her department’s records and incoming public complaints to complement the analytics engine results in generating a comprehensive picture of the city’s most problematic residences. Then, using these sources of intelligence, the task force determines the appropriate action to take. This could mean increasing police surveillance, expediting enforcement proceedings by the Air Pollution Control Commission, levying of charges to recoup public costs, or commencing foreclosure proceedings against a property owner with delinquent real estate taxes. Read more in "Problem Properties: A Preemptive Strategy Toward Neighborhood Stability," by Craig Campbell.

  • How can inspectors reduce response time to maintenance complaints?
  • Where are blighted properties located?

    Detroit, possibly the city hit hardest by blight and vacancies, is leading the way in attacking the problem in a data-driven method. Detroit’s Blight Removal Task Force, deploying more than 200 people over 14 weeks, has successfully surveyed more than 99 percent of the city’s 380,217 properties. Information collected onsite, including photographs, lot characteristics, condition of structures and the owner, is sent wirelessly to the operations center, where it is checked while the team is still at the property. This information has helped identify candidates for demolition and areas of elevated safety concern. "Data Helps Calculate the True Costs of Blight," by Stephen Goldsmith. 

  • How can we prioritize annual elevator safety inspections?  For example, can we predict or identify which elevators pass every year and could be outsourced to a 3rd party?
  • How can we predict vacant or abandoned buildings before they reach that status?  To do so, can we use court foreclosure filings, US Postal “undeliverable” data, tax information, and data outside government, such as utility bill records?
  • Where is illegal grease disposal likely to be occurring?

    With assistance from the Mayor’s Office of Analytics using a hotspot analysis, New York City's Business Integrity Commission cross-referenced industry data on grease production with restaurant permit data and sewer back-up data from the Departments of Health (DOH) and Environmental Protection (DEP) to better target enforcement and predict illegal activity. Since launching this partnership effort with DEP in the Fall of 2012, they have achieved an increase in violations by 30% while achieving a 60% reduction in manpower dedicated to grease enforcement. Read more in "Enforcement and Data," by Shari Hyman.

  • Which construction / renovation projects are the highest risk / should be inspected first?
  • Which buildings are the highest risk / should be inspected first?
  • Which equipment (such as boilers, elevators, cranes, vehicles, etc.) is the highest risk / should be inspected first?
  • Where are there unreported incidents of food poisoning?

    Diners who suffer food poisoning rarely report it through official channels, even though foodborne illness is a public health concern. However, sick, unhappy customers have incentive to vent their complaints on Yelp, a popular app and website for local business reviews. New York City’s Department of Health and Mental Hygiene recently completed a pilot project in partnership with the company aimed at identifying unreported outbreaks of foodborne illness. Working with software developers at Columbia University, city researchers converted nearly nine months of Yelp reviews into machine-readable data. They were then able to pinpoint potentially hazardous establishments by reviews that included terms such as “sick,” “vomit” or “food poisoning.” Scanning 294,000 restaurant reviews in New York, the software flagged three restaurants that together produced 16 documented illnesses. When health inspectors subsequently visited these establishments, they discovered astonishing health code violations: improperly sanitized surfaces and bare-hand contact with ready-to-eat food at the first two, and live roaches and evidence of mice at the third. Read more in "How Social Media Listening Can Improve Public Health," by Stephen Goldsmith

  • Which restaurants are most likely to have code violations?

    In a recently completed pilot program, Chicago used analytics to improve the process by which health inspectors identify "critical violations" in food establishments, usually related to improper food temperature. Here's how it worked: The city processed relevant data to identify predicting variables associated with violations, developed a model, ran a simulation and then used this forecast to allocate inspections in a way that prioritized likely violators. This data-optimized trial method sped up the process of identifying critical violations by seven days — meaning that restaurant patrons are that much less likely to contract a foodborne illness. Read more in "Chicago's Data-Powered Recipe for Food Safety" by Stephen Goldsmith.

  • What variables affect inspector productivity and which can be most easily influenced? What distinctions can be made between inspectors who complete a high number of inspections and those who are at the bottom end?
  • Based on the relationship between inspections and violations, what building inspection regimens are most effective at preventing violations from occuring?
  • How many inter-agency inspections are conducted each year? Do they effectively detect current violations?
  • Which city debts are least likely to be paid?
  • Which taxpayers are least like to pay?
  • Which businesses are likely to be underpaying their taxes? How can we increase the productivity of auditors?

    In New York City, a Finance Department auditing team decided to use analytics to increase the productivity of auditors reviewing companies thought to be underpaying their taxes. Using sophisticated data analytics, the commissioner instructed his department to look for patterns—identifying individuals who had businesses similar to others but who stood out as outliers on taxes paid. In so doing, the team reduced the portion of audit cases closing without change: from 37 percent to 22 percent over three years. This represents a 40 percent increase in productivity for the department and a 100 percent reduction of government intrusion for the thousands of companies that would have been catapulted into the audit process, with an end result of no change on their returns. Read more in "Making Data Matter in Administrative Systems," by Stephen Goldsmith.

  • What city blocks need more inspection enforcement?
  • Which businesses are most likely to be violating weights and measures?
  • How can we determine what businesses will have over-occupancy issues, including multiple incidents of over-crowding?
  • How can we tap social media for information on illegal businesses?
  • What property owners, architects, developers, businesses and landlords need more regulatory enforcement? 
  • How can we use social media to ensure licenses are conducting legal business?
  • How can data identify illegal tree removal?

    Washington, D.C.'s Urban Forestry Administration used lidar data to identify illegal tree removal based on the original height of the trees. This vastly improved enforcement of permitting laws. Read more in "How D.C. Grew a Data-Driven Tree Strategy" by Stephen Goldsmith.

  • Can we predict which stores sell cigarettes to youth?
  • How can we target stores that sell outdated food or expired baby formula?
  • Does the order of inspections (building, health, or fire) increase the rate of violation?

This post has been updated over time with additional examples.