By Nyasha Weinberg • November 14, 2016

Managing London’s transport infrastructure is no easy task, but a successful city requires well-run transportation. Each day, residents take 24 million journeys on underground trains and six million commuters ride one of 8,500 buses. Waterloo station alone serves 100 million passengers each year. With a population projected to rise by 2.35 million between 2014 and 2041, London, like many other cities worldwide, must expand capacity to meet the burgeoning demand.

The transportation that powers the city depends on back-office data on journey planning, status, disruption, construction, timetables, embarkation points, routes, lines, and fares. The huge demands placed on services and infrastructure requires holistic thinking about the network. The complex system comprises underground trains, aboveground trains, buses, bikes, boats, and even a gondola. Transport for London (TfL), the agency responsible for all public transit in London, is demonstrating how cities can use data to optimize their transport systems, making travel smarter and more efficient in an effort to improve the quality of service delivered to Londoners.


The multiple dynamic parts of London’s transportation network necessitate the clever use of technology to maximize system potential. TfL collects a great deal of data through its ticketing services, and, in 2003, introduced the Oyster card, one of the first smart ticketing initiatives. The bus system went completely cash-free in 2014, after the discovery that just one percent of journeys were made using cash.

By transitioning to smart payment methods including Oyster and contactless credit/debit cards, TfL is now able to collect information on the location, time, and date of journeys. This information is linked to the card, encrypted, and then stored for eight weeks, after which links to individual passengers are deleted to protect privacy. This allows TfL to retain anonymized data for research. One benefit of storing information on journey characteristics is that it enables TfL to provide excellent customer service. Customers can search their individual travel records, identify where incorrect charging has taken place, or autocollect refunds on journeys that were delayed.

Data from smart ticketing systems can also inform service provision. Approximating daily flows of passengers is a complex task. By collecting information on journeys on buses, trains and other forms of transport in a particular area, TfL can approximate the extent of service disruption generated by any particular event. For example, the closure of Putney Bridge, a large bridge in West London, for repairs could have triggered serious disruption. However, because the city collects data from Oyster cards, analysts were able to approximate the number of journeys taken across the bridge within a typical day (56,000) and arrange transfer facilities to ensure that customers weren’t charged twice. The city fused Oyster Card and bus location data to obtain an accurate picture of pedestrian flows. Finally, those customers who had provided contact details to TfL were sent targeted emails with information on alternative routes.

Changes in services can also allow for the collection of novel data, including assessment of the success of new initiatives. TfL recently offered a ‘bus hopper’ fare to prevent passengers from being charged twice when traveling on different bus routes. Patterns in travel can be detected by combining information on the number of customers using the bus hopper fare and behavioral analysis. Lauren Sager-Weinstein, the head of analytics for TfL, explained that the agency is now using “machine learning to segment customers with different characteristics” to learn more about their behavior and plan services accordingly. Categories analyzed include visitors, people who own a car, and commuters. The city can then use this information to inform the direction of future policy.

Frequently, this work is complex. Passengers only use contactless cards when boarding a bus, and therefore their exit stop isn’t recorded. In conjunction with the Massachusetts Institute of Technology (MIT), TfL built an algorithm pairing information from the bus and train networks, both of which run on Oyster and contactless credit cards, to infer the exit point for passengers. Using this data, TfL can now anticipate when buses are likely to be crowded in order to change the number of buses available and change stop locations to minimize walk times. The information collected led to a complete restructuring of the network in one particular location

Similar analytical work assists with capacity planning for scheduled disruptions, including large Premier League football matches, concerts, and other cultural events alongside unscheduled changes to services like signal failures or station closures. Preparing for such events is a crucial part of managing any large city, particularly one as vibrant as London. By mapping passengers’ travel routes using information from smart ticketing, TfL can improve both network planning and operations management. Finally, fusing all of the data has helped TfL to create innovative visualizations like this collision map.


TfL doesn’t only generate data to study passenger behavior, it also uses data and technology to direct repair and the maintenance work that accompanies a vast network of tracks and fleet of vehicles. The city is now using mobile solutions for the management of repairs and maintenance teams. More than 500 frontline personnel are managing and documenting their work on iPhones and iPads. Moving to a mobile system makes data management more efficient by eliminating wasteful paper-based products and streamlining information management. Moreover, it saves time by removing the need to return to the head office between jobs.

Integrating technology into asset management and inspection processes also increases the quality and quantity of data on individual assets and allows TfL to use predictive analytics to try to determine where the system is likely to fail. London has already made investments in machine learning and condition monitoring to predict where replacements are needed, improving safety.

Road Data

TfL is looking to understand more about traffic on the roads, which involves analyzing the activity of vehicles outside of the agency’s control. The agency hopes this will minimize bus delays and improve service provision during rush hour. Traffic counters gather information for use by traffic ops teams who are developing a real-time view of traffic circulation around the network.

Data presentation has been an important part of the learning curve for TfL. The first front-end tool for locating buses was rolled out around five years ago. However, controllers complained about information overload. As a consequence, a separate research project was initiated to consolidate the information to provide easily discernable early indicators of delays. Encouraging the iterative creation of tools ensure that the data collected is put to use.

Third Parties

By creating a unified API, TfL has enabled the development of over 500 apps by third parties. Even though the initial task of releasing that data on a platform took a large amount of work, the results have included useful visualizations; these apps are used by 42% of Londoners to navigate the city.

Next Steps

Sager-Weinstein identified roads as an area on which additional data would be helpful. At present, only limited information is collected on walkers and cyclists, and although digital counters are used to count flows in some locations, these don’t give TfL enough fine-grained detail to construct a complex picture of road utilization.  

Moreover, TfL lacks information on how people are using the streets. Are they headed to work? Walking for leisure? Shopping? Developing tools to improve the knowledge of pedestrian behavior would supplement work on other aspects of the network to create a complete picture of movement around the city.

In the next 35 years, urban centers worldwide are predicted to add another 2.5 billion residents. Given the enormous transport demands created by this swelling population, it is imperative that cities like London lead efforts to develop new solutions to complex transportation problems.

Fortunately, there is much that smaller cities can learn. Implementing changes in London always proves extremely difficult given the complexity of a system whose first tracks were laid in 1863. Smaller cities with fewer foundational challenges have greater opportunities for holistic change and should look to learn from some of TfL’s successes.