Raiders of the lost data

Posted: November 30, 2018

In the late 1960s, Ohio State Professor Joseph Treiterer undertook what would become seminal research in the study of traffic and how people drive. Over fifty years later, Treiterer's work is still cited dozens of times each year. The longevity of the work is due in part to the massive undertaking of extracting vehicle trajectories (i.e., direction and speed of travel). A specially modified airplane collected one photo every second of rush hour traffic on I-71 in Columbus, one morning in July 1967. Over the next few years, his study tracked over 200 vehicles as they traveled three miles through that traffic jam, manually measuring the location of each vehicle at every second.

In today's smart mobility era, predicting other traffic participants trajectories is a crucial task for autonomous vehicle programming.

By the time Associate Professor Benjamin Coifman joined the Ohio State faculty in 1999, Treiterer's data had attained almost mythical status in the traffic flow theory community. However the data set was never shared with the research community and, at some point, was lost without a trace. It was not until 2017 that the situation would change.

transportation research
Lizhe Li and Coifman discuss their research
"In traffic flow studies, we want to model the acceleration of individual vehicles as they respond to their respective leader. But collecting real traffic data at this resolution is challenging to say the least,” Coifman stated. “At freeway speeds, a vehicle can travel 100 feet per second while potentially responding to dozens of other vehicles and we want to be able to measure these interactions with an accuracy of under one foot."  

A few years ago, Coifman was working with the 2005 Next Generation Simulation (NGSIM) data published by the U.S. Department of Transportation (USDOT). His team recognized major inconsistences in the trajectory data and later determined that the image processor used to track the vehicles was faulty, yielding unrealistic trajectories. Walking in the footsteps of Treiterer, Coifman's group began the task of manually re-extracting the NGSIM trajectories from the original video. The team used an automated tracker for the majority of its work, while humans validated the trajectories with the aid of specially designed user interfaces.

Then inspiration struck. "With the tools we developed to clean the NGSIM data, why not see what we could do with Treiterer's data?" Coifmann asked. Although the dataset was lost, printouts of the vehicle trajectories existed in the original reports that Treiterer submitted to the Ohio Department of Transportation (ODOT) and USDOT. Zona Kahkonen Keppler, specialist in Research Services at ODOT, quickly tracked down the reports in the agency's archives and provided the Coifman group with high resolution scans.

The process of recovering Treiterer's trajectories was the subject of Resurrecting the Lost Vehicle Trajectories of Treiterer and Myers with New Insights into a Controversial Hysteresis, a paper recently published in Transportation Research Record: Journal of the Transportation Research Board (TRR Journal).

Coifman's team, including PhD candidate Lizhe Li, created algorithms to trace the printed trajectories, overcoming distortion in the original print with help from Dr. Wen Xiao, coauthor and colleague from Newcastle University. Once more, the team was able to rely on modern technology to complete most of the processing. Still, the researchers themselves completed tasks too complex for computers and verified the final results.

The newly recovered data finally allowed the researchers to reproduce Treiterer's analysis, which was known not only for the scale of the data collection, but also for the unexpected hysteresis—looping pattern in the data— that it noted. University of California, Berkeley Professor Carlos Daganzo had previously stated that Treiterer's hysteresis has been "a longstanding source of speculation in the transportation literature, often used to justify questionable models." With the resurrected trajectories, Coifman's team was able to reexamine the vehicles underlying the hysteresis and quell the speculation.

Rather than arising from car following behavior, it turns out that the enigmatic progression arose from a combination of lane change maneuvers and unremarkable transitions into and out of traffic congestion. These confounding factors would have been difficult to spot decades ago. With the use of modern numerical analysis software however, Coifman's team was able to efficiently examine the data. A plot that would have taken Treiterer's team weeks to produce, could now be generated in under one second.

With the publication of the paper, Treiterer's trajectories were finally released to the research community after 50 years. And with the growing interest in autonomous vehicles, Coifman believes that this trajectory data set is as important as ever. "Right now we still do not fully understand how people drive, so is it realistic to think we can effectively automate traffic?" he observed.

By analyzing Treiterer's data to study how humans drive, researchers will gain critical insights in how to automate the driving process. Coifman believes that by understanding how humans follow the leader on the road, we will ultimately be able to teach autonomous vehicles to reproduce what humans do correctly while avoiding human’s problematic driving behaviors.

In August 2018, Coifman's paper was awarded the Greenshields Prize from the Transportation Research Board (TRB) Committee on Traffic Flow Theory and Characteristics. The Greenshields Prize is the highest honor given annually by the committee. Part of the National Academies of Sciences, Engineering and Medicine, TRB's mission is to provide innovative, research-based solutions to improve transportation.

by Kevin Satterfield, Dept. of Civil, Environmental and Geodetic Engineering