Despite being one of the richest countries in the world, in Qatar, digital maps are lagging behind. While the country is adding new roads and constantly improving old ones in preparation for the 2022 FIFA World Cup, Qatar isn’t a high priority for the companies that actually build out maps, like Google.
“While visiting Qatar, we’ve had experiences where our Uber driver can’t figure out how to get where he’s going, because the map is so off,” Sam Madden, a professor at MIT’s Department of Electrical Engineering and Computer Science, said in a prepared statement. “If navigation apps don’t have the right information, for things such as lane merging, this could be frustrating or worse.”
Madden’s solution? Quit waiting around for Google and feed machine learning models a whole buffet of satellite images. It’s faster, cheaper, and way easier to obtain satellite images than it is for a tech company to drive around grabbing street-view photos. The only problem: Roads can be occluded by buildings, trees, or even street signs.
So Madden, along with a team composed of computer scientists from MIT and the Qatar Computing Research Institute, came up with RoadTagger, a new piece of software that can use neural networks to automatically predict what roads look like behind obstructions. It’s able to guess how many lanes a given road has and whether it’s a highway or residential road.
RoadTagger uses a combination of two kinds of neural nets: a convolutional neural network (CNN), which is mostly used in image processing, and a graph neural network (GNN), which helps to model relationships and is useful with social networks. This system is what the researchers call “end-to-end,” meaning it’s only fed raw data and there’s no human intervention.
First, raw satellite images of the roads in question are input to the convolutional neural network. Then, the graph neural network divides up the roadway into 20-meter sections called “tiles.” The CNN pulls out relevant road features from each tile and then shares that data with the other nearby tiles. That way, information about the road is sent to each tile. If one of these is covered up by an obstruction, then, RoadTagger can look to the other tiles to predict what’s included in the one that’s obfuscated.
Parts of the roadway may only have two lanes in a given tile. While a human can easily tell that a four-lane road, shrouded by trees, may be blocked from view, a computer normally couldn’t make such an assumption. RoadTagger creates a more human-like intuition in a machine learning model, the research team says.
“Humans can use information from adjacent tiles to guess the number of lanes in the occluded tiles, but networks can’t do that,” Madden said. “Our approach tries to mimic the natural behavior of humans … to make better predictions.”
The results are impressive. In testing out RoadTagger on occluded roads in 20 U.S. cities, the model correctly counted the number of lanes 77 percent of the time and inferred the correct road types 93 percent of the time. In the future, the team hopes to include other new features, like the ability to identify parking spots and bike lanes.