Thanks to a couple of salivating canines, robot fry cooks can whip up a pretty impressive hot dog. That’s because both the dogs and the robots have a connection to reinforcement learning, a concept in human psychology that uses reinforcement, or reward, to condition a given organism’s biological responses.
A new paper published in the scientific journal Science Robotics describes the work that Boston University researchers Xiao Li, Zachary Serlin, Guang Yang, and Calin Belta conducted. To teach the two robots—called Baxter and Jaco—to flip the beef franks, the team relied on a different kind of reinforcement learning from the world of psychology.
In computer science lingo, reinforcement learning is a branch of machine learning that uses software to gather more information about its surroundings through continually repeating trials and rewarding the successful attempts. Once that’s been fleshed out on the software side—here, through a myriad of simulations—the machine, in the real world, should be able to perform the functions its software has gotten down pat.
This is how the team managed to guide the real robots through the task of making and assembling a hot dog. It builds upon prior work that researchers have conducted in the field of robotic reinforcement learning, applied to human food-making tasks.
“This work is an attempt to bridge the gap between symbolic knowledge representation and reasoning with optimization-based planning while allowing the overall system to continuously and safely improve by interacting with its environment,” Boston University doctoral fellow and first author of the paper, Xiao Li, said in a press statement. “We hope that such an architecture can help us impart our knowledge and objectives to the robot, and improve our understanding of what it has learned, thus leading to more capable robotic systems.”
In the field of psychology, reinforcement learning goes back to Russian physiologist Ivan Pavlov. In his famous experiment, Pavlov trained dogs to associate the ring of a dinner bell with food. After many trials with the food and the subsequent ring, the dogs began to salivate when they heard the ding-dong of the bell.
So how did the Boston University scientists take advantage of reinforcement learning in the realm of machine learning? They put together a proof-of-concept task by training Baxter and Jaco to cook, assemble, and serve hot dogs.
Of course, they did much of the heavy lifting through simulation trials. These were set up through a set of formulas that the researchers used to specify and combine tasks like:
🌭“Pick up hot dog and place on the grill.”
🌭“Always avoid collisions,“ to meet safety requirements.
🌭“You cannot pick up another hot dog if you are already holding one,“ to incorporate general prior knowledge, that we take for granted as humans.
To train Jaco and Baxter through simulation trials, the scientists highly emphasized what they’ve called a “formal specification language” that helps to train the software. The goal is to create easy-to-understand task descriptions.
And through the hot-dog cooking and assembling task, the authors found their formal specification language was a success. In the paper, they write it was “easily interpretable from the beginning because the language is very similar to plain English.”
Baxter and Jaco aren’t alone in the world of robotic grill masters. Sony, for its part, is working with Carnegie Mellon University to build robots with the dexterity and precision required to handle food prep, cooking, and delivery. Robotic manipulation in the world of foodstuffs is infamously difficult, but CMU has already proven itself back in 2013 through Herb, a robot that can very carefully separate an Oreo cookie from its creamy filling.
That ties in nicely with a robot named Vincenzo, who works for Zume Pizza in Mountain View, California to make the workplace safer for its human coworkers. Vincenzo removes the piping hot pizzas from the ovens to help reduce human injuries—typically, in pizza joints, those are minor to severe burns.
And there’s also a patty-flipping robot called —of course—Flippy that mans the grill at a California fast-food joint. Flippy uses thermal imaging and 3D optics to sense when the burger is cooked to the desirable temperature before flipping and removing it. Here’s hoping we get a Philly Cheesesteak bot next.