Back in 2008, theoretical physicist Stephen Hawking used a speech synthesizer program on an Apple II computer to “talk.” He had to use hand controls to work the system, which became problematic as his case of Lou Gehrig’s disease progressed. When he upgraded to a new device, called a “cheek switch,” it detected when Hawking tensed the muscle in his cheek, helping him speak, write emails, or surf the Web.
Now, neuroscientists at the University of California, San Francisco have come up with a far more advanced technology—an artificial intelligence program that can turn thoughts into text. In time, it has the potential to help millions of people with speech disabilities communicate with ease.
“We exploit the conceptual similarity of the task of decoding speech from neural activity to the task of machine translation; that is, the algorithmic translation of text from one language to another,” the scientists wrote in a new paper published in the scientific journal Nature Neuroscience.
They’ve taken an AI approach that is akin to translating text in different languages. The underlying theory is the same in both cases—the goal is to convert one sequence of some arbitrary length into another—but the inputs are different, neural signals in the brain versus text.
To test out their hypothesis, the researchers used human trials. The scientists implanted electrodes into the brains of four participants with epilepsy to monitor their speech. Each person then read sentences aloud from one of two datasets: a set of picture descriptions, composed of 30 sentences and 125 unique words, which contained 460 sentences and about 1,800 unique words.
Each participant read 50 sentences aloud multiple times, including lines like “Tina Turner is a pop singer” and “there is chaos in the kitchen.” As each person spoke, the researchers monitored their brain activity. Then, they input the data into a machine learning algorithm that could switch the brain waves into a string of numbers that encoded the sentences. In another portion of the system, the numbers were converted back into a sequence of words.
At the outset, the system came up with some nonsensical phrases, like “the spinach was a famous singer;” lines with improper grammar, like “several adults the kids was eaten by;” and some ultimately philosophical-sounding sentences, such as “the oasis was a mirage.” Over time, the system improved as the researchers fed the system the initial sentences that the participants read aloud, to compare against.
In one case, the system got 97 percent of the sentences correct, representing less errors than the average human transcriber. Still, the algorithm is only processing a small number of sentences and words compared to what a user would ultimately desire.
Still, the system currently only works on verbal speech—meaning those who suffer from speech disorders caused by muscle paralysis won’t benefit just yet.