你负责想,电脑帮你说

For many people who are paralyzed and unable to speak, signals of what they'd like to say hide in their brains.

对于很多那些瘫痪了或没法说话的人们,想要说的话都存在脑子里。

No one has been able to decipher those signals directly.

目前还没人能直接解码出这些信号。

But three research teams recently made progress in turning data from electrodes surgically placed on the brain into computer-generated speech.

但最近,三个研究小组在将植入大脑的电极数据转化为计算机生成的语音方面取得了进展。

Using computational models known as neural networks, they reconstructed words and sentences that were, in some cases, intelligible to human listeners.

他们利用被称为神经网络的计算模型,重建了在某些情况下人类听众能够理解的单词和句子。

None of the efforts, described in papers in recent months on the preprint server bioRxiv, managed to re-create speech that people had merely imagined.

最近几个月里,之前在预印本服务器bioRxiv上发表的论文中描述的这些努力,还没有一项能成功再现人们脑海中想象的语言。

Instead, the researchers monitored parts of the brain as people either read aloud, silently mouthed speech, or listened to recordings.

相反,当人们大声朗读、默念或听录音时,研究人员监测了他们大脑的部分区域。

But showing the reconstructed speech is understandable is "definitely exciting," says Stephanie Martin, a neural engineer at the University of Geneva in Switzerland who was not involved in the new projects.

但是展示重构后的演讲是可以理解的“也是绝对令人兴奋的,”瑞士日内瓦大学的神经工程师斯蒂芬妮·马丁说,她没有参与新项目。

People who have lost the ability to speak after a stroke or disease can use their eyes or make other small movements to control a cursor or select on-screen letters.

中风或疾病后失去说话能力的人可以用眼睛或做其他小动作来控制光标或选择屏幕上的字母

(Cosmologist Stephen Hawking tensed his cheek to trigger a switch mounted on his glasses.)

(宇宙学家斯蒂芬霍金会通过绷紧脸颊来触发眼镜上的一个开关。)

But if a brain-computer interface could re-create their speech directly, they might regain much more: control over tone and inflection, for example, or the ability to interject in a fast-moving conversation.

但是如果大脑-计算机界面能够直接重建他们的语言,他们可能会重新获得更多:例如,对语调和语调变化的控制,或者在快速移动的对话中插话的能力。

The hurdles are high.

挑战还是不小的。

"We are trying to work out the pattern of … neurons that turn on and off at different time points, and infer the speech sound," says Nima Mesgarani, a computer scientist at Columbia University.

哥伦比亚大学的计算机科学家尼玛·梅斯加拉尼说:“我们正试图找出在不同时间点开启和关闭的神经元的模式,并推断出说话的声音。”

"The mapping from one to the other is not very straightforward."

“从一个到另一个的映射并不是很简单。”

How these signals translate to speech sounds varies from person to person, so computer models must be "trained" on each individual.

这些信号转换成语音的方式因人而异,因此必须对每个人去“训练”计算机模型。

And the models do best with extremely precise data, which requires opening the skull.

这些模型得用非常精确的数据才能做得最好,这就需要打开头骨。

Researchers can do such invasive recording only in rare cases.

研究员们只有在很罕见的情况下才能进行这种侵入式的记录。

One is during the removal of a brain tumor, when electrical readouts from the exposed brain help surgeons locate and avoid key speech and motor areas.

一种是在切除脑瘤的过程中,暴露在外的大脑发出的电信号可以帮助外科医生定位并避免关键的语言和运动区域。

Another is when a person with epilepsy is implanted with electrodes for several days to pinpoint the origin of seizures before surgical treatment.

另一种是将电极植入癫痫病人体内数天,以便在手术治疗前查明癫痫发作的原因。

"We have, at maximum, 20 minutes, maybe 30," for data collection, Martin says. "We're really, really limited."

“我们只有20,顶多30分钟”来进行数据收集,马丁说道。“我们真的真的很受限制。”

The groups behind the new papers made the most of precious data by feeding the information into neural networks, which process complex patterns by passing information through layers of computational "nodes."

这些新论文背后的团队则是通过将信息输入神经网络,从而最大限度地利用了宝贵的数据,神经网络通过计算“节点”层传递信息,处理复杂的模式。

The networks learn by adjusting connections between nodes.

神经网络通过调整节点之间的连接来进行学习。

In the experiments, networks were exposed to recordings of speech that a person produced or heard and data on simultaneous brain activity.

在实验中,网络被暴露在一个人产生或听到的讲话录音和同时进行的大脑活动的数据中。

Mesgarani's team relied on data from five people with epilepsy.

梅斯加拉尼的研究小组依赖于5名癫痫患者的数据。

Their network analyzed recordings from the auditory cortex (which is active during both speech and listening) as those patients heard recordings of stories and people naming digits from zero to nine.

他们的网络分析了来自听觉皮层(在说话和听的过程中都是活跃的)的录音,这些病人听到的是故事的录音以及人们说出的数字从0到9。

The computer then reconstructed spoken numbers from neural data alone; when the computer "spoke" the numbers, a group of listeners named them with 75% accuracy.

然后,计算机仅根据神经数据重建语音数字;当计算机“说出”这些数字时,一组听众以75%的准确率说出了这些数字。

Another team, led by computer scientist Tanja Schultz at the University Bremen in Germany, relied on data from six people undergoing brain tumor surgery.

另一个由德国不来梅大学计算机科学家唐娅.舒尔茨领导的研究小组,依靠的是6名接受脑瘤手术的患者的数据。

A microphone captured their voices as they read single-syllable words aloud.

当他们大声朗读单音节单词时,麦克风捕捉到了他们的声音。

Meanwhile, electrodes recorded from the brain's speech planning areas and motor areas, which send commands to the vocal tract to articulate words.

与此同时,从大脑的语言规划区和运动区记录下来的电极,将命令发送到声道以清晰地表达单词。

Computer scientists Miguel Angrick and Christian Herff, now with Maastricht University, trained a network that mapped electrode readouts to the audio recordings, and then reconstructed words from previously unseen brain data.

目前供职于马斯特里赫特大学的计算机科学家米格尔·安里克和克里斯蒂安·赫夫训练了一个将电极读数映射到音频记录的网络,然后根据之前未见过的大脑数据重建单词。

According to a computerized scoring system, about 40% of the computer-generated words were understandable.

根据计算机评分系统,大约40%的计算机生成的单词是可以理解的。

Finally, neurosurgeon Edward Chang and his team at the University of California, San Francisco, reconstructed entire sentences from brain activity captured from speech and motor areas while three epilepsy patients read aloud.

最后,加州大学洛杉矶分校的神经外科医生爱德华.常及团队在三个癫痫病人大声朗读时,用从语言和运动区域捕捉到的大脑活动重建完整的句子。

In an online test, 166 people heard one of the sentences and had to select it from among 10 written choices.

在一项在线测试中,166人听到其中一个句子,并从10个书面选择中进行选择。

Some sentences were correctly identified more than 80% of the time. The researchers also pushed the model further: They used it to re-create sentences from data recorded while people silently mouthed words.

有些句子的正确率超过80%。研究人员还进一步推进了这一模型:他们使用该模型从人们默读单词时记录的数据中重新创建句子。

That's an important result, Herff says—"one step closer to the speech prosthesis that we all have in mind."

这是一个重要的结果,赫夫说,“离我们心中的语音假肢又近了一步。”

However, "What we're really waiting for is how [these methods] are going to do when the patients can't speak," says Stephanie Riès, a neuroscientist at San Diego State University in California who studies language production.

然而,“我们真正在等待的是,当病人不会说话时,(这些方法)将如何发挥作用,”加州圣地亚哥州立大学研究语言产生的神经科学家斯蒂芬妮·里斯说。

The brain signals when a person silently "speaks" or "hears" their voice in their head aren't identical to signals of speech or hearing.

当一个人默默地“说话”或“听到”自己的声音时,大脑发出的信号与说话或听到的信号并不相同。

Without external sound to match to brain activity, it may be hard for a computer even to sort out where inner speech starts and ends.

如果没有与大脑活动相匹配的外部声音,计算机甚至可能很难分辨出内部语言的开始和结束。

Decoding imagined speech will require "a huge jump," says Gerwin Schalk, a neuroengineer at the National Center for Adaptive Neurotechnologies at the New York State Department of Health in Albany.

纽约州奥尔巴尼市的纽约州卫生局国家适应神经技术中心(的神经工程师格温.斯恰克说,要解码脑海中的演说则需要迈出“更大的一步”。

"It's really unclear how to do that at all."

“目前还不清楚到底该怎样做的。”

One approach, Herff says, might be to give feedback to the user of the brain-computer interface: If they can hear the computer's speech interpretation in real time, they may be able to adjust their thoughts to get the result they want.

赫夫说,一种方法可能是给大脑-计算机界面进行用户反馈:如果他们能听到计算机的实时解译,他们就可以调整自己的思路来得到想要的结果。

With enough training of both users and neural networks, brain and computer might meet in the middle.

经过足够的用户和神经网络的训练,大脑和计算机可能会形成一种默契。

问题

文中提到的计算机重构出的语言让别人去识别的时候,准确率最高的可以达到多少?

留言回复正确答案,即可获得红包啦,朋友们快来试试吧!

感谢关注

跟amber一起看世界

(0)

相关推荐