Google的AI翻译正在接近人类的水平:

谷歌用神经机器系统把汉语翻译成英语 错误率最高下降85%

北京时间9月28日上午消息,谷歌今天宣布,网络和移动版的谷歌翻译现在使用新的神经机器翻译系统,并用于汉译英,目前,谷歌翻译应用每天翻译约1800万次。谷歌也在发布一篇关于该方法的学术论文。

  此前谷歌也曾表示在谷歌翻译中使用神经网络,但具体用于实时视频翻译功能。而今年早些时候,谷歌高级研究员杰夫·迪恩(Jeff Dean)告诉VentureBeat,谷歌正致力于研究将深入学习功能更多地整合到谷歌翻译中。今天发布的就是这项工作的成果。

  谷歌已经把深层神经网络整合到越来越多的应用中,包括智能即时通讯工具Google Allo和Gmail Inbox,同时也帮助谷歌更有效地运行数据中心。

  就谷歌神经机器翻译(GNMT)而言,该公司正在依托八层长短期记忆递归神经网络(LSTM RNNs)。一旦神经网络得到了充分的训练,在图形处理单元(GPU)的帮助下,谷歌即可依靠最近推出的张量处理单元(TPU)对新数据加以推断。

  神经机器翻译并不总是最理想的,但谷歌的研究成果在某些情况下显示出优势。

  “使用者评价表明,与以前的许多语言对系统:英语?法语、英语?西班牙语和英语?汉语相比,GNMT可将翻译错误减少60%,”研究者在论文中写道。“更多的实验表明,翻译系统的译文质量接近于普通的人类译者。”

  在今天发布的一篇博客文章中,谷歌大脑团队的研究科学家富国乐(Quoc Le)和麦克·舒斯特(Mike Schuster)指出,“在双语评委的帮助下,从维基百科和新闻网站选取的几种语言对样句,”翻译错误实际上下降了55-85%。

  即便如此,这套系统也不完美。

  “GNMT仍然可以作出翻译人员不可能犯的重大错误,如漏译和误译专有名称或罕见术语,而且,翻译句子是孤立的,没有考虑上下文或页面的语境,“富国乐和舒斯特写道。“还有很多工作要做,我们为用户服务得更好。然而,GNMT代表了一个重要的里程碑。”


Google's AI translation system is approaching human-level accuracy

14 comments

But there’s still significant work to be done

Google is one of the leading providers of artificial intelligence-assisted language translation, and the company now says a new technique for doing so is vastly improving the results. The company’s AI team calls it the Google Neural Machine Translation system, or GNMT, and it initially provided a less resource-intensive way to ingest a sentence in one language and produce that same sentence in another language. Instead of digesting each word or phrase as a standalone unit, as prior methods do, GNMT takes in the entire sentence as a whole.

"The advantage of this approach is that it requires fewer engineering design choices than previous Phrase-Based translation systems," writes Quoc V. Le and Mike Schuster, researchers on the Google Brain team. When the technique was first employed, it was able to match the accuracy of those existing translation systems. Over time, however, GNMT has proved capable of both producing superior results and working at the speed required of Google’s consumer apps and services. These improvements are detailed in a new paper published this week.

Google

In some cases, Google says its GNMT system is even approaching human-level translation accuracy. That near-parity is restricted to transitions between related languages, like from English to Spanish and French. However, Google is eager to gather more data for "notoriously difficult" use cases, all of which will help its system learn and improve over time thanks to machine learning techniques. So starting today, Google is using its GNMT system for 100 percent of Chinese to English machine translations in the Google Translate mobile and web apps, accounting for around 18 million translations per day.

Google admits that its approach still has a ways to go. "GNMT can still make significant errors that a human translator would never make, like dropping words and mistranslating proper names or rare terms," Le and Schuster explain, "and translating sentences in isolation rather than considering the context of the paragraph or page. There is still a lot of work we can do to serve our users better." But soon, as Google’s products and services continue vacuuming up valuable corner cases and rare phrasings, our phones may be capable of breaking down language barriers as effectively as a bilingual human being.

参考文献

Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation



你可能感兴趣的:(深度学习)