机器翻译的炒作艺术

百度公布了自己研发的 STACL 系统(Simultaneous Translation with Anticipation and Controllable Latency,带有预测和可控延迟的即时翻译),据称也能在译者开始讲话后几秒钟,而不是说完一段话后,就开始进行几乎同步的翻译。

如名称所示,刚刚讲到的「几秒钟」实际上是一个可控制的时长。当你需要在两种关联性相对较强的语言间互译时(比如说法语和西班牙语),STACL 差不多能做到隔开一个词就开始翻译。而当两种被译语言差距很大(比如中文和英文),系统在翻译前要等待的时间就可能会更久,只有这样才能保证更好的准确性。

百度官方发布的关于STACL系统的博客公告里提到STACL系统的重大技术突破:We tackled this challenge using an idea inspired by human simultaneous interpreters, who routinely anticipate or predict materials that the speaker is about to cover in a few seconds into the future. However, different from human interpreters, our model does not predict the source language words in the speaker’s speech but instead directly predict the target language words in the translation.

CNBC的报道里提到的重要一点,百度的预测功能是基于200万对中英语料库。

Baidu Research Blog:Baidu Research
Engadget报道:百度开发了自己的即时翻译系统
官方Demo视频演示:Demos for STACL (Simultaneous Translation with Int…
Paper地址:STACL: Simultaneous Translation with …


以上是10月27日,也就是STACL系统发布的第三天,我曾在「翻译技术交流与资源共享」微信群中向大家热烈做过的推荐。但是事实再次证明百度流氓与无耻的本性。Slator昨日发布批评性报道「The Art of Hyping Machine Translation」(机器翻译的炒作艺术),文章揭露:

百度PR部门将新闻稿、研究论文、以及GitHub演示页面打包发给了众多媒体机构(这其中就包括Slator)。几小时后,媒体报道铺天盖地:MIT Technology Review、Engadget、CNBC、SCMP、Fortune等等等等。

在Baidu Research的Github演示页面上甚至集中挂上了该论文获得的媒体报道链接和媒体Logo。虽然这些报道失实,但是由于是重要媒体站台,该报道得到了更多其他媒体的转载。

而作为PR的一部分,百度在11月1日的百度世界大会上进行了STACL的公开演示。大会期间,主显示屏两侧的两个屏幕分别显示了自动语音识别输出和STACL的即时翻译。

但是,现场直播中的同声传译仍然是由人工口译员提供。

关于STACL的翻译效果究竟如何?研究者分别做了英-德和中-英方向的任务实验。在wait-5-model模型(即系统等待5个单词后开始翻译)中,STACL的输出质量略差炒作效果。而在wait-3-model模型(即系统等待3个单词后开始翻译)中,系统预测的词则是完全错误的。

NMT领域的专家、Iconic Translation Machines的联合创始人和CEO John Tinsley表示:在判定每一项新研究是否为突破(breakthrough、即此次各大媒体在报道STACL时用到的词)时,我们仍需极其谨慎。

2018百度世界大会:2018百度世界大会-英文同传版全程回顾
Slator:The Art of Hyping Machine Translation

《机器翻译的炒作艺术》有13个想法

  1. It’s hard to come by well-informed people on this topic, but you seem like you
    know what you’re talking about! Thanks

  2. I am not sure where you’re getting your info, but good topic.
    I needs to spend some time learning more or understanding more.
    Thanks for wonderful info I was looking for this information for my mission.

  3. Cbd oil that works 2020
    We absolutely love your blog and find nearly all of your post’s to be exactly I’m looking for.
    Would you offer guest writers to write content for you personally?
    I wouldn’t mind writing a post or elaborating on some
    of the subjects you write in relation to here. Again, awesome web site!
    best cbd oil for pain http://tinyurl.com/cbd-oil-that-works-2020 cbd oil that works 2020 http://tinyurl.com/cbd-oil-that-works-2020

  4. I’ve read some excellent stuff here. Certainly value bookmarking for revisiting.
    I surprise how so much attempt you set to make this sort of wonderful informative website.

  5. Thanks on your marvelous posting! I truly enjoyed reading it, you will be
    a great author.I will make certain to bookmark your blog and
    will eventually come back very soon. I want to encourage you to continue your great posts,
    have a nice holiday weekend!

  6. This design is incredible! You definitely know how to keep a reader amused.
    Between your wit and your videos, I was almost moved to start
    my own blog (well, almost…HaHa!) Wonderful job. I really enjoyed
    what you had to say, and more than that, how you presented it.
    Too cool!

  7. Thank you, I’ve just been looking for information about this topic for a long time and yours is the greatest I’ve discovered so far.
    But, what in regards to the bottom line? Are you positive in regards
    to the supply?

  8. Hi, Neat post. There is an issue with your website in web explorer, may test this?
    IE still is the marketplace leader and a big section of other
    people will miss your fantastic writing because of this problem.

发表评论

电子邮件地址不会被公开。 必填项已用*标注