对话系统评测任务:勒布纳奖(The Loebner Prize)

The Loebner Prize is an annual competition in artificial intelligence that awards prizes to the computer programs considered by the judges to be the most human-like. The format of the competition is that of a standard Turing test. In each round, a human judge simultaneously holds textual conversations with a computer program and a human being via computer. Based upon the responses, the judge must decide which is which. [2]

勒布纳奖(The Loebner Prize)[1]由纽约慈善家休·勒布纳( Hugh Loebner, 1942-2016)于1990年设立,准备将大奖颁给第一台通过图灵测试的计算机。其目的为鼓励对人工智能的研究。时至今日,由于大奖依然没能颁出,因此这一大赛也在继续。

The Loebner Prize 2018

2018年的比赛分为两个阶段,海选阶段与测试阶段。
海选阶段:共有12个系统参赛。每个参赛系统都要回答20个与往届风格相似的问题,并至少回答2个维诺格拉德(T.Winograd)[3]式的应用了格语法的问题。此后,由人类专家就机器与人类的相似性进行评分,按照评分从高到低排序,选取前四名参加最终的测试。

The top four entries from the pool of entries that conform to the entry specifications will be selected as follows. Each entry will be provided with a set of 20 questions in English in a similar format to previous competitions, with at least 2 Winograd style questions. The responses from each of the AI systems will be recorded for this question set and then assessed for how human their responses are. The top 4 entries from this process will be entered into the finals of the competition at Bletchley Park.

测试阶段:评测者由四个裁判组成。共有四轮测试,每个系统一轮。在每轮测试中,每个裁判通过电脑同时与两个对象进行交流,其中一个对象是该轮对应的对话系统,而另一个对象则是一个真实的人。每个裁判根据最长25分钟的问询,判断哪个对象是对话系统,哪个对象是人。如果有一个系统成功欺骗了半数以上的裁判,则该系统的创造者将会获得银奖;否则,裁判将会对对话过程进行评分,根据评分结果从高到低排序依次颁发奖项。

The contest consists of 4 rounds where in each round, the 4 judges will each interact with two entities using a computer terminal. One of these entities will be a human ‘confederate’ and the other an AI system. After 25 minutes of questioning the judge must decide which entity is the human and which is the AI. If a system can fool half the judges that it is human under these conditions, a solid Silver Medal will be awarded to the creator of that AI system. In the event that this doesn’t happen, prizes will be awarded to the creators of the AI system as follows in accordance with judges’ ranked scores:
1st place - a bronze medal and $4000
2nd place - $1500
3rd place - $1000
4th place - $500

The Loebner Prize 2017

2017年的比赛分为两个阶段:海选阶段与测试阶段。
海选阶段:共有16个系统参赛,每个系统都要回答20个问题,专家对系统与人类的相似性评分,从高到低选出4个系统参加最终的测试。
测试阶段:四个评委对每个系统的表现进行评分,表现与人最接近的为4分,接下来按照相似程度从高到低依次为3,2,1分,每个系统的最终分数为评委评分之和。

The Loebner Prize 2016

2016年比赛共有16个系统参加海选。海选的评分采用100分制,排位最高的4个系统参加最终的评分。最终的测试只确定排序而没有具体的评分。

历届冠军

对话系统评测任务:勒布纳奖(The Loebner Prize)_第1张图片

Mitsuku

由Steve Worswick开发的Mitsuku[4]目前已经获得了四个铜奖(银奖、金奖未颁出过)。这一聊天机器人以AIML[5]为基础,主要采用正则表达式匹配,同时可以进行简单的推理[6]。
AIML在GitHub上的资源:rosie

Suzette, Rosette, Angela,Rose

由Bruce Wilcox开发的多个对话系统在该奖项中都曾获得较好的名次,它们都以他开发的ChatScript为基础[7]。ChatScript也是一种应用于对话系统开发的语言。
ChatScript在GitHub上的资源:ChatScript

小结

相对而言,勒布纳奖是历史比较悠久的对话系统测试。这一测试继承了图灵测试的形式,要求对话系统有特别的人物设定,比赛的目标是奖励能够成功扮演所设定的角色的系统。评价方面,主要由裁判对对话系统的扮演效果进行评分,没有客观、公开的标准。这一评测中取得较好成绩的都是基于启发式规则的对话系统。历史上,这一评测促生了AIML、ChatScript等设计对话系统的框架,对于聊天机器人的发展有重要的作用。

参考内容:

[1] Loebner Prize
[2] WikiPedia: Loebner Prize
[3] SHRDLU 人机对话系统
[4] Mitsuku
[5] AIML
[6] WikiPedia: Mitsuku
[7] Wilcox B, Wilcox S. Making it real: Loebner-winning chatbot design[J]. Arbor Ciencia Pensamiento Y Cultura, 2013, 189(764):a086.
[8] ChatScript

你可能感兴趣的:(对话系统)