Rm@i,P@i,MAP,MRR

Rm@i

Let {(ci,ri),1 <= i <=n} be the list of m context-response pairs from the test set. For each context ci, we create a set of m alternative responses, one response being the actual response ri, and the m-1 other responses being sampled at random from the same corpus. The m alternative responses are then ranked based on the output from the conversational model, and the Recallm@i measures how often the correct response appears in the top i results of this ranked list. The Recallm@i metric is often used for the evaluation of retrieval models as several responses may be equally “correct” given a particular context.

Precision@K

Set a rank threshold K
Compute % relevant in top K
Ignores documents ranked lower than K

Ex: 这里写图片描述
Prec@3 of 2/3
Prec@4 of 2/4
Prec@5 of 3/5

Mean Average Precision

Rm@i,P@i,MAP,MRR_第1张图片

Rm@i,P@i,MAP,MRR_第2张图片

Rm@i,P@i,MAP,MRR_第3张图片

MRR

Rm@i,P@i,MAP,MRR_第4张图片

你可能感兴趣的:(ML)