Normalized Discounted Cumulative Gain(归一化折损累计增益)
NDCG用作排序结果的评价指标,评价排序的准确性。
推荐系统通常为某用户返回一个item列表,假设列表长度为K,这时可以用NDCG@K评价该排序列表与用户真实交互列表的差距。
解释:
G a i n = r ( i ) Gain=r(i) Gain=r(i)
C G @ K = ∑ i K r ( i ) CG@K=\sum^K_ir(i) CG@K=i∑Kr(i)
D C G @ K = ∑ i K r ( i ) l o g 2 ( i + 1 ) DCG@K=\sum^K_i\frac{r(i)}{log_2(i+1)} DCG@K=i∑Klog2(i+1)r(i)
如果相关性分数r(i)只有(0,1)两种取值时,DCG@K有另一种表达。其实就是如果算法返回的排序列表中的item出现在真实交互列表中时,分子加1,否则跳过。
D C G @ K = ∑ i K = 2 r ( i ) l o g 2 ( i + 1 ) DCG@K=\sum_i^K=\frac{2^{r(i)}}{log_2(i+1)} DCG@K=i∑K=log2(i+1)2r(i)
N D C G u @ K = D C G u @ K I D C G u NDCG_u@K=\frac{DCG_u@K}{IDCG_u} NDCGu@K=IDCGuDCGu@K
N D C G @ K = N D C G u @ K ∣ u ∣ NDCG@K=\frac{NDCG_u@K}{|u|} NDCG@K=∣u∣NDCGu@K
import numpy as np
def ndcg(rel_true, rel_pred, p=None, form="linear"):
""" Returns normalized Discounted Cumulative Gain
Args:
rel_true (1-D Array): relevance lists for particular user, (n_songs,)
rel_pred (1-D Array): predicted relevance lists, (n_pred,)
p (int): particular rank position
form (string): two types of nDCG formula, 'linear' or 'exponential'
Returns:
ndcg (float): normalized discounted cumulative gain score [0, 1]
"""
rel_true = np.sort(rel_true)[::-1]
p = min(len(rel_true), min(len(rel_pred), p))
# 因为索引是从0开始的,正常应该加1,但是从0开始,log(0+1)则等于无穷大,所以这里面加的是2,如果索引是从1开始,则加的是1,所以感觉跟上面的公式不一致,其实是一样的。
discount = 1 / (np.log2(np.arange(p) + 2))
if form == "linear":
idcg = np.sum(rel_true[:p] * discount)
dcg = np.sum(rel_pred[:p] * discount)
elif form == "exponential" or form == "exp":
idcg = np.sum([2 ** x - 1 for x in rel_true[:p]] * discount)
dcg = np.sum([2 ** x - 1 for x in rel_pred[:p]] * discount)
else:
raise ValueError("Only supported for two formula, 'linear' or 'exp'")
return dcg / idcg
if __name__ == "__main__":
song_index = {'A': 0, 'B': 1, 'C': 2, 'D': 3, 'E': 4, 'F': 5, 'G': 6, 'H': 7, 'I': 8}
user_lists = ["USER1", "USER2", "USER3"]
relevance_true = {
# 每首歌曲i在每个用户下的评分,并且按降序排序,这个顺序对于相应的用户是最完美的。
"USER1": [3, 3, 2, 2, 1, 1, 0, 0, 0],
"USER2": [3, 2, 1, 1, 2, 0, 1, 1, 1],
"USER3": [0, 1, 0, 1, 2, 3, 3, 1, 0]
}
s1_prediction = {
# 模型预测,用户可能点击的顺序
"USER1": ['A', 'E', 'C', 'D', 'F'],
"USER2": ['G', 'E', 'A', 'B', 'D'],
"USER3": ['C', 'G', 'F', 'B', 'E']
}
s2_prediction = {
"USER1": ['A', 'B', 'C', 'G', 'E'],
"USER2": ['B', 'A', 'G', 'E', 'F'],
"USER3": ['E', 'G', 'F', 'B', 'I']
}
for user in user_lists:
print(f'===={user}===')
r_true = relevance_true[user]
for song in s1_prediction[user]:
test = song_index[song]
test2 = r_true[test]
s1_pred = [r_true[song_index[song]] for song in s1_prediction[user]]
s2_pred = [r_true[song_index[song]] for song in s2_prediction[user]]
print(f'S1 nDCG@5 (linear): {ndcg(r_true, s1_pred, 5, "linear")}')
print(f'S2 nDCG@5 (linear): {ndcg(r_true, s2_pred, 5, "linear")}')
# 一般我们使用下面指数的形式
print(f'S1 nDCG@5 (exponential): {ndcg(r_true, s1_pred, 5, "exp")}')
print(f'S2 nDCG@5 (exponential): {ndcg(r_true, s2_pred, 5, "exp")}')