How Not To Sort By Average Rating

how Not to Sort by average rating

rating的错误方式

  • Score = (Positive ratings) − (Negative ratings)

    • item 1: 1000 ratings (600 positive ratings, 400 negative ratings)
    • item 2: 10000 ratings(5500 positive ratings, 4500 negative ratings)
      item1不应该放在item2前面
  • Score = Average rating = (Positive ratings) / (Total ratings)

    • item 1: 1 positive , 0 negative
    • item 2: 100 positive , 1 negative
      item 1 不应该放在item2前面

rating的正确公式

Score = Lower bound of Wilson score confidence interval for a Bernoulli parameter
评分给定下,95%的可能性positive rating的真实比例至少是多少

equation.png

(pos:积极评分的数目 n: 总评分的数 confidence:置信率)
How Not To Sort By Average Rating_第1张图片
r语言实现.png

How Not To Sort By Average Rating_第2张图片
sql语言实现.png

应用场景(不限于sorting)

  • 检测垃圾邮件: What percentage of people who see this item will mark it as spam?
  • 创造best of list: What percentage of people who see this item will mark it as “best of”?
  • most emailed list: What percentage of people who see this page will click “Email”?

How Hacker News ranking algorithm works

How Not To Sort By Average Rating_第3张图片
rank score.png

随着时间的增长,得分变低。同时gravity增加的时候得分会减少地更快

python.png

对于old stories,时间影响变小(曲线平滑,主要依赖于vote)
对于new stories,是时间和vote的同时作用

How Not To Sort By Average Rating_第4张图片
score.png

How Reddit ranking algorithms work

你可能感兴趣的:(How Not To Sort By Average Rating)