推荐笔记, 使用lightfm矩阵分解

以下是协同过滤推荐系统的学习笔记

  1. 公式


    image.png
  2. 逻辑图


    image.png
image.png
  1. 原理理解
  • 使用用户对物品的评分,分解出用户感兴趣的物品类型特征,和物品在不同物品类型的分数。例如:电影分为动作电影类型、情感电影类型,某一电影在动作电影类型分数为9,情感电影类型分数为1。同理某一用户对动作类型电影分数为1分,情感电影为9分。这些我把它理解为物对-物品类型特征和用户-物品类型特征。
  1. 使用LightFM
  • LightFM使用这边比较简单,就是给用户电影的评分数据,LightFM自动计算出用户对不同物品的分数
  • 一下是从LightFM官网粘帖的代码
from lightfm import LightFM
from lightfm.datasets import fetch_movielens
from lightfm.evaluation import precision_at_k
import numpy as np

# Load the MovieLens 100k dataset. Only five
# star ratings are treated as positive.
data = fetch_movielens(data_home='./data', min_rating=5.0)
print(data['train'])
# Instantiate and train the model
model = LightFM(loss='warp')
model.fit(data['train'], epochs=30, num_threads=2)

# Evaluate the trained model
test_precision = precision_at_k(model, data['test'], k=5).mean()

print("Train precision: %.2f" % precision_at_k(model, data['train'], k=5).mean())
print("Test precision: %.2f" % precision_at_k(model, data['test'], k=5).mean())

def sample_recommendation(model, data, user_ids):


    n_users, n_items = data['train'].shape

    for user_id in user_ids:
        known_positives = data['item_labels'][data['train'].tocsr()[user_id].indices]
        print(data['train'].tocsr())
        print(data['train'].tocsr()[user_id])
        print(data['train'].tocsr()[user_id].indices)
        scores = model.predict(user_id, np.arange(n_items))
        top_items = data['item_labels'][np.argsort(-scores)]

        print("User %s" % user_id)
        print("     Known positives:")

        for x in known_positives[:3]:
            print("        %s" % x)

        print("     Recommended:")

        for x in top_items[:3]:
            print("        %s" % x)

sample_recommendation(model, data, [3, 25, 450])
  1. github对应源码和需要的数据地址
    https://github.com/wengmingdong/tf2-stu/tree/master/recommender

你可能感兴趣的:(推荐笔记, 使用lightfm矩阵分解)