皮尔逊相关系数实例

计算两个用户的Pearson 相关性:

/**

  • 皮尔逊Pearson Correlation
  • 对用户X,Y sum2X:X的所有评分项之平方和sum2Y:Y的所有评分项之平方和
  • sumXY:sumX 、sumY的交集之和,即X、Y都评价了的项之和
  • 相关性:sumXY / sqrt(sumX^2 * sumY^2)
  • 两个变量之间的皮尔逊相关系数定义为两个变量之间的协方差和标准差的商:
  • p(X,Y) = (xi - avg(x))(yi - avg(y)) / sqrt((xi - avg(x))^2) * sqrt((yi - avg(y))^2)
    *皮尔逊距离度量的是两个变量X和Y之间的距离:
  • d(X,Y) =1 -p(X,Y)/(n -1) * sum((Xi - avg(X))/p(X) * (Yi-avg(Y)))/p(Y)

*/

public double userSimilarity(int userid1, int userid2) throws MyException {
// if(userid1 == userid2)
// throw new MyException(“同一用户不能比较相似度。”);
List list1 = null;
List list2 = null;
double avgX = 0.0;
double avgY = 0.0;
try {
list1 = st.getRatings(userid1);
list2 = st.getRatings(userid2);
avgX = st.getAvgRatings(userid1);
avgY = st.getAvgRatings(userid2);
} catch (SQLException e) {
e.printStackTrace();
}
double sumXY = 0, sumX = 0, sumY = 0;
for (int i = 0; i < list1.size(); i++) {
double rating1 = list1.get(i).getRating();
sumX += (rating1 - avgX) * (rating1 - avgX);
}
for (int j = 0; j < list2.size(); j++) {
double rating2 = list2.get(j).getRating();
sumY += (rating2 - avgY) * (rating2 - avgY);
}
for (int i = 0; i < list1.size(); i++) {
double rating1 = list1.get(i).getRating();
for (int j = 0; j < list2.size(); j++) {
double rating2 = list2.get(j).getRating();
if (list1.get(i).getItemid() == list2.get(j).getItemid()) {
sumXY += (rating1 - avgX) * (rating2 - avgY);
}
}
}
return sumXY / (Math.sqrt(sumX * sumY));
}

你可能感兴趣的:(机器学习与深度学习)