python - 如何在python中计算2个列表的精度和召回率

问题描述

我写了一个电影推荐系统。我有我推荐给用户的 20 部电影的列表和用户最后真正看过的 150 部电影的列表。如何在 python 中使用 sklearn 计算这两个列表中的精度和召回率？

比如我有10部电影推荐给用户，用户真正看过，recall的计算是：10/150，precision的计算是：10/20

标签： pythonscikit-learnprecision-recall

根据我的阅读，最简单的方法是intersection在两组之间使用。

我想您对电影使用某种标识符，因此您的列表不能有重复项（例如，您可能不会两次推荐同一部电影），这意味着您可以使用集合及其内置的intersection.

recommendations={"movie1", "movie2", "movie3"}
saw={"movie1", "movie2", "movie4", "movie5", "movie6"}

"recommended movies saw by the user"
recommendations.intersection(saw)
>>> {"movie1", "movie2"}

# To get the "number of recommended movie that the user saw":
movie_intersect = len(recommendations.intersection(saw))
movie_intersect
>>> 2

# Precision is just:
movie_intersect/len(recommendations)
>>> 0.666666666666666667

# Recall:
movie_intersect/len(saw)
>>> 0.4

python - 如何在python中计算2个列表的精度和召回率

问题描述

解决方案

推荐阅读