首页 > 解决方案 > 如何评价混合系统推荐器?

问题描述

我正在使用 MovieLens 数据构建一个混合系统推荐器——更准确地说,我首先制作基于内容的模型,然后是协同过滤方法。最后,采用混合方法,我首先运行基于内容的过滤并确定我们想要向用户推荐的电影,然后使用 SVD 预测评级对 CF 的推荐进行过滤和排序。

我也在尝试评估所有模型,为此,我正在计算命中率。但是,对于我能够获得的单个模型,我不确定如何使用混合方法来计算它,这是否有意义?任何帮助是极大的赞赏!

先感谢您!

这是我所拥有的(注释行是我对命中率和混合模型的试验):

user_id = 50
df_movies=movies
def hybrid_content_svd_model(userId):
    """
    hydrid the functionality of content based and svd based model to recommend user top 10 movies. 
    :param userId: userId of user
    :return: list of movies recommended with rating given by svd model
    """
    recommended_movies_by_content_model = get_recommendation_content_model(userId)
    recommended_movies_by_content_model = df_movies[df_movies.apply(lambda movie: movie["title"] in recommended_movies_by_content_model, axis=1)]
    for key, columns in recommended_movies_by_content_model.iterrows():
        predict = svd.predict(userId, columns["movieId"])
        recommended_movies_by_content_model.loc[key, "svd_rating"] = predict.est
    # #count=recommended_movies_by_content_model[(recommended_movies_by_content_model['svd_rating'])>=3]['movieId'].count()
        #total=recommended_movies_by_content_model.shape[0]
        #hit_ratio= count/total
    return recommended_movies_by_content_model.sort_values("svd_rating", ascending=False).iloc[0:11]
        
hybrid_content_svd_model(user_id)

以下是我计算 CF 命中率的方法:

def evaluation_collaborative_svd_model(userId,userOrItem):
    """
    hydrid the functionality of Collaborative based and SVD based model to see if ratings of predicted movies 
    :param userId: userId of user, userOrItem is a boolean value if True it is User-User and if false Item-Item
    :return: dataframe of movies and ratings
    """ 
    movieIdsList= list()
    movieRatingList=list()
    movieIdRating= pd.DataFrame(columns=['movieId','rating'])
    if userOrItem== True:
        movieIdsList=getRecommendedMoviesAsperUserSimilarity(userId)
    else:
        movieIdsList=recommendedMoviesAsperItemSimilarity(user_id)
    for movieId in movieIdsList:
        predict = svd.predict(userId, movieId)
        movieRatingList.append([movieId,predict.est])
        movieIdRating = pd.DataFrame(np.array(movieRatingList), columns=['movieId','rating'])
        count=movieIdRating[(movieIdRating['rating'])>=3]['movieId'].count()
        total=movieIdRating.shape[0]
        hit_ratio= count/total
    return hit_ratio

标签: pythonrecommendation-enginerecommender-systems

解决方案


推荐阅读