首页 > 解决方案 > 知道前两列的值,如何从第三列中获取值

问题描述

当给出其他两列的值时,我想从第三列获取值。我希望每部电影和用户的评分值形成一个用户电影矩阵。

我已经在两个列表中获得了唯一的电影 ID 和用户 ID,并尝试找到频率与我想要的值匹配的实例

import pandas as pd
import numpy as np
import matplotlib as plot

def main():
    df = pd.read_csv(r'/Users/ttbarack/Desktop/ratings.csv')
    #print(df)
    userIds = []
    for id in df['userId']:
        if id not in userIds:
            userIds.append(id)
    #print(userIds)
    movieIds = []
    for movie in df['movieId']:
        if movie not in movieIds:
            movieIds.append(movie)
    #print(movieIds)


    """PART 1"""


    finalList = []
    for id in userIds:
        newlist = []
        for mov in movieIds:
            newlist.append(df['rating'].where(df['userId'].values() == id and df['movieId'].values() == mov))
        finalList.append(newlist)
    print(finalList)

这是我得到的错误:

Traceback (most recent call last):
  File "/Users/ttbarack/PycharmProjects/Proj1/Project2.py", line 29, in <module>
    main()
  File "/Users/ttbarack/PycharmProjects/Proj1/Project2.py", line 22, in main
    newlist.append(df['rating'].where(df['userId'].values() == id and df['movieId'].values() == mov))
TypeError: 'numpy.ndarray' object is not callable

标签: pythonmatrixdata-science

解决方案


错误是因为您将 numpy 数组作为函数调用

利用 :

newlist.append(df['rating'].where(df['userId'] == id and df['movieId'] == mov))

代替

newlist.append(df['rating'].where(df['userId'].values() == id and df['movieId'].values() == mov))

推荐阅读