首页 > 解决方案 > 构建数据矩阵时如何处理“索引越界”?

问题描述

我正在尝试使用 size 构建实用矩阵(n_users, n_items),但出现index is out of bounds错误。从错误中,很明显我试图到达矩阵范围之外的元素,但我不知道如何形成矩阵来处理这个问题。如果有什么建议我会为你考虑的。

这是我的代码:

## Import the required libraries
import pandas as pd
import numpy as nm
from scipy import spatial
from sklearn.metrics.pairwise import pairwise_distances
from sklearn.preprocessing import MinMaxScaler

user_artists = pd.read_csv("./user_artists.dat", sep='\t+', engine='python')
#user_artists has three features ['userID','artistID','weight']
n_users = user_artists.userID.nunique()
n_items = user_artists.artistID.nunique()
n_users,n_items
## (1892, 17632)

## Create a user-item matrix that can be used to calculate the similarity between users and items.

data_matrix = nm.zeros((n_users, n_items))
for line in user_artists.itertuples():
    data_matrix[line[1]-1, line[2]-1] = line[3]

这是错误:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-96-f3242d18985b> in <module>
      3 data_matrix = nm.zeros((n_users, n_items))
      4 for line in user_artists.itertuples():
----> 5     data_matrix[line[1]-1, line[2]-1] = line[3]

IndexError: index 18733 is out of bounds for axis 1 with size 17632

标签: pythonnumpyrecommendation-engine

解决方案


如果您确定自己的逻辑并希望避免此错误(这不是解决方案),请执行此操作

for line in user_artists.itertuples():
    try:
        data_matrix[line[1]-1, line[2]-1] = line[3]
    except:
        pass


推荐阅读