python - IndexError:索引 20 超出轴 1 的范围,大小为 20
问题描述
我不断收到此错误。但问题是,如果我以 350 的 final_target_1 大小运行它运行良好。任何比这更大的东西都会给我上述错误。这是我正在使用的整个代码。它与我在代码的最后 6 行中传递的 maxlen 参数有关,但我似乎无法弄清楚问题出在哪里。
# load the data
context = np.load('/content/context_indexes.npy',allow_pickle=True)
final_target = np.load('/content/target_indexes.npy',allow_pickle=True)
with open('/content/dictionary.pkl', 'rb') as f:
word_to_index = pickle.load(f)
# the indexes of the words start with 0. But when the sequences are padded
later on, they too will be zeros.
# so, shift all the index values one position to the right, so that 0 is
spared, and used only to pad the sequences
for i,j in word_to_index.items():
word_to_index[i] = j+1
# reverse dictionary
index_to_word = {}
for k,v in word_to_index.items():
index_to_word[v] = k
final_target_1 = final_target
context_1 = context
#temp soln: reducing the size of data to be processed-
#this is where I reduce the dataset size from 1000 to 350 because it will
not run with anything larger than that
#final_target_1=final_target[:350]
#context_1 = context[:350]
maxLen = 30
# shift the indexes of the context and target arrays too
for i in final_target_1:
for pos,j in enumerate(i): i[pos] = j + 1
for i in context_1:
for pos,j in enumerate(i): i[pos] = j + 1
# read in the 50 dimensional GloVe embeddings
def read_glove_vecs(file):
with open(file, 'r',encoding='utf-8') as f:
words = set()
word_to_vec_map = {}
for line in f:
line = line.strip().split()
word = line[0]
words.add(word)
word_to_vec_map[word] = np.array(line[1:], dtype=np.float64)
return words, word_to_vec_map
words, word_to_vec_map = read_glove_vecs('/content/drive/My
Drive/glove.twitter.27B.50d.txt')
# since the indexes start from 1 and not 0, we add 1 to the no. of total
words to get the vocabulary size (while initializing
# and populating arrays later on, this will be required)
vocab_size = len(word_to_index) + 1
# initialize the embedding matrix that will be used (50 is the GloVe
vector dimension)
embedding_matrix = np.zeros((vocab_size, 50))
for word,index in word_to_index.items():
try:
embedding_matrix[index, :] = word_to_vec_map[word.lower()]
except: continue
outs = np.zeros((context_1.shape[0], maxLen, vocab_size))
for pos,i in enumerate(final_target_1):
for pos1,j in enumerate(i):
if pos1 > 0:
outs[pos, pos1 - 1, j] = 1
if pos%1000 == 0: print ('{} entries completed'.format(pos))
解决方案
推荐阅读
- python-3.x - 为什么在 exec 中使用“as”导入模块?
- node.js - Nodemon 问题(在 nodemon@2.0.4 安装后脚本失败)/npm 问题
- c - 哪些数值类型可以安全地转换为 intptr_t?
- azure - 在 Azure DevOps YAML Pipelines 上分配特定代理
- cpu-architecture - 字节顺序和位移
- python-3.x - 在 pyspark 数据框中从 lat-long 中查找状态名称
- python - 在pyqt5中进入全屏时按钮和标签不会调整大小
- powershell - 在字符串数组的开头和结尾插入引号
- javascript - 如何从javascript打开一个新窗口/标签
- java - Android Service 改为 Thread