首页 > 解决方案 > One-Hot-Encode only integers, slices(`:`), ellipsis(`...`), numpy.newaxis(`None`) and integer or boolean arrays are valid indicies problem

问题描述

我正在尝试通过一次热编码解决以下问题,但也发生了错误。

我正在尝试进行图像分类(捕捉矩形),当我尝试让它进行一次热编码时,发生了错误。在更改为 one_hot_label 之前,
标签如下:

 'circle' 'circle' 'circle' 'circle' 'circle' 'circle' 'circle' 'circle'
 'pentagon' 'pentagon' 'pentagon' 'pentagon' 'pentagon' 'rectangle' ....
 'triangle' 'triangle' 'triangle' 'triangle' 'triangle' 'triangle']

我把T变成了[[0, 0, 0, 0, 0], ... , [0, 0, 0, 0, 0]]因为我有5种身材。但是row[X[idx]] = 1

我得到一个错误 only integers, slices (::), ellipsis (...), numpy.newaxis () and integer or boolean arrays are valid indices

def _change_one_hot_label(X):
    T = np.zeros((X.size, 5))
    for idx, row in enumerate(T):
        if(X[idx] == 'rectangle'):
            row[X[idx]] = 1

    return T

我不知道我该怎么做才能解决这个问题......

请帮我。谢谢。

==================================================== ====================

试图用上述方法解决。(一热编码)

==================================================== ===================
我正在学习深度学习。

我收到一个错误:'ufunc'multiply' 不包含签名匹配类型的循环(dtype('dtype('

我试图自己解决它,但我需要帮助。

我正在加载我的图像数据集

data_list = glob('dataset\\training\\*\\*.jpg')

def load_label(data_list):
    labels = []
    for path in data_list:
        labels.append(get_label_from_path(path))
    return np.array(labels)

x_batch example: [[0.00392157 0.00392157 0.00392157 ... 0.00392157 0.00392157 0.00392157] ... [0.00392157 0.00392157 0.00392157 ... 0.00392157 0.00392157 0.00392157]]
t_batch example: ['circle' 'circle' ... 'circle' 'circle']

train_size = 3 # x_train.shape[0]
batch_size = 22
for i in range(242): # iters_num = 242
   batch_mask = np.random.choice(train_size, batch_size)
   print( t_train, batch_mask )
   x_batch = x_train[batch_mask]
   t_batch = t_label[batch_mask]
   grad = network.gradient(x_batch, t_batch) # error start position

当我尝试获得渐变时,它会流动self.loss(x_batch, t_batch) # each parameter is x, t->

def loss(self, x, t):
        y = self.predict(x)
        return self.lastLayer.forward(y, t)

def forward(self, x, t):
        self.t = t
        self.y = softmax(x)
        self.loss = cross_entropy_error(self.y, self.t)
        return self.loss

def cross_entropy_error(y, t): 
    if y.ndim == 1:
        t = t.reshape(1, t.size)
        y = y.reshape(1, y.size)

    batch_size = y.shape[0]

    return -np.sum(t * np.log(y+1e-7)) / batch_size

而最新的一行,return -np.sum(t*np.log(y+1e-7)) / batch_size
得到了一个错误:UFuncTypeError: ufunc 'multiply' did not contain a loop with signature matching types (dtype('<U32'), dtype('<U32')) -> dtype('<U32')

例如,我尝试将标签更改为 int:'circle' = 0,'rectangle' = 1,但后来我的深度学习并没有了解它。

我不知道我错过了什么..有人可以帮助我吗?

标签: pythondeep-learning

解决方案


要解决一个热编码问题,您可以使用以下功能。

import numpy as np

label2idx = dict(rectangle=0, circle=1, pentagon=2, ..)

def _change_one_hot_label(X):
    T = np.zeros((len(X), 5)).astype('int32')
    for i in range(T.shape[0]):
        label = X[i]
        T[i, label2idx[label]] = 1

    return T

_change_one_hot_label(['circle', 'rectangle', 'rectangle'])

对于另一个问题,正如您所说,该变量t是一个包含字符串的数组['circle', 'rect' ..],您不能将字符串和数字相乘。

首先,您应该将一个热编码功能应用于t.

def cross_entropy_error(y, t): 
    # It is nota good practice, but you can place this instruction here
    # Maybe, if you have a 'batch preprocessor function' you should place it there
    t = _change_one_hot_label(t)

    if y.ndim == 1:
        t = t.reshape(1, t.size)
        y = y.reshape(1, y.size)

    batch_size = y.shape[0]
    return -np.sum(t * np.log(y+1e-7)) / batch_size

推荐阅读