首页 > 解决方案 > One-hot encodings in Keras without for loops

问题描述

I want to generate one-hot encodings for a list of sequences.

def encode_output(sequences, vocab_size):
  y = np.zeros([sequences.shape[0], sequences.shape[1], vocab_size], dtype='int16')
  for i in range(sequences.shape[0]):
    y[i] = keras.utils.to_categorical(sequences[i], num_classes=vocab_size, dtype='int16')
  return y

Sequences is a 2-D numpy array

array([[  23,    4,  563, ...,    0,    0,    0],
       [3480,    3,   86, ...,    0,    0,    0],
       [   9,  930,    6, ...,    0,    0,    0],
       ...,
       [ 507, 1408,    0, ...,    0,    0,    0],
       [4447,   13,  642, ...,    0,    0,    0],
       [   1,  195, 2618, ...,    0,    0,    0]], dtype=int32)

My code works fine, but maybe there is a way to make it without for loop?

标签: pythonnumpymachine-learningkerasone-hot-encoding

解决方案


你可以简单地使用array-assignment-

def encode_vectorized(a, n, dtype=int):
    out = np.zeros(a.shape + (n,), dtype=dtype)
    np.put_along_axis(out, a[...,None], 1, axis=-1)
    return out

推荐阅读