首页 > 解决方案 > 将模型输出替换为二进制形式

问题描述

我有这样的模型输出, <tf.Tensor: shape=(3,), dtype=float32, numpy=array([0.92, 0.2 , 0.77], dtype=float32)>>但我想将数组中的最大值更改为 1,将其他值更改为 0。如果有两个或更多最大值,则应将其更改为零数组。

例如:

[0.92, 0.2 , 0.77] -> [1.0,0.0,0.0],
[0.92, 0.92 , 0.77] -> [0.0,0.0,0.0]

我知道如何通过 np.argmax 来实现它,但我想通过 keras.layers 来实现它,因为在那之后总结输出并希望以二进制模式来实现它?

我已经制作了这样的自定义层,但不幸的是无法编译它

class Amplifier(tf.keras.layers.Layer):
    def __init__(self):
        super(Amplifier, self).__init__()
        # pattern 2D matrix
        self.f = tf.constant(
            [[1., 0., 0.],
             [0., 1., 0.],
             [0., 0., 1.]], dtype='float32')

    def call(self, inputs, training=None, mask=None):
        # return index of max value
        x = tf.math.argmax(
            inputs,
            axis=None,
            output_type=tf.dtypes.int32,
            name=None)
        # get factor from f constant as 1D matrix form
        y = tf.reshape(tf.slice(self.f, [x, 0], [1, 3]), [3])
        # multiply input on the pattern matrix
        return tf.math.multiply(inputs, y) 

我收到了这个错误:

ValueError:试图将“开始”转换为张量并失败。错误:形状必须具有相同的等级,但为 1 和 0 来自将形状 0 与其他形状合并。对于“{{节点放大器/Slice/packed}} = Pack[N=2, T=DT_INT32, axis=0](amplifier/ArgMax,amplifier/Slice/packed/1)”,输入形状为:[3], [ ]。

我不知道如何避免这个错误

标签: numpytensorflow

解决方案


我想你要问的是你是否可以让你的输出是tf.one_hot格式的(正确的?)。如果您只想报告此输出,则可以使用 argmax 和 one_hot 的组合。如果你想让它成为你网络中梯度计算的一部分,你需要一个数值稳定的 one_hot。尝试这样的事情:

import tensorflow as tf

from tensorflow.keras import layers
from tensorflow.keras.layers import Dense


class OneHotLayer(layers.Layer):
    def __init__(self, num_labels=3, do_numerical=False):
        super().__init__()
        # large constant used in masking softmax
        self._inf = 1e9
        # how many categories we have
        self.num_labels = num_labels
        # if we need our `one_hot` to be numerically stable
        self.do_numerical = do_numerical
        # dense layer; the last layer of our network
        self.probs = Dense(self.num_labels, activation="softmax")
        
    # this is a numerically stable `one_hot` method.
    # it looks for the max across rows, subtracts that from
    # the original tensor (across rows) and exponentiates.
    # this will make the largest value 1.0. Then we create a
    # mask and zero out all other values (again, across rows)
    # leaving us with a binary matrix.
    def _numerical_one_hot(self, x):
        m = tf.math.reduce_max(x, axis=1, keepdims=True)
        e = tf.math.exp(x - m)
        mask = tf.cast(tf.math.not_equal(e, 1.0), tf.float32)
        x = x - self._inf * mask
        return tf.nn.softmax(x, axis=1)
    
    def call(self, x):
        # do forward pass through last layer
        x = self.probs(x)
        # do we need numerical stability? If we do use the
        # op we created. If not, then take the argmax across
        # rows and then use `tf.one_hot`; return both
        ohx = self._numerical_one_hot(x) \
            if self.do_numerical \
            else tf.one_hot(tf.math.argmax(x, axis=1), self.num_labels)
        return x, ohx

资料

# fake data
X = tf.random.normal([8, 10])
y = tf.random.uniform(
    shape=[8,],
    minval=0,
    maxval=3,
    dtype=tf.int64)
y = tf.one_hot(y, 3)

测试数值稳定性网络

# network with stable one_hot
x_in = tf.keras.Input((10,))
x = Dense(25)(x_in)
x_probs, x_indices = OneHotLayer(3, True)(x)
#                                     ^
# ------------------------------------|

# model
model = tf.keras.Model(x_in, [x_probs, x_indices])
probs, idxs = model(X)
#      ^^^ this can be used in loss calc if you need it

print(probs)
# tf.Tensor(
# [[0.12873407 0.83047885 0.04078702]
#  [0.22919412 0.1288479  0.641958  ]
#  [0.27402356 0.35891128 0.36706516]
#  [0.1328154  0.3546107  0.51257384]
#  [0.5309519  0.10788985 0.36115825]
#  [0.35019153 0.272698   0.37711048]
#  [0.35740596 0.23807809 0.4045159 ]
#  [0.13749316 0.72042704 0.14207976]],
#  shape=(8, 3), # dtype=float32)

print(idxs)
# tf.Tensor(
# [[0. 1. 0.]
#  [0. 0. 1.]
#  [0. 0. 1.]
#  [0. 0. 1.]
#  [1. 0. 0.]
#  [0. 0. 1.]
#  [0. 0. 1.]
#  [0. 1. 0.]],
# shape=(8, 3), dtype=float32)

测试内置一热网络

# network using argmax and built-in one_hot
x_in = tf.keras.Input((10,))
x = Dense(25)(x_in)
x_probs, x_indices = OneHotLayer(3, False)(x)
#                                     ^
# ------------------------------------|

# model
model = tf.keras.Model(x_in, [x_probs, x_indices])
probs, idxs = model(X)

print(probs)
# tf.Tensor(
# [[0.28931475 0.33777648 0.3729088 ]
#  [0.25985113 0.532114   0.20803489]
#  [0.4226228  0.21078317 0.36659405]
#  [0.460703   0.3534157  0.18588138]
#  [0.6028035  0.26571727 0.13147917]
#  [0.1994377  0.4208928  0.37966955]
#  [0.39812535 0.33319235 0.26868224]
#  [0.13242716 0.47491995 0.3926528 ]],
# shape=(8, 3), dtype=float32)

print(idxs)
# tf.Tensor(
# [[0. 0. 1.]
#  [0. 1. 0.]
#  [1. 0. 0.]
#  [1. 0. 0.]
#  [1. 0. 0.]
#  [0. 1. 0.]
#  [1. 0. 0.]
#  [0. 1. 0.]],
# shape=(8, 3), dtype=float32)

推荐阅读