首页 > 解决方案 > 使用分类交叉熵作为优化器的损失函数.minimize - tensorflow.js

问题描述

遵循 tensorflow 教程中的(大致)示例,用于训练模型的代码为:

// The weights and biases for the two dense layers.
const w1 = tf.variable(tf.randomNormal([784, 32]));
const b1 = tf.variable(tf.randomNormal([32]));
const w2 = tf.variable(tf.randomNormal([32, 10]));
const b2 = tf.variable(tf.randomNormal([10]));

function model(x) {
  return x.matMul(w1).add(b1).relu().matMul(w2).add(b2);
}
const xs = tf.data.generator(data);
const ys = tf.data.generator(labels);
// Zip the data and labels together, shuffle and batch 32 samples at a time.
const ds = tf.data.zip({xs, ys}).shuffle(100 /* bufferSize */).batch(32);

const optimizer = tf.train.sgd(0.1 /* learningRate */);
// Train for 5 epochs.
for (let epoch = 0; epoch < 5; epoch++) {
    await ds.forEachAsync(({xs, ys}) => {
    optimizer.minimize(() => {
      const predYs = model(xs);
      const loss = tf.losses.softmaxCrossEntropy(ys, predYs);
      loss.data().then(l => console.log('Loss', l));
      return loss;
    });
  });
  console.log('Epoch', epoch);
}

我在这个模型和我的模型之间做出的两个主要改变是 1)我想使用分类交叉熵作为我的损失函数,以及 2)我正在使用let model = tf.sequential()而不是为我的模型创建一个函数。

我的模型构造为:

function getModel() {
    const model = tf.sequential();
    const IMAGE_WIDTH = 28;
    const IMAGE_HEIGHT = 28;
    const IMAGE_CHANNELS = 1;  


    model.add(tf.layers.conv2d({
      inputShape: [IMAGE_WIDTH, IMAGE_HEIGHT, IMAGE_CHANNELS],
      kernelSize: 5,
      filters: 8,
      strides: 1,
      activation: 'relu',
      kernelInitializer: 'varianceScaling'
    }));

    // The MaxPooling layer acts as a sort of downsampling using max values
    // in a region instead of averaging.  
    model.add(tf.layers.maxPooling2d({poolSize: [2, 2], strides: [2, 2]}));

    // Repeat another conv2d + maxPooling stack. 
    // Note that we have more filters in the convolution.
    model.add(tf.layers.conv2d({
      kernelSize: 5,
      filters: 16,
      strides: 1,
      activation: 'relu',
      kernelInitializer: 'varianceScaling'
    }));
    model.add(tf.layers.maxPooling2d({poolSize: [2, 2], strides: [2, 2]}));

    // Now we flatten the output from the 2D filters into a 1D vector to prepare
    // it for input into our last layer. This is common practice when feeding
    // higher dimensional data to a final classification output layer.
    model.add(tf.layers.flatten());

    // Our last layer is a dense layer which has 10 output units, one for each
    // output class (i.e. 0, 1, 2, 3, 4, 5, 6, 7, 8, 9).
    const NUM_OUTPUT_CLASSES = 10;
    model.add(tf.layers.dense({
      units: NUM_OUTPUT_CLASSES,
      kernelInitializer: 'varianceScaling',
      activation: 'softmax'
    }));


    // Choose an optimizer, loss function and accuracy metric,
    // then compile and return the model
    const optimizer = tf.train.adam();
    model.compile({
      optimizer: optimizer,
      loss: 'categoricalCrossentropy',
      metrics: ['accuracy'],
    });

    return model;
  }

训练代码是

let input = getNextTrainBatch(BATCH_SIZE,TRAIN_DATA_SIZE, data);
model.optimizer.minimize(() => {
        const predict = model.apply(input[0]);
        const loss = tf.metrics.categoricalCrossentropy(input[1],predict);
        return loss;
});

使用此设置,当我运行代码时,我会收到错误Uncaught (in promise) Error: Tensor is disposed. 如果我放在model.apply(input[0])外面,model.optimizer.minimize我会收到错误

Uncaught (in promise) Error: Cannot find a connection between any variable and the result of the loss function y=f(x). Please make sure the operations that use variables are inside the function f passed to minimize().

我认为第二个错误是由于分类交叉熵在 tf.metrics 而不是 tf.loss 下,但是 tf.loss 类中没有分类交叉熵。我不知道该怎么做/如果我可以为此使用分类交叉熵

注意:如果版本很重要,我"@tensorflow/tfjs": "1.0.2"在依赖项中使用 tensorflow

编辑:我意识到tf.metrics.categoricalCrossEntropy我需要使用tf.losses.softmaxCrossEntropy, 而不是 , 作为实际的损失函数。但是,尽管进行了此更改,但我仍然遇到与以前相同的 2 个错误

标签: javascriptmachine-learningtensorflow.js

解决方案


推荐阅读