首页 > 解决方案 > tensorflow 对象检测 API train.py 自身错误

问题描述

我已经完成了 tensorflow 对象检测 API 的所有安装步骤。我已经检查了几个关于正确安装的指南,只是为了确保我做得正确。但是我仍然一次又一次地收到此错误:

Instructions for updating:
Please switch to tf.train.create_global_step
Traceback (most recent call last):
  File "train.py", line 183, in <module>
    tf.app.run()
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/platform/app.py", line 126, in run
    _sys.exit(main(argv))
  File "train.py", line 179, in main
    graph_hook_fn=graph_rewriter_fn)
  File "/xxxx/models/research/object_detection/trainer.py", line 262, in train
    global_step = slim.create_global_step()
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/util/deprecation.py", line 250, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/contrib/framework/python/ops/variables.py", line 135, in create_global_step
    return training_util.create_global_step(graph)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/training/training_util.py", line 143, in create_global_step
    ops.GraphKeys.GLOBAL_STEP])
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py", line 1297, in get_variable
    constraint=constraint)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py", line 1093, in get_variable
    constraint=constraint)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py", line 439, in get_variable
    constraint=constraint)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py", line 408, in _true_getter
    use_resource=use_resource, constraint=constraint)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py", line 800, in _get_single_variable
    use_resource=use_resource)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py", line 2157, in variable
    use_resource=use_resource)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py", line 2147, in <lambda>
    previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py", line 2130, in default_variable_creator
    constraint=constraint)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/ops/variables.py", line 235, in __init__
    constraint=constraint)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/ops/variables.py", line 337, in _init_from_args
    initial_value(), name="initial_value", dtype=dtype)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py", line 784, in <lambda>
    shape.as_list(), dtype=dtype, partition_info=partition_info)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py", line 99, in __call__
    return array_ops.zeros(shape, dtype)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/ops/array_ops.py", line 1601, in zeros
    output = fill(shape, constant(zero, dtype=dtype), name=name)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/framework/constant_op.py", line 214, in constant
    value, dtype=dtype, shape=shape, verify_shape=verify_shape))
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/framework/tensor_util.py", line 533, in make_tensor_proto
    append_fn(tensor_proto, proto_values)
  File "tensorflow/python/framework/fast_tensor_util.pyx", line 45, in tensorflow.python.framework.fast_tensor_util.AppendInt64ArrayToTensorProto
  File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/google/protobuf/internal/containers.py", line 251, in append
    self._values.append(self._type_checker.CheckValue(value))
UnboundLocalError: local variable 'self' referenced before assignment

我正在使用一个非常小的训练集(15 张图像),只是为了掌握整个过程——当然我会在以后添加更多以提高准确性。我会提到这一点,以防万一。

我怀疑配置文件、train.py 文件本身(相对于我的数据集)或 .record 文件有问题。然而,对我来说,一切似乎都符合要求。

我正在使用 ssd_mobilenet_v1_coco_11_06_2017 数据集和 ssd_mobilenet_v1_pets 配置。

任何提示都非常感谢。如果我遗漏了必要的细节,请告诉我。

标签: pythontensorflowmachine-learningobject-detection

解决方案


我在 python 版本 3.7(MacOS 10.13.4)上遇到了同样的问题,我重新检查了.config文件中给出的路径,但它是正确的。我可以通过将 python 降级到 version2.7 并在 tensorflow 对象识别中遵循相同的安装步骤并schedule.config文件中搜索并删除以下块来解决该问题。

schedule { step: 0 learning_rate: .0001 }

然后运行,

python legacy/train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/faster_rcnn_inception_v2_pets.config

这可能是由于版本不匹配。


推荐阅读