python - TensorFlow Object Detection API 中用于平衡数据的类权重
问题描述
我正在使用Open Images Dataset上的TensorFlow 对象检测 API微调SSD对象检测器。我的训练数据包含不平衡的类,例如
- 顶部(5K 图像)
- 连衣裙(50K 图像)
- ETC...
我想在分类损失中添加类权重以提高性能。我怎么做?配置文件的以下部分似乎相关:
loss {
classification_loss {
weighted_sigmoid {
}
}
localization_loss {
weighted_smooth_l1 {
}
}
...
classification_weight: 1.0
localization_weight: 1.0
}
如何更改配置文件以添加每个类的分类损失权重?如果不是通过配置文件,那么推荐的方法是什么?
解决方案
API 期望直接在注释文件中为每个对象 (bbox) 设置一个权重。由于这个要求,使用类权重的解决方案似乎是:
1)如果您有自定义数据集,则可以修改每个对象(bbox)的注释以将权重字段包含为“对象/权重”。
2) 如果您不想修改注释,您可以只重新创建 tf_records文件以包含 bbox 的权重。
3)修改API的代码(在我看来相当棘手)
我决定选择#2,所以我把代码放在这里,为一个自定义数据集生成这样的加权tf 记录文件,该数据集有两个类(“top”、“dress”),权重(1.0、0.1)给定一个xml注释文件夹作为:
import os
import io
import glob
import hashlib
import pandas as pd
import xml.etree.ElementTree as ET
import tensorflow as tf
import random
from PIL import Image
from object_detection.utils import dataset_util
# Define the class names and their weight
class_names = ['top', 'dress', ...]
class_weights = [1.0, 0.1, ...]
def create_example(xml_file):
tree = ET.parse(xml_file)
root = tree.getroot()
image_name = root.find('filename').text
image_path = root.find('path').text
file_name = image_name.encode('utf8')
size=root.find('size')
width = int(size[0].text)
height = int(size[1].text)
xmin = []
ymin = []
xmax = []
ymax = []
classes = []
classes_text = []
truncated = []
poses = []
difficult_obj = []
weights = [] # Important line
for member in root.findall('object'):
xmin.append(float(member[4][0].text) / width)
ymin.append(float(member[4][1].text) / height)
xmax.append(float(member[4][2].text) / width)
ymax.append(float(member[4][3].text) / height)
difficult_obj.append(0)
class_name = member[0].text
class_id = class_names.index(class_name)
weights.append(class_weights[class_id])
if class_name == 'top':
classes_text.append('top'.encode('utf8'))
classes.append(1)
elif class_name == 'dress':
classes_text.append('dress'.encode('utf8'))
classes.append(2)
else:
print('E: class not recognized!')
truncated.append(0)
poses.append('Unspecified'.encode('utf8'))
full_path = image_path
with tf.gfile.GFile(full_path, 'rb') as fid:
encoded_jpg = fid.read()
encoded_jpg_io = io.BytesIO(encoded_jpg)
image = Image.open(encoded_jpg_io)
if image.format != 'JPEG':
raise ValueError('Image format not JPEG')
key = hashlib.sha256(encoded_jpg).hexdigest()
#create TFRecord Example
example = tf.train.Example(features=tf.train.Features(feature={
'image/height': dataset_util.int64_feature(height),
'image/width': dataset_util.int64_feature(width),
'image/filename': dataset_util.bytes_feature(file_name),
'image/source_id': dataset_util.bytes_feature(file_name),
'image/key/sha256': dataset_util.bytes_feature(key.encode('utf8')),
'image/encoded': dataset_util.bytes_feature(encoded_jpg),
'image/format': dataset_util.bytes_feature('jpeg'.encode('utf8')),
'image/object/bbox/xmin': dataset_util.float_list_feature(xmin),
'image/object/bbox/xmax': dataset_util.float_list_feature(xmax),
'image/object/bbox/ymin': dataset_util.float_list_feature(ymin),
'image/object/bbox/ymax': dataset_util.float_list_feature(ymax),
'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
'image/object/class/label': dataset_util.int64_list_feature(classes),
'image/object/difficult': dataset_util.int64_list_feature(difficult_obj),
'image/object/truncated': dataset_util.int64_list_feature(truncated),
'image/object/view': dataset_util.bytes_list_feature(poses),
'image/object/weight': dataset_util.float_list_feature(weights) # Important line
}))
return example
def main(_):
weighted_tf_records_output = 'name_of_records_file.record' # output file
annotations_path = '/path/to/annotations/folder/*.xml' # input annotations
writer_train = tf.python_io.TFRecordWriter(weighted_tf_records_output)
filename_list=tf.train.match_filenames_once(annotations_path)
init = (tf.global_variables_initializer(), tf.local_variables_initializer())
sess=tf.Session()
sess.run(init)
list = sess.run(filename_list)
random.shuffle(list)
for xml_file in list:
print('-> Processing {}'.format(xml_file))
example = create_example(xml_file)
writer_train.write(example.SerializeToString())
writer_train.close()
print('-> Successfully converted dataset to TFRecord.')
if __name__ == '__main__':
tf.app.run()
如果您有其他类型的注释,代码将非常相似,但不幸的是,这个代码不起作用。
推荐阅读
- mongodb - 从自定义查询对象中过滤 mongodb 作为输入
- c++ - C++:如何通过 curl 调用使用 HTTP 发布请求发送二进制数据(protobuf 数据)
- javascript - 有没有办法跳过量角器中的谷歌验证码?
- ubuntu - Akeneo 未加载
- laravel - 如何在 codeigniter 或 laravel 中使用 hash_hmac 函数?
- java - 在 Java 中定义地图
- javascript - CKEDITOR on change 在每个字符(包括空间更改)上触发事件。如何让它像普通的 js onchange 事件一样?
- passport.js - 用户不可用 NestJS 护照策略(文档示例除外)
- javascript - this 关键字如何在函数调用链上起作用?
- user-interface - Flutter 中带有描边圆形背景的对话框