python - TensorFlow/Keras - 预期 global_average_pooling2d_1_input 的形状为 (1, 1, 2048) 但得到的数组形状为 (7, 7, 2048)
问题描述
我对 TensorFlow 和图像分类还很陌生,所以我可能缺少关键知识,这可能就是我面临这个问题的原因。
我已经ResNet50
在 TensorFlow 中建立了一个模型,用于使用该库对犬种进行图像分类,ImageNet
并且我已经成功地训练了一个可以检测各种犬种的神经网络。
我现在想将狗的随机图像传递给我的模型,以便它吐出它认为的狗品种的输出。但是,当我运行此函数时,当它尝试执行该行时dog_breed_predictor("<file path to image>")
出现错误,我不知道如何解决这个问题。expected global_average_pooling2d_1_input to have shape (1, 1, 2048) but got array with shape (7, 7, 2048)
Resnet50_model.predict(bottleneck_feature)
这是代码。我已经提供了我认为与问题相关的所有内容。
import cv2
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from keras.applications.resnet50 import ResNet50
from keras.preprocessing import image
from tqdm import tqdm
from sklearn.datasets import load_files
np_utils = tf.keras.utils
# define function to load train, test, and validation datasets
def load_dataset(path):
data = load_files(path)
dog_files = np.array(data['filenames'])
dog_targets = np_utils.to_categorical(np.array(data['target']), 133)
return dog_files, dog_targets
# load train, test, and validation datasets
train_files, train_targets = load_dataset('dogImages/dogImages/train')
valid_files, valid_targets = load_dataset('dogImages/dogImages/valid')
test_files, test_targets = load_dataset('dogImages/dogImages/test')
#define Resnet50 model
Resnet50_model = ResNet50(weights="imagenet")
def path_to_tensor(img_path):
#loads RGB image as PIL.Image.Image type
img = image.load_img(img_path, target_size=(224, 224))
#convert PIL.Image.Image type to 3D tensor with shape (224, 224, 3)
x = image.img_to_array(img)
#convert 3D tensor into 4D tensor with shape (1, 224, 224, 3)
return np.expand_dims(x, axis=0)
from keras.applications.resnet50 import preprocess_input, decode_predictions
def ResNet50_predict_labels(img_path):
#returns prediction vector for image located at img_path
img = preprocess_input(path_to_tensor(img_path))
return np.argmax(Resnet50_model.predict(img))
###returns True if a dog is detected in the image stored at img_path
def dog_detector(img_path):
prediction = ResNet50_predict_labels(img_path)
return ((prediction <= 268) & (prediction >= 151))
###Obtain bottleneck features from another pre-trained CNN
bottleneck_features = np.load("bottleneck_features/DogResnet50Data.npz")
train_DogResnet50 = bottleneck_features["train"]
valid_DogResnet50 = bottleneck_features["valid"]
test_DogResnet50 = bottleneck_features["test"]
###Define your architecture
Resnet50_model = tf.keras.Sequential()
Resnet50_model.add(tf.keras.layers.GlobalAveragePooling2D(input_shape=train_DogResnet50.shape[1:]))
Resnet50_model.add(tf.contrib.keras.layers.Dense(133, activation="softmax"))
Resnet50_model.summary()
###Compile the model
Resnet50_model.compile(loss="categorical_crossentropy", optimizer="rmsprop", metrics=["accuracy"])
###Train the model
checkpointer = tf.keras.callbacks.ModelCheckpoint(filepath="saved_models/weights.best.ResNet50.hdf5",
verbose=1, save_best_only=True)
Resnet50_model.fit(train_DogResnet50, train_targets,
validation_data=(valid_DogResnet50, valid_targets),
epochs=20, batch_size=20, callbacks=[checkpointer])
###Load the model weights with the best validation loss.
Resnet50_model.load_weights("saved_models/weights.best.ResNet50.hdf5")
###Calculate classification accuracy on the test dataset
Resnet50_predictions = [np.argmax(Resnet50_model.predict(np.expand_dims(feature, axis=0))) for feature in test_DogResnet50]
#Report test accuracy
test_accuracy = 100*np.sum(np.array(Resnet50_predictions)==np.argmax(test_targets, axis=1))/len(Resnet50_predictions)
print("Test accuracy: %.4f%%" % test_accuracy)
def extract_Resnet50(tensor):
from keras.applications.resnet50 import ResNet50, preprocess_input
return ResNet50(weights='imagenet', include_top=False).predict(preprocess_input(tensor))
def dog_breed(img_path):
#extract bottleneck features
bottleneck_feature = extract_Resnet50(path_to_tensor(img_path))
#obtain predicted vector
predicted_vector = Resnet50_model.predict(bottleneck_feature) #shape error occurs here
#return dog breed that is predicted by the model
return dog_names[np.argmax(predicted_vector)]
def dog_breed_predictor(img_path):
#determine the predicted dog breed
breed = dog_breed(img_path)
#display the image
img = cv2.imread(img_path)
cv_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.imshow(cv_rgb)
plt.show()
#display relevant predictor result
if dog_detector(img_path):
print("This is a dog and its breed is: " + str(breed))
elif face_detector(img_path):
print("This is a human but it looks like a: " + str(breed))
else:
print("I don't know what this is.")
dog_breed_predictor("dogImages/dogImages/train/016.Beagle/Beagle_01126.jpg")
我输入函数的图像来自用于训练模型的同一数据集——我想看看模型是否按预期工作——所以这个错误使它更加混乱。我可能做错了什么?
解决方案
感谢nessuno的帮助,我解决了这个问题。问题确实出pooling
在ResNet50
.
我上面的脚本中的以下代码:
return ResNet50(weights='imagenet',
include_top=False).predict(preprocess_input(tensor))
返回一个形状(1, 7, 7, 2048)
(诚然,我不完全理解为什么)。为了解决这个问题,我在参数中添加pooling="avg"
如下:
return ResNet50(weights='imagenet',
include_top=False,
pooling="avg").predict(preprocess_input(tensor))
这反而返回了一个形状(1, 2048)
(同样,诚然,我不知道为什么。)
但是,该模型仍需要 4-D 形状。为了解决这个问题,我在dog_breed()
函数中添加了以下代码:
print(bottleneck_feature.shape) #returns (1, 2048)
bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
print(bottleneck_feature.shape) #returns (1, 1, 1, 1, 2048) - yes a 5D shape, not 4.
这将返回一个形状(1, 1, 1, 1, 2048)
。出于某种原因,当我只添加了 2 个维度时,模型仍然抱怨它是 3D 形状,但当我添加了第 3 个维度时它停止了(这很奇怪,我想了解更多关于为什么会这样。)。
所以总的来说,我的dog_breed()
功能来自:
def dog_breed(img_path):
#extract bottleneck features
bottleneck_feature = extract_Resnet50(path_to_tensor(img_path))
#obtain predicted vector
predicted_vector = Resnet50_model.predict(bottleneck_feature) #shape error occurs here
#return dog breed that is predicted by the model
return dog_names[np.argmax(predicted_vector)]
对此:
def dog_breed(img_path):
#extract bottleneck features
bottleneck_feature = extract_Resnet50(path_to_tensor(img_path))
print(bottleneck_feature.shape) #returns (1, 2048)
bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
print(bottleneck_feature.shape) #returns (1, 1, 1, 1, 2048) - yes a 5D shape, not 4.
#obtain predicted vector
predicted_vector = Resnet50_model.predict(bottleneck_feature) #shape error occurs here
#return dog breed that is predicted by the model
return dog_names[np.argmax(predicted_vector)]
同时确保将参数pooling="avg"
添加到我对ResNet50
.
推荐阅读
- android - 为什么 Android Studio 在默认文件中显示错误
- python - 具有多个条件和不同更新的 SQLite 命令
- javascript - 防止表单提交 jQuery
- python - 当每个文件名不包含日期时,使用 Python 从一系列文件名中删除日期?
- spring-boot - 如何从 KeyCloak 获取用户组并用 Java 打印它们
- java - 为什么使用 Spring 的依赖注入而不是常规依赖注入?
- flutter - Flutter变量在更改时不会改变背景颜色?
- r - 如何将给定变量的演变添加为 R 数据表中的新变量?
- php - 如何将相同的标识列添加到不同的表?
- azure-active-directory - west-us2 区域的 Azure Log Analytics API 权限