python - 可视化和训练自定义 Blazepose 模型(姿势估计关键点检测)
问题描述
我目前正在使用自定义Blazepose模型(这里是repo 和代码)。我面临着可视化预测的问题。我正在检查源代码,发现模型返回 3 个输出:
model = Model(inputs=inputs, outputs=[conv99_1, sigm99_1, reshape99_2])
我还检查了 tf.js 代码(https://github.com/terryky/tfjs_webgl_app/blob/master/blazepose_fullbody/tfjs_blazepose.js),但我不明白他们是如何可视化这些点的
如何可视化图像中的这些点?另外,我正在尝试训练 58 个关键点,而不是 39 个关键点——你能告诉我这个吗?
我在图像上的输出形状256 x 256 x 3
是:
(1, 128, 128, 1)
(1, 1, 1, 1)
(1, 156)
这是完整的模型架构:
Input :
==================================================================================================
input (InputLayer) [(None, 256, 256, 3) 0
__________________________________________________________________________________________________
输出 :
output_segmentation (Conv2D) (None, 128, 128, 1) 73 conv2d_68[0][0]
__________________________________________________________________________________________________
tf_op_layer_Sigmoid (TensorFlow (None, 1, 1, 1) 0 conv2d_69[0][0]
__________________________________________________________________________________________________
tf_op_layer_ld_3d (TensorFlowOp (1, 156) 0 conv2d_70[0][0]
解决方案
我开发了推理代码,它对我来说非常完美
import os
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import load_model
import cv2
def set_env():
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
try:
# Restrict TensorFlow to only use the fourth GPU
tf.config.experimental.set_visible_devices(gpus[0], 'GPU')
# Currently, memory growth needs to be the same across GPUs
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
logical_gpus = tf.config.experimental.list_logical_devices('GPU')
print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
except RuntimeError as e:
# Memory growth must be set before GPUs have been initialized
print(e)
def return_model(filepath):
model = load_model(filepath)
model.compile()
return model
def get_preds(model , image):
img = np.expand_dims(image , axis = 0)
preds = model.predict(img)
return preds
def convert_preds_to_xy(preds):
kpts = []
temp = preds[2][0]
for x,y in zip(temp[::4] , temp[1::4]):
kpts.append((int(x),int(y)))
return kpts
def infer_video(model , video = 0 ):
cap = cv2.VideoCapture(video)
while cap.isOpened():
okay , frame = cap.read()
if not okay :
print('Cant open webcam , please try again!')
break
inframe = frame.copy()
inframe_resize = cv2.resize(inframe , (256 , 256)) / 255
preds = get_preds(model , inframe_resize)
kpts = convert_preds_to_xy(preds)
for pair in POSE_PAIRS:
cv2.line(inframe_resize, kpts[pair[0]], kpts[pair[1]], (0, 255, 0), thickness=1)
# for point in kpts:
# cv2.circle(inframe_resize , point , 2, (0,0,255) , 2)
cv2.imshow('Inference' , inframe_resize)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
def infer_image(model , image):
img = cv2.imread(image)
img_resize = cv2.resize(img , (256 , 256)) / 255
preds = get_preds(model , img_resize)
kpts = convert_preds_to_xy(preds)
for pair in POSE_PAIRS:
cv2.line(img_resize, kpts[pair[0]], kpts[pair[1]], (0, 255, 0), thickness=1)
# for idx , point in enumerate(kpts):
# cv2.circle(img_resize , point, 2 , (0 , 0 , 255) , 2)
# cv2.putText(img_resize, "{}".format(idx), point, cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0 , 255 ,0), 1, lineType=cv2.LINE_AA)
# cv2.line(image, (x1, y1), (x2, y2), (0, 255, 0), thickness=line_thickness)
cv2.imshow('Inference Image' , img_resize)
cv2.waitKey(0)
if __name__ == '__main__':
POSE_PAIRS = [(0,1) ,(0,4) ,(1,2) ,(2,3) ,(3,7),(4,5) ,(5,6) ,(6, 8), (9, 10),
(11 ,12) , (12 ,14) , (14,16) ,(16,22) ,(16,18) ,(16,22), (18,20),
(12,24) , (24,26) ,(26,28) ,(28,32), (28,30) ,(30 ,32) ,(24,23) ,
(11,13) ,(13,15) ,(15,21) ,(15,17) ,(15,19) ,(19,17) ,(11,23),
(23,25) ,(25,27) ,(27,29) ,(27 ,31) ,(29 ,31)
]
set_env()
model = return_model('full_pose_landmark_39kp.h5')
model.summary()
infer_video(model)
# infer_image(model , 'image.jpg')
推荐阅读
- google-bigquery - 查询 ARRAY (Bigquery) 中的 STRUCT 元素
- lua - 如何处理lua浮点数精度不够的问题
- list - Prolog 如何从初始列表创建新列表而不重复
- r - 从一组数字到一个更大的数字 R
- java - JAVA中具有不同数量元素的3D数组
- apache-beam - 设置 CoGroupByKey 结果的时间戳
- cuda - 如何避免这个 CUDA 内核中的线程分歧?
- visual-studio-code - 如何将我的 dotnet 应用程序放入 bitbucket?
- python-3.x - python中的动态编程最佳求和代码
- python - 使用滑块操作变量以更改 3D 空间中的绘制矢量