首页 > 解决方案 > 在浏览器 Web 应用程序中同时对两个视频进行姿势检测不起作用

问题描述

我编写了以下网络应用程序来对两个视频执行姿势检测。这个想法是,比如说,在第一个中给出一个基准视频,在第二个中给出一个用户视频(预先录制的视频或他们的网络摄像头),然后比较两者的动作。

import dash, cv2
import dash_core_components as dcc
import dash_html_components as html
import mediapipe as mp
from flask import Flask, Response

mp_drawing = mp.solutions.drawing_utils
mp_pose = mp.solutions.pose

class VideoCamera(object):
    def __init__(self, video_path):
        self.video = cv2.VideoCapture(video_path)

    def __del__(self):
        self.video.release()

    def get_frame(self):
        with mp_pose.Pose(min_detection_confidence=0.5, min_tracking_confidence=0.5) as pose:
            success, image = self.video.read()

            # Recolor image to RGB
            image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
            image.flags.writeable = False
          
            # Make detection
            results = pose.process(image)
        
            # Recolor back to BGR
            image.flags.writeable = True
            image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
            
            # Render detections
            mp_drawing.draw_landmarks(image, results.pose_landmarks, mp_pose.POSE_CONNECTIONS,
                                        mp_drawing.DrawingSpec(color=(245,117,66), thickness=2, circle_radius=2), 
                                        mp_drawing.DrawingSpec(color=(245,66,230), thickness=2, circle_radius=2) 
                                     )

            _, jpeg = cv2.imencode('.jpg', image)
            return jpeg.tobytes()


def gen(camera):
    while True:
        frame = camera.get_frame()
        yield (b'--frame\r\n'
               b'Content-Type: image/jpeg\r\n\r\n' + frame + b'\r\n\r\n')

server = Flask(__name__)
app = dash.Dash(__name__, server=server)

@server.route('/video_feed_1')
def video_feed_1():
    return Response(gen(VideoCamera(0)), mimetype='multipart/x-mixed-replace; boundary=frame')

@server.route('/video_feed_2')
def video_feed_2():
    return Response(gen(VideoCamera(0)), mimetype='multipart/x-mixed-replace; boundary=frame')

app.layout = html.Div([
    html.Img(src="/video_feed_1", style={'width' : '40%', 'padding': 10}),
    html.Img(src="/video_feed_2", style={'width' : '40%', 'padding': 10})
])

if __name__ == '__main__':
    app.run_server(debug=True)

但是,当我运行这段代码时,我笔记本电脑上的风扇开始启动,并且它不会在浏览器中呈现任何内容。它适用于任何视频,但似乎只能处理一个视频。您可以删除这两个函数中的任何一个video_feed_1()video_feed_2()也可以将来自0(网络摄像头)的视频路径替换为任何其他视频的路径(如,/path/to/video.mp4),它工作正常。

此外,当我只是在浏览器中显示两个视频时,也可以正常工作。get_frame()您也可以通过将上面类中的函数替换为以下内容来尝试这一点:

def get_frame(self):
    success, image = self.video.read()
    ret, jpeg = cv2.imencode('.jpg', image)
    return jpeg.tobytes()

那么,如何在同时渲染两个视频的姿态估计时减少浏览器的负载呢?为什么在浏览器中渲染时负载如此之高,而当姿势估计默认在两个弹出窗口(即使用cv.imshow(image))上渲染时它工作得非常好?

标签: python-3.xopencvflaskcomputer-visionplotly-dash

解决方案


对于需要实时更新的任务,例如姿势估计,我建议使用 websockets 进行通信。这是一个小例子,Quart服务器通过 websocket 将数据流式传输到 Dash 前端,

import asyncio
import base64
import dash, cv2
import dash_html_components as html
import mediapipe as mp
import threading

from dash.dependencies import Output, Input
from quart import Quart, websocket
from dash_extensions import WebSocket

mp_drawing = mp.solutions.drawing_utils
mp_pose = mp.solutions.pose


class VideoCamera(object):
    def __init__(self, video_path):
        self.video = cv2.VideoCapture(video_path)

    def __del__(self):
        self.video.release()

    def get_frame(self):
        with mp_pose.Pose(min_detection_confidence=0.5, min_tracking_confidence=0.5) as pose:
            success, image = self.video.read()

            # Recolor image to RGB
            image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
            image.flags.writeable = False

            # Make detection
            results = pose.process(image)

            # Recolor back to BGR
            image.flags.writeable = True
            image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)

            # Render detections
            mp_drawing.draw_landmarks(image, results.pose_landmarks, mp_pose.POSE_CONNECTIONS,
                                      mp_drawing.DrawingSpec(color=(245, 117, 66), thickness=2, circle_radius=2),
                                      mp_drawing.DrawingSpec(color=(245, 66, 230), thickness=2, circle_radius=2)
                                      )

            _, jpeg = cv2.imencode('.jpg', image)
            return jpeg.tobytes()


# Setup small Quart server for streaming via websocket, one for each stream.
server = Quart(__name__)
n_streams = 2


async def stream(camera, delay=None):
    while True:
        if delay is not None:
            await asyncio.sleep(delay)  # add delay if CPU usage is too high
        frame = camera.get_frame()
        await websocket.send(f"data:image/jpeg;base64, {base64.b64encode(frame).decode()}")


@server.websocket("/stream0")
async def stream0():
    camera = VideoCamera("./kangaroo.mp4")
    await stream(camera)


@server.websocket("/stream1")
async def stream1():
    camera = VideoCamera("./yoga.mp4")
    await stream(camera)


# Create small Dash application for UI.
app = dash.Dash(__name__)
app.layout = html.Div(
    [html.Img(style={'width': '40%', 'padding': 10}, id=f"v{i}") for i in range(n_streams)] +
    [WebSocket(url=f"ws://127.0.0.1:5000/stream{i}", id=f"ws{i}") for i in range(n_streams)]
)
# Copy data from websockets to Img elements.
for i in range(n_streams):
    app.clientside_callback("function(m){return m? m.data : '';}", Output(f"v{i}", "src"), Input(f"ws{i}", "message"))

if __name__ == '__main__':
    threading.Thread(target=app.run_server).start()
    server.run()

尽管此解决方案的性能明显更好(至少在我的笔记本电脑上),但资源使用率仍然很高,因此我添加了一个delay参数,可以以降低帧速率为代价降低资源使用率。

示例应用


推荐阅读