python-3.x - 将 mp4 文件保存到 csv 以训练数据
问题描述
我对计算机视觉领域非常陌生,我正在尝试训练我的模型,作为工作的开始,我使用标签编码器为我正在使用的事件标记我的视频。在这里,我有两个事件是意外事件和非意外事件。
图片的文件夹结构:
Colab_Notebooks
- accident(all the .jpg frames are here)
- nonaccident(all the .jpg frames are here)
所以我的 data.csv 文件看起来像这样,下面给出了代码。
data.csv
image_path,target
/content/drive/MyDrive/Colab_Notebooks/accident/accident_0000638.jpg,0.0
/content/drive/MyDrive/Colab_Notebooks/nonaccident/nonaccident_0002143.jpg,1.0
/content/drive/MyDrive/Colab_Notebooks/accident/accident_0000372.jpg,0.0
/content/drive/MyDrive/Colab_Notebooks/accident/accident_0000419.jpg,0.0
/content/drive/MyDrive/Colab_Notebooks/nonaccident/nonaccident_0001675.jpg,1.0
/content/drive/MyDrive/Colab_Notebooks/accident/accident_0000307.jpg,0.0
/content/drive/MyDrive/Colab_Notebooks/accident/accident_00001099.jpg,0.0
/content/drive/MyDrive/Colab_Notebooks/accident/accident_0000940.jpg,0.0
/content/drive/MyDrive/Colab_Notebooks/accident/accident_0000892.jpg,0.0
/content/drive/MyDrive/Colab_Notebooks/accident/accident_0000805.jpg,0.0
/content/drive/MyDrive/Colab_Notebooks/accident/accident_0000232.jpg,0.0
/content/drive/MyDrive/Colab_Notebooks/accident/accident_0000255.jpg,0.0
/content/drive/MyDrive/Colab_Notebooks/accident/accident_0000840.jpg,0.0
/content/drive/MyDrive/Colab_Notebooks/accident/accident_0000974.jpg,0.0
我用于生成 data.csv 的代码如下所示:
all_paths = os.listdir('/content/drive/MyDrive/Colab_Notebooks/')
folder_paths = [x for x in all_paths if os.path.isdir('/content/drive/MyDrive/Colab_Notebooks/' + x )]
print(f"Folder paths : {folder_paths}")
print (f"Number of folders: {len(folder_paths)}")
create_labels = ['accident','nonaccident']
data = pd.DataFrame()
image_formats = ['jpg']
labels = []
counter = 0
for i, folder_path in tqdm(enumerate(folder_paths), total = len(folder_paths)):
if folder_path not in create_labels:
continue
image_paths = os.listdir('/content/drive/MyDrive/Colab_Notebooks/' + folder_path)
label = folder_path
for image_path in image_paths:
if image_path.split('.')[-1] in image_formats:
data.loc[counter,'image_path'] = f"/content/drive/MyDrive/Colab_Notebooks/{folder_path}/{image_path}"
labels.append(label)
counter += 1
labels = np.array(labels)
# one-hot encode the labels
lb = LabelBinarizer()
labels = lb.fit_transform(labels)
#print(labels)
# save as CSV file
data.to_csv('/content/drive/MyDrive/Colab_Notebooks/data.csv', index=False)
# pickle the binarized labels
print('Saving the binarized labels as pickled file')
joblib.dump(lb, '/content/drive/MyDrive/Colab_Notebooks/lb.pkl')
print(data.head(5))
我能够做到这一点,因为您在顶部看到的数据集是 jpg 图像的帧。但我想对视频做同样的事情。
Colab_Notebooks
- accident(all the .mp4 clips are here)
- nonaccident(all the .mp4 clips are here)
Expected output:
/content/drive/MyDrive/Colab_Notebooks/accident/accident_0000638.mp4,0.0
/content/drive/MyDrive/Colab_Notebooks/nonaccident/nonaccident_0002143.mp4,1.0
/content/drive/MyDrive/Colab_Notebooks/accident/accident_0000372.mp4,0.0
/content/drive/MyDrive/Colab_Notebooks/accident/accident_0000419.mp4,0.0
/content/drive/MyDrive/Colab_Notebooks/nonaccident/nonaccident_0001675.mp4,1.0
/content/drive/MyDrive/Colab_Notebooks/accident/accident_0000307.mp4,0.0
有人可以告诉我如何修改代码以读取视频剪辑而不是图像吗?
解决方案
我能够通过使用 mp4 代替 jpg 格式来解决这个问题。
推荐阅读
- javascript - How extract parts of string, based on pattern using javascript/nodejs?
- sql - SQL查询以计算一行中的数据按时间匹配的次数
- ios - 无法在我的 swift 4 应用程序中复制来自 vimeo 或 youtube 的视频
- sql - 根据 postgresql 中队列的初始日期计数
- python - Discord中的“期望用双引号括起来的属性名称”
- jquery - 如何使用 jQuery 变量而不是硬编码值删除选择选项?
- php - 单击按钮时使用 php refresh 登录
- java - 如何在 java 中将 long 转换为 char[] 数组?
- javascript - 除了使用 Fontawesome Unicode,我们如何使用 Fontawesome 类来创建带有图标的手风琴?
- c++ - 为什么这个双变量会导致分段错误?