首页 > 解决方案 > 如何使用 PIL 从 pandas 数据框中的行值创建图像?

问题描述

这是我的数据

这是一个包含 36 列的 csv 文件。我打算将每一行转换为一张图片,并将其存储为一个可以馈送到神经网络的数据库。

我已经看到并尝试使用 将 1d numpy 数组转换为图片PIL,但不知道如何在整个数据上实现它。

import pandas as pd
import numpy as np
from PIL import Image

dataframe = pd.read_csv('https://www.dropbox.com/s/sw2p9155zgmkkl5/df22.csv?dl=1',index_col=0)
dataframe

我创建了一个google colab以使其更容易尝试。

数据

HOME,WORK,SHOP,FREETIME,ACCOMPANY,FOOD,OTHER,AM,PM,MIDDAY,NIGHT,firsttrip_time,lasttrip_time,home_traveltime,work_traveltime,shop_traveltime,freetime_traveltime,accompany_traveltime,food_traveltime,home_traveldistance,work_traveldistance,shop_traveldistance,freetime_traveldistance,accompany_traveldistance,food_traveldistance,TRPMILES_mean,TRVL_MIN_mean,home_dweltime,work_dweltime,shop_dweltime,freetime_dweltime,accompany_dweltime,food_dweltime,AVG_VEH_CNT,TRPMILES_sum,TRVL_MIN_sum
2.0,0.0,0.0,1.0,0.0,0.0,2.0,1.0,2.0,2.0,0.0,9.0,20.0,32.5,0.0,0.0,2.0,0.0,0.0,0.72,0.0,0.0,0.01,0.0,0.0,0.58,25.4,115.0,0.0,0.0,118.0,0.0,0.0,1.0,84.22,127.0
2.0,0.0,0.0,3.0,2.0,0.0,1.0,1.0,5.0,2.0,0.0,9.0,20.0,32.5,0.0,0.0,10.0,2.5,0.0,0.72,0.0,0.0,0.26,0.01,0.0,0.37,16.88,115.0,0.0,0.0,51.67,12.5,0.0,1.0,85.22,135.0
2.0,2.0,0.0,0.0,0.0,0.0,1.0,2.0,1.0,1.0,1.0,9.0,20.0,11.5,8.5,0.0,0.0,0.0,0.0,0.19,0.12,0.0,0.0,0.0,0.0,0.14,9.4,46.0,243.0,0.0,0.0,0.0,0.0,1.0,21.0,47.0
1.0,0.0,2.0,0.0,0.0,1.0,0.0,0.0,2.0,2.0,0.0,13.0,16.0,20.0,0.0,17.5,0.0,0.0,10.0,0.17,0.0,0.07,0.0,0.0,0.03,0.09,16.25,0.0,0.0,50.0,0.0,0.0,20.0,1.0,10.0,65.0
1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,13.0,20.0,30.0,0.0,0.0,35.0,0.0,0.0,0.41,0.0,0.0,0.41,0.0,0.0,0.41,32.5,0.0,0.0,0.0,385.0,0.0,0.0,1.0,24.0,65.0
1.0,0.0,2.0,0.0,0.0,1.0,0.0,0.0,4.0,0.0,0.0,11.0,14.0,30.0,0.0,12.5,0.0,0.0,10.0,0.31,0.0,0.15,0.0,0.0,0.02,0.16,16.25,0.0,0.0,25.0,0.0,0.0,80.0,0.0,18.22,65.0
2.0,0.0,2.0,0.0,0.0,0.0,2.0,0.0,2.0,4.0,0.0,10.0,17.0,3.0,0.0,12.5,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,8.17,1.0,0.0,107.5,0.0,0.0,0.0,1.5,1.0,49.0
1.0,8.0,1.0,0.0,0.0,0.0,0.0,4.0,6.0,0.0,0.0,7.0,15.0,30.0,26.0,30.0,0.0,0.0,0.0,0.52,0.52,0.48,0.0,0.0,0.0,0.51,27.14,0.0,6.0,10.0,0.0,0.0,0.0,1.5,104.0,190.0
3.0,0.0,1.0,1.0,0.0,2.0,1.0,1.0,7.0,0.0,0.0,9.0,15.0,7.67,0.0,3.0,10.0,0.0,3.5,0.11,0.0,0.02,0.01,0.0,0.09,0.08,6.0,50.33,0.0,1.0,1.0,0.0,32.0,1.5,18.61,48.0
3.0,0.0,3.0,0.0,0.0,0.0,0.0,2.0,4.0,0.0,0.0,8.0,14.0,8.0,0.0,8.67,0.0,0.0,0.0,0.09,0.0,0.09,0.0,0.0,0.0,0.09,8.33,43.33,0.0,47.0,0.0,0.0,0.0,1.5,15.11,50.0

标签: pandasnumpymatplotlibpython-imaging-library

解决方案


  • 使用如何在 Python 中将一维图像数组转换为 PIL 图像应用于数据帧
  • 图像适用于每一行。
    • 由于图像是矩形的,我们可以制作一个正方形,即6 x 6,因为行的长度为 36。
    • 这将是 214217 个非常小的图像,因此可以通过使用将它们调整为所需的大小.resize
      • 调整所有图像的大小将需要几分钟,具体取决于大小。
  • 使用.applywithaxis=1将该函数应用于数据框中的每一行数据。
    • .values会将行值提取x到一个 numpy 的 shape 数组中(36,),可以用.reshape.
import pandas as pd
import numpy as np
from PIL import Image 

# create the dataframe
df = pd.read_csv('https://www.dropbox.com/s/sw2p9155zgmkkl5/df22.csv?dl=1', index_col=0)

# create images
images = df.apply(lambda x: Image.fromarray(x.values.reshape(6, 6), 'L').resize((200, 200)), axis=1)

# show image 0
images[0]
  • 下图,代表第一行的数据df
df.iloc[0, :].values.reshape(6, 6)

array([[2.000e+00, 0.000e+00, 0.000e+00, 1.000e+00, 0.000e+00, 0.000e+00],
       [2.000e+00, 1.000e+00, 2.000e+00, 2.000e+00, 0.000e+00, 9.000e+00],
       [2.000e+01, 3.250e+01, 0.000e+00, 0.000e+00, 2.000e+00, 0.000e+00],
       [0.000e+00, 7.200e-01, 0.000e+00, 0.000e+00, 1.000e-02, 0.000e+00],
       [0.000e+00, 5.800e-01, 2.540e+01, 1.150e+02, 0.000e+00, 0.000e+00],
       [1.180e+02, 0.000e+00, 0.000e+00, 1.000e+00, 8.422e+01, 1.270e+02]])

在此处输入图像描述

  • 白色边框只是来自剪切和粘贴,它不是图像的一部分。

推荐阅读