首页 > 解决方案 > 如何在 Python 中重塑数据

问题描述

我有一个只包含一行但多列的数据框: 在此处输入图像描述

我想将每 5 列放到一个新行中。这是预期的输出: 在此处输入图像描述

原始数据在列表中,我转换为数据框。不知道通过一个列表来重塑是否更容易,但是这里有一个示例列表供您尝试,原始列表真的很长。['review: I stayed around 11 days and enjoyed stay very much.', 'compound: 0.5106, ','neg: 0.0, ','neu: 0.708, ','pos: 0.292, ','review: Plans for weekend stay canceled due to Coronavirus shutdown.','compound: 0.0, ','neg: 0.0, ','neu: 1.0, ','pos: 0.0, ']

标签: pythonpandasdataframestackreshape

解决方案


将其解析为列表更容易,然后将其转换为数据框。

  • 对于每个条目,用 ':' 拆分条目并将键\值添加到字典中
  • 将字典转换为数据框

尝试这个:

import pandas as pd

lst = ['review: I stayed around 11 days and enjoyed stay very much.', 'compound: 0.5106, ','neg: 0.0, ','neu: 0.708, ','pos: 0.292, ',
       'review: Plans for weekend stay canceled due to Coronavirus shutdown.','compound: 0.0, ','neg: 0.0, ','neu: 1.0, ','pos: 0.0, ']

dd = {}

for x in lst:
   sp = x.split(':')
   if sp[0] in dd:
      dd[sp[0]].append(sp[1].replace(',',"").strip())
   else:
      dd[sp[0]] = [sp[1].replace(',',"").strip()]
      
print(dd)
print(pd.DataFrame(dd).to_string(index=False))

输出

                                                       review compound  neg    neu    pos
          I stayed around 11 days and enjoyed stay very much.   0.5106  0.0  0.708  0.292
 Plans for weekend stay canceled due to Coronavirus shutdown.      0.0  0.0    1.0    0.0

推荐阅读