python-3.x - 将具有多个值和变量键的嵌套字典导出到 Excel
问题描述
从这里开始第二次尝试。我需要的是将以下字典导出到 excel。
{1: {'Field Cluster': ['This', 'This', 'This'],
'Exploration Block': ['Is', 'Is', 'Is'],
'Producing since': [1923.0, 1923.0, 1923.0],
'Fluids': ['A ', 'A ', 'A '],
'Reservoirs': ['Test', 'Test', 'Test'],
'Area (km2)': ['File', 'File', 'File'],
'Depth (m)': ['A\nHuge\nDepth', 'A\nHuge\nDepth', 'A\nHuge\nDepth'],
'Concession License No.': ['UNIX license', 'UNIX license', 'UNIX license'],
'License Expiry Date / Extension': ['Everlasting', 'Everlasting', 'Everlasting'],
'Working Interest with SB': ['There is one\n', 'There is one\n', 'There is one\n'],
'Government approval:': ['It is!', 'It is!', 'It is!'],
'Last study:': ['Million years ago', 'Million years ago', 'Million years ago'],
'Parameters': ['Horizon1', 'Horizon2', 'Horizon3'],
'Reservoir rock': ['First', 'Second', 'Third'],
'Net pay thickness (m)': [1.0, 21.0, 41.0],
'Avr. porosity (%)': [2.0, 22.0, 42.0],
'Average absolute permeability (mD)': [3.0, 23.0, 43.0],
'Swi (%)': [4.0, 24.0, 44.0],
'Initial pressure (at)': [5.0, 25.0, 45.0],
'Bubble Pressure (at.)': [6.0, 26.0, 46.0],
'Dew Point Pressure (at)': [7.0, 27.0, 47.0],
'Initial Solution Ratio (Stm3/m3)': [8.0, 28.0, 48.0],
'Initial Condensate Gas Ratio (g/Stm3)': [9.0, 29.0, 49.0],
'Oil density (kg/cm)': [10.0, 30.0, 50.0],
'Oil viscosity (Pb) (cP)': [11.0, 31.0, 51.0],
'Contaminants (H2S, CO2)': [12.0, 32.0, 52.0],
'Initial Oil in Place (e3 to)': [13.0, 33.0, 53.0],
'Initial NGL in Place (e3 to)': [14.0, 34.0, 54.0]},
2: {'Field Cluster': ['This fff', 'This fff', 'This fff', 'This fff'],
'Exploration Block': ['fff', 'fff', 'fff', 'fff'],
'Producing since': ['1923fff', '1923fff', '1923fff', '1923fff'],
'Fluids': ['A fff', 'A fff', 'A fff', 'A fff'],
'Reservoirs': ['Test', 'Test', 'Test', 'Test'],
'Area (km2)': ['File', 'File', 'File', 'File'],
'Depth (m)': ['A\nHuge\nDepthfff', 'A\nHuge\nDepthfff', 'A\nHuge\nDepthfff', 'A\nHuge\nDepthfff'],
'Concession License No.': ['UNIX license', 'UNIX license', 'UNIX license', 'UNIX license'],
'License Expiry Date / Extension': ['Everlastingfff', 'Everlastingfff', 'Everlastingfff', 'Everlastingfff'],
'Working Interest': ['There is one\n', 'There is one\n', 'There is one\n', 'There is one\n'],
'Gouvernment approval:': ['ffff', 'ffff', 'ffff', 'ffff'],
'Last study:': ['Million years fffff', 'Million years fffff', 'Million years fffff', 'Million years fffff'],
'Parameters': ['Horizon1', 'Horizon2', 'Horizon3', 'Horizon4'],
'Reservoir rock': ['First', 'Second', 'Third', 'Fourth'],
'Net pay thickness (m)': [1.0, 21.0, 41.0, 61.0],
'Avr. porosity (%)': [2.0, 22.0, 42.0, 62.0],
'Average absolute permeability (mD)': [3.0, 23.0, 43.0, 63.0],
'Swi (%)': [4.0, 24.0, 44.0, 64.0],
'Initial Oil in Place (e3 to)': [13.0, 33.0, 53.0, 73.0],
'Initial NGL in Place (e3 to)': [14.0, 34.0, 54.0, 74.0],
'Initial Gas (assoc.) in Place (e6 m3) sol.gas/gas cap': [15.0, 35.0, 55.0, 75.0],
'Initial Gas (non assoc.) in Place (e6 m3)': [16.0, 36.0, 56.0, 76.0],
'Primary recovery / drive mechanism\nNone': ['Wow\nA', 'Recovery\nNone', 'Mechanism\nNone', 'Nice\nNone', ''],
'Secondary recovery': ['Another one', '', '', '', ''],
'Total Wells': ['1000', '-', '-', '-', ''],
'Productive wells (oil/gas)': ['500', '-', '-', '-', ''],
'Injection wells (water/gas)': ['500', '-', '-', '-', ''],
'Rate of best producer in the field (tons / e3 Sm3/day)': ['30', '-', '-', '-', ''],
'WOW Production (Something)': ['1', 2.0, '3', '4', '']}}
上一篇文章给出了两个答案。第一个:
df=pd.DataFrame(d) # assuming d is the name of the dict
cols=df.columns
final=pd.concat([pd.DataFrame(df[i].dropna().tolist()) for i in cols],axis=1,keys=cols)
final.index=df.index
print(final)
这个仅适用于第一个嵌套字典。关键问题是第二个子字典中缺少一些键,并且根据第一个字典使用的顺序对值进行排序。这会导致值与相应的参数不匹配。
另一个答案非常相似,它适用于测试词典,但不适用于上述词典:
df=pd.DataFrame(d) # assuming d is the name of the dict
cols=df.columns
final=pd.concat([pd.DataFrame(v).T for k,v in d.items()],axis=1,sort=False,keys=d.keys())
final.index=df.index
print(final)
对于实际字典,此代码仅返回两行,其中包含元组中的参数。而且,它只考虑第二个子字典。
简而言之,我想要什么:假设我们有这本小字典,与实际字典非常相似:
{1:
{'Parameter 1': ['Value 1', 'Value 2', 'Value 3'],
'Parameter 2': ['Value 11', 'Value 22', 'Value 33'],
'Parameter 3': ['Num1', 'Num2', 'Num3']},
2:
{'Parameter 1': ['Data 1', 'Data 2', 'Data 3'],
'Parameter 2': ['Data 11', 'Data 22', 'Data 33'],
'Parameter 4': ['Numb11', 'Numb22', 'Numb33']}
}
我想从中得到这样的表:
| 1 | 2 |
---------------------------------------------------------------------
Parameter 1 | Value 1 | Value 2 | Value 3 | Data 1 | Data 2 | Data 3 |
----------------------------------------------------------------------
Parameter 2 | Value 11| Value 22| Value 33| Data 1 | Data 2 | Data 3 |
----------------------------------------------------------------------
Parameter 3 | Num1 | Num2 | Num3 | | | |
----------------------------------------------------------------------
Parameter 4 | | | | Numb11 | Numb22 | Numb33 |
----------------------------------------------------------------------
所以每个值都对应它的参数,所有参数都位于第一列,不重复。
解决方案
以下内容与您提供的内容相同(但少了两个):
df_to_concat = {k: pd.DataFrame(v).transpose() for (k, v) in d.items()}
df = pd.concat(df_to_concat.values(), keys=df_to_concat.keys(), axis='columns')
但是你的大字典有不相等的列表,会出现以下错误:
ValueError: arrays must all be same length
最后一个键有最后一个空值。当我手动删除时,代码有效。如果您想以编程方式执行此操作,您可以在创建数据框之前执行类似操作,它会删除包含太多项目的列表的最后一个值:
min_length = {k: min([len(one_list) for one_list in v.values()]) for (k, v) in d.items()}
new_d = {}
for k, v in d.items():
new_v = {}
for k2, one_list in v.items():
new_v.update({k2: one_list[:min_length[k]]})
new_d.update({k: new_v})
推荐阅读
- linux - VSCode 在 Linux 中使用哪个字体文件夹?
- python - Twisted - 将结果传递给多个回调
- python - 使用SQL查询作为数据源摆脱python中的多个html标签
- javascript - Discord.js 让你的机器人等待回复
- vmware-clarity - Clarity Datagrid & Promise / 主题绑定栏
- php - 尝试从命名空间“App\DataFixtures”加载类“BaseFixture”。您是否忘记了另一个名称空间的“使用”语句?
- tensorflow - 如何使用 Tensorflow 2.0 获得可重现的结果?
- python - 将 groupedby pandas 数据框(多个但不是所有列)从长转换为宽
- java - 如何在freemarker中获取列表大小?
- java - Apache Karaf (4.2.6) shell:无法安装 webconsole