python-3.x - dataframe columns to multtindex dataframe
问题描述
I have the following data in an Excel sheet and I want to read it as a multiindex dataframe:
Y1 Y1 Y2 Y2
B H1 H2 H1 H2
1 80 72 79.2 84.744
2 240 216 237.6 254.232
3 40 36 39.6 42.372
4 160 144 158.4 169.488
5 240 216 237.6 254.232
6 0 0 0 0
I am reading it as:
DATA = pd.read_excel('data.xlsx',sheet_name=None)
since I am reading other sheets too.
Question 1:
This data is not read as multi-index data. How do I make it read it as multi-index? Or maybe I should read it as a dataframe and then convert it to multi-index? Current result of reading as dataframe
DATA['Load']
Y1 Y1.1 Y2 Y2.1
bus H1 H2 H1 H2
1 80 72 79.2 84.744
2 240 216 237.6 254.232
3 40 36 39.6 42.372
4 160 144 158.4 169.488
5 240 216 237.6 254.232
6 0 0 0 0
Question 2 and probably the more fundamental question:
How do I deal with multi-indexing when one or more of the indexes are on the columns side? In this example, I want to access the data by specifying B, Y, H
. I know how to work with multi-index when they are all as index, but can't get the hang of it when the indexes are on the columns.
Thank you very much for your help :)
PS:
Another sheet may look like the following:
from to x ratea
1 2 0.4 10
1 4 0.6 80
1 5 0.2 10
2 3 0.2 10
2 4 0.4 10
2 6 0.3 10
3 5 0.2 10
4 6 0.3 10
where I will set from
and to
as set (set_index(['from','to']
) to get a multi-index dataframe.
解决方案
将这样的数据帧读取到多索引用户中的标头参数pd.read_excel()
df = pd.read_excel('myFile.xlsx', header=[0,1])
Y1 Y2
B H1 H2 H1 H2
1 80 72 79.2 84.744
2 240 216 237.6 254.232
3 40 36 39.6 42.372
4 160 144 158.4 169.488
5 240 216 237.6 254.232
6 0 0 0.0 0.000
这意味着你告诉 pandas 你有两个标题行 0 和 1
在我们交谈之后:
df = pd.read_excel('Book2.xlsx', header=[0,1])
df2 = df.unstack().to_frame()
idx = df2.swaplevel(0,2).swaplevel(1,2).index.set_names(['B', 'Y', 'H'])
df2.set_index(idx, inplace=True)
0
B Y H
1 Y1 H1 80.000
2 Y1 H1 240.000
3 Y1 H1 40.000
4 Y1 H1 160.000
5 Y1 H1 240.000
6 Y1 H1 0.000
1 Y1 H2 72.000
2 Y1 H2 216.000
3 Y1 H2 36.000
4 Y1 H2 144.000
5 Y1 H2 216.000
6 Y1 H2 0.000
1 Y2 H1 79.200
2 Y2 H1 237.600
3 Y2 H1 39.600
4 Y2 H1 158.400
5 Y2 H1 237.600
6 Y2 H1 0.000
1 Y2 H2 84.744
2 Y2 H2 254.232
3 Y2 H2 42.372
4 Y2 H2 169.488
5 Y2 H2 254.232
6 Y2 H2 0.000
推荐阅读
- java - BigQuery Java API 将行转换为 Java 对象
- azure-devops - 在 Windows 机器上设置自托管代理:写错误:发生错误:未配置
- orm - CreateQueryBuilder 在数组中查找字符串?
- c# - C# 在解决方案上禁用闭包
- php - 我不能在 for 循环中形成一个数组来回显当前计数吗?(PHP)
- flutter - 为什么我应该使用带有日历而不是静态地图的 Bloc?
- vba - 我需要 Word 中的一个宏,将连续信件的每一页单独保存为 PDF
- python - 创建python objectify Element然后添加属性
- python - Python sklearn LogisticRegression ValueError: invalid __array_struct__
- powershell - PowerShell - 找不到长路径但文件存在