首页 > 解决方案 > Python/Pandas:独特的表格转置和变换

问题描述

我有一个如下图所示的 Excel 表格,其中数字“1”标记了给定商店所在的存储桶。

+-------+-------+-----------+----------+----------+------------+----------+
|       |       |           |     H    |     H    |      N     |     G    |
+-------+-------+-----------+----------+----------+------------+----------+
|       |       |           |    111   |    114   |     120    |    130   |
+-------+-------+-----------+----------+----------+------------+----------+
| Group | Code  | Name      | Bucket 1 | Bucket A | Bucket Min | Bucket B |
+-------+-------+-----------+----------+----------+------------+----------+
| H     | 11110 | Store 1   |     1    |          |            |     1    |
+-------+-------+-----------+----------+----------+------------+----------+
| N     | 11450 | Store AAA |          |          |      1     |          |
+-------+-------+-----------+----------+----------+------------+----------+
| H     | 11152 | Store BBB |          |          |      1     |     1    |
+-------+-------+-----------+----------+----------+------------+----------+
| H     | 11734 | Store XXX |          |     1    |            |          |
+-------+-------+-----------+----------+----------+------------+----------+

我想把这张桌子改成我下面的样子。

+----------+--------+------------+---------------------------------------------------+
| Category | Number | Transfer   | Received Information                              |
+----------+--------+------------+---------------------------------------------------+
| H        | 111    | Bucket 1   | Received for Store 1 (11110)                      |
+----------+--------+------------+---------------------------------------------------+
| H        | 114    | Bucket A   | Received for Store XXX (11734)                    |
+----------+--------+------------+---------------------------------------------------+
| N        | 120    | Bucket Min | Received for Store AAA (11450), Store BBB (11152) |
+----------+--------+------------+---------------------------------------------------+
| G        | 130    | Bucket B   | Received from Store 1 (11111), Store BBB (11152)  |
+----------+--------+------------+---------------------------------------------------+

这种独特的转置和变换如何在 Python/Pandas 中完成?

请注意,文本“Received for”将始终出现在“Received information”列中每个单元格的开头。

标签: pythonpandas

解决方案


我会试一试。首先,您应该使要从中读取数据的 excel 文件对 pandas 更友好,方法是为列名添加额外的行并正确命名行:

之后,此代码应为您提供所需的结果:

import pandas as pd
f = r"path/to/file.xlsx"


# %%
df = (
    pd.read_excel(f, index_col=[0, 1, 2], header=[0, 1, 2])
    # move away from multi-index
    .reset_index()
    # create a new column combining Code and Name
    .assign(NameCode=lambda x: x["Name"] + " (" + x["Code"].astype(str) + ")")
    # drop the columns we combined into the new one above
    .drop(columns=["Name", "Code"])
    # 'transpose' table to create one entry for each of the NameCode
    # entries
    .melt(id_vars=["Group", "NameCode"], value_name="deliveries")
    # drop everything with NaN
    .loc[lambda x: ~x["deliveries"].isnull()]
)
print(df)

# %%
pvt = df.groupby(by=["Category", "Number", "Transfer"]).agg(
    Received_Information=pd.NamedAgg(
        column="NameCode", aggfunc=lambda x: "Received from " + ", ".join(x.str.strip())
    )
)
print(pvt)
   Group           NameCode Category Number    Transfer  deliveries
0      H    Store 1 (11110)        H    111    Bucket 1         1.0
7      N  Store XXX (11734)        H    114    Bucket A         1.0
9      N  Store AAA (11450)        N    120  Bucket Min         1.0
10     H  Store BBB (11152)        N    120  Bucket Min         1.0
12     H    Store 1 (11110)        G    130    Bucket B         1.0
14     H  Store BBB (11152)        G    130    Bucket B         1.0
                                                          Received_Information
Category Number Transfer                                                      
G        130    Bucket B      Received from Store 1 (11110), Store BBB (11152)
H        111    Bucket 1                         Received from Store 1 (11110)
         114    Bucket A                       Received from Store XXX (11734)
N        120    Bucket Min  Received from Store AAA (11450), Store BBB (11152)

推荐阅读