首页 > 解决方案 > 基于多个条件的 Python 数据透视表

问题描述

我在 python jupyter中的这个链接 SUMIFS 中问这个问题

但是,我刚刚意识到该解决方案不起作用,因为他们可以在不同的日期切换和切换。所以基本上他们必须先切换出去,然后才能切换进来。

这是数据框(根据日期排序):

+---------------+--------+---------+-----------+--------+
| Switch In/Out | Client | Quality |   Date    | Amount |
+---------------+--------+---------+-----------+--------+
| Out           |      1 | B       | 15-Aug-19 |    360 |
| In            |      1 | A       | 16-Aug-19 |    180 |
| In            |      1 | B       | 17-Aug-19 |    180 |
| Out           |      1 | A       | 18-Aug-19 |    140 |
| In            |      1 | B       | 18-Aug-19 |     80 |
| In            |      1 | A       | 19-Aug-19 |     60 |
| Out           |      2 | B       | 14-Aug-19 |     45 |
| Out           |      2 | C       | 15-Aug-20 |     85 |
| In            |      2 | C       | 15-Aug-20 |    130 |
| Out           |      2 | A       | 20-Aug-19 |    100 |
| In            |      2 | A       | 22-Aug-19 |     30 |
| In            |      2 | B       | 23-Aug-19 |     30 |
| In            |      2 | C       | 23-Aug-19 |     40 |
+---------------+--------+---------+-----------+--------+

然后我会创建一个新列并将它们分成不同的事务。


+---------------+--------+---------+-----------+--------+------+
| Switch In/Out | Client | Quality |   Date    | Amount | Rows |
+---------------+--------+---------+-----------+--------+------+
| Out           |      1 | B       | 15-Aug-19 |    360 |    1 |
| In            |      1 | A       | 16-Aug-19 |    180 |    1 |
| In            |      1 | B       | 17-Aug-19 |    180 |    1 |
| Out           |      1 | A       | 18-Aug-19 |    140 |    2 |
| In            |      1 | B       | 18-Aug-19 |     80 |    2 |
| In            |      1 | A       | 19-Aug-19 |     60 |    2 |
| Out           |      2 | B       | 14-Aug-19 |     45 |    3 |
| Out           |      2 | C       | 15-Aug-20 |     85 |    3 |
| In            |      2 | C       | 15-Aug-20 |    130 |    3 |
| Out           |      2 | A       | 20-Aug-19 |    100 |    4 |
| In            |      2 | A       | 22-Aug-19 |     30 |    4 |
| In            |      2 | B       | 23-Aug-19 |     30 |    4 |
| In            |      2 | C       | 23-Aug-19 |     40 |    4 |
+---------------+--------+---------+-----------+--------+------+

有了这个,我可以应用枢轴公式并从那里获取它。但是,我如何在 python 中做到这一点?在 excel 中,我可以只使用多个 SUMIFS 并进行比较。但是,这在 python 中是不可能的。

谢谢!

标签: pythonpandaspivot

解决方案


一个简单的解决方案是迭代并在每个元素上应用检查(函数)作为新列的结果,因此:map.

使用df.index.map我们获取每个项目的索引作为参数传递,因此我们可以使用值,获取和比较。在您的情况下,您的目标是确定对“Out”的更改以保持计数器。

import pandas as pd
switchInOut = ["Out", "In", "In", "Out", "In", "In", 
               "Out", "Out", "In", "Out", "In", "In", "In"]
df = pd.DataFrame(switchInOut, columns=['Switch In/Out'])
counter = 1

def changeToOut(i):
    global counter
    if df["Switch In/Out"].get(i) == "Out" and df["Switch In/Out"].get(i-1) == "In":
        counter += 1
    return counter

rows = df.index.map(changeToOut)
df["Rows"] = rows
df

结果:

+----+-----------------+--------+
|    | Switch In/Out   |   Rows |
|----+-----------------+--------|
|  0 | Out             |      1 |
|  1 | In              |      1 |
|  2 | In              |      1 |
|  3 | Out             |      2 |
|  4 | In              |      2 |
|  5 | In              |      2 |
|  6 | Out             |      3 |
|  7 | Out             |      3 |
|  8 | In              |      3 |
|  9 | Out             |      4 |
| 10 | In              |      4 |
| 11 | In              |      4 |
| 12 | In              |      4 |
+----+-----------------+--------+

推荐阅读