sql - 在没有折扣的情况下查找等于零的发票 Pandas Dataframe
问题描述
在 Pandas 中,我有一个这样的数据框:
| Division | Invoice | Transactions | Amount |
|----------|----------|----------------|--------|
| Europe | 10000000 | Product Charge | 1000 |
| Europe | 10001000 | Product Charge | 1000 |
| Europe | 10001000 | Discount | -500 |
| Latam | 10002000 | Product Charge | 0 |
| Latam | 10003000 | Product Charge | 1000 |
| Latam | 10003000 | Discount | -1000 |
| Europe | 10004000 | Product Charge | 500 |
| Europe | 10004000 | Discount | -500 |
| Europe | 10005000 | Product Charge | 500 |
| Europe | 10005000 | Discount | 495 |
| Latam | 10006000 | Product Charge | 0 |
| Latam | 10007000 | Product Charge | 0 |
| Latam | 10007000 | Loyalty bonus | 200 |
division
当发票金额 = 0 和折扣 = 0 时,我需要构建一个新的 DF 来计算每个 DF,如下所示:
分配 | 发票 | 全部的 | Q_发票 |
---|---|---|---|
拉美 | 10002000 | 0 | 1 |
拉美 | 10006000 | 0 | 1 |
在 SQL 中我可以按如下方式计算它,但在 Pandas DF 中我无法复制它:
SELECT Division, Invoice, SUM (Amount) Total, COUNT (DISTINCT Invoice) Q_Invoice
FROM df
GROUP BY Division, Invoice
HAVING SUM (CASE WHEN Transactions = 'Discount' THEN 1 ELSE 0 END) = '0'
AND SUM (CASE WHEN Transactions = 'Product Charge' THEN 1 ELSE 0 END) >= '1'
AND SUM(Amount) = 0
我尝试使用 pandassql 在 jupyter notebook 中复制上述结果,但它不起作用,请使用以下计算:
import pandasql as ps
import pandas as pd
df2 = ps.sqldf ("""SELECT Division, Invoice, SUM (Amount) Total,COUNT (DISTINCT Invoice) Q_Invoice
FROM df
GROUP BY Division, Invoice
HAVING SUM (CASE WHEN Transactions = 'Discount' THEN 1 ELSE 0 END) = '0'
AND SUM (CASE WHEN Transactions = 'Product Charge' THEN 1 ELSE 0 END) >= '1'
AND SUM(Amount) = 0 """)
我不知道如何继续,我是熊猫新手
解决方案
我希望我已经正确理解了你的问题。您可以.pivot_table
数据名,然后按产品费用的总和进行过滤:
x = df.pivot_table(
index=["Division", "Invoice"],
columns="Transactions",
values="Amount",
aggfunc=["sum", "count"],
fill_value=0,
)
x = x[x[("sum", "Product Charge")].eq(0)].reset_index()
x.columns = x.columns.map("_".join)
x = x.rename(
columns={
"Division_": "Division",
"Invoice_": "Invoice",
"sum_Product Charge": "Total",
"count_Product Charge": "Q_Invoice",
}
)[["Division", "Invoice", "Total", "Q_Invoice"]]
print(x)
印刷:
Division Invoice Total Q_Invoice
0 Latam 10002000 0 1
1 Latam 10006000 0 1
推荐阅读
- java - 我们可以从 UncategorizedSQLException 中恢复吗?
- azure-data-factory - 如何对基于 Azure 数据工厂的管道进行冒烟测试
- java - dockerBuild 失败导致不支持的类文件主要版本 61 错误
- c++ - 参数相关查找如何在搜索点之后找到声明?
- woocommerce - WooCommerce 检查购物车的优惠券
- c# - 我有一个位数不确定的bool数组,如何将它转换成一个短数组,每个bool值是一个位
- python - 如何使用 sklearn 加快核密度估计?
- windows-10 - 为什么8080端口被文件资源管理器占用?
- flutter - 如何在 ScrollablePositionedList 颤振中添加上拉刷新
- python - 防止 PyCharm 弹出框出现在光标上方