python-3.x - 如何在熊猫数据框中选择不存在条件的记录
问题描述
我有两个数据框,如下所示。我想将数据选择 SQL 查询重写为包含不存在条件的 pandas
SQL
Select ORDER_NUM, DRIVER FROM DF
WHERE
1=1
AND NOT EXISTS
(
SELECT 1 FROM
order_addition oa
WHERE
oa.Flag_Value = 'Y'
AND df.ORDER_NUM = oa.ORDER_NUM)
样本数据
order_addition.head(10)
ORDER_NUM Flag_Value
22574536 Y
32459745 Y
15642314 Y
12478965 N
25845673 N
36789156 N
df.head(10)
ORDER_NUM REGION DRIVER
22574536 WEST Ravi
32459745 WEST David
15642314 SOUTH Rahul
12478965 NORTH David
25845673 SOUTH Mani
36789156 SOUTH Tim
如何在熊猫中轻松做到这一点。
解决方案
IIUC,您可以merge
使用df1
等于 Y 的值,然后找到 nans:
result = df2.merge(df1[df1["Flag_Value"].eq("Y")],how="left",on="ORDER_NUM")
print (result[result["Flag_Value"].isnull()])
ORDER_NUM REGION DRIVER Flag_Value
3 12478965 NORTH David NaN
4 25845673 SOUTH Mani NaN
5 36789156 SOUTH Tim NaN
ORDER_NUM
如果您是独一无二的,甚至更简单:
print (df2.loc[~df2["ORDER_NUM"].isin(df1.loc[df1["Flag_Value"].eq("Y"),"ORDER_NUM"])])
ORDER_NUM REGION DRIVER
3 12478965 NORTH David
4 25845673 SOUTH Mani
5 36789156 SOUTH Tim
推荐阅读
- rabbitmq - RabbitMQ 是否可以在 RabbitMQ 中复制一些消息?
- java - Java Webservice:GET 返回了 401 Unauthorized 的响应状态
- python - 将数字范围添加到列表中的每个元素
- apache - 将所有请求重定向到 Index.html
- node.js - openstack、pkgcloud 和节点 js
- database - 无法从 Oracle RAW 数据中解码所有信息
- machine-learning - 如何从此热图中选择变量?
- html - CSS 从左到右滑动面板
- javascript - Href 链接未正确呈现
- javascript - 如何在错误时触发 ajax 请求?