python - 我想比较熊猫中的值
问题描述
我有两个数据框。第一:
import pandas as pd
a = [['xxx', 'admin'], ['yyy', 'admin,super admin'], ['zzz', 'guest,admin,superadmin']]
df1 = pd.DataFrame(a, columns=['user', 'groups'])
第二个:
b = [['xxx', 'admin,super admin'], ['www', 'admin,super admin'], ['zzz', 'guest,superadmin']]
df2 = pd.DataFrame(b, columns=['user', 'groups'])
这是第一个:
user groups
0 xxx admin
1 yyy admin,super admin
2 zzz guest,admin,superadmin
这是第二个:
user groups
0 xxx admin,super admin
1 www admin,super admin
2 zzz guest,superadmin
我想做两件事:
如果第二个用户不在第一个用户中,则打印出来。喜欢: www 不在列表中
如果用户在列表中,但组不相等,则打印出:
likexxx
user have more:super admin
than the list
zzz
user has less:admin
than the list。
解决方案
如果两个 DataFrame 中有相同长度的索引值,并且需要比较每行的值:
print (df1.index.equals(df2.index))
True
#compare rows for not equal
mask = df1['user'].ne(df2['user'])
#filter rows by mask and column user in df2
a = df2.loc[mask, 'user'].tolist()
print (a)
['www']
#join both DataFrames together
df1 = pd.concat([df1, df2], axis=1, keys=('a','b'))
df1.columns = df1.columns.map('_'.join)
#filter only same user rows
df1 = df1[~mask]
#split columns by , ans convert to sets
df1['a'] = df1['a_groups'].apply(lambda x: set(x.split(',')))
df1['b'] = df1['b_groups'].apply(lambda x: set(x.split(',')))
#get difference of sets, join to strings with separator ,
df1['a_diff'] = [', '.join(x.difference(y)) for x, y in zip(df1['b'],df1['a'] )]
df1['b_diff'] = [', '.join(x.difference(y)) for x, y in zip(df1['a'],df1['b'] )]
print (df1)
a_user a_groups b_user b_groups \
0 xxx admin xxx admin,super admin
2 zzz guest,admin,superadmin zzz guest,superadmin
a b a_diff b_diff
0 {admin} {admin, super admin} super admin
2 {admin, superadmin, guest} {superadmin, guest} admin
#filter by casting set columns to boolean, empty sets are converted to False
b = df1.loc[df1['a_diff'].astype(bool), ['a_user','a_diff']]
print (b)
a_user a_diff
0 xxx super admin
c = df1.loc[df1['b_diff'].astype(bool), ['a_user','b_diff']]
print (c)
a_user b_diff
2 zzz admin
推荐阅读
- c# - 当我按 Alt 2 次时,我需要粘贴一些东西。有没有办法以 C# 形式做到这一点?
- vb.net - 您可以将值强制转换为 Date 类型吗?
- java - 从 Firebase 实时数据库中删除用户数据
- python - 搜索包含特殊字符的 Pandas 数据框?
- python - 字典列表到熊猫数据框
- apache-spark - 火花 Sql 查询
- c# - 向 MS Teams 发送主动消息
- python - 如何将变量“display_lines”的内容放入列表中
- python - 如何在 python 中添加表格标题,最好使用 pandas
- c# - 使用路由配置在多租户环境中进行 ASP.NET Core Health 检查