首页 > 解决方案 > 基于k类过滤数据框的建议

问题描述

我有两个分类列,第一个是 client_abc、client_def,第二个是 F1、F2、F3,其余是数字列。

数据看起来像

 date       client          facility     count     claim
21/3/2019   'client_abc'     F1           200        1300
22/3/2019    'client_def'    F2           400        1800
21/3/2019    'client_abc'    F3           1000       3000
22/3/2019    'client_def'    F1           380        3600
21/3/2019    'client_abc'    F2           900        900
22/3/2019    'client_def'    F3           1030       2500
21/3/2019    'client_abc'    F1           190        1700
22/3/2019    'client_def'    F2           100000     1560

对于客户端“abc”和“f1”

 date       client          facility     count     claim
21/3/2019   'client_abc'     F1           200        1300
21/3/2019    'client_abc'    F1           190        1700

同样对于'abc' and 'f2', 'abc' and 'f3', 'def' and 'f1', 'def' and 'f2', 'def' and 'f3'.

我的尝试

df_fac_f1 =df[facility=='F1' & client == 'client_abc' ]
df_fac_f1 =df[facility=='F1' & client == 'client_def' ]
df_fac_f1 =df[facility=='F2' & client == 'client_abc' ]
df_fac_f1 =df[facility=='F2' & client == 'client_def' ]
df_fac_f1 =df[facility=='F3' & client == 'client_abc' ]
df_fac_f1 =df[facility=='F3' & client == 'client_def' ]

facility在事先不知道和client列值的情况下,如何获得相同的结果?

标签: pythonpandasdataframedata-science

解决方案


for group, grouped in df.groupby(["facility", "client"]):
    pass # grouped is a df of grouped by this columns values 

更多信息


推荐阅读