首页 > 解决方案 > Pandas 多索引选择标准以及列选择标准

问题描述

我需要根据索引和列上的多个条件查询 Pandas 数据框。请在下面查看我的数据。'Country' 和 'Surname' 是两个独立的索引,而 'Name'、'Score'、'Type' 是列。

y = pd.DataFrame({'Name':['Nikhil', 'Ankit', 'Keval', 'Darpan', 'Rajesh', 'John', 'Lynda'],
                  'Score':[89,92,96,82,95,98,97], 
                  'Type':['Fat','Slim','Fat','Slim','Fat','Slim','Slim'],
                  'Country':['India','USA','Denmark','Australia','Italy','China','Israel'],
                  'Surname':['Sharma','Sharma','Patel','Shah','Sharma','Sharma','Sharma']}                 
                ).set_index('Country').set_index('Surname', append=True)

在此处输入图像描述

我想选择符合以下条件的数据:

标签: pythonpandasmulti-index

解决方案


对于在列中或在 MultiIndex 级别中的选择,可以通过或query与链式掩码一起使用:and&

q='Country not in ["India","USA"] and Surname == "Sharma" and Score >= 90 and Type == "Slim"'

或者:

q = 'Country not in ["India","USA"] & Surname == "Sharma" & Score >= 90 & Type == "Slim"'

print (y.query(q))
                  Name  Score  Type
Country Surname                    
China   Sharma    John     98  Slim
Israel  Sharma   Lynda     97  Slim

替代boolean indexing,但链接掩码是严格的&

m1 = ~y.index.get_level_values('Country').isin(["India","USA"])
m2 = y.index.get_level_values('Surname') == 'Sharma'
m3 = y['Score'].ge(90)
m4 = y['Type'].eq('Slim')
print (y[m1 & m2 & m3 & m4])
                  Name  Score  Type
Country Surname                    
China   Sharma    John     98  Slim
Israel  Sharma   Lynda     97  Slim

推荐阅读