python - 从多索引映射到多列
问题描述
我有一个巨大的多索引数据框。我希望根据多索引的部分内容创建新列。这就是我所拥有的:
arrays = [['bar', 'bar', 'bar', 'baz', 'baz', 'foo', 'foo','foo','qux', 'qux'],
['one', 'two', 'three', 'one', 'four', 'one', 'two', 'eight','one', 'two'],
['green', 'green', 'blue', 'blue', 'black', 'black', 'orange', 'green','blue', 'black'] ]
s = pd.DataFrame(np.random.randn(10), index=arrays)
s.index.names = ['p1','p2','p3']
s
0
p1 p2 p3
bar one green -0.676472
two green -0.030377
three blue -0.957517
baz one blue 0.710764
four black 0.404377
foo one black -0.286358
two orange -1.620832
eight green 0.316170
qux one blue -0.433310
two black 1.127754
这就是我要的:
0 x1 x2 x3
p1 p2 p3
bar one green 1.563381 1 0 1
two green 0.193622 0 0 0
three blue 0.046728 1 0 0
baz one blue 0.098216 0 0 0
black 1.826574 0 1 0
foo one black -0.120856 1 1 1
two orange 0.605020 0 0 0
eight green 0.693606 0 0 0
qux one blue 0.588244 1 1 1
two black -0.872104 1 1 1
现在,在伪代码中,我想:
if (p1 =='bar') & (p2 == 'one') & (p3 == 'green'): s['x1'] = 1, s['x3'] = 1
if (p1 == 'bar') & (p3 == 'blue'): s['x1'] = 1
if (p1 == 'baz') & (p3 == 'black'): s['x2'] = 1
if (p1 =='foo') & (p2 == 'one') & (p3 == 'black'): s['x1'] = 1, s['x2'] = 1, s['x3'] = 1
if (p1 == 'qux'): s['x1'] = 1, s['x2'] = 1, s['x3'] = 1
即基于多索引列的值,我想将1分配给新的x列。我正在寻找像 numpy.select (condition, choice) 这样的矢量化方法,但我无法让 numpy.select 在每个条件下使用多个选项。
由于我有 14 个索引列,因此我希望明确使用我条件的列的名称(即(p1 == 'bar') & (p2 == 'one')
首选而不是['bar','one',]
)。
任何指导将不胜感激!
谢谢您的帮助!
解决方案
这里可以通过索引切片使用选择并通过如下方式设置列1
:
idx = pd.IndexSlice
s = s.assign(x1=0, x2=0, x3=0)
s.loc[idx['bar','one','green'], ['x1','x3']] = 1
s.loc[idx['bar',:,'blue'], ['x1']] = 1
s.loc[idx['baz',:,'black'], ['x2']] = 1
s.loc[idx['foo','one','black'], ['x1','x2','x3']] = 1
s.loc[idx['qux',:,:], ['x1','x2','x3']] = 1
print (s)
0 x1 x2 x3
p1 p2 p3
bar one green 0.152556 1 0 1
two green 0.488762 0 0 0
three blue 0.037346 1 0 0
baz one blue 1.903518 0 0 0
four black 0.589922 0 1 0
foo one black 0.871984 1 1 1
two orange 0.514062 0 0 0
eight green -0.177246 0 0 0
qux one blue 0.740046 1 1 1
two black 0.755664 1 1 1
编辑:
def get_i(lev, val):
return s.index.get_level_values(lev) == val
s = s.assign(x1=0, x2=0, x3=0)
s.loc[get_i('p1','bar') & get_i('p2','one') & get_i('p3','green'), ['x1','x3']] = 1
s.loc[get_i('p1','bar') & get_i('p3','blue'), ['x1']] = 1
s.loc[get_i('p1','baz') & get_i('p3','black'), ['x2']] = 1
s.loc[get_i('p1','foo') & get_i('p2','one') & get_i('p3','black'), ['x1','x2','x3']] = 1
s.loc[get_i('p1','qux'), ['x1','x2','x3']] = 1
print (s)
0 x1 x2 x3
p1 p2 p3
bar one green -0.029773 1 0 1
two green -1.505461 0 0 0
three blue 1.819085 1 0 0
baz one blue 0.645498 0 0 0
four black -1.119554 0 1 0
foo one black 1.002072 1 1 1
two orange -0.461030 0 0 0
eight green -2.565080 0 0 0
qux one blue 0.286967 1 1 1
two black -0.522340 1 1 1
推荐阅读
- javascript - 你能用 github 托管一个不和谐的机器人吗?
- python - Python中的特定近似匹配
- numpy - 如何引用numpy数组的列?
- javascript - 如何在另一个数组中检索具有值的数组?
- c++ - 如何显示错误消息,然后重新显示打油诗?
- php - How to generate multi-level associative array in a loop
- spring-boot - 创建名为“embeddedKafka”的 bean 时出错:调用 init 方法失败
- c# - 触发并忘记的任务与后台无限任务
- c# - 从 Razor 页面使用内部 API
- laravel - 不使用 Vuex 有没有办法解决这个问题?