python - 列上的 Multiindiex str 替换

问题描述

我想替换多索引数据框中列中的所有值，我发现了一种肮脏的方法，但我正在寻找更清洁的东西

如果有帮助，则从 .xlsx 导入数据，因为它能够使用千位运算符从第一列中删除“，”。

所有数字都是字符串，所以我需要将它们转换为浮点数或整数，因此 str.replace 函数

示例数据框

Name    0                       1                      ...
Col     A           B           A            B         ...
0       409511  30.3%           355529   30.3%  ...
1       332276  20.3%           083684   20.3%  ...
2       138159  10.3%           570834   10.3%  ...

如果我使用

df['0','B']= df['0','B'].str.replace('%','').astype(float)

这行得通，但我不想对每一列都这样做

我一直在尝试玩

df.loc[:,pd.IndexSlice[:,'B']].str.replace('%','').astype(float)

但我得到了错误

'DataFrame' 对象没有属性 'str'

我试过了

df.loc[:,pd.IndexSlice[:,'Percent']].replace('%','')

它返回没有错误的数据帧，但对它没有任何作用

如果我做

df.loc[:,pd.IndexSlice[:,'Percent']].replace('%','').astype(float)

无法将字符串转换为浮点数：'33.3%'

我通读了https://pandas.pydata.org/pandas-docs/stable/user_guide/advanced.html但没有任何内容可以替换

我也无法在此 https://jakevdp.github.io/PythonDataScienceHandbook/03.05-hierarchical-indexing.html中找到任何内容

标签： pythonpandasdataframe

你可以试试Index.Sliceand loc, and update（注意：你需要regex=True）

idx = pd.IndexSlice
df.update(df.loc[:, idx[:,'B']].replace('%', '', regex=True).astype(float))

Out[1374]:
        0             1
        A     B       A     B
0  409511  30.3  355529  30.3
1  332276  20.3   83684  20.3
2  138159  10.3  570834  10.3

或使用filter并update返回df

df.update(df.filter(like='B').replace('%', '', regex=True).astype(float))

Out[1363]:
        0             1
        A     B       A     B
0  409511  30.3  355529  30.3
1  332276  20.3   83684  20.3
2  138159  10.3  570834  10.3

python - 列上的 Multiindiex str 替换

问题描述

解决方案

推荐阅读