首页 > 解决方案 > 熊猫。删除单元格并将其他人排成一行的好方法?

问题描述

在数据框中,我需要删除一些单元格并将其他单元格移到行中:

df=pd.DataFrame({'X0':['anytext','anytext','anytext','anytext','anytext'],
                 'X1':['12:40','boss','engen','15:44','16:01'],
                 'X2':['anytext','12:44','14:06','anytext','anytext'],
                 'X3':['anytext','anytext','anytext','anytext','anytext']})

 df
  
        X0     X1       X2       X3
0  anytext  12:40  anytext  anytext
1  anytext   boss    12:44  anytext
2  anytext  engen    14:06  anytext
3  anytext  15:44  anytext  anytext
4  anytext  16:01  anytext  anytext

我想删除“boss”和“engen”并将其他单元格向左移动:

        X0     X1       X2       X3
0  anytext  12:40  anytext  anytext
1  anytext  12:44  anytext      NaN
2  anytext  14:06  anytext      NaN
3  anytext  15:44  anytext  anytext
4  anytext  16:01  anytext  anytext

标签: pythonpandas

解决方案


您需要选择行进行移位,例如,这里测试了前 2 个值X1是否为数字 bystr[:2]Series.str.isnumeric,反转掩码 by ~,因此仅用于非数字值使用DataFrame.shift

m = ~df['X1'].str[:2].str.isnumeric()

面具的另一个想法,谢谢@Manakin 是测试格式的日期时间HH:MM

m = pd.to_datetime(df['X1'],format='%H:%M',errors='coerce').isna()

:此外,如果想用长度测试数字 2 数字2

m = ~df['X1'].str.contains('^\d{2}:\d{2}$')

df[m] = df[m].shift(-1, axis=1)
print(df)
      X1       X2       X3
0  12:40  anytext  anytext
1  12:44  anytext      NaN
2  14:06  anytext      NaN
3  15:44  anytext  anytext
4  16:01  anytext  anytext

如果需要在X1一个想法之后修改所有列:

df=pd.DataFrame({'X0':['anytext','anytext','anytext','anytext','anytext'],
                 'X1':['12:40','boss','engen','15:44','16:01'],
                 'X2':['anytext','12:44','14:06','anytext','anytext'],
                 'X3':['anytext','anytext','anytext','anytext','anytext']}) 

m = ~df['X1'].str.contains('^\d{2}:\d{2}$')
df.loc[m, 'X1':] =df.loc[m, 'X1':].shift(-1, axis=1)
print(df)
       X0     X1       X2       X3
0  anytext  12:40  anytext  anytext
1  anytext  12:44  anytext      NaN
2  anytext  14:06  anytext      NaN
3  anytext  15:44  anytext  anytext
4  anytext  16:01  anytext  anytext

另一个转换X0为索引:

df = df.set_index('X0')
m = ~df['X1'].str.contains('^\d{2}:\d{2}$')
df[m] = df[m].shift(-1, axis=1)
df = df.reset_index()
print(df)
        X0     X1       X2       X3
0  anytext  12:40  anytext  anytext
1  anytext  12:44  anytext      NaN
2  anytext  14:06  anytext      NaN
3  anytext  15:44  anytext  anytext
4  anytext  16:01  anytext  anytext

推荐阅读