python - 根据列中的最高索引和值过滤数据框中的行
问题描述
我有以下示例:我想将所有行保留在ID=5
多行的位置和位置,我ID=3
只想保留索引最高的行。
data = {'Profession':['Teacher', 'Banker', 'Teacher', 'Judge','lawyer','Teacher'], 'Gender':['Male','Male', 'Female', 'Male','Male','Female'],'Size':['M','M','L','S','S','M'],'ID':['5','3','3','3','5','3']}
data2={'Profession':['Doctor', 'Scientist', 'Scientist', 'Banker','Judge','Scientist'], 'Gender':['Male','Male', 'Female','Female','Male','Male'],'Size':['L','M','L','M','L','L'],'ID':['5','3','5','3','3','3']}
data3 = {'Profession':['Banker', 'Banker', 'Doctor', 'Doctor','lawyer','Teacher'], 'Gender':['Male','Male', 'Female', 'Female','Female','Male'],'Size':['S','M','S','M','L','S'],'ID':['5','3','3','3','5','3']}
data4={'Profession':['Judge', 'Judge', 'Scientist', 'Banker','Judge','Scientist'], 'Gender':['Female','Female', 'Female','Female','Female','Female'],'Size':['M','S','L','S','M','S'],'ID':['3','5','3','3','5','3']}
df =pd.DataFrame(data)
df2=pd.DataFrame(data2)
df3=pd.DataFrame(data3)
df4=pd.DataFrame(data4)
DATA=pd.concat([df,df2,df3,df4])
DATA.reset_index(drop=True,inplace=True)
DATA
我想要这个:这只是一个例子。在我的真实数据中,我有大量的行,所以我想有一段代码适用于更大的数据帧。
解决方案
您可以构造一个布尔值,它获取以下 ID,3
但保留第一个。
布尔正在测试
- 该行等于
3
- 这些真值上方的行也等于
3
如果我们用这个布尔值查看带有条件列的前几行 -
Profession Gender Size ID bool_
0 Teacher Male M 5 False
1 Banker Male M 3 False <-- fulfills 1st condition but not 2nd so false.
2 Teacher Female L 3 True <-- fulfills condition 1 & 2
3 Judge Male S 3 True <-- fulfills condition 1 & 2
4 lawyer Male S 5 False
5 Teacher Female M 3 False
#df = DATA
#df['ID'] = df['ID'].astype(int)
m = df['ID'].eq(3) & df['ID'].eq(df['ID'].shift())
df_new = df[~m]
Profession Gender Size ID
0 Teacher Male M 5.0
1 Banker Male M 3.0
4 lawyer Male S 5.0
5 Teacher Female M 3.0
6 Doctor Male L 5.0
7 Scientist Male M 3.0
8 Scientist Female L 5.0
9 Banker Female M 3.0
12 Banker Male S 5.0
13 Banker Male M 3.0
16 lawyer Female L 5.0
17 Teacher Male S 3.0
19 Judge Female S 5.0
20 Scientist Female L 3.0
22 Judge Female M 5.0
23 Scientist Female S 3.0
推荐阅读
- kubernetes - Openshift 部署无法访问 jarfile “tes.jar”
- c# - Quickfix/n QuickFix.FieldConvertError: 无法转换字段: 无法将字符串转换为 int (N)
- python - 使用 OpenCV Python 流式传输多个 IP 摄像机会导致延迟
- php - 在哪里放置图标数组?
- react-native - 带有换行符的字符串插值给出了错误的 console.log 输出
- python - 有效计算包含值大于百分位的列名的数据框列
- mysql - MySQL 之间的日期不会得到想要的结果
- android - 为什么 kotlin 媒体会话回调示例中的 2 个函数是公开的?
- javascript - 如何将多个数组和一个字符串组合成一个数组?
- java - 在 GET 请求中使用 LocalDate 作为参数的 TypeMismatch 错误