python - 如何在 np.where 中使用表示条件的变量作为 pandas 中具有列表值的列?
问题描述
我正在尝试使用 np.where 根据其他条件在列内进行计算。我希望修改其他条件。我也必须使用 ** df1['matches'].fillna('[0]',inplace = True)** 否则它会给出不同的错误
代码:
df1 = pd.read_csv('one.txt',sep = '\t')
df1['matches'].fillna('[0]',inplace = True)
df1['scorehigh?'] = df1['league'].apply(lambda a: 'yes' if a == 'Active' or a == 'Super Active' else 'no')
df1['greaterthan10?'] = (['yes' if any(int(a)>10 for a in i) else 'no'
for i in df1['matches'].str.findall('\d+')])
m=np.where((df1['scorehigh?']=='yes')) & (df1['matches'] != '[0]')
df1['Finals?'] = np.where((df1['scorehigh?']=='yes') & (df1['greaterthan10?'] == 'yes'), 'YES', m)
a=df1['Finals?'].value_counts()
print(a)
错误:
setting an array element with a sequence.
输入:
league matches
Active [[1, 0, 50,], [2, 0, 14,]]
Active [[1, 0, 0,], [2, 0, 4,]]
Active [[1, 0, 50,], [2, 0, 14,]]
Super Active [[1, 0, 50,], [2, 0, 14,]]
Low [[1, 0, 50,], [2, 0, 14,]]
Low [[1, 0, 5,], [2, 0, 5,]]
Low [[1, 0, 40,], [2, 0, 10,]]
Super Active
Super Active
Super Active
Super
Low
预期输出:
league matches greater_than_10?
Active [[1, 0, 50,], [2, 0, 14,]] yes
Active [[1, 0, 0,], [2, 0, 4,]] no
Active [[1, 0, 50,], [2, 0, 14,]] yes
Super Active [[1, 0, 50,], [2, 0, 14,]] yes
Low [[1, 0, 50,], [2, 0, 14,]] no
Low [[1, 0, 5,], [2, 0, 5,]] no
Low [[1, 0, 40,], [2, 0, 10,]] no
Super Active [0] no
Super Active [0] no
Super Active [0] no
Super [0] no
Low [0] no
预期使用value.counts后:
Yes: 3
No: 4
解决方案
问题在于:
m=np.where((df1['scorehigh?']=='yes')) & (df1['matches'] != '[0]')
如果掩码输出后没有参数是匹配值的位置数组。
df1['matches'].fillna('[0]',inplace = True)
df1['scorehigh?'] = df1['league'].apply(lambda a: 'yes' if a == 'Active' or a == 'Super Active' else 'no')
df1['greaterthan10?'] = (['yes' if any(int(a)>10 for a in i) else 'no'
for i in df1['matches'].str.findall('\d+')])
如果不匹配,则使用嵌套numpy.where
指定None
,也仅使用第二个掩码df1['matches'] != '[0]'
:
df1['Finals?'] = np.where((df1['scorehigh?']=='yes')&(df1['greaterthan10?'] == 'yes'), 'YES',
np.where(df1['matches'] != '[0]', 'NO', None))
或者numpy.select
:
df1['Finals?'] = np.select([(df1['scorehigh?']=='yes')& (df1['greaterthan10?'] == 'yes'),
df1['matches'] != '[0]'], ['YES', 'NO'], default=None)
print (df1)
league matches scorehigh? greaterthan10? Finals?
0 Active [[1, 0, 50,], [2, 0, 14,]] yes yes YES
1 Active [[1, 0, 0,], [2, 0, 4,]] yes no NO
2 Active [[1, 0, 50,], [2, 0, 14,]] yes yes YES
3 Super Active [[1, 0, 50,], [2, 0, 14,]] yes yes YES
4 Low [[1, 0, 50,], [2, 0, 14,]] no yes NO
5 Low [[1, 0, 5,], [2, 0, 5,]] no no NO
6 Low [[1, 0, 40,], [2, 0, 10,]] no yes NO
7 Super Active [0] yes no None
8 Super Active [0] yes no None
9 Super Active [0] yes no None
10 Super [0] no no None
11 Low [0] no no None
a=df1['Finals?'].value_counts()
print(a)
NO 4
YES 3
Name: Finals?, dtype: int64
如果使用两个条件输出不同:
df1['Finals?'] = np.select([(df1['scorehigh?']=='yes')& (df1['greaterthan10?'] == 'yes'),
(df1['scorehigh?']=='yes') & (df1['matches'] != '[0]')],
['YES', 'NO'], default=None)
print (df1)
league matches scorehigh? greaterthan10? Finals?
0 Active [[1, 0, 50,], [2, 0, 14,]] yes yes YES
1 Active [[1, 0, 0,], [2, 0, 4,]] yes no NO
2 Active [[1, 0, 50,], [2, 0, 14,]] yes yes YES
3 Super Active [[1, 0, 50,], [2, 0, 14,]] yes yes YES
4 Low [[1, 0, 50,], [2, 0, 14,]] no yes None
5 Low [[1, 0, 5,], [2, 0, 5,]] no no None
6 Low [[1, 0, 40,], [2, 0, 10,]] no yes None
7 Super Active [0] yes no None
8 Super Active [0] yes no None
9 Super Active [0] yes no None
10 Super [0] no no None
11 Low [0] no no None
a=df1['Finals?'].value_counts()
print(a)
YES 3
NO 1
Name: Finals?, dtype: int64
推荐阅读
- c# - C#中如何正确使用注册表权限删除注册表子项?
- reactjs - 尝试执行以下代码时未定义
- parsing - 解析 Rascal DSL 的问题
- powershell - 如何清除 powershell 中的 .csv 列?我有一个 csv 文件,我需要清除 2 列而不删除该列
- javascript - 小胡子如何访问arraylist中数组的特定元素
- swift - 将一个协议转换为另一个协议 Swift
- bash - 使用 CSV 文件中的数据作为 Bash 循环中的变量值
- docker - 在 pod 中获取“ErrImageNeverPull”
- c++ - 对 boost::asio::io_context::run() 和 boost::thread::join() 感到困惑
- html - 在 CSS 中使用 div 的宽度悬停过渡时保持嵌套锚点固定