python - 根据循环内另一列的值将列的值更改为nan
问题描述
我有大量带有后缀“平均值”或“总和”的列。有时带有“平均”后缀的是NaN。发生这种情况时,我也想将带有“sum”后缀的那个也变成 NaN。我有大量变量,所以我需要 (?) 使用循环。我创建了一个假数据框,并添加了基于 SO 中类似帖子尝试过的 3 件事。不幸的是,没有任何效果
original_data_set = (pd.DataFrame
(
{
'customerId':[1,2]
,'usage_1_sum':[100, 200]
,'usage_1_mean':[np.nan,100]
,'usage_2_sum':[420,330]
,'usage_2_mean':[45,np.nan]
}
)
)
print('original dataset')
original_data_set
desired_data_set = (pd.DataFrame
(
{
'customerId':[1,2]
,'usage_1_sum':[np.nan, 200]
,'usage_1_mean':[np.nan,100]
,'usage_2_sum':[420,np.nan]
,'usage_2_mean':[45,np.nan]
}
)
)
print('desired dataset')
desired_data_set
holder_set = original_data_set.copy()
for number in range(1,3):
holder_set['usage_{}_sum'.format(number)] = (
holder_set['usage_{}_sum'.format(number)]
.where(holder_set['usage_{}_mean'.format(number)] == np.nan, np.nan
)
)
print('using an np.where statement changed all sum variables into NaN with no discretion')
holder_set
holder_set = original_data_set.copy()
for number in range(1,3):
conditions = [holder_set['usage_{}_mean'.format(number)]==np.nan]
outcome = [np.nan]
holder_set['usage_{}_sum'.format(number)] = np.select(conditions, outcome, default=holder_set['usage_{}_sum'.format(number)])
print('using an np.select did not have any effect on the dataframe')
holder_set
holder_set = original_data_set.copy()
for number in range(1,3):
holder_set.loc[holder_set['usage_{}_mean'.format(number)]==np.nan, 'usage_{}_sum'.format(number)] = 12
print('using a loc did not have any effect on the dataframe')
holder_set
解决方案
假设original
数据框为df
:
df = pd.DataFrame({'customerId': [1, 2], 'usage_1_sum': [100, 200], 'usage_1_mean': [
np.nan, 100], 'usage_2_sum': [420, 330], 'usage_2_mean': [45, np.nan]})
使用,Series.str.endswith
来过滤以_mean
then 结尾的列,以将列中的每一列以将列中_mean
的相应值更改为平均列中的值的_sum
位置:NaN
NaN
for col in df.columns[df.columns.str.endswith('_mean')]:
df.loc[df[col].isna(), col.rstrip('_mean') + '_sum'] = np.nan
结果:
# print(df)
customerId usage_1_sum usage_1_mean usage_2_sum usage_2_mean
0 1 NaN NaN 420.0 45.0
1 2 200.0 100.0 NaN NaN
推荐阅读
- php - Laravel:如何在循环时将字段添加到集合中
- python - 来自 Pebble ProcessPool 的“迭代期间字典大小发生变化”
- python - 如何使用 BeautifulSoup 4 获取根标签的属性?
- google-sheets - 如何使用脚本发送谷歌表格
- python - 如何在 PyTorch 中将坐标值的 3D 网格分割成块?
- java - Java 中的锯齿状数组:转换 char[][] 和 ArrayList
> - linux - 在实时环境中监控进程系统调用
- matlab - 如何在matlab中计算过滤的弯曲?
- ms-word - 如何在 Word-Officejs 中选择多个单个范围?
- php - Directing users to specific pages on login in PHP