python - 如何使用 np.where 使用前面的行和公式创建新列？

问题描述

这是我的问题的延续，或者与我最初的问题“如何np.where使用以前的行创建新列？”有些相关。

我正在将一个 excel 文件转换为 python，因为 excel 无法再运行（或者运行数十万行需要一段时间）。这是我要转换的示例表。

其中我将此公式运行到 C 列

=IF(B2="new",A2,C3+1)

A 列和 B 列是输入，C 列是输出。

如果 B2 等于“new”，则可以导出 C 列，它将导致 A 列的值，如果不等于“new”，它将导致上一个值（或下一个值）+1。

我尝试将 1 添加到建议的先前代码并产生不同的结果。这是代码。

df['Previous'] = np.where(df['Status']=='new', df['Count'], np.nan)    
df['Previous'] = df['Previous'].bfill().astype(int) +1

这个快照是我运行它时代码的结果。

非常感谢。

特别提到@SeaBean 帮助我。

标签： pythonexcelpandasnumpy

您可以使用以下代码：

df['Previous'] = df['Count'].where(df['Status'] =='new').bfill()
df['Group_ID'] = df['Status'][::-1].eq('new').cumsum()
df['Previous'] = df.groupby('Group_ID', as_index=False)['Previous'].transform(lambda x: (x[::-1] + (x[::-1] == x[::-1].shift()).cumsum()).sort_index()).astype(int)


print(df)

    Count Status  Previous  Group_ID
0       4    old         4         3
1       3    old         3         3
2       2    old         2         3
3       1    new         1         3
4      40    old        13         2
5      30    old        12         2
6      20    old        11         2
7      10    new        10         2
8     400    old       103         1
9     300    old       102         1
10    200    old       101         1
11    100    new       100         1

您可以通过以下方式进一步删除Group_ID创建以帮助处理：

df = df.drop(columns='Group_ID')

print(df)

    Count Status  Previous
0       4    old         4
1       3    old         3
2       2    old         2
3       1    new         1
4      40    old        13
5      30    old        12
6      20    old        11
7      10    new        10
8     400    old       103
9     300    old       102
10    200    old       101
11    100    new       100

python - 如何使用 np.where 使用前面的行和公式创建新列？

问题描述

解决方案

推荐阅读