python - 仅在找到第一个数字条目时返回 True,在所有其他情况下返回 False
问题描述
dataframe
我正在处理的内容如下表所示:
COLUMN-A COLUMN-B COLUMN-C COLUMN-D
2005-12-23 2.78229429977895 2.59054751268432
2005-12-28 2.77990953370726 2.59625529291923
2005-12-29 2.77770141742004 2.60175855794512
2005-12-30 2.77565686568447 2.60706465870293
2006-01-03 2.78676377607689 2.61845788272621
2006-01-04 2.79415905904631 2.62804815466004
2006-01-05 2.79233986786484 2.63311058575101
2006-01-06 2.79065543181717 2.63799172343874
2006-01-09 2.7876513234596 2.64200075091549
2006-01-10 2.78342529650764 2.64516894228885
2006-01-11 2.77951230901599 2.64822370776439
2006-01-12 2.77877806345801 2.65256358425937
2006-01-13 2.78965376857357 2.66232574953289
2006-01-16 2.81417572440332 2.67871384606613
2006-01-17 2.83688123723998 2.69451541833616
2006-01-18 2.84923078073203 2.70556518000894
2006-01-19 2.854887762274 2.71343113557577
2006-01-20 2.86012570781281 2.72101563266667
2006-01-23 2.8620867671879 2.72693465617535
2006-01-24 2.85668033821582 2.72915676427006
2006-01-25 2.85311883059988 2.7319963852241
2006-01-27 2.84982113851717 2.73473442527192
2006-01-30 2.84098994077245 2.73458665290639
2006-01-31 2.83281290615161 2.73444416615124
2006-02-01 2.82235268854652 2.73291291585375
2006-02-02 2.79821544736977 2.72446373657389 2.31735945722146
2006-02-03 2.7903180053127 2.72328924609567 2.32165937425023
2006-02-06 2.78300555917914 2.72215675685381 2.32590335299919
2006-02-07 2.77912366526979 2.72245848891773 2.33053900014161
2006-02-08 2.77552931914827 2.72274943166327 2.33511466419111
我正在尝试编写逻辑以在COLUMN-D
第一个数字条目的位置返回 True,在所有其他情况下返回 False
这是我写的引发错误的逻辑 -ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
代码
import pandas as pd
def has_trail_started(df, df_key):
return (~pd.isnull(df[df_key])) & (pd.isnull(df[df_key].shift()))
if (has_trail_started(data, 'COLUMN-D') and data['has_changed_status']):
// Logic
请问我可以得到一些帮助来纠正这个问题吗?
解决方案
if
您的函数返回系列,出于陈述的目的,不能将其解释为 bool 。但是您可以将“trail start info”添加到 df,如下所示:
def has_trail_started(df, df_key):
df["has_trail_started"] = (~pd.isnull(df[df_key])) & (pd.isnull(df[df_key].shift()))
has_trail_started(data, 'COLUMN-D')
然后 df 看起来像这样:
COLUMN-A COLUMN-B COLUMN-C COLUMN-D has_trail_started
0 2005-12-23 2.782294 2.590548 NaN False
1 2005-12-28 2.779910 2.596255 NaN False
2 2005-12-29 2.777701 2.601759 NaN False
3 2005-12-30 2.775657 2.607065 NaN False
4 2006-01-03 2.786764 2.618458 NaN False
5 2006-01-04 2.794159 2.628048 NaN False
6 2006-01-05 2.792340 2.633111 NaN False
7 2006-01-06 2.790655 2.637992 NaN False
8 2006-01-09 2.787651 2.642001 NaN False
9 2006-01-10 2.783425 2.645169 NaN False
10 2006-01-11 2.779512 2.648224 NaN False
11 2006-01-12 2.778778 2.652564 NaN False
12 2006-01-13 2.789654 2.662326 NaN False
13 2006-01-16 2.814176 2.678714 NaN False
14 2006-01-17 2.836881 2.694515 NaN False
15 2006-01-18 2.849231 2.705565 NaN False
16 2006-01-19 2.854888 2.713431 NaN False
17 2006-01-20 2.860126 2.721016 NaN False
18 2006-01-23 2.862087 2.726935 NaN False
19 2006-01-24 2.856680 2.729157 NaN False
20 2006-01-25 2.853119 2.731996 NaN False
21 2006-01-27 2.849821 2.734734 NaN False
22 2006-01-30 2.840990 2.734587 NaN False
23 2006-01-31 2.832813 2.734444 NaN False
24 2006-02-01 2.822353 2.732913 NaN False
25 2006-02-02 2.798215 2.724464 2.317359 True
26 2006-02-03 2.790318 2.723289 2.321659 False
27 2006-02-06 2.783006 2.722157 2.325903 False
28 2006-02-07 2.779124 2.722458 2.330539 False
29 2006-02-08 2.775529 2.722749 2.335115 False
现在你可以基于这个新的 bool 应用一些逻辑,如下所示:
data["extra_logic"] = data["has_trail_started"].apply(lambda x: "yay" if x else "boo")
这将添加一个新列,其值是has_trail_started
标志的函数。
推荐阅读
- python - 查找特定坐标范围内的唯一元素
- php - symfony:/usr/bin 下似乎没有 php
- javascript - Shopify 购物车循环 - 为运输时间创建变量
- html - 使用 css 创建水平跨越但按钮保持单词大小的按钮
- python - 使用函数参数创建字符串参数
- amazon-ec2 - EC2 Instance Connect(基于浏览器的 SSH 连接)不起作用
- javascript - 有没有办法从 Shopify 中的序列化添加到购物车表单中获取变体 ID
- visual-studio - Visual Studio 2017 调试器不工作 - 出现附加到进程
- php - 如何使用复选框数组更新_post_meta?
- git - 如何实际同步显示为已同步但内容不同的本地 git repo 和远程 git repo