python - 编写一个 if 条件来过滤记录,使用日期来获取数据过滤器
问题描述
我有关于过去几年用户登录我系统的数据。我想对过去三个月从我的系统中退出的用户进行分析。
我有三个条件
A (All) = All Accepted(Activated) users as of August 31st
N (new) = Accepted users in last three months (Jun, July, Aug)
R (returning) = Accepted before June, but logged in last three months i.e. (Jun, July, Aug)
D = Dropping
丢弃 = A - N - R
数据看起来像这样 - https://pastebin.com/Ybu9KWqk
我想将数据过滤到 N 和 R 类别并存储到 csv 并获取 D 的值。
我为此写了这个逻辑。
df = pd.read_csv("work_hrs.csv")
from datetime import datetime
import pdb
new_col = []
threshold_act_date = datetime.strptime("2019-6-01", '%Y-%m-%d').date()
threshold_log_date = datetime.strptime("2019-8-21", '%Y-%m-%d').date()
for row in df.iloc[:,[2,3]].values:
try:
last_log = datetime.strptime(row[0][:10], '%Y-%m-%d').date()
active_in = datetime.strptime(row[1][:10], '%Y-%m-%d').date()
if last_log >= threshold_log_date:
if active_in >= threshold_act_date:
new_col.append("N")
else:
new_col.append("R")
else:
new_col.append("D")
except:
new_col.append("not_active")
total_came = len(df) - Counter(df["status"])["unregistered"] - Counter(df["status"])["notregister"] - Counter(df["status"])["nan"]
print("Total come to our platform so far :", total_came)
dropout = total_came - Counter(df["new_stat"])["N"] - Counter(df["new_stat"])["R"]
print("The total no. of dropouts are :",dropout)
上面的代码是否正确地将日期过滤到 N 和 R 类别?
预期输出:
1. D which will give count of dropouts from the system in last three months
2. CSV file which contains All new (N) users in last 3 months
3. CSV file which contains all returning (R) users in last three months
解决方案
推荐阅读
- c++ - 井字游戏:通过检查每一行来寻找获胜者时遇到了麻烦。我的while循环有什么问题?
- node.js - web-ext 连接 ECONNREFUSED ::1:5037
- java - 从另一个实体获取数据
- android - 在外部设备的复制目录上保留日期时间戳
- python - 模拟 pyarrow 飞行
- reporting-services - 带有 JDE 查询连接的 SSRS
- java - JavaFX 结合了两个 setRowFactory 方法
- macos - macOS 通知服务无法运行(但重复的 Catalyst/iOS 工作)
- c# - 删除行后如何更新datagridview
- linux - ubuntu 中缺少列输出分隔符