python - 使用 pandas 将值替换为 2 列中的条件
问题描述
我有一个熊猫数据框,如下所示
df1_new = pd.DataFrame({'person_id': [1, 2, 3, 4, 5],
'start_date': ['07/23/2377', '05/29/2477', '02/03/2177', '7/27/2277', '7/13/2077'],
'start_datetime': ['07/23/2377 12:00:00', '05/29/2477 04:00:00', '02/03/2177 02:00:00', '7/27/2277 05:00:00', '7/13/2077 12:00:00'],
'end_date': ['07/25/2377', '06/09/2477', '02/05/2177', '01/01/2000', '01/01/2000'],
'end_datetime': ['07/25/2377 02:00:00', '06/09/2477 04:00:00', '02/05/2177 01:00:00', '01/01/2000 00:00:00', '01/01/2000 00:00:00'],
'Type' :['IP','IP','OP','OP','IP']})
我想做的是
if ((end_date contains 2000 or end_datetime contains 2000) and (type == IP)) then
end_date = start_date + 2 days
end_datetime = start_datetime + 2 days
else ((if end_date contains 2000 or end_datetime contains 2000) and (type == OP)) then
end_date = start_date
end_datetime = start_datetime
这是我尝试过的,但没有产生准确的输出
df['end_date'] = df['start_date'].apply(lambda x: df['start_date'] + pd.DateOffset(days=2) if (x == 'OP' and x == '01/01/2000') else df['start_date'])
df['end_datetime'] = df['start_datetime'].apply(lambda x: df['start_datetime'] + pd.DateOffset(days=2) if (x == 'OP' and x == '01/01/2000') else df['start_datetime'])
我希望我的输出如下所示
解决方案
这是一个例子。看评论我想你会理解基本方法。
from copy import deepcopy
from datetime import datetime
import pandas as pd
from dateutil.relativedelta import relativedelta
df = pd.DataFrame.from_dict({
'person_id': [1, 2, 3, 4, 5],
'start_date': ['07/23/2377', '05/29/2477', '02/03/2177', '7/27/2277', '7/13/2077'],
'start_datetime': ['07/23/2377 12:00:00', '05/29/2477 04:00:00', '02/03/2177 02:00:00', '7/27/2277 05:00:00', '7/13/2077 12:00:00'],
'end_date': ['07/25/2377', '06/09/2477', '02/05/2177', '01/01/2000', '01/01/2000'],
'end_datetime': ['07/25/2377 02:00:00', '06/09/2477 04:00:00', '02/05/2177 01:00:00', '01/01/2000 00:00:00', '01/01/2000 00:00:00'],
'type': ['IP', 'IP', 'OP', 'OP', 'IP']
})
def calculate_days(x):
# datetime object from string
x['end_date'] = datetime.strptime(x['end_date'], '%m/%d/%Y')
x['start_date'] = datetime.strptime(x['start_date'], '%m/%d/%Y')
x['end_datetime'] = datetime.strptime(x['end_datetime'], '%m/%d/%Y %H:%M:%S')
x['start_datetime'] = datetime.strptime(x['start_datetime'], '%m/%d/%Y %H:%M:%S')
# you need only 2000 year...
if not (x['end_date'].year == 2000 or x['end_datetime'] == 2000):
return x
# type conditions and calculations...
if x['type'] == 'IP':
x['end_date'] = x['start_date'] + relativedelta(days=2)
x['end_datetime'] = x['start_datetime'] + relativedelta(days=2)
elif x['type'] == 'OP':
x['end_date'] = deepcopy(x['start_date'])
x['end_datetime'] = deepcopy(x['start_datetime'])
return x
# apply our custom function
df = df.apply(calculate_days, axis=1)
print(df.head())
# person_id start_date ... end_datetime type
# 0 1 2377-07-23 00:00:00 ... 2377-07-25 02:00:00 IP
# 1 2 2477-05-29 00:00:00 ... 2477-06-09 04:00:00 IP
# 2 3 2177-02-03 00:00:00 ... 2177-02-05 01:00:00 OP
# 3 4 2277-07-27 00:00:00 ... 2277-07-27 05:00:00 OP
# 4 5 2077-07-13 00:00:00 ... 2077-07-15 12:00:00 IP
# [5 rows x 6 columns]
希望这可以帮助。
推荐阅读
- android - 单击 Listview 自定义适配器中的 Listitem 按钮更新 Listitem Textview
- android - 詹金斯:离子安卓构建失败
- java - 在 springboot aws lambda 中注入服务
- angular - 页面刷新后保存并保留下拉值 PrimeNG
- xamarin.forms - 使用 xamarin 表单获取移动设备的宽度和高度
- javascript - 使用 angularfire2 从 Firebase 获取数据(文档示例)
- ssas - SSAS 计算成员上的复杂过滤器
- javascript - 如何在 jplayer 中调整搜索栏/进度条的大小?
- vba - 将字符串写入 txt 文件并将其替换为 Excel 列表
- jquery - 分配 src="@URL.action" 时如何显示特定错误