首页 > 解决方案 > 从一个日期列创建开始和结束日期功能

问题描述

我正在尝试从一个日期列中创建一项新功能以提供开始和结束日期。

这是它的样子:

unique_id = ["001", "001", "001",
             "002",
             "003", "003"
            ]
end_dates = ["2018-10-31 12:43:03 PM", "2018-10-31 12:44:23 PM", "2018-10-31 1:01:42 PM",
             "2018-11-23 03:33:13 PM",
             "2018-11-23 04:10:45 PM", "2018-11-23 04:13:58 PM"
             
            ]
activity_class = ["step 1", "step 2", "step 3",
                  "step 1",
                  "step 1", "step 2"
                 ]

df = \
pd.DataFrame({"ID": unique_id,
              "Edit Date": end_dates,
              "Activity": activity_class
             })

df["Edit Date"] = pd.to_datetime(df["Edit Date"])

这是我希望它的样子:

unique_id = ["001", "001", "001",
             "002",
             "003", "003"
            ]


start_date = ["2018-10-31 12:43:03 PM", "2018-10-31 12:43:03 PM", "2018-10-31 12:44:23 PM",
              "2018-11-23 03:33:13 PM",
              "2018-11-23 04:10:45 PM", "2018-11-23 04:10:45 PM"
             ]

end_date = ["2018-10-31 12:43:03 PM", "2018-10-31 12:44:23 PM", "2018-10-31 1:01:42 PM",
             "2018-11-23 03:33:13 PM",
             "2018-11-23 04:10:45 PM", "2018-11-23 04:13:58 PM"
           ]

activity_class = ["step 1", "step 2", "step 3",
                  "step 1",
                  "step 1", "step 2"
                 ]

df = \
pd.DataFrame({"ID": unique_id,
              "Start_Date": start_date,
              "End_Date": end_date,
              "Activity": activity_class
             })

df["Start_Date"] = pd.to_datetime(df["Start_Date"])
df["End_Date"] = pd.to_datetime(df["End_Date"])

到目前为止我尝试了什么:

df["Start_Date"] = df["Edit Date"].shift(1).backfill()

一些规则:

标签: pythonpython-3.xpandas

解决方案


你在找groupby().shift()吗?

df['Start_Date'] = df['Edit Date']
df['End_Date'] = df.groupby('ID')['Edit Date'].shift().fillna(df['Edit Date'])

输出:

    ID           Edit Date Activity          Start_Date            End_Date
0  001 2018-10-31 12:43:03   step 1 2018-10-31 12:43:03 2018-10-31 12:43:03
1  001 2018-10-31 12:44:23   step 2 2018-10-31 12:44:23 2018-10-31 12:43:03
2  001 2018-10-31 13:01:42   step 3 2018-10-31 13:01:42 2018-10-31 12:44:23
3  002 2018-11-23 15:33:13   step 1 2018-11-23 15:33:13 2018-11-23 15:33:13
4  003 2018-11-23 16:10:45   step 1 2018-11-23 16:10:45 2018-11-23 16:10:45
5  003 2018-11-23 16:13:58   step 2 2018-11-23 16:13:58 2018-11-23 16:10:45

推荐阅读