首页 > 解决方案 > 如何使用 linux/python 在 CSV 文件中创建派生列?

问题描述

我有一个包含以下列的 CSV(示例)文件

PC_name,Time,Plant,Section,PC_value
35901052,2017-08-01 05:50,MIYAKONOJO,MIYAKONOJO_05,0.000
35901052,2017-08-01 05:51,MIYAKONOJO,MIYAKONOJO_05,0.000
35901052,2017-08-01 05:56,MIYAKONOJO,MIYAKONOJO_05,0.000
35901052,2017-08-01 06:01,MIYAKONOJO,MIYAKONOJO_05,0.000
35901052,2017-08-01 06:06,MIYAKONOJO,MIYAKONOJO_05,0.000

我想要一个基于“时间”列的新列“” ,如下所述

如果我的时间戳介于下午 6 点(18:00)早上 6 点(06:00)之间,则该值应为“夜间”,否则为“

样本输出:

PC_name,Time,Plant,Section,PC_value,New
35901052,2017-08-01 05:50,MIYAKONOJO,MIYAKONOJO_05,0.000,Night
35901052,2017-08-01 05:51,MIYAKONOJO,MIYAKONOJO_05,0.000,Night
35901052,2017-08-01 05:56,MIYAKONOJO,MIYAKONOJO_05,0.000,Night
35901052,2017-08-01 06:01,MIYAKONOJO,MIYAKONOJO_05,0.000,Day
35901052,2017-08-01 06:06,MIYAKONOJO,MIYAKONOJO_05,0.000,Day

标签: pythonlinuxcsvderived-column

解决方案


您可以将您的系列转换为日期时间并提取小时。然后将其映射到值

df["Time"] = pd.to_datetime(df["Time"])
df["New"] = df["Time"].dt.hour.map({hour: "Night" if 18 < hour or hour < 6 else "Day" for hour in range(23)})

输出:

>>> df
    PC_name                Time       Plant        Section  PC_value    New
0  35901052 2017-08-01 05:50:00  MIYAKONOJO  MIYAKONOJO_05       0.0  Night
1  35901052 2017-08-01 05:51:00  MIYAKONOJO  MIYAKONOJO_05       0.0  Night
2  35901052 2017-08-01 05:56:00  MIYAKONOJO  MIYAKONOJO_05       0.0  Night
3  35901052 2017-08-01 06:01:00  MIYAKONOJO  MIYAKONOJO_05       0.0    Day
4  35901052 2017-08-01 06:06:00  MIYAKONOJO  MIYAKONOJO_05       0.0    Day

推荐阅读