首页 > 解决方案 > “是否有基于数据框另一列的某些值添加新列的 pandas 函数?”

问题描述

我正在尝试根据另一列中的时间值在数据框中创建一个新列,即如果时间在 06:00:00 和 12:00:00 之间,那么早上,如果时间在 12:0:00 和 15:00 之间:00 下午等

我尝试使用 for 循环和 if else 语句,但我的数据框有 1549293 行,因此循环不可执行

import datetime
import time
times= [datetime.time(6,0,0),datetime.time(12,0,0),datetime.time(15,0,0),datetime.time(20,0,0),datetime.time(23,0,0)]
times

df['time']=df['start_time'].dt.time
df['day_interval']=df['time']

for i in range(0,df.shape[0]):

    if df['time'][i] >= times[0] and df['time'][i] < times[1]:
        df['day_interval'][i]= "Morning"
    elif df['time'][i] >= times[1] and df['time'][i] < times[2]:
        df['day_interval'][i]= "Afternoon"
    elif df['time'][i] >= times[2] and df['time'][i] < times[3]:
        df['day_interval'][i]= "Evening"
    elif df['time'][i] >= times[3] and df['time'][i] < times[4]:
        df['day_interval'][i]= "Night"
    elif df['time'][i] >= times[4]:
        df['day_interval'][i]= "Late Night"
    if df['time'][i] < times[0]:
        df['day_interval'][i]= "Early Hours"

有什么方法可以减少处理时间

标签: pythonpandas

解决方案


使用通知我在你的00:00:00 和 23:59:59pd.cut添加两次times

pd.cut(s1,bins=pd.to_datetime(pd.Series(times),format='%H:%M:%S').tolist(),labels=['Early','M','A','E','N','L'])
0    Early
1        M
Name: time, dtype: category
Categories (6, object): [Early < M < A < E < N < L]

数据设置

times= [datetime.time(0,0,0),datetime.time(6,0,0),datetime.time(12,0,0),datetime.time(15,0,0),datetime.time(20,0,0),datetime.time(23,0,0),datetime.time(23,59,59)]
s1=pd.to_datetime(df.time,format='%H:%M:%S') 

推荐阅读