首页 > 解决方案 > 如何从python中的一列时间确定膳食的不同部分

问题描述

我想从时间列(上午 9:30-中午 12 点,下午 3 点-午夜 12 点)中获取部分餐点(“早餐”、“午餐”、“晚餐”),以下是数据框列的示例:-

0                                10am – 1am
1                           12noon – 1:30am 
2                              9:30am – 1am
3         12noon – 3:30pm, 7pm – 12midnight
4        11am – 3:30pm, 6:30pm – 12midnight
                       ...                 
170                           11:40am – 4am
171                            7pm – 1:30am
172                            12noon – 1am
173                            6pm – 3:30am
174                              9am – 10pm

我想用相应的餐点/部分替换相应的时间, 例如,如果上午 11 点至下午 3:30 则将其替换为 ["breakfast","lunch"]如果上午 9:10 则将其替换为 ["breakfast","午餐","晚餐"] 等等。

标签: pythonpandasdataframedata-cleaning

解决方案


我的解决方案:

import re

def parse_time(t):
    t = t.strip()
    hours = int(re.findall('^[0-9]+', t)[0])
    m = re.findall(':([0-9]+)', t)
    if len(m) > 0:
        minutes = int(m[0])
    else:
        minutes = 0
    afternoon = re.search('(pm)|(midnight)', t)
    if afternoon:
        hours += 12
    return (hours, minutes)

def get_parts(s):
    x = re.split('–|-', s)
    start, end = x[0].strip(), x[1].strip()
    start_hours, start_minutes = parse_time(start)
    end_hours, end_minutes = parse_time(end)
    parts = []
    if start_hours < 11: # or whenever you think breakfast ends
        parts.append("breakfast")
    if 12 < start_hours < 15 or 12 < end_hours < 15:
        parts.append("lunch")
    if end_hours > 17:
        parts.append("dinner")
    return parts

def get_all_parts(data):
    x = [set(get_parts(s)) for s in data.split(",")]
    return set.union(*x)

print(get_all_parts("10am-3:30pm"))
print(get_all_parts("11am - 3:30pm, 6:30pm - 12midnight"))
print(get_all_parts("10am - 11am, 5pm-7pm"))

推荐阅读