首页 > 解决方案 > 根据参数获得列中的最小时间差

问题描述

这是我的可重复数据-

raw_data = {'file': [123, 342, 223, 134, 235,233], 
            'identity': [12, 12, 12, 12,14,14], 
            'line': [1, 2, 3, 4, 5,6], 
            'date': ['10/27/2013','10/27/2013', '10/27/2013', '10/27/2013', '10/20/2013','10/20/2013'],
            'time': ['13:20:00', '13:20:30', '13:21:00', '13:21:30', '15:40:00','15:40:30']}

现在对于给定的参数说'identity'=12 ,'date'=10/27/2013 and 'time'=13:20:21我现在想创建一个新的数据帧,根据参数标识,日期从数据帧中选择与时间参数具有最小时间差的行。

例如对于参数'identity'=12 ,'date'=10/27/2013 and 'time'=13:20:21我们有答案-

identity  date        time     difference
12       10/27/2013  13:20:30     9

标签: pythonpandas

解决方案


这不是您的代码的样子,因为您没有向我们提供您的尝试。但这应该让您清楚地了解如何解决它

from datetime import datetime
df = pd.DataFrame(raw_data)

cond = (df['identity'] == 12) 
cond2 = df['date'] == '10/27/2013'

td = datetime.strptime('13:20:21', '%H:%M:%S')

# series of time differnces
min_time_diff = abs(df.loc[cond & cond2]['time'].apply(lambda x: datetime.strptime(x, '%H:%M:%S') - td))

# return the row with the minimum time difference
out = df.loc[min_time_diff.idxmin()]

out['differce'] = min_time_diff[min_time_diff.idxmin()].components.seconds

出去:

date        10/27/2013
file               342
identity            12
line                 2
time          13:20:30
differce             9
Name: 1, dtype: object

推荐阅读