首页 > 解决方案 > ArcPy & Python - Get Latest TWO dates, grouped by Value

问题描述

I've looked around for the last week to an answer but only see partial answers. Being new to python, I really could use some assistance. I have two fields in a table [number] and [date]. The date format is date and time, so: 07/09/2018 3:30:30 PM. The [number] field is just an integer, but each row may have the same number.

I have tried a few options to gain access to the LATEST date, and I can get these using Pandas:

myarray = arcpy.da.FeatureClassToNumPyArray (fc, ['number', 'date'])
mydf = pd.DataFrame(myarray)
date_index = mydf.groupby(['number'])['date'].transform(max)==mydf['date']

However, I need the latest TWO dates. I've moved on to trying an "IF" statement because I feel arcpy.da.UpdateCursor is better suited to look through the record and update another field by grouping by NUMBER and returning the rows with the latest TWO dates.

End result would like to see the following table grouped by number, latest two dates (as examples):

Number : Date
1       7/29/2018 4:30:44 PM
1       7/30/2018 5:55:34 PM
2       8/2/2018  5:45:23 PM
2       8/3/2018  6:34:32 PM

标签: pandasdatearcpy

解决方案


尝试这个。

import pandas as pd
import numpy as np

# Some data.

data = pd.DataFrame({'number': np.random.randint(3, size = 15), 'date': pd.date_range('2018-01-01', '2018-01-15')})

# Look at the data.

data

这给出了一些这样的示例数据:

在此处输入图像描述

所以在我们的输出中,我们希望看到第 5 和第 9 的数字为 0,第 14 和第 15 的数字为 1,第 6 和第 12 的数字为 2。

然后我们按数字分组,抓取最后两行,设置索引并排序。

# Group and label the index.

last_2 = data.groupby('number').tail(2).set_index('number').sort_index()

last_2

这给了我们我们所期望的。

在此处输入图像描述


推荐阅读