pandas - ArcPy & Python - Get Latest TWO dates, grouped by Value
问题描述
I've looked around for the last week to an answer but only see partial answers. Being new to python, I really could use some assistance. I have two fields in a table [number] and [date]. The date format is date and time, so: 07/09/2018 3:30:30 PM. The [number] field is just an integer, but each row may have the same number.
I have tried a few options to gain access to the LATEST date, and I can get these using Pandas:
myarray = arcpy.da.FeatureClassToNumPyArray (fc, ['number', 'date'])
mydf = pd.DataFrame(myarray)
date_index = mydf.groupby(['number'])['date'].transform(max)==mydf['date']
However, I need the latest TWO dates. I've moved on to trying an "IF" statement because I feel arcpy.da.UpdateCursor is better suited to look through the record and update another field by grouping by NUMBER and returning the rows with the latest TWO dates.
End result would like to see the following table grouped by number, latest two dates (as examples):
Number : Date
1 7/29/2018 4:30:44 PM
1 7/30/2018 5:55:34 PM
2 8/2/2018 5:45:23 PM
2 8/3/2018 6:34:32 PM
解决方案
尝试这个。
import pandas as pd
import numpy as np
# Some data.
data = pd.DataFrame({'number': np.random.randint(3, size = 15), 'date': pd.date_range('2018-01-01', '2018-01-15')})
# Look at the data.
data
这给出了一些这样的示例数据:
所以在我们的输出中,我们希望看到第 5 和第 9 的数字为 0,第 14 和第 15 的数字为 1,第 6 和第 12 的数字为 2。
然后我们按数字分组,抓取最后两行,设置索引并排序。
# Group and label the index.
last_2 = data.groupby('number').tail(2).set_index('number').sort_index()
last_2
这给了我们我们所期望的。
推荐阅读
- c# - 单击上传按钮时我无法显示文件路径
- android - 如何防止 Flutter 构建输出带回旧的调试输出?
- jpa - JPA Spring Data原生查询和子实体初始化
- conditional-statements - getter 方法中的条件语句
- javascript - 模型方法未识别
- kotlin - Kotlin 中的私有构造函数是做什么用的?
- visual-studio-code - vscode vim扩展重新映射了UP和DOWN键在按住时不起作用
- javascript - JavaScript 使用复杂的 json 文件
- android - DataBinding onClickListener 使用“::”给出错误
- node.js - Firebase 函数预部署错误:命令以非零退出代码终止