python - 使用 df,groupby 时组内的记录数不正确
问题描述
我遵循从此处获取的修改代码,根据时间戳将行拆分为 5 秒组。
df = pd.read_csv(file_name, delimiter=',')
df['dt'] = pd.to_datetime(df['datetime'], unit='s')
for g in df.groupby(pd.Grouper(freq='5s', key='dt')):
print(f'Start time {g[0]} has {len(g)} records within 5 secs')
但我在组内得到的记录数不正确。
输出
Start time 2017-05-02 16:00:45 has 2 records within 5 secs
...
示例 CSV 如下所示
datetime,x,y,z,label
1493740845,0.0004,-0.0001,0.0045,bad
1493740846,0.0004,0.0006,0.0049,bad
1493740847,0.0002,0.0013,0.0044,bad
1493740848,0.0002,0.0005,0.0046,bad
1493740849,0.0006,0.0006,0.0038,bad
1493740850,0.0009,0.0002,0.0038,bad
...
解决方案
有g
2 个值的元组,所以总是 get 2
。
我认为您可以将元组解压缩为name
和g
变量,然后像您需要的那样工作:
for name, g in df.groupby(pd.Grouper(freq='5s', key='dt')):
print(f'Start time {name} has {len(g)} records within 5 secs')
Start time 2017-05-02 16:00:45 has 5 records within 5 secs
Start time 2017-05-02 16:00:50 has 1 records within 5 secs
在您的解决方案中使用g[1]
s length
:
for g in df.groupby(pd.Grouper(freq='5s', key='dt')):
print(f'Start time {g[0]} has {len(g[1])} records within 5 secs')
推荐阅读
- javascript - 使用 customElements 在 javascript 中创建页面构建框架
- microsoft-graph-toolkit - IE 和 mgt-login 不兼容
- javascript - 根据列的值发送电子邮件?
- matlab - 无法在matlab中的条形图上方放置图像
- r - 通过扭曲参数空间处理 Nelder-Mead 优化中的框约束
- excel - 调试 MS Excel for Mac 中的兼容性问题
- json - 未处理的异常:反序列化
- python - 使用 webdriver 以编程方式创建 Firefox 配置文件
- mysql - debconf:DbDriver“密码”警告:无法打开 /var/cache/debconf/passwords.dat:权限被拒绝
- grammar - 左递归消除题