python - 如何提取与特定列中日期相同的值?(在蟒蛇中)
问题描述
考虑以下字典中的数据框。2 列是“日期时间”、“日期_at_which_value_is_needed”。我想创建一个新列,其中包含 datetime 列的值作为列表/系列,其日期与“date_at_which_value_is_needed”列中的值相同。有没有办法在没有循环的情况下做到这一点?
{'datetime': {667: Timestamp('2019-11-08 10:00:00+0000', tz='UTC'),
673: Timestamp('2019-11-08 16:00:00+0000', tz='UTC'),
679: Timestamp('2019-11-08 22:00:00+0000', tz='UTC'),
685: Timestamp('2019-11-09 04:00:00+0000', tz='UTC'),
691: Timestamp('2019-11-11 10:00:00+0000', tz='UTC'),
697: Timestamp('2019-11-11 16:00:00+0000', tz='UTC'),
703: Timestamp('2019-11-11 22:00:00+0000', tz='UTC'),
709: Timestamp('2019-11-12 04:00:00+0000', tz='UTC'),
715: Timestamp('2019-11-12 10:00:00+0000', tz='UTC'),
721: Timestamp('2019-11-12 16:00:00+0000', tz='UTC'),
727: Timestamp('2019-11-12 22:00:00+0000', tz='UTC'),
733: Timestamp('2019-11-13 04:00:00+0000', tz='UTC'),
739: Timestamp('2019-11-13 10:00:00+0000', tz='UTC'),
745: Timestamp('2019-11-13 16:00:00+0000', tz='UTC'),
751: Timestamp('2019-11-13 22:00:00+0000', tz='UTC'),
757: Timestamp('2019-11-14 04:00:00+0000', tz='UTC'),
763: Timestamp('2019-11-14 10:00:00+0000', tz='UTC'),
769: Timestamp('2019-11-14 16:00:00+0000', tz='UTC'),
775: Timestamp('2019-11-14 22:00:00+0000', tz='UTC'),
780: Timestamp('2019-11-15 04:00:00+0000', tz='UTC')},
'date_at_which_value_is_needed': {667: Timestamp('2019-11-05 00:00:00+0000', tz='UTC'),
673: Timestamp('2019-11-05 00:00:00+0000', tz='UTC'),
679: Timestamp('2019-11-05 00:00:00+0000', tz='UTC'),
685: Timestamp('2019-11-06 00:00:00+0000', tz='UTC'),
691: Timestamp('2019-11-06 00:00:00+0000', tz='UTC'),
697: Timestamp('2019-11-06 00:00:00+0000', tz='UTC'),
703: Timestamp('2019-11-06 00:00:00+0000', tz='UTC'),
709: Timestamp('2019-11-07 00:00:00+0000', tz='UTC'),
715: Timestamp('2019-11-07 00:00:00+0000', tz='UTC'),
721: Timestamp('2019-11-07 00:00:00+0000', tz='UTC'),
727: Timestamp('2019-11-07 00:00:00+0000', tz='UTC'),
733: Timestamp('2019-11-08 00:00:00+0000', tz='UTC'),
739: Timestamp('2019-11-08 00:00:00+0000', tz='UTC'),
745: Timestamp('2019-11-08 00:00:00+0000', tz='UTC'),
751: Timestamp('2019-11-08 00:00:00+0000', tz='UTC'),
757: Timestamp('2019-11-11 00:00:00+0000', tz='UTC'),
763: Timestamp('2019-11-11 00:00:00+0000', tz='UTC'),
769: Timestamp('2019-11-11 00:00:00+0000', tz='UTC'),
775: Timestamp('2019-11-11 00:00:00+0000', tz='UTC'),
780: Timestamp('2019-11-12 00:00:00+0000', tz='UTC')},
'c': {667: 64.6475,
673: 65.005,
679: 65.0075,
685: 65.0075,
691: 65.0225,
697: 65.5875,
703: 65.6,
709: 65.5625,
715: 65.355,
721: 65.475,
727: 65.425,
733: 65.0375,
739: 65.9017,
745: 66.1875,
751: 66.15,
757: 66.075,
763: 65.695,
769: 65.625,
775: 65.66,
780: 65.9525}}
例如,对于最后一行(索引 780),新列将包含列表:
[Timestamp('2019-11-12 04:00:00+0000', tz='UTC'), Timestamp('2019-11-12 10:00:00+0000', tz='UTC'), Timestamp('2019-11-12 16:00:00+0000', tz='UTC'), Timestamp('2019-11-12 22:00:00+0000', tz='UTC')]
解决方案
尝试这个:
import pandas as pd
from pandas import Timestamp
data = {'datetime': {667: Timestamp('2019-11-08 10:00:00+0000', tz='UTC'),
673: Timestamp('2019-11-08 16:00:00+0000', tz='UTC'),
679: Timestamp('2019-11-08 22:00:00+0000', tz='UTC'),
685: Timestamp('2019-11-09 04:00:00+0000', tz='UTC'),
691: Timestamp('2019-11-11 10:00:00+0000', tz='UTC'),
697: Timestamp('2019-11-11 16:00:00+0000', tz='UTC'),
703: Timestamp('2019-11-11 22:00:00+0000', tz='UTC'),
709: Timestamp('2019-11-12 04:00:00+0000', tz='UTC'),
715: Timestamp('2019-11-12 10:00:00+0000', tz='UTC'),
721: Timestamp('2019-11-12 16:00:00+0000', tz='UTC'),
727: Timestamp('2019-11-12 22:00:00+0000', tz='UTC'),
733: Timestamp('2019-11-13 04:00:00+0000', tz='UTC'),
739: Timestamp('2019-11-13 10:00:00+0000', tz='UTC'),
745: Timestamp('2019-11-13 16:00:00+0000', tz='UTC'),
751: Timestamp('2019-11-13 22:00:00+0000', tz='UTC'),
757: Timestamp('2019-11-14 04:00:00+0000', tz='UTC'),
763: Timestamp('2019-11-14 10:00:00+0000', tz='UTC'),
769: Timestamp('2019-11-14 16:00:00+0000', tz='UTC'),
775: Timestamp('2019-11-14 22:00:00+0000', tz='UTC'),
780: Timestamp('2019-11-15 04:00:00+0000', tz='UTC')},
'date_at_which_value_is_needed': {667: Timestamp('2019-11-05 00:00:00+0000', tz='UTC'),
673: Timestamp('2019-11-05 00:00:00+0000', tz='UTC'),
679: Timestamp('2019-11-05 00:00:00+0000', tz='UTC'),
685: Timestamp('2019-11-06 00:00:00+0000', tz='UTC'),
691: Timestamp('2019-11-06 00:00:00+0000', tz='UTC'),
697: Timestamp('2019-11-06 00:00:00+0000', tz='UTC'),
703: Timestamp('2019-11-06 00:00:00+0000', tz='UTC'),
709: Timestamp('2019-11-07 00:00:00+0000', tz='UTC'),
715: Timestamp('2019-11-07 00:00:00+0000', tz='UTC'),
721: Timestamp('2019-11-07 00:00:00+0000', tz='UTC'),
727: Timestamp('2019-11-07 00:00:00+0000', tz='UTC'),
733: Timestamp('2019-11-08 00:00:00+0000', tz='UTC'),
739: Timestamp('2019-11-08 00:00:00+0000', tz='UTC'),
745: Timestamp('2019-11-08 00:00:00+0000', tz='UTC'),
751: Timestamp('2019-11-08 00:00:00+0000', tz='UTC'),
757: Timestamp('2019-11-11 00:00:00+0000', tz='UTC'),
763: Timestamp('2019-11-11 00:00:00+0000', tz='UTC'),
769: Timestamp('2019-11-11 00:00:00+0000', tz='UTC'),
775: Timestamp('2019-11-11 00:00:00+0000', tz='UTC'),
780: Timestamp('2019-11-12 00:00:00+0000', tz='UTC')},
'c': {667: 64.6475,
673: 65.005,
679: 65.0075,
685: 65.0075,
691: 65.0225,
697: 65.5875,
703: 65.6,
709: 65.5625,
715: 65.355,
721: 65.475,
727: 65.425,
733: 65.0375,
739: 65.9017,
745: 66.1875,
751: 66.15,
757: 66.075,
763: 65.695,
769: 65.625,
775: 65.66,
780: 65.9525}}
# Converting the dictionaries into a dataframe
datesDf = pd.DataFrame.from_dict(data)
# Selecting the date part of the datetime column
datesDf['date'] = datesDf['datetime'].apply(lambda x: x.date())
datesDf['date_needed'] = datesDf['date_at_which_value_is_needed'].apply(lambda x: x.date())
# Creating a new dataframe grouping dates by datetime
datesGrouped = datesDf.groupby('date')['datetime'].apply(list).to_frame()
# Joining original dataframe with new one after the grouping
result = datesDf.merge(datesGrouped, how='left', left_on='date_needed', right_on='date')
# Formating the result
result = result.drop(['date', 'date_needed'], axis = 1).rename(columns={"datetime_x": "datetime", "datetime_y": "datetime_col"})
推荐阅读
- javascript - 应该添加列表的按钮 css 不起作用
- javascript - 将 JSON 字符串导出到服务器上的文件
- typescript - 在 TypeScript 配置文件“tsconfig.json”中,将“lib”编译器选项设置为“dom”和“dom.iterable”是否多余?
- python - 如何创建带有图像边框的透明圆形 QWebEngineView?
- node.js - 我从 npm 收到 ENOMPTY 错误,我尝试了所有方法,但没有工作,已经 2 小时了
- onclick - 如何在 Jetpack Compose 中将 OnClick 添加到 LazyColumn 文本?
- python - 如何在Python中插入文件的中间
- java - 如何检查用户输入是否为 char 值?
- xgboost - 如何将 SageMaker xgboost 的 eval_metric 设置为 f1?
- php - 如何使用php获取数据库表内两列中的数据