python - Pandas:将带有 if/else 条件的 for 循环转换为 apply 方法(lambda 函数)
问题描述
我有以下带有 for 循环的功能:
def add_CQI_iterrows(df):
previous_row = df['Date'].astype(str)[0]
CQI_index = 0
series = []
for index, row in df.iterrows():
if row['Date'] == previous_row:
previous_row = row['Date']
print(CQI_index)
else:
CQI_index += 1
previous_row = row['Date']
series.append(CQI_index)
df['CQI'] = series
return df
我想找到一种方法将此 for 循环转换为 apply 方法。像这样的东西(不起作用):
def add_CQI_apply(df):
previous_row = df['Date'].astype(str)[0]
CQI_index = 1
series = []
df['CQI'] = df.apply(lambda row: previous_row = row['Date'] if row['Date'] == previous_row else CQI_index += 1 and previous_row = row['Date'], axis=1)
return df
我想做这个转换,因为我想看看 apply 方法有多快,以及是否可以在 Pandas 系列上对 apply 方法进行矢量化。
这是我的数据(data.json):
[
{
"Date": "9/20/2020 8:50",
"UE": 1
},
{
"Date": "9/20/2020 8:50",
"UE": 2
},
{
"Date": "9/20/2020 8:50",
"UE": 3
},
{
"Date": "9/20/2020 8:57",
"UE": 1
},
{
"Date": "9/20/2020 8:57",
"UE": 8
},
{
"Date": "9/20/2020 8:57",
"UE": 2
},
{
"Date": "9/20/2020 9:12",
"UE": 1
},
{
"Date": "9/20/2020 9:12",
"UE": 5
},
{
"Date": "9/20/2020 9:12",
"UE": 3
},
{
"Date": "9/20/2020 9:20",
"UE": 1
},
{
"Date": "9/20/2020 9:20",
"UE": 4
},
{
"Date": "9/20/2020 9:20",
"UE": 3
}
]
最后是上传这些数据的函数:
def upload_data(file):
df = pd.read_json(file)
df['Date'] = pd.to_datetime(df['Date'], format="%Y-%d-%m %H:%M:%S")
df['CQI'] = np.nan
return df
解决方案
df['CQI'] = (df['Date'] != df['Date'].shift()).cumsum()
In [120]: (df['Date'] != df['Date'].shift()).cumsum()
Out[120]:
0 1
1 1
2 1
3 2
4 2
5 2
6 3
7 3
8 3
9 4
10 4
11 4
Name: Date, dtype: int64
推荐阅读
- java - java.lang.ClassNotFoundException 无法到达控制器
- android - 当在 python 后端指定主题时,FCM 通知不发送
- c++ - 是否允许从具有一些不确定值的对象进行分配?
- node.js - 如何在纯 Node.js 中将文件存储在服务器中?
- java - 导航架构组件,包含列表的片段从开始显示,而从另一个片段返回
- smt - 如何在布尔模型中表示双精度?
- ruby-on-rails - 如何在ruby中使用rails和carrierwave显示我的图像
- neo4j - 节点(ID)已经存在,标签为“X”和属性“Y”
- c++ - 如何在不使用指针的情况下允许未初始化类类型的对象并识别何时出现这种情况?
- python - 使用 openpyxl 将数据框值添加到 xls