python - 1000 万个大数据帧上的 for 循环的更好替代方案?
问题描述
我写了一个运行良好的代码。如下所示: 我需要优化运行时。
for i in range(len(df)):
try:
if df['event_name'][i] in ['add_basket_click','remove_basket_click'] and df['event_name'][i-1]=='product_search':
try:
if df['event_desc'][i]['firebase_screen_id']==df['event_desc'][i-1]['firebase_screen_id']:
df.at[i,'search_process']=1
except:
pass
except:
pass
下面是一个示例数据集:
user_id event_name event_desc
10 product_search {'firebase_previous_id': '8996730796507124997'}
10 add_basket_click {'firebase_previous_id': '8996730796507124997'}
10 start {'firebase_previous_id': '8996730796507124997'}
10 add_basket_click {'firebase_previous_id': '8996730796507124997'}
输出:
user_id event_name event_desc search_process
10 product_search {'firebase_previous_id': '8996730796507124997'} 0
10 add_basket_click {'firebase_previous_id': '8996730796507124997'} 1
10 start {'firebase_previous_id': '8996730796507124997'} 0
10 add_basket_click {'firebase_previous_id': '8996730796507124997'} 0
解决方案
我相信您需要在列firebase_previous_id
中firebase_screen_id
的字典中进行测试event_desc
:
m1 = df['event_name'].shift() =='product_search'
m2 = df['event_name'].isin(['add_basket_click','remove_basket_click'])
#changed values for non matched values after get
s1 = df['event_desc'].apply(lambda x: x.get('firebase_previous_id', 'not_m'))
s2 = df['event_desc'].apply(lambda x: x.get('firebase_previous_id', 'not_matched'))
m3 = s1 == s2.shift()
df['search_process'] = (m1 & m2 & m3).astype(int)
print (df)
user_id event_name event_desc \
0 10 product_search {'firebase_previous_id': '8996730796507124997'}
1 10 add_basket_click {'firebase_previous_id': '8996730796507124997'}
2 10 start {'firebase_previous_id': '8996730796507124997'}
3 10 add_basket_click {'firebase_previous_id': '8996730796507124997'}
search_process
0 0
1 1
2 0
3 0
推荐阅读
- raspberry-pi - 将 ipython 解释器添加到 Thonny
- spring - spring和spring boot的转换区别
- ruby-on-rails - 为什么 Rails 6 不允许我在 WHERE 子句中使用 `attr: nil`?
- android - 应用程序在创建 pdf 并在 itextpdf 中添加图像时暂停一段时间:itext7
- node.js - 快递 POST 返回 404?
- javascript - fullcalendar popover 不适用于事件图标
- node.js - node js服务器发布请求忽略从axios发送的方括号“[]”我该怎么办?
- angular - Angular 12 优化构建失败,没有什么可做的
- c - 输入 7 - 8 个命令行参数时出错
- git - 我可以取回因运行 git restore -s@ -SW 而消失的工作树中的文件吗?