python - Pandas lookup within groupby dataframe
问题描述
I have the below df:
data={'Name':['A','A','A','A','A','A','A','A','A','B','B','B','B','B','B','B','B','B','B','B','B','C','C','C','C','C','C','C','C'],
'Sales':['327','255','977','211','146','183','138','142','156','208','195','224','181','351','166','173','320','197','311','327','245','186','362','391','604','2320','2230','0','0'],
'Price':['10','11','12','13','14','15','16','17','18','30','31','32','33','34','35','36','37','38','39','40','41','60','61','62','63','64','65','66','67'],
'Second_highest_Sales':['','255','327','327','327','327','327','327','327','','195','208','208','224','224','224','320','320','320','327','327','','186','362','391','604','2230','2230','2230']}
data=pd.DataFrame(data)
I am looking to get the corresponding 'price' for 'Second_highest_Sales' based on 'Sales' for each group (Name). The result would look like:
result={
'Name':['A','A','A','A','A','A','A','A','A','B','B','B','B','B','B','B','B','B','B','B','B','C','C','C','C','C','C','C','C'],
'Sales':['327','255','977','211','146','183','138','142','156','208','195','224','181','351','166','173','320','197','311','327','245','186','362','391','604','2320','2230','0','0'],
'Price':['10','11','12','13','14','15','16','17','18','30','31','32','33','34','35','36','37','38','39','40','41','60','61','62','63','64','65','66','67'],
'Second_highest_Sales':['','255','327','327','327','327','327','327','327','','195','208','208','224','224','224','320','320','320','327','327','','186','362','391','604','2230','2230','2230'],
'2nd_Highest_Price':['','11','10','10','10','10','10','10','10','','31','30','30','32','32','32','37','37','37','40','40','','60','61','62','63','65','65','65']}
result=pd.DataFrame(result)
I tried with .shift() and .lookup() but get the index error on a groupby dataframe. Is there an easier way to do this instead of a custom function?
解决方案
我会
- 获取“Second_highest_Sales”系列删除空值
- 检索相应的名称
- 按名称和销售额重新索引 DataFrame
- 搜索相应名称和 Second_highest_Sales 的价格
- 用定义 Second_highest_Sales 的所需值填充列
在代码上会是这样的
shs = data['Second_highest_Sales']
shs = shs[shs!='']
shs_names = data.iloc[shs.index]['Name']
prices = data.set_index(['Name','Sales']).loc[zip(shs_names, shs)]['Price']
result = data.copy()
result ['Second_highest_Price']=''
result.loc[shs.index,'Second_highest_Price'] = prices.values
推荐阅读
- lua - 如何对 Lua 脚本进行去混淆处理?
- javascript - 纯 Javascript 模态不会显示在同步进程中
- java - 声明多个有效的最终资源时,try-with-resource 是否不安全?
- udp - LibVLCSharp UWP 应用程序未在某些网络上运行
- ti-basic - 为什么我的 TI-BASIC 程序在猜谜游戏中显示错误的输出?
- node.js - 使用 AWS SDK 为事件桥规则添加 AWS Lambda 作为目标
- c# - CodeFresh 上的 C# Selenium 测试未加载测试文件
- java - 如何使用 JPA Criteria API 检索不同行的数量
- kubernetes - K8s - 在 K8s 升级之前无法升级 statefulset API
- python - 在 For 循环中添加用户输入,在 0 输入后中断