python - Avoiding For Loops / Iterating over two tables
问题描述
I have two large pandas dataframes and am currently using nested for loops to perform the calculation listed below. This is highly inefficient and takes forever to run. I was wondering if anyone has an approach other than iterables / for loops to perform the following calc:
- I have two dataframes:
source
and destination:
I want to join the data from source and destination to get the following output:
The rules that I apply:
- The total of amt (38) in the source table remains the same in the output table
- There can be multiple columns present in source ( like Business) not present in destination.
- The logic starts at the beginning of the source table and proceeds to the end.
- Cells in green have a perfect match ( ie instrument and entity ) between source and destination tables. The Depot column is the new column in the output 5.Cells in Yellow have a lower Amt in the source, the Source Amt is maintained in the output
- Cells in Orange have no match in the destination, so the Amt is shown without a depot
- Cells in blue have a lower match in the destination so are split up
I am able to achieve the logic above using a nested for loop. However, this is not efficient and was wondering if there is a Pythonic approach to achieve this more efficiently.
解决方案
将熊猫导入为 pd
destination.drop(['Instrument', 'Amt'], axis = 1, inplace = True)
source = pd.merge(source, destination, on = 'Entity', how = 'left')
推荐阅读
- java - 当您需要将多种类型作为源传递时,实现 Spring 的 Converter 接口的最佳方式
- c++ - 仅使用加法/递归 C++ 对数字求平方
- c# - Microsoft.Extensions.Configuration 究竟如何依赖 ASP.NET Core?
- sql - 如何使用 groupby 和聚合函数将 SQL 查询中的行数返回为单个数字?
- python - while 循环没有读取变量
- python - Python:查找列表列表的范围
- vim - 使用当前文件类型的 ftplugin 文件打开缓冲区?
- asp.net-core - 类库中的 EF Core 并从 API 项目中使用 - 如何将配置信息传递到类库?
- swift - 如何获取 UICollectionView 中每个选定单元格的按钮标题?
- javascript - 循环遍历多维Json