首页 > 解决方案 > 我可以找出导致我的 Python MemoryError 的分配请求吗?

问题描述

语境

我的小型 Python 脚本使用一个库来处理一些相对较大的数据。此任务的标准算法是动态规划算法,因此可能“幕后”库分配了一个大数组来跟踪 DP 的部分结果。事实上,当我尝试给它相当大的输入时,它会立即给出一个MemoryError.

最好不要深入研究库的深度,我想弄清楚是否值得在具有更多内存的另一台机器上尝试这个算法,或者尝试减少我的输入大小,或者它是否是一个失败的原因我正在尝试使用的数据大小。

问题

当我的 Python 代码抛出 a 时MemoryError,我是否有一种“自上而下”的方式来调查我的代码尝试分配导致错误的内存大小,例如通过检查错误对象?

标签: pythonpython-3.xerror-handlingout-of-memory

解决方案


You can't see from the MemoryError exception, and the exception is raised for any situation where memory allocation failed, including Python internals that do not directly connect to code creating new Python data structures; some modules create locks or other support objects and those operations can fail due to memory having run out.

You also can't necessarily know how much memory would be required to have the whole operation succeed. If the library creates several data structures over the course of operation, trying to allocate memory for a string used as a dictionary key could be the last straw, or it could be copying the whole existing data structure for mutation, or anything in between, but this doesn't say anything about how much memory is going to be needed, in addition, for the remainder of the process.

That said, Python can give you detailed information on what memory allocations are being made, and when, and where, using the tracemalloc module. Using that module and an experimental approach, you could estimate how much memory your data set would require to complete.

The trick is to find data sets for which the process can be completed. You'd want to find data sets of different sizes, and you can then measure how much memory those data structures require. You'd create snapshots before and after with tracemalloc.take_snapshot(), compare differences and statistics between the snapshots for those data sets, and perhaps you can extrapolate from that information how much more memory your larger data set would need. It depends, of course, on the nature of the operation and the datasets, but if there is any kind of pattern tracemalloc is your best shot to discover it.


推荐阅读