首页 > 解决方案 > 在python中使用多处理删除文件

问题描述

我正在使用以下代码删除python中的大量文件:

import os
from multiprocessing import Pool

def deleteFiles(loc):
    def Fn_deleteFiles(inp):
        [fn, loc] = [inp['fn'], inp['loc']]
        os.remove(os.path.join(loc, fn))

    p = Pool(5)
    for path, subdirs, files in os.walk(loc):
        if len(files) > 0:
            inpData = [{'fn':x, 'loc':loc} for x in files]
            p.map(Fn_deleteFiles, inpData)
    p.close()

if __name__ == '__main__':
    loc = r'C:\myDriveWithFilesToDelete'
    deleteFiles(loc)

我收到以下错误:

  File "C:\Program Files\Python 3.5\lib\multiprocessing\reduction.py", line 50, in dumps
    cls(buf, protocol).dump(obj)
AttributeError: Can't pickle local object 'deleteFiles.<locals>.Fn_deleteFiles'

标签: multiprocessingpython-3.5

解决方案


问题是您正在函数内部创建一个函数。

函数Fn_deleteFiles(inp), 定义在deleteFiles(loc).

这意味着它Fn_deleteFiles(inp)是_only_在deleteFiles(loc)运行时制作的。

问题是,在内部,multiprocessing.pool.Pool()调用pickle库以将函数对象从这个 python 进程传输到正在生成的一个新 python 函数。

但是,如果无法找到函数的来源,pickle则无法对函数进行字符串化。

这是一个演示类似错误的演示。

import pickle
def foo():
    def bar():
        return "Hello"
    return bar

bar = foo()

if __name__ == '__main__':
    s = pickle.dumps(bar)

会导致同样的错误:

Traceback (most recent call last):
  File ".../stacktest.py", line 10, in <module>
    s = pickle.dumps(bar)
AttributeError: Can't pickle local object 'foo.<locals>.bar'

因此,要修复此错误,您可以改用multiprocessing.pool.ThreadPool它,因为它不会腌制。

import os
from multiprocessing.pool import ThreadPool as Pool
def deleteFiles(loc):
    def Fn_deleteFiles(inp):
        [fn, loc] = [inp['fn'], inp['loc']]
        os.remove(os.path.join(loc, fn))
    p = Pool(5)
    for path, subdirs, files in os.walk(loc):
        if len(files) > 0:
            inpData = [{'fn':x, 'loc':loc} for x in files]
            p.map(Fn_deleteFiles, inpData)
    p.close()
if __name__ == '__main__':
    loc = 'DriveWithFilesToDelete'
    deleteFiles(loc)

或者,您可以定义Fn_deleteFiles(inp)外部deleteFiles(loc)来解决此问题。

警告由于我不明白的原因,这个答案将挂在空闲解释器中。

import os
from multiprocessing import Pool

def Fn_deleteFiles(inp):
    print("Delete", inp)
    [fn, loc] = [inp['fn'], inp['loc']]
    os.remove(os.path.join(loc, fn))

def deleteFiles(loc):
    p = Pool(5)
    for path, subdirs, files in os.walk(loc):
        if len(files) > 0:
            inpData = [{'fn':x, 'loc':loc} for x in files]
            p.map(Fn_deleteFiles, inpData)
    p.close()

if __name__ == '__main__':
    loc = 'DriveWithFilesToDelete'
    deleteFiles(loc)

推荐阅读