python - 使用 Fast.AI 查找学习率/训练时出现 BrokenPipeError
问题描述
我正在通过使用 Fast.AI/Pytorch 在 Windows 中创建 CNN 来练习我的机器学习技能。我已经成功地创建并初始化了我的数据集,但是当我尝试训练它或寻找学习率时,我得到了一个 BrokenPipeError。
...
learn = cnn_learner(data, models.resnet34, metrics = error_rate) #We're fine here
#Now either line of code will throw the same error.
learn.fit_one_cycle(1)
learn.lr_find()
...
这是我得到的具体错误。
Traceback (most recent call last):
File "<ipython-input-34-4d78bfcf8d69>", line 1, in <module>
runfile('C:/Users/.../Desktop/Homebrew AI/image_test.py', wdir='C:/Users/.../Desktop/Homebrew AI')
File "C:\Users\...\AppData\Local\Continuum\anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 786, in runfile
execfile(filename, namespace)
File "C:\Users\...\AppData\Local\Continuum\anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 110, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/Users/.../Desktop/Homebrew AI/image_test.py", line 36, in <module>
learn.lr_find()
File "C:\Users\...\AppData\Local\Continuum\anaconda3\lib\site-packages\fastai\train.py", line 32, in lr_find
learn.fit(epochs, start_lr, callbacks=[cb], wd=wd)
File "C:\Users\...\AppData\Local\Continuum\anaconda3\lib\site-packages\fastai\basic_train.py", line 200, in fit
fit(epochs, self, metrics=self.metrics, callbacks=self.callbacks+callbacks)
File "C:\Users\...\AppData\Local\Continuum\anaconda3\lib\site-packages\fastai\basic_train.py", line 99, in fit
for xb,yb in progress_bar(learn.data.train_dl, parent=pbar):
File "C:\Users\...\AppData\Local\Continuum\anaconda3\lib\site-packages\fastprogress\fastprogress.py", line 72, in __iter__
for i,o in enumerate(self._gen):
File "C:\Users\...\AppData\Local\Continuum\anaconda3\lib\site-packages\fastai\basic_data.py", line 75, in __iter__
for b in self.dl: yield self.proc_batch(b)
File "C:\Users\...\AppData\Local\Continuum\anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 193, in __iter__
return _DataLoaderIter(self)
File "C:\Users\...\AppData\Local\Continuum\anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 469, in __init__
w.start()
File "C:\Users\...\AppData\Local\Continuum\anaconda3\lib\multiprocessing\process.py", line 112, in start
self._popen = self._Popen(self)
File "C:\Users\...\AppData\Local\Continuum\anaconda3\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Users\...\AppData\Local\Continuum\anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Users\...\AppData\Local\Continuum\anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 89, in __init__
reduction.dump(process_obj, to_child)
File "C:\Users\...\AppData\Local\Continuum\anaconda3\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe
我假设它可能与Windows有关?解决此问题的任何帮助都会有所帮助。
解决方案
事实证明,PyTorch 和 Windows 的 1.0.4 版本在多线程方面发生了一些奇怪的事情。将 PyTorch 的版本降级到 1.0.0 解决了这个问题。
推荐阅读
- apache-spark - 当会话在写入期间被终止时,Spark saveAsTable 是否会回滚?
- java - 如何计算 Java 数组的内存大小?
- javascript - 如何从消费者更新提供者中的上下文值?
- java - 从 Java 程序调用 Kotlin 时找不到 kotlin-reflection.jar
- python - 什么字段用于高度?
- rest - @pathparam 以及 apache-cxf webservice 的 POST 有效负载
- android - 为什么三星 SM-J5008 (Lollipop) 重启后不触发 onReceive()?
- linux - 在同一域上托管多个项目但路径不同
- powershell - 带有详细信息的复制项后的 Powershell 错误消息
- html - 对象数组的表单控件