问题
今天想试一下之前在Windows 10上配置的Pytorch环境。准备模型训练的数据集,当我尝试遍历DataLoader的时候出现了以下报错信息。
Traceback (most recent call last):File "<string>", line 1, in <module>File "D:\ProgramData\Anaconda3\lib\multiprocessing\spawn.py", line 105, in spawn_mainexitcode = _main(fd)File "D:\ProgramData\Anaconda3\lib\multiprocessing\spawn.py", line 115, in _mainself = reduction.pickle.load(from_parent)AttributeError: Can't get attribute 'MyDataset' on <module '__main__' (built-in)>Traceback (most recent call last):File "D:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3331, in run_codeexec(code_obj, self.user_global_ns, self.user_ns)File "<ipython-input-5-e37105fe54f7>", line 1, in <module>for i, j in train_iter:File "D:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 279, in __iter__return _MultiProcessingDataLoaderIter(self)File "D:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 719, in __init__w.start()File "D:\ProgramData\Anaconda3\lib\multiprocessing\process.py", line 112, in startself._popen = self._Popen(self)File "D:\ProgramData\Anaconda3\lib\multiprocessing\context.py", line 223, in _Popenreturn _default_context.get_context().Process._Popen(process_obj)File "D:\ProgramData\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popenreturn Popen(process_obj)File "D:\ProgramData\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 89, in __init__reduction.dump(process_obj, to_child)File "D:\ProgramData\Anaconda3\lib\multiprocessing\reduction.py", line 60, in dumpForkingPickler(file, protocol).dump(obj)BrokenPipeError: [Errno 32] Broken pipe
好像涉及到多进程的一些问题,之前学python的时候记得Windows上没有fork()系统调用,多进程好像需求特殊的处理。涉及问题的部分代码如下。
train_iter = DataLoader(dataset=train_set, batch_size=batch_size, shuffle=True, num_workers=10)test_iter = DataLoader(dataset=test_set, batch_size=batch_size, shuffle=True, num_workers=10)# %%for i, j in train_iter:print(i)
DataLoader的num_workers涉及多线程读取数据,而Python由于设计时有GIL全局锁,导致了多线程无法利用多核,这边实际上应该是用多进程实现多核利用的。问题大致就出于此。
解决方法
方案一
将num_workers设置为0
方案二
用以下代码包括你的其他代码
if __name__ == '__main__':
