amazon-ec2 - 在 AWS EC2 上加载 torch.hub.load('pytorch/fairseq', 'roberta.large.mnli') 时出错
问题描述
我正在尝试在 AWS 上的 EC2 实例上使用 Torch(和 Roberta 语言模型)运行一些代码。编译似乎失败了,有人有修复的指针吗?
确认 Torch 已正确安装
import torch
a = torch.rand(5,3)
print (a)
返回:张量([[0.7494, 0.5213, 0.8622],...
尝试加载罗伯塔
roberta = torch.hub.load('pytorch/fairseq', 'roberta.large.mnli')
Using cache found in /home/ubuntu/.cache/torch/hub/pytorch_fairseq_master
/home/ubuntu/.local/lib/python3.8/site-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
return torch._C._cuda_getDeviceCount() > 0
fatal: not a git repository (or any of the parent directories): .git
running build_ext
/home/ubuntu/.local/lib/python3.8/site-packages/torch/utils/cpp_extension.py:352: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
warnings.warn(msg.format('we could not find ninja.'))
skipping 'fairseq/data/data_utils_fast.cpp' Cython extension (up-to-date)
skipping 'fairseq/data/token_block_utils_fast.cpp' Cython extension (up-to-date)
building 'fairseq.libnat' extension
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/.local/lib/python3.8/site-packages/torch/include -I/home/ubuntu/.local/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/ubuntu/.local/lib/python3.8/site-packages/torch/include/TH -I/home/ubuntu/.local/lib/python3.8/site-packages/torch/include/THC -I/usr/include/python3.8 -c fairseq/clib/libnat/edit_dist.cpp -o build/temp.linux-x86_64-3.8/fairseq/clib/libnat/edit_dist.o -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -DTORCH_EXTENSION_NAME=libnat -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
In file included from /home/ubuntu/.local/lib/python3.8/site-packages/torch/include/ATen/Parallel.h:149,
from /home/ubuntu/.local/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/utils.h:3,
from /home/ubuntu/.local/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/nn/cloneable.h:5,
from /home/ubuntu/.local/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/nn.h:3,
from /home/ubuntu/.local/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/all.h:12,
from /home/ubuntu/.local/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/torch.h:3,
from fairseq/clib/libnat/edit_dist.cpp:9:
/home/ubuntu/.local/lib/python3.8/site-packages/torch/include/ATen/ParallelOpenMP.h:84: warning: ignoring #pragma omp parallel [-Wunknown-pragmas]
84 | #pragma omp parallel for if ((end - begin) >= grain_size)
然后结束,过了很久。
x86_64-linux-gnu-gcc: fatal error: Killed signal terminated program cc1plus compilation terminated.
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
解决方案
通过在本地而不是从集线器加载预训练模型来使其工作。
from fairseq.models.roberta import RobertaModel
roberta = RobertaModel.from_pretrained('roberta.large.mnli', 'model.pt', '/home/ubuntu/deployedapp/roberta.large')
roberta.eval()
请注意,我必须使用 XLarge EC2 实例来运行它,否则进程会因内存不足而被终止。
推荐阅读
- string - 如何将字符串添加到数据框
- javascript - 第 n 个子选择器的意外行为
- python - 是否可以对具有 58 行和(时间序列)和 10467 列的大型 GDP 数据运行向量自回归分析?
- javascript - 如何在 VueJS 中为 formBuilder 嵌入 jQuery
- arm - 新手会引起一些混乱;在 uboot/board 文件夹中,但未指定我的供应商名称 - lichee Pi Zero
- azure - 如何修复“Http11NioProtocol:读取请求时出错,被忽略”
- javascript - 如何从库 react-edittext 中的按钮中删除 `text-decoration: line-through`?
- java - 使用 NAT 向 IP 发送 UDP 数据包
- windows - 即使将防火墙规则定义为允许,Google Windows 实例也未启用 445/139 端口
- java - Selenium:网页弹出/警报/通知消息的元素不可见异常