python - GPT2Simple 运行时出现问题
问题描述
我正在尝试运行此 GPT2Simple 示例,但出现错误
Original stack trace for 'model/MatMul':
File "c:/Users/Jerome Ariola/Desktop/Machine Learning Projects/gpt test.py", line 32, in <module>
steps=1)
File "C:\Program Files\Python36\lib\site-packages\gpt_2_simple\gpt_2.py", line 198, in finetune
output = model.model(hparams=hparams, X=context, gpus=gpus)
File "C:\Program Files\Python36\lib\site-packages\gpt_2_simple\src\model.py", line 212, in model
logits = tf.matmul(h_flat, wte, transpose_b=True)
File "C:\Program Files\Python36\lib\site-packages\tensorflow_core\python\util\dispatch.py", line 180, in wrapper
return target(*args, **kwargs)
File "C:\Program Files\Python36\lib\site-packages\tensorflow_core\python\ops\math_ops.py", line 2754, in matmul
a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
File "C:\Program Files\Python36\lib\site-packages\tensorflow_core\python\ops\gen_math_ops.py", line 6136, in mat_mul
name=name)
File "C:\Program Files\Python36\lib\site-packages\tensorflow_core\python\framework\op_def_library.py", line 794, in _apply_op_helper
op_def=op_def)
File "C:\Program Files\Python36\lib\site-packages\tensorflow_core\python\util\deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "C:\Program Files\Python36\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3357, in create_op
attrs, op_def, compute_device)
File "C:\Program Files\Python36\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3426, in _create_op_internal
op_def=op_def)
File "C:\Program Files\Python36\lib\site-packages\tensorflow_core\python\framework\ops.py", line 1748, in __init__
self._traceback = tf_stack.extract_stack()
这是代码,取自https://github.com/minimaxir/gpt-2-simple
我还从 Tensorflow 2.0 降级到了 Tensorflow 1.15,因为存在问题tf.contrib
或其他问题
# https://github.com/minimaxir/gpt-2-simple
import gpt_2_simple as gpt2
import os
import requests
model_name = "124M"
if not os.path.isdir(os.path.join("models", model_name)):
print(f"Downloading {model_name} model...")
gpt2.download_gpt2(model_name=model_name) # model is saved into current directory under /models/124M/
file_name = "shakespeare.txt"
if not os.path.isfile(file_name):
url = "https://raw.githubusercontent.com/karpathy/char-rnn/master/data/tinyshakespeare/input.txt"
data = requests.get(url)
with open(file_name, 'w') as f:
f.write(data.text)
sess = gpt2.start_tf_sess()
gpt2.finetune(sess,
file_name,
model_name=model_name,
steps=1)
gpt2.generate(sess)
解决方案
更新:我再次降级,最初从 tf2.0 到 tf1.15,现在到 tf1.14。还是一样的错误。
这是我得到的完整错误(或者至少在分配器停止的地方)
Limit: 6696213545
InUse: 6693793536
MaxInUse: 6693795584
NumAllocs: 2032
MaxAllocSize: 268435456
2021-03-19 01:21:53.793259: W tensorflow/core/common_runtime/bfc_allocator.cc:319] ***x************************************************************************************************
2021-03-19 01:21:53.798596: W tensorflow/core/framework/op_kernel.cc:1502] OP_REQUIRES failed at cwise_ops_common.cc:70 : Resource exhausted: OOM when allocating tensor with shape[1,12,1024,1024] and type bool on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):
File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1356, in _do_call
return fn(*args)
File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1341, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1429, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[1,1024,3072] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node model/h11/mlp/Pow}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "c:/Users/Jerome Ariola/Desktop/Desktop 2021/Machine Learning Projects/Drake bot/gpt test.py", line 32, in <module>
steps=1)
File "C:\Program Files\Python36\lib\site-packages\gpt_2_simple\gpt_2.py", line 337, in finetune
opt_compute, feed_dict={context: sample_batch()})
File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\client\session.py", line 950, in run
run_metadata_ptr)
File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1173, in _run
feed_dict_tensor, options, run_metadata)
File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1350, in _do_run
run_metadata)
File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1370, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[1,1024,3072] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[node model/h11/mlp/Pow (defined at C:\Program Files\Python36\lib\site-packages\gpt_2_simple\src\model.py:56) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
Errors may have originated from an input operation.
Input Source operations connected to node model/h11/mlp/Pow:
model/h11/mlp/c_fc/Reshape_2 (defined at C:\Program Files\Python36\lib\site-packages\gpt_2_simple\src\model.py:85)
Original stack trace for 'model/h11/mlp/Pow':
File "c:/Users/Jerome Ariola/Desktop/Desktop 2021/Machine Learning Projects/Drake bot/gpt test.py", line 32, in <module>
steps=1)
File "C:\Program Files\Python36\lib\site-packages\gpt_2_simple\gpt_2.py", line 198, in finetune
output = model.model(hparams=hparams, X=context, gpus=gpus)
File "C:\Program Files\Python36\lib\site-packages\gpt_2_simple\src\model.py", line 197, in model
h, present = block(h, 'h%d' % layer, past=past, hparams=hparams)
File "C:\Program Files\Python36\lib\site-packages\gpt_2_simple\src\model.py", line 158, in block
m = mlp(norm(x, 'ln_2'), 'mlp', nx*4, hparams=hparams)
File "C:\Program Files\Python36\lib\site-packages\gpt_2_simple\src\model.py", line 148, in mlp
h = gelu(conv1d(x, 'c_fc', n_state))
File "C:\Program Files\Python36\lib\site-packages\gpt_2_simple\src\model.py", line 56, in gelu
return 0.5*x*(1+tf.tanh(np.sqrt(2/np.pi)*(x+0.044715*tf.pow(x, 3))))
File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\util\dispatch.py", line 180, in wrapper
return target(*args, **kwargs)
File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\ops\math_ops.py", line 450, in pow
return gen_math_ops._pow(x, y, name=name)
File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 7382, in _pow
"Pow", x=x, y=y, name=name)
File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\framework\ops.py", line 3616, in create_op
op_def=op_def)
File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\framework\ops.py", line 2005, in __init__
self._traceback = tf_stack.extract_stack()
PS C:\Users\Jerome Ariola\Desktop\Desktop 2021\Machine Learning Projects>
推荐阅读
- r - 箭头在 ggplot2 中用作尺寸美学
- gams-math - 是否可以在同一个 .gms 文件中将变量声明为不同的类型?
- linux - 致命错误:“AWSHTTPSConnection”对象没有属性“server_hostname”
- python - 用于句子语音识别的拆分音频文件
- python - 在具有大量噪声的二进制图像上检测圆形形状
- angularjs - 如何将独立量角器自动化脚本存储库移动到 angularjs 项目存储库?
- php - 如何使用与 PHP str_replace 相同的字符串(字符)来打开和关闭标签?
- angular - 嵌套表单数组错误:找不到带有路径的控件:'module -> 0 -> view'
- amazon-s3 - Rabbitmq(AMQP) 到 s3 连接器
- android - 如何在 SQLite Android 中提交?