tensorflow - 编译失败:尝试编译图 get_loss_cond_1_true_88089_rewritten[] 时检测到不支持的操作
问题描述
尝试使用自定义 crf 损失函数时,我在 google colab TPU 上收到以下错误。我检查了https://cloud.google.com/tpu/docs/tensorflow-ops的 FakeParam 操作,看起来操作符在 Cloud TPU 上可用。
InvalidArgumentError: 9 root error(s) found. (0) Invalid argument: {{function_node __inference_train_function_104228}} Compilation failure: Detected unsupported operations when trying to compile graph get_loss_cond_1_true_88089_rewritten[] on XLA_TPU_JIT: FakeParam (No registered 'FakeParam' OpKernel for XLA_TPU_JIT devices compatible with node {{node get_loss/cond_1/FakeParam_15}} (OpKernel was found, but attributes didn't match) Requested Attributes: dtype=DT_VARIANT, shape=[]){{node get_loss/cond_1/FakeParam_15}} [[get_loss/cond_1]] TPU compilation failed [[tpu_compile_succeeded_assert/_12238515605435969423/_6]] [[tpu_compile_succeeded_assert/_12238515605435969423/_6/_279]] (1) Invalid argument: {{function_node __inference_train_function_104228}} Compilation failure: Detected unsupported operations when trying to compile graph get_loss_cond_1_true_88089_rewritten[] on XLA_TPU_JIT: FakeParam (No registered 'FakeParam' OpKernel for XLA_TPU_JIT devices compatible with node {{node get_loss/cond_1/FakeParam_15}} (OpKernel was found, but attributes didn't match) Requested Attributes: dtype=DT_VARIANT, shape=[]){{node get_loss/cond_1/FakeParam_15}} [[get_loss/cond_1]] TPU compilation failed [[tpu_compile_succeeded_assert/_12238515605435969423/_6]] [[tpu_compile_succeeded_assert/_12238515605435969423/_6/_223]] (2) Invalid argument: {{function_node __inference_train_function_104228}} Compilation failure: Detected unsupported operations when trying to compile graph get_loss_cond_1_true_88089_rewritten[] on XLA_TPU_JIT: FakeParam (No registered 'FakeParam' OpKernel for XLA_TPU_JIT devices compatible with node {{node get_loss/cond_1/FakeParam_15}} (OpKernel was found, but attributes didn't match) Requested Attributes: dtype=DT_VARIANT, shape=[]){{node get_loss/cond_1/FakeParam_15}} [[get_loss/cond_1]] TPU compilation failed [[tpu_compile_succeeded_assert/_12238515605435969423/_6]] [[tpu_compile_succeeded_assert/_12238515605435969423/_6/_265]] (3) Invalid argument: {{function_node __inference_train_function_104228}} Compilation failure: Detected unsupported operations when trying to compile graph get_loss_cond_1_true_88089_rewritten[] on XLA_TPU_JIT: FakeParam (No registered 'FakeParam' OpKernel for XLA_TPU_JIT devices compatible with node {{node get_loss/cond_1/FakeParam_15}} (OpKernel was found, but attributes didn't match) Requested Attributes: dtype=DT_VARIANT, shape=[]){{node get_loss/cond_1/FakeParam_15}} [[get_loss/cond_1]] TPU compilation failed [[tpu_compile_succeeded_assert/_12238515605435969423/_6]] [[tpu_compile_succeeded_assert/_12238515605435969423/_6/_251]] (4) Invalid argument: {{function_node __inference_train_function_104228}} Compilation failure: Detected unsupported operations when trying to compile graph get_loss_cond_1_true_88089_rewritten[] on XLA_TPU_JIT: FakeParam (No registered 'FakeParam' OpKernel for XLA_TPU_JIT devices compatible with node {{node get_loss/cond_1/FakeParam_15}} (OpKernel was found, but attributes didn't match) Requested Attributes: dtype=DT_VARIANT, shape=[ ... [truncated]
这是我的代码:
def make_model():
input_ids_in = tf.keras.layers.Input(shape=(100,), name='input_token', dtype=tf.int32)
input_mask_in = tf.keras.layers.Input(shape=(100,), name='input_mask', dtype=tf.int32)
bert_model = TFAutoModel.from_pretrained("dbmdz/bert-base-turkish-cased")
embedding_layer = bert_model(input_ids_in, attention_mask = input_mask_in)[0]
model = tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(50,trainable=False,
return_sequences=True))(embedding_layer)
model = tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(len(labels_ner), activation="relu"))(model)
crf = CRF(len(labels_ner)) # CRF layer
out = crf(model) # output
model = Model([input_ids_in,input_mask_in], out)
model.compile('adam', loss=crf.get_loss)
print("Baseline/LSTM-CRF model built: ")
return model
with strategy.scope():
model = make_model()
model.fit(x_tr, np.argmax(y_tr,axis=-1) ,batch_size=32 ,epochs=5,verbose=1,validation_split = 0.1)
我使用了这个 tensorflow_addon crf.py 模块https://github.com/howl-anderson/addons/blob/feature/crf_layers/tensorflow_addons/layers/crf.py
谢谢
解决方案
Looks likeFakeParam
仅支持这些 dtypes: {bfloat16,bool,complex64,float,int32,int64,uint32,uint64}
,而不支持dtype=DT_VARIANT
.
在 TF2 上启用自动外部编译应该可以解决此问题,请在某处添加此行:
tf.config.set_soft_device_placement(True)
.
推荐阅读
- firebase - 数组内的搜索键,如何在 .where 中访问它们进行搜索
- ruby-on-rails - Active Record 查询以查找在某个日期范围内有孩子的父母和在该日期范围之前有孩子的父母
- salesforce - Salesforce 中的大数据负载
- python - 如何循环回到开头
- html - 使用 CSS flexbox 沿其他内容缩放 SVG
- .net-core - netstandard 2.0 包与 netcoreapp2.2 项目不兼容
- python - Python-kenel 错误:PermissionError:[Errno 13] 权限被拒绝:
- windows - 错误:HTTPClient::ReceiveTimeoutError:使用刀引导时执行已过期
- python - Python3中关于身份和布尔值的问题
- android - 在其他设备中布局