python - 使用 tf.profiler.experimental.client.trace 的 TensorFlow 分析器提供空的跟踪数据
问题描述
我无法使用tf.profiler.experimental.client.trace
“请有人帮忙”收集跟踪数据吗?我在这里关注(CPU/GPU)示例用法https://www.tensorflow.org/api_docs/python/tf/profiler/experimental/client/trace看起来很简单。
tf.profiler.experimental.start
我有一个非常简单的模型,我可以使用和从中收集跟踪数据tf.profiler.experimental.stop
。
但是tf.profiler.experimental.client.trace
给了我空的跟踪数据。
我的代码如下:
import tensorflow as tf
import numpy as np
def mnist_dataset(batch_size):
(x_train, y_train), _ = tf.keras.datasets.mnist.load_data()
x_train = x_train / np.float32(255)
y_train = y_train.astype(np.int64)
train_dataset = tf.data.Dataset.from_tensor_slices(
(x_train, y_train)).shuffle(60000).repeat().batch(batch_size)
return train_dataset
batch_size = 64
dataset = mnist_dataset(batch_size)
model = tf.keras.Sequential([
tf.keras.Input(shape=(28, 28)),
tf.keras.layers.Reshape(target_shape=(28, 28, 1)),
tf.keras.layers.Conv2D(32, 3, activation='relu'),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(10)
])
model.compile(
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
optimizer=tf.keras.optimizers.SGD(learning_rate=0.001),
metrics=['accuracy'])
#tf.profiler.experimental.start('./logs/tb_log')
tf.profiler.experimental.server.start(6009)
model.fit(dataset, epochs=10, steps_per_epoch=70)
tf.profiler.experimental.client.trace('grpc://localhost:6009', './logs/tbc_log', 20000)
#tf.profiler.experimental.stop()
代码贯穿历代,然后输出
2021-02-02 17:49:44.943933: I tensorflow/core/profiler/rpc/client/capture_profile.cc:198] Profiler delay_ms was 0, start_timestamp_ns set to 1612288184943887718 [2021-02-02T17:49:44.943887718+00:00]
Starting to trace for 20000 ms. Remaining attempt(s): 2
2021-02-02 17:49:44.944037: I tensorflow/core/profiler/rpc/client/remote_profiler_session_manager.cc:75] Deadline set to 2021-02-02T17:50:44.890124419+00:00 because max_session_duration_ms was 60000 and session_creation_timestamp_ns was 1612288184890124419 [2021-02-02T17:49:44.890124419+00:00]
2021-02-02 17:49:44.944197: I tensorflow/core/profiler/rpc/client/profiler_client.cc:113] Asynchronous gRPC Profile() to localhost:6009
2021-02-02 17:49:44.944316: I tensorflow/core/profiler/rpc/client/remote_profiler_session_manager.cc:96] Issued Profile gRPC to 1 clients
2021-02-02 17:49:44.944340: I tensorflow/core/profiler/rpc/client/profiler_client.cc:131] Waiting for completion.
2021-02-02 17:49:44.946274: I tensorflow/core/profiler/lib/profiler_session.cc:136] Profiler session initializing.
2021-02-02 17:49:44.947547: W tensorflow/core/profiler/lib/profiler_session.cc:144] Profiling is late (2021-02-02T17:49:44.946338176+00:00) for the scheduled start (2021-02-02T17:49:44.943887718+00:00) and will start immediately.
2021-02-02 17:49:44.947582: I tensorflow/core/profiler/lib/profiler_session.cc:155] Profiler session started.
2021-02-02 17:49:44.947660: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1365] Profiler found 2 GPUs
2021-02-02 17:49:44.949656: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcupti.so.11.0
2021-02-02 17:50:08.435260: I tensorflow/core/profiler/lib/profiler_session.cc:71] Profiler session collecting data.
2021-02-02 17:50:08.435591: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1487] CUPTI activity buffer flushed
2021-02-02 17:50:08.635192: I tensorflow/core/profiler/internal/gpu/cupti_collector.cc:228] GpuTracer has collected 0 callback api events and 0 activity events.
2021-02-02 17:50:08.648616: I tensorflow/core/profiler/rpc/profiler_service_impl.cc:67] Collecting XSpace to repository: ./logs/tbc_log/plugins/profile/2021_02_02_17_49_44/localhost_6009.xplane.pb
2021-02-02 17:50:08.650309: I tensorflow/core/profiler/lib/profiler_session.cc:172] Profiler session tear down.
2021-02-02 17:50:08.650676: W tensorflow/core/profiler/rpc/client/capture_profile.cc:133] No trace event is collected from localhost:6009
No trace event is collected. Automatically retrying.
2021-02-02 17:50:08.651046: I tensorflow/core/profiler/rpc/client/capture_profile.cc:198] Profiler delay_ms was 0, start_timestamp_ns set to 1612288208651017638 [2021-02-02T17:50:08.651017638+00:00]
Starting to trace for 20000 ms. Remaining attempt(s): 1
2021-02-02 17:50:08.651123: I tensorflow/core/profiler/rpc/client/remote_profiler_session_manager.cc:75] Deadline set to 2021-02-02T17:50:44.890124419+00:00 because max_session_duration_ms was 60000 and session_creation_timestamp_ns was 1612288184890124419 [2021-02-02T17:49:44.890124419+00:00]
2021-02-02 17:50:08.651274: I tensorflow/core/profiler/rpc/client/profiler_client.cc:113] Asynchronous gRPC Profile() to localhost:6009
2021-02-02 17:50:08.651391: I tensorflow/core/profiler/rpc/client/remote_profiler_session_manager.cc:96] Issued Profile gRPC to 1 clients
2021-02-02 17:50:08.651420: I tensorflow/core/profiler/rpc/client/profiler_client.cc:131] Waiting for completion.
2021-02-02 17:50:08.652492: I tensorflow/core/profiler/lib/profiler_session.cc:136] Profiler session initializing.
2021-02-02 17:50:08.652570: W tensorflow/core/profiler/lib/profiler_session.cc:144] Profiling is late (2021-02-02T17:50:08.652539729+00:00) for the scheduled start (2021-02-02T17:50:08.651017638+00:00) and will start immediately.
2021-02-02 17:50:08.652591: I tensorflow/core/profiler/lib/profiler_session.cc:155] Profiler session started.
2021-02-02 17:50:31.280828: I tensorflow/core/profiler/lib/profiler_session.cc:71] Profiler session collecting data.
2021-02-02 17:50:31.281134: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1487] CUPTI activity buffer flushed
2021-02-02 17:50:31.510697: I tensorflow/core/profiler/internal/gpu/cupti_collector.cc:228] GpuTracer has collected 0 callback api events and 0 activity events.
2021-02-02 17:50:31.515475: I tensorflow/core/profiler/rpc/profiler_service_impl.cc:67] Collecting XSpace to repository: ./logs/tbc_log/plugins/profile/2021_02_02_17_49_44/localhost_6009.xplane.pb
2021-02-02 17:50:31.518037: I tensorflow/core/profiler/lib/profiler_session.cc:172] Profiler session tear down.
2021-02-02 17:50:31.518440: W tensorflow/core/profiler/rpc/client/capture_profile.cc:133] No trace event is collected from localhost:6009
No trace event is collected. Automatically retrying.
2021-02-02 17:50:31.518819: I tensorflow/core/profiler/rpc/client/capture_profile.cc:198] Profiler delay_ms was 0, start_timestamp_ns set to 1612288231518793164 [2021-02-02T17:50:31.518793164+00:00]
Starting to trace for 20000 ms. Remaining attempt(s): 0
2021-02-02 17:50:31.518889: I tensorflow/core/profiler/rpc/client/remote_profiler_session_manager.cc:75] Deadline set to 2021-02-02T17:50:44.890124419+00:00 because max_session_duration_ms was 60000 and session_creation_timestamp_ns was 1612288184890124419 [2021-02-02T17:49:44.890124419+00:00]
2021-02-02 17:50:31.519021: I tensorflow/core/profiler/rpc/client/profiler_client.cc:113] Asynchronous gRPC Profile() to localhost:6009
2021-02-02 17:50:31.519124: I tensorflow/core/profiler/rpc/client/remote_profiler_session_manager.cc:96] Issued Profile gRPC to 1 clients
2021-02-02 17:50:31.519147: I tensorflow/core/profiler/rpc/client/profiler_client.cc:131] Waiting for completion.
2021-02-02 17:50:31.520067: I tensorflow/core/profiler/lib/profiler_session.cc:136] Profiler session initializing.
2021-02-02 17:50:31.520136: W tensorflow/core/profiler/lib/profiler_session.cc:144] Profiling is late (2021-02-02T17:50:31.520095781+00:00) for the scheduled start (2021-02-02T17:50:31.518793164+00:00) and will start immediately.
2021-02-02 17:50:31.520152: I tensorflow/core/profiler/lib/profiler_session.cc:155] Profiler session started.
2021-02-02 17:50:44.891412: W tensorflow/core/profiler/rpc/client/profiler_client.cc:152] Deadline exceeded: Deadline Exceeded
2021-02-02 17:50:44.891501: W tensorflow/core/profiler/rpc/client/capture_profile.cc:133] No trace event is collected from localhost:6009
2021-02-02 17:50:44.891526: W tensorflow/core/profiler/rpc/client/capture_profile.cc:145] localhost:6009 returned Deadline exceeded: Deadline Exceeded
No trace event is collected after 3 attempt(s). Perhaps, you want to try again (with more attempts?).
Tip: increase number of attempts with --num_tracing_attempts.
2021-02-02 17:50:44.891848: I tensorflow/core/profiler/lib/profiler_session.cc:172] Profiler session tear down.
Traceback (most recent call last):
File "keras_singleworker_2.py", line 37, in <module>
2021-02-02 17:50:44.893228: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1487] CUPTI activity buffer flushed
tf.profiler.experimental.client.trace('grpc://localhost:6009', './logs/tbc_log', 20000)
File "/fserver/jonathanb/miniconda3/envs/tf2.4/lib/python3.8/site-packages/tensorflow/python/profiler/profiler_client.py", line 131, in trace
_pywrap_profiler.trace(
tensorflow.python.framework.errors_impl.UnavailableError: No trace event was collected because there were no responses from clients or the responses did not have trace data.
我尝试在代码的其他位置找到 tf.profiler.experimental.server.start 和 tf.profiler.experimental.client.trace ,但没有成功。
解决方案
推荐阅读
- c++ - 返回完整的大写/小写转换字符串
- css - Bootstrap 4中li元素之间的内联管道分隔符
- c++ - 编写 C++ 程序“双倍的分数”
- visual-studio-code - 从命令行在 WSL 或 EMR 上安装 vscode 扩展
- java - Google Reflections 因扫描许多不必要的文件而减慢
- python - 向服务器 pyzmq 发送数据时出现问题
- html - 当“.top”类滚动到 150px 时,我希望“.title”类淡入淡出
- java - 对数组
在 Android 中用于过渡共享元素 - swift - Swift:在后台运行计时器并发送通知
- salt-stack - 你如何添加带有 https url 和 saltstack 的包 repo?