python - 如何在多线程/多处理时将 TensorFlow 张量转换为 tf.Dataset.map() 中的 NumPy 数组？

问题描述

我在 Eager Execution 中使用 TensorFlow 1.12。我有一个计算量很大的函数，我使用tf.Dataset.map(). 在这个函数内部——我通过在急切执行中简单地对数据集进行 for 循环开发——我使用EagerTensor.numpy(). 由于预处理我的数据集花费的时间太长，我现在希望对整个地图过程进行多线程处理，使用tf.Dataset.map(myfunction, num_cores=30). 但是，现在myfunction()不再急切地执行了。myfunction()我尝试了两件事：1）使用包装tf.py_function，在这种情况下map()不再是多线程的，2）打开tf.Session内部并使用而不是myfunction()获取我的NumPy数组，在这种情况下我需要提供输入张量（我不明白为什么）：.eval().numpy()

InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'arg0' with dtype string
     [[node arg0 (defined at /home/.../semantic_fpn.py:101)  = Placeholder[dtype=DT_STRING, shape=<unknown>]]```

Is there an easy why for me to loop over my dataset with ```myfunction()``` (and its NumPy conversion) while multithreading/multiprocessing? I had additionally considered using a for-loop instead of ```.map()```, but I wouldn't know how to alter/write to records in a ```tf.Dataset```.

标签： pythonmultithreadingnumpytensorflowtensor

python - 如何在多线程/多处理时将 TensorFlow 张量转换为 tf.Dataset.map() 中的 NumPy 数组？

问题描述

解决方案

推荐阅读

python - 如何在多线程/多处理时将 ​​TensorFlow 张量转换为 tf.Dataset.map() 中的 NumPy 数组？

问题描述

解决方案

推荐阅读

python - 如何在多线程/多处理时将 TensorFlow 张量转换为 tf.Dataset.map() 中的 NumPy 数组？