首页 > 解决方案 > What's the correct way to use tf.data.Dataset.map?

问题描述

I have audio files and I'd like to make a tf.DataSet from their audio content (i.e. each audio file in the dataset should be represented as a vector of float values)

Here's my code

def convert_audio_file_to_numpy_array(filepath):
  sample_rate = sox.file_info.sample_rate(filepath)
  audio, sr = librosa.load(filepath, sr=sample_rate)
  array = np.asarray(audio)
  return array

filenames_ds = tf.data.Dataset.from_tensor_slices(input_filepaths)
waveforms_ds = filenames_ds.map(convert_audio_file_to_numpy_array, num_parallel_calls=tf.data.AUTOTUNE)

This produces this error: TypeError: stat: path should be string, bytes, os.PathLike or integer, not Tensor

I'm using DataSet's map function following the pattern in this official tutorial (see the call to files_ds.map). In it, the function that map uses takes a filepath.

What am I doing differently to the official tutorial?

标签: pythonnumpytensorflow

解决方案


问题是该函数def sample_rate(input_filepath: Union[str, Path]) -> float:需要 astring或 a pathlib.Path,而您提供的是 a Tensor。(你的元素filename_ds是字符串类型的张量)。

在 tensorflow 教程中,他们使用需要字符串类型的 tensorflow 函数加载数据Tensor。您应该检查是否可以使用tf.audio本机函数加载文件。

否则,一种常见的解决方法是使用带有 的生成器tf.data.Dataset.from_generator,类似于以下解决方案:

def generator_func(list_of_path):
  
  def convert_audio_file_to_numpy_array(filepath):
    sample_rate = sox.file_info.sample_rate(filepath)
    audio, sr = librosa.load(filepath, sr=sample_rate)
    array = np.asarray(audio)
    return array

  for path in list_of_path:
    yield convert_audio_file_to_numpy_array(path)

ds = tf.data.Dataset.from_generator(generator_func, output_types=tf.float32)

推荐阅读