首页 > 解决方案 > Inaccurate audio data plot with FFT (Python)

问题描述

I'm a beginner in Python and I'm trying to plot the full spectrum of an audio wav file. I was given a CSV data file that contains the frequency and magnitude as a reference. I have been figuring out the parameters to clone the plot but was unsuccessful. Since the audio file is in stereo format, I am using the second channel for the plotting.

Here's some info about the audio file:

As you can see from the image below, the first plot is from the CSV data while the second plot is the end result. The end result is very noisy and messy despite having some characteristics from the CSV plot. Are there any ways I can do to get a close enough plot? Feel free to correct me if I'm wrong in my code. Thanks

CSV Plot vs Result Plot

if(len(signal.shape) > 1):
  # Second Signal
  fft = np.fft.fft(signal[1]) 
  magnitude = 20 * np.log10(np.abs(fft) / 2e-5)  # Convert to dB unit
  frequency = np.linspace(0, sr, len(magnitude))

  plt.figure(figsize=(25, 5))
  plt.semilogx(frequency, magnitude, 'b')
  plt.grid(which='major')
  plt.grid(which='minor', linestyle=':')
  plt.title('First Signal')
  plt.xlabel("Frequency (Hz)")
  plt.ylabel("SPL (dB)")

  plt.show()

I'm pretty sure the dB conversion is wrong :(

*Update: Instead of using normal FFT, I used Welch's method by Scipy. It's pretty close but could be optimized further.

x = data[1] * (2**15)  # scale signal to [-1.0 .. 1.0]
segment_size = sr
noverlap = segment_size / 2
f, Pxx = signal.welch(x,                        # signal
                  fs=sr,                    # sample rate
                  nperseg=segment_size,     # segment size
                  window='hann',         # window type to use
                  nfft=segment_size,        # num. of samples in FFT
                  detrend=False,            # remove DC part
                  scaling='spectrum',       # return power spectrum [V^2]
                  noverlap=noverlap)        # overlap between segments

ref = 2e-5   
p = librosa.power_to_db(Pxx, ref=ref) + 4.5 # add 4.5 dB for scale factor

CSV Plot vs Result Plot

标签: pythonaudiosignalsfft

解决方案


推荐阅读