首页 > 解决方案 > FFT time series with no change in data?

问题描述

i want to calculate a Fast Fourier Transform, i want to see the rate that network packets are send over the network. For that, i count the packets being send per second.

Some packets have a sending rate of 1Hz, this results in this time series data: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

Is there a way for perform FFT on this array? I always get 0Hz as result. Or is this not a case for a FFT? [1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1] works fine..

So basically I want transform this:

input

into this:

enter image description here

标签: pythonfft

解决方案


Before digging into the details of your problem, let's illustrate some theory using time series nomenclature.

Given a certain function f(t), this can be sampled with a certain frequency Δt for a certain period of time T = n * Δt called the duration (n being the number of samples). Now you can apply the Discrete Fourier Transform (DFT) to f(t) and you obtain the frequency distribution F(ν), which is also a (complex) function with the same number of samples n. I assume you are familiar with the meaning of frequency distribution, and I will not discuss this further. The only property I would like to highlight here is that DFT is defined circularly, meaning that it is assumed that f(t) = f(t + T) for all t and analogously F(ν) = F(ν + Ν) for all ν (Ν being the frequency period), i.e. that they are both periodic.

The fundamental question now is: what is the sampling Δν and the duration Ν of ν for F(ν)? Of course, Wikipedia knows this. As you might guess, this is related to Δt and T. In particular: Δν = 1 / T and hence Ν = n * Δν = 1 / Δt.

NumPy implements DFT using the Fast Fourier Transform (FFT) algorithm, which has some nice properties with regards to F(ν), most notably it always computes the 0th frequency (i.e. F(ν = 0)) and in the absence of shifts this is also the first coefficient of F(ν). Note that because of the periodicity of F(ν) the negative frequencies are actually after the positive frequencies. To get F_ν centered around F(ν = 0) you have to use: np.fft.fftshift(). If n is even, F(ν = 0) is the first value in the right half of F_ν and extra care is required to define ν_shift correctly and represent F(ν) with negative and positive ν values.

To get a feeling of what the plots should be, please refer again to Wikipedia for notable Fourier transform function pairs.

Just to illustrate all this with code, here is some that computes DFT for the superposition of two sinusoidal waves:

import numpy as np
import matplotlib.pyplot as plt

π = np.pi

# Number of samples
n = 400

# Time uration and sampling
T = 80
Δt = T / n
print(f'Δt = {Δt}, T = {T}')
# Δt = 0.2, T = 80

# Frequency duration and sampling
Δν = 1 / T
Ν = 1 / Δt # or: n * Δν
print(f'Δν = {Δν}, Ν = {Ν}')
# Δν = 0.0125, Ν = 5.0

# Define time and frequency (endpoint=False takes care of periodicity)
t = np.linspace(0, T, n, endpoint=False)  
ν = np.linspace(0, Ν, n, endpoint=False)
# Generate a certain f(t) = A₀ sin(ν₀ * 2πt) + A₁ sin(ν₁ * 2πt)
ν_ = 1, 2
A_ = 1, 3
f_t = A_[0] * np.sin(ν_[0] * 2 * π * t) + A_[1] * np.sin(ν_[1] * 2 * π * t)

# Compute the F(ν) as the DFT(f(t))
# F(ν) ∝ A₀ (δ(ν - ν₀) + δ(ν + ν₀)) + A₁ (δ(ν - ν₁) + δ(ν + ν₁))
# δ is the [Dirac Delta function][3]
F_ν = np.fft.fft(f_t)

# Center F(ν) around ν = 0
F_ν_shift = np.fft.fftshift(F_ν)
ν_shift = ν - (Ν / 2) + (Δν / 2 if n % 2 else 0)

# Plot the result
fig, axs = plt.subplots(1, 3, figsize=(16, 4), squeeze=False)
axs[0, 0].plot(t, f_t)
axs[0, 1].plot(ν, 2.0 / n * np.abs(F_ν))
axs[0, 2].plot(ν_shift, 2.0 / n * np.abs(F_ν_shift))
axs[0, 1].axvline(ν_[0], color='green')
axs[0, 1].axvline(ν_[1], color='orange')
axs[0, 2].axvline(ν_[0], color='green')
axs[0, 2].axvline(ν_[1], color='orange')
plt.show()

generic_fft_plot

(You should probably play around with the above code to understand what FFT is doing).

Note that in all this the sampling frequency did not show up as a peak in F(ν)! It does play a role, but in effects like aliasing and spectral leakage, both related to the Shannon-Nyquist theorem.


Now, to your problem, you are comparing f_t when this is basically a constant -- whose Fourier transform is a Dirac Delta centered at ν = 0, δ(0) -- and when this is an alternating function, which could be seen as sin(ν₀ * 2πt) for ν₀ = Ν / 2 -- whose Fourier transform is (as before) proportional to a sum of Dirac's deltas at ν = ±ν₀ = ±Ν / 2.

The code below will help you visualize what is happening:

import numpy as np
import matplotlib.pyplot as plt

π = np.pi

# Number of samples
n = 20

# Time duration and sampling
T = 20
Δt = T / n

# Frequency duration and sampling
Δν = 1 / T
Ν = 1 / Δt

# Define time and frequency
t = np.linspace(0, T, n, endpoint=False)
ν = np.linspace(0, Ν, n, endpoint=False)

# Plot the result
fig, axs = plt.subplots(2, 3, figsize=(16, 8), squeeze=False)

# Plotting DFT for constant function
f_t = np.array([1 for _ in range(n)])
F_ν = np.fft.fft(f_t)
F_ν_shift = np.fft.fftshift(F_ν)
ν_shift = ν - (Ν / 2) + (Δν / 2 if n % 2 else 0)

axs[0, 0].plot(t, f_t)
axs[0, 1].plot(ν, 2.0 / n * np.abs(F_ν))
axs[0, 2].plot(ν_shift, 2.0 / n * np.abs(F_ν_shift))
axs[0, 1].axvline(0, color='green')
axs[0, 2].axvline(0, color='green')

# Plotting DFT for an alternating function
f_t = np.array([i % 2 for i in range(n)])
F_ν = np.fft.fft(f_t)
F_ν_shift = np.fft.fftshift(F_ν)
ν_shift = ν - (Ν / 2) + (Δν / 2 if n % 2 else 0)

axs[1, 0].plot(t, f_t)
axs[1, 1].plot(ν, 2.0 / n * np.abs(F_ν))
axs[1, 2].plot(ν_shift, 2.0 / n * np.abs(F_ν_shift))
axs[1, 1].axvline(0, color='green')
axs[1, 1].axvline(Ν / 2, color='orange')
axs[1, 2].axvline(0, color='green')
axs[1, 2].axvline(Ν / 2, color='orange')
axs[1, 2].axvline(-Ν / 2, color='orange')

plt.show()

fft_const_alternate

(Note that the peak at ν = +Ν / 2 is not actually there because when n is even, the positive frequencies are 1 sample less than the negative frequencies and the peak is EXACTLY at the edge.)

We can observe that there is a peak in the middle of the non-shifted FFT for the alternating series, that is not present for the constant one.

Hence, what you observe is correct, BUT your interpretation is wrong.

If we assume that the constant and the alternating f(t) are obtained at the same sampling rate, we can conclude that there is no relationship between the sampling rate and the appearance of F(ν).

On the other hand, if we assume that the constant and the alternating f(t) are two sampling of the same event with different sampling rate and the alternating one is obtained at a higher sampling rate (which is not the situation described in the above plots), then we can conclude that the emergence of the additional peaks at ν = ±Ν / 2 = ±1 / (2 Δt) is related to the insufficient sampling of f(t) when the function appeared constant. The position of these new peaks is a manifestation of aliasing which is related to the Shannon-Nyquist theorem mentioned earlier.


Finally, if you want to observe network traffic analytics the number of packets per unit of time IS already "the rate that network packets are send over the network".


推荐阅读