Wavelet analysis. Part 1

Wavelet analysis. Part 1


Introduction


Consider the discrete wavelet transform (DWT) implemented in the PyWavelets library PyWavelets 1.0.3 . PyWavelets is open source, free software licensed under the MIT license.

When processing data on a computer, a sampled version of a continuous wavelet transform can be performed, the basics of which are described in my previous article. However, setting discrete values ​​of parameters (a, b) of wavelets with an arbitrary step Δa and Δb requires a large number of calculations.

In addition, the result is an excessive number of coefficients, far exceeding the number of samples of the original signal that is not required for its reconstruction.

The discrete wavelet transform (DWT) implemented in the PyWavelets library provides enough information for both signal analysis and synthesis, being at the same time economical in the number of operations and in the required memory.

When you need to use a wavelet transform instead of a Fourier transform


Fourier transforms will work very well when the frequency spectrum is stationary. In this case, the frequencies present in the signal are independent of time, and the signal contains the xHz frequencies that are present in any place of the signal. The nonstationary signal, the worse the results will be. This is a problem, as most of the signals that we see in real life are non-stationary in nature.

The Fourier transform has a high resolution in the frequency domain, but zero resolution in the time domain. We show this in the following two examples.

Listing
  import numpy as np
 from scipy import fftpack
 from pylab import *
 N = 100,000
 dt = 1e-5
 xa = np.linspace (0, 1, num = N)
 xb = np.linspace (0, 1/4, num = N/4)
 frequencies = [4, 30, 60, 90]
 y1a, y1b = np.sin (2 * np.pi * frequencies [0] * xa), np.sin (2 * np.pi * frequencies [0] * xb)
 y2a, y2b = np.sin (2 * np.pi * frequencies [1] * xa), np.sin (2 * np.pi * frequencies [1] * xb)
 y3a, y3b = np.sin (2 * np.pi * frequencies [2] * xa), np.sin (2 * np.pi * frequencies [2] * xb)
 y4a, y4b = np.sin (2 * np.pi * frequencies [3] * xa), np.sin (2 * np.pi * frequencies [3] * xb)
 def spectrum_wavelet (y):
  Fs = 1/dt # sampling rate, Fs = 0.1 MHz
  n = len (y) # length of the signal
  k = np.arange (n)
  T = n/fs
  frq = k/T # two sides frequency range
  frq = frq [range (n//2)] # one side frequency range
  Y = fftpack.fft (y)/n # fft computing and normalization
  Y = Y [range (n//2)]/max (Y [range (n//2)])
  # plotting the data
  subplot (2, 1, 1)
  plot (k/N, y, 'b')
  ylabel ('Amplitude')
  grid ()
  # plotting the spectrum
  subplot (2, 1, 2)
  plot (frq [0: 140], abs (Y [0: 140]), 'r')
  xlabel ('Freq')
  plt.ylabel ('| Y (freq) |')
  grid ()
 y = y1a + y2a + y3a + y4a
 spectrum_wavelet (y)
 show ()
 show ()  


On this graph, all four frequencies are present in the signal for the entire duration of its action.

Listing
  import numpy as np
 from scipy import fftpack
 from pylab import *
 N = 100,000
 dt = 1e-5
 xa = np.linspace (0, 1, num = N)
 xb = np.linspace (0, 1/4, num = N/4)
 frequencies = [4, 30, 60, 90]
 y1a, y1b = np.sin (2 * np.pi * frequencies [0] * xa), np.sin (2 * np.pi * frequencies [0] * xb)
 y2a, y2b = np.sin (2 * np.pi * frequencies [1] * xa), np.sin (2 * np.pi * frequencies [1] * xb)
 y3a, y3b = np.sin (2 * np.pi * frequencies [2] * xa), np.sin (2 * np.pi * frequencies [2] * xb)
 y4a, y4b = np.sin (2 * np.pi * frequencies [3] * xa), np.sin (2 * np.pi * frequencies [3] * xb)
 def spectrum_wavelet (y):
  Fs = 1/dt # sampling rate, Fs = 0.1 MHz
  n = len (y) # length of the signal
  k = np.arange (n)
  T = n/fs
  frq = k/T # two sides frequency range
  frq = frq [range (n//2)] # one side frequency range
  Y = fftpack.fft (y)/n # fft computing and normalization
  Y = Y [range (n//2)]/max (Y [range (n//2)])
  # plotting the data
  subplot (2, 1, 1)
  plot (k/N, y, 'b')
  ylabel ('Amplitude')
  grid ()
  # plotting the spectrum
  subplot (2, 1, 2)
  plot (frq [0: 140], abs (Y [0: 140]), 'r')
  xlabel ('Freq')
  plt.ylabel ('| Y (freq) |')
  grid ()
 y = np.concatenate ([y1b, y2b, y3b, y4b])
 spectrum_wavelet (y)
 show ()
 show ()  



In this graph, the signals do not overlap in time, the side lobes are due to a gap between four different frequencies.

For two frequency spectra containing exactly the same four peaks, the Fourier transform cannot determine where these frequencies are present in the signal. The best approach for analyzing signals with a dynamic frequency spectrum is wavelet transform.

Basic Wavelet Properties


The choice of the type, and therefore the properties of the wavelet, depends on the task of analysis, for example, to determine the effective values ​​of currents in the power industry, greater accuracy is provided by the wavelets of Ingrid Daubeci of large orders. Wavelet properties can be obtained using the pywt.DiscreteContinuousWavelet () function in the following listing:

Listing
  import pywt
 from pylab import *
 from numpy import *
 discrete_wavelets = ['db5', 'sym5', 'coif5', 'haar']
 print ('discrete_wavelets-% s'% discrete_wavelets)
 st = 'db20'
 wavelet = pywt.DiscreteContinuousWavelet (st)
 print (wavelet)
 i = 1
 phi, psi, x = wavelet.wavefun (level = i)
 subplot (2, 1, 1)
 title ("The graph of the wavelet function itself is% s"% st)
 plot (x, psi, linewidth = 2, label = 'level =% s'% i)
 grid ()
 legend (loc = 'best')
 subplot (2, 1, 2)
 title ("Primitive Graph -% s"% st)
 plt.plot (x, phi, linewidth = 2, label = 'level =% s'% i)
 legend (loc = 'best')
 grid ()
 show ()  

We get:

discrete_wavelets - ['db5', 'sym5', 'coif5', 'haar']

  Wavelet db20
  Family name: Daubechies
  Short name: db
  Filters length: 40
  Orthogonal: True
  Biorthogonal: True
  Symmetry: asymmetric
  DWT: True
  CWT: False  



In a number of practical cases, it is necessary to obtain information about the center frequency of the psi wavelet — a function that is used, for example, in wavelet analysis of signals to identify defects in gears:

  import pywt
 fc = pywt.central_frequency ('haar', precision = 8)
 print (fc)
 # or so:
 scale = 1
 fс1 = pywt.scale2frequency ('haar', scale)
 print (fc1)  

  0.9961089494163424
 0.9961089494163424  

Using the center frequency $ f_ {c} $ maternal wavelet and the scale factor" a "you can convert the scale in the pseudo frequency  $ f_ {a} $ using the equation:

$ f_ {a} = \ frac {f_ {c}} {a} $

Signal Expansion Modes


Before computing a discrete wavelet transform using filter banks , it becomes necessary to extend the signal . Depending on the extrapolation method, significant artifacts can occur at the signal boundaries, leading to inaccurate DWT conversion.

PyWavelets provides several signal extrapolation methods that can be used to minimize this negative effect. To demonstrate these methods, use the following listing:

Listing demonstration of signal extension methods
  import numpy as np
 from matplotlib import pyplot as plt
 from pywt._doc_utils import boundary_mode_subplot
 # synthetic test signal
 x = 5 - np.linspace (-1.9, 1.1, 9) ** 2
 # Create a figure with one subplots per boundary mode
 fig, axes = plt.subplots (3, 3, figsize = (10, 6))
 plt.subplots_adjust (hspace = 0.5)
 axes = axes.ravel ()
 boundary_mode_subplot (x, 'symmetric', axes [0], symw = False)
 boundary_mode_subplot (x, 'reflect', axes [1], symw = True)
 boundary_mode_subplot (x, 'periodic', axes [2], symw = False)
 boundary_mode_subplot (x, 'antisymmetric', axes [3], symw = False)
 boundary_mode_subplot (x, 'antireflect', axes [4], symw = True)
 boundary_mode_subplot (x, 'periodization', axes [5], symw = False)
 boundary_mode_subplot (x, 'smooth', axes [6], symw = False)
 boundary_mode_subplot (x, 'constant', axes [7], symw = False)
 boundary_mode_subplot (x, 'zeros', axes [8], symw = False)
 plt.show ()  



The graphs show how a short signal (red) expands (black) beyond its original length.

Discrete Wavelet Transform


To demonstrate the DWT, we will use a signal with a dynamic frequency spectrum, which increases with time. The beginning of the signal contains low-frequency values, and the end of the signal contains frequencies of the shortwave range. This allows us to easily determine which part of the frequency spectrum is filtered out just by looking at the time axis:

Listing
  from pylab import *
 from numpy import *
 x = linspace (0, 1, num = 2048)
 chirp_signal = sin (250 * pi * x ** 2)
 fig, ax = subplots (figsize = (6,1))
 ax.set_title ("Signal with dynamic frequency spectrum")
 ax.plot (chirp_signal)
 show ()  




The discrete wavelet transform in PyWavelets 1.0.3 is the pywt.dwt () function, which calculates the approximating coefficients cA and the detail cD coefficients of the first level of the wavelet transform of the signal specified by the vector:

Listing the first transformation level
  import pywt
 from pylab import *
 from numpy import *
 x = linspace (0, 1, num = 2048)
 y = sin (250 * pi * x ** 2)
 st = 'sym5'
 (cA, cD) = pywt.dwt (y, st)
 subplot (2, 1, 1)
 plot (cA, 'b', linewidth = 2, label = 'cA, level-1')
 grid ()
 legend (loc = 'best')
 subplot (2, 1, 2)
 plot (cD, 'r', linewidth = 2, label = 'cD, level-1')
 grid ()
 legend (loc = 'best')
 show ()  




Listing of the fifth level of conversion
  import pywt
 from pylab import *
 from numpy import *
 x = linspace (0, 1, num = 2048)
 y = sin (250 * pi * x ** 2)
 st = 'sym5'
 (cA, cD) = pywt.dwt (y, st)
 (cA, cD) = pywt.dwt (cA, st)
 (cA, cD) = pywt.dwt (cA, st)
 (cA, cD) = pywt.dwt (cA, st)
 (cA, cD) = pywt.dwt (cA, st)
 subplot (2, 1, 1)
 plot (cA, 'b', linewidth = 2, label = 'cA, level-5')
 grid ()
 legend (loc = 'best')
 subplot (2, 1, 2)
 plot (cD, 'r', linewidth = 2, label = 'cD, level-5')
 grid ()
 legend (loc = 'best')
 show ()  



The approximation coefficients (cA) represent the output of the low-pass filter (averaging filter) DWT.The detail coefficients (cD) represent the output of the high-pass filter (differential filter) DWT.

You can use the pywt.wavedec () function to immediately calculate higher level coefficients. This function takes as input the source signal and the level and returns one set of approximation coefficients (n-th level) and n sets of detail coefficients (from 1 to n-th level). Here is an example for level five:

  from pywt import wavedec
 from pylab import *
 from numpy import *
 x = linspace (0, 1, num = 2048)
 y = sin (250 * pi * x ** 2)
 st = 'sym5'
 coeffs = wavedec (y, st, level = 5)
 subplot (2, 1, 1)
 plot (coeffs [0], 'b', linewidth = 2, label = 'cA, level-5')
 grid ()
 legend (loc = 'best')
 subplot (2, 1, 2)
 plot (coeffs [1], 'r', linewidth = 2, label = 'cD, level-5')
 grid ()
 legend (loc = 'best')
 show ()  

As a result, we get the same graphs as in the previous example. You can get the coefficients cA and cD separately:

For cA:

  import pywt
 from pylab import *
 from numpy import *
 x = linspace (0, 1, num = 2048)
 data = sin (250 * pi * x ** 2)
 coefs = pywt.downcoef ('a', data, 'db20', mode = 'symmetric', level = 1)  

For cD:

  import pywt
 from pylab import *
 from numpy import *
 x = linspace (0, 1, num = 2048)
 data = sin (250 * pi * x ** 2)
 coefs = pywt.downcoef ('d', data, 'db20', mode = 'symmetric', level = 1)
  

Filter Bank


We discussed some of the issues related to conversion levels in the previous section. However, DWT is always implemented as a filter bank in the form of a cascade of high-frequency and low-frequency filters. Filter banks are a very effective way of splitting a signal into several frequency subbands.

At the first stage with a small scale analyzing the high-frequency signal behavior. At the second stage, the scale increases by a factor of two (the frequency decreases by a factor of two), and we analyze the behavior of about half of the maximum frequency. At the third stage, the scale factor is equal to four, and we analyze the frequency behavior of about a quarter of the maximum frequency. And this continues until we reach the maximum level of decomposition.

The maximum level of decomposition can be calculated using the pywt.wavedec () function, while decomposition and detailing will look like this:

Listing
  import pywt
 from pywt import wavedec
 from pylab import *
 from numpy import *
 x = linspace (0, 1, num = 2048)
 data = sin (250 * pi * x ** 2)
 n_level = pywt.dwt_max_level (len (data), 'sym5')
 print ('Maximum decomposition level:% s'% n_level)
 x = linspace (0, 1, num = 2048)
 y = sin (250 * pi * x ** 2)
 st = 'sym5'
 coeffs = wavedec (y, st, level = 7)
 subplot (2, 1, 1)
 plot (coeffs [0], 'b', linewidth = 2, label = 'cA, level-7')
 grid ()
 legend (loc = 'best')
 subplot (2, 1, 2)
 plot (coeffs [1], 'r', linewidth = 2, label = 'cD, level-7')
 grid ()
 legend (loc = 'best')
 show ()
  


We get:

Maximum level of decomposition: 7



The decomposition stops when the signal becomes shorter than the filter length for a given sym5 wavelet. For example, suppose we have a signal with frequencies up to 1000 Hz. At the first stage, we divide our signal into low-frequency and high-frequency parts, i.e. 0-500 Hz and 500-1000 Hz. At the second stage, we take the low-frequency part and again divide it into two parts: 0-250 Hz and 250-500 Hz. In the third stage, we divided the part of 0-250 Hz into part of 0-125 Hz and part of 125-250 Hz. This continues until we reach the maximum level of decomposition.

Analyzing wav files using fft Fourier and wavelet scalars


For analysis, we will use the WebSDR file .Consider the analysis of the reduced signal by means of triang from scipy.signal and the implementation of the discrete Fourier transform in python (fft from scipy.fftpack). If the length of the fft sequence is not equal to 2n, then instead of the fast Fourier transform (fft), a discrete Fourier transform (dft) will be performed. This is how this command works.

We use the fast Fourier transform buffer according to the following scheme (numerical data for an example):

fftbuffer = np.zeros (15); create a buffer filled with zeros;
fftbuffer [: 8] = x [7:]; move the end of the signal to the first part of the buffer; fftbuffer [8:] = x [: 7] —shifts the start of the signal to the last part of the buffer; X = fft (fftbuffer) - consider the Fourier transform of the buffer filled with the signal values.

To make the phase spectrum more readable, apply phase deployment. To do this, change the line with the calculation of the phase characteristics: pX = np.unwrap (np.angle (X)).

Listing for fft analysis of a signal fragment
  import numpy as np
 from pylab import *
 from scipy import *
 import scipy.io.wavfile as wavfile
 M = 501
 hM1 = int (np.floor ((1 + M)/2))
 hM2 = int (np.floor (M/2))
 (fs, x) = wavfile.read ('WebSDR.wav')
 x1 = x [5000: 5000 + M] * np.hamming (M)
 N = 511
 fftbuffer = np.zeros ([N])
 fftbuffer [: hM1] = x1 [hM2:]
 fftbuffer [N-hM2:] = x1 [: hM2]
 X = fft (fftbuffer)
 mX = abs (X)
 pX = np.angle (X)
 suptitle ("WebSDR Radio Signal Analysis")
 subplot (3, 1, 1)
 st = 'Input Signal (WebSDR.wav)'
 plot (x, linewidth = 2, label = st)
 legend (loc = 'center')
 subplot (3, 1, 2)
 st = 'Frequency spectrum of the input signal'
 plot (mX, linewidth = 2, label = st)
 legend (loc = 'best')
 subplot (3, 1, 3)
 st = 'Phase spectrum of the input signal'
 pX = np.unwrap (np.angle (X))
 plot (pX, linewidth = 2, label = st)
 legend (loc = 'best')
 show ()  



For a comparative analysis, we will use the wavelet scalar , which can be constructed using the tree = pywt.wavedec function (signal, 'coif5' ) in matplotlib.

Listing wavelet scalars
  from pylab import *
 import pywt
 import scipy.io.wavfile as wavfile
 # Find the highest power of the two channels, which is less than or equal to the input.
 def lepow2 (x):
  return int (2 ** floor (log2 (x)))
 # Skalogram considering the MRA tree.
 def scalogram (data):
  bottom = 0
  vmin = min (map (lambda x: min (abs (x)), data))
  vmax = max (map (lambda x: max (abs (x)), data))
  gca (). set_autoscale_on (False)
  for row in range (0, len (data)):
  scale = 2.0 ** (row - len (data))
  imshow (
  array ([abs (data [row])]),
  interpolation = 'nearest',
  vmin = vmin,
  vmax = vmax,
  extent = [0, 1, bottom, bottom + scale])
  bottom + = scale
 # Download the signal, take the first channel.
 rate, signal = wavfile.read ('WebSDR.wav')
 signal = signal [0: lepow2 (len (signal))]
 tree = pywt.wavedec (signal, 'coif5')
 gray ()
 scalogram (tree)
 show ()  




Thus, scalogram gives a more detailed answer to the question of the distribution of frequencies over time, and the fast Fourier transform is responsible for the frequencies themselves. It all depends on the task, even for such a simple example.

Conclusions


  1. The rationale for using discrete wavelet transforms for dynamic signals is given.
  2. Examples of wavelet analysis using PyWavelets 1.0.3 - free open source software released under the MIT license.
  3. Considered software tools for practical use of the PyWavelets library.

Source text: Wavelet analysis. Part 1