Understanding Digital Signal Processing Chapter 3

Posted on 14 Apr 2014

Chapter 3

The continuous Fourier transform is given by.

$X(f) = \int^{\infty}_{-\infty} x(t) e^{-j \, 2 \pi \, f \, t}\mathrm{dt}$

DFT: In the discrete case, this becomes a sum, where $t \rightarrow n$ , and $f \rightarrow m/N$ .

$X(m) = \sum^{N-1}_{n = 0} x(n) e^{-j \, 2 \pi \, n \, m/N}$

Another way to get to the discrete Fourier transform: Pick a test-frequency $m/N$ . Multiply a cosine and a sine (or complex exponential, your choice) that has that test-frequency with your signal data, and sum all $N$ terms up. If the original signal had a component at that frequency, the net-sum will be non-zero (basically a DC-term when multiplying signals that have the same frequency). Or, in the words of the original author:

Any individual X(m) output value is nothing more than the sum of the term-by-term products, a correlation, of an input signal sample sequence with a cosine and a sinewave whose frequencies are m complete cycles in the total sample interval of N samples.

The (discrete) Fourier Transform of a real-valued signal is symmetric, as can be seen by sending $m \rightarrow N-m$ in the equation above, and then dropping the $2 \pi n$ term in the exponential, which leads to

$X(N-m) = X^*(m)\quad .$

The DFT of a symmetric signal, where $x(n) = x(N-n)$ , is real. You can either see this by reversing the index of summation in the DFT's definition, or intuitively: A symmetric signal can be described with cosines only, so the amplitudes of the sines are all 0.

In our definition of the DFT the calculated phases of the frequency components are always relative to cosines.

If you have a real-valued signal and want to know the amplitude of one of its frequency components $f_n < f_{\mathrm{s}}/2$ , then you have to divide the calculated (DFT) amplitude by $N/2$ . You can see why this factor is $N/2$ when you consider what happens for some test signal. Each point in time contributes with 1/2. (Again, this is only true for real-valued signals, where the missing 1/2 is at the other end of the spectrum!) Side note: the author is back at all-positive frequencies here... There are other definitions of DFT's where the scale factor $N$ is distributed somewhere else, but that's not important to us.

The "fundamental frequency" or frequency resolution when computing a DFT of a signal that has length $N$ and sample rate $f_{\mathrm{s}}$ is given by

$f_{\mathrm{res}} = \frac{f_{\mathrm{s}}}{N}$

The inverse DFT follows from the view that "the signal is just a bunch of harmonic signals added together" and that we already know the magnitude and phases of those signals. It differs from the DFT only in the sign of the exponential, and in the scaling factor $1/n$ .

$x(n) = \frac{1}{N}\sum^{N-1}_{m = 0} X(m) e^{j \, 2 \pi \, n \, m/N}$

A time-shift of the input signal merely results in a phase-shift of the DFT.

Leakage: If the signal-frequency is not one of the DFT-frequency "bins" $f_{\mathrm{bin}}(m) = m f_{\mathrm{s}} /N$ , signal energy "leaks" into all bins. Example: A real cosine signal with $k \in \Re^+$ cycles in an $N$ -point sample and amplitude $A_0$ has DFT-amplitudes given by the sinc-function

$X(m) = \frac{A_0 N}{2} \cdot \frac{\sin[\pi(k-m)]}{\pi(k-m)} \quad .$

This makes sense, because due to only looking at N points, we are looking at a "burst" that has a FT given by a sinc.

Leakage can be surprisingly asymmetric, especially if $f_{\mathrm{signal}}$ is near 0 or $f_{\mathrm{s}}/2$ , because the leaking spectrum wraps around (more like - is reflected? No!) at these corner frequencies.

Windowing is a technique to mitigate leakage. The input signal is multiplied (scalar product) with an input window, for instance a triangle function or a "raised cosine" (Hamming window, better than triangle) function. Tradeoff: Peaks become broader, and overall you lose signal (because start and end of signal tend towards 0). However, it's now easier to spot small signals in the vicinity of big ones. Other window functions: Chebyshev, Kaiser.

Scalloping is somewhat related to leakage: For signal frequencies directly between two DFT bins, the max(magnitude) drops by more than 30% (-4 dB), because the signal leaks into other bins. Windowing improves this effect as well (example: -2 dB for Hamming). Real-world signals are often wider than 1 bin, making this not very relevant.

Zero-Padding is used to increase the frequency resolution by appending 0's to the sample. One case where this makes sense if the signal really is 0 outside of the sampled interval. But you can also think about it in a different way: When zero-padding, you are basically "probing" your signal with non-integer frequencies, which is of course absolutely ok. In the extreme limit, this becomes the discrete time Fourier transform (DTFT), which has a continuous output. Careful: When combining zero-padding with windowing, only window the original $L$ non-zero samples. Note that the calculated DFT amplitudes will go down with the total, padded length $N$ .

DTFT stands for "discrete time Fourier transform" and is basically zero-padding taken to the extreme, such that the transform's output is continuous. This does not improve resolution, which due to the width of the (rectangular or funky looking) "input window" surrounded by 0's is limited by sidelobes to roughly $f_{\mathrm{res}} = 1/T_\mathrm{sample}$ .

SNR is output signal power / average noise power.

You can think of a DFT's frequency bins as narrow bandpass filters. When you increase the sample length $N$ , the average noise amplitude increases by $\sqrt{N}$ , but the measured amplitude of a (coherent!) signal increases by $N$ . Thus, SNR increases by $\sqrt{N}^2 = N$ (assuming that the power in the signal bin is dominated by the signal, not the noise). The book does not give a concise definition of processing gain - only that due to the effects described above it increases by 3dB per doubling of $N$ .

The DFT of a rectangular pulse is always a (sampled) sinc-function. The region around the central peak is called the Dirichlet kernel. Assuming a pulse with $K$ non-zero values of amplitude $A_0$ , the peak in the DFT has a height of $A_0 K$ an a full width (to 0 amplitude) of $2 N/K$ . In the book's example, the author unexptectedly uses negative indexes and frequencies. Trying the example with Matlab, I get the same amplitudes, but the phases are quite different, which is due to his sneaky redefinition of the DFT in equation 3-36 (the n in the exponent should be shifted, if he decides to shift the summation index as he did!).

Further, the DFT of a rectangular pulse has a phase-factor and an amplitude factor. The amplitude factor only depends on the amplitude, sample-length, and number of (consecutive) non-zero samples. In other words, if you shift the onset of the pulse, you only get phase shifts in the frequency domain.

The frequency axis can be expressed in many different ways: In frequency $f$ with a resolution $f_{\mathrm{s}}/m$ , in normalized frequency or "cycles per sample" $f/f_{\mathrm{s}}$ with resolution $1/N$ , or in radians per sample $\omega$ with resolution $2 \pi/N$ . Note that in the latter case $\omega$ has the non-standard dimension radians/sample!!