Understanding Digital Signal Processing Chapter 3

Chapter 3

The continuous Fourier transform is given by.

X(f)=x(t)ej2πftdt X(f) = \int^{\infty}_{-\infty} x(t) e^{-j \, 2 \pi \, f \, t}\mathrm{dt}

DFT: In the discrete case, this becomes a sum, where tnt \rightarrow n, and fm/Nf \rightarrow m/N.

X(m)=n=0N1x(n)ej2πnm/N X(m) = \sum^{N-1}_{n = 0} x(n) e^{-j \, 2 \pi \, n \, m/N}

Another way to get to the discrete Fourier transform: Pick a test-frequency m/Nm/N. Multiply a cosine and a sine (or complex exponential, your choice) that has that test-frequency with your signal data, and sum all NN terms up. If the original signal had a component at that frequency, the net-sum will be non-zero (basically a DC-term when multiplying signals that have the same frequency). Or, in the words of the original author:

Any individual X(m) output value is nothing more than the sum of the term-by-term products, a correlation, of an input signal sample sequence with a cosine and a sinewave whose frequencies are m complete cycles in the total sample interval of N samples.

The (discrete) Fourier Transform of a real-valued signal is symmetric, as can be seen by sending mNmm \rightarrow N-m in the equation above, and then dropping the 2πn 2 \pi n term in the exponential, which leads to

X(Nm)=X(m). X(N-m) = X^*(m)\quad .

The DFT of a symmetric signal, where x(n)=x(Nn)x(n) = x(N-n), is real. You can either see this by reversing the index of summation in the DFT's definition, or intuitively: A symmetric signal can be described with cosines only, so the amplitudes of the sines are all 0.

In our definition of the DFT the calculated phases of the frequency components are always relative to cosines.

If you have a real-valued signal and want to know the amplitude of one of its frequency components fn<fs/2f_n < f_{\mathrm{s}}/2, then you have to divide the calculated (DFT) amplitude by N/2N/2. You can see why this factor is N/2N/2 when you consider what happens for some test signal. Each point in time contributes with 1/2. (Again, this is only true for real-valued signals, where the missing 1/2 is at the other end of the spectrum!) Side note: the author is back at all-positive frequencies here... There are other definitions of DFT's where the scale factor NN is distributed somewhere else, but that's not important to us.

The "fundamental frequency" or frequency resolution when computing a DFT of a signal that has length NN and sample rate fsf_{\mathrm{s}} is given by

fres=fsN f_{\mathrm{res}} = \frac{f_{\mathrm{s}}}{N}

The inverse DFT follows from the view that "the signal is just a bunch of harmonic signals added together" and that we already know the magnitude and phases of those signals. It differs from the DFT only in the sign of the exponential, and in the scaling factor 1/n1/n.

x(n)=1Nm=0N1X(m)ej2πnm/N x(n) = \frac{1}{N}\sum^{N-1}_{m = 0} X(m) e^{j \, 2 \pi \, n \, m/N}

A time-shift of the input signal merely results in a phase-shift of the DFT.

Leakage: If the signal-frequency is not one of the DFT-frequency "bins" fbin(m)=mfs/Nf_{\mathrm{bin}}(m) = m f_{\mathrm{s}} /N, signal energy "leaks" into all bins. Example: A real cosine signal with k+k \in \Re^+ cycles in an NN-point sample and amplitude A0A_0 has DFT-amplitudes given by the sinc-function

X(m)=A0N2sin[π(km)]π(km).X(m) = \frac{A_0 N}{2} \cdot \frac{\sin[\pi(k-m)]}{\pi(k-m)} \quad .

This makes sense, because due to only looking at N points, we are looking at a "burst" that has a FT given by a sinc.

Leakage can be surprisingly asymmetric, especially if fsignalf_{\mathrm{signal}} is near 0 or fs/2f_{\mathrm{s}}/2, because the leaking spectrum wraps around (more like - is reflected? No!) at these corner frequencies.

Windowing is a technique to mitigate leakage. The input signal is multiplied (scalar product) with an input window, for instance a triangle function or a "raised cosine" (Hamming window, better than triangle) function. Tradeoff: Peaks become broader, and overall you lose signal (because start and end of signal tend towards 0). However, it's now easier to spot small signals in the vicinity of big ones. Other window functions: Chebyshev, Kaiser.

Scalloping is somewhat related to leakage: For signal frequencies directly between two DFT bins, the max(magnitude) drops by more than 30% (-4 dB), because the signal leaks into other bins. Windowing improves this effect as well (example: -2 dB for Hamming). Real-world signals are often wider than 1 bin, making this not very relevant.

Zero-Padding is used to increase the frequency resolution by appending 0's to the sample. One case where this makes sense if the signal really is 0 outside of the sampled interval. But you can also think about it in a different way: When zero-padding, you are basically "probing" your signal with non-integer frequencies, which is of course absolutely ok. In the extreme limit, this becomes the discrete time Fourier transform (DTFT), which has a continuous output. Careful: When combining zero-padding with windowing, only window the original LL non-zero samples. Note that the calculated DFT amplitudes will go down with the total, padded length NN.

DTFT stands for "discrete time Fourier transform" and is basically zero-padding taken to the extreme, such that the transform's output is continuous. This does not improve resolution, which due to the width of the (rectangular or funky looking) "input window" surrounded by 0's is limited by sidelobes to roughly fres=1/Tsamplef_{\mathrm{res}} = 1/T_\mathrm{sample}.

SNR is output signal power / average noise power.

You can think of a DFT's frequency bins as narrow bandpass filters. When you increase the sample length NN, the average noise amplitude increases by N\sqrt{N}, but the measured amplitude of a (coherent!) signal increases by NN. Thus, SNR increases by N2=N\sqrt{N}^2 = N (assuming that the power in the signal bin is dominated by the signal, not the noise). The book does not give a concise definition of processing gain - only that due to the effects described above it increases by 3dB per doubling of NN.

The DFT of a rectangular pulse is always a (sampled) sinc-function. The region around the central peak is called the Dirichlet kernel. Assuming a pulse with KK non-zero values of amplitude A0A_0, the peak in the DFT has a height of A0KA_0 K an a full width (to 0 amplitude) of 2N/K2 N/K. In the book's example, the author unexptectedly uses negative indexes and frequencies. Trying the example with Matlab, I get the same amplitudes, but the phases are quite different, which is due to his sneaky redefinition of the DFT in equation 3-36 (the n in the exponent should be shifted, if he decides to shift the summation index as he did!).

Further, the DFT of a rectangular pulse has a phase-factor and an amplitude factor. The amplitude factor only depends on the amplitude, sample-length, and number of (consecutive) non-zero samples. In other words, if you shift the onset of the pulse, you only get phase shifts in the frequency domain.

The frequency axis can be expressed in many different ways: In frequency ff with a resolution fs/mf_{\mathrm{s}}/m, in normalized frequency or "cycles per sample" f/fsf/f_{\mathrm{s}} with resolution 1/N1/N, or in radians per sample ω\omega with resolution 2π/N2 \pi/N. Note that in the latter case ω\omega has the non-standard dimension radians/sample!!