Animations in PDF using LaTeX
There is the "animate" package for LaTeX that allows you to embed animations into PDFs. These animations are stored frame-by-frame, which can lead to quite large output files.
This is a simple examle, which just outputs the number 1 through 10 and then repeats:
\documentclass{article}
\usepackage{animate}
\begin{document}
\begin{animateinline}[autoplay,loop]{2}
\multiframe{11}{n=0+1}{
\strut\makebox[2em][r]{\n}
}
\end{animateinline}
\end{document}
It compiles nicely with pretty much any output profile built into
TeXnicCenter. The magic hapens in the \begin{animateinline}
block,
where we just write the the counter \n
into a fixed-size box. The box
size is the same for all frames, and it is determined by the size of the
first frame. If you are not careful, this can lead to clipping. As the
nice folks at tex.stackexchange pointed
out, you should use a
wider box, so that double-digit numbers won't be squeezed (this is what
\makebox
is for), and you also have to enforce a higher box (with
\strut
), because TeX chooses too small boxes by default. The last one
is a bit weird.
Units in Mathematica 9
Today I tried using units in Mathematica. Here is how Mathematica wants you to do it:
Quantity[0.5, "Speed of Light"]
The Quantity
function checks your input against an internal list of
known units and constants. This internal list is pretty big - you can
see it by running:
Quantity; QuantityUnits`Private`$UnitReplacementRules[[1, All, 1]])
If your input is not an item of that list, the Quantity
function asks
WolframAlpha to parse your input. You can tell that this is happening
when the command does not finish instantly. Fortunately, the result
seems to be cached.
You can rightclick on the output and view the "Raw Input" form of the output to see how WolframAlpha interpreted it, and how you should enter it in the future. I don't know why this is hidden so much. If there is an external parser (always subject to change!), then it should tell me how it parsed the input. In any case, the Mathematica convention seems to be to write units in full word, with a capital letter, in plural form.
You can multiply, add, and otherwise transform your "quantities" in the obvious ways, and then convert the output to SI units by running
Quantity[1, "SpeedOfLight"] + Quantity[10^8, "Meters/Seconds"] // UnitConvert
You can also convert it to any unit by giving a second argument to the
UnitConvert
function:
UnitConvert[Quantity[1, "microparsec/(fortnight)"] , "mph"] // N
But don't rely on it. It does not parse SI prefixes correctly:
UnitConvert[Quantity[1, "microparsec/(kilofortnight)"] , "SI"] // N
(This gives you something with kilograms in the output...)
Things become particularily painful once you try to use temperatures. The following snippet works:
kB = Quantity[1, "Boltzmann Constant"];
T = Quantity[300, "Kelvins"];
Energy = kB*T;
UnitConvert[Energy, "eV"]
But it took a long time to get it working. The problem is that Mathematica defines their own unit system for temperature: absolute temperatures and difference temperature. This sort of make sense, because they transform differently (due to the 0-offsets). If you enter "Kelvin" as unit, it is interpreted by WolframAlpha as "KelvinDifference", if you enter "Kelvins", it is interpreted as absolute "Kelvins". This distinction is at first hidden from the user, because both use the same symbol "K" in the standard form output. Who the fuck designs this shit?!? Further, it's highly debatable if the unit of kB really should be J/Kelvin. If you really want to make the distinction between absolute and difference temperatures, then I think you should use J/KelvinDifference as unit for kB, otherwise it will transform in the wrong way if you transform to, say, J/°C.
Overall I have to say, working with units in Mathematica is an absolute pain. The feature is not ready. I recommend against using it. The "transparent" parsing by WolframAlpha is bad news, because noone guarantees that it won't be subject to change (meaning today's input will run differently tomorrow, even though you are using the same Mathematica version.) None of the weirdness behind is well documented or indicated to the user.
The standard Google search handles units better than Mathematica: It tells the user how it parsed the input (how to enter it in the future). Plus it's free, faster, and the syntax is better.
12 principles of animation and CSS
After seeing this site about Disney's 12 basic princples of animation, I wondered if you could do something similiar just using HTML5 and CSS3. I especially like the cubic bezier functions for the timing between transformations; they make the animations look pretty alive:
Over here is a great page to play with the timings - try "easeInOutQuart", for example.
Unfortunately, all CSS transformations that are currently defined are linear in space. This is a dealbreaker for implementing Disney's 12 principles.
Still, playing with CSS transform, transitions, and animations was fun. Here some notes about my understanding of them:
You can do
transforms like
scale(x), scale(x,y), translate(x,y), ... These go into the "normal"
CSS-rules, such as #stuff:hover {transform: scale(1.5)}
.
If an element changes from one style to another, you can define how this transitionshould take place. You can exactly which CSS properties should tranisition smoothly, the length of the effect, the delay, and a "timing function" that allows non-linear timing.
Thirdly, you can define key-frame-rules for animations. Basically, using a \@keyframes rule, you specify how the element should look at certain points in time. Inside a normal styling rule, you specify how these transitions should take place. This is pretty cool, but unfortunately I was not able to figure out how to specify, for example, the timing-function seperately for each transition in the animation.
Here is an example that I played around with in jsfiddle:
/* This is the animation code. */
@keyframes example {
0% { transform: scale(1.0); }
/* 33% { transform: scale(1.5, 0.7);} */
/* 66% { transform: scale(1.0); } */
100% { transform: scale(1.5, 0.7);}
}
/* This is the element that we apply the animation to. */
div {
width: 50px;
height: 50px;
margin: 50px 50px;
background-color: #88F;
line-height: 50px;
text-align: center;
}
/* This is the element that we apply the animation to. */
div:hover {
animation-name: example;
animation-duration: 0.3s;
animation-timing-function: cubic-bezier(0.9, 0, 0.1, 1);
/* animation-timing-function: linear; */
animation-delay: 0s; /* 0 is the default */
animation-iteration-count: 2; /* 1 is the default */
animation-direction: alternate; /* normal is the default */
}
For the html, I just used one or two div's. Note to self: An indepent, jsfiddle like wordpress plugin would be nice, but I could not find anything so far.
Addendum: The example at the top of the page uses transitions (not
animations), with the following CSS added to my wordpress theme
wordpress child
theme:
#lookmabezier {
width: 50px;
height: 50px;
margin: 50px 50px;
background-color: #88F;
line-height: 50px;
text-align: center;
transition-property: transform;
transition-duration: 0.5s;
transition-timing-function: cubic-bezier(0.85, 0, 0.15, 1);
transition-delay: 0s;
}
#martinscanvas:hover #lookmabezier {
transform: translateX(200px);
transition-property: transform;
transition-duration: 0.5s;
transition-timing-function: cubic-bezier(0.85, 0, 0.15, 1);
transition-delay: 0s;
}
I used two nested divs, so that the hover on the outer div triggers the movement of the inner div. If you would trigge on the hover of the inner div, it moves out of the hover, and the animation can never finish.
Fundamentals of Arbitrary Waveform Generation Ch. 2
Note: On the previous server, someone left a comment that DSP’s work quite differently. Unfortunately, this comment got lost during the migration. Please feel free to contact me if you spot any errors.
You can think of generating a sampled signal as generating a train of -pulses, ideally filtered by a "brick wall" filter with a cut-off frequency. This brick-wall filter is a convolution with a sinc-function in the time-domain.
In reality, you don't do -pulses, but 0-order hold (-)pulses. This, in itself, is a crude low-pass-filter. You can show it has 3.92 dB attenuation at . You have to correct for this "filter" by a subsequent (analog) interpolation filter.
The interpolation filter can't be made to roll off sharply at , so aliasing-free AWG only works up to approx. .
You can further improve the output by pre-distorting the DAC-input.
The publication is a bit weak on nomenclature when describing the different sampling frequencies involved in AWG's. So I'll define three quantities: The sampling frequency of the waveform, , the sampling frequency of the DAC, , and the access rate of the memory, . All these might be quite different from each other. It's probably best to think of as variable (for sweeps and such).
There are different AWG architectures.
- True Arb: Variable , so that every sample is played, no matter what is used.
- Direct Digital Synthesis (DDS): Fixed + "phase accumulator", which tells you which sample to play.
- Interpolating DAC: is higher than the access rate of the memory, with DSP filling in the blanks. Max. freq. is still limitied by , but images are pushed up to . This makes filtering them easier. Further DAC noise is spread over a larger spectrum.
- Pseudo Interleaving DAC: Two True-Arb's running with shift that are added. This is similiar to IQ, no? Gives full access up to (almost) , but is sensitive to timing errors.
- Trueform: Looks to me like interpolating DAC, but and are not necessarily so different. Thus, the DSP specializes in filling in the blanks for . Is this marketing BS?
DDS vs. True Arb: DDS needs the time resolution of the waveform to be as good as possible. If features are shorter than , they might be skipped. A digital low-pass filter up front helps. A DDS can have , but this makes sweeping impossible.
Details about Trueform: ILt's a kind of DDS architecture where the waveform is interpolated between samples (upsampling -> FIR LP filter -> Decimation). The digital filter can be optimized (user-set) for flat frequency response or precise timing (no ringing.).
A typical vertical resolution is 10-14 bits.
The limited vertical resolution leads to quantization noise. It's bound to less than 1/2 LSB and often has a flat spectrum. But in reality, it's periodic, and if you are unlucky the period is low and you get noise peaks. Quantization noise goes down with 6dB per additional bit. The noise power density goes down with higher samplig rate (because the quantization noise amplitude is roughly constant over time).
You always have thermal noise, with a power density of -174dBm/Hz @ room temperature.
At a high enough sampling rate, quantization noise will be lower than thermal noise floor. For 12 bit, this is 12 GSa/s, for 14 bit it's 0.8 GSa/s (assuming a 0.7 Vpp, full scale signal).
Linear DAC errors are trivially corrected for. Nonlinearities lead to harmonics and intermodulation, which are all folded into the band. Dynamic DAC errors are the worst. Among these are switching glitches (different timing between different bits + overshoots). The so-called "major carry transition" from 0111111111 to 1000000000 is especially bad, even though it's only a change by LSB.
Another DAC error comes from the limited slew rate (rate of change for output; this is also limited by the output amp's BW).
Slew-rate + switching glitches lead to non-zero settling time. (There is also a fixed delay time, but this does not matter much and is subsequently often not included in the spec of the settling time.)
If you want to minimize non-linearities, it might make sense to use an amplitutde smaller than FS.
SFDR stands for spurious free dynamic range. It's usually given in dBc, sometimes only including non-harmonic spurs. SFDR dBFS is basically the same thing, but given in dBFS.
Non-harmonic spurs are caused by bleedthrough of internal oscillators (switch-mode PS; or, sometimes, by quantization noise.
You also have white-noise sources everywhere, at the very minimum the thermal noise.
Sampling-clock jitter leads to phase noise of the carrier. Other sources of phase noise include DAC sampling uncertainty, data-signal jitter, current source skew. (Even after finishing the publication, I am not sure what these last three are.)
Don't trust the SFDR specs. These are not necessarily tested for all possible carrier frequencies.
THD is total harmonic distortion. This is the r.m.s. amplitude of the first harmonics ( should be stated in the spec) divided by the signal amplitude. Usually given in %, sometimes in dBc. (Then, of course, it's the power ratio.)
SNR: Signal power/noise (floor) power with noise power integrated from 0 to , excluding spurs.
SINAD: Signal power/(noise+distortion) power.
ENOB: Effective number of bits; i.e. as many bits that quantization noise power would be as big as noise+distortion power.
Old DAC's (like, Lord Kelvin old) were just a string of resistance and lot's of switches - a string DAC.
The implementation of modern DACs is based on lots of current sources. Each current source is just a transitor biased with a constant at the gate. This causes a constant current to flow through a resistor towards a (negative) . The "upper rail" is switched with two transistors (one using signal, the other using not(signal)) between either GND or to the "DAC network". This DAC network is often just the negative input of a single OP-Amp. This OP-Amp is operated in feedback mode and tries to keep the input at GND. The feedback path is across a resistor and carries the combined current of all the current sources, such that the output is proportional to the current. (I guess a diagram would help me here, future me. Look at page 60 of the PDF.) Bottom line: The current sources pull a varying amount of current through a single resistor. The voltage across this resistor is the output.
There are two approaches for an -bit DAC: You can either have identical current sources (less nonlinearities), or weighted current sources (less complexity).
In real life, the two architectures are often mixed, using a "unit element section" (sometimes called "thermometer section") for the most significant bits (to reduce non-linearity where it matters) and a "binary weighted section" is used for the least significant bits. These are called "hybrid" or "segmented DACs".
Deglitching the signal can be done by "resampling" it. You put the DAC-ouput into a track-and-hold section. This track-and-hold-section samples the signal (again, at the same rate as the DAC) just after the glitches have died down, and then holds the signal. It's important that this section is linear. You can do something similiar with all the current sources - Agilent loves this. It's not explained very well (picture is blurry.)
Some of the advanced deglitching techniques can cause the signal to die down (Return to Zero) while the DAC is glitching. This can distort the signal, so some ARBs allow you to turn this feature off.
Another technique that helps with deglitching is dual DAC cores, offset by half a cycle. This allows distributed deglitching without the Return to Zero problem. Alternatively, you live with the Return to Zero, which reduces amplitude but improves image rejection in the other Nyquist band half. Finally, you can run the two cores indpendently and thereby generate frequencies through the entire Nyquist band . (Similiar to IQ stuff.)
After the DAC, there can be several signal paths. If the user chooses "Direct DAC" output, no further processing happens. Often, the output filters and amplifiers include a "Time domain optimisation" (no ringing, low jitter), and a "Frequency domain optimization" (high dynamic range, flatness, good image removal". For our applications, I think we want the time domain optimization.
Typical output filters are 4th-order Bessel filters. They have worse roll-off than Butterworth and Chebyshev, but better step response (no ringing).
For time optimized output, Agilent recommends to choose the filter such that the rise-time is adequate, and then set sample rate so high that images are moved into the "well-enough" attenuation band of that filter. With Bessel filter, you typically need at least 4x oversampling (instead of 2x stated by Nyquist).
Frequecy optimized signals need a filter that cancels the frequency-response of the "0-order-hold" of the DAC. These types of filters are called equalizing filters. You can also do this, to some extent, with the digital data before it is sent to the DAC, either through precomputation, or, in the case of our 33500 and 33600 (?), with the built-in DSP block "on the fly".
Fundamentals of Arbitrary Waveform Generation Ch. 1
My notes about the Agilent publication "Fundamentals of Arbitrary Waveform Generation" (2012)
Arbitrary Waveform Generators (AWG) were developed in the 1980's and compete with RF generators (possibly incl. modulation), analog function generators, and pulse train generators.
When comparing AWG's, you should look at their sampling rate, memory size, vertical resolution (bit depth of the DAC), analog BW (3dB BW of the output amp+filter), amplitude + DC-offset range, and number of channels.
Proving the convolution theorem for discrete signals
Sources: Here for the proof and this mathworks page for the destinction between linear and circular convolutions.
Assume that the signals and have lengths and , respectively, and are non-periodic. Then you typically define a linear convolution as
Some of the indices run out of bounds in this sum, so we'll tacitly assume that anything out of bounds is zero. Note that can run from 0 to ! In other words, a (non-trivial) linear convolution has a longer output than the input. Our definition nicely describes a tapped delay line. The extra samples are the input and output transients.
For periodic signals, you typically switch to a seperate definition of a convolution. Assume that and are of the same length and periodic. Then in the following definition for the circular convolution nothing can run out of bounds (everything is periodic!), and the output will have the same period as the input.
No matter if we are working with periodic signals or non-periodic signals, the discrete Fourier transform and its inverse are always defined as
So, let's get to work and prove the convolution theorem for periodic signals. We'll start with the definition of the circular convolution
and replace and with the inverse Fourier transform of and . Note that we need the periodicity here - otherwise we'd have to ensure that out-of-bound indexes lead to 0! (We'll get to that case in a bit.)
Next, we rearrange the sums such that the sum over is the innermost sum.
We can pull out the part of the exponential that does not depend on .
If , then the inner sum evaluates to . For all other cases, you can apply the formula of a geometric series and quickly convince yourself that the result is 0 (the numerator in that formula is alwasy 0 due to the 's). Thus
And this is just the inverse Fourier transform of the function . In other words, we just proved that the convolution can be expressed as the inverse Fourier transform of the product of the individual spectra.
With care, we can also apply this to the linear convolution. By appropriately 0-padding both signals, such that they both have a length of , we can treat them as periodic functions! The periodicity just doesn't matter, because the ends never meet (there is just enough zeros between them.) We can then apply the convolution theorem and see that it mostly works as expected.
But there is an important side-effect: We cannot apply a pure sine frequency at the input anymore. For any non-trivial filter, the signal has to be 0-padded at the end! So you never get a pure spectrum out - this is because the input and output transients are usually not sinusoid.
Understanding Digital Signal Processing Chapter 5
Filters are there to change the spectrum of signals. There are two broad classes: finitie impulse response (FIR) and inifite impulse response (IIR) filters.
FIR in a nutshell: An input with finite non-zero values will lead to an output with finite non-zero values. Example: Calculating a running average of the last 5 values is a low pass filter. In DSP, this is called a 5-tap tapped-delay line FIR filter... The mental model is a line having 5 unit-delays, each having a T-junction (tap) in front of them that leads to a sum-input (and a 0.2 multiplier). A more general example:
In the formula above is the number of coefficients, is the number of coefficients, and can run from 0 to . Usually, the first outputs are discarded, as this is the transient repsonse as the filter-buckets are being filled up. If any index runs out-of bounds, the corresponding coefficient is assume to equal 0. The equation is often rewritten in shorthand, using as the convolution operator.
You can characterize a filter with its impulse response. This is the output (time series!) for an input comprising a single 1 preceded and followed by zeros. For FIR filters this directly gives you the filter coefficients .
Convolution in the time domain becomes multiplication in the frequency domain (proof omitted in the book - see errata post for my idea of the proof) Don't underestimate the power of this statement: If you know the spectra of both the signal and your filter (at least one of them appropriately 0-padded in the time domain), then the output spectrum is just the scalar multiplication of the input spectra:
In general, filter effects include a transient response (at the beginning and end, where there are not enough values for the filter-coefficients), a phase shift, and a amplitude shift. During the transient, the output may not be given by the Fourier transform (I think
- because what the convolution theorem does is to cycle unused filter-coefficients to the other end). Phase and amplitude shift can directly be read from the filter's spectrum.
Question: Are the output/input transients described by the convolution theorem? The book does not give an answer to this, but if you look at the proof of the convolution theorem, you can see an important side-effect: For the convolution theorem to work, you have to define your signal and your filter-coefficients to be periodic. This is, of course, not what a tapped-delay-line filter does - it cannot know about the last few samples as the first one comes in. Therefore, in order to describe this type of filter using convolutions and the convolution theorem, you have to appropriately zero-pad both the signal and the filter coefficients, such that in a periodic world they can never meet. This means, you cannot have pure sine-waves as inputs (so no "pure" input spectrum), so the product in the frequency space won't give you a nice peak, and you can see how the input and output transients might be properly described by the convolution theorem.
For symmetrical filter coefficents, the phase shift can be expressed as a constant delay in time.
The author keeps saying that convolution works with time-reversed samples, but I don't think this is generally true. If standard order is = oldest sample = first measured sample, then in his pictures, he indeed reverses the order. In his formulas, though, the order is unchanged. Apparently, some authors do this differently (?).
You can design FIR filters with many different methods, usually starting out with a desired response in the frequency domain, then translating that to filter coefficients.
The Window Design Method can mean one of two things: 1) You choose your coefficients to form a "frequency window". You can calculate the filter coefficient algebraically, which the author describes by using negative frequencies (and a rectangular window centered around 0). Or you can calculate them using an IFFT "on the fly", but then you have to use standard (all positive) frequencies and a "half-window" at both ends. (For whatever reason, the author procedes to throw away one output coefficient and to shift the others around such that the filter coefficients are highest in the center and symmetrical). 2) You can start out with the sinc-like coefficients of an ideal window and multiply them, in the time domain (!), with window coefficients . The latter usually have a smooth fourier transform, which due to the convolution in the frequency domain, makes the total frequency response smoother.
You can leave out some of the filter coefficients to make the filter simpler, but then you get a non-ideal frequency reponse. In any case, you have to deal with ripple. Even when using all filter coefficients, you still have ripple for non-integer frequencies ("Gibb's Dilemma"). The ripple is a direct consequence of the step-like frequency window.
The window coefficients of popular functions are often given in the time domain (!) as . You then, in the time domain, multiply the window coefficients with the "sinc-like-coefficients of the "ideal" rectangle filter to get your final filter coefficients. In the frequency domain, this corresponds to a convulotion of the sinc's rectangle with the smooth function's whatever-smooth-function.
You can use smoother windows to reduce the ripple - a popular example being the Blackman window function (two cosines cleverly added to get smooth transitions). The cost is a less sharp filter.
The Chebyshev and the Kaiser window functions are other, important alternatives. They have the added benefit of an adjustable parameter that directly gives you the minimum attenuation of the stop band. The higher the attenuation, the broader the filter.
A bandpass filter is just a low-pass filter shited up in frequency. You can shift it up by multiplying it (in the time domain) with a sinusoid of appropriate frequency. (Think about the convolution of the filter's spectrum with the sinusoids delta-peak at some frequency - this just shifts the filter spectrum.) A popular choice is a bandpass-filter around , because then the sine-wave has 4 points per full cycly, and these points are . This makes the multiplication trivial. Note that the output is generally a factor two lower (your multiplying with lots of zeros!), but appears to be wider (?). I am not sure how you see the reduction of two in the convolution, though...
The same technique can be used to design a high-pass filter. You just multiply by , which is the sequency .
One automated way of calculating a filter is the Parks-McClellan exchange. You define a few points in the pass-band, and it tries to give you a result with minimal ripple. The result - claimed to be optimal - is a Chebyshev-type filter that has constant ripple throughout the stop-band (all the spurious maxima have the same height, which means that the highes maximum is a low as it gets). The math is complicated, but many filter design tools exist for it.
A supposedly very useful, special filter is the half-band FIR filter, which is a (low-pass) filter that has a pass-band gain of 1 and is point-symmetric (?) around . (It has to pass through that point.) In this special case, almost half of the filter coefficients are 0, which makes it very fast to calculate. These filters are supposedly used all over the place.
FIR filters always (?) have a linear phase response (he later says only if the coefficients are symmetric...). Intuitively, this makes sense, if you think about tapped delay line filter, or about the convolution sum "holding inputs back" by a fixed amount of samples linearly varying frequency. The result is, that if you look at the on a phasor-diagram, they always rotate by a fixed amount if you go change . But beware: The amplitude can also go negative, which gives you 180° phase jumps. When you calculate the angles, you also get 360° phase jumps because the arctan-routines typically don't know how to unwrap the phase (of course, you could teach them...).
The linear phase response means, that the phase-change per frequency bin is constant. In the general case (even when not constant?) this is called the group delay, and is given by . Assuming it's constant, you can quickly show that a constant group delay leads to a fixed time-delay of every frequency component, which means there is no (relative) phase distortion between different frequencies. A useful property for any modulated signals! It's also called "envelope delay". For symmetric coefficients and an -tap FIR filter with as the number of unit-delay elements, you can handwavingly find that the group delay (in seconds) is given by
An -th order FIR filter is just an order tapped delay line filter.
Passband gain is defined as the mean-value inside the passband around which the ripples oscillate. For small ripples, the passband gain is equal to the DC-gain, which is equal to the sum of all the FIR coefficients.
Estimating the number of FIR filter taps is somewhat guesswork, but you can guide yourself by the following formula, which gives you passband ripple of 0.1 dB for any attenuation and (fractional, relative to ) corner frequencies :
Analyzing FIR-filters can either be done by stating the DTFT as a "polynomial" (you just expand the Fourier-sum with a continuous inside, which looks like a polynomial of exp's), or you can calculate the DFT (which gives you a sampled version of the DTFT.) In the latter case, you probably want to 0-pad the original data to get a higher resolution.
Question: I'm curious what happens if you ignore the symmetry and leave out the far end coefficients in the IFFT calculation of the windowed-low pass filter.
Question: How exactly does he go to negative frequencies and back? Again, I think there is a phase factor involved, and I think he sneakily redefines the DFT as he goes along!
Understanding Digital Signal Processing Chapter 4
The Fast Fourier Transform (FFT) is a fast implementation of the DFT. It'n not an approximation! The DFT scales with , but the FFT scales with . The most popular implementation of the FFT is the radix-2 algorithm, but it's limited to .
When deciding on sampling parameters, you should sample fast enough, so that there are no significant components near or beyond (low-pass filters can also be used). Also, you have to sample long enough for the desired resolution. Keep in mind that and choose N as the next larger power of two that gives you your desired resolution.
Once you have a sample, common processing steps on the time series includes removing the mean value to kick out the (possibly overwhelmingly large) DC-component of the spectrum. Secondly, you might apply a window-function to reduce leakage - at the cost of losing resolution and overall signal. If the given sample is not a power of two, you can finally zero-pad the data. FFT results can be enhanced by averaging or by after-the-fact windowing.
Note that the interpretation of DFTs depends on wether the original signal was real-valued or complex-valued. DFTs of real-valued signals are symmetric, so only computing the first half suffices. For complex-valued signals, all FFT bins are independent and you should look at all of them! When calculating signal amplitudes, you have to divide by a factor of for real-valued signals and by for complex valued signals, and we also have to divide out a processing loss factor if we used a window function. (These factors can be looked up in literature.)
The normalized power spectrum is calculated by normalizing the peak power to 1. In this case you don't have to worry about scale factors. It is always given in dB, which helps identify low peaks next to big ones.
\[X_\mathrm{dB}(m) = 10 \log\left( \frac{\left\| X(m) \right\|^2}{\left\| X(m) \right\|_\mathrm{max}^2}\right) = 20 \log\left( \frac{\left\| X(m) \right\|}{\left\| X(m) \right\|_\mathrm{max}}\right)\]The idea of the FFT algorithm is to avoid costly complex multiplications. In a normal DFT you often repeat the same multiplication over and over. In the FFT, you split the sum of the DFT into to sums of odd and even indeces, pull out a phase factor of the odd-valued sum (called twiddle factor), and realize (after some algebraic magic) that you are now performing two -point DFTs. You can do this over and over again, until you are doing 1-point DFTs. The data flows in a "butterfly"-pattern. I have to admit I did not read this in detail - it gets boring quite fast.
The bottleneck in FFT implementations used to be the complex multiplications, but newer processors can do these in a single cycle (the author claims). The remaining shuffling of the input sequence ("bit reversal") is currently more of a bottleneck than the multiplications, but there are some tricks to speed this up. For example, if you are performing a convolution (multiplying two FFT's and then doing an IFFT), with clever algorithm design, you don't need to shuffle anything.
There are several FFT implementations that can be broadly categorized as decimation-in-frequency (better for complex input) and decimation-in-time (better for real inputs). There is also "constant-geometry", which is best for hardware implementations. Read the second half of chapter 4 for these gnarly details, if you should ever need them.