FFT

Tier: Core (Internal Utility) | No ComponentType | No Parameters

Radix-2 Cooley-Tukey Fast Fourier Transform with pre-computed twiddle factors and bit-reversal permutation.

Overview

The FFT class provides forward and inverse Fourier transforms used by frequency-domain components throughout FolioDSP. It is a pure computational utility with no parameters, snapshots, control interface, or bridge class.

The implementation uses the classic Cooley-Tukey radix-2 decimation-in-time algorithm. All twiddle factors and bit-reversal indices are pre-computed during construction, so the transform itself is a tight loop of butterfly operations with no transcendental function calls. The transform size must be a power of 2 (typical audio sizes: 256, 512, 1024, 2048, 4096).

Two interface variants are provided: real-to-complex (for transforming audio buffers) and complex-to-complex (for components that need to manipulate complex spectra directly). The inverse transform normalizes by \(1/N\).

File Locations

	Path
Header	`Sources/FolioDSP/include/FolioDSP/Core/FFT.h`
Implementation	`Sources/FolioDSP/src/Core/FFT.cpp`
Tests	`Tests/FolioDSPTests/FFTTests.swift`

API / Interface

namespace folio::dsp {

class FFT {
public:
    /// Construct with transform size (must be power of 2).
    explicit FFT(uint32_t size);

    /// Forward: N real samples -> N/2 interleaved complex bins (N floats total).
    void forward(const float* input, float* output) const;

    /// Inverse: N/2 interleaved complex bins -> N real samples (normalized by 1/N).
    void inverse(const float* input, float* output) const;

    /// Forward complex-to-complex: N complex -> N complex.
    void forwardComplex(const float* real, const float* imag,
                        float* outReal, float* outImag) const;

    /// Inverse complex-to-complex: N complex -> N complex (normalized by 1/N).
    void inverseComplex(const float* real, const float* imag,
                        float* outReal, float* outImag) const;

    uint32_t size() const;

private:
    uint32_t size_, logSize_;
    std::vector<float> twiddleReal_, twiddleImag_;  // size/2 each
    std::vector<uint32_t> bitRev_;                  // size entries

    void computeTwiddles();
    void computeBitReversal();
    void fftCore(float* real, float* imag, bool inverse) const;
};

}

Algorithm

1. Construction: Twiddle Factor Pre-computation

The discrete Fourier transform of an \(N\)-point sequence \(x[n]\) is:

\[X[k] = \sum_{n=0}^{N-1} x[n] \cdot W_N^{nk}\]

where \(W_N = e^{-2\pi i / N}\) is the \(N\)-th root of unity. The twiddle factors \(W_N^k\) for \(k \in [0, N/2)\) are pre-computed at construction:

\[W_N^k = \cos\!\left(\frac{-2\pi k}{N}\right) + i\sin\!\left(\frac{-2\pi k}{N}\right)\]

These are stored in two parallel std::vector<float> arrays of size \(N/2\) --- one for the real part, one for the imaginary part. The inverse transform uses the complex conjugate \(\overline{W_N^k}\), which simply negates the imaginary component.

2. Construction: Bit-Reversal Permutation Table

The Cooley-Tukey algorithm requires inputs in bit-reversed order. A permutation table of \(N\) entries is pre-computed, where each index \(i\) maps to its \(\log_2 N\)-bit reversal. For example, with \(N = 8\):

\(i\) (binary)	Reversed	\(\text{bitRev}[i]\)
000 (0)	000	0
001 (1)	100	4
010 (2)	010	2
011 (3)	110	6

At transform time, elements at positions \(i\) and \(\text{bitRev}[i]\) are swapped (only when \(\text{bitRev}[i] > i\), to avoid double-swapping).

3. Core Butterfly Computation

The fftCore method performs the in-place Cooley-Tukey butterfly across \(\log_2 N\) stages. At each stage \(s\):

Block size: \(B = 2^{s+1}\), half-block: \(H = 2^s\)
Twiddle stride: \(S = N / B\)

For each butterfly pair \((e, o)\) where \(e = k + j\) and \(o = e + H\):

\[t = W_{S \cdot j} \cdot Z[o]\]

\[Z'[e] = Z[e] + t\]

\[Z'[o] = Z[e] - t\]

Expanding the complex multiplication:

\[t_{\text{re}} = W_{\text{re}} \cdot Z_{\text{re}}[o] - W_{\text{im}} \cdot Z_{\text{im}}[o]\]

\[t_{\text{im}} = W_{\text{re}} \cdot Z_{\text{im}}[o] + W_{\text{im}} \cdot Z_{\text{re}}[o]\]

For the inverse transform, the imaginary part of the twiddle factor is negated (conjugated), and the output is normalized:

\[x[n] = \frac{1}{N} \cdot \text{IFFT}(X[k])\]

4. Real-to-Complex Forward Transform

The forward() method:

Copies \(N\) real input samples into the real array, zeroes the imaginary array
Applies fftCore (bit-reversal + butterflies)
Packs the first \(N/2\) bins as interleaved complex: [re[0], im[0], re[1], im[1], ...]

Only \(N/2\) bins are output because the spectrum of a real signal has conjugate symmetry: \(X[N-k] = \overline{X[k]}\).

5. Complex-to-Real Inverse Transform

The inverse() method:

Unpacks \(N/2\) interleaved complex bins into real/imaginary arrays
Mirrors the conjugate symmetry: \(X[N-k] = \overline{X[k]}\) for \(k \in [1, N/2)\)
Applies fftCore with inverse = true (conjugate twiddles + \(1/N\) normalization)
Extracts the real part as output

6. Complex-to-Complex Transforms

forwardComplex() and inverseComplex() take separate real and imaginary input arrays of size \(N\) and produce separate real and imaginary output arrays. These copy the inputs and call fftCore directly, bypassing the packing/unpacking step.

Implementation Notes

Complexity: \(O(N \log N)\) for the transform, \(O(N)\) for construction.
Memory: The forward() and inverse() methods allocate temporary std::vector buffers internally. This makes them unsuitable for per-sample real-time calling, but they are used in block-based contexts (overlap-add convolution, STFT analysis) where allocations occur once per block at rates far below sample rate.
Twiddle storage: Only \(N/2\) twiddle factors are stored. The stride-based indexing j * twiddleStride selects the correct twiddle for each butterfly at each stage without additional storage.
No SIMD: The implementation is portable scalar C++. Apple's vDSP/Accelerate is not used because FolioDSP has zero external dependencies.
Typical sizes in use: IRConvolver uses size 1024 (block convolution), SpectralFreeze uses size 4096 (STFT), OnsetDetector uses size 1024 (spectral flux analysis).

Used By

Component	FFT Size	Purpose
IRConvolver	1024	Overlap-add frequency-domain convolution
SpectralFreeze	4096	STFT analysis/resynthesis with frozen magnitudes
OnsetDetector	1024	Spectral flux computation for transient detection