CircularBuffer

Tier: Core (Internal Utility) | No ComponentType | No Parameters

Header-only power-of-2 template circular buffer with mask-based wrapping and linear interpolation.

Overview

CircularBuffer is a fixed-size ring buffer designed for real-time audio use. It stores a continuous stream of samples and provides efficient random-access reads at both integer and fractional positions. The size is a compile-time template parameter that must be a power of 2, enabling the critical optimization of replacing modulo operations with bitwise AND masking.

The buffer supports four read modes: delay-relative (read \(N\) samples behind the write head), absolute integer (direct index), absolute fractional (with linear interpolation), and the delay-relative mode also uses linear interpolation for fractional delay values. This covers the needs of delay lines, grain readers, slice players, and loopers.

Being header-only with a std::array backing store, CircularBuffer makes zero heap allocations. The entire buffer lives inline in the owning component's memory, making it fully real-time safe.

File Locations

	Path
Header	`Sources/FolioDSP/include/FolioDSP/Core/CircularBuffer.h`
Tests	(tested indirectly through component tests)

API / Interface

namespace folio::dsp {

template<uint32_t Size>
class CircularBuffer {
    static_assert((Size & (Size - 1)) == 0, "Size must be power of 2");

public:
    /// Write one sample at the current write position and advance.
    void write(float sample);

    /// Read with fractional delay from the write head (linear interpolation).
    float read(float delaySamples) const;

    /// Read at an absolute index with mask-based wrapping.
    float readAbsolute(uint32_t absPos) const;

    /// Read at an absolute fractional position with linear interpolation.
    float readAbsoluteInterp(float absPos) const;

    /// Current write position (monotonically increasing, masked on access).
    uint32_t writePosition() const;

    /// Zero the entire buffer and reset write position.
    void clear();

    /// Buffer size (compile-time constant).
    static constexpr uint32_t size();

private:
    std::array<float, Size> buffer_{};
    uint32_t writePos_ = 0;
    static constexpr uint32_t kMask = Size - 1;
};

}

Algorithm

Power-of-2 Masking

The fundamental optimization: for any power-of-2 size \(N\), the mask \(M = N - 1\) allows wrapping via bitwise AND instead of modulo:

\[\text{index} = \text{pos}\ \&\ M\]

This is equivalent to \(\text{pos} \bmod N\) but compiles to a single AND instruction instead of a division. For example, with \(N = 262144\) (\(2^{18}\)):

\[M = 262143 = \texttt{0x3FFFF}\]

Any 32-bit unsigned position wraps correctly, including values far beyond the buffer size, because the upper bits are simply masked off.

Write

Writing stores a sample at the masked write position and increments the write head:

\[\text{buffer}[\text{writePos}\ \&\ M] = x\]

\[\text{writePos} \mathrel{+}= 1\]

The write position is allowed to overflow uint32_t naturally. Since all reads use masking, this is safe and avoids the need for explicit wrap-around checks.

Read (Delay-Relative)

Given a fractional delay \(d\) in samples, the read position is computed relative to the write head:

\[p = \text{writePos} - d\]

Two adjacent samples are read with linear interpolation:

\[i_0 = \lfloor p \rfloor\ \&\ M\]

\[i_1 = (i_0 - 1)\ \&\ M\]

\[f = p - \lfloor p \rfloor\]

\[y = \text{buffer}[i_0] + f \cdot (\text{buffer}[i_1] - \text{buffer}[i_0])\]

Note that \(i_1 = i_0 - 1\) (not \(i_0 + 1\)) because the buffer is written in forward order --- the sample one position earlier in the buffer is the sample one step further back in time.

Read Absolute (Integer)

Direct index access with wrapping:

\[y = \text{buffer}[\text{absPos}\ \&\ M]\]

Used when a component needs to read a specific slice or grain at a known buffer position.

Read Absolute (Fractional)

Fractional position with forward-direction interpolation:

\[i_0 = \lfloor p \rfloor\ \&\ M\]

\[i_1 = (i_0 + 1)\ \&\ M\]

\[f = p - \lfloor p \rfloor\]

\[y = \text{buffer}[i_0] + f \cdot (\text{buffer}[i_1] - \text{buffer}[i_0])\]

Here \(i_1 = i_0 + 1\) because the caller is scanning forward through the buffer at a known absolute position (e.g., a grain reader scanning through recorded audio).

Clear

Zeroes the entire backing array with memset and resets the write position to 0.

Implementation Notes

Real-time safe: No heap allocations. The backing std::array<float, Size> is allocated inline, typically on the stack of the owning component. No locks, no system calls.
Zero-initialized: The backing array is value-initialized to zero ({}), so a freshly constructed buffer reads silence everywhere.
static_assert enforcement: Attempting to instantiate with a non-power-of-2 size produces a compile-time error.
Overflow safety: The 32-bit write position overflows after approximately 27 hours at 44.1 kHz (\(2^{32} / 44100 \approx 97391\) seconds). Since all reads use masking, overflow is transparent and correct.
Linear interpolation only: No cubic or sinc interpolation is provided. For the use cases in FolioDSP (granular grains, delay taps, slice readers), linear interpolation provides sufficient quality at minimal cost. Components needing higher-quality interpolation (e.g., PitchShifter) use their own custom delay buffers.
Memory footprint: Each buffer consumes exactly \(\text{Size} \times 4\) bytes plus 4 bytes for the write position. The largest instance (MarkovBuffer at \(2^{20}\)) uses 4 MB.

Used By

Component	Size	Memory	Purpose
GranularEngine	\(2^{18}\) (262,144)	1 MB	Recording buffer for grain extraction (~6 seconds at 44.1 kHz)
FeedbackNetwork	\(8 \times 2^{15}\) (8 x 32,768)	8 x 128 KB	Eight delay lines for the feedback delay network
MicroLooper	\(2^{19}\) (524,288)	2 MB	Loop capture buffer (~12 seconds at 44.1 kHz)
MarkovBuffer	\(2^{20}\) (1,048,576)	4 MB	Slice storage for probabilistic reordering (~24 seconds at 44.1 kHz)