Onset Detector
Tier: Analysis | ComponentType: 37 | Params: 3
Spectral flux transient detection with adaptive threshold, cooldown, and optional band-split analysis. Audio passes through unchanged.
Overview
OnsetDetector uses short-time Fourier analysis to detect note onsets, transients, and percussive attacks in the audio stream. It accumulates samples into a 1024-sample frame, applies a Hann window, and computes the FFT every 256 samples (hop size). The spectral flux — the sum of positive magnitude differences between consecutive frames — measures how rapidly the spectrum is changing. Sudden increases in spectral flux correspond to transient events.
An adaptive threshold based on the running median of recent flux values prevents false triggers in sustained or noisy signals. The sensitivity parameter scales this threshold: higher values require larger spectral changes to trigger an onset. A cooldown timer enforces a minimum interval between detections, preventing repeated triggering from the same event.
When Band Split is enabled, the detector also computes flux in three frequency bands (low: 0-300 Hz, mid: 300-3000 Hz, high: 3000+ Hz), each with a reduced threshold. This allows downstream systems to distinguish between bass transients, midrange attacks, and high-frequency events.
This is an analysis-only component — audio passes through unchanged. All detection results are emitted via the snapshot pipeline.
File Locations
| Path | |
|---|---|
| Header | Sources/FolioDSP/include/FolioDSP/Analysis/OnsetDetector.h |
| Implementation | Sources/FolioDSP/src/Analysis/OnsetDetector.cpp |
| Tests | Tests/FolioDSPTests/OnsetDetectorTests.swift |
| Bridge | Sources/FolioDSPBridge/src/FolioDSPBridge.mm (OnsetDetectorBridge) |
Parameters
| Index | Name | Description | Min | Max | Default Min | Default Max | Default | Unit |
|---|---|---|---|---|---|---|---|---|
| 0 | Sensitivity | Threshold multiplier (higher = less sensitive) | 0.1 | 10.0 | 0.5 | 5.0 | 1.5 | |
| 1 | Min Interval | Minimum time between onsets (cooldown) | 5.0 | 500.0 | 20.0 | 200.0 | 50.0 | ms |
| 2 | Band Split | Enable per-band detection (0=off, 1=on) | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 |
Processing Algorithm
The process() function accumulates samples and manages cooldown. Every hop, analyzeFrame() runs the full detection pipeline:
1. Frame Accumulation
Samples are written into a 1024-sample circular buffer. Every 256 samples (hop size), the analysis frame is extracted:
2. Hann Windowing
The frame is windowed to reduce spectral leakage before FFT:
The Hann window values are retrieved from a precomputed 1024-entry lookup table with interpolation.
3. Forward FFT
A real-to-complex FFT transforms the windowed frame into 512 frequency bins:
4. Magnitude Computation
The magnitude of each frequency bin is computed:
5. Spectral Flux (Half-Wave Rectified)
Spectral flux measures the sum of positive magnitude increases across all bins. Only increases count — decreases are ignored, making the detector sensitive to energy appearing (onsets) rather than disappearing (offsets):
6. Adaptive Threshold
The threshold adapts to the signal's recent spectral activity using the median of the last 20 flux values, scaled by the sensitivity parameter:
The median is computed via insertion sort on a copy of the flux history buffer.
7. Onset Detection
An onset is declared when the spectral flux exceeds the threshold and the cooldown has expired:
8. Onset Strength
The strength of the detected onset is proportional to how far the flux exceeds the threshold:
9. Cooldown
After an onset, a cooldown timer prevents retriggering:
The onset flag remains set during cooldown and clears when the counter reaches zero.
10. Band-Split Detection (Optional)
When enabled, separate flux is computed for three frequency bands with bin ranges derived from the sample rate:
Each band uses a reduced threshold:
11. Audio Passthrough
The input sample is returned unchanged. OnsetDetector is a pure analysis component with no effect on the audio signal.
Core Equations
Snapshot Fields
| Field | Type | Range | Unit | Description |
|---|---|---|---|---|
| Spectral Flux | Float | 0–10 | Current spectral flux magnitude | |
| Threshold | Float | 0–10 | Current adaptive threshold value | |
| Onset | Bool | 0–1 | Whether an onset was detected this frame | |
| Strength | Float | 0–1 | Onset strength (how far flux exceeds threshold) | |
| Low Onset | Bool | 0–1 | Onset detected in low band (0-300 Hz) | |
| Mid Onset | Bool | 0–1 | Onset detected in mid band (300-3000 Hz) | |
| High Onset | Bool | 0–1 | Onset detected in high band (3000+ Hz) | |
| Flux History | Float[32] | 0–10 | Ring buffer of recent spectral flux values |
Implementation Notes
- FFT size is 1024 with a hop of 256 (75% overlap), giving ~172 Hz analysis rate at 44.1 kHz. The 512-bin resolution provides approximately 43 Hz per bin.
- Half-wave rectification of the spectral difference is critical — it makes the detector respond only to energy appearing in new frequency bins, not to energy decaying. Without this, sustained notes would produce continuous flux.
- Adaptive threshold via median is more robust than a fixed threshold or mean-based threshold, since medians are resistant to outliers from the onsets themselves.
- Band-split threshold is multiplied by 0.3 (not the full sensitivity value) because per-band flux is inherently lower than total flux. This ensures band-level onset detection is proportionally sensitive.
- Cooldown operates at sample resolution, not hop resolution, providing fine-grained control over minimum onset interval.
- The Sensitivity parameter uses
ParamSmoother(smoothed = true) to prevent abrupt threshold changes during live performance. - All parameters use
std::atomic<float>for lock-free thread safety. - Snapshot emission is decimated to ~60 fps (every 735 samples at 44.1 kHz).
Equation Summary
onset = flux > median*sensitivity