Audio Analysis Toolkit
Music-reactive applications need dependable beat and energy data from arbitrary tracks, but full transcription of mixed polyphonic audio is beyond what FFT peak-picking can honestly deliver. Most hobby tools blur that line and ship unreliable note data.
A pipeline that pulls audio from a YouTube URL, decodes it to PCM and runs real-time-synced FFT band analysis, beat detection and approximate note/key detection — with each signal explicitly graded by reliability so downstream consumers know what to trust. A companion in-browser synthesiser covers the generation side: keyboard, theremin, drum sequencer, piano roll, loop recording and FX.
- Band analysis
- 6 frequency bands, 4096-sample FFT at 50% overlap (~10.7 Hz resolution)
- Beat detection
- rolling-average energy spikes, 1.5× threshold, 200 ms minimum gap
- Key detection
- chroma histogram matched against 24 major/minor keys
- Honest limits
- FFT note detection documented as approximate, with an ML (BasicPitch) upgrade path specified
- Validation loop
- detected notes resynthesised to audio for by-ear quality checks
- WebSynth
- 11 JS modules — keyboard, theremin, drum sequencer, piano roll, loops, FX, recording
.NET Blazor Server (YoutubeExplode + NAudio for the pipeline, requestAnimationFrame-synced canvas visualisers) and a Web Audio API synth, with analysis design and the limits write-up produced through agent research sessions.