Noise Cancellation in C++ and Python: Phase 5 — Rust Three-Model DeepFilterNet3 Pipeline
April 29, 2026 · Luciano Muratore
Phase 5 — Rust Three-Model DeepFilterNet3 Pipeline
What’s here
| File | Purpose |
|---|---|
Cargo.toml | Updated dependencies |
src/main.rs | Full three-model pipeline server |
inspect_onnx_models.py | One-shot diagnostic: prints exact ONNX tensor names |
Step 0 — Run the inspector first (critical)
The ONNX tensor names baked into the official models are not documented in a stable place and may differ slightly between DeepFilterNet versions. Before building the Rust server, confirm the actual names:
cd NoiseCancellation
ai\python\venv\Scripts\activate
python ai\python\experiments\inspect_onnx_models.py
Sample expected output:
enc inputs: feat_erb, feat_spec, h0, h1
enc outputs: e0, e1, e2, e3, emb, c0, lsnr, h0_out, h1_out
erb_dec inputs: emb, e3, e2, e1, e0, h0
erb_dec outputs: m, h0_out
df_dec inputs: emb, c0, h0
df_dec outputs: coefs, h0_out
If any names differ, update the string literals in src/main.rs accordingly
(search for "feat_erb", "emb", "m", "coefs", etc.).
Step 1 — Copy files into the repo
git checkout -b feature/phase5-rust-three-model-pipeline
cp phase5/Cargo.toml ai/rust/Cargo.toml
cp phase5/src/main.rs ai/rust/src/main.rs
cp phase5/inspect_onnx_models.py ai/python/experiments/inspect_onnx_models.py
Step 2 — Build
cd ai\rust
cargo build --release 2>&1 | tee build_phase5.log
Expected new dependencies that will be compiled: rubato, ndarray, configparser.
Step 3 — Run
Terminal 1 (Rust server):
ai\rust\target\release\noise_server.exe
Terminal 2 (C++ engine):
engine\build\Debug\SleepFixTest.exe
Set log level for detailed frame-by-frame output:
set RUST_LOG=debug
ai\rust\target\release\noise_server.exe
Architecture recap
ZeroMQ PULL (port 5555)
│ Vec<f32> × 1536 @ 44100 Hz
▼
Resample 44100→48000 (rubato FftFixedIn)
│ ~1672 samples
▼
Loop: split into 480-sample frames (10 ms @ 48 kHz)
┌──────────────────────────────────────────────────┐
│ DFState::analysis() → Complex32[481] │
│ feat_erb() → f32[32] (log-power ERB)│
│ feat_cplx() → f32[2×96] (Re+Im norm) │
│ │
│ enc.onnx → emb, e0-e3, c0, lsnr │
│ erb_dec.onnx → m (ERB mask) │
│ df_dec.onnx → coefs (DF filter) │
│ │
│ apply_erb_mask(spec, m) │
│ apply_df_filter(spec[0..96], coefs, rolling_buf) │
│ DFState::synthesis() → f32[480] │
└──────────────────────────────────────────────────┘
│ enhanced frames × 480 @ 48000 Hz
▼
Resample 48000→44100 (rubato FftFixedIn)
│ Vec<f32> × 1536 @ 44100 Hz
▼
ZeroMQ PUSH (port 5556)
Key parameters (DeepFilterNet3 defaults)
| Parameter | Value | Source |
|---|---|---|
| Sample rate | 48000 Hz | config.ini / paper |
| FFT size | 960 (20 ms) | config.ini |
| Hop size | 480 (10 ms) | config.ini |
| nb_erb | 32 bands | config.ini |
| nb_df | 96 bins (~4.8 kHz) | config.ini |
| df_order | 5 taps | config.ini |
| Lookahead | 2 frames (20 ms) | config.ini |
Troubleshooting
”Cannot load enc.onnx” (file not found)
main.rs uses paths relative to the workspace root (where you run the exe).
Run from the NoiseCancellation/ root, or set the MODEL_DIR env var and adapt
the path constants at the top of main.rs.
Tensor name mismatch / session.run() panic
Run inspect_onnx_models.py and match the string literals exactly.
The RNN state inputs may be named h_enc_0, h_enc_1 instead of h0, h1
depending on the export version.
White noise / silence output
Check that lsnr is within the expected range (-15..35 dB). If it is always
very negative, the ERB/complex features are being passed with wrong shapes.
Enable RUST_LOG=debug and check the per-frame lsnr values.
Latency budget
Each 480-sample frame is ~10 ms. With a 34.8 ms C++ buffer (1536 frames @ 44100), you have room for 3 full frames of processing + resampling overhead. The Rust ONNX inference measured at < 1 ms per frame in Phase 4; the three-model orchestration should still be well within budget.
Commit message (Conventional Commits)
feat(rust): implement three-model DeepFilterNet3 ONNX pipeline
Replace single broken model with official enc/erb_dec/df_dec ONNX
models orchestrated through DFState STFT, per-frame ERB/complex
feature extraction, ERB mask and deep filter coefficient application.
Adds rubato for 44100↔48000 resampling and stateful GRU carry-over.
Fixes: Phase 5 white-noise output
Github Link: https://github.com/Dextromethorpan/Noise_Cancellation