Gryt

Audio Processing

Multi-stage processing pipeline with real-time visualization

The client implements a multi-stage audio processing pipeline built entirely on the Web Audio API. Each stage is a native AudioNode (or AudioWorkletNode) connected in series, so processing runs off the main thread with minimal latency.

Key Features

Enhanced Audio Pipeline

  • Multi-stage Processing: Volume → RNNoise → AGC → Compressor → eSports filter → Noise Gate → Mute → Output
  • Real-time Audio Visualization: Frequency spectrum and level meters
  • Loopback Monitoring: Hear yourself to test audio setup
  • Device Management: Hot-swappable microphone and speaker selection

Audio Quality Optimization

  • RNNoise Noise Reduction: AI-powered noise suppression via AudioWorklet (~20 ms latency)
  • True Auto Gain Control: RMS-based level normalization with configurable target dB
  • Dynamic Range Compressor: Separate compressor stage with adjustable amount
  • Noise Gate Filtering: Configurable threshold with smooth curves
  • Echo Cancellation: Browser-level AEC via getUserMedia constraints

Professional Controls

  • Volume Controls: Independent microphone and output volume with 2x boost
  • Real-time Adjustment: Instant response without audio glitches
  • Visual Feedback: Accurate representation of transmitted audio
  • Device Hot-swapping: Change devices without connection loss

Audio Processing Pipeline

The client chains native Web Audio nodes in a single AudioContext:

┌────────────┐   ┌────────────┐   ┌─────────────┐   ┌────────────────┐
│ Microphone │──►│  Volume    │──►│  Raw Analyser│──►│   RNNoise      │
│   Input    │   │  Control   │   │  + Raw Out   │   │  AudioWorklet  │
└────────────┘   └────────────┘   └─────────────┘   └────────────────┘

┌────────────┐   ┌────────────┐   ┌─────────────┐           │
│   Mute     │◄──│Noise Gate  │◄──│eSports LPF  │◄──────────┤
│  Control   │   │  (gain)    │   │  (optional)  │           │
└────────────┘   └────────────┘   └─────────────┘    ┌──────┴───────┐
       │                                              │  Compressor  │
       ▼                                              │  (optional)  │
┌────────────┐   ┌────────────┐                       └──────┬───────┘
│   Final    │──►│   Visual   │                              │
│  Analyser  │   │  Feedback  │                       ┌──────┴───────┐
└────────────┘   └────────────┘                       │   Auto Gain  │
       │                                              │  (RMS-based) │
       ▼                                              └──────────────┘
┌────────────┐
│    SFU     │
│Transmission│
└────────────┘

Stage 1: Volume Control

A GainNode with 0–200 % range (2x boost). Logarithmic scaling maps the slider to a natural loudness curve.

Stage 2: RNNoise Noise Reduction (AudioWorklet)

When enabled, audio is routed through an AudioWorkletNode running RNNoise compiled to WebAssembly. The worklet processes 480-sample frames (10 ms at 48 kHz) in a dedicated thread, adding roughly 20 ms of latency. This replaces the previous ScriptProcessorNode approach, eliminating main-thread blocking and audio glitches.

PropertyValue
Frame size480 samples (10 ms)
Sample rate48 kHz
Processing threadAudioWorklet (off main thread)
Typical added latency~20 ms

Stage 3: Auto Gain Control (True AGC)

A true RMS-based AGC that continuously measures input loudness and adjusts a GainNode to hit a configurable target level. This replaces the previous DynamicsCompressorNode hack.

Settings:

  • Target Level: -30 dB to -5 dB (default -20 dB) — the volume your voice gets normalized to
  • Enabled by default: Yes

The AGC analyser measures RMS in real-time and the gain node is adjusted dynamically in usePipelineControls to converge on the target dB level.

Stage 4: Compressor

An optional DynamicsCompressorNode that tames dynamic peaks after AGC. Useful for keeping volume consistent when speaking softly then loudly.

Settings:

  • Enabled by default: Yes
  • Amount: 0–100 % slider that interpolates compressor parameters from gentle to aggressive
AmountThresholdKneeRatioAttackRelease
0 %-10 dB402:10.01 s0.25 s
100 %-50 dB520:10.001 s0.05 s

Stage 5: eSports Low-pass Filter (Optional)

When eSports mode is enabled, a BiquadFilterNode (lowpass, 3400 Hz cutoff) rolls off high-frequency content to prioritize vocal clarity over fidelity, matching competitive voice chat conventions.

Stage 6: Noise Gate

A GainNode controlled by the raw analyser's RMS level. When input falls below the configurable threshold, the gate closes smoothly. The raw analyser taps the signal before RNNoise so the gate responds to actual mic input, not processed audio.

Configuration:

  • Threshold: -50 dB to -10 dB (configurable via slider)
  • Behavior: Smooth open/close to avoid clicks

Stage 7: Mute Control

Server-synchronized mute via a GainNode set to 0. State is synced bidirectionally with the signaling server so other participants see the correct mute indicator.

Stage 8: Final Analysis + SFU Transmission

The final analyser provides real-time visual feedback (level meters, frequency spectrum). The output is connected to a MediaStreamDestination for WebRTC transmission to the SFU.

Settings Reference

All audio settings are persisted to localStorage and take effect immediately:

SettingDefaultRangeDescription
Microphone Volume100 %0–200 %Input gain with 2x boost capability
Output Volume100 %0–200 %Speaker / headphone volume
Noise Gate Threshold-30 dB-50 to -10 dBBelow this level, mic is gated
RNNoiseOffOn / OffAI noise reduction via AudioWorklet
Auto GainOnOn / OffRMS-based level normalization
AGC Target Level-20 dB-30 to -5 dBTarget loudness for auto gain
CompressorOnOn / OffDynamic range compression after AGC
Compressor Amount50 %0–100 %Gentle → aggressive compression
eSports ModeOffOn / Off3.4 kHz low-pass for vocal clarity
LoopbackOffOn / OffMonitor your processed audio locally

Loopback Monitoring

Toggle loopback in Audio Settings to route your fully-processed audio to your speakers/headphones. Useful for verifying how you sound to others before joining a voice channel.

Device Management

Microphone and speaker devices can be changed at any time through Audio Settings without disconnecting from voice. The client listens for devicechange events and automatically refreshes the device list. If your selected device is unplugged, it falls back to the system default.

Troubleshooting

Common Audio Processing Issues

RNNoise causing artifacts?

  • Disable RNNoise in Audio Settings — it is experimental and may not work well on all systems
  • Ensure your browser supports AudioWorklet (all modern browsers do)
  • Check for high CPU usage; RNNoise adds a constant processing load

Auto gain too loud or too quiet?

  • Adjust the AGC target level slider (lower = quieter, higher = louder)
  • If your mic has very low input, combine with the Volume slider boost

Compressor squashing too much?

  • Lower the Compressor Amount slider or disable it entirely
  • The compressor works best when paired with Auto Gain

Noise gate cutting off speech?

  • Lower the noise gate threshold so quieter speech isn't gated
  • If using RNNoise, you may not need an aggressive noise gate at all

Device switching issues?

  • Check browser permissions for microphone access
  • Some devices require a page reload after plugging in
  • Try selecting "Default" before choosing a specific device

Performance Metrics

MetricTarget
Pipeline latency (no RNNoise)< 5 ms
Pipeline latency (with RNNoise)~20 ms
CPU usage (audio processing)< 5 %
RNNoise WASM memory~2 MB
Buffer underruns< 0.1 %

On this page