Audio Processing
Multi-stage processing pipeline with real-time visualization
The client implements a multi-stage audio processing pipeline built entirely on the Web Audio API. Each stage is a native AudioNode (or AudioWorkletNode) connected in series, so processing runs off the main thread with minimal latency.
Key Features
Enhanced Audio Pipeline
- Multi-stage Processing: Volume → RNNoise → AGC → Compressor → Noise Gate → Mute → Output
- Real-time Audio Visualization: Frequency spectrum and level meters
- Loopback Monitoring: Hear yourself to test audio setup
- Device Management: Hot-swappable microphone and speaker selection
Audio Quality Optimization
- RNNoise Noise Reduction: AI-powered noise suppression via AudioWorklet (~20 ms latency)
- True Auto Gain Control: RMS-based level normalization with configurable target dB
- Dynamic Range Compressor: Separate compressor stage with adjustable amount
- Noise Gate Filtering: Configurable threshold with smooth curves
- Echo Cancellation: Browser-level AEC via
getUserMediaconstraints
Professional Controls
- Volume Controls: Independent microphone and output volume with 2x boost
- Real-time Adjustment: Instant response without audio glitches
- Visual Feedback: Accurate representation of transmitted audio
- Device Hot-swapping: Change devices without connection loss
Audio Processing Pipeline
The client chains native Web Audio nodes in a single AudioContext:
Stage 1: Volume Control
A GainNode with 0–200 % range (2x boost). Logarithmic scaling maps the slider to a natural loudness curve.
Stage 2: RNNoise Noise Reduction (AudioWorklet)
When enabled, audio is routed through an AudioWorkletNode running RNNoise compiled to WebAssembly. The worklet processes 480-sample frames (10 ms at 48 kHz) in a dedicated thread, adding roughly 20 ms of latency.
Audio frames flow directly between the AudioWorklet and the RNNoise Web Worker via a MessageChannel, completely bypassing the main thread. This prevents UI freezes that would otherwise occur from relaying ~200 message events per second through the main thread event loop.
| Property | Value |
|---|---|
| Frame size | 480 samples (10 ms) |
| Sample rate | 48 kHz |
| Processing thread | AudioWorklet + Worker (off main thread) |
| Main thread involvement | Control messages only (enable/disable) |
| Typical added latency | ~20 ms |
Stage 3: Auto Gain Control (True AGC)
A true RMS-based AGC that continuously measures input loudness and adjusts a GainNode to hit a configurable target level. This replaces the previous DynamicsCompressorNode hack.
Settings:
- Target Level: -30 dB to -5 dB (default -20 dB) — the volume your voice gets normalized to
- Enabled by default: Yes
The AGC analyser measures RMS in real-time and the gain node is adjusted dynamically in usePipelineControls to converge on the target dB level.
Stage 4: Compressor
An optional DynamicsCompressorNode that tames dynamic peaks after AGC. Useful for keeping volume consistent when speaking softly then loudly.
Settings:
- Enabled by default: Yes
- Amount: 0–100 % slider that interpolates compressor parameters from gentle to aggressive
| Amount | Threshold | Knee | Ratio | Attack | Release |
|---|---|---|---|---|---|
| 0 % | -10 dB | 40 | 1:1 | 0.003 s | 0.25 s |
| 100 % | -40 dB | 5 | 20:1 | 0.003 s | 0.25 s |
Stage 5: Noise Gate
A GainNode controlled by the raw analyser's RMS level. When input falls below the configurable threshold, the gate closes smoothly. The raw analyser taps the signal before RNNoise so the gate responds to actual mic input, not processed audio.
Configuration:
- Threshold: -50 dB to -10 dB (configurable via slider)
- Behavior: Smooth open/close to avoid clicks
Stage 6: Mute Control
Server-synchronized mute via a GainNode set to 0. State is synced bidirectionally with the signaling server so other participants see the correct mute indicator.
Stage 7: Final Analysis + SFU Transmission
The final analyser provides real-time visual feedback (level meters, frequency spectrum). The output is connected to a MediaStreamDestination for WebRTC transmission to the SFU.
Screen Share Audio — Native Capture
When screen sharing with system audio, Gryt uses OS-native per-process audio capture on the desktop app to exclude its own audio from the stream. This means other participants hear your game, music, or application audio — but not the voices of people already in the call.
How it works
- Windows (10 build 20348+): Uses the WASAPI
PROCESS_LOOPBACK_MODE_EXCLUDE_TARGET_PROCESS_TREEAPI to capture all system audio except Gryt's process tree. - macOS (13.0+): Uses ScreenCaptureKit with
excludesCurrentProcessAudioto exclude the current app's audio. - Linux / Web: No OS-level API is available. Screen share audio is passed through unfiltered.
The native binary is a small standalone executable shipped alongside the Electron app. It writes raw 48 kHz 16-bit stereo PCM to stdout; the main process forwards chunks to the renderer via IPC, where an AudioWorkletNode converts them into a MediaStreamTrack for WebRTC transmission.
Platform support
| Platform | Method | Gryt audio excluded? |
|---|---|---|
| Windows (desktop app) | WASAPI process loopback | Yes |
| macOS (desktop app) | ScreenCaptureKit | Yes |
| Linux (desktop app) | System loopback | No |
| Web (any OS) | getDisplayMedia | No |
Settings Reference
All audio settings are persisted to localStorage and take effect immediately:
| Setting | Default | Range | Description |
|---|---|---|---|
| Microphone Volume | 100 % | 0–200 % | Input gain with 2x boost capability |
| Output Volume | 100 % | 0–200 % | Speaker / headphone volume |
| Noise Gate Threshold | -30 dB | -50 to -10 dB | Below this level, mic is gated |
| RNNoise | Off | On / Off | AI noise reduction via AudioWorklet |
| Auto Gain | On | On / Off | RMS-based level normalization |
| AGC Target Level | -20 dB | -30 to -5 dB | Target loudness for auto gain |
| Compressor | On | On / Off | Dynamic range compression after AGC |
| Compressor Amount | 50 % | 0–100 % | Gentle → aggressive compression |
| eSports Mode | Off | On / Off | Smaller FFT and faster smoothing for lower latency visualization |
| Loopback | Off | On / Off | Monitor your processed audio locally |
Loopback Monitoring
Toggle loopback in Audio Settings to route your fully-processed audio to your speakers/headphones. Useful for verifying how you sound to others before joining a voice channel.
Device Management
Microphone and speaker devices can be changed at any time through Audio Settings without disconnecting from voice. The client listens for devicechange events and automatically refreshes the device list. If your selected device is unplugged, it falls back to the system default.
Troubleshooting
Common Audio Processing Issues
RNNoise causing artifacts?
- Disable RNNoise in Audio Settings — it is experimental and may not work well on all systems
- Ensure your browser supports
AudioWorklet(all modern browsers do) - Check for high CPU usage; RNNoise adds a constant processing load
Auto gain too loud or too quiet?
- Adjust the AGC target level slider (lower = quieter, higher = louder)
- If your mic has very low input, combine with the Volume slider boost
Compressor squashing too much?
- Lower the Compressor Amount slider or disable it entirely
- The compressor works best when paired with Auto Gain
Noise gate cutting off speech?
- Lower the noise gate threshold so quieter speech isn't gated
- If using RNNoise, you may not need an aggressive noise gate at all
Device switching issues?
- Check browser permissions for microphone access
- Some devices require a page reload after plugging in
- Try selecting "Default" before choosing a specific device
Latency
The pipeline adds negligible latency without RNNoise. With RNNoise enabled, the AudioWorklet buffering adds roughly 20 ms of processing latency. The actual end-to-end latency depends on network conditions and the SFU.