Gryt

Audio Processing

Multi-stage processing pipeline with real-time visualization

The client implements a multi-stage audio processing pipeline built entirely on the Web Audio API. Each stage is a native AudioNode (or AudioWorkletNode) connected in series, so processing runs off the main thread with minimal latency.

Key Features

Enhanced Audio Pipeline

  • Multi-stage Processing: Volume → RNNoise → AGC → Compressor → Noise Gate → Mute → Output
  • Real-time Audio Visualization: Frequency spectrum and level meters
  • Loopback Monitoring: Hear yourself to test audio setup
  • Device Management: Hot-swappable microphone and speaker selection

Audio Quality Optimization

  • RNNoise Noise Reduction: AI-powered noise suppression via AudioWorklet (~20 ms latency)
  • True Auto Gain Control: RMS-based level normalization with configurable target dB
  • Dynamic Range Compressor: Separate compressor stage with adjustable amount
  • Noise Gate Filtering: Configurable threshold with smooth curves
  • Echo Cancellation: Browser-level AEC via getUserMedia constraints

Professional Controls

  • Volume Controls: Independent microphone and output volume with 2x boost
  • Real-time Adjustment: Instant response without audio glitches
  • Visual Feedback: Accurate representation of transmitted audio
  • Device Hot-swapping: Change devices without connection loss

Audio Processing Pipeline

The client chains native Web Audio nodes in a single AudioContext:

Stage 1: Volume Control

A GainNode with 0–200 % range (2x boost). Logarithmic scaling maps the slider to a natural loudness curve.

Stage 2: RNNoise Noise Reduction (AudioWorklet)

When enabled, audio is routed through an AudioWorkletNode running RNNoise compiled to WebAssembly. The worklet processes 480-sample frames (10 ms at 48 kHz) in a dedicated thread, adding roughly 20 ms of latency.

Audio frames flow directly between the AudioWorklet and the RNNoise Web Worker via a MessageChannel, completely bypassing the main thread. This prevents UI freezes that would otherwise occur from relaying ~200 message events per second through the main thread event loop.

PropertyValue
Frame size480 samples (10 ms)
Sample rate48 kHz
Processing threadAudioWorklet + Worker (off main thread)
Main thread involvementControl messages only (enable/disable)
Typical added latency~20 ms

Stage 3: Auto Gain Control (True AGC)

A true RMS-based AGC that continuously measures input loudness and adjusts a GainNode to hit a configurable target level. This replaces the previous DynamicsCompressorNode hack.

Settings:

  • Target Level: -30 dB to -5 dB (default -20 dB) — the volume your voice gets normalized to
  • Enabled by default: Yes

The AGC analyser measures RMS in real-time and the gain node is adjusted dynamically in usePipelineControls to converge on the target dB level.

Stage 4: Compressor

An optional DynamicsCompressorNode that tames dynamic peaks after AGC. Useful for keeping volume consistent when speaking softly then loudly.

Settings:

  • Enabled by default: Yes
  • Amount: 0–100 % slider that interpolates compressor parameters from gentle to aggressive
AmountThresholdKneeRatioAttackRelease
0 %-10 dB401:10.003 s0.25 s
100 %-40 dB520:10.003 s0.25 s

Stage 5: Noise Gate

A GainNode controlled by the raw analyser's RMS level. When input falls below the configurable threshold, the gate closes smoothly. The raw analyser taps the signal before RNNoise so the gate responds to actual mic input, not processed audio.

Configuration:

  • Threshold: -50 dB to -10 dB (configurable via slider)
  • Behavior: Smooth open/close to avoid clicks

Stage 6: Mute Control

Server-synchronized mute via a GainNode set to 0. State is synced bidirectionally with the signaling server so other participants see the correct mute indicator.

Stage 7: Final Analysis + SFU Transmission

The final analyser provides real-time visual feedback (level meters, frequency spectrum). The output is connected to a MediaStreamDestination for WebRTC transmission to the SFU.

Screen Share Audio — Native Capture

When screen sharing with system audio, Gryt uses OS-native per-process audio capture on the desktop app to exclude its own audio from the stream. This means other participants hear your game, music, or application audio — but not the voices of people already in the call.

How it works

  • Windows (10 build 20348+): Uses the WASAPI PROCESS_LOOPBACK_MODE_EXCLUDE_TARGET_PROCESS_TREE API to capture all system audio except Gryt's process tree.
  • macOS (13.0+): Uses ScreenCaptureKit with excludesCurrentProcessAudio to exclude the current app's audio.
  • Linux / Web: No OS-level API is available. Screen share audio is passed through unfiltered.

The native binary is a small standalone executable shipped alongside the Electron app. It writes raw 48 kHz 16-bit stereo PCM to stdout; the main process forwards chunks to the renderer via IPC, where an AudioWorkletNode converts them into a MediaStreamTrack for WebRTC transmission.

Platform support

PlatformMethodGryt audio excluded?
Windows (desktop app)WASAPI process loopbackYes
macOS (desktop app)ScreenCaptureKitYes
Linux (desktop app)System loopbackNo
Web (any OS)getDisplayMediaNo

Settings Reference

All audio settings are persisted to localStorage and take effect immediately:

SettingDefaultRangeDescription
Microphone Volume100 %0–200 %Input gain with 2x boost capability
Output Volume100 %0–200 %Speaker / headphone volume
Noise Gate Threshold-30 dB-50 to -10 dBBelow this level, mic is gated
RNNoiseOffOn / OffAI noise reduction via AudioWorklet
Auto GainOnOn / OffRMS-based level normalization
AGC Target Level-20 dB-30 to -5 dBTarget loudness for auto gain
CompressorOnOn / OffDynamic range compression after AGC
Compressor Amount50 %0–100 %Gentle → aggressive compression
eSports ModeOffOn / OffSmaller FFT and faster smoothing for lower latency visualization
LoopbackOffOn / OffMonitor your processed audio locally

Loopback Monitoring

Toggle loopback in Audio Settings to route your fully-processed audio to your speakers/headphones. Useful for verifying how you sound to others before joining a voice channel.

Device Management

Microphone and speaker devices can be changed at any time through Audio Settings without disconnecting from voice. The client listens for devicechange events and automatically refreshes the device list. If your selected device is unplugged, it falls back to the system default.

Troubleshooting

Common Audio Processing Issues

RNNoise causing artifacts?

  • Disable RNNoise in Audio Settings — it is experimental and may not work well on all systems
  • Ensure your browser supports AudioWorklet (all modern browsers do)
  • Check for high CPU usage; RNNoise adds a constant processing load

Auto gain too loud or too quiet?

  • Adjust the AGC target level slider (lower = quieter, higher = louder)
  • If your mic has very low input, combine with the Volume slider boost

Compressor squashing too much?

  • Lower the Compressor Amount slider or disable it entirely
  • The compressor works best when paired with Auto Gain

Noise gate cutting off speech?

  • Lower the noise gate threshold so quieter speech isn't gated
  • If using RNNoise, you may not need an aggressive noise gate at all

Device switching issues?

  • Check browser permissions for microphone access
  • Some devices require a page reload after plugging in
  • Try selecting "Default" before choosing a specific device

Latency

The pipeline adds negligible latency without RNNoise. With RNNoise enabled, the AudioWorklet buffering adds roughly 20 ms of processing latency. The actual end-to-end latency depends on network conditions and the SFU.

On this page