Overview
Go SFU — WebRTC media forwarding with Pion
The Gryt SFU (Selective Forwarding Unit) is a Go-based WebRTC media server built with Pion WebRTC v4. It receives audio, video, and screen-share tracks from each participant and forwards them to every other peer in the room without transcoding.
Features
- Selective forwarding: Routes audio, video, and screen-share streams with no transcoding
- SVC layer-aware forwarding: Parses the Dependency Descriptor (DD) RTP header extension to extract temporal layer IDs, then selectively drops higher layers for bandwidth-constrained receivers. Falls back to blind relay for non-SVC streams
- RTCP relay + SVC adaptation: Relays receiver feedback (PLI, FIR, REMB) back to senders, and uses REMB bitrate to auto-adapt each receiver's temporal layer subscription
- Multi-codec support: Registers H.264, VP9, VP8, and AV1 codecs. The client controls codec preference via
setCodecPreferences— H.264 is the default for universal hardware encoding (NVENC, Quick Sync, AMF), with AV1 available for newer GPUs. The SFU forwards whichever codec is negotiated without transcoding - Pion WebRTC v4: ICE handling, STUN support, multi-IP NAT mapping, connection recovery
- Multi-network support: Comma-separated
ICE_ADVERTISE_IPfor LAN + WAN setups — clients automatically select the fastest path - Thread-safe: Concurrent handling of multiple connections
- Lightweight: Minimal CPU and memory footprint
- Metrics: Prometheus endpoint (
/metrics) with room, peer, track, and Go runtime metrics
Project structure
sfu/
├── cmd/sfu/ # Entry point
├── internal/
│ ├── config/ # Environment-based configuration
│ ├── metrics/ # Prometheus metrics
│ ├── svc/ # SVC: DD header parser, per-receiver LayerForwarder
│ ├── websocket/ # Thread-safe WebSocket wrapper + handler
│ ├── webrtc/ # Peer connection management
│ ├── track/ # Media track lifecycle
│ └── signaling/ # Offer/answer coordination
└── pkg/types/ # Shared message structuresGetting started
cd packages/sfu
cp env.example .env
go run ./cmd/sfuOr with the start script:
./start.shEnvironment variables
| Variable | Default | Description |
|---|---|---|
PORT | 5005 | HTTP server port |
STUN_SERVERS | stun:stun.l.google.com:19302 | Comma-separated STUN servers |
ICE_UDP_MUX_PORT | — | Enable ICE UDP mux on a single UDP port (e.g. 443) |
ICE_UDP_PORT_MIN | — | Min UDP port for WebRTC media |
ICE_UDP_PORT_MAX | — | Max UDP port for WebRTC media |
ICE_ADVERTISE_IP | — | Public IP(s) to advertise in ICE candidates (comma-separated for multi-network) |
DISABLE_STUN | false | Disable server-side STUN. Only safe with host networking or 1:1 NAT (see Troubleshooting) |
MAX_PEERS | 200 (mux) / port range size | Max concurrent peers |
DEBUG | true | Enable debug logging |
VERBOSE_LOG | false | Enable verbose logging (very noisy) |
WebSocket protocol
The SFU uses raw WebSocket (not Socket.IO). All messages use a JSON envelope: { "event": "<name>", "data": "<json_string>" }.
Client to SFU
| Event | Data | Description |
|---|---|---|
client_join | { room_id, server_id, server_password, user_token, user_id } | Join a room |
answer | SDP answer string | WebRTC answer |
candidate | RTCIceCandidateInit JSON | ICE candidate |
renegotiate | — | Request renegotiation (e.g. to add tracks) |
set_layer | { track_id, max_temporal_layer } | Set max temporal layer for a track (-1 = all, 0 = T0, 1 = T0+T1, 2 = all) |
keep_alive | { timestamp } | Keep-alive ping (every 15s) |
SFU to Client
| Event | Data | Description |
|---|---|---|
room_joined | — | Join success |
room_error | Error string | Join or room error |
offer | SDP offer string | WebRTC offer |
candidate | RTCIceCandidateInit JSON | ICE candidate |
Track lifecycle
- Client joins room — peer connection created with recvonly transceivers for audio, video, screen-share video, and screen-share audio. Supported video codecs: H.264, VP9, VP8, AV1
- Client sends offer — SFU processes and creates answer
- Client adds tracks (audio, camera, screen share) — SFU creates a LayerForwarder per track that parses the Dependency Descriptor header extension and fans out to per-receiver tracks
- For each forwarded video track, receiver RTCP (PLI/REMB) is relayed back to the sender and used to auto-adapt temporal layer subscriptions
- Clients can send
set_layerto manually override their temporal layer for a track - Client leaves — tracks, forwarders, and connections cleaned up
HTTP endpoints
| Endpoint | Description |
|---|---|
GET /health | Health check: { status, service, version, timestamp } |
GET /metrics | Prometheus metrics |
Development
go mod download
go test ./...
go run -race ./cmd/sfu # with race detectionTroubleshooting
External users stuck on "connecting" (ICE failure)
If users outside your local network can reach the signaling server but voice never connects, the most likely cause is a UDP source-port mismatch. This happens when there is a NAT layer (Docker bridge, cloud VPC, etc.) between the SFU's UDP socket and the public internet that rewrites the source port.
Diagnosis — look for the SFU's ICE candidate log lines:
ICE candidate for <peer>: type=host protocol=udp address=203.0.113.10:443
ICE candidate for <peer>: type=srflx protocol=udp address=203.0.113.10:57599If the srflx port (57599) differs from ICE_UDP_MUX_PORT (443), the NAT is remapping the port. The rewritten host candidate advertises :443, but external peers actually need to reach :57599. Without the srflx candidate, ICE checks will never succeed.
Fix — keep DISABLE_STUN=false (the default) so the SFU discovers and advertises the correct external port. DISABLE_STUN=true is only safe when the SFU has a direct, port-preserving path to the internet (host networking, bare metal, or a 1:1 NAT that preserves UDP source ports).