Vast.ai GPU Streaming Deployment

This guide covers deploying Hyperscape’s streaming duel arena on Vast.ai GPU servers for live RTMP broadcasting to Twitch, Kick, X/Twitter, and other platforms.

Prerequisites

Required

Vast.ai account with GPU instance
NVIDIA GPU with Vulkan support (verified via nvidia-smi)
GitHub repository with secrets configured
Stream keys for target platforms (Twitch, Kick, X/Twitter)

GPU Requirements

NVIDIA GPU: Required for WebGPU via ANGLE/Vulkan backend
Vulkan ICD: Must be available at /usr/share/vulkan/icd.d/nvidia_icd.json
DRI/DRM Access: Optional but recommended for best performance (Xorg mode)

Deployment Architecture

The deployment script (scripts/deploy-vast.sh) attempts GPU rendering modes in this order:

1. Xorg with NVIDIA (Best Performance)

Requirements: DRI/DRM device access (/dev/dri/card*)
Display: Real X server with NVIDIA GLX driver
WebGPU: Direct GPU access via NVIDIA driver
Status: DUEL_CAPTURE_USE_XVFB=false, DISPLAY=:0

2. Xvfb with NVIDIA Vulkan (Fallback)

Requirements: NVIDIA GPU accessible, no DRI/DRM needed
Display: Virtual framebuffer (Xvfb) on :99
WebGPU: Chrome uses ANGLE/Vulkan to access GPU
Status: DUEL_CAPTURE_USE_XVFB=true, DISPLAY=:99

3. Ozone Headless with GPU (Experimental)

Requirements: NVIDIA GPU with Vulkan
Display: No X server, Chrome’s --ozone-platform=headless
WebGPU: Direct Vulkan access via Chrome
Status: STREAM_CAPTURE_OZONE_HEADLESS=true, DISPLAY= (empty)

4. Headless Software Rendering (NOT SUPPORTED)

WebGPU: WILL NOT WORK
Deployment: FAILS with error message
Reason: WebGPU requires hardware GPU acceleration

WebGPU Validation

The deployment script performs comprehensive WebGPU validation:

1. GPU Hardware Check

nvidia-smi  # Verify NVIDIA GPU is accessible

2. Vulkan ICD Verification

ls /usr/share/vulkan/icd.d/nvidia_icd.json  # Check ICD file exists
cat /usr/share/vulkan/icd.d/nvidia_icd.json  # Log ICD content
VK_LOADER_DEBUG=all vulkaninfo  # Verify Vulkan loader works

3. Display Server Check

xdpyinfo -display :0  # Verify X server responds (Xorg mode)
xdpyinfo -display :99  # Verify Xvfb responds (Xvfb mode)

4. WebGPU Preflight Test

Launches Chrome with GPU flags
Navigates to blank page
Tests navigator.gpu.requestAdapter() with 30s timeout
Tests renderer.init() with 60s timeout
Extracts chrome://gpu diagnostics
Deployment fails if WebGPU cannot initialize

5. GPU Diagnostics Capture

# Captured during deployment for debugging:
- WebGPU adapter info
- Vulkan backend status
- GPU vendor/renderer
- Feature support flags

Environment Persistence

Settings are persisted to .env for PM2 restarts:

# GPU/Display configuration
DISPLAY=:99
DUEL_CAPTURE_USE_XVFB=true
STREAM_CAPTURE_OZONE_HEADLESS=false
STREAM_CAPTURE_USE_EGL=false

# Chrome executable path
STREAM_CAPTURE_EXECUTABLE=/usr/bin/google-chrome-unstable

Stream Capture Configuration

Chrome Flags (GPU Sandbox Bypass)

Required for container GPU access:

--disable-gpu-sandbox
--disable-setuid-sandbox

Capture Modes

CDP (default): Chrome DevTools Protocol screencast
WebCodecs: Native VideoEncoder API (experimental)
MediaRecorder: Legacy fallback

Timeouts & Recovery

Probe Timeout: 5s on evaluate calls to prevent hanging
Probe Retry: Proceeds after 5 consecutive timeouts (browser unresponsive)
Viewport Recovery: Automatic restoration on resolution mismatch
Browser Restart: Every 45 minutes to prevent WebGPU OOM crashes

Production Client Build

Enable for faster page loads (fixes 180s timeout issues):

NODE_ENV=production
DUEL_USE_PRODUCTION_CLIENT=true

Serves pre-built client via vite preview instead of JIT dev server.

Audio Capture

PulseAudio Configuration

# User-mode PulseAudio
XDG_RUNTIME_DIR=/tmp/pulse-runtime

# Virtual audio sink
pactl load-module module-null-sink sink_name=chrome_audio

# FFmpeg capture source
PULSE_AUDIO_DEVICE=chrome_audio.monitor
STREAM_AUDIO_ENABLED=true

RTMP Multi-Streaming

Supported Platforms

Twitch: TWITCH_STREAM_KEY
Kick: KICK_STREAM_KEY, KICK_RTMP_URL
X/Twitter: X_STREAM_KEY, X_RTMP_URL
YouTube: Disabled by default (high latency)

FFmpeg Tee Muxer

Single-encode multi-output for efficiency:

ffmpeg -i input \
  -f tee \
  "[f=flv]rtmp://twitch|[f=flv]rtmp://kick|[f=flv]rtmp://x"

Stream Encoding

# Default: film tune with B-frames
# Low-latency mode: zerolatency tune
STREAM_LOW_LATENCY=true

# GOP size (default: 60 frames)
STREAM_GOP_SIZE=60

# Buffer multiplier (default: 2x)
# Reduced from 4x to prevent backpressure buildup

Health Monitoring

# Health check timeout: 5s
# Data timeout: 15s
# Faster failure detection than previous 10s/30s

GitHub Secrets Configuration

Set these in your repository’s Settings → Secrets and variables → Actions:

Required Secrets

# Streaming keys
TWITCH_STREAM_KEY=live_123456789_abcdefghij
KICK_STREAM_KEY=your-kick-key
KICK_RTMP_URL=rtmp://ingest.kick.com/live
X_STREAM_KEY=your-x-key
X_RTMP_URL=rtmp://x-media-studio/your-path

# Database
DATABASE_URL=postgresql://user:pass@host:5432/db

# Solana
SOLANA_DEPLOYER_PRIVATE_KEY=[1,2,3,...]

# Vast.ai SSH
VAST_HOST=ssh5.vast.ai
VAST_PORT=12345
VAST_SSH_KEY=-----BEGIN OPENSSH PRIVATE KEY-----...

Deployment Workflow

Automated Deployment (GitHub Actions)

# Triggered on push to main
.github/workflows/deploy-vast.yml

Manual Deployment

# From local machine
./scripts/deploy-vast.sh

Deployment Steps

GPU Validation: Verify NVIDIA GPU and Vulkan ICD
Display Setup: Try Xorg → Xvfb → Ozone headless
WebGPU Test: Preflight check with Chrome
Environment Persistence: Save settings to .env
PM2 Configuration: Export GPU mode to ecosystem.config.cjs
Service Start: Launch game server + RTMP bridge via PM2

Monitoring & Diagnostics

Check Deployment Status

# SSH into Vast.ai instance
ssh -p $VAST_PORT root@$VAST_HOST

# Check PM2 processes
pm2 status

# View logs
pm2 logs duel-stack
pm2 logs rtmp-bridge

# Check GPU
nvidia-smi

# Check display
echo $DISPLAY
xdpyinfo -display $DISPLAY

WebGPU Diagnostics

# Check chrome://gpu output (captured during deployment)
cat /root/hyperscape/gpu-diagnostics.log

# Test WebGPU manually
google-chrome-unstable --headless=new --enable-unsafe-webgpu \
  --enable-features=WebGPU --use-vulkan \
  --dump-dom about:blank

Common Issues

WebGPU initialization hangs:

Check GPU diagnostics log
Verify Vulkan ICD is present
Ensure display server is running
Review Chrome flags in ecosystem.config.cjs

Stream not starting:

Check RTMP bridge logs: pm2 logs rtmp-bridge
Verify stream keys are set correctly
Test with single destination first
Check FFmpeg is installed: which ffmpeg

Browser crashes immediately:

GPU sandbox bypass flags missing
Check Chrome executable path
Verify GPU is accessible: nvidia-smi

Audio not captured:

Check PulseAudio is running: pactl info
Verify virtual sink exists: pactl list sinks
Check XDG_RUNTIME_DIR is set

Performance Tuning

Stream Quality

# Bitrate (default: 6000k)
STREAM_BITRATE=6000k

# Resolution (default: 1920x1080)
STREAM_WIDTH=1920
STREAM_HEIGHT=1080

# Frame rate (default: 30)
STREAM_FPS=30

Browser Performance

# Restart interval (default: 45 minutes)
# Prevents WebGPU OOM crashes
BROWSER_RESTART_INTERVAL_MS=2700000

# Page navigation timeout (default: 180s)
# Increased for production client build
PAGE_NAVIGATION_TIMEOUT_MS=180000

Resource Limits

# Activity logger queue (default: 1000)
LOGGER_MAX_ENTRIES=1000

# Session timeout (default: 30 minutes)
MAX_SESSION_TICKS=3000

# Damage event cache (default: 1000)
DAMAGE_CACHE_MAX_SIZE=1000

Security Best Practices

Never Commit Secrets

All stream keys must be in .env or GitHub Secrets
Never hardcode credentials in code
Use .gitignore to block .env files

Secret Rotation

# Rotate these regularly:
- TWITCH_STREAM_KEY
- KICK_STREAM_KEY
- X_STREAM_KEY
- STREAMING_VIEWER_ACCESS_TOKEN
- ARENA_EXTERNAL_BET_WRITE_KEY

Access Control

# Restrict SSH access to Vast.ai instance
# Use SSH keys, not passwords
# Disable root login after setup

Troubleshooting

Deployment Fails at WebGPU Test

Symptom: Deployment exits with “WebGPU initialization failed” Solutions:

Check GPU is accessible: nvidia-smi
Verify Vulkan ICD: cat /usr/share/vulkan/icd.d/nvidia_icd.json
Check display server: echo $DISPLAY && xdpyinfo -display $DISPLAY
Review GPU diagnostics: cat gpu-diagnostics.log
Try different GPU mode: Set STREAM_CAPTURE_OZONE_HEADLESS=true

Stream Stops After 45 Minutes

Symptom: Browser restarts, stream briefly interrupts Explanation: Automatic browser restart prevents WebGPU OOM crashes Solutions:

This is expected behavior (prevents crashes)
Increase interval: BROWSER_RESTART_INTERVAL_MS=3600000 (1 hour)
Monitor memory usage: pm2 monit

Page Load Timeout (>180s)

Symptom: Browser times out loading game page Solutions:

Enable production client build:

NODE_ENV=production
DUEL_USE_PRODUCTION_CLIENT=true

Increase timeout: PAGE_NAVIGATION_TIMEOUT_MS=300000
Check network latency to CDN

Audio Not Captured

Symptom: Stream has video but no audio Solutions:

Check PulseAudio: pactl info
Verify sink: pactl list sinks | grep chrome_audio
Check device: PULSE_AUDIO_DEVICE=chrome_audio.monitor
Enable audio: STREAM_AUDIO_ENABLED=true

Advanced Configuration

Custom Chrome Executable

# Use specific Chrome version
STREAM_CAPTURE_EXECUTABLE=/usr/bin/google-chrome-unstable

Low-Latency Streaming

# Enable zerolatency tune (faster playback start)
STREAM_LOW_LATENCY=true

# Reduce GOP size
STREAM_GOP_SIZE=30

Multiple RTMP Destinations

# JSON array format
RTMP_DESTINATIONS_JSON=[
  {"name":"Custom","url":"rtmp://host/live","key":"key","enabled":true}
]

Monitoring

PM2 Dashboard

pm2 monit  # Real-time monitoring
pm2 status  # Process status
pm2 logs --lines 100  # Recent logs

Resource Usage

nvidia-smi  # GPU utilization
htop  # CPU/RAM usage
df -h  # Disk space

Stream Health

# Check RTMP bridge status
curl http://localhost:8765/health

# Check game server
curl http://localhost:5555/status

# Check streaming state
curl http://localhost:5555/api/streaming/state

Cost Optimization

GPU Selection

Minimum: GTX 1060 (6GB VRAM)
Recommended: RTX 3060 (12GB VRAM)
Optimal: RTX 4070 (12GB VRAM)

Instance Configuration

vCPUs: 4-8 cores recommended
RAM: 16GB minimum, 32GB recommended
Storage: 50GB minimum for assets + logs

Spot vs On-Demand

Spot: Cheaper but can be interrupted
On-Demand: More expensive but guaranteed uptime
Recommendation: Use on-demand for production streams

Configuration

Operations

​Vast.ai GPU Streaming Deployment

​Prerequisites

​Required

​GPU Requirements

​Deployment Architecture

​1. Xorg with NVIDIA (Best Performance)

​2. Xvfb with NVIDIA Vulkan (Fallback)

​3. Ozone Headless with GPU (Experimental)

​4. Headless Software Rendering (NOT SUPPORTED)

​WebGPU Validation

​1. GPU Hardware Check

​2. Vulkan ICD Verification

​3. Display Server Check

​4. WebGPU Preflight Test

​5. GPU Diagnostics Capture

​Environment Persistence

​Stream Capture Configuration

​Chrome Flags (GPU Sandbox Bypass)

​Capture Modes

​Timeouts & Recovery

​Production Client Build

​Audio Capture

​PulseAudio Configuration

​RTMP Multi-Streaming

​Supported Platforms

​FFmpeg Tee Muxer

​Stream Encoding

​Health Monitoring

​GitHub Secrets Configuration

​Required Secrets

​Deployment Workflow

​Automated Deployment (GitHub Actions)

​Manual Deployment

​Deployment Steps

​Monitoring & Diagnostics

​Check Deployment Status

​WebGPU Diagnostics

​Common Issues

​Performance Tuning

​Stream Quality

​Browser Performance

​Resource Limits

​Security Best Practices

​Never Commit Secrets

​Secret Rotation

​Access Control

​Troubleshooting

​Deployment Fails at WebGPU Test

​Stream Stops After 45 Minutes

​Page Load Timeout (>180s)

​Audio Not Captured

​Advanced Configuration

​Custom Chrome Executable

​Low-Latency Streaming

​Multiple RTMP Destinations

​Monitoring

​PM2 Dashboard

​Resource Usage

​Stream Health

​Cost Optimization

​GPU Selection

​Instance Configuration

​Spot vs On-Demand

​See Also