Vast.ai GPU Streaming Deployment Guide
Complete guide for deploying Hyperscape GPU streaming to Vast.ai with WebGPU rendering.Overview
Hyperscape streams live gameplay to Twitch, Kick, and X/Twitter using GPU-accelerated WebGPU rendering. This guide covers the complete deployment process for Vast.ai GPU instances.Requirements
Hardware
- NVIDIA GPU with Vulkan support (RTX 3060 or better recommended)
- 8GB+ VRAM for WebGPU rendering
- 16GB+ RAM for Chrome, FFmpeg, and game server
- 50GB+ storage for Docker images and game assets
Software
- Ubuntu 22.04 or newer
- NVIDIA drivers 525+ with Vulkan support
- Docker for PostgreSQL
- Xorg or Xvfb display server (headless NOT supported)
- PulseAudio for audio capture
- FFmpeg with H.264 encoding support
- Chrome Dev channel with WebGPU enabled
Architecture
Rendering Pipeline
GPU Rendering Modes
The deployment script tries these modes in order:-
Xorg with NVIDIA (preferred):
- Direct GPU access via DRI/DRM devices
- Best performance and compatibility
- Requires
/dev/dri/card0access
-
Xvfb with NVIDIA Vulkan (fallback):
- Virtual framebuffer for X11 protocol
- Chrome uses GPU via ANGLE/Vulkan
- Works without DRM device access
-
Ozone Headless with GPU (experimental):
- Wayland-like headless rendering
- May work when X11/Xvfb fails
- Requires GPU sandbox bypass
-
Headless mode: NOT SUPPORTED
- WebGPU requires display server
- Deployment FAILS if no GPU mode works
Deployment Steps
1. Create Vast.ai Instance
Recommended specs:- GPU: RTX 3060 or better
- RAM: 16GB+
- Storage: 50GB+
- Image: Ubuntu 22.04 with NVIDIA drivers
2. Configure GitHub Secrets
Set these secrets in your GitHub repository (Settings → Secrets → Actions): Required:VAST_HOST- Instance IP addressVAST_PORT- SSH port (usually 22)VAST_SSH_KEY- Private SSH key for authenticationDATABASE_URL- PostgreSQL connection stringTWITCH_STREAM_KEY- Twitch stream keyKICK_STREAM_KEY- Kick stream keyKICK_RTMP_URL- Kick RTMP URLX_STREAM_KEY- X/Twitter stream keyX_RTMP_URL- X/Twitter RTMP URLSOLANA_DEPLOYER_PRIVATE_KEY- Solana keypair (base58)
JWT_SECRET- JWT signing secretARENA_EXTERNAL_BET_WRITE_KEY- Arena betting API key
3. Deploy via GitHub Actions
Automatic deployment:4. Verify Deployment
Check GPU access:Configuration
Environment Variables
The deployment script auto-configures these variables in.env:
GPU Configuration:
PM2 Configuration
ecosystem.config.cjs is pre-configured with optimal settings:- Auto-restart on crash
- Memory limit: 4GB
- Crash-loop protection
- Log rotation
- Environment variable passthrough
Monitoring
Health Checks
RTMP Status:Alerts
Webhook alerts (optional):- Server crash
- Database connection failure
- RTMP stream failure
- WebGPU initialization failure
Troubleshooting
WebGPU Initialization Failed
Symptoms:- Black frames in stream
- Browser hangs on page load
- “WebGPU preflight FAILED” in logs
-
Update NVIDIA drivers:
-
Reinstall Vulkan ICD:
-
Restart display server:
-
Re-run deployment:
Browser Timeout on Page Load
Symptoms:- “Navigation timeout” in logs
- Page takes >180s to load
- Stream never starts
Resolution Mismatch
Symptoms:- Stream shows wrong resolution
- “Resolution mismatch” warnings in logs
- Viewport size doesn’t match stream dimensions
No Audio in Stream
Symptoms:- Video works but no audio
- FFmpeg shows audio stream as silent
-
Restart PulseAudio:
-
Recreate audio sink:
-
Verify environment:
-
Restart PM2:
CDP Capture Stalls
Symptoms:- Stream freezes
- “CDP capture stalled” in logs
- No traffic for 30+ seconds
- Soft recovery: Restart CDP screencast (~2-3s, no gap)
- Hard recovery: Full browser restart (~10-15s, visible gap)
- Fallback: Switch to MediaRecorder after max failures
RTMP Connection Failed
Symptoms:- “RTMP connection failed” in logs
- Stream not visible on platform
- FFmpeg errors
-
Verify stream keys:
-
Test RTMP connection:
-
Check firewall:
-
Restart PM2:
High Memory Usage
Symptoms:- PM2 restarts due to memory limit
- “max_memory_restart” in logs
- System OOM killer
-
Increase memory limit:
-
Enable memory trimming:
-
Reduce stream quality:
-
Restart browser periodically:
- Automatic browser rotation every 1 hour (built-in)
- Clears WebGPU memory leaks
Performance Tuning
Low Latency Streaming
For faster playback start (higher bitrate):High Quality Streaming
For better compression (slower playback start):Stability Tuning
For more stable streaming (slower recovery):CPU Optimization
For lower CPU usage:Monitoring
Real-Time Monitoring
PM2 Dashboard:Log Analysis
WebGPU Diagnostics:Maintenance
Update Deployment
Via GitHub Actions:Restart Services
Restart PM2:Clean Up
Clear logs:Security
SSH Access
Use SSH keys (not passwords):Stream Keys
NEVER commit stream keys to repository:- Set via GitHub Secrets
- Store in
.envfile (gitignored) - Rotate keys regularly
Firewall
Open required ports:Cost Optimization
Instance Selection
Recommended specs:- RTX 3060: ~$0.20/hour
- RTX 3070: ~$0.25/hour
- RTX 3080: ~$0.30/hour
- Use spot instances (cheaper but can be interrupted)
- Stop instance when not streaming
- Use lower resolution (720p vs 1080p)
Resource Usage
Typical usage:- CPU: 15-25%
- GPU: 30-50%
- RAM: 2-3GB
- Network: 5-10 Mbps upload
Advanced Configuration
Custom Chrome Executable
Custom RTMP Destinations
WebCodecs Mode (Experimental)
- Native browser encoding
- FFmpeg stream copy (no re-encoding)
- Lower CPU usage
- Experimental API
- May not work on all GPUs
- Falls back to CDP if initialization fails
Ozone Headless Mode (Experimental)
- GPU rendering without X11
- May work when Xorg/Xvfb fails
- Experimental
- Requires GPU sandbox bypass
- Not all GPUs support it
Support
- Deployment Issues: Check
scripts/deploy-vast.shlogs - Streaming Issues: See AGENTS.md for architecture details
- WebGPU Issues: See WEBGPU-MIGRATION.md for migration guide
- Bug Reports: GitHub Issues
Related Documentation
- AGENTS.md - GPU streaming architecture
- CLAUDE.md - Development guide
- WEBGPU-MIGRATION.md - WebGPU migration guide
- packages/server/README.md - Server configuration
- .env.example - Environment variables reference
Related Commits
47782ed- Enforce WebGPU-only modeff45217- Add WebGPU initialization timeoutd5c6884- Add WebGPU diagnostics and preflight test47167b6- Remove hardcoded secrets, add resolution tracking4be263a- Use production client build for faster page loads432ff84- Proceed with capture after 5 consecutive probe timeouts37ad042- Add Playwright WebGPU test and DOM-based result capture