Vast.ai Deployment Guide
Complete guide for deploying Hyperscape to Vast.ai GPU instances for streaming and duel arena.Overview
Vast.ai provides affordable GPU instances for running Hyperscape’s streaming pipeline. The deployment uses:- NVIDIA GPU for hardware-accelerated WebGPU rendering
- Xorg or Xvfb for display server (WebGPU requires a display)
- Chrome Dev with Vulkan backend for WebGPU support
- PulseAudio for game audio capture
- FFmpeg for H.264 encoding and RTMP streaming
- PM2 for process management and auto-restart
Prerequisites
Vast.ai Instance Requirements
GPU:- NVIDIA GPU with Vulkan support
- CUDA capability 3.5+ (most GPUs from 2014+)
- Minimum 2GB VRAM (4GB+ recommended)
- Examples: RTX 3060, RTX 4070, A4000, A5000
- Ubuntu 20.04+ or 22.04
- NVIDIA drivers pre-installed
- CUDA toolkit (optional but recommended)
- 4+ CPU cores
- 16GB+ RAM
- 50GB+ disk space
GitHub Secrets
Configure these secrets in your GitHub repository (Settings → Secrets → Actions):Automated Deployment
GitHub Actions Workflow
The deployment is automated via.github/workflows/deploy-vast.yml:
Triggers:
- Push to
mainbranch (after CI passes) - Manual trigger via
workflow_dispatch
- Enter maintenance mode (pauses duel cycles)
- Wait for pending markets to resolve (300s timeout)
- SSH to Vast.ai instance
- Write secrets to
/tmp/hyperscape-secrets.env - Run
scripts/deploy-vast.sh - Wait for server health check (120s, 30 retries)
- Exit maintenance mode (resumes operations)
Manual Deployment
Trigger manually from GitHub Actions tab:- Go to Actions → Deploy to Vast.ai
- Click “Run workflow”
- Select branch (usually
main) - Click “Run workflow”
Deployment Script
Thescripts/deploy-vast.sh script performs the full deployment:
1. Code Update
2. Secrets Restoration
Secrets are written to/tmp/hyperscape-secrets.env before git reset, then copied back:
3. System Dependencies
4. GPU Rendering Setup
Xorg Attempt (if DRI devices available):5. Chrome Installation
6. PulseAudio Setup
7. Build & Database
8. Process Management
9. Port Proxies
Vast.ai exposes different ports externally, so we use socat to proxy:10. Health Check
11. Streaming Diagnostics
After deployment, the script runs comprehensive diagnostics:Environment Variables
The deployment script exports these environment variables for PM2:PM2 Configuration
Theecosystem.config.cjs file configures the duel stack:
Troubleshooting
Deployment Fails at GPU Setup
Error: “FATAL ERROR: Cannot establish WebGPU-capable rendering mode” Causes:- NVIDIA drivers not installed
- GPU not accessible in container
- DRI/DRM devices not available
- Verify instance has NVIDIA GPU:
nvidia-smi - Check Vast.ai instance template includes NVIDIA drivers
- Try different Vast.ai instance with better GPU support
- Check Xorg logs:
cat /var/log/Xorg.99.log
Stream Not Appearing on Platforms
Symptoms:- Deployment succeeds but no stream on Twitch/Kick/X
- RTMP status shows disconnected
- Verify stream keys are correct in GitHub secrets
- Check RTMP URLs are correct (especially Kick)
- Review PM2 logs:
bunx pm2 logs hyperscape-duel - Check FFmpeg processes:
ps aux | grep ffmpeg - Verify network connectivity to RTMP servers
Audio Not in Stream
Symptoms:- Video works but no audio
- FFmpeg shows audio input errors
- Check PulseAudio is running:
pulseaudio --check - Verify chrome_audio sink:
pactl list short sinks - Check monitor device:
pactl list short sources | grep monitor - Review PulseAudio logs in PM2 output
- Restart PulseAudio:
pulseaudio --kill && pulseaudio --start
PM2 Process Crashes
Symptoms:- PM2 shows “errored” or “stopped” status
- Frequent restarts
- Check error logs:
bunx pm2 logs hyperscape-duel --err - Verify all environment variables are set:
bunx pm2 env 0 - Check memory usage:
bunx pm2 status(should be under 4GB) - Review crash-loop protection: Check restart count
- Manually restart:
bunx pm2 restart hyperscape-duel
Database Connection Fails
Symptoms:- “Database warmup failed” errors
- Server fails to start
- Verify
DATABASE_URLis set:echo $DATABASE_URL - Check database is accessible:
psql $DATABASE_URL -c "SELECT 1" - Verify secrets file exists:
cat /tmp/hyperscape-secrets.env - Check network connectivity to database host
- Review database logs for connection errors
Maintenance Mode API
For graceful deployments without interrupting active duels:Enter Maintenance Mode
Check Status
Exit Maintenance Mode
Monitoring
PM2 Commands
Streaming Diagnostics
Log Files
Performance Optimization
GPU Memory
Monitor GPU memory usage:- Reduce stream resolution:
STREAM_CAPTURE_WIDTH=1280 STREAM_CAPTURE_HEIGHT=720 - Lower CDP quality:
STREAM_CDP_QUALITY=70 - Reduce concurrent agents:
AUTO_START_AGENTS_MAX=5
CPU Usage
Monitor CPU usage:- Reduce stream FPS:
STREAM_FPS=24 - Use faster x264 preset:
STREAM_X264_PRESET=ultrafast - Reduce concurrent agents
Network Bandwidth
Monitor network usage:- Reduce stream bitrate:
STREAM_BITRATE=3000k - Reduce resolution:
STREAM_CAPTURE_WIDTH=1280 STREAM_CAPTURE_HEIGHT=720 - Disable some RTMP destinations
Security Best Practices
- Rotate stream keys regularly - Update GitHub secrets monthly
- Use strong JWT_SECRET - Generate with
openssl rand -base64 32 - Restrict admin access - Set
ADMIN_CODEand keep it secret - Monitor logs - Check for unauthorized access attempts
- Update dependencies - Run
bun updateregularly - Firewall rules - Only expose necessary ports (35143, 35079, 35144)
Cost Optimization
Instance Selection
- RTX 3060 - Good balance of performance and cost (~$0.20/hr)
- RTX 4070 - Better performance, higher cost (~$0.35/hr)
- A4000 - Professional GPU, stable but expensive (~$0.50/hr)
Auto-Shutdown
Configure Vast.ai to auto-shutdown during low usage:- Set max idle time in Vast.ai dashboard
- Use
pm2 stopbefore shutdown to save state
Spot Instances
Use Vast.ai “interruptible” instances for lower cost:- ~50% cheaper than on-demand
- May be interrupted with 30s notice
- Good for development/testing
See Also
- docs/streaming-configuration.md - Streaming configuration reference
- docs/webgpu-requirements.md - WebGPU requirements
- docs/maintenance-mode-api.md - Maintenance mode API
- scripts/deploy-vast.sh - Deployment script
- ecosystem.config.cjs - PM2 configuration