Self-hosting autonomous AI agents on your own VPS keeps sensitive data off third-party clouds, gives you predictable latency, and lets you route tasks to the cheapest or most reliable model at any moment. This guide walks you through a hardened, reproducible setup using OpenClaw, Docker/Compose, and a local model runtime (Ollama) as your offline/cheap fallback.
Pick and Prepare Your VPS
- Size: For CPU-only pilots, start with 4 vCPU / 8–16 GB RAM / 120 GB SSD. For GPU, pick an entry NVIDIA card (T4/L4/A10) with 16–24 GB VRAM. Bandwidth 2–5 TB/month is usually enough.
- OS: Ubuntu 22.04 LTS. Set hostname, timezone (UTC), and swap (2–4 GB) if memory is tight.
- Access: Create a non-root sudo user, add your SSH public key, disable password auth, and keep root login off. Configure DNS (A/AAAA) for the box now.
Harden the Server
- Apply updates:
sudo apt update && sudo apt upgrade -y; enableunattended-upgrades. - Firewall:
ufw allow 22/tcp,ufw allow 80,443/tcp, thenufw enable. For admin ports, prefer IP allowlists; cross-check with the OpenClaw security hardening checklist. - Fail2Ban: enable the SSH jail to slow brute force attempts.
- SSH hardening: keep default port or move off 22, enforce
PasswordAuthentication no, and setMaxAuthTries 3. - Backups: snapshot after base hardening and before major upgrades.
Install the Core Stack (Docker, OpenClaw, Local Models)
- Install Docker Engine and Compose plugin (see Docker docs). Verify with
docker run hello-world. - Install OpenClaw gateway/agent per the OpenClaw VPS setup. Confirm
openclaw doctorand gateway health. - Install Ollama for local models: follow DigitalOcean’s Ollama on Ubuntu or the official Ollama Linux guide. Start with
llama3ormistralfor cheap fallbacks.
Configure Model Routing and Agents
- Register cloud providers (Anthropic/OpenAI/Gemini) plus your local Ollama endpoint in
models(AutoGen users can self-host as in the AutoGen install guide). - Add routing rules: heavy reasoning/code → premium; routine summarization/drafts → local/cheap. Set retry + budget caps to prevent runaway spend, following the multi-model routing setup.
- Enable lossless logging + sandboxing; store secrets outside the repo (env vars or secret manager).
- Create a starter SEO agent chain (research → outline → draft → QC) that can fall back to local models if cloud APIs throttle.
Secure Exposure (Reverse Proxy + TLS)
- Put Nginx in front of the gateway: reverse proxy setup, proxy pass to OpenClaw services.
- Obtain TLS with Certbot; auto-renew via systemd timer or cron.
- Protect admin routes: basic auth or IP allowlists; rotate app passwords regularly.
Deploy a Starter Agent Workflow
- Pull your SEO workflow container stack with Docker Compose; pin versions.
- Run an end-to-end smoke test: research a keyword, generate outline + draft, and verify logs and outputs.
- Validate fallbacks: temporarily block cloud API to confirm local Ollama handles low-stakes tasks.
Monitoring, Updates, and Recovery
- Metrics: watch CPU/GPU, RAM, disk, and network; rotate logs with
logrotate. - Updates: patch OS monthly, Docker quarterly, OpenClaw as released; snapshot before upgrades and keep rollback notes.
- Backups: offsite (object storage) for configs and content; practice a restore to a fresh VPS.
FAQ
What are the minimum VPS specs for AI agents? 4 vCPU / 8–16 GB RAM is fine for CPU-only plus local small models; add GPU for faster drafts or code-heavy tasks.
Do I need a GPU? No for light drafting/QA with small local models; yes for faster, larger models or batch jobs.
How do I keep costs low while self-hosting? Route routine tasks to local/Ollama, cap token budgets per task, and auto-stop idle containers. Use premium models only for hard reasoning steps.
How do I secure agent endpoints? Terminate TLS at Nginx, restrict admin routes with auth/IP allowlists, and keep SSH key-only with fail2ban + ufw enabled.
How do I mix local and cloud models for reliability? Define a multi-model fallback chain (local → cheap cloud → premium) and set retries/timeouts in OpenClaw so tasks drop to the next provider automatically.




