External Services¶

Sapari depends on ~10 external services for data, payments, email, DNS, storage, and CI access. This page is a per-service runbook for provisioning each one from scratch — use it for disaster recovery or setting up a new environment. For the rationale behind each choice, see architecture decisions.

Each section lists what to create and which values end up in the server's .env.

Neon Postgres¶

Two separate projects — sapari-staging and sapari-production. Not branches. (why)

Sign up at neon.com. Free tier.
Create project sapari-<env>. Region must match the Hetzner datacenter — Hillsboro → us-west-2, Ashburn → us-east-1. (why)
Connection string: use the direct endpoint (ep-*.c-*.<region>.aws.neon.tech), not the -pooler one. (why)

Capture: DATABASE_URL in async form. Two gotchas when copying from Neon's UI: - Replace sslmode=require with ssl=require — asyncpg uses the latter (psycopg2 uses the former). - Drop any &channel_binding=require Neon adds — asyncpg auto-negotiates SCRAM channel binding via the auth handshake; passing it as a query param TypeErrors.

Final form: postgresql+asyncpg://USER:PASS@ep-xxx.c-N.<region>.aws.neon.tech/dbname?ssl=require.

Upgrade trigger: CU-hours > 90/mo on production, or storage > 0.4 GB.

Cloudflare R2¶

Three buckets per environment:

sapari-raw (clip uploads)
sapari-exports (rendered MP4s)
sapari-assets (user-uploaded overlay media)

Staging gets sapari-staging-raw, sapari-staging-exports, sapari-staging-assets.

CF Dashboard → R2 → Create bucket (6 total, 3 per env).
Enable versioning on production buckets (one-time toggle; protects against accidental delete).
Create two API tokens — one per environment — scoped to that env's buckets with Object Read & Write permissions.

Capture: STORAGE_ACCESS_KEY_ID, STORAGE_SECRET_ACCESS_KEY, STORAGE_ENDPOINT_URL (per-account, same for all buckets), bucket names.

Cloudflare DNS¶

One zone for sapari.io. Records:

Type	Name	Target	Proxied
A	`@` (apex)	Cloudflare Pages IP	Yes
CNAME	`www`	`sapari.io`	Yes
CNAME	`app`	`<prod-pages>.pages.dev`	Yes
CNAME	`staging`	`<staging-pages>.pages.dev`	Yes
A	`api`	Production Hetzner IPv4	No (grey cloud)
A	`api-staging`	Staging Hetzner IPv4	No (grey cloud)

Backend A records are DNS-only (grey cloud) because Caddy runs the TLS termination and the backend firewall restricts to Cloudflare IPs. Proxying would double-terminate TLS.

Caddy needs an API token for DNS-01 ACME challenge (port 80 stays closed on the server). Create under My Profile → API Tokens → Create Token → template "Edit zone DNS". Scope: Zone:DNS:Edit on sapari.io only.

Capture: CLOUDFLARE_API_TOKEN.

Cloudflare Pages¶

Two projects — sapari-frontend and sapari-landing.

Frontend: deploys frontend/ from branch. main → app.sapari.io, staging → staging.sapari.io. Build command: npm run build. Output: dist/.
Landing: deploys landing/ from main only. main → sapari.io. Build: npm run build. Output: dist/.

Cloudflare Workers¶

One Worker per environment (sapari-proxy-staging, sapari-proxy-production) fronting the frontend origin. Handles two path prefixes — /api/* proxies to the Hetzner backend, /media/v1/<jwt> verifies a media JWT and streams bytes from R2. Everything else falls through to the Pages project. (why)

Source lives in worker/. Route patterns (the two prefixes above) are dashboard-managed, not in wrangler.toml — adding a new prefix requires both a code change and a dashboard route addition.

See cloudflare-workers.md for the full deploy runbook, secret management, route-pattern gotcha, and troubleshooting matrix. Follow that doc when provisioning a new env or cutting over production.

Sessions + SSE (keep-alive: true) flow through the /api/* path unchanged.

Cloudflare Access¶

Gates staging behind GitHub OAuth so the world can't poke at staging.sapari.io.

Zero Trust → Access → Applications → Add application → Self-hosted.
Application domain: staging.sapari.io (all paths).
Policies: allow your GitHub account(s) / team.
Identity provider: GitHub OAuth (one-time config in Zero Trust → Settings → Authentication).

Production is public — no Access rule.

Stripe¶

One Stripe account, two modes — test (staging) and live (production).

Test mode setup: 1. Create products + prices matching scripts/create_first_tier.py expectations. 2. Webhook endpoint → https://staging.sapari.io/api/v1/webhooks/stripe (routed via Cloudflare Worker). Events: checkout.session.completed, customer.subscription.updated, customer.subscription.deleted, invoice.payment_succeeded, invoice.payment_failed, payment_intent.succeeded. 3. Save signing secret.

Live mode setup: same steps, webhook points at https://app.sapari.io/api/v1/webhooks/stripe, separate signing secret.

Capture: STRIPE_API_KEY, STRIPE_WEBHOOK_SECRET (distinct per env).

The backend treats STRIPE_TEST_MODE=true as test and requires false for production.

Postmark¶

Transactional email. One account, two message streams — outbound-staging and outbound-production.

Create account + add Sender Signature or verify the sapari.io domain.
DKIM: add the *._domainkey.sapari.io TXT record Postmark gives you to Cloudflare DNS. Start this early — DNS propagation can take hours and Postmark's DNS check must pass before DKIM signing turns on. Sapari's DMARC is p=reject, so emails without DKIM will be rejected.
Create two Server tokens (staging + production).

Capture: POSTMARK_SERVER_TOKEN (per env), EMAIL_FROM_ADDRESS (e.g., noreply@sapari.io).

Separate from Cloudflare Access. These let Sapari's end users log in with Google / GitHub.

Google OAuth: 1. Google Cloud Console → OAuth consent screen → External, scopes: email, profile. 2. Create OAuth Client ID (Web application). Authorized redirect URIs: - https://staging.sapari.io/api/v1/auth/oauth/google/callback - https://app.sapari.io/api/v1/auth/oauth/google/callback

GitHub OAuth: 1. GitHub → Settings → Developer settings → OAuth Apps → New. 2. One per environment. Callback URL is the same shape as Google.

Capture: OAUTH_GOOGLE_CLIENT_ID, OAUTH_GOOGLE_CLIENT_SECRET, OAUTH_GITHUB_CLIENT_ID, OAUTH_GITHUB_CLIENT_SECRET. One set per environment.

YouTube cookies (optional)¶

Only needed when the download worker logs show yt-dlp extractor errors like "Sign in to confirm you're not a bot". YouTube increasingly blocks extraction from datacenter IPs; passing cookies from a logged-in browser session bypasses the check. Most videos extract fine without this.

Generate cookies:

Install the "Get cookies.txt LOCALLY" Chrome extension (or equivalent). Avoid cloud-sync ones — they exfiltrate cookies.
In a Chrome profile logged in to a throwaway YouTube account (not your personal one — these credentials will live on the server), visit https://www.youtube.com.
Open the extension → Export → select Netscape format → save as youtube-cookies.txt.

Install on the server:

# From your laptop:
scp youtube-cookies.txt deploy@<server-tailnet-ip>:~/sapari/secrets/
ssh deploy@<server-tailnet-ip> 'chmod 600 ~/sapari/secrets/youtube-cookies.txt'

# Set the env var in ~/sapari/.env:
YOUTUBE_COOKIES_FILE=/run/secrets/youtube-cookies.txt

# Restart the download worker to pick up the new env + mount:
docker compose -f ~/sapari/docker-compose.prod.yml --env-file ~/sapari/.env up -d taskiq-download-worker

The download worker mounts ./secrets:/run/secrets (bind, read-write so yt-dlp can refresh tokens). Files in secrets/ are gitignored except the README.

Refresh cadence: YouTube session cookies expire; re-export every few months or whenever extraction starts failing again. The file is hot-reloaded by yt-dlp — no deploy needed after replacing it, just restart the worker.

GHCR (GitHub Container Registry)¶

Backend images push to ghcr.io/benavlabs/sapari-backend. No provisioning needed — any repo with GitHub Actions has ghcr.io/<org>/* access via the built-in GITHUB_TOKEN.

The server pulls public images, so no registry auth required on the server either.

Tailscale (CI access to servers)¶

CI workflows join the tailnet to reach tailnet-only SSH. (why)

One-time tailnet ACL (in Tailscale admin console → Access Controls):

{
  "tagOwners": {
    "tag:server": ["<tailnet-owner>"],
    "tag:ci":     ["<tailnet-owner>"]
  },
  "grants": [
    { "src": ["*"], "dst": ["*"], "ip": ["*"] }
  ]
}

(Allow-all today; tighten to tag:ci → tag:server:22 later if needed.)

Per server: in the admin console → Machines → pick the device → Edit ACL tags → add tag:server.

OAuth client for CI (→ Settings → OAuth clients → Generate): - Scopes: Keys → Auth Keys → Write - Tags: tag:ci

Capture: TS_OAUTH_CLIENT_ID, TS_OAUTH_SECRET. These go in both the staging and production GitHub environments (reused across envs).

GitHub repo settings¶

Environments (Settings → Environments): - Create staging and production. - Deployment branch policy for both: allow main (workflow_run always runs in default-branch context regardless of triggering branch). - Secrets per environment: SSH_HOST (server's tailnet IP), SSH_KEY (dedicated CI SSH private key), TS_OAUTH_CLIENT_ID, TS_OAUTH_SECRET.

Summary — what you capture, where it goes¶

Every value above ends up in either: 1. /home/deploy/sapari/.env on the server (most secrets — see backend/.env.production.example for the full list) 2. GitHub environment secrets (only SSH_HOST, SSH_KEY, TS_OAUTH_CLIENT_ID, TS_OAUTH_SECRET) 3. 1Password (or equivalent) — back up everything so you can rebuild if the server is lost

Nothing else needs persisting outside those three places.