Storage¶
Sapari uses Cloudflare R2 for object storage. R2 is S3-compatible, so we use boto3 and standard S3 patterns. This page covers how we organize files and handle uploads/downloads.
Buckets¶
We use three buckets to separate concerns:
| Bucket | Env Var | Contents |
|---|---|---|
| Raw clips | STORAGE_BUCKET_RAW |
Original videos, extracted audio, proxies, waveforms |
| Exports | STORAGE_BUCKET_EXPORTS |
Rendered final videos |
| Assets | STORAGE_BUCKET_ASSETS |
User-uploaded assets (images, videos, audio) |
The separation lets us apply different retention policies and access controls. Raw clips might be cleaned up after 30 days of inactivity, while exports are kept longer. Assets are retained until explicitly deleted.
Key Structure¶
Files are organized with a prefix based on the first 2 characters of the UUID. This helps S3/R2 distribute files across partitions for better performance at scale.
# Clips bucket
clips/ab/ab123456-.../{original_filename}
clips/ab/ab123456-.../audio.mp3
clips/ab/ab123456-.../proxy.mp4
clips/ab/ab123456-.../sprite.jpg
clips/ab/ab123456-.../waveform.json
# Exports bucket
exports/{project_uuid}/{export_uuid}/Final Cut v1.mp4
# Assets bucket
assets/ab/ab123456-.../{original_filename}
The ClipFile model stores these paths in storage_key, audio_key, proxy_key, and sprite_key fields; sprite_seconds_per_tile (integer, nullable) persists the density chosen at generation time so the frontend can map clip-time to sprite tile. The AssetFile model stores its path in storage_key.
Upload Flow¶
We use presigned URLs for uploads so files go directly to R2 without passing through our API servers. This is faster and cheaper.
# 1. Client requests presigned URL
POST /api/v1/projects/{uuid}/clips/presign
{
"filename": "recording.mp4",
"content_type": "video/mp4",
"size_bytes": 104857600
}
# 2. Server creates records and returns presigned PUT URL
{
"clip_uuid": "ab123456-...",
"upload_url": "https://r2.cloudflare.../clips/ab/ab123456.../recording.mp4?X-Amz-...",
"content_type": "video/mp4",
"expires_in": 3600
}
# 3. Client PUTs file bytes directly to R2
PUT {upload_url}
Content-Type: video/mp4
Body: <file bytes>
# 4. Client confirms upload completed
POST /api/v1/projects/{uuid}/clips/{clip_uuid}/confirm
The presigned PUT is valid for 1 hour. R2 does not implement PostObject, so the size cap cannot be bound into the signature (S3-on-AWS would use presigned POST with content-length-range; that's unreachable here). On confirm, the backend HEADs the object and re-checks the user's storage quota against the actual uploaded size, not the client-declared size — this HEAD-based recheck is what actually enforces the cap. If the upload fails, the client can request a new PUT URL.
Download Flow¶
Downloads split into two paths depending on the content type.
Exports and asset playback — presigned URL, direct to R2¶
One-shot downloads (exported videos) and asset playback (b-roll, overlays) still use presigned R2 URLs:
# Request download URL for an export
GET /api/v1/exports/{uuid}/download
# Response
{
"url": "https://r2.cloudflare.../exports/...",
"expires_in": 3600,
"filename": "Final Cut v1.mp4"
}
The frontend redirects to this URL or uses it in a download link. Asset playback follows the same pattern via GET /api/v1/assets/{uuid}/video-url — asset migration to the Worker path is tracked as a post-launch follow-up in R2_MEDIA_PROXY_PLAN.md.
Clip playback — Worker-fronted, JWT-authenticated¶
Clip playback routes through a Cloudflare Worker at /media/v1/<jwt> rather than handing the browser a presigned R2 URL directly. The backend mints a short-lived HS256 JWT (MEDIA_TOKEN_TTL_SECONDS, default 300); the Worker verifies it with a kid-based secret registry, fetches bytes from R2 via a native binding, and streams them back through Cloudflare's edge cache. Three reasons for the architecture:
- Per-request authorization. Presigned URLs are valid-until-expiry for anyone who sees them. JWT + Worker verification lets ownership checks run on every request.
- Edge caching. The Cache API stores byte-range responses per-colo. Repeat views share bandwidth rather than re-fetching from R2.
- Shorter leak window. 5-minute JWTs limit blast radius if a URL ever escapes a log or screenshot.
# Request playback URL for a clip
GET /api/v1/projects/{project_uuid}/clips/{clip_uuid}/proxy
# Response
{
"url": "https://staging.sapari.io/media/v1/<jwt>",
"expires_in": 300,
"sprite": {
"url": "https://staging.sapari.io/media/v1/<sprite-jwt>",
"expires_in": 300,
"tile_width_px": 160,
"tile_height_px": 90,
"tiles_per_row": 10,
"total_tiles": 200,
"seconds_per_tile": 1
}
}
The sprite field is null until proxy generation completes; once the ClipFile has sprite_key + sprite_seconds_per_tile populated, the response includes a minted sprite URL alongside the proxy URL. Sprite responses ship with Cache-Control: public, max-age=31536000, immutable because the URL is content-addressable (clip UUID + filename); proxy/original video responses keep the default 1-hour max-age.
See R2_MEDIA_PROXY_PLAN.md for the full Stage 0-6 migration plan, TIER_3_SPRITE_PLAN.md for the sprite design, and docs/operations/media-token-rotation.md for the annual rotation runbook.
Storage Quotas¶
Each tier has a storage quota enforced at presign time. Only user-uploaded files count toward the quota — YouTube imports and system-generated files (proxy, audio) are excluded.
| Tier | Quota | What counts |
|---|---|---|
| Free | 500 MB | ClipFile.size_bytes + AssetFile.size_bytes (user uploads only) |
| Hobby | 2 GB | Same |
| Creator | 25 GB | Same |
| Viral | 100 GB | Same |
How it works:
- Presign — checks
user.storage_used_bytes + request.size_bytesagainst tier quota. Rejects with 422 if over limit. - Confirm — atomically increments
user.storage_used_bytesbyfile.size_bytes. - Delete — atomically decrements (with
greatest(x, 0)guard) if the file is the last reference and not a YouTube import. - Reconciliation — daily cron (
reconcile_storage_usage, 3 AM) recalculates from actual SUM queries and corrects drift.
The cached counter (User.storage_used_bytes) avoids expensive JOIN queries on every presign. Exports are excluded from the upload quota (handled separately by tier-based retention).
Tier quotas are defined in MB (entitlement/constants.py: TIER_STORAGE_MB) and converted to bytes using BYTES_PER_MB from common/constants.py at presign time.
Local Development¶
For local development, we use MinIO as an S3-compatible object store. Docker Compose sets it up automatically.
There's a quirk with presigned URLs in Docker: the URL generated inside the container points to http://minio:9000 (the Docker network hostname), but browsers need http://localhost:9000.
The StorageClient handles this by remapping URLs:
# Internal (Docker network)
http://minio:9000/bucket/key?signature...
# Remapped for browser
http://localhost:9000/bucket/key?signature...
This happens automatically based on the STORAGE_PUBLIC_ENDPOINT env var.
Storage Client¶
The StorageClient class wraps boto3 and provides a clean interface:
from src.infrastructure.storage import get_storage_client
storage = get_storage_client()
# Generate presigned PUT URL (R2 doesn't implement PostObject, so size cannot be signed in)
upload = await storage.generate_upload_url(
bucket=settings.STORAGE_BUCKET_RAW,
key="clips/ab/abc123/video.mp4",
content_type="video/mp4",
expires_in=3600,
)
# Returns PresignedUpload(url=..., key=..., bucket=..., content_type=..., expires_in=...)
# Read object metadata (for post-upload quota recheck — authoritative size enforcement)
metadata = await storage.head_object(
bucket=settings.STORAGE_BUCKET_RAW,
key="clips/ab/abc123/video.mp4",
)
# Returns ObjectMetadata(size_bytes=..., content_type=..., etag=...)
# Generate presigned download URL
url = await storage.generate_presigned_download(
bucket=settings.STORAGE_BUCKET_EXPORTS,
key="exports/proj123/exp456/output.mp4",
expires_in=3600,
)
# Upload file directly
await storage.upload_file(
bucket=settings.STORAGE_BUCKET_RAW,
key="clips/ab/abc123/audio.mp3",
file_path=Path("/tmp/audio.mp3"),
content_type="audio/mpeg",
)
Key Files¶
| Component | Location |
|---|---|
| Storage client | backend/src/infrastructure/storage/client.py |
| Storage settings | backend/src/infrastructure/config/settings.py |
| Clip presign endpoint | backend/src/interfaces/api/v1/clips.py |
| Asset presign endpoint | backend/src/interfaces/api/v1/assets.py |
| Export download endpoint | backend/src/interfaces/api/v1/exports.py |