Models¶
This page documents the core database models in Sapari. All models use SQLAlchemy with async support and follow a consistent pattern with UUIDs, timestamps, and soft deletion.
Core Entities¶
Sapari has six main entities that form the video editing workflow:
erDiagram
User ||--o{ Project : owns
User ||--o{ UserAsset : uploads
User ||--o{ AssetGroup : creates
Project ||--o{ Clip : contains
Project ||--o{ Edit : has
Project ||--o{ Draft : saves
Project ||--o{ Export : renders
Clip }o--|| ClipFile : references
Draft }o--o{ Edit : overrides
Export }o--o| Draft : "based on"
UserAsset }o--|| AssetFile : references
UserAsset }o--o{ AssetGroup : "belongs to (many-to-many)"
User¶
class User:
id: int
name: str # 2-30 chars
username: str # 2-20 chars, lowercase alphanumeric + underscores (between chars), unique
email: str # unique
hashed_password: str
profile_image_url: str | None
tier_id: int | None # FK to Tier
is_superuser: bool
email_verified: bool # Must be True for password users to login
google_id: str | None # Google OAuth user ID (unique)
github_id: str | None # GitHub OAuth user ID (unique)
oauth_provider: str | None # "google" or "github"
stripe_customer_id: str | None # Stripe customer reference
storage_used_bytes: int # Cached counter of upload storage used (default 0)
onboarding_seen: dict | None # Tour keys dismissed, e.g. {"desktop_pipeline": true} (JSON, nullable)
locale: str # UI locale for API errors + emails (default "en"); distinct from AnalysisRun.language
Auth flow: Password-based users start with email_verified=False. A verification email is sent on signup. Login returns 403 until verified. OAuth users (Google/GitHub) get email_verified set from the provider and skip the verification step.
Storage quota: storage_used_bytes is a cached counter — incremented on confirm (clip/asset upload), decremented on delete (non-YouTube only, last reference). Daily reconciliation cron (reconcile_storage_usage, 3 AM) corrects drift. Quota checked at presign-time; exceeding tier quota returns 422.
Project¶
A project is the top-level container for a video editing session. Users create projects, upload clips to them, and export edited videos.
class Project:
uuid: UUID
user_id: int
name: str
status: ProjectStatus # created, analyzing, analyzed, rendering, complete, failed
settings: dict # pacing_level, silence_threshold_ms, language, etc.
transcript: str | None # Full transcript text (copy from active run)
active_run_id: UUID | None # Currently active AnalysisRun
error_message: str | None
Status Flow:
stateDiagram-v2
[*] --> CREATED
CREATED --> ANALYZING: Trigger analysis
ANALYZING --> ANALYZED: Analysis complete
ANALYZING --> FAILED: Analysis error
ANALYZED --> RENDERING: Trigger render
RENDERING --> COMPLETE: Render complete
RENDERING --> FAILED: Render error
COMPLETE --> [*]
FAILED --> ANALYZING: Retry analysis
FAILED --> RENDERING: Retry render
Clip¶
A clip is a video segment within a project. Projects can have multiple clips that get concatenated during render.
class Clip:
uuid: UUID
project_id: UUID
clip_file_id: UUID # Reference to shared ClipFile
display_order: int # Position in timeline
status: ClipStatus # pending, uploaded, importing, failed
no_trim: bool # Skip silence removal for this clip
no_assets: bool # Skip asset overlay
no_subtitles: bool # Skip subtitle generation
ClipFile¶
ClipFile represents the actual media file, separate from Clip. This allows the same video file to be used in multiple projects without re-downloading or re-processing.
class ClipFile:
uuid: UUID
youtube_video_id: str | None # For metadata display (source tracking)
sha256_hash: str | None # For integrity verification
storage_key: str # R2 path to original video
audio_key: str | None # R2 path to extracted audio
proxy_key: str | None # R2 path to web-compatible proxy
sprite_key: str | None # R2 path to timeline scrub sprite (10x20 grid)
sprite_seconds_per_tile: int | None # Density chosen at sprite generation time
waveform_json: list[float] # Peak amplitudes for timeline
duration_ms: int
status: ClipFileStatus # pending, multipart_initiated, downloading, uploaded, failed
has_audio: bool # False when the source has no audio stream
has_audio is set by process_clip_artifacts after probing the input with ffprobe. If the source carries no audio track (e.g. iOS ReplayKit screen recordings made without microphone) the worker synthesizes a silent WAV at audio_key so downstream consumers stay shape-stable, and stores has_audio=False. The analysis pipeline reads this flag to route Whisper / silence / false-start around silent clips, and the frontend disables audio-dependent SettingsPanel controls when every clip in the project is silent.
Each YouTube import creates a new ClipFile (per-user storage, no deduplication). The youtube_video_id field is kept for metadata and source tracking only.
flowchart TB
A[Import YouTube URL] --> B[Download video]
B --> C[Create new ClipFile]
C --> D[Create Clip referencing ClipFile]
Edit¶
An edit represents a region to modify in the video - either a silence, false start, or profanity detected by analysis, a manual edit created by the user, or a keep region marking content to preserve.
class Edit:
uuid: UUID
project_id: UUID
type: EditType # silence, false_start, profanity, manual, asset, keep
action: EditAction # cut (remove video+audio), mute (silence audio), insert (asset), keep (preserve region)
start_ms: int # Edit start time (ms into combined project timeline)
end_ms: int # Edit end time (enforced end_ms <= SUM(clip_file.duration_ms) at service layer)
active: bool # Whether to apply this edit
confidence: float # Detection confidence (0-1)
reason: str # Full explanation (for debugging/logs)
reason_tag: str # Short tag — raw English written by the detection pipeline; frontend localizes at render time
# Asset fields (only when type='asset')
asset_file_id: UUID | None # FK to UserAsset
insert_source: InsertSource # fixed, ai_directed, manual
fixed_position: FixedPosition # intro, outro, watermark, background_audio
visual_mode: VisualMode # replace, overlay, insert, none
audio_mode: AudioMode # original_only, asset_only, mix, none
overlay_position: OverlayPosition # 9-position grid (top_left, center, etc.)
overlay_size_percent: int # Overlay size as % of video (1-100)
overlay_opacity_percent: int # Overlay opacity (10-100)
overlay_x: float # Custom X position (0-1, center-anchored)
overlay_y: float # Custom Y position (0-1, center-anchored)
overlay_flip_h: bool # Flip overlay horizontally
overlay_flip_v: bool # Flip overlay vertically
overlay_rotation_deg: int # Rotation in degrees (0-360)
audio_volume_percent: int # Audio volume (0-100)
audio_duck_main: bool # Duck main audio during asset
asset_offset_ms: int # Offset into asset source (ms)
# Inside-insert anchor (issue #234) — set together, both NULL when not anchored
inside_insert_edit_id: UUID | None # FK → edit.uuid (anchored INSERT target)
insert_offset_ms: int | None # Offset into that INSERT (ms, ≥ 0)
# Per-INSERT override map: {insert_edit_uuid: 'merge'|'split'|'anchor'}
insert_overlap_modes: dict[str, str] | None
# Per-asset crop reframe (issue #235) — only consulted for REPLACE/INSERT video/image
asset_crop_enabled: bool # Default False; True opts into cover-crop (no letterbox)
asset_crop_zoom: float # ∈ [1.0, 10.0], default 1.0
asset_crop_pan_x: float # ∈ [-1.0, 1.0], default 0.0 (center)
asset_crop_pan_y: float # ∈ [-1.0, 1.0], default 0.0 (center)
Inside-insert anchor. When a non-INSERT asset edit is placed inside an existing INSERT region, inside_insert_edit_id + insert_offset_ms are written as a pair (schema-level "both-or-neither" validator). The renderer's apply_inside_insert_anchors reads them and schedules the asset at <anchored insert's output_start_ms> + insert_offset_ms, bypassing the main-time shift. The FK has ON DELETE SET NULL as a safety net, but EditService.delete runs _repoint_anchored_children as the primary path when an INSERT is deleted — anchored children are moved to the splice point (visible duration preserved), the anchor fields are cleared, and the deleted insert's UUID is stripped from each child's insert_overlap_modes.
Per-asset crop reframe (issue #235). REPLACE / INSERT video and image asset edits default to letterbox when their source aspect differs from the project's target. Setting asset_crop_enabled=true opts that edit into a cover-crop reframe; the four columns persist UI-state (zoom + pan), not a frozen render-state crop rect — same convention as the main-video crop, so changing project aspect after persisting auto-adapts. The renderer's _maybe_crop_chain (workers/render/ffmpeg/concat/segments.py) builds the FFmpeg scale,crop,scale,setsar chain off _compute_crop_region (workers/render/ffmpeg/video_filters.py, Python port of frontend/shared/lib/cropUtils.ts:computeCropRegion). It bails to None when crop is disabled, target dims are unknown, or AssetFile.width/height are missing — the renderer falls back to today's letterbox path, so legacy assets uploaded before migration a4c9e2b6f085 round-trip without error.
Insert overlap modes. Per-INSERT override of how this asset behaves where its bar crosses or overlaps a specific INSERT:
- 'merge' (default, also "missing entry"): asset plays through the insert region.
- 'split': asset is split into pre/post pieces; insert region is skipped.
- 'anchor': asset is clipped to play only inside the host insert (renderer caps end at insert.output_end_ms). Combined with the anchor fields above.
EditAction determines how the edit is applied:
| Action | Video | Audio | Use Case |
|---|---|---|---|
CUT |
Removed | Removed | Silence, false starts - skip entirely |
MUTE |
Keeps playing | Silenced/bleeped | Profanity - video continues, audio censored |
The action field is set by the backend during analysis - the frontend just uses it without needing to know the business logic. Profanity edits use MUTE, all others use CUT.
The reason field contains the full LLM explanation and is useful for debugging. The reason_tag is a short raw-English identifier written by the detection pipeline (e.g. "Word gap", "refinement · multi_attempt", "adjusted"). The API returns it unchanged; the frontend maps known values to the analysis.edit_reason.* i18n catalog at render time via REASON_TAG_KEY_BY_RAW in features/analysis/hooks/useTransformedEdits.ts. Unknown values fall through verbatim.
Users can toggle active to include/exclude specific edits before rendering. Each edit belongs to the AnalysisRun that created it (including manual cuts, which belong to the active run at creation time).
AnalysisRun¶
Tracks a single analysis pipeline execution. Each re-analysis creates a new run — old runs' edits/captions/transcript are preserved.
class AnalysisRun:
uuid: UUID
project_id: UUID
user_id: int
status: AnalysisRunStatus # pending, running, completed, failed
pacing_level: int # Settings used for this run
false_start_sensitivity: int
language: str | None
transcript: str | None # Owned by this run, copied to project when active
transcript_words: list[dict] | None
edit_count: int # Snapshot counts from analysis time
silence_count: int
false_start_count: int
credits_charged: int
duration_ms: int | None
Run switching: project.active_run_id points to the current run. Edit, caption, and draft queries filter by this. POST /analysis-runs/{run_uuid}/activate switches runs — copies transcript to project, invalidates frontend caches.
Draft¶
A draft saves a specific configuration of edits for a project. Think of it as a "save state" that users can restore later.
class Draft:
uuid: UUID
project_id: UUID
name: str
edit_overrides: dict[str, EditOverride] # Per-edit adjustments
export_settings: dict # Resolution, aspect ratio, etc.
is_default: bool # Auto-load this draft
The edit_overrides field stores adjustments to individual edits without modifying the original Edit records:
flowchart LR
subgraph Original["Original Edit Records"]
E1["Edit 1: 1000-2500ms, active"]
E2["Edit 2: 5000-6200ms, active"]
end
subgraph Draft["Draft Overrides"]
O1["Edit 1: start_ms=1200"]
O2["Edit 2: active=false"]
end
subgraph Result["Applied State"]
R1["Edit 1: 1200-2500ms, active"]
R2["Edit 2: 5000-6200ms, inactive"]
end
E1 --> O1 --> R1
E2 --> O2 --> R2
class EditOverride:
active: bool # Override active state
start_ms: int | None # Override start time
end_ms: int | None # Override end time
Export¶
An export is a rendered video. Each export captures a snapshot of the edits at render time.
class Export:
uuid: UUID
project_id: UUID
draft_id: UUID | None # Source draft (optional)
name: str
edit_snapshot: dict # Frozen copy of active edits
settings_snapshot: dict # Frozen export settings
storage_key: str | None # R2 path to rendered video
status: ExportStatus # pending, processing, complete, failed
duration_ms: int | None # Final video duration
file_size_bytes: int | None
The snapshots mean you can keep editing while a render is in progress - it uses the frozen state.
Supporting Entities¶
Asset & AssetFile¶
User-uploaded media assets (images, videos, audio) for overlays and b-roll:
class AssetFile:
uuid: UUID
storage_key: str | None # R2 path to file
original_filename: str
content_type: str
size_bytes: int | None
status: AssetFileStatus # pending, multipart_initiated, processing, uploaded, failed
youtube_video_id: str | None # For metadata display (source tracking)
duration_ms: int | None # Video/audio duration
waveform_json: list[float] | None # Peak amplitudes for timeline visualization
metadata_json: dict | None # Additional metadata (title, uploader, etc.)
width: int | None # Source pixel width (video/image only; issue #235)
height: int | None # Source pixel height (video/image only; issue #235)
class UserAsset:
uuid: UUID
user_id: int
asset_file_id: UUID # FK with CASCADE delete
display_name: str
tags: list[str]
# Note: Groups via UserAssetGroupMembership (many-to-many)
Each asset import creates a new AssetFile (per-user storage, no deduplication). Deleting a UserAsset cascades to delete the underlying AssetFile and storage object. The youtube_video_id field is kept for metadata display only. width + height are populated by _probe_asset_media for video and image types during process_asset_artifacts (migration a4c9e2b6f085); they remain NULL for audio assets and for legacy uploads that predate the probe extension. The per-asset crop renderer treats missing dims as a falls-back-to-letterbox signal — see Edit.asset_crop_* above.
AssetGroup¶
Organizational folders for assets with AI instructions:
class AssetGroup:
uuid: UUID
user_id: int
name: str
description: str | None
default_instructions: str | None # AI hint for when to use assets
is_default: bool # Auto-include in projects
is_pinned: bool # Show at top of list
display_order: int
UserAssetGroupMembership¶
Junction table for many-to-many asset-group relationships:
class UserAssetGroupMembership:
user_asset_id: UUID # FK to UserAsset (CASCADE delete)
group_id: UUID # FK to AssetGroup (CASCADE delete)
added_at: datetime
This enables assets to belong to multiple groups simultaneously. "Copy to group" creates a membership link rather than duplicating the asset.
flowchart TB
A[Asset: Logo.png] --> M1[Membership]
A --> M2[Membership]
M1 --> G1[Brand Assets]
M2 --> G2[Intro Templates]
CaptionLine¶
Editable caption/subtitle lines generated from transcript:
class CaptionLine:
uuid: UUID
project_id: UUID # FK to Project
sequence: int # Order in transcript
text: str # Editable caption text
original_text: str # Original from transcript
start_ms: int # Start time
end_ms: int # End time
Caption lines are generated from transcript words with configurable max_words per line (3-5 for vertical video, 7-10 for horizontal). Users can edit text while original_text preserves the original for comparison.
AnalysisPreset¶
Saved analysis settings that users can reuse across projects:
class AnalysisPreset:
uuid: UUID
user_id: int
name: str
is_default: bool # Auto-apply this preset
pacing_level: int # 0-100, silence removal aggressiveness
false_start_sensitivity: int # 0-100, false start detection
language: str | None # ISO code (en, pt-BR) or null for auto
audio_clean: bool # Enable noise reduction + LUFS normalization
audio_censorship: str # 'none', 'mute', 'bleep' - profanity handling
caption_censorship: bool # Replace profanity with asterisks in captions
director_notes: str | None # Free-text AI instructions
Audio Clean Processing:
When audio_clean is enabled, exports include two audio processing stages:
| Stage | FFmpeg Filter | Purpose |
|---|---|---|
| Noise Reduction | afftdn=nf=-25 |
Removes background noise (AC, fans, room tone) |
| LUFS Normalization | loudnorm=I=-14:TP=-1.5:LRA=11 |
Adjusts loudness to -14 LUFS (YouTube/Spotify standard) |
Noise reduction runs first to avoid amplifying background noise during normalization.
Audio Censorship Options:
| Mode | Effect |
|---|---|
none |
No audio censorship - profanity plays normally |
mute |
Silence audio during profanity regions |
bleep |
Play 1kHz tone during profanity regions |
Users can have up to 5 presets. One can be marked as default.
PreviewPreset¶
Saved preview/export styling settings:
class PreviewPreset:
uuid: UUID
user_id: int
name: str
is_default: bool
# Format settings
format: str | None # "W:H" aspect ratio (e.g. "16:9", "9:16", "1:1", "4:3", "5:4"). None = original.
background: str # Letterbox color (#000000)
# Caption styling
caption_style: CaptionStyle # default, minimal, bold
caption_font: str # sans, serif, mono (generic family ids)
caption_size: int # Font size in px (12-144)
caption_position: CaptionPosition # top, center, bottom
caption_color: str # Text color (#FFFFFF)
caption_length: CaptionLength # short, medium, long (words per line)
# Main video transforms
video_flip_h: bool # Flip main video horizontally
video_flip_v: bool # Flip main video vertically
Allows users to save and quickly switch between different export configurations.
format field semantics. Stored as a "W:H" string (e.g. "16:9", "9:16", "1:1", "4:3", "5:4") or None meaning "Original" (keep the source video's native ratio). Validated at the API boundary against FORMAT_RATIO_PATTERN = r"^[1-9]\d{0,1}:[1-9]\d{0,1}$" in modules/preview_preset/constants.py. The pattern rejects zero sides (prevents a 0:0 reaching any future downstream consumer that would divide by it) and caps each side at 99 so stored values stay ≤ 5 characters (no log pollution) and within the range of real video aspect ratios.
Was previously a 3-value enum (FormatRatio = {LANDSCAPE, PORTRAIT, SQUARE}). Widened because the mobile FormatPanel lets users pick custom ratios, and saving a preset on a custom ratio would 422. The enum and its PG type formatratio have been removed.
Security boundary. This field is UI metadata only — it round-trips between the DB and React state to restore previewAspectRatio on preset load. It does NOT reach FFmpeg arguments, shell commands, filesystem paths, or any other external consumer. If a future change adds such a consumer, the FORMAT_RATIO_PATTERN validation must be re-evaluated for that context.
Subscriber¶
Newsletter subscriber record. Independent of User — newsletter signups can predate or replace account creation, so the same email may appear in both tables.
class Subscriber:
uuid: UUID
email: str # unique
locale: str # 'en' | 'pt' | 'es' (defaults to DEFAULT_LOCALE; seeded from Accept-Language at signup)
status: SubscriberStatus # pending | confirmed | unsubscribed
confirmation_token: str | None # double-opt-in token
confirmation_token_generated_at: datetime | None
confirmation_email_send_attempts: int # cron-bumped on each re-enqueue
confirmed_at: datetime | None
unsubscribed_at: datetime | None
Locale flow. POST /newsletter/subscribe reads Accept-Language (via get_request_locale) and stores it on the row. The confirmation email task and the confirm/unsubscribe success redirects (/{locale}/newsletter/{confirmed,unsubscribed} on the landing site, English unprefixed at root per Astro's prefixDefaultLocale: false) both consume subscriber.locale — so a signup in Portuguese sees Portuguese email copy and lands on /pt/newsletter/confirmed.
Lifecycle invariants. Pending rows whose confirmation_token_generated_at is older than UNCONFIRMED_SUBSCRIBER_TTL_DAYS = 30 are hard-deleted by _cleanup_unconfirmed_subscribers (GDPR Art. 5 storage limitation — abandoned signup is PII without a current legal basis). Pending rows past STUCK_NEWSLETTER_PENDING_MINUTES = 30 with confirmation_email_send_attempts < MAX_NEWSLETTER_CONFIRMATION_ATTEMPTS = 3 are re-enqueued by _recover_pending_newsletter_subscribers; each successful enqueue bumps the attempts counter and resets confirmation_token_generated_at (gives the user a fresh 24h confirm window and prevents immediate re-match next sweep).
Billing & Entitlements¶
Payment¶
Tracks Stripe Checkout Sessions and payment records.
| Field | Type | Description |
|---|---|---|
user_id |
FK → User | User who made the payment |
price_id |
FK → Price | Price being purchased |
payment_type |
Enum | ONE_TIME or SUBSCRIPTION |
status |
Enum | PENDING, SUCCEEDED, FAILED |
amount |
int | Amount in cents |
stripe_checkout_session_id |
str | Stripe Checkout Session ID |
stripe_payment_intent_id |
str | Stripe Payment Intent ID (used for chargeback matching) |
stripe_customer_id |
str | Stripe Customer ID |
stripe_subscription_id |
str | For subscription payments |
UserEntitlement¶
Flexible access control — grants credits, tier access, or features to users.
| Field | Type | Description |
|---|---|---|
user_id |
FK → User | User who owns this entitlement |
entitlement_type |
Enum | SUBSCRIPTION, CREDIT_GRANT, TIER_ACCESS, FEATURE_UNLOCK |
grant_reason |
Enum | PURCHASE, SUBSCRIPTION, TRIAL, BONUS, etc. |
tier_id |
FK → Tier | For TIER_ACCESS entitlements |
credit_type |
str | For CREDIT_GRANT (e.g., "ai_minutes") |
quantity_granted |
int | Total credits granted |
quantity_used |
int | Credits consumed |
consumption_type |
Enum | NONE, DECREMENTAL, RENEWABLE |
expires_at |
datetime | When entitlement expires (null = permanent) |
status |
Enum | ACTIVE, INACTIVE, EXPIRED, SUSPENDED |
EntitlementTransaction¶
Authoritative append-only ledger for all credit events. quantity_used on UserEntitlement is a denormalized cache rebuildable from this table via rebuild_user_balance().
| Field | Type | Description |
|---|---|---|
user_id |
FK -> User | User |
entitlement_id |
FK -> UserEntitlement | Which entitlement was affected |
credit_type |
Enum | AI_MINUTES, API_CALLS, etc. |
amount |
int | Positive for grants, negative for usage, 0 for resets |
transaction_type |
Enum | PURCHASE, GRANT, USAGE, RESET, TRIAL, BETA, REFUND |
balance_before |
int | Per-entitlement balance before transaction |
balance_after |
int | Per-entitlement balance after transaction |
period |
str | Period start date for RESET transactions (e.g., "2026-04-05") |
pool_identifier |
Enum | Credit pool (subscription_allowance, beta_program, etc.) |
transaction_metadata |
JSON | Extra context (e.g., {"usage_before_reset": 75}) |
Common Patterns¶
UUIDs Everywhere¶
All entities use UUIDs as public identifiers. Internal integer IDs exist but are never exposed via the API:
# Good - use UUID in API responses
{"uuid": "550e8400-e29b-41d4-a716-446655440000", ...}
# Bad - never expose internal IDs
{"id": 42, ...}
Soft Deletion¶
Entities aren't truly deleted - they're marked with deleted_at and is_deleted:
Queries filter out deleted records by default.
Timestamps¶
All entities track creation and update times:
Key Files¶
| Component | Location |
|---|---|
| Project model | backend/src/modules/project/models.py |
| Clip model | backend/src/modules/clip/models.py |
| Edit model | backend/src/modules/edit/models.py |
| Export model | backend/src/modules/export/models.py |
| Draft model | backend/src/modules/draft/models.py |
| Asset models | backend/src/modules/asset/models.py |
| CaptionLine model | backend/src/modules/caption_line/models.py |
| AnalysisRun model | backend/src/modules/analysis_run/models.py |
| AnalysisPreset model | backend/src/modules/preset/models.py |
| PreviewPreset model | backend/src/modules/preview_preset/models.py |
| Common schemas | backend/src/modules/common/schemas.py |