Skip to content

Models

This page documents the core database models in Sapari. All models use SQLAlchemy with async support and follow a consistent pattern with UUIDs, timestamps, and soft deletion.

Core Entities

Sapari has six main entities that form the video editing workflow:

erDiagram
    User ||--o{ Project : owns
    User ||--o{ UserAsset : uploads
    User ||--o{ AssetGroup : creates
    Project ||--o{ Clip : contains
    Project ||--o{ Edit : has
    Project ||--o{ Draft : saves
    Project ||--o{ Export : renders

    Clip }o--|| ClipFile : references

    Draft }o--o{ Edit : overrides
    Export }o--o| Draft : "based on"

    UserAsset }o--|| AssetFile : references
    UserAsset }o--o{ AssetGroup : "belongs to (many-to-many)"

User

class User:
    id: int
    name: str                      # 2-30 chars
    username: str                  # 2-20 chars, lowercase alphanumeric, unique
    email: str                     # unique
    hashed_password: str
    profile_image_url: str | None
    tier_id: int | None            # FK to Tier
    is_superuser: bool
    email_verified: bool           # Must be True for password users to login
    google_id: str | None          # Google OAuth user ID (unique)
    github_id: str | None          # GitHub OAuth user ID (unique)
    oauth_provider: str | None     # "google" or "github"
    stripe_customer_id: str | None # Stripe customer reference
    storage_used_bytes: int        # Cached counter of upload storage used (default 0)
    onboarding_seen: dict | None   # Tour keys dismissed, e.g. {"desktop_pipeline": true} (JSON, nullable)

Auth flow: Password-based users start with email_verified=False. A verification email is sent on signup. Login returns 403 until verified. OAuth users (Google/GitHub) get email_verified set from the provider and skip the verification step.

Storage quota: storage_used_bytes is a cached counter — incremented on confirm (clip/asset upload), decremented on delete (non-YouTube only, last reference). Daily reconciliation cron (reconcile_storage_usage, 3 AM) corrects drift. Quota checked at presign-time; exceeding tier quota returns 422.

Project

A project is the top-level container for a video editing session. Users create projects, upload clips to them, and export edited videos.

class Project:
    uuid: UUID
    user_id: int
    name: str
    status: ProjectStatus  # created, analyzing, analyzed, rendering, complete, failed
    settings: dict         # pacing_level, silence_threshold_ms, language, etc.
    transcript: str | None # Full transcript text (copy from active run)
    active_run_id: UUID | None  # Currently active AnalysisRun
    error_message: str | None

Status Flow:

stateDiagram-v2
    [*] --> CREATED
    CREATED --> ANALYZING: Trigger analysis
    ANALYZING --> ANALYZED: Analysis complete
    ANALYZING --> FAILED: Analysis error
    ANALYZED --> RENDERING: Trigger render
    RENDERING --> COMPLETE: Render complete
    RENDERING --> FAILED: Render error
    COMPLETE --> [*]
    FAILED --> ANALYZING: Retry analysis
    FAILED --> RENDERING: Retry render

Clip

A clip is a video segment within a project. Projects can have multiple clips that get concatenated during render.

class Clip:
    uuid: UUID
    project_id: UUID
    clip_file_id: UUID     # Reference to shared ClipFile
    display_order: int     # Position in timeline
    status: ClipStatus     # pending, processing, ready, failed
    no_trim: bool          # Skip silence removal for this clip
    no_assets: bool        # Skip asset overlay
    no_subtitles: bool     # Skip subtitle generation

ClipFile

ClipFile represents the actual media file, separate from Clip. This allows the same video file to be used in multiple projects without re-downloading or re-processing.

class ClipFile:
    uuid: UUID
    youtube_video_id: str | None  # For metadata display (source tracking)
    sha256_hash: str | None       # For integrity verification
    storage_key: str              # R2 path to original video
    audio_key: str | None         # R2 path to extracted audio
    proxy_key: str | None         # R2 path to web-compatible proxy
    sprite_key: str | None        # R2 path to timeline scrub sprite (10x20 grid)
    sprite_seconds_per_tile: int | None  # Density chosen at sprite generation time
    waveform_json: list[float]    # Peak amplitudes for timeline
    duration_ms: int
    status: ClipFileStatus        # pending, processing, ready, failed

Each YouTube import creates a new ClipFile (per-user storage, no deduplication). The youtube_video_id field is kept for metadata and source tracking only.

flowchart TB
    A[Import YouTube URL] --> B[Download video]
    B --> C[Create new ClipFile]
    C --> D[Create Clip referencing ClipFile]

Edit

An edit represents a region to modify in the video - either a silence, false start, or profanity detected by analysis, a manual edit created by the user, or a keep region marking content to preserve.

class Edit:
    uuid: UUID
    project_id: UUID
    type: EditType       # silence, false_start, profanity, manual, asset, keep
    action: EditAction   # cut (remove video+audio), mute (silence audio), insert (asset), keep (preserve region)
    start_ms: int        # Edit start time (ms into combined project timeline)
    end_ms: int          # Edit end time (enforced end_ms <= SUM(clip_file.duration_ms) at service layer)
    active: bool         # Whether to apply this edit
    confidence: float    # Detection confidence (0-1)
    reason: str          # Full explanation (for debugging/logs)
    reason_tag: str      # Short tag for UI display (e.g., "Word gap", "Serial repetition")

    # Asset fields (only when type='asset')
    asset_file_id: UUID | None       # FK to UserAsset
    insert_source: InsertSource      # fixed, ai_directed, manual
    fixed_position: FixedPosition    # intro, outro, watermark, background_audio
    visual_mode: VisualMode          # replace, overlay, insert, none
    audio_mode: AudioMode            # original_only, asset_only, mix, none
    overlay_position: OverlayPosition  # 9-position grid (top_left, center, etc.)
    overlay_size_percent: int        # Overlay size as % of video (1-100)
    overlay_opacity_percent: int     # Overlay opacity (10-100)
    overlay_x: float                 # Custom X position (0-1, center-anchored)
    overlay_y: float                 # Custom Y position (0-1, center-anchored)
    overlay_flip_h: bool             # Flip overlay horizontally
    overlay_flip_v: bool             # Flip overlay vertically
    overlay_rotation_deg: int        # Rotation in degrees (0-360)
    audio_volume_percent: int        # Audio volume (0-100)
    audio_duck_main: bool            # Duck main audio during asset
    asset_offset_ms: int             # Offset into asset source (ms)

EditAction determines how the edit is applied:

Action Video Audio Use Case
CUT Removed Removed Silence, false starts - skip entirely
MUTE Keeps playing Silenced/bleeped Profanity - video continues, audio censored

The action field is set by the backend during analysis - the frontend just uses it without needing to know the business logic. Profanity edits use MUTE, all others use CUT.

The reason field contains the full LLM explanation and is useful for debugging. The reason_tag is a short, formatted tag shown in the UI (converted from snake_case to "Sentence case" by the API).

Users can toggle active to include/exclude specific edits before rendering. Each edit belongs to the AnalysisRun that created it (including manual cuts, which belong to the active run at creation time).

AnalysisRun

Tracks a single analysis pipeline execution. Each re-analysis creates a new run — old runs' edits/captions/transcript are preserved.

class AnalysisRun:
    uuid: UUID
    project_id: UUID
    user_id: int
    status: AnalysisRunStatus  # pending, running, completed, failed
    pacing_level: int          # Settings used for this run
    false_start_sensitivity: int
    language: str | None
    transcript: str | None     # Owned by this run, copied to project when active
    transcript_words: list[dict] | None
    edit_count: int            # Snapshot counts from analysis time
    silence_count: int
    false_start_count: int
    credits_charged: int
    duration_ms: int | None

Run switching: project.active_run_id points to the current run. Edit, caption, and draft queries filter by this. POST /analysis-runs/{run_uuid}/activate switches runs — copies transcript to project, invalidates frontend caches.

Draft

A draft saves a specific configuration of edits for a project. Think of it as a "save state" that users can restore later.

class Draft:
    uuid: UUID
    project_id: UUID
    name: str
    edit_overrides: dict[str, EditOverride]  # Per-edit adjustments
    export_settings: dict                     # Resolution, aspect ratio, etc.
    is_default: bool                          # Auto-load this draft

The edit_overrides field stores adjustments to individual edits without modifying the original Edit records:

flowchart LR
    subgraph Original["Original Edit Records"]
        E1["Edit 1: 1000-2500ms, active"]
        E2["Edit 2: 5000-6200ms, active"]
    end

    subgraph Draft["Draft Overrides"]
        O1["Edit 1: start_ms=1200"]
        O2["Edit 2: active=false"]
    end

    subgraph Result["Applied State"]
        R1["Edit 1: 1200-2500ms, active"]
        R2["Edit 2: 5000-6200ms, inactive"]
    end

    E1 --> O1 --> R1
    E2 --> O2 --> R2
class EditOverride:
    active: bool         # Override active state
    start_ms: int | None # Override start time
    end_ms: int | None   # Override end time

Export

An export is a rendered video. Each export captures a snapshot of the edits at render time.

class Export:
    uuid: UUID
    project_id: UUID
    draft_id: UUID | None         # Source draft (optional)
    name: str
    edit_snapshot: dict           # Frozen copy of active edits
    settings_snapshot: dict       # Frozen export settings
    storage_key: str | None       # R2 path to rendered video
    status: ExportStatus          # pending, processing, complete, failed
    duration_ms: int | None       # Final video duration
    file_size_bytes: int | None

The snapshots mean you can keep editing while a render is in progress - it uses the frozen state.

Supporting Entities

Asset & AssetFile

User-uploaded media assets (images, videos, audio) for overlays and b-roll:

class AssetFile:
    uuid: UUID
    storage_key: str | None       # R2 path to file
    original_filename: str
    content_type: str
    size_bytes: int | None
    status: AssetFileStatus       # pending, uploaded, failed
    youtube_video_id: str | None  # For metadata display (source tracking)
    duration_ms: int | None       # Video/audio duration
    waveform_json: list[float] | None  # Peak amplitudes for timeline visualization
    metadata_json: dict | None    # Additional metadata (title, uploader, etc.)

class UserAsset:
    uuid: UUID
    user_id: int
    asset_file_id: UUID           # FK with CASCADE delete
    display_name: str
    tags: list[str]
    # Note: Groups via UserAssetGroupMembership (many-to-many)

Each asset import creates a new AssetFile (per-user storage, no deduplication). Deleting a UserAsset cascades to delete the underlying AssetFile and storage object. The youtube_video_id field is kept for metadata display only.

AssetGroup

Organizational folders for assets with AI instructions:

class AssetGroup:
    uuid: UUID
    user_id: int
    name: str
    description: str | None
    default_instructions: str | None  # AI hint for when to use assets
    is_default: bool                   # Auto-include in projects
    is_pinned: bool                    # Show at top of list
    display_order: int

UserAssetGroupMembership

Junction table for many-to-many asset-group relationships:

class UserAssetGroupMembership:
    user_asset_id: UUID   # FK to UserAsset (CASCADE delete)
    group_id: UUID        # FK to AssetGroup (CASCADE delete)
    added_at: datetime

This enables assets to belong to multiple groups simultaneously. "Copy to group" creates a membership link rather than duplicating the asset.

flowchart TB
    A[Asset: Logo.png] --> M1[Membership]
    A --> M2[Membership]
    M1 --> G1[Brand Assets]
    M2 --> G2[Intro Templates]

CaptionLine

Editable caption/subtitle lines generated from transcript:

class CaptionLine:
    uuid: UUID
    project_id: UUID              # FK to Project
    sequence: int                 # Order in transcript
    text: str                     # Editable caption text
    original_text: str            # Original from transcript
    start_ms: int                 # Start time
    end_ms: int                   # End time

Caption lines are generated from transcript words with configurable max_words per line (3-5 for vertical video, 7-10 for horizontal). Users can edit text while original_text preserves the original for comparison.

AnalysisPreset

Saved analysis settings that users can reuse across projects:

class AnalysisPreset:
    uuid: UUID
    user_id: int
    name: str
    is_default: bool              # Auto-apply this preset
    pacing_level: int             # 0-100, silence removal aggressiveness
    false_start_sensitivity: int  # 0-100, false start detection
    language: str | None          # ISO code (en, pt-BR) or null for auto
    audio_clean: bool             # Enable noise reduction + LUFS normalization
    audio_censorship: str         # 'none', 'mute', 'bleep' - profanity handling
    caption_censorship: bool      # Replace profanity with asterisks in captions
    director_notes: str | None    # Free-text AI instructions

Audio Clean Processing:

When audio_clean is enabled, exports include two audio processing stages:

Stage FFmpeg Filter Purpose
Noise Reduction afftdn=nf=-25 Removes background noise (AC, fans, room tone)
LUFS Normalization loudnorm=I=-14:TP=-1.5:LRA=11 Adjusts loudness to -14 LUFS (YouTube/Spotify standard)

Noise reduction runs first to avoid amplifying background noise during normalization.

Audio Censorship Options:

Mode Effect
none No audio censorship - profanity plays normally
mute Silence audio during profanity regions
bleep Play 1kHz tone during profanity regions

Users can have up to 5 presets. One can be marked as default.

PreviewPreset

Saved preview/export styling settings:

class PreviewPreset:
    uuid: UUID
    user_id: int
    name: str
    is_default: bool

    # Format settings
    format: str | None            # "W:H" aspect ratio (e.g. "16:9", "9:16", "1:1", "4:3", "5:4"). None = original.
    background: str               # Letterbox color (#000000)

    # Caption styling
    caption_style: CaptionStyle   # default, minimal, bold
    caption_font: str             # sans, serif, mono (generic family ids)
    caption_size: int             # Font size in px (12-144)
    caption_position: CaptionPosition  # top, center, bottom
    caption_color: str            # Text color (#FFFFFF)
    caption_length: CaptionLength # short, medium, long (words per line)

    # Main video transforms
    video_flip_h: bool            # Flip main video horizontally
    video_flip_v: bool            # Flip main video vertically

Allows users to save and quickly switch between different export configurations.

format field semantics. Stored as a "W:H" string (e.g. "16:9", "9:16", "1:1", "4:3", "5:4") or None meaning "Original" (keep the source video's native ratio). Validated at the API boundary against FORMAT_RATIO_PATTERN = r"^[1-9]\d{0,1}:[1-9]\d{0,1}$" in modules/preview_preset/constants.py. The pattern rejects zero sides (prevents a 0:0 reaching any future downstream consumer that would divide by it) and caps each side at 99 so stored values stay ≤ 5 characters (no log pollution) and within the range of real video aspect ratios.

Was previously a 3-value enum (FormatRatio = {LANDSCAPE, PORTRAIT, SQUARE}). Widened because the mobile FormatPanel lets users pick custom ratios, and saving a preset on a custom ratio would 422. The enum and its PG type formatratio have been removed.

Security boundary. This field is UI metadata only — it round-trips between the DB and React state to restore previewAspectRatio on preset load. It does NOT reach FFmpeg arguments, shell commands, filesystem paths, or any other external consumer. If a future change adds such a consumer, the FORMAT_RATIO_PATTERN validation must be re-evaluated for that context.

Billing & Entitlements

Payment

Tracks Stripe Checkout Sessions and payment records.

Field Type Description
user_id FK → User User who made the payment
price_id FK → Price Price being purchased
payment_type Enum ONE_TIME or SUBSCRIPTION
status Enum PENDING, SUCCEEDED, FAILED
amount int Amount in cents
stripe_checkout_session_id str Stripe Checkout Session ID
stripe_payment_intent_id str Stripe Payment Intent ID (used for chargeback matching)
stripe_customer_id str Stripe Customer ID
stripe_subscription_id str For subscription payments

UserEntitlement

Flexible access control — grants credits, tier access, or features to users.

Field Type Description
user_id FK → User User who owns this entitlement
entitlement_type Enum SUBSCRIPTION, CREDIT_GRANT, TIER_ACCESS, FEATURE_UNLOCK
grant_reason Enum PURCHASE, SUBSCRIPTION, TRIAL, BONUS, etc.
tier_id FK → Tier For TIER_ACCESS entitlements
credit_type str For CREDIT_GRANT (e.g., "ai_minutes")
quantity_granted int Total credits granted
quantity_used int Credits consumed
consumption_type Enum NONE, DECREMENTAL, RENEWABLE
expires_at datetime When entitlement expires (null = permanent)
status Enum ACTIVE, INACTIVE, EXPIRED, SUSPENDED

EntitlementTransaction

Authoritative append-only ledger for all credit events. quantity_used on UserEntitlement is a denormalized cache rebuildable from this table via rebuild_user_balance().

Field Type Description
user_id FK -> User User
entitlement_id FK -> UserEntitlement Which entitlement was affected
credit_type Enum AI_MINUTES, API_CALLS, etc.
amount int Positive for grants, negative for usage, 0 for resets
transaction_type Enum PURCHASE, GRANT, USAGE, RESET, TRIAL, BETA, REFUND
balance_before int Per-entitlement balance before transaction
balance_after int Per-entitlement balance after transaction
period str Period start date for RESET transactions (e.g., "2026-04-05")
pool_identifier Enum Credit pool (subscription_allowance, beta_program, etc.)
transaction_metadata JSON Extra context (e.g., {"usage_before_reset": 75})

Common Patterns

UUIDs Everywhere

All entities use UUIDs as public identifiers. Internal integer IDs exist but are never exposed via the API:

# Good - use UUID in API responses
{"uuid": "550e8400-e29b-41d4-a716-446655440000", ...}

# Bad - never expose internal IDs
{"id": 42, ...}

Soft Deletion

Entities aren't truly deleted - they're marked with deleted_at and is_deleted:

class PersistentDeletion:
    deleted_at: datetime | None
    is_deleted: bool = False

Queries filter out deleted records by default.

Timestamps

All entities track creation and update times:

class TimestampSchema:
    created_at: datetime
    updated_at: datetime | None

Key Files

Component Location
Project model backend/src/modules/project/models.py
Clip model backend/src/modules/clip/models.py
Edit model backend/src/modules/edit/models.py
Export model backend/src/modules/export/models.py
Draft model backend/src/modules/draft/models.py
Asset models backend/src/modules/asset/models.py
CaptionLine model backend/src/modules/caption_line/models.py
AnalysisRun model backend/src/modules/analysis_run/models.py
AnalysisPreset model backend/src/modules/preset/models.py
PreviewPreset model backend/src/modules/preview_preset/models.py
Common schemas backend/src/modules/common/schemas.py

← Architecture Overview API Endpoints →