Models¶

This page documents the core database models in Sapari. All models use SQLAlchemy with async support and follow a consistent pattern with UUIDs, timestamps, and soft deletion.

Core Entities¶

Sapari has six main entities that form the video editing workflow:

erDiagram
    User ||--o{ Project : owns
    User ||--o{ UserAsset : uploads
    User ||--o{ AssetGroup : creates
    Project ||--o{ Clip : contains
    Project ||--o{ Edit : has
    Project ||--o{ Draft : saves
    Project ||--o{ Export : renders

    Clip }o--|| ClipFile : references

    Draft }o--o{ Edit : overrides
    Export }o--o| Draft : "based on"

    UserAsset }o--|| AssetFile : references
    UserAsset }o--o{ AssetGroup : "belongs to (many-to-many)"

User¶

class User:
    id: int
    name: str                      # 2-30 chars
    username: str                  # 2-20 chars, lowercase alphanumeric + underscores (between chars), unique
    email: str                     # unique
    hashed_password: str
    profile_image_url: str | None
    tier_id: int | None            # FK to Tier
    is_superuser: bool
    email_verified: bool           # Must be True for password users to login
    google_id: str | None          # Google OAuth user ID (unique)
    github_id: str | None          # GitHub OAuth user ID (unique)
    oauth_provider: str | None     # "google" or "github"
    stripe_customer_id: str | None # Stripe customer reference
    storage_used_bytes: int        # Cached counter of upload storage used (default 0)
    onboarding_seen: dict | None   # Tour keys dismissed, e.g. {"desktop_pipeline": true} (JSON, nullable)
    locale: str                    # UI locale for API errors + emails (default "en"); distinct from AnalysisRun.language

Auth flow: Password-based users start with email_verified=False. A verification email is sent on signup. Login returns 403 until verified. OAuth users (Google/GitHub) get email_verified set from the provider and skip the verification step.

Storage quota: storage_used_bytes is a cached counter — incremented on confirm (clip/asset upload), decremented on delete (non-YouTube only, last reference). Daily reconciliation cron (reconcile_storage_usage, 3 AM) corrects drift. Quota checked at presign-time; exceeding tier quota returns 422.

Project¶

A project is the top-level container for a video editing session. Users create projects, upload clips to them, and export edited videos.

class Project:
    uuid: UUID
    user_id: int
    name: str
    status: ProjectStatus  # created, analyzing, analyzed, rendering, complete, failed
    settings: dict         # pacing_level, silence_threshold_ms, language, etc.
    transcript: str | None # Full transcript text (copy from active run)
    active_run_id: UUID | None  # Currently active AnalysisRun
    error_message: str | None

Status Flow:

stateDiagram-v2
    [*] --> CREATED
    CREATED --> ANALYZING: Trigger analysis
    ANALYZING --> ANALYZED: Analysis complete
    ANALYZING --> FAILED: Analysis error
    ANALYZED --> RENDERING: Trigger render
    RENDERING --> COMPLETE: Render complete
    RENDERING --> FAILED: Render error
    COMPLETE --> [*]
    FAILED --> ANALYZING: Retry analysis
    FAILED --> RENDERING: Retry render

Clip¶

A clip is a video segment within a project. Projects can have multiple clips that get concatenated during render.

class Clip:
    uuid: UUID
    project_id: UUID
    clip_file_id: UUID     # Reference to shared ClipFile
    display_order: int     # Position in timeline
    status: ClipStatus     # pending, uploaded, importing, failed
    no_trim: bool          # Skip silence removal for this clip
    no_assets: bool        # Skip asset overlay
    no_subtitles: bool     # Skip subtitle generation

ClipFile¶

ClipFile represents the actual media file, separate from Clip. This allows the same video file to be used in multiple projects without re-downloading or re-processing.

class ClipFile:
    uuid: UUID
    youtube_video_id: str | None  # For metadata display (source tracking)
    sha256_hash: str | None       # For integrity verification
    storage_key: str              # R2 path to original video
    audio_key: str | None         # R2 path to extracted audio
    proxy_key: str | None         # R2 path to web-compatible proxy
    sprite_key: str | None        # R2 path to timeline scrub sprite (10x20 grid)
    sprite_seconds_per_tile: int | None  # Density chosen at sprite generation time
    waveform_json: list[float]    # Peak amplitudes for timeline
    duration_ms: int
    status: ClipFileStatus        # pending, multipart_initiated, downloading, uploaded, failed
    has_audio: bool               # False when the source has no audio stream

has_audio is set by process_clip_artifacts after probing the input with ffprobe. If the source carries no audio track (e.g. iOS ReplayKit screen recordings made without microphone) the worker synthesizes a silent WAV at audio_key so downstream consumers stay shape-stable, and stores has_audio=False. The analysis pipeline reads this flag to route Whisper / silence / false-start around silent clips, and the frontend disables audio-dependent SettingsPanel controls when every clip in the project is silent.

Each YouTube import creates a new ClipFile (per-user storage, no deduplication). The youtube_video_id field is kept for metadata and source tracking only.

flowchart TB
    A[Import YouTube URL] --> B[Download video]
    B --> C[Create new ClipFile]
    C --> D[Create Clip referencing ClipFile]

Edit¶

An edit represents a region to modify in the video - either a silence, false start, or profanity detected by analysis, a manual edit created by the user, or a keep region marking content to preserve.

class Edit:
    uuid: UUID
    project_id: UUID
    type: EditType       # silence, false_start, profanity, manual, asset, keep
    action: EditAction   # cut (remove video+audio), mute (silence audio), insert (asset), keep (preserve region)
    start_ms: int        # Edit start time (ms into combined project timeline)
    end_ms: int          # Edit end time (enforced end_ms <= SUM(clip_file.duration_ms) at service layer)
    active: bool         # Whether to apply this edit
    confidence: float    # Detection confidence (0-1)
    reason: str          # Full explanation (for debugging/logs)
    reason_tag: str      # Short tag — raw English written by the detection pipeline; frontend localizes at render time

    # Asset fields (only when type='asset')
    asset_file_id: UUID | None       # FK to UserAsset
    insert_source: InsertSource      # fixed, ai_directed, manual
    fixed_position: FixedPosition    # intro, outro, watermark, background_audio
    visual_mode: VisualMode          # replace, overlay, insert, none
    audio_mode: AudioMode            # original_only, asset_only, mix, none
    overlay_position: OverlayPosition  # 9-position grid (top_left, center, etc.)
    overlay_size_percent: int        # Overlay size as % of video (1-100)
    overlay_opacity_percent: int     # Overlay opacity (10-100)
    overlay_x: float                 # Custom X position (0-1, center-anchored)
    overlay_y: float                 # Custom Y position (0-1, center-anchored)
    overlay_flip_h: bool             # Flip overlay horizontally
    overlay_flip_v: bool             # Flip overlay vertically
    overlay_rotation_deg: int        # Rotation in degrees (0-360)
    audio_volume_percent: int        # Audio volume (0-100)
    audio_duck_main: bool            # Duck main audio during asset
    asset_offset_ms: int             # Offset into asset source (ms)

    # Inside-insert anchor (issue #234) — set together, both NULL when not anchored
    inside_insert_edit_id: UUID | None    # FK → edit.uuid (anchored INSERT target)
    insert_offset_ms: int | None          # Offset into that INSERT (ms, ≥ 0)

    # Per-INSERT override map: {insert_edit_uuid: 'merge'|'split'|'anchor'}
    insert_overlap_modes: dict[str, str] | None

    # Per-asset crop reframe (issue #235) — only consulted for REPLACE/INSERT video/image
    asset_crop_enabled: bool         # Default False; True opts into cover-crop (no letterbox)
    asset_crop_zoom: float           # ∈ [1.0, 10.0], default 1.0
    asset_crop_pan_x: float          # ∈ [-1.0, 1.0], default 0.0 (center)
    asset_crop_pan_y: float          # ∈ [-1.0, 1.0], default 0.0 (center)

Inside-insert anchor. When a non-INSERT asset edit is placed inside an existing INSERT region, inside_insert_edit_id + insert_offset_ms are written as a pair (schema-level "both-or-neither" validator). The renderer's apply_inside_insert_anchors reads them and schedules the asset at <anchored insert's output_start_ms> + insert_offset_ms, bypassing the main-time shift. The FK has ON DELETE SET NULL as a safety net, but EditService.delete runs _repoint_anchored_children as the primary path when an INSERT is deleted — anchored children are moved to the splice point (visible duration preserved), the anchor fields are cleared, and the deleted insert's UUID is stripped from each child's insert_overlap_modes.

Per-asset crop reframe (issue #235). REPLACE / INSERT video and image asset edits default to letterbox when their source aspect differs from the project's target. Setting asset_crop_enabled=true opts that edit into a cover-crop reframe; the four columns persist UI-state (zoom + pan), not a frozen render-state crop rect — same convention as the main-video crop, so changing project aspect after persisting auto-adapts. The renderer's _maybe_crop_chain (workers/render/ffmpeg/concat/segments.py) builds the FFmpeg scale,crop,scale,setsar chain off _compute_crop_region (workers/render/ffmpeg/video_filters.py, Python port of frontend/shared/lib/cropUtils.ts:computeCropRegion). It bails to None when crop is disabled, target dims are unknown, or AssetFile.width/height are missing — the renderer falls back to today's letterbox path, so legacy assets uploaded before migration a4c9e2b6f085 round-trip without error.

Insert overlap modes. Per-INSERT override of how this asset behaves where its bar crosses or overlaps a specific INSERT: - 'merge' (default, also "missing entry"): asset plays through the insert region. - 'split': asset is split into pre/post pieces; insert region is skipped. - 'anchor': asset is clipped to play only inside the host insert (renderer caps end at insert.output_end_ms). Combined with the anchor fields above.

EditAction determines how the edit is applied:

Action	Video	Audio	Use Case
`CUT`	Removed	Removed	Silence, false starts - skip entirely
`MUTE`	Keeps playing	Silenced/bleeped	Profanity - video continues, audio censored

The action field is set by the backend during analysis - the frontend just uses it without needing to know the business logic. Profanity edits use MUTE, all others use CUT.

The reason field contains the full LLM explanation and is useful for debugging. The reason_tag is a short raw-English identifier written by the detection pipeline (e.g. "Word gap", "refinement · multi_attempt", "adjusted"). The API returns it unchanged; the frontend maps known values to the analysis.edit_reason.* i18n catalog at render time via REASON_TAG_KEY_BY_RAW in features/analysis/hooks/useTransformedEdits.ts. Unknown values fall through verbatim.

Users can toggle active to include/exclude specific edits before rendering. Each edit belongs to the AnalysisRun that created it (including manual cuts, which belong to the active run at creation time).

AnalysisRun¶

Tracks a single analysis pipeline execution. Each re-analysis creates a new run — old runs' edits/captions/transcript are preserved.

class AnalysisRun:
    uuid: UUID
    project_id: UUID
    user_id: int
    status: AnalysisRunStatus  # pending, running, completed, failed
    pacing_level: int          # Settings used for this run
    false_start_sensitivity: int
    language: str | None
    transcript: str | None     # Owned by this run, copied to project when active
    transcript_words: list[dict] | None
    edit_count: int            # Snapshot counts from analysis time
    silence_count: int
    false_start_count: int
    credits_charged: int
    duration_ms: int | None

Run switching: project.active_run_id points to the current run. Edit, caption, and draft queries filter by this. POST /analysis-runs/{run_uuid}/activate switches runs — copies transcript to project, invalidates frontend caches.

Draft¶

A draft saves a specific configuration of edits for a project. Think of it as a "save state" that users can restore later.

class Draft:
    uuid: UUID
    project_id: UUID
    name: str
    edit_overrides: dict[str, EditOverride]  # Per-edit adjustments
    export_settings: dict                     # Resolution, aspect ratio, etc.
    is_default: bool                          # Auto-load this draft

The edit_overrides field stores adjustments to individual edits without modifying the original Edit records:

flowchart LR
    subgraph Original["Original Edit Records"]
        E1["Edit 1: 1000-2500ms, active"]
        E2["Edit 2: 5000-6200ms, active"]
    end

    subgraph Draft["Draft Overrides"]
        O1["Edit 1: start_ms=1200"]
        O2["Edit 2: active=false"]
    end

    subgraph Result["Applied State"]
        R1["Edit 1: 1200-2500ms, active"]
        R2["Edit 2: 5000-6200ms, inactive"]
    end

    E1 --> O1 --> R1
    E2 --> O2 --> R2

class EditOverride:
    active: bool         # Override active state
    start_ms: int | None # Override start time
    end_ms: int | None   # Override end time

Export¶

An export is a rendered video. Each export captures a snapshot of the edits at render time.

class Export:
    uuid: UUID
    project_id: UUID
    draft_id: UUID | None         # Source draft (optional)
    name: str
    edit_snapshot: dict           # Frozen copy of active edits
    settings_snapshot: dict       # Frozen export settings
    storage_key: str | None       # R2 path to rendered video
    status: ExportStatus          # pending, processing, complete, failed
    duration_ms: int | None       # Final video duration
    file_size_bytes: int | None

The snapshots mean you can keep editing while a render is in progress - it uses the frozen state.

Supporting Entities¶

Asset & AssetFile¶

User-uploaded media assets (images, videos, audio) for overlays and b-roll:

class AssetFile:
    uuid: UUID
    storage_key: str | None       # R2 path to file
    original_filename: str
    content_type: str
    size_bytes: int | None
    status: AssetFileStatus       # pending, multipart_initiated, processing, uploaded, failed
    youtube_video_id: str | None  # For metadata display (source tracking)
    duration_ms: int | None       # Video/audio duration
    waveform_json: list[float] | None  # Peak amplitudes for timeline visualization
    metadata_json: dict | None    # Additional metadata (title, uploader, etc.)
    width: int | None             # Source pixel width (video/image only; issue #235)
    height: int | None            # Source pixel height (video/image only; issue #235)

class UserAsset:
    uuid: UUID
    user_id: int
    asset_file_id: UUID           # FK with CASCADE delete
    display_name: str
    tags: list[str]
    # Note: Groups via UserAssetGroupMembership (many-to-many)

Each asset import creates a new AssetFile (per-user storage, no deduplication). Deleting a UserAsset cascades to delete the underlying AssetFile and storage object. The youtube_video_id field is kept for metadata display only. width + height are populated by _probe_asset_media for video and image types during process_asset_artifacts (migration a4c9e2b6f085); they remain NULL for audio assets and for legacy uploads that predate the probe extension. The per-asset crop renderer treats missing dims as a falls-back-to-letterbox signal — see Edit.asset_crop_* above.

AssetGroup¶

Organizational folders for assets with AI instructions:

class AssetGroup:
    uuid: UUID
    user_id: int
    name: str
    description: str | None
    default_instructions: str | None  # AI hint for when to use assets
    is_default: bool                   # Auto-include in projects
    is_pinned: bool                    # Show at top of list
    display_order: int

UserAssetGroupMembership¶

Junction table for many-to-many asset-group relationships:

class UserAssetGroupMembership:
    user_asset_id: UUID   # FK to UserAsset (CASCADE delete)
    group_id: UUID        # FK to AssetGroup (CASCADE delete)
    added_at: datetime

This enables assets to belong to multiple groups simultaneously. "Copy to group" creates a membership link rather than duplicating the asset.

flowchart TB
    A[Asset: Logo.png] --> M1[Membership]
    A --> M2[Membership]
    M1 --> G1[Brand Assets]
    M2 --> G2[Intro Templates]

CaptionLine¶

Editable caption/subtitle lines generated from transcript:

class CaptionLine:
    uuid: UUID
    project_id: UUID              # FK to Project
    sequence: int                 # Order in transcript
    text: str                     # Editable caption text
    original_text: str            # Original from transcript
    start_ms: int                 # Start time
    end_ms: int                   # End time

Caption lines are generated from transcript words with configurable max_words per line (3-5 for vertical video, 7-10 for horizontal). Users can edit text while original_text preserves the original for comparison.

AnalysisPreset¶

Saved analysis settings that users can reuse across projects:

class AnalysisPreset:
    uuid: UUID
    user_id: int
    name: str
    is_default: bool              # Auto-apply this preset
    pacing_level: int             # 0-100, silence removal aggressiveness
    false_start_sensitivity: int  # 0-100, false start detection
    language: str | None          # ISO code (en, pt-BR) or null for auto
    audio_clean: bool             # Enable noise reduction + LUFS normalization
    audio_censorship: str         # 'none', 'mute', 'bleep' - profanity handling
    caption_censorship: bool      # Replace profanity with asterisks in captions
    director_notes: str | None    # Free-text AI instructions

Audio Clean Processing:

When audio_clean is enabled, exports include two audio processing stages:

Stage	FFmpeg Filter	Purpose
Noise Reduction	`afftdn=nf=-25`	Removes background noise (AC, fans, room tone)
LUFS Normalization	`loudnorm=I=-14:TP=-1.5:LRA=11`	Adjusts loudness to -14 LUFS (YouTube/Spotify standard)

Noise reduction runs first to avoid amplifying background noise during normalization.

Audio Censorship Options:

Mode	Effect
`none`	No audio censorship - profanity plays normally
`mute`	Silence audio during profanity regions
`bleep`	Play 1kHz tone during profanity regions

Users can have up to 5 presets. One can be marked as default.

PreviewPreset¶

Saved preview/export styling settings:

class PreviewPreset:
    uuid: UUID
    user_id: int
    name: str
    is_default: bool

    # Format settings
    format: str | None            # "W:H" aspect ratio (e.g. "16:9", "9:16", "1:1", "4:3", "5:4"). None = original.
    background: str               # Letterbox color (#000000)

    # Caption styling
    caption_style: CaptionStyle   # default, minimal, bold
    caption_font: str             # sans, serif, mono (generic family ids)
    caption_size: int             # Font size in px (12-144)
    caption_position: CaptionPosition  # top, center, bottom
    caption_color: str            # Text color (#FFFFFF)
    caption_length: CaptionLength # short, medium, long (words per line)

    # Main video transforms
    video_flip_h: bool            # Flip main video horizontally
    video_flip_v: bool            # Flip main video vertically

Allows users to save and quickly switch between different export configurations.

format field semantics. Stored as a "W:H" string (e.g. "16:9", "9:16", "1:1", "4:3", "5:4") or None meaning "Original" (keep the source video's native ratio). Validated at the API boundary against FORMAT_RATIO_PATTERN = r"^[1-9]\d{0,1}:[1-9]\d{0,1}$" in modules/preview_preset/constants.py. The pattern rejects zero sides (prevents a 0:0 reaching any future downstream consumer that would divide by it) and caps each side at 99 so stored values stay ≤ 5 characters (no log pollution) and within the range of real video aspect ratios.

Was previously a 3-value enum (FormatRatio = {LANDSCAPE, PORTRAIT, SQUARE}). Widened because the mobile FormatPanel lets users pick custom ratios, and saving a preset on a custom ratio would 422. The enum and its PG type formatratio have been removed.

Security boundary. This field is UI metadata only — it round-trips between the DB and React state to restore previewAspectRatio on preset load. It does NOT reach FFmpeg arguments, shell commands, filesystem paths, or any other external consumer. If a future change adds such a consumer, the FORMAT_RATIO_PATTERN validation must be re-evaluated for that context.

Subscriber¶

Newsletter subscriber record. Independent of User — newsletter signups can predate or replace account creation, so the same email may appear in both tables.

class Subscriber:
    uuid: UUID
    email: str                              # unique
    locale: str                             # 'en' | 'pt' | 'es' (defaults to DEFAULT_LOCALE; seeded from Accept-Language at signup)
    status: SubscriberStatus                # pending | confirmed | unsubscribed
    confirmation_token: str | None          # double-opt-in token
    confirmation_token_generated_at: datetime | None
    confirmation_email_send_attempts: int   # cron-bumped on each re-enqueue
    confirmed_at: datetime | None
    unsubscribed_at: datetime | None

Locale flow. POST /newsletter/subscribe reads Accept-Language (via get_request_locale) and stores it on the row. The confirmation email task and the confirm/unsubscribe success redirects (/{locale}/newsletter/{confirmed,unsubscribed} on the landing site, English unprefixed at root per Astro's prefixDefaultLocale: false) both consume subscriber.locale — so a signup in Portuguese sees Portuguese email copy and lands on /pt/newsletter/confirmed.

Lifecycle invariants. Pending rows whose confirmation_token_generated_at is older than UNCONFIRMED_SUBSCRIBER_TTL_DAYS = 30 are hard-deleted by _cleanup_unconfirmed_subscribers (GDPR Art. 5 storage limitation — abandoned signup is PII without a current legal basis). Pending rows past STUCK_NEWSLETTER_PENDING_MINUTES = 30 with confirmation_email_send_attempts < MAX_NEWSLETTER_CONFIRMATION_ATTEMPTS = 3 are re-enqueued by _recover_pending_newsletter_subscribers; each successful enqueue bumps the attempts counter and resets confirmation_token_generated_at (gives the user a fresh 24h confirm window and prevents immediate re-match next sweep).

Billing & Entitlements¶

Payment¶

Tracks Stripe Checkout Sessions and payment records.

Field	Type	Description
`user_id`	FK → User	User who made the payment
`price_id`	FK → Price	Price being purchased
`payment_type`	Enum	ONE_TIME or SUBSCRIPTION
`status`	Enum	PENDING, SUCCEEDED, FAILED
`amount`	int	Amount in cents
`stripe_checkout_session_id`	str	Stripe Checkout Session ID
`stripe_payment_intent_id`	str	Stripe Payment Intent ID (used for chargeback matching)
`stripe_customer_id`	str	Stripe Customer ID
`stripe_subscription_id`	str	For subscription payments

UserEntitlement¶

Flexible access control — grants credits, tier access, or features to users.

Field	Type	Description
`user_id`	FK → User	User who owns this entitlement
`entitlement_type`	Enum	SUBSCRIPTION, CREDIT_GRANT, TIER_ACCESS, FEATURE_UNLOCK
`grant_reason`	Enum	PURCHASE, SUBSCRIPTION, TRIAL, BONUS, etc.
`tier_id`	FK → Tier	For TIER_ACCESS entitlements
`credit_type`	str	For CREDIT_GRANT (e.g., "ai_minutes")
`quantity_granted`	int	Total credits granted
`quantity_used`	int	Credits consumed
`consumption_type`	Enum	NONE, DECREMENTAL, RENEWABLE
`expires_at`	datetime	When entitlement expires (null = permanent)
`status`	Enum	ACTIVE, INACTIVE, EXPIRED, SUSPENDED

EntitlementTransaction¶

Authoritative append-only ledger for all credit events. quantity_used on UserEntitlement is a denormalized cache rebuildable from this table via rebuild_user_balance().

Field	Type	Description
`user_id`	FK -> User	User
`entitlement_id`	FK -> UserEntitlement	Which entitlement was affected
`credit_type`	Enum	AI_MINUTES, API_CALLS, etc.
`amount`	int	Positive for grants, negative for usage, 0 for resets
`transaction_type`	Enum	PURCHASE, GRANT, USAGE, RESET, TRIAL, BETA, REFUND
`balance_before`	int	Per-entitlement balance before transaction
`balance_after`	int	Per-entitlement balance after transaction
`period`	str	Period start date for RESET transactions (e.g., "2026-04-05")
`pool_identifier`	Enum	Credit pool (subscription_allowance, beta_program, etc.)
`transaction_metadata`	JSON	Extra context (e.g., `{"usage_before_reset": 75}`)

Common Patterns¶

UUIDs Everywhere¶

All entities use UUIDs as public identifiers. Internal integer IDs exist but are never exposed via the API:

# Good - use UUID in API responses
{"uuid": "550e8400-e29b-41d4-a716-446655440000", ...}

# Bad - never expose internal IDs
{"id": 42, ...}

Soft Deletion¶

Entities aren't truly deleted - they're marked with deleted_at and is_deleted:

class PersistentDeletion:
    deleted_at: datetime | None
    is_deleted: bool = False

Queries filter out deleted records by default.

Timestamps¶

All entities track creation and update times:

class TimestampSchema:
    created_at: datetime
    updated_at: datetime | None

Key Files¶

Component	Location
Project model	`backend/src/modules/project/models.py`
Clip model	`backend/src/modules/clip/models.py`
Edit model	`backend/src/modules/edit/models.py`
Export model	`backend/src/modules/export/models.py`
Draft model	`backend/src/modules/draft/models.py`
Asset models	`backend/src/modules/asset/models.py`
CaptionLine model	`backend/src/modules/caption_line/models.py`
AnalysisRun model	`backend/src/modules/analysis_run/models.py`
AnalysisPreset model	`backend/src/modules/preset/models.py`
PreviewPreset model	`backend/src/modules/preview_preset/models.py`
Common schemas	`backend/src/modules/common/schemas.py`

← Architecture Overview API Endpoints →