Prompts¶
This page documents the LLM prompts used in Sapari for false start detection and validation. Understanding these prompts is essential for tuning detection quality.
Detection Prompt¶
The detection prompt asks the LLM to identify false starts in a transcript. It's designed to maximize cut size - finding the earliest failed attempt and cutting everything before the keeper.
System Prompt¶
You are a speech editor identifying FALSE STARTS in video transcripts.
A false start is when the speaker:
1. Starts saying something
2. Abandons it (stops, trails off, or restarts)
3. Says it again (same words or rephrased)
The prompt emphasizes finding patterns, not just word repetitions. The LLM needs to understand that "I think... I think we should" is different from intentional emphasis like "Yes, yes, I understand."
Critical Rules¶
The prompt includes explicit rules to ensure consistent behavior:
- Maximize cut size: Find the EARLIEST failed attempt, cut to just before the keeper
- One cut per sequence: Never split multiple repetitions into separate cuts
- Keep the last instance: The version that flows into new content stays
- Precise indices: Use the
[N]word indices for exact boundaries
Example in Prompt¶
The prompt includes a worked example with serial repetitions:
Plain text:
Isso é o PIB A riqueza... Isso é o PIB Isso é o PIB Isso é o PIB Isso é o PIB É a riqueza total
Indexed:
[0]Isso [1]é [2]o [3]PIB [4]A [5]riqueza... [6]Isso [7]é [8]o [9]PIB [10]Isso [11]é [12]o [13]PIB [14]Isso [15]é [16]o [17]PIB [18]Isso [19]é [20]o [21]PIB [22]É [23]a [24]riqueza [25]total
→ Cut words 0-17, keep "Isso é o PIB É a riqueza total"
What's NOT a False Start¶
The prompt explicitly lists exceptions: - Intentional emphasis ("Yes, yes, I understand") - Rhetorical repetition for effect - Same phrase in genuinely different contexts
Judge Prompt¶
The judge prompt performs quality control on proposed cuts. It receives the original transcript, proposed cuts, and the resulting text.
System Prompt¶
You are a speech editor performing a FINAL QUALITY CHECK on proposed cuts.
LOOK FOR:
- Remaining false starts that were MISSED
- Awkward transitions created by the cuts
- Cuts that removed too much or too little
- Fragmented cuts that should be combined
- Wrong keeper (should keep the LAST instance)
Verdict Options¶
The judge returns one of:
- "pass": Final text reads naturally
- "needs_refinement": Found issues to fix
The prompt tells the judge to be thorough but not overly critical - minor imperfections in natural speech are OK.
Refinement Prompt¶
When the judge finds issues, the refinement prompt asks for specific fixes.
System Prompt¶
You are a speech editor fixing issues identified in a transcript.
CRITICAL RULES:
1. CONSOLIDATE RELATED ISSUES INTO ONE CUT
- Never create multiple adjacent cuts with no words between them
2. ALWAYS KEEP THE LAST INSTANCE
- Cut should remove all repetitions EXCEPT the last one
- The last repetition flows best into what comes next
3. FOR NEW CUTS:
- Specify exact start_word_idx and end_word_idx
4. FOR ADJUSTMENTS:
- Reference the original_cut_idx
- Use action "remove" or "adjust"
Fix Types¶
The refinement can: - Add new cuts: For missed false starts - Remove cuts: For cuts that shouldn't have been made - Adjust cuts: Change start/end boundaries
Prompt Design Principles¶
A few things we learned while developing these prompts:
Language Independence¶
All prompts include: "Always write your reasons and explanations in ENGLISH, regardless of the transcript language."
This ensures consistent output parsing even for non-English videos.
Indexed Text Format¶
We provide both plain text (for understanding) and indexed text (for precision):
The LLM reads the plain text to understand context, then uses indices for exact boundaries.
One-Shot Examples¶
Each prompt includes a worked example. LLMs perform much better with concrete examples than abstract instructions alone.
Explicit Anti-Patterns¶
Rather than just saying what to do, we explicitly list what NOT to do: - Don't split sequences into multiple cuts - Don't cut intentional repetition - Don't keep the first instance instead of the last
Tuning Tips¶
If detection quality isn't good:
-
Too many false positives? The prompt might be too aggressive. Add more "what's NOT a false start" examples.
-
Missing obvious patterns? Add the pattern type to the PATTERN TYPES list with an example.
-
Wrong cut boundaries? The indexed text might be confusing. Check that word indices align correctly.
-
Inconsistent across languages? Add language-specific examples to the prompt.
Key Files¶
| Component | Location |
|---|---|
| Detection prompt | backend/src/workers/analysis/false_starts/detection/prompts.py |
| Judge prompt | backend/src/workers/analysis/false_starts/validation/prompts.py |
| Prompt templates | Referenced in step files |