# DataPage Workflow Documentation

## Overview
All parser pages (ClientCapture, Invoicer, Responder, ThankYou) extend DataPage base class, which implements a universal 5-stage workflow for intelligently loading Client/Booking data.

## Goals
1. **Performance** - Avoid expensive LLM calls when data already exists
2. **User Experience** - Show cached/DB data instantly when available
3. **Data Freshness** - Parse current page when needed
4. **Flexibility** - RELOAD button forces fresh parse as escape hatch

## The 5-Stage Workflow

### STAGE 1: Load STATE from Cache
- Check chrome.storage for cached Client/Booking data
- **Why:** User may have just parsed this page, or switched from another parser page with same client
- **If found:** Proceed to STAGE 2

### STAGE 2: Search DB with STATE Data
- If STATE has name/email, search SQLite database
- **Why:** Client may exist in DB with historical bookings - show them instantly (green background)
- **If found:** Render DB data + STOP (early exit)
- **If not found but STATE exists:** Render STATE data + STOP (early exit)

### STAGE 3: Prelim Parse (Identity Only)
- Quick procedural parse to extract ONLY name/email from current page
- **Why:** Fast check if current page has different person than STATE (no LLM needed)
- **If failed:** Proceed to STAGE 5

### STAGE 4: Search DB with Prelim Parse Identity
- Search database with name/email from prelim parse
- **Why:** Client may exist in DB - show historical data instantly (green background)
- **If found:** Render DB data + STOP (early exit)

### STAGE 5: Full LLM Parse
- Dynamic parser loading from leedz_config.json
- Match parser to current URL (Gmail, GCal, LinkedIn, etc)
- Run full LLM extraction for Client + Booking data
- **Why:** No cache, no DB match - must parse current page
- Render fresh parsed data

## RELOAD Button Behavior
- **Bypasses STAGES 1-4 completely**
- **Forces STAGE 5 only** - Full LLM parse with forceFullParse flag
- **Purpose:** Escape hatch to refresh data, add NEW booking for existing client
- **Implementation:** DataPage.cycleNextBooking() → fullParse() → reloadParser({ forceFullParse: true })

## Key Design Decisions

### Early Exits on DB Match
- When client found in DB, workflow STOPS
- **Rationale:** Historical bookings are valuable - show them instantly with green visual cue
- **Trade-off:** Current page may have NEW booking that isn't parsed
- **Solution:** User sees green background (DB data), clicks RELOAD if current page has new booking

### DB Search Runs Twice (Max)
- Once with STATE data (STAGE 2)
- Once with prelim parse identity (STAGE 4)
- **Rationale:** Maximize chance of finding existing client before expensive LLM call
- **Performance:** DB queries are fast (<50ms), LLM calls are slow (2-5 seconds)

### Prelim Parse Before Full Parse
- Quick procedural extraction (no LLM) gets name/email
- **Rationale:** If page identity matches STATE, skip full parse (optimization)
- **Limitation:** Currently not used for optimization - TODO in future
- **Current use:** Feeds into DB search (STAGE 4)

### STATE Persistence Across Page Switches
- ClientCapture → ThankYou preserves same Client/Booking
- **Why:** User workflow often involves multiple parser pages for same client
- **Implementation:** chrome.storage.local caches STATE, STAGE 1 loads it

## Subclass Implementation

Each parser page implements 4 hooks:

```javascript
async fullParse() {
  // Call reloadParser({ forceFullParse: true }) to skip prelim/DB
  // Returns { success, data }
}

async renderFromState(stateData) {
  // Render cached STATE data (may be null)
}

async renderFromDB(dbData) {
  // Render DB client + bookings with green styling
}

async renderFromParse(parseResult) {
  // Render fresh LLM parse result
}
```

DataPage orchestrates when each hook is called. Subclasses only implement rendering logic.
