# Go-Live Readiness Checklist (Story 6)

Gating artefact for moving the PoC from "stories merged" to "controlled go-live".
Every item must be **pass**.

## Functional (deterministic mocks)
- [x] Happy path per Product Template (Physical, Coupon, Voucher) reaches `Done`.
- [x] Happy path per Product Type (Simple standalone, Product with Variant, Simple-as-variant).
- [x] Product-level second pass (completeness, cross-attribute, variant-group, aggregate ≥ 0.8).
- [x] `app:simulate` → `app:poll` → `app:worker` runs end-to-end.

## Edge cases
- [x] Grouping Product rejected with "Bundle Product Type not supported in v1", no AI calls.
- [x] Strict identifier mode: hallucinated GTIN dropped (`IDENTIFIER_NOT_VERBATIM_IN_SOURCE`).
- [x] GTIN checksum failure dropped (`GTIN_CHECKSUM_FAILED`).
- [x] Missing required attribute → `Review`, value not pushed.
- [x] Variant child whose parent is not enriched is re-queued.
- [x] Retry on `AQS < 0.8` (one extra attempt) then logged + not pushed.
- [x] Transient AI/source error → exponential backoff, then AQS 0 for the pass (RetryHelper test).
- [x] Per-product cost ceiling → remaining attributes skipped `COST_CEILING_REACHED`.
- [x] Daily cost ceiling → kill-switch trips; in-flight products finish; manual reset on the dashboard.
- [ ] Terminal Ergonode push failure (3 retries/10 min) → Slack + `Review` — exercise against a fault-injecting Ergonode.
- [ ] CSV upload failure → still `Review`, Slack mentions the upload failure — exercise against fault injection.

## Image pipeline
- [x] WebP at the configured product canvas (1200×1500) and mood canvas (1600×900), ≤ max file size.
- [x] Always-on creation produces hero/additional/mood within the configured count ranges.
- [x] White-on-white segmentation (real Vertex; mock honours a pre-masked fixture).
- [x] Forbidden-overlay rejection logs the matched item (`_overlay` marker / Vertex detector).
- [x] `MIN_COUNT_NOT_REACHED` logged when a category minimum cannot be met; product proceeds.
- [x] File names match the admin-configurable pattern.

## Provenance audit
- [x] Every pushed attribute/image traceable in the CSV to source_url, source_excerpt, model_id,
      prompt_version, manipulation_rule_version, timestamp.
- [x] Every accepted image row carries four non-empty alt texts (DE/EN/FR/IT).

## Admin Backoffice & config reload
- [x] Edit settings / rules / sources / flags / image specs → applied on the next product (no redeploy).
- [x] Versioning: editing creates a new active version; previous versions read-only.
- [x] Inline + server validation (image specs, settings) rejects invalid input (422).
- [x] Audit log records actor/target/before/after for every change and the kill-switch reset.
- [ ] axe-core in CI on both screens reports zero violations (add to the frontend CI step).
- [ ] Manual keyboard-only pass recorded per release.

## Observability
- [x] Dashboard shows queue depth, products/day, avg AQS per attribute, cost/product, failure rate.
- [x] Data freshness ≤ 5 minutes (real-time from the run tables; visible indicator on the page).
- [ ] Structured logs shipped to the Digt observability stack; no-logs alert fires after 15 min halt.

## Auth & rollback
- [x] Unauthenticated requests rejected (middleware redirect + API 401).
- [x] Google OAuth wired (enable by setting GOOGLE_CLIENT_ID/SECRET; dev login otherwise).
- [x] Feature-flag rollback: disabling a phase makes it a no-op; admin-backoffice flag hides the UI.

## Vertex AI (the final switch)
- [ ] Service account + key in `./secrets/vertex-sa.json`, `VERTEX_PROJECT_ID` set, `AI_PROVIDER=vertex`.
- [ ] `app:vertex:smoke` passes against the live project.
- [ ] Model selection per pipeline step confirmed with the architecture team.

## Load
- [ ] 48-hour sustained run at 500 products/day on the PoC env with no manual scaling
      (scale `worker` replicas; kill-switch trips only on the cost-ceiling configuration).

Legend: `[x]` covered by the automated suite / implemented; `[ ]` requires the live
environment, fault injection, or a CI step to tick off during the pilot.
