PixelLift

Role Matrix

A visual, brand-scoped permission builder that lets admins define who can view, apply, edit, approve, or publish presets per brand or collection. Simulate a user’s access before saving to prevent misconfigurations, speed onboarding, and protect brand integrity.

Requirements

Brand & Collection-Scoped Permission Model

"As a brand admin, I want to assign permissions by brand and collection so that each team member only sees and acts on relevant presets."

Description

Establish a granular, brand- and collection-scoped permission model that defines allowed actions—view, apply, edit, approve, publish—on resources like Style Presets, Collections, and Batch Jobs. Support role-based assignment with scoped overrides, brand isolation (multi-tenant), and inheritance rules (brand → collection). Provide an efficient evaluation engine with caching and indexing to resolve effective permissions at request time across the app and API. Include default system roles and migration of existing users into the new model.

Acceptance Criteria

Assign brand-level role with collection override

Given User U has role "Preset Editor" at Brand B And Collection C belongs to Brand B And a collection-level override on C grants "approve" and denies "publish" for U When U requests effective permissions for Preset P in Collection C via API Then the response includes allows: ["view","apply","edit","approve"] and denies: ["publish"] And POST /presets/{id}/publish returns 403 for Preset P in Collection C And POST /presets/{id}/approve returns 200 for Preset P in Collection C And for Preset Q in Collection D (same brand, no override), allows: ["view","apply","edit"] and denies: ["approve","publish"] And app UI buttons reflect the same permissions (approve enabled, publish disabled in C; approve/publish disabled in D)

Default system roles and permissions matrix

Rule: The following default roles exist and are assignable at brand scope with optional collection overrides - Brand Admin: allow {view, apply, edit, approve, publish} on {Style Presets, Collections, Batch Jobs} within brand; allow manage role assignments within brand - Preset Editor: allow {view, apply, edit} on {Style Presets, Batch Jobs}; view-only on Collections; deny {approve, publish} - Approver: allow {view, apply, approve} on {Style Presets, Batch Jobs}; view-only on Collections; deny {edit, publish} - Publisher: allow {view, apply, publish} on {Style Presets, Batch Jobs}; view-only on Collections; deny {edit, approve} - Operator: allow {view, apply} on {Batch Jobs using only approved presets}; view-only on {Style Presets, Collections}; deny {edit, approve, publish} - Viewer: allow {view} only on all resources in scope; deny {apply, edit, approve, publish} Validation: - An Operator creating a Batch Job with any unapproved preset returns 403 - A Publisher cannot approve a preset (returns 403) but can publish an approved preset (200) - A Viewer cannot access POST/PUT/PATCH/DELETE endpoints for any resource (403), and UI shows no action controls

Brand isolation across tenants

Given User U has assignments only within Brand A and none in Brand B When U calls GET /brands/B/style-presets or GET /style-presets?brand=B Then the API returns 403 Forbidden When U calls GET /style-presets without brand filter Then only Brand A presets are returned and total/count reflect Brand A data only And direct access to any resource by ID that belongs to Brand B returns 403 And cross-brand aggregation endpoints exclude Brand B data for U

Permission evaluation engine performance and cache invalidation

Given a steady-state load of ≥500 permission checks/second across ≥1,000 users and ≥5 brands When the system warms up Then p95 evaluation latency ≤25 ms and p99 ≤50 ms for cached checks And on cold cache (first check after deploy) p95 ≤150 ms And steady-state cache hit ratio ≥85% When an admin changes a user’s role or override Then all affected effective-permission cache entries are invalidated within 2 seconds across nodes And the very next check reflects the change (no stale allows beyond 2 seconds) And error rate from permission evaluation <0.1% over a 30-minute soak test

Resource-level enforcement across app and API

Given a user lacking a required action on a specific resource When they attempt that action via UI or API Then UI control is disabled/hidden and API responds 403 with error_code "ERR_FORBIDDEN" and reason "missing_permission:<action>" Validation matrix: - Style Presets: view | apply | edit | approve | publish enforced per effective permissions - Collections: view enforced for all roles with scope; edit restricted to Brand Admin; approve/publish not exposed for Collections unless explicitly permitted by role (default deny) - Batch Jobs: view | apply | publish enforced per effective permissions; edit limited to job owner with edit permission or Brand Admin And list endpoints exclude resources the user cannot view; counts reflect only viewable resources

Permission precedence and inheritance resolution

Rules: - Default: no assignment implies no access - Same-scope multiple roles: effective allows = union(allows); effective denies = union(denies); any deny overrides corresponding allow - Cross-scope precedence: collection overrides take precedence over brand assignments; effective permissions at collection = (brand allows ∪ collection allows) − (brand denies ∪ collection denies); any explicit deny wins - Cross-brand isolation: roles from one brand never apply to another - Mixed outcomes are consistent across UI and API for the same user/resource/action

Migration of existing users into new model

Given a pre-migration snapshot of each user’s effective permissions by brand and collection And default system roles are available When the migration script executes on staging against production-like data Then 100% of users retain identical effective permissions post-migration (no loss, no escalation) And 0 users gain access to new brands/collections they did not previously access And 0 users are left unmapped; otherwise migration fails with a report And the migration is idempotent: re-running produces no changes And a production run completes within the maintenance window and logs success for 100% of assignments

Visual Role Matrix Builder UI

"As an admin, I want a clear matrix UI to manage role permissions so that I can configure access quickly and confidently."

Description

Deliver a visual matrix builder that lets admins assign actions to roles across brands and collections through a grid UI. Include bulk select, drag-to-fill, search/filter by user, role, brand, collection, and resource type, plus keyboard accessibility and WCAG AA compliance. Show immediate visual cues for inherited vs. explicit permissions and pending changes. Support draft mode with reversible changes and clear save/cancel, and integrate with the simulator for instant previews.

Acceptance Criteria

Grid-based assignment of actions to roles per brand/collection

Given an admin opens the Role Matrix for a selected scope (All Brands/Specific Brand/Specific Collection) When the matrix loads Then the grid displays roles as rows and actions (View, Apply, Edit, Approve, Publish) across the selected resources with current effective states visible And empty/error states are not shown when data is available Given roles, brands, and collections exist When the admin toggles a permission cell Then the cell reflects the new state immediately in the UI as a pending explicit change without persisting to the server And a pending-changes counter increments Given at least one pending change exists When the admin clicks Save Then all pending changes are persisted and the UI updates to show them as explicit (not pending) And a success toast appears within 2 seconds Given a save attempt is made When the server responds with an error Then an error message is displayed with retry And no changes are persisted and pending indicators remain Given up to 50 roles and 200 resources When the matrix first renders Then time to interactive is under 2 seconds on a standard laptop (i5+, 8GB RAM, latest Chrome)

Bulk select and drag-to-fill permissions

Given the matrix is visible When the admin clicks a row header (role) or column header (action) Then all visible cells in that row/column are selected for bulk edit Given multiple cells are selected via mouse drag or Shift+Arrow keys When the admin presses Space/Enter or uses the bulk toggle control Then the selected cells all change to the chosen state and are marked as pending And a confirmation chip shows the number of cells affected Given some selected cells are already explicitly set When a bulk operation is applied Then existing explicit states are overwritten to the new explicit state And inherited states become explicit with the new value Given a bulk operation was just applied When the admin presses Ctrl+Z or clicks Undo Then the last bulk change is reverted without persisting Given very large selections (≥1000 cells) When a bulk operation is applied Then the UI remains responsive and completes the update within 1 second

Search and filter by user, role, brand, collection, and resource type

Given the admin enters text in the search box When the query matches roles, users, brands, collections, or actions Then the matrix filters to matching entities and highlights the match Given filters are applied for Role, Brand, Collection, and Resource Type When multiple filters are combined Then the result set reflects the intersection of all filters And a visible chip list shows active filters with one-click clear Given a specific user is selected in the User filter When the matrix updates Then only roles/resources relevant to that user are shown And effective permissions for that user are highlighted distinct from global role defaults Given a dataset of up to 5,000 visible cells post-filter When typing or changing filters Then results update within 300 ms on a standard laptop Given no results match When filters are applied Then an empty state appears with a clear-filters action

Keyboard navigation and WCAG AA compliance

Given the matrix is focused When the admin uses Tab/Shift+Tab Then focus moves through interactive controls in a logical order with a visible focus indicator Given a cell has focus When the admin presses Arrow keys Then focus moves to the adjacent cell; Space/Enter toggles the cell state; Shift+Arrow extends selection for range edit Given a screen reader is active When a cell receives focus Then it announces role, action, resource, current state (checked/unchecked), and source (explicit/inherited/pending) Given the UI is rendered When evaluated for contrast Then all text and interactive elements meet WCAG 2.2 AA contrast (≥4.5:1; large text ≥3:1) Given all interactive elements When tested with keyboard only Then there are no keyboard traps; skip links are available to jump to the matrix and filters; all controls have accessible names and roles Given ARIA attributes are required When cells and controls are rendered Then proper semantics (aria-checked, aria-pressed, aria-describedby) are present and valid

Visual indicators for inherited, explicit, and pending states

Given the matrix is visible When permissions are displayed Then inherited, explicit, and pending states are visually distinct using both color and iconography And a legend explains the indicators Given a cell is inherited When the admin hovers or focuses the cell Then a tooltip reveals the inheritance source (e.g., Brand default, Role policy) without obscuring the cell Given a cell is toggled during the session When the state changes Then a pending badge appears on the cell and a global pending count updates in the header Given multiple pending changes exist When the admin clicks the Pending filter Then only cells with pending status are shown And clearing the filter restores the full view

Draft mode with reversible changes and clear save/cancel

Given no changes have been made When the admin interacts with the matrix Then Save and Cancel are disabled until the first change occurs Given at least one change is made When the draft banner appears Then Save and Cancel become enabled and show the number of pending changes Given pending changes exist When the admin clicks Cancel Then a confirmation dialog appears to discard or keep changes And choosing Discard reverts all cells to their prior effective state and clears the draft Given pending changes exist When the admin navigates away or attempts to close the page Then an unsaved-changes prompt appears preventing accidental loss Given a single cell was changed When the admin uses Revert on that cell Then the cell returns to its prior effective (inherited/explicit) state without affecting others

Simulator integration for instant previews before save

Given the simulator panel is open When the admin selects a user and scope (Brand or Collection) Then the simulator shows the user’s effective permissions for presets and related actions before any new changes are saved Given the admin toggles permissions in the matrix When the simulator is visible Then the simulator updates within 300 ms to reflect the predicted effective access under the draft changes and is clearly labeled as Preview Given incompatible or conflicting changes are made When the simulator evaluates the draft Then conflicts are highlighted with guidance on which role/resource causes the conflict Given Save is clicked When persistence succeeds Then the simulator switches from Preview to Live and matches the post-save matrix state

Access Simulator (What‑If) Preview

"As an admin, I want to simulate a user’s access before saving so that I can catch and prevent misconfigurations."

Description

Provide a what-if access simulator that previews a specific user’s effective permissions before saving changes. Allow testing by selecting a user and/or a set of roles, brands, and collections, and display allowed/denied actions with explanations (source role, inheritance, conflicting rules). Generate warnings for risky configurations and show the impact delta compared to current production settings.

Acceptance Criteria

Simulate Existing User’s Effective Permissions

Given I am an Admin on the Role Matrix And I open the Access Simulator When I select an existing user Then the simulator preloads the user’s current roles and scoped brands/collections And the current production effective permissions are captured as the baseline for delta And the Simulate action becomes available

Simulate Ad Hoc Roles/Brands/Collections (No User)

Given I am on the Access Simulator When I do not select a user but select one or more roles and at least one brand or collection Then the simulator calculates effective permissions for the hypothetical identity And if no role is selected, the Simulate action remains disabled with a validation message

Show Allowed/Denied With Explanations Per Scope

Given a simulation has been run Then for each brand/collection scope the actions View, Apply, Edit, Approve, Publish are displayed with Allowed or Denied And each action provides a Why explanation including source role(s), rule type (allow/deny), scope, and inheritance details And if multiple roles contribute, all contributors are listed with the applied precedence noted

Highlight Conflicts and Resolution Rationale

Given at least one conflicting rule exists across roles or scopes for the simulated identity When the simulation is run Then the UI flags the conflict on the affected action and scope And the simulator explicitly states which rule prevails and the resolution rationale (e.g., precedence, specificity) And the explanation references the originating role and scope for the prevailing and losing rules

Risky Configuration Warnings

Given simulation results differ from production When the simulated identity would gain any new Approve or Publish permission in any brand or collection Then a High-risk change warning is displayed listing affected brands/collections and actions And when any existing Approve or Publish permission would be removed Then a Potentially disruptive change warning is displayed listing affected brands/collections and actions And when the scope of any action expands from one brand/collection to more than one Then a Scope expansion warning is displayed with before/after counts

Delta vs Production Summary and Details

Given a baseline of current production permissions exists When the simulation results differ from production Then a Delta panel shows counts of Added, Removed, and Changed permissions by action and by brand/collection And a detailed list of deltas is displayed with before → after state per action and scope And each delta item links to its corresponding Why explanation

Simulation Is Non-Destructive and Resettable

Given I have staged selections and run a simulation Then no role, scope, or permission changes are persisted to production And when I click Reset, the simulator clears staged selections and results back to the baseline And when I navigate away or refresh, no permission changes are applied to production

Approval & Publish Gate Enforcement

"As a brand owner, I want publishing gated by approval so that only vetted presets go live and protect brand integrity."

Description

Enforce approval and publishing rules across PixelLift. Only users with Approve can transition presets to Approved; Publish requires Approved status and the Publish permission within the same scope. Apply these gates to batch operations, API endpoints, and integrations, with clear error messages and UI indicators. Support configurable review requirements (e.g., dual-approval) per brand, and block cross-brand publishes.

Acceptance Criteria

Approval Gate: Only Approvers Can Set Preset to Approved

Given a user without Approve permission in Brand A and a preset within Brand A in Review When the user attempts to approve via the UI Then the Approve control is disabled and a tooltip states "Requires Approve permission in Brand A" Given the same user and preset When the user calls the approve API endpoint Then the request is rejected with HTTP 403 and error code PERMISSION_DENIED including fields action=approve, requiredPermission=Approve, scopeType=brand, scopeId=Brand A, entityType=preset, entityId=<presetId>; and the preset status remains unchanged Given a user with Approve permission in Brand A and the preset in Review When the user approves Then the preset status transitions to Approved and an audit log records approver userId, timestamp, scopeId, previousStatus, newStatus=Approved

Publish Gate: Requires Approved Status and Publish Permission in Same Scope

Given a preset in Approved within Brand A and a user with Publish permission in Brand A When the user publishes via UI or API Then the operation succeeds (HTTP 200), the preset status becomes Published, and an audit log records publisher userId, timestamp, scopeId, previousStatus=Approved, newStatus=Published Given a preset not in Approved within Brand A and any user When the user attempts to publish Then the request is rejected with HTTP 409 PRECONDITION_FAILED and message "Preset must be Approved before publish"; no status change occurs Given a preset in Approved within Brand A and a user without Publish in Brand A When the user attempts to publish Then the request is rejected with HTTP 403 PERMISSION_DENIED including requiredPermission=Publish and scopeId=Brand A; no status change occurs

Dual-Approval: Configurable Review Requirements Per Brand

Given Brand B has dual-approval requirement set to 2 and a preset in Review within Brand B When the first eligible approver approves Then the preset is not yet Approved, approvalCount=1, and the approver is recorded; the same user cannot approve again (HTTP 409 DUPLICATE_APPROVAL) Given the same preset and a second distinct user with Approve in Brand B When the second user approves Then the preset status becomes Approved and both approvals are recorded with userId and timestamps Given Brand C has dual-approval disabled (or set to 1) When a user with Approve in Brand C approves a preset in Review Then the preset immediately becomes Approved with a single approval recorded

Batch Operations: Per-Item Gate Enforcement and Result Summary

Given a batch request to approve or publish multiple presets with mixed states and permissions When the batch operation is executed via UI or API Then each item is evaluated independently against approval/publish gates; successful items are processed, and failed items return itemized errors with HTTP status per item (e.g., 403 PERMISSION_DENIED, 409 PRECONDITION_FAILED) including action, requiredPermission (if applicable), and scope And the batch response includes a per-item results array and aggregate counts (total, succeeded, failed); processing continues despite failures in other items; each item update is atomic

API and Integrations: Consistent Gate Enforcement and Error Schema

Given any API or integration endpoint that changes preset status (approvePreset, publishPreset, partner integrations) When a caller without required permission in the preset’s scope attempts the action Then the endpoint responds with HTTP 403 PERMISSION_DENIED and a standardized error body containing fields: code, message, action, requiredPermission, scopeType, scopeId, entityType, entityId, correlationId Given the preset state does not meet preconditions (e.g., publish on non-Approved) When the action is requested Then the endpoint responds with HTTP 409 PRECONDITION_FAILED and the standardized error body; no side effects occur Given the same action is attempted via different channels (UI, REST API, integration) When gates are violated Then the error codes and messages are consistent across channels

Cross-Brand Publish Block

Given a preset scoped to Brand A and a user with Publish permission only in Brand B (or in both A and B) When the user attempts to publish the Brand A preset into Brand B (e.g., targetBrandId=Brand B) Then the request is rejected with HTTP 403 SCOPE_MISMATCH and message "Cross-brand publish is blocked"; no change occurs to the preset or Brand B assets; an audit log records the denied attempt with sourceBrandId and targetBrandId Given a publish attempt without an explicit target brand where the preset scope is Brand A When processed Then the publish only targets Brand A; any attempt to route output to another brand is blocked with the same error

UI Indicators and Messaging for Gate States

Given a user views a preset details page When the user lacks Approve or Publish permissions or the preset is not in the required state Then the Approve/Publish controls are disabled with tooltips that explicitly state the unmet requirement (e.g., "Requires Publish permission in Brand A" or "Preset must be Approved to publish") Given dual-approval is required and exactly one approval has been recorded When the page loads Then the UI displays an approval progress indicator (e.g., 1/2 approvals) and the Approve button is disabled for the user who already approved Given an action fails due to gate enforcement When the UI shows an error banner/toast Then the message text matches the API error (code and reason) and includes the scope; the user is provided a link or CTA to request access or contact an admin

Conflict Detection & Safe‑Guard Validation

"As an admin, I want automated validation of role changes so that I avoid breaking access or exposing sensitive presets."

Description

Add pre-save validation and conflict detection that scans role matrices for contradictory, over-broad, or unsafe policies. Detect cases like publish without approve, approve without view, dangling roles without members, no remaining admin for a brand, or cross-brand exposures. Provide inline warnings, remediation suggestions, and hard blocks for critical violations.

Acceptance Criteria

Hard Block: Publish Without Approve

Given an admin edits the Brand X role matrix And a role grants Publish on any preset scope within Brand X And the same role’s effective permissions (including inherited roles) do not grant Approve on the same scopes When the admin attempts to save Then the save is blocked and no changes persist And an inline error is shown at each offending permission with severity=Critical and code=PUB_NO_APPROVE And the error lists the affected role(s) and scope(s) count And a remediation suggestion is available to add Approve to the matching scopes or remove Publish And the Save action is disabled until all PUB_NO_APPROVE conflicts are resolved And an audit log entry is recorded with action=validation_blocked, code=PUB_NO_APPROVE, brand=Brand X, count >= 1

Warning: Approve Without View

Given an admin edits the Brand X role matrix And a role grants Approve on any preset scope within Brand X And the same role’s effective permissions do not grant View on the same scopes When validation runs (on change, simulate, or pre-save) Then an inline warning appears with severity=Warning and code=APP_NO_VIEW for each affected scope And a “Add View for affected scopes” quick-fix is available And saving is allowed without blocking And an audit log entry is recorded with action=validation_warning, code=APP_NO_VIEW, count >= 1

Warning: Dangling Roles Without Members

Given the role matrix contains one or more roles with zero assigned members or groups When validation runs Then a warning appears with severity=Warning and code=DANGLING_ROLE for each such role And a remediation suggestion is available to assign members or archive the role And saving is allowed And the warning summary displays total dangling roles count

Hard Block: No Remaining Brand Admin

Given changes would remove or demote the last user/group with Admin capability for Brand X When the admin attempts to save Then the save is blocked and no changes persist And an inline error appears with severity=Critical and code=NO_BRAND_ADMIN And the error lists current candidates who meet Admin criteria before the change (=0 after change) And a remediation suggestion is available to assign Admin to at least one user/group for Brand X And Save is disabled until NO_BRAND_ADMIN is resolved And an audit log entry is recorded with action=validation_blocked, code=NO_BRAND_ADMIN, brand=Brand X

Hard Block: Cross-Brand Exposure

Given a permission grants View/Apply/Edit/Approve/Publish on Brand B to a principal whose brand membership excludes Brand B When validation runs or on save Then the change is blocked with severity=Critical and code=CROSS_BRAND_EXPOSURE And the error lists the principal(s), source brand membership, and target brand(s) exposed And a remediation suggestion is available to restrict scope to allowed brands or update membership And Save is disabled until all CROSS_BRAND_EXPOSURE conflicts are resolved And an audit log entry is recorded with action=validation_blocked, code=CROSS_BRAND_EXPOSURE, count >= 1

Consistency: Simulation Mirrors Save Validation

Given the admin runs “Simulate access as User U” for Brand X before saving And conflicts exist per PUB_NO_APPROVE, APP_NO_VIEW, DANGLING_ROLE, NO_BRAND_ADMIN, or CROSS_BRAND_EXPOSURE When simulation is executed Then the simulation pane displays the same validation items with identical codes, severities, counts, and affected entities as pre-save validation And resolving a conflict in the matrix updates the simulation validation list in real time (<300 ms p95) And running pre-save validation after simulation yields identical results

Performance: Validation at Scale

Given a role matrix up to 200 roles, 10 brands, 500 presets/collections, and up to 10,000 permission edges When the user edits any single permission Then inline validation feedback appears within 200 ms p95 and 400 ms p99 And running “Simulate access” completes within 500 ms p95 and 1000 ms p99 And “Save” pre-commit validation completes within 1000 ms p95 and 2000 ms p99 And validation does not freeze the UI thread (>55 FPS) during feedback rendering

Audit Trail & Versioned Policy History

"As a compliance officer, I want versioned audit history of permission changes so that I can trace decisions and restore previous states."

Description

Implement end-to-end audit logging and versioned history for all permission-related changes, including who changed what, when, and the before/after diff. Support per-brand filtering, CSV/JSON export, immutable logs for compliance, and one-click rollback to a prior version with dependency checks. Surface recent changes within the Role Matrix UI and via API.

Acceptance Criteria

Audit Event on Role Matrix Permission Change

Given an admin with manage-permissions privileges edits a preset permission within the Role Matrix for brand X When they change a user's ability and click Save Then the system records an audit event with fields: event_id, tenant_id, brand_id, actor_id, actor_email, action='permission.update', resource_type='preset_permission', resource_id, timestamp (UTC ISO 8601), request_id, ip, user_agent, before_state, after_state, diff (JSON Patch) And the event persists successfully before the save response is returned And the policy version for brand X increments by 1

Versioned Policy History Retrieval and Diff

Given there are at least two saved versions of the brand X permission policy When a reviewer opens the policy history and selects versions N and N-1 Then the UI shows a human-readable before/after summary and a machine-readable diff (JSON Patch) And selecting View raw returns the exact stored before_state and after_state JSON for both versions And the API returns versions with version number, created_at, actor_id, actor_email, and optional changelog message

Per-Brand Audit Filtering and Isolation (UI and API)

Given audit events exist for brands X and Y When a compliance user applies filters brand_id=X and a date range Then only events for brand X within the date range are returned And no events from brand Y are included And results are paginated (default page size 50) and sortable by timestamp desc And the API endpoint for audit retrieval with the same filters enforces RBAC so users without audit:view for brand X receive 403

CSV and JSON Export of Filtered Audit Logs

Given a user has applied filters to the audit log for brand X When they export to CSV Then the downloaded file contains only the filtered result set up to the export limit of 50,000 records and includes headers: event_id, tenant_id, brand_id, timestamp, actor_email, action, resource_type, resource_id And initiating a JSON export yields newline-delimited JSON (NDJSON) with the same records and field names And the API supports equivalent exports and returns 202 with job_id for async exports exceeding 5,000 records, followed by a downloadable URL when complete

Immutable Audit Log Enforcement and Tamper Evidence

Given an audit event exists When any user attempts to modify or delete the event via UI or API Then the operation is rejected with 405 or 403 and the message 'Audit logs are immutable' And there is no supported API to update or delete audit events And a proof endpoint returns a valid tamper-evident hash (e.g., chain or Merkle proof) for the event upon request

One-Click Policy Rollback with Dependency Checks

Given a policy history for brand X with current version N and a prior version N-1 When an admin selects Rollback to version N-1 and confirms Then the system runs dependency checks for referenced presets, roles, and collections and lists any blockers by ID and reason And if no blockers exist, the system creates version N+1 identical to N-1, marks it current, and records an audit event action='policy.rollback' linking versions And if blockers exist, the rollback is aborted with no state changes and a clear error summary

Surface Recent Changes in Role Matrix UI and via API

Given a user opens the Role Matrix for brand X When the page loads Then a Recent Changes panel displays the 20 most recent permission-related events for brand X with actor, relative time, and a short summary And clicking an item opens a detailed diff view for that event And GET /audit/recent?brand_id=X&limit=20 returns the same events in the same order

Permissions API & Webhook Notifications

"As a platform integrator, I want APIs and webhooks for permission management so that I can sync PixelLift with our identity provider."

Description

Expose secured REST/GraphQL endpoints and SDK helpers for managing roles, assignments, scopes, and permission checks. Include a lightweight authorize endpoint to verify an action on a resource, and webhooks for permission changes to allow external cache invalidation and downstream sync. Enforce OAuth scopes and rate limits, and document endpoints with examples and error codes.

Acceptance Criteria

OAuth Scope Enforcement Across REST and GraphQL

- Given a valid OAuth 2.0 access token containing the required scopes for an endpoint/field, when the client calls that REST endpoint or GraphQL field, then the server authorizes the call and returns a 2xx response with only data allowed by the scopes. - Given a token missing a required scope, when calling a REST endpoint, then the server returns 403 with body { error: { code: "insufficient_scope", requiredScopes: [..], providedScopes: [..], requestId } }. - Given a token missing a required scope, when calling a GraphQL field, then HTTP status is 200 and the response contains errors[0].extensions.code = "INSUFFICIENT_SCOPE" and extensions.requiredScopes listing missing scopes; no unauthorized data is returned. - Given an invalid or expired token, when calling any endpoint, then the server returns 401, includes WWW-Authenticate header, and body error.code = "invalid_token" (REST) or a top-level 401 (GraphQL HTTP) with the same error shape in extensions. - Given a token scoped to a specific brand/collection, when the request targets a different brand/collection, then the server returns 403 with error.code = "scope_mismatch". - For every REST route and GraphQL field, the required scopes and applicable brand/collection constraints are machine-readable (e.g., OpenAPI x-required-scopes / GraphQL directive) and validated in contract tests.

Lightweight Authorize Endpoint Decision Semantics and Performance

- Given POST /v1/authorize with subjectId, action, resourceType, resourceId, and optional context { brandId, collectionId }, when the subject is permitted, then the server returns 200 with { allow: true, reasons: ["ALLOWED"], decisionId, evaluatedAt } (ISO-8601). - Given the same inputs when not permitted, then the server returns 200 with { allow: false, reasons: ["insufficient_scope" | "no_matching_role" | "assignment_expired" | "resource_not_found"], decisionId, evaluatedAt }. - Given an unknown action/resourceType or malformed payload, then the server returns 400 with error.code = "validation_error" and field-level details. - Given repeated identical authorize requests under consistent state, then allow is deterministic and decisionId is traceable in audit logs. - Performance: p95 latency for /v1/authorize <= 100ms and p99 <= 250ms under documented reference load; SLIs and test harness validate targets.

Roles, Assignments, and Scopes Management CRUD with Audit

- POST /v1/roles creates a role within a brand scope; unique name per brand is enforced; success returns 201 with role.id; duplicates return 409 with error.code = "duplicate_role". - PATCH/PUT /v1/roles/{id} requires If-Match ETag; stale ETag returns 412; success returns 200 and a new ETag. - DELETE /v1/roles/{id} on a role with active assignments returns 409 with error.code = "role_in_use"; deleting an unassigned role returns 204. - POST /v1/assignments creates a user→role assignment with optional collection scope and optional expiration; duplicates return 409 with error.code = "duplicate_assignment"; expired assignments are not considered in authorize decisions. - Scope validation: unknown brandId/collectionId returns 404 with error.code = "scope_not_found". - All write operations (create/update/delete) generate audit records with actorId (from token), action, targetType, targetId, scope, requestId, and timestamp; GET /v1/audit supports filtering by actorId, targetId, and time range.

Webhook Notifications for Permission Changes

- Events emitted: role.created, role.updated, role.deleted, assignment.created, assignment.updated, assignment.deleted, scope.updated; each event payload includes id, type, occurredAt (ISO-8601), brandId, collectionId (if any), actorId, requestId, and data.{...}. - Delivery timeliness: 99% of events are delivered to subscribed endpoints within 30 seconds of the committing API write. - Each webhook request includes X-PixelLift-Signature with HMAC-SHA256 over timestamp + body and X-Event-ID; receivers can validate using a shared secret; replays older than 5 minutes (by signature timestamp) are rejected by the platform. - Retry policy: non-2xx responses trigger exponential backoff retries for up to 24 hours; delivery stops after a 2xx; 410 disables the endpoint; 3xx is treated as failure and retried. - Idempotency: the same X-Event-ID is never reused; receiving the same event twice must be safe; platform deduplicates per endpoint. - Endpoint management: a webhook endpoint must be verified before activation via a verification event; a test event can be sent on demand from the dashboard/API.

SDK Helpers Parity and Local Decision Caching

- Official SDK provides helpers: authorize(subjectId, action, resourceType, resourceId, context?), createRole, updateRole, deleteRole, createAssignment, deleteAssignment, listRoles, listAssignments; all helpers map 1:1 to API endpoints and shapes. - The SDK exposes a verifyWebhookSignature(payload, headers, secret) utility that validates X-PixelLift-Signature and timestamp window. - The SDK’s authorize helper supports in-memory caching keyed by { subjectId, action, resourceType, resourceId, brandId, collectionId }; default TTL = 60s, negative TTL = 15s; cache can be disabled; tests verify correctness and TTL behavior. - Error handling parity: API error codes are surfaced as typed exceptions with code, message, details, requestId; TypeScript types are provided for all inputs/outputs. - Retries and backoff for transient 5xx/429 are implemented with jitter; maximum retry duration and attempts are configurable and documented.

Rate Limiting and Quotas Visibility

- All REST and GraphQL requests are subject to per-client rate limits; the /v1/authorize endpoint uses a separate higher-capacity bucket; limits are configurable per plan and environment. - When a limit is exceeded, REST responses return 429 with headers: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, and Retry-After; GraphQL returns HTTP 429 with the same headers. - Successful responses include X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset to inform consumption. - Contract tests validate correct header presence and decrement behavior, and that exceeding limits produces 429 within the same window.

API Documentation, Examples, and Error Codes

- OpenAPI 3.1 spec for REST is published at /docs/openapi.json and includes endpoint descriptions, parameters, request/response schemas, required OAuth scopes, and error models; GraphQL SDL is published at /graphql/schema and includes field-level authorization directives. - Developer docs include runnable examples (cURL, JavaScript/TypeScript, Python) for: creating a role, assigning a role, checking authorization, and handling webhooks; examples execute successfully against a sandbox. - Error format is standardized across REST ({ error: { code, message, details?, requestId } }) and GraphQL (errors[].extensions.code, errors[].extensions.requestId); error code catalog is documented with remediation guidance. - Versioning and deprecation: new versions are announced with changelog entries; deprecated endpoints emit Sunset and Deprecation headers and are documented with end-of-life dates.

Draft Sandbox

A safe workspace to iterate on preset changes without touching live outputs. Batch-test drafts on sample images, compare side-by-side with current presets, and promote with one click when results meet standards—enabling confident experimentation with zero production risk.

Requirements

Draft Preset Versioning

"As a brand manager, I want to create draft versions of style presets without affecting live outputs so that I can experiment safely and refine styles before rollout."

Description

Enable creation of draft versions of existing style presets that are fully isolated from production. Each draft carries a unique version ID, metadata (author, timestamp, change notes), and a non-production flag. Drafts support granular edits (retouch intensity, background settings, crop rules) and a change-diff view against the current live preset. Draft artifacts are stored separately and never applied to live jobs until promotion. Provide APIs and UI to create, edit, clone, archive, and delete drafts, with audit logging for compliance. This ensures safe experimentation and repeatability while integrating with the existing preset library and project structure.

Acceptance Criteria

Create Draft from Live Preset (UI)

Given an Editor is viewing a live preset When the user clicks "Create Draft" and provides optional change notes (0–500 chars) Then a draft version is created with a unique versionId (UUIDv4), linked to the parent presetId, and nonProduction=true And metadata is recorded: authorId=current user, createdAt=server UTC ISO-8601 timestamp, changeNotes as provided And the draft appears in the Preset Library under the parent preset with label "Draft" And no live jobs reference the draft and it is excluded from production preset selectors

Draft CRUD via API with Auth and Validation

Given a client with scope presets.drafts.write When the client calls POST /presets/{presetId}/drafts with a valid payload Then the API returns 201 with the new draft resource including versionId and nonProduction=true And invalid inputs (unknown presetId, missing required fields, invalid values) return 404/400 with error codes Given a client with scope presets.drafts.read When the client calls GET /presets/{presetId}/drafts?status=active&limit=50 Then the API returns 200 with paginated results, totalCount, and filters applied When the client calls PATCH /presets/{presetId}/drafts/{versionId} with If-Match=ETag and valid changes Then the API returns 200 and updates only the draft fields; concurrent edits without matching ETag return 412 When the client calls POST /presets/{presetId}/drafts/{versionId}:clone Then the API returns 201 with a new draft version carrying over fields and a new versionId and authorId=current user When the client calls POST /presets/{presetId}/drafts/{versionId}:archive Then the draft status becomes archived, editable=false, and archivedAt is set; repeated archive requests are idempotent (200) When the client calls DELETE /presets/{presetId}/drafts/{versionId} Then the draft is soft-deleted (recoverable for 30 days) and deletes are blocked if the draft has active sandbox runs (409)

Granular Draft Editing (Retouch, Background, Crop)

Given a draft is open in the editor When the user sets retouchIntensity to a value between 0 and 100 inclusive Then validation passes and the value is saved; values outside this range are rejected with inline error When the user sets background.mode to one of [remove, solid, transparent, scene] and background.color to a valid hex when required Then the draft saves successfully and previews update using draft values When the user edits cropRules (aspectRatios, focalPoint, padding) Then changes are persisted to the draft only, not the live preset, and appear in the draft JSON And saving the draft returns success within 500 ms under normal load

Draft vs Live Change-Diff View

Given a draft exists for a live preset When the user opens Compare with Live Then the UI displays only fields that differ, with before (live) and after (draft) values side-by-side And nested changes (arrays/objects) are diffed by path and highlighted And unchanged fields are hidden by default with an option to Show All And the diff can be exported as JSON containing only changed paths And the view loads within 1 second for presets up to 50 KB

Draft Isolation and Storage of Artifacts

Given the user runs a sandbox render using a draft When processing completes Then all generated artifacts are stored under /drafts/{presetId}/{versionId}/ and are not listed in any production bucket or CDN path And artifacts are tagged with nonProduction=true metadata And draft outputs are watermarked "DRAFT" in the UI preview And live job pickers and automations do not offer draft versions

Audit Logging for Draft Lifecycle Actions

Given a user performs create, edit, clone, archive, or delete on a draft When the action completes Then an audit log entry is created with actorId, presetId, draftVersionId, action, timestamp (UTC ISO-8601), requestId, and before/after snapshot hashes And audit entries are immutable and retrievable via GET /audit?entity=draft&draftVersionId={id} And delete and archive require a non-empty change note; otherwise the action is rejected with 400

UI Safeguards and Permissions

Given a Viewer role opens a live preset Then the Create Draft and Edit actions are hidden Given an Editor role opens a live preset When attempting to delete a draft Then the UI requires type-to-confirm of the draft versionId and shows impact summary (active sandbox runs, last edited) before enabling Delete And permission checks are enforced server-side; unauthorized actions return 403 And within the Preset Library, Drafts are visually labeled and sortable by createdAt and author

Sample Set Selector

"As a content lead, I want to build a representative sample set for draft testing so that results reflect real catalog diversity and edge cases."

Description

Provide tools to define and persist representative sample image sets used for testing drafts. Users can pick images manually or by rules (recent uploads, product tags, collections, SKU coverage, image aspect ratios, lighting conditions). Support randomization with a fixed seed for repeatability, caps on sample size, and dataset pinning per workspace or preset. Integrate with the media library for fast filtering and ensure privacy controls (exclude customer images, respect folder permissions). The outcome is a consistent, realistic test bed that surfaces edge cases and reduces bias in evaluations.

Acceptance Criteria

Manual Sample Set Selection and Persistence

Given I have access to the Media Library in Workspace W within the Draft Sandbox And I can view only images I have permission to access When I manually select N images and save the sample set named "Summer Shoes Draft Set" Then the system persists the set with exactly N unique image IDs And the saved set is listed under Workspace W > Sample Sets with the correct name and count And reopening the selector shows the same N images marked as selected And an audit record is created capturing user, timestamp, selection method = manual, and image count

Rule-Based Selection for Tags, Collections, SKU Coverage, Aspect Ratio, and Lighting

Given I open the Rule Builder for Sample Set Selector When I configure rules including: - Uploaded within last 30 days - Product tags include any of ["summer","sandals"] - Collections include any of ["Shoes","Sale"] - SKU coverage requires at least one image per SKU in [S1,S2,S3] - Aspect ratio in {1:1, 4:5} - Lighting condition in {studio, low_light} And I set a sample size cap K Then the preview shows an eligible pool that matches all rules and is deduplicated And generating the sample returns exactly min(K, pool size) images And each SKU in [S1,S2,S3] is represented by at least one image unless no eligible images exist for that SKU, in which case a warning lists the missing SKUs And saving the rules persists them and regenerates the same results upon reopen if the underlying library has not changed and randomization is disabled

Seeded Randomization for Repeatable Samples

Given I enable Randomize with seed ABC123 and set cap K And the eligible pool size M is at least K When I generate a sample set Then the selected image ID list is deterministic for seed ABC123 And regenerating with the same seed and unchanged pool returns an identical ID list And changing the seed to DEF456 returns a non-identical ID list And the saved sample set metadata includes the seed and pool criteria

Sample Size Cap Enforcement and Validation

Given the eligible pool size is M When I set the sample size cap to K and generate a sample Then the resulting sample contains exactly min(M, K, SystemMax) images And if K exceeds SystemMax, the UI informs me the cap was reduced to SystemMax and displays the final count And if K is not a positive integer, form validation blocks saving and shows an inline error message

Dataset Pinning per Workspace and Preset

Given a saved sample set S exists When I pin S to Workspace W and to Preset P Then opening Draft Sandbox in Workspace W with Preset P preselects S as the default dataset And switching to a different preset P2 preselects the dataset pinned to P2 or none if unpinned And users without access to Workspace W cannot view or use S And unpinning S removes it as the default while keeping it available in the sample set list if permissions allow

Privacy Controls and Folder Permission Enforcement

Given some images are flagged CustomerContent = true or are located in folders I do not have permission to access When I search, filter, or build a sample set manually or by rules Then images flagged CustomerContent = true or in unauthorized folders do not appear in results And attempts to add such images by direct ID return a permission error and do not modify the sample set And previews, comparisons, and exports exclude such images And an audit entry records each blocked attempt with user, timestamp, and reason

Media Library Integration and Performance

Given a media library with up to 50,000 images When I apply filters (recent uploads, tags, collections) and paginate through results Then initial filter results render within 2 seconds at the 95th percentile And subsequent page loads render within 1 second at the 95th percentile And selected images remain selected across pagination and when filters change but still include those images And the UI displays the eligible pool count and current selected count in real time

Batch Draft Processing

"As a studio operator, I want to batch-run drafts on sample images so that I can evaluate results at scale quickly and reliably."

Description

Execute draft presets on selected sample sets via a scalable batch engine with queuing, concurrency controls, and progress tracking. Support resumable jobs, per-workspace quotas, GPU utilization, and cost guardrails. Outputs are stored as ephemeral draft results with TTL and content-addressed caching to avoid reprocessing identical inputs. Provide job telemetry (ETA, throughput, failures), structured logs, and deterministic settings for fair comparisons. This enables rapid, controlled experimentation across hundreds of images without impacting production resources.

Acceptance Criteria

Queueing and Concurrency for Batch Drafts

Given a sample set of 500 images and a draft preset and concurrency_limit=8 When the job is submitted Then no more than 8 images are processed concurrently at any time And remaining images are queued And queue_depth and active_worker_count are observable via the job status API Given concurrency_limit is updated from 8 to 4 while a job is running When the update is applied Then active_worker_count converges to 4 within 10 seconds without aborting in-flight tasks Given two jobs from the same workspace with FIFO scheduling enabled When both are submitted Then tasks are scheduled in FIFO order per workspace And cross-workspace fairness is maintained per global scheduler policy

Progress, ETA, Throughput, and Failure Telemetry

Given a running job When requesting status via API Then the response includes total, processed, succeeded, failed, queued counts; throughput_images_per_min; estimated_time_remaining_seconds; started_at; updated_at Given progress changes during execution When 10 seconds pass Then updated_at and estimated_time_remaining_seconds are refreshed at least every 10 seconds Given at least one task fails When status is requested Then failures[] contains image_id, error_code, error_message, retry_count, last_attempt_at Given logging is enabled When inspecting job logs Then each log entry includes job_id, image_id, event_type, timestamp_iso8601, duration_ms (when applicable), severity, and error_code (when applicable) Given a job completes When status is requested Then completion_state is one of {succeeded, completed_with_failures, failed} And final throughput_images_per_min is reported

Resumable Jobs and Deterministic Re-runs

Given a running job is interrupted (e.g., worker crash or network loss) When it is restarted within 24 hours Then already-succeeded images are not reprocessed And pending images resume processing from the last completed checkpoint Given identical inputs (image content hash), draft preset version, and deterministic settings (random_seed and operator_versions) When the job is re-run Then produced outputs are bitwise-identical and share the same content hash Given an image fails with a transient error When automatic retry policy max_retries=3 with exponential backoff is configured Then the system retries up to 3 times and records each attempt in logs and job status And the image is marked failed only after all retries are exhausted

Per-Workspace Quotas and Throttling

Given daily_draft_image_quota=10000 and 9800 images already processed today for a workspace When a job for 500 images is submitted Then only 200 images are scheduled And 300 images are rejected with error_code=quota_exceeded and a remediation message including remaining_quota Given active_concurrent_jobs_quota=2 for a workspace When a third job is submitted Then the job is accepted in queued state if queueing_allowed=true Otherwise it is rejected with error_code=quota_exceeded Given quotas are increased for a workspace When the new limits are applied Then queued jobs transition to running automatically if capacity permits And status changes are emitted to the event stream

GPU Utilization and Scheduling

Given 2 provisioned GPUs and per-task gpu_requirement=1 and concurrency_limit=8 When the job runs Then at most 2 GPU tasks run concurrently and remaining tasks wait in queue Given a GPU becomes unhealthy mid-run When health checks fail for that GPU Then tasks on that GPU are rescheduled to healthy GPUs And job status reflects a rescheduling event with affected image_ids Given GPU metrics collection is enabled When job status is requested Then the API reports gpu_utilization_percent per GPU and avg_gpu_memory_used_mb during the last sampling window

Cost Guardrails and Budget Enforcement

Given monthly_draft_budget_usd=200 for a workspace and estimated_cost_usd for a new job=250 When submission is attempted Then the job is blocked with error_code=budget_exceeded And the response includes estimated_cost_usd and remaining_budget_usd Given a job starts with remaining_budget_usd=50 and estimated_cost_usd=40 When actual_cost_usd reaches 50 Then the job is auto-paused And remaining tasks are not started And status=paused_budget_reached Given a paused_budget_reached job receives an additional budget of 100 When resumed by an authorized user Then processing continues without reprocessing previously completed images

Ephemeral Draft Storage, TTL, and Content-Addressed Caching

Given draft outputs have ttl_hours=168 When 168 hours elapse after an output's created_at timestamp Then the output is auto-deleted and no longer retrievable via the draft asset API And a deletion event is recorded in logs Given an image+preset combination previously produced an output with content_hash=H within TTL When the same combination is re-run Then processing is skipped And cached output with content_hash=H is returned And job status marks the item as cache_hit=true Given the same image with a modified preset version When re-run Then processing occurs And cache_hit=false And a new content_hash is produced

Side-by-Side Compare Viewer

"As a visual merchandiser, I want to compare draft outputs side-by-side with current results so that I can quickly spot quality differences and approve changes with confidence."

Description

Deliver an interactive viewer to compare draft outputs against current live preset results and original images. Include split-view and two-up layouts, synchronized zoom/pan, before/after toggles, and keyboard navigation. Display key metadata (preset version, parameters) and visual aids (histogram, clipping warnings, edge masks for background removal). Allow per-image annotations and reviewer comments, and support shareable review links with expiry. Color-manage the display (sRGB) and respect watermark/download restrictions for drafts. This accelerates review cycles and raises confidence in visual quality.

Acceptance Criteria

Split-View and Two-Up with Synchronized Zoom/Pan

Given Draft, Live, and Original variants are available and the viewer is open in split-view or two-up When the user zooms via UI controls, mouse wheel/trackpad, or +/- keys Then both panes update to the same magnification within 1% and remain position-aligned within ≤2px at 100% zoom And switching between split-view and two-up preserves the current zoom level and focal point And initial load renders a fit-to-screen view in ≤500 ms for a 24MP image on target browsers And pan operations respond in ≤100 ms and do not exceed ≤2px desync between panes over a 4K canvas And the split divider is draggable and snaps with no more than 16 ms input latency

Before/After Toggle and Side Swap

Given Draft and a chosen baseline (Live or Original) are loaded When the user presses B or clicks the Before/After toggle Then the viewer toggles between Draft and baseline in both split-view and two-up within ≤100 ms and shows clear side labels (Draft, Live, Original) And holding B shows the alternate state only while pressed (press-and-hold) And pressing S swaps left/right assignment within ≤100 ms without changing zoom or focal point And the selected baseline persists across image changes within the current review session

Metadata and Visual Aids Display Accuracy

Given an image pair is selected When the metadata panel is opened Then it displays preset name, version (e.g., vX.Y.Z), and the parameter set applied to each variant, matching processing job records exactly (0 mismatches) And the histogram matches an offline reference histogram within ±1% per 256-bin channel bucket And enabling clipping warnings overlays red for pixels clipped at 255 in any channel and blue for pixels at 0, with pixel-accurate alignment And enabling edge mask shows the background removal mask aligned within ≤1px of the processed edges and supports opacity control from 0–100% in 10% increments

Keyboard Navigation and Accessibility

Given the viewer has keyboard focus When Arrow keys are pressed Then the image pans by 10% viewport per keypress (Shift+Arrow = 25%) And +/- zooms in/out by 20% increments; 0 sets Fit, 1 sets 100% And Tab/Shift+Tab cycles through actionable controls without focus traps, with visible focus indicators And pressing ? opens a keyboard shortcut overlay that lists all shortcuts and is dismissible with Esc And all functions are operable without a mouse and meet WCAG 2.1 AA keyboard requirements

Per-Image Annotations and Reviewer Comments

Given a user with Editor or Reviewer role opens an image in the viewer When the user adds a pin or box annotation at a location Then the annotation anchors to image coordinates and remains correctly positioned under any zoom/pan And each annotation supports a threaded comment stream with timestamps; authors can edit/delete their own comments within 15 minutes And annotations/comments are saved per image and variant (Draft/Live) and display user, time, and status (open/resolved) And View-only users (including via share links) can see but cannot create, edit, or resolve annotations/comments

Shareable Review Links with Expiry and Revocation

Given a project member generates a review link When an expiry is set between 1 hour and 30 days (default 7 days) Then the link grants view-only access to the specified images and viewer features, with watermarking and download restrictions enforced for drafts And accessing an expired or revoked link returns 410 Gone and blocks further viewing within 60 seconds of revocation And the link token is unguessable (≥128-bit entropy) and audit logs record creation, access times, and revocations And owners can revoke links at any time from the share panel

Color Management (sRGB) and Draft Watermark/Download Restrictions

Given images with embedded color profiles (e.g., Adobe RGB, ProPhoto) are loaded When rendered in the viewer Then they are converted and displayed in sRGB with ΔE00 ≤ 2 versus a reference conversion for a standard test chart on color-managed devices And draft outputs display a semi-transparent watermark and have downloads disabled: no download button, right-click save blocked, and unauthenticated direct asset requests return 403 And users with only view permissions cannot obtain an unwatermarked draft via any viewer control or URL And admins with explicit override can download, and the action is logged

One-Click Promote & Rollback

"As a product owner, I want to promote an approved draft to live with one click and a rollback option so that I can deploy improvements safely and quickly."

Description

Provide a guarded promotion flow that atomically sets a draft preset as the live version with optional scheduling and canary validation on a small set. Enforce pre-promotion checks (required approvals, passing quality gates, no active edits) and generate an audit trail with version notes. Support immediate rollback to the previous live version and notify stakeholders via in-app alerts and webhooks. Promotion is safe, reversible, and observable, ensuring zero production risk while streamlining deployment of approved styles.

Acceptance Criteria

Immediate Promotion - Pre-checks and Atomic Swap

Given a draft preset D exists with all required approvals completed, all configured quality gates passing, and no active edits on D And the user has Promote permissions for the workspace And there is an existing live preset version L When the user clicks Promote Now for D and confirms Then the system atomically sets D as the new live version L+1 with no partial/visible intermediate state And the previous live version is recorded as Previous Live And the promotion request returns success with the new live version identifier And no other presets are modified And the Draft Sandbox marks D as Promoted and locks further edits until a new draft is created

Scheduled Promotion - Execution and Failure Handling

Given a draft preset D is scheduled for promotion at a future UTC timestamp T with optional freeze window checks enabled And at scheduling time, D satisfies all pre-promotion checks When the system clock reaches T Then the system re-evaluates all pre-promotion checks at execution time And if all checks pass, D is promoted atomically to live (L+1) and the schedule is cleared And if any check fails (e.g., new edit detected, approval revoked, quality gate fail), the promotion is not executed, the schedule is canceled, and stakeholders are notified with the failure reason And if the user cancels the schedule before T, no promotion occurs and no state changes are applied

Canary Validation - Sample Guard and Gates

Given a draft preset D has a defined canary sample set S (by percentage or explicit list) within allowed bounds (e.g., 1–5% of catalog or 50–500 images) And quality gate thresholds are configured for the canary (e.g., background accuracy, defect rate, color variance) When the user starts a canary run for D Then the system applies D only to S and leaves the current live version for all non-S items And the system generates a canary report with metrics, pass/fail per gate, and side-by-side thumbnails for S And if Auto-abort on fail is enabled and any gate fails, the promotion action is blocked and live remains unchanged And if all gates pass, the user can finalize promotion in one click without reprocessing S

One-Click Rollback - Atomic Restore

Given a promotion from L to L+1 has occurred and the prior live version L is retained When the user clicks Rollback and confirms Then the system atomically restores L as the live version and marks L+1 as Rolled Back And any pending schedules tied to L+1 are canceled And an audit entry is recorded with actor, reason, and correlated promotion id And stakeholders are notified via in-app alert and webhook of the rollback outcome

Audit Trail & Version Notes - Complete and Immutable

Given a promotion or rollback action is initiated When the action completes (success or failure) Then an immutable audit record is created containing: actor, action type (promote/rollback), draft id, from_version, to_version, timestamps (started/completed), pre-check results, canary metrics snapshot (if any), outcome, and webhook delivery summary And promotion requires non-empty version notes (minimum 5 characters) which are stored in the audit record And audit records are queryable by time range, actor, preset id, and outcome And audit records cannot be edited or deleted by end users

Stakeholder Notifications - In-App and Webhooks

Given notification subscriptions exist for promotion and rollback events When a promotion or rollback succeeds or fails Then targeted stakeholders receive in-app alerts containing version ids, action, outcome, and links to audit and diffs And configured webhooks are sent with a signed payload schema including event type, preset id, from_version, to_version, actor, outcome, and timestamps And webhook delivery uses retry with exponential backoff for transient failures and records final delivery status And no notifications are sent for scheduled attempts that are canceled before execution

Concurrency & Idempotency - Safe, Single-Action Semantics

Given two users attempt to promote the same draft D concurrently When both requests are submitted within a short window Then only one promotion succeeds and the other receives a conflict response with no state change And if a client retries the same promotion using the same idempotency key, the server processes it at most once and returns the original result And if any active edit session exists on D at the moment of execution, the promotion is blocked with a clear error referencing the active editor/session And promote/rollback endpoints are linearizable: read-after-write returns the new live version immediately after success

Access Control & Approvals

"As an admin, I want role-based permissions and approvals for drafts so that only authorized changes move to production."

Description

Implement role-based access control for draft creation, editing, viewing, and promotion. Configure approval workflows with required reviewers, minimum approvals, and optional policy checks by workspace. Provide audit logs of all actions, reviewer comments, and decisions, plus secure share links for external reviewers with expiration and watermarking. Integrate with SSO/SCIM roles and existing organization permissions. This ensures proper governance without slowing small teams, balancing control and velocity.

Acceptance Criteria

RBAC: Draft Create/Edit/View/Promote Enforcement

Given a user authenticates via SSO and is assigned one of: Admin, Draft Editor, Reviewer, Viewer And the workspace has default permissions applied When the user attempts each action on drafts via UI and API: create, edit, view, submit for review, approve/decline, promote Then access is allowed only per role policy: - Admin: all actions - Draft Editor: create, edit, view, submit for review; cannot approve or promote - Reviewer: view, comment, approve/decline; cannot create, edit, or promote - Viewer: view only; cannot create, edit, approve/decline, or promote And disallowed actions return HTTP 403 on API and disabled controls with tooltip "Insufficient permissions" in UI And each denied attempt is recorded in the audit log with actor, action, timestamp, and reason "permission_denied"

Approval Workflow: Required Reviewers and Minimum Approvals

Given a workspace admin configures an approval workflow with required reviewers [A, B] and minimum approvals = 2 And "allow substitutes" is disabled When a draft is submitted for review Then only listed required reviewers may approve And promotion eligibility requires approvals >= 2 and approvals include both A and B And if a required reviewer declines, the draft enters "Changes Requested" until resubmitted And the system prevents marking the review complete if any required reviewer has not approved

Gatekeeping: Promotion Blocked Until Approvals and Policy Checks Satisfied

Given a draft has at least the configured minimum approvals and all required reviewers have approved And all configured policy checks (e.g., background removal verification, resolution >= 2000px) have passed When a user with Promote permission clicks Promote or calls POST /drafts/{id}/promote Then the draft is promoted and a new preset version is created and timestamped And if any precondition is unmet, the promotion is blocked with a consolidated error message listing unmet items And a promotion event (success or blocked) is written to the audit log including approver list and policy check results

Audit Log: Complete, Immutable, and Exportable Records

Given any action on drafts or workflows occurs (create, edit, submit, approve, decline, comment, share link create/revoke, promote, permission denied) When viewing the audit log for a draft or workspace Then each event includes: actor id and display name, actor role, action type, object id and version, ISO-8601 UTC timestamp (UTC), IP address, result (success/denied), and optional comment text or reason And log entries are append-only (no update/delete APIs) and have a monotonically increasing sequence id And logs can be exported by Admin as CSV and JSON for a specified date range And logs are retained for at least 365 days

External Review: Expiring Watermarked Share Links

Given a Draft Owner generates an external review link scoped to a specific draft with expiry = 7 days and comment-only permission When an external reviewer opens the link Then the reviewer can view watermarked images and leave comments but cannot download originals or promote And every image is overlaid with the workspace watermark text and draft id And the link becomes unusable after expiry or immediate revocation, returning HTTP 410 via API and an "Link expired" page in UI And all views and comments via the link are attributed to "External Reviewer" with per-link token id in the audit log

SSO/SCIM: Role Mapping and Permission Inheritance

Given organization SSO is configured and SCIM is enabled with group-to-role mappings (e.g., okta_group_editors -> Draft Editor) When a user is added to or removed from a mapped IdP group Then their PixelLift role updates within 5 minutes and access reflects the new role on next request And deprovisioned users lose all access within 60 seconds of SCIM delete and active sessions are revoked And local role changes in PixelLift are overridden by SCIM-sourced mappings on next sync and recorded in the audit log

Small Team Fast-Path: Streamlined Workflow Without Governance Loss

Given a workspace sets Minimum Approvals = 0 and has no Required Reviewers configured When a Draft Editor submits and then promotes their own draft Then promotion is allowed without reviewer approvals, provided all policy checks pass And the promotion is fully audited (including actor, policy results, and before/after preset version) And if Required Reviewers or Minimum Approvals > 0 are later configured, subsequent promotions require the new approvals

Quality & Impact Summary

"As a QA reviewer, I want a quality summary and pass/fail gates for draft runs so that promotion decisions are data-driven and consistent with brand standards."

Description

After each batch run, generate a summary report with visual and quantitative indicators: background removal confidence, color delta, exposure and white balance shifts, sharpness and noise metrics, crop/size conformance, processing time, error rates, and estimated cost. Highlight outliers and failures, show per-image thumbnails, and compute pass/fail against configurable thresholds. Provide export (CSV/JSON) and link the report to promotion gates. This gives data-backed evidence to judge whether a draft meets brand standards and operational constraints.

Acceptance Criteria

Auto-Generated Batch Quality Summary

Given a Draft Sandbox batch run with up to 500 images completes processing When the run finishes Then a Quality & Impact Summary is generated within 60 seconds and attached to the run And the summary includes per-image fields: image_id, background_removal_confidence (0.00–1.00, 2 decimals), color_delta_E00, exposure_shift_EV, white_balance_shift_K, sharpness_variance, noise_SNR_dB, crop_size_conformance (true/false), processing_time_ms, error_code, estimated_cost_cents And the summary includes batch aggregates: total_images, failed_images, error_rate_percent (2 decimals), total_processing_time_ms, total_estimated_cost_cents, mean/median/min/max for each numeric metric And missing per-image metrics are recorded as null without failing summary generation

Outlier Detection and Highlighting

Given a threshold profile is configured for the draft's preset When the summary is generated Then any metric outside its threshold is flagged per image with outlier=true and reason_codes And the batch shows outlier_count equal to the number of images with ≥1 violation And toggling "Show Outliers" lists only outlier images sorted by severity within 1 second And updating thresholds recalculates outlier flags within 5 seconds without reprocessing images

Per-Image Thumbnails in Report

Given the summary is viewed in the Draft Sandbox When per-image rows render Then each row displays a processed-image thumbnail with max dimension 200 px, WebP format, and file size ≤ 50 KB And thumbnails lazy-load and render above-the-fold rows within 500 ms on a 3G fast network And clicking a thumbnail opens the full-size processed image in a new tab with the image_id in the URL

Threshold-Based Pass/Fail Computation

Given a threshold profile T is active When the summary is generated or thresholds change Then image_status is Pass if all thresholded metrics are within bounds and error_code is null; otherwise Fail with reason_codes populated And batch_status is Pass if at least 95% of images pass and error_rate_percent ≤ the configured limit (default 2%); otherwise Fail And the header displays pass_count, fail_count, and pass_percent with two decimals And status recalculates within 5 seconds of a threshold change without reprocessing images

CSV and JSON Export of Summary

Given a user clicks Export CSV or Export JSON on a completed summary with up to 1,000 images When the export is requested Then the file downloads within 10 seconds and contains one row/object per image plus batch-level metadata And CSV is UTF-8, comma-delimited, includes a header row, and row count equals total_images + 1 And JSON validates against schema id "pixellift.qualitySummary.v1" with top-level fields batch and images[] And exported numeric fields preserve at least the precision shown in the UI

Cost and Processing Time Accuracy

Given pricing rules and operation counts used in the batch are known When estimated_cost_cents and processing_time_ms are computed Then per-image estimated_cost_cents matches the pricing table within ±2% and rounds to the nearest cent And batch total_estimated_cost_cents equals the sum of per-image estimates And per-image processing_time_ms equals the sum of stage times within ±5% and batch total equals the sum of per-image times

Promotion Gate Enforcement via Summary

Given a user attempts to Promote a draft to Live When the Quality & Impact Summary batch_status is Pass Then the Promote action is enabled and requires the summary_id to be linked to the promotion record And when batch_status is Fail, the Promote action is disabled unless the user has OverridePromotion permission and enters an override_reason (≥10 characters) And all promotion attempts are audit-logged with user_id, timestamp, batch_status, outlier_count, and total_estimated_cost_cents

Smart Approvals

Configurable approval chains with SLAs, change diffs, and auto-escalation. Approvers get concise visual summaries and can approve from email or Slack, keeping launches on schedule while ensuring significant edits receive proper oversight.

Requirements

Dynamic Approval Chains & Rules Engine

"As a brand operations manager, I want to configure conditional approval chains for batches so that significant edits get proper oversight without slowing routine work."

Description

Provide configurable, multi-step approval workflows for PixelLift projects and batches, supporting sequential and parallel stages, conditional routing based on asset metadata (product category, brand, risk score, change magnitude), and policy templates. Includes per-step SLAs, required approver roles, minimum quorum, and thresholds that define significant edits (e.g., background replaced, retouch intensity above a set level). Integrates with PixelLift job orchestration so batch processing pauses at approval gates and resumes automatically upon approval. Offers UI for creating, cloning, and simulating workflows, with versioning and safe rollout by workspace.

Acceptance Criteria

Sequential and Parallel Approval with Role Quorum

Given a workflow "PL-Workflow-1" with Stage 1 (sequential) requiring role=Brand Manager and quorum=2 of 3 And Stage 2 (parallel) contains "Legal Review" (role=Legal Counsel, quorum=1) and "QA Review" (role=QA Lead, quorum=1) And a batch job "B-1001" enters Stage 1 When two distinct users with role=Brand Manager approve within Stage 1 Then the job advances to Stage 2 and both "Legal Review" and "QA Review" open concurrently And additional approvals for Stage 1 after quorum are ignored and logged When one Legal Counsel and one QA Lead approve their respective parallel steps Then the job exits Stage 2 and continues processing And the audit log records approver IDs, timestamps, and stage outcomes

Conditional Routing by Metadata and Significant Edit Thresholds

Given a policy with rule: if product_category in ["Beauty","Skincare"] OR risk_score >= 70 OR significant_edit=true then route to "Compliance Review" else route to "Standard Review" And significant_edit is defined as (background_replaced=true OR retouch_intensity >= 60) And batch "B-2002" has asset metadata: product_category="Beauty", retouch_intensity=72, background_replaced=true, risk_score=65 When the workflow evaluates routing for the batch Then the batch is routed to "Compliance Review" And the evaluation report lists matched conditions ["product_category","significant_edit"] and non-matched ["risk_score"] When asset metadata is product_category="Shoes", retouch_intensity=40, background_replaced=false, risk_score=20 Then the batch is routed to "Standard Review"

Per-Step SLA, Reminders, and Auto-Escalation

Given Stage "Brand Approval" has SLA=24h, reminder cadence=6h, and escalation target="Brand Director" group after SLA breach And batch "B-3003" enters the stage at T0 When no quorum is achieved by T0+24h Then the system marks the stage "Overdue" and sends an escalation notification to the "Brand Director" group And reminders were sent at T0+6h, T0+12h, and T0+18h to pending approvers And an escalation approver's approval satisfies quorum When quorum is met before SLA breach Then no escalation is sent and stage status is "Completed on time"

Email and Slack One-Click Approvals with Secure Tokens

Given approver Alice is assigned to Stage "Legal Review" And the system sends an email and Slack message with Approve/Reject buttons embedding a single-use token that expires in 24h When Alice clicks Approve in Slack within 24h Then the approval is recorded, the token is invalidated, and the stage updates in under 3 seconds And the Slack message updates to show the decision and an audit link When the same token is used again or after expiry Then the action is rejected with "Token invalid or expired" and no state change occurs And the audit log captures channel, user ID, IP, timestamp, and comment (if provided)

Pause and Auto-Resume at Approval Gates in Job Orchestration

Given batch "B-4004" has an approval gate after the "Background Removal" step When the gate opens Then the pipeline pauses downstream tasks for that batch and releases compute resources for the paused job When approval quorum is reached Then the pipeline resumes automatically within 10 seconds and continues at the next step When the stage is rejected Then remaining downstream steps are canceled and the batch status is set to "Changes Requested" with notifications sent to the submitter

Workflow Builder: Create, Clone, Simulate, and Versioned Rollout

Given a user with role "Workflow Admin" opens Workflow Builder When they create workflow v1.0 with three stages and save Then the workflow is versioned as 1.0 and set to Draft until rolled out to Workspace A When they clone v1.0 to v1.1, edit rules, and run simulation against sample assets A1 and A2 Then the simulator shows per-asset routes, matched rules, and SLAs without executing jobs When v1.1 is rolled out to Workspace A with rollout mode="Gradual 25%" Then new jobs are assigned v1.1 in 25% of cases and in-flight jobs remain on their original version When rollout is promoted to 100% and then v1.1 is reverted Then new jobs return to the previous stable version and a change log entry is created

Concise Visual Summaries and Change Diffs for Approvers

Given an approval stage is opened for batch "B-5005" When approvers open the summary in email or Slack Then they see per-asset before/after thumbnails, detected edits (e.g., "Background replaced", "Retouch intensity: 72"), risk score, and SLA remaining time And the summary loads in under 2 seconds for up to 100 assets with total payload under 5 MB And all images include alt-text and the content is keyboard and screen-reader accessible When approvers click "View full diff" Then a web view shows side-by-side large previews with annotations and a downloadable change report (PDF) linked in the audit trail

Visual Diff Summaries for Batches

"As an approver, I want concise visual diffs and batch rollups so that I can review hundreds of images quickly and catch high-impact changes."

Description

Generate concise, visual summaries per asset and per batch that include before/after sliders, annotated lists of transformations, change heatmaps, and a significance score for quick triage. Provide batch-level rollups showing counts of high-significance edits, outliers, and flagged items. Enable quick filters and sampling tools for fast review (e.g., review a 10% sample or all items over a threshold). Embed bandwidth-friendly thumbnails in notifications and deep link to full-resolution proofing in PixelLift.

Acceptance Criteria

Per-Asset Before/After Slider

- Given an approver opens an asset summary from a processed batch, When the page loads, Then a before/after slider is displayed above the fold showing Original vs Enhanced. - Given the user drags the slider, When sliding from 0% to 100%, Then the two images remain pixel-aligned (≤1px deviation) and no layout reflow occurs. - Given standard network conditions (5 Mbps, 100 ms RTT), When the asset summary loads cold, Then both images are visible and draggable within 1.5 seconds. - Given the user drags the slider, Then interaction latency stays under 200 ms and frame rate ≥ 50 FPS on a mid-tier laptop and a modern mobile device. - Given the user clicks "Zoom 100%", When zoom is active, Then the slider continues to function on the zoomed image without loss of alignment.

Annotated Transformation List Accuracy

- Given a processed asset, When viewing the Transformations panel, Then a chronological, human-readable list of applied operations is shown with parameters (e.g., Background removed; Exposure +0.7 EV; Crop 4:5). - Given the pipeline metadata, When compared to the UI list, Then 100% of applied operations are present, ordered correctly, and parameter values match within ±1% for numeric values. - Given a transformation in the list, When the user hovers or clicks it, Then the affected region(s) highlight on the visual and the panel focuses the selected item. - Given an asset with no edits, When viewing the panel, Then the UI displays "No changes applied" and hides irrelevant controls.

Pixel Change Heatmap Visualization

- Given a processed asset, When the user toggles Change Heatmap, Then a heatmap overlay appears with a legend and adjustable sensitivity from 0–100%. - Given a known test pair with synthetic differences, When generating the heatmap, Then ≥95% of modified pixels are detected and ≥95% precision is achieved for unchanged pixels. - Given the user downloads the heatmap, When clicking Export, Then a PNG of the overlay at asset resolution downloads within 3 seconds for assets up to 24 MP. - Given the user toggles off the heatmap, Then the base image returns to normal with no residual overlay.

Significance Score and Threshold Filtering

- Given an asset summary, When displayed, Then a significance score between 0 and 100 is shown with a badge (Low/Medium/High) mapped to configurable thresholds (default: Low 0–39, Medium 40–69, High 70–100). - Given a batch, When applying a filter Score ≥ T, Then 100% of assets with score ≥ T are included and 0% with score < T are included. - Given thresholds are updated by an admin, When filters and badges refresh, Then counts and badges update within 1 second without a full page reload. - Given identical inputs and model version, When the same asset is scored twice, Then the score difference is ≤ ±2 points.

Batch-Level Rollups and Outlier Detection

- Given a batch of N assets, When viewing the batch summary, Then rollups display: total assets, counts of High/Medium/Low significance, Outliers, and Flagged. - Given rollup counts, When cross-checked via corresponding filters, Then counts exactly match the items returned. - Given default outlier detection (z-score > 2.5 on batch significance distribution), When enabled, Then qualifying assets are labeled Outlier; if method is switched to IQR, rollups recompute within 2 seconds for up to 5,000 assets. - Given a batch up to 5,000 assets, When opening the batch summary, Then rollups compute and render within 3 seconds on first load and within 1 second on subsequent loads (cached).

Sampling Tools: Percentage and Seeded Randomization

- Given a batch, When the user selects Sample 10%, Then the system returns a random subset of ceil(0.10 × N) unique assets. - Given a seed value S, When the user sets Seed = S and applies the same sampling percentage again, Then the returned subset is identical across sessions and devices. - Given combined filters, When the user applies Score ≥ T and then Sample 10%, Then sampling operates only on the filtered set. - Given sampling is active, When viewing the grid, Then the UI displays the sample size and provides a control to Review All to exit sampling.

Notifications: Thumbnails and Deep Links to Proofing

- Given email and Slack notifications are enabled, When a batch completes, Then notifications include per-asset thumbnails and a batch-level summary; each thumbnail is ≤150 KB and ≤800 px on the longest edge. - Given a user clicks a thumbnail or Open in PixelLift, When the link opens, Then the full-resolution proofing view loads with the corresponding batch/asset and any filter context applied. - Given standard network conditions (5 Mbps, 100 ms RTT), When opening a deep link, Then the first full-resolution image is visible within 2 seconds and remaining assets stream progressively. - Given clients that block external images, When viewing the notification, Then alt text is present and links remain functional with readable layout.

SLA Timers with Auto-Reminders & Escalation

"As a project lead, I want SLA timers with auto-escalation so that approvals stay on schedule and delays are handled proactively."

Description

Track SLA per approval step with business-hours calendars and time-zone awareness. Send proactive reminders before due times and follow-ups after breaches, escalating to backup approvers or managers as configured. Support escalation trees, snooze options, OOO auto-reassignment, and pause/resume for holidays. Surface SLA status in dashboards and provide metrics (average approval time, breach rate) for operational reporting. Log all reminders and escalations in the audit trail.

Acceptance Criteria

Business-Hours & Time-Zone SLA Computation

- Given an approval step with an 8-business-hour SLA and an approver in America/Los_Angeles with business hours 09:00–17:00 Mon–Fri, when the request is submitted Friday at 16:30 local time, then the due time is Monday 12:30 local time and only business hours are counted. - Given the requester is in Europe/Berlin, when viewing the same step, then the due timestamp is displayed localized to the viewer without changing the underlying due moment. - Given a DST transition (spring forward) occurs during the SLA window, when computing remaining time, then the lost hour is not counted as business time and the due time reflects correct business-hour math. - Given business hours exclude weekends, when a step is submitted Saturday, then the SLA countdown begins at 09:00 Monday in the approver’s time zone.

Pre-Due Reminders with Snooze

- Given a step with reminders configured for T-24h and T-2h, when those thresholds are reached within the approver’s business hours, then reminders are sent via the configured channels (email and/or Slack) with remaining time shown in the message. - Given a reminder threshold occurs outside business hours, when the next business hour window opens, then exactly one queued reminder is sent and no duplicates are produced. - Given the approver clicks Snooze for 2 hours in a reminder, when the snooze window is active, then no additional reminders are sent for that step and reminders resume after snooze expires. - Given a per-step max reminder count of 3, when more than 3 reminders would be triggered, then additional reminders are suppressed and the suppression is recorded.

Breach Follow-Up and First-Level Auto-Escalation

- Given a step breaches its SLA, when breach is detected, then a follow-up notification is sent immediately to the current approver and the step status is marked Breached in UI and API. - Given an escalation rule with a 1-business-hour grace period to Backup Approver A, when the breach remains unresolved for 1 business hour, then the step is escalated to A per configuration (reassign or parallel) and A is notified via configured channels. - Given the original approver approves during the grace period, when the step becomes approved, then the scheduled escalation is canceled and the status updates to Approved without an escalation event.

Multi-Level Escalation Trees with Skip/Stop Conditions

- Given an escalation tree [Primary -> Backup -> Manager Group -> Director], when each escalation interval elapses without approval, then the step escalates to the next node and notifications are sent at each level. - Given a node in the tree is ineligible (OOO or already approved), when escalation reaches that node, then the node is skipped and escalation proceeds to the next eligible node. - Given any user in the current node approves, when approval is recorded, then all pending downstream escalations are canceled and the escalation chain stops at that level. - Given the final node is reached and no approval occurs within its SLA, when the final interval elapses, then no further escalation occurs and the step is flagged for manual intervention in UI and API.

OOO Auto-Reassignment to Delegates

- Given the primary approver has an active OOO window with delegate D, when a step is assigned to the primary, then it is auto-reassigned to D, notifications are sent to D, and the due time is recalculated using D’s business-hours calendar and time zone. - Given the primary becomes OOO after assignment, when OOO activates, then any currently assigned, unapproved steps are reassigned to the delegate per policy and both users are notified. - Given no delegate is configured, when the primary is OOO at assignment time, then the step escalates to the configured backup path per the escalation rules and this action is recorded.

Holiday Pause and Resume

- Given the approver’s holiday calendar marks a day as a holiday, when an SLA window spans that day, then the SLA countdown pauses at the start of the holiday and resumes at the next business day start, extending the due time accordingly. - Given a step is submitted during a holiday, when computing the SLA start, then the countdown begins at the next business day start in the approver’s time zone. - Given multiple consecutive holidays or weekends, when computing due time, then all non-business days are excluded from the SLA calculation and the remaining time is accurate to the hour. - Given an SLA is paused for a holiday, when viewing the step, then the UI displays Paused with reason Holiday and shows paused duration not counted toward SLA.

SLA Visibility, Reporting & Audit Trail

- Given an approval request, when viewed in the dashboard, then each step displays status (On Track, At Risk ≤20% SLA remaining, or Breached), remaining/elapsed business hours, due timestamp, and current escalation level. - Given filters for date range, team, and status are applied, when metrics load, then average approval time (business hours), breach rate, and p50/p90 are computed from the filtered set and match values derived from raw events. - Given an export is requested, when the CSV is generated, then it includes per-step fields (IDs, assignees, due times, statuses, SLA durations, escalation levels) with timestamps in UTC and the viewer’s local offset. - Given any reminder, snooze, reassignment, or escalation event occurs, when inspecting the audit trail, then an entry exists with timestamp, actor=system, recipients, channel(s), template ID or subject, outcome (queued/sent/delivered if available), and related step IDs.

Actionable Email & Slack Approvals

"As a busy approver, I want to approve from email or Slack with clear context so that I can keep launches moving without logging into the app."

Description

Deliver actionable approval requests via email and Slack with secure, one-click Approve, Request Changes, and Reject actions. Include batched thumbnails, key metrics, and top diffs in the message for context. Use signed, expiring tokens and SSO handoff to allow secure actions without full login. Support bulk decisions, inline comments, and quick filters within Slack modals, with reliable fallback deep links to the web app if interactivity is blocked.

Acceptance Criteria

One-Click Email Approval with Secure Token

Given an approver receives an approval email with Approve, Request Changes, and Reject CTAs When the approver clicks a CTA within the token TTL Then the action is executed without full login via SSO handoff and a confirmation screen shows decision, item count, and reference ID And the token is single-use; subsequent clicks return an "Expired or Used" page and do not change state And clicks after TTL return an "Expired" page with a deep link to the web app approval view And the decision, channel=email, user, timestamp, and IP are audit logged

Slack Modal Quick Filters and Actions

Given an approver opens a Slack approval modal from a PixelLift notification or shortcut When they apply quick filters (e.g., submitter, brand preset, priority, due soon) and select a subset of items Then the list updates in-modal and selection persists across filters And Approve/Reject/Request Changes actions are available for single or multiple selected items And an ephemeral Slack confirmation summarizes outcomes with links to details

Bulk Decisions with Per-Item Outcomes

Rule: Up to 200 items can be actioned in a single bulk operation from Slack or email deep link Rule: Bulk operations return per-item outcomes; successes are committed, failures remain pending with reasons Rule: For 100 items, 95th percentile server processing time is ≤ 8 seconds; UI shows progress/spinner until completion Rule: If any item requires a comment for Request Changes, the modal enforces a comment before submission

Context-Rich Visual Summaries in Messages

Rule: Email and Slack messages include batched thumbnails (up to 6) with alt text, key metrics (item count, presets applied, background type), and top diffs (e.g., background removed, retouch strength, style preset) Rule: A "View all" link opens the approval page with the same selection pre-applied Rule: Diffs link to a side-by-side before/after view in the web app

Inline Comment Capture on Request Changes

Given an approver chooses Request Changes for one or more items When they submit the action Then a comment is required (minimum 3 characters) and supports @mentions And the comment is stored on each affected item’s thread, visible in the web app within 5 seconds And submitters receive a notification with the comment and item links

Secure Tokens, SSO Handoff, and Authorization

Rule: Action links/buttons use signed, single-use, expiring tokens bound to approval ID and recipient user; configurable TTL (5–60 min), default 15 min Rule: Only assigned approvers with permission can complete the action; unauthorized attempts return 403 and do not change item state Rule: SSO handoff allows the action to complete without full login; no persistent session is created unless the user explicitly continues to the app Rule: All token validations and decisions are audit logged with channel, user, decision, item IDs, and timestamp

Fallback Deep Links When Interactivity Is Blocked

Given Slack interactivity is disabled or times out, or an email client blocks action buttons When the approver uses the fallback deep link Then the web app opens to the approval view with the same context (pre-filtered selection) and offers Approve/Request Changes/Reject and bulk actions And SSO redirect signs the user in if needed without losing context And the fallback path is tracked as channel=web_fallback in audit logs

Audit Trail & Compliance Exports

"As a compliance officer, I want an immutable audit trail and exports so that we can prove proper oversight during audits."

Description

Maintain immutable, tamper-evident logs of every approval decision, timestamp, approver identity, content viewed (diffs, summaries), and justification notes. Link decisions to exact asset versions and the workflow version used at the time. Provide exports to CSV/JSON and webhook delivery to external DAM, QA, or governance systems. Include e-discovery search, configurable retention policies, and data residency controls aligned to workspace regions.

Acceptance Criteria

Immutable Tamper-Evident Approval Log

Given an approval event is recorded When any user attempts to modify or delete a prior log entry through UI or API Then the system rejects the operation with HTTP 403 and records an audit-violation event Given the audit log integrity check is executed When hashes are recomputed Then each record contains a hash and previous_hash that form an unbroken chain without gaps Given an export with integrity manifest is generated When the manifest checksum is validated Then recomputed checksums match the published manifest and the export is marked valid Given a simulated storage crash and recovery When the service restarts Then all committed log records persist and sequence numbers remain strictly increasing without duplicates

Complete Event Data Capture

Given an approval decision is submitted via any channel (Web, Email, Slack, API) When the event is persisted Then the record includes event_id, event_type, decision, timestamp (ISO 8601 UTC), approver_id, approver_email, auth_method, ip_address, user_agent, requester_id, request_id, asset_id, asset_version_id, workflow_id, workflow_version, viewed_artifacts (diff_id, summary_id), justification_text (nullable), sla_at_decision, source_channel And required fields are non-null and validated against allowed enumerations; optional fields are null when not applicable And timestamps are monotonic per request_id and include millisecond precision

Decision Linkage to Exact Versions

Given an approval is made on asset version Vn under workflow version Wm When the audit record is created Then it stores asset_id, asset_version_id=Vn, asset_content_hash, workflow_id, workflow_version=Wm, and diff_version_id used at decision time And when the asset or workflow is later updated, the historical record continues to resolve to Vn and Wm without ambiguity And when replaying the event, the system retrieves the exact versions referenced by the record

CSV/JSON Export with Filtering and Scale

Given a workspace admin requests an export with date range, event types, approvers, and field selection When the export is generated Then the system produces a ZIP containing NDJSON (.jsonl) and CSV files matching the selected schema and a data dictionary And CSV conforms to RFC 4180 with UTF-8 encoding and header row; JSON is one event per line; timestamps are ISO 8601 UTC And exports up to 1,000,000 events complete within 30 minutes and are chunked into files of <= 100 MB with deterministic filenames And the export includes an integrity manifest with SHA-256 checksums for each file

Webhook Delivery with Idempotency and Security

Given a subscriber registers a webhook with endpoint URL and HMAC secret When new audit events occur Then the system sends POST requests within 30 seconds containing the event payload, a unique delivery_id, and an Idempotency-Key header And each request includes an X-Signature header with SHA-256 HMAC of the body and a timestamp so receivers can verify authenticity and freshness And on non-2xx responses or timeouts, deliveries are retried with exponential backoff for up to 24 hours, then marked failed and an alert is emitted And deliveries for the same request_id preserve order, and duplicate processing is prevented via Idempotency-Key reuse

E-Discovery Search Query and Export

Given a compliance user with e-discovery permission accesses audit search When they query by keyword phrase, approver, decision type, date range, asset_id, workflow_version, and justification_text Then results return within 2 seconds for datasets up to 50,000 records and within 10 seconds for up to 1,000,000 records And queries support exact phrase (quoted), boolean AND/OR, and field filters; results include total count and are paginated And selected results can be exported to CSV/JSON with the same schema as standard audit exports

Retention Policies and Data Residency Enforcement

Given a workspace in region EU sets a 7-year retention policy and a legal hold on request_id=XYZ When the nightly purge job runs Then records older than 7 years without legal holds are irreversibly deleted and a signed purge report is added to the audit log And records under legal hold are retained until the hold is removed And all audit data, exports, and backups are stored and processed exclusively in the configured region and never leave it And when retention settings are shortened, the system requires explicit confirmation and enforces a 7-day grace period before deletions execute

Role-Based Access, Delegation, and Overrides

"As a workspace admin, I want role-based permissions and delegation so that only authorized people can approve and we have coverage when someone is out."

Description

Enforce role- and scope-based permissions for creating workflows, approving steps, and overriding decisions. Support temporary delegation for out-of-office coverage, approver alternates, and emergency overrides requiring a reason, multi-factor confirmation, and automatic notifications. Highlight overrides in dashboards and the audit log, and restrict high-risk approvals to designated roles or multi-approver quorum as configured.

Acceptance Criteria

Role- & Scope-Based Permissions Enforcement

Given a user with role "Workflow Creator" scoped to Brand=A, when they create a workflow under Brand=A, then the API returns 201 and the workflow scope is set to Brand=A. Given the same user attempts to create a workflow under Brand=B, when they submit, then the API returns 403 and no workflow is created. Given a user without the "Approver" role views a pending approval, when the approval screen loads, then Approve and Override actions are hidden/disabled and any direct API calls are blocked with 403. Given a user with role "Approver" scoped to Catalog=Shoes, when they approve an item in Catalog=Shoes, then the approval succeeds (200) and is recorded; when they attempt outside their scope, then the action is blocked (403) and logged. Given any permission-denied attempt occurs, when the system blocks the action, then an audit entry is created with user ID, action, resource ID, scope, and timestamp.

Time-Bound Delegation for Out-of-Office Coverage

Given a delegator configures delegation to a delegatee with start/end timestamps and scope, when the delegatee accepts, then the delegation activates at the start time and deactivates at the end time automatically. Given the delegation is active, when an approval assigned to the delegator arrives within the delegated scope, then the delegatee receives notifications and can approve or decline it. Given the delegatee takes action under delegation, when the audit log records the event, then it attributes "delegatee on behalf of delegator" and stores the delegation ID. Given the delegation has ended or is outside scope, when the delegatee attempts action, then the system returns 403 and no state change occurs. Given a delegation is created, then it cannot grant access beyond the delegator's own role and scopes; attempts to exceed are rejected (400) at save-time. Given delegation activation or deactivation occurs, then email/Slack notifications are sent to both delegator and delegatee within 1 minute.

Auto-Routing to Alternates on OOO or SLA Breach

Given a step has a primary approver and alternates [A,B] with SLA=24h, when the primary is marked Out-of-Office or the SLA expires without action, then the approval is reassigned to the first eligible alternate and both primary and alternate are notified. Given alternates are evaluated, when an alternate lacks required role/scope, then they are skipped and the next is evaluated until an eligible alternate is found or none exist. Given an alternate approves, then the step completes and further responses from other alternates are ignored; duplicate approvals are prevented at the API (409). Given any reassignment occurs, then the audit log records original assignee, reason (OOO/SLA), new assignee, and timestamp.

Emergency Override with Reason and MFA

Given a user with the "Override" permission initiates an override, when prompted, then they must supply a reason of at least 10 characters and complete MFA within 60 seconds or the attempt fails (401/422) with no state change. Given MFA succeeds and the user is within scope, when they confirm the override, then the system advances the workflow, records the action as an override, and tags the record with reason, MFA method, and actor. Given an override completes, then notifications are sent immediately to workflow owner, security/admin group, and the bypassed approver(s) via email/Slack. Given an override is attempted by a delegate without explicit "Override" permission, then it is blocked (403) and logged.

Override Visibility in Dashboards and Audit Log

Given an approval was completed via override, when dashboards load, then the item displays a visible "Override" badge/icon and can be filtered using a "Show Overrides Only" control. Given the audit log is queried for that item, then the entry includes override flag, actor, original approver(s), reason text, MFA status/method, step ID, and timestamps. Given exports are generated (CSV/JSON), then override-specific fields are included. Given a normal (non-override) approval occurs, then no override badge appears and no override fields are populated.

High-Risk Approval Quorum and Role Restrictions

Given a High-Risk step is configured with allowed roles {Owner, Senior Approver} and quorum=2, when approvals are collected, then the step completes only after approvals from two distinct eligible users are recorded. Given an ineligible user attempts to approve a High-Risk step, then the attempt is blocked (403) and logged with reason "role/scope not permitted". Given fewer than quorum approvals are recorded, when the SLA expires, then escalation rules trigger and the step remains incomplete. Given overrides are allowed for High-Risk steps with configured override quorum=2, when only one override actor confirms, then the override does not complete; when two distinct override actors confirm within the allowed window, then the override completes and is logged as High-Risk Override.

Release Scheduler

Time-lock when new preset versions go live. Pin versions to specific batches, freeze changes during drops, and roll back instantly if needed—keeping imagery consistent mid-campaign and avoiding surprises during high-velocity launches.

Requirements

Versioned Preset Store

"As a brand manager, I want to create immutable versions of style presets so that I can release changes without affecting in-flight campaigns."

Description

Introduce immutable, versioned style presets with semantic versioning and metadata (creator, createdAt, changelog). Past versions are read-only; new versions are created via clone-and-edit to preserve auditability and reproducibility. Ensure all processing services can resolve a preset by stable version identifier and that rendering behavior is deterministic across versions. Include validation for backward compatibility, diff view between versions, and referential integrity so batches and jobs always resolve to a valid version. Integrate with PixelLift’s preset editor, batch processor, and permissions model.

Acceptance Criteria

Publish Immutable Preset Version (SemVer + Metadata)

Given a user with "Preset Editor" permission has created a new preset draft When they set the version to "1.0.0", provide a non-empty changelog, and click Publish Then the system validates semantic versioning (MAJOR.MINOR.PATCH), stamps createdAt (UTC), and records creator metadata And the version state becomes read-only; any update attempt to preset fields returns 409 "Preset version is immutable" And using "Clone" from 1.0.0 creates a new draft prepopulated with identical settings and cleared version field And the system suggests the next version (1.0.1 by default) and requires a new changelog on publish

Stable Version Resolution and Deterministic Rendering

Given a preset reference in the form presetId@1.2.0 is provided to the editor, batch processor, and renderer services When each service resolves the reference Then all services retrieve an identical configuration payload (byte hash equality) And processing the same source image twice with presetId@1.2.0 yields byte-identical outputs (checksum match) across runs and environments And resolving a preset without a version (presetId) returns the current default version id And requesting a non-existent version returns 404 with a machine-readable error code PRESET_VERSION_NOT_FOUND

Backward Compatibility Validation on Publish

Given a draft cloned from presetId@1.2.0 is prepared for release as 1.2.1 (patch) When only non-breaking fields are modified (e.g., numeric parameter adjustments within allowed ranges) Then Publish succeeds and the changelog is stored Given a draft includes breaking changes (e.g., removal of a field or algorithm type change) When the target version is a patch or minor bump Then Publish is blocked with error code INCOMPATIBLE_VERSION_BUMP and a list of detected breaking changes And setting the target version to 2.0.0 (major) for the same changes allows Publish to proceed

Version Diff View with Previews and Changelog

Given a user selects two versions of the same preset (e.g., 1.0.0 vs 1.1.0) When the diff view is opened Then the UI displays field-by-field differences categorized as added/removed/changed with counts And a side-by-side preview renders on three sample images using each version within 5 seconds total And the changelog for the newer version is displayed alongside the diff And the user can export the diff as JSON via a Download action, producing a file that includes before/after values

Referential Integrity for Batches and Jobs

Given existing batches and jobs reference presetId@1.0.0 When version 1.0.0 is deprecated or a newer default is published Then all existing references continue to resolve successfully without mutation And attempts to delete version 1.0.0 are blocked with 409 "Version in use" while references exist And integrity checks report zero dangling references after migrations or cleanup operations And if a referenced version is forcibly removed in a test environment, dependent jobs fail fast with 404 and include remediation guidance in the error payload

Permissions and Audit Logging for Versioned Presets

Given roles Admin, Preset Editor, and Viewer exist Then Admin and Preset Editor can create, clone, and publish versions; only Admin can deprecate or delete versions; Viewer has read-only access And unauthorized actions return 403 with error code INSUFFICIENT_PERMISSIONS and are recorded And every create/clone/publish/deprecate/delete action writes an immutable audit record including actor id, action, version id, timestamp (UTC), and a diff hash And audit records are queryable by version id and actor

Release Scheduling, Pinning, Freeze, and Rollback

Given version 1.1.0 is scheduled to go live at 2025-10-01T10:00:00Z When the current time is before the go-live Then the default version remains on the previous version When the go-live time is reached Then the default version for new batches flips to 1.1.0 automatically and a VERSION_WENT_LIVE event is emitted And batches pinned to 1.0.0 continue to use 1.0.0 regardless of default changes And a freeze window on an active drop blocks changes to pinned versions and schedules; attempts return 423 Locked And a one-click rollback sets the default back to the prior version within 60 seconds and emits a VERSION_ROLLED_BACK event without altering existing pinned batches

Batch Version Pinning

"As a seller uploading a new catalog batch, I want to pin the exact preset version used so that reprocessing later yields the same look."

Description

Enable explicit pinning of a specific preset version to each batch at creation and during reprocessing. Persist the mapping batchId → presetVersion so all renders, retries, and regenerations use the pinned version for consistent visual output. Provide UI and API options to select a version, prevent accidental drift, and display pinned status in batch details. Handle edge cases such as archived versions, deleted presets, and cross-workspace moves by enforcing safe fallbacks and warnings.

Acceptance Criteria

Pin Preset Version During Batch Creation (UI)

Given I am on Create Batch UI with a selected preset that has multiple versions And I select a specific preset version from the version picker When I create the batch Then the batch is saved with pinned=true and presetVersionId set to the selected version And all initial renders use that presetVersionId And changing the preset’s default version after creation does not alter the batch’s pinned version And Batch Details displays the pinned version label, ID, and a "Pinned" badge

Pin Preset Version via API on Batch Create/Update

Given a POST /batches request includes presetId and a valid presetVersionId belonging to that preset When the request is submitted with valid authentication Then the response is 201 and the body includes pinned=true and the same presetVersionId And a subsequent GET /batches/{id} returns the same pinned fields Given a POST /batches includes presetId but omits presetVersionId Then the system pins to the latest active version at creation time and returns its presetVersionId Given the provided presetVersionId does not belong to the preset Then the response is 422 with error code PinnedVersionInvalid Given a PATCH /batches/{id} with presetVersionId to re-pin and the user has Editor role and the batch is not currently processing Then the response is 200 and future renders use the new presetVersionId Given a PATCH occurs while the batch is processing Then the response is 409 Conflict with error code BatchBusy

Use Pinned Version for All Renders, Retries, and Regenerations

Given a batch is pinned to preset version V1 And the preset’s default version is later updated to V2 When I retry failed items, regenerate outputs, or reprocess the batch Then all processing jobs use V1 And each render job’s metadata includes presetVersionId=V1 And API GET /renders for the batch returns presetVersionId=V1 for the associated renders

Behavior with Archived or Deleted Preset Versions

Given a batch is pinned to a preset version that becomes Archived When I view Batch Details Then the pinned version shows an "Archived" badge with a non-blocking warning And reprocessing is allowed using the archived version Given a batch is pinned to a preset version that is Deleted When I attempt to reprocess or render Then processing is blocked and the UI shows a blocking banner requiring re-pin And API operations return 410 Gone with error code PinnedVersionDeleted And the UI offers a one-click re-pin to the latest active version with explicit confirmation And upon confirmation, the batch is re-pinned to the selected version and an audit entry is created

Display and Communicate Pinned Status in Batch Details and Lists

Given a batch with a pinned version When I open Batch Details Then I see pinned=true, preset name, version label, version ID, release date, and a link to release notes And the batch list shows a "Pinned" indicator and supports filtering by pinned status And API GET /batches includes pinned, presetVersionId, pinnedAt, and pinnedBy fields

Cross-Workspace Batch Move with Pinned Versions

Given I initiate moving a batch to another workspace And the target workspace contains the same preset and preset version Then the move completes and the batch retains the same presetVersionId Given the target workspace lacks the preset or the pinned version Then the move is blocked and a mapping dialog requires selecting an available preset version in the target workspace And the move cannot complete until a valid version is selected or the move is canceled And upon completion with a selection, the batch is re-pinned to the chosen version and an audit entry is recorded

Controlled Re-Pinning and Audit Trail

Given I have Editor or higher permissions When I change the batch’s pinned version in the UI Then a confirmation modal summarizes oldVersionId -> newVersionId and requires explicit confirmation And upon confirmation, the pinned version updates, future renders use the new version, and an audit log entry is recorded with actor, timestamp, oldVersionId, and newVersionId Given I lack sufficient permissions Then the re-pin control is disabled in the UI and API PATCH /batches/{id} returns 403 Forbidden with error code InsufficientPermissions

Timed Release & Timezone Control

"As a marketing lead, I want to schedule when a new preset version becomes active in my timezone so that launches switch over consistently at the planned moment."

Description

Allow scheduling of a new preset version to become active at a specific date/time with explicit timezone selection and DST awareness. The scheduler performs an atomic pointer switch for the preset’s active version and guarantees idempotency and retry on transient failures. Provide conflict detection (e.g., overlapping schedules, frozen windows), preflight validation, and a countdown/status view. Ensure safe handling of in-progress jobs (finish with old version) and new jobs (start with new version) with clear cutover semantics. Integrate with the job orchestrator and system clock service for accuracy and reliability.

Acceptance Criteria

DST-Aware Timezone Scheduling

Given I select timezone "America/Los_Angeles" and choose 2025-03-09 02:30 local time When I attempt to save the schedule Then the system rejects the save with error code TIME_INVALID_DST and suggests the nearest valid local times (e.g., 03:00), and the UTC preview for 03:00 local displays 2025-03-09T10:00:00Z Given I select timezone "America/Los_Angeles" and choose 2025-11-02 01:30 local time When I attempt to save the schedule without disambiguating offset Then the system requires me to choose either "UTC-07 (before fallback)" or "UTC-08 (after fallback)", and the saved schedule stores the resolved absolute UTC instant accordingly Given a schedule is saved with a selected timezone When I view the schedule details Then I see both the local wall time with offset and the canonical UTC time, and the cutover executes at the stored UTC instant per the system clock service

Atomic Cutover and Job Routing

Given preset P has active version v1 and a scheduled switch to v2 at time T (UTC) And job J1 began processing at time t1 < T And job J2 is submitted at time t2 >= T When time reaches T Then the active pointer switches from v1 to v2 atomically within 1 second And J1 completes using v1 exclusively And J2 starts using v2 exclusively And no job processes a mix of v1 and v2 within a single job And an audit event preset_version_activated for v2 is recorded exactly once with event_time >= T

Idempotent Switch with Retry on Transient Failure

Given a transient network error occurs after issuing the cutover command for preset P to version v2 at time T When the scheduler retries the cutover using the same idempotency key Then the final active version is v2, only one audit/event record exists for the cutover, and duplicate notifications are not sent Given duplicate cutover events are received within 60 seconds for the same preset and target version When the second event is processed Then it results in a no-op with a 200 response indicating idempotency, and no additional side effects occur Given the scheduler process crashes mid-cutover When a standby instance resumes and retries within 60 seconds Then the cutover completes successfully without manual intervention and without leaving stale locks

Conflict Detection: Overlaps and Freeze Windows

Given a preset has a freeze window from Fstart to Fend When a user schedules an activation at time T where Fstart <= T <= Fend Then the system blocks the action with error code SCHEDULE_BLOCKED_FROZEN and displays the freeze window details Given a preset already has a pending scheduled activation at T1 When a user attempts to create a second scheduled activation for the same preset Then the system blocks the action with error code SCHEDULE_CONFLICT and shows the existing schedule details; no new schedule is created Given preset P has version vX pinned to Batch B When a new activation of version vY is scheduled Then the preflight informs "Batch B will continue using its pinned version" and the scheduler excludes Batch B from cutover without blocking the schedule

Preflight Validation and Confirmation Gate

Given I click Schedule for preset P to activate version v2 at T with timezone Z When preflight runs Then it validates: v2 exists and is published (not archived), user has Schedule permission for P, job orchestrator is reachable, and system clock service health indicates skew < 100 ms; any failed check is shown with specific error codes and the schedule is not created Given all preflight checks pass When I proceed to confirmation Then I am shown a summary including preset, current version, new version, activation time in local (with offset) and UTC, cutover semantics, and affected scope; I must confirm explicitly before the schedule is created Given the schedule is created When I inspect the record via API Then it includes fields: preset_id, target_version_id, activation_utc, timezone, local_time_display, created_by, idempotency_key, and status=Scheduled

Countdown and Status Visibility

Given a future activation is scheduled for preset P at time T (UTC) When I view the schedule detail page Then I see a live countdown to T updating at least once per second, showing both local time (with offset) and UTC; if detected client clock skew > 5 seconds relative to server time, a warning is displayed Given the activation is approaching When time reaches T Then status transitions are reflected as: Scheduled -> Cutting Over (for up to 60 seconds) -> Live upon success, or Failed if not completed by T+60s; the same states are returned by the status API Given I cancel the schedule before T When I confirm cancellation Then the countdown stops, status becomes Canceled, and no cutover occurs at T

Change Freeze Windows

"As a campaign owner, I want to freeze preset changes during a drop window so that no accidental edits alter live imagery."

Description

Introduce configurable freeze windows to block preset edits and releases during critical campaign periods. Support per-workspace and per-preset scopes, recurring and one-off windows, and admin override with justification. Provide UI indicators, API enforcement, and pre-schedule validation that prevents creating releases inside frozen periods. Log all override attempts and enforce granular permissions to reduce risk of accidental changes mid-drop.

Acceptance Criteria

Block Edits and Releases During Active Freeze

Given an active freeze window applies to the target preset or workspace When a user with Editor role attempts to save preset parameter changes via UI Then the Save action is prevented, controls remain disabled, and the UI displays a reason indicating the freeze end timestamp in the workspace timezone And no changes are persisted to the preset version Given an active freeze window applies to the target preset or workspace When a client calls the API to publish a new preset version or attach a preset version to a batch Then the request fails with HTTP 403 and error code FREEZE_WINDOW_BLOCKED including windowId and endsAt fields And zero publish jobs are enqueued and no side effects occur Given no active freeze window applies When the same edit or release actions are performed Then the actions succeed with standard response codes and persisted changes

Pre-Schedule Validation Blocks Releases in Frozen Periods

Given a freeze window covers 2025-11-25 09:00–12:00 in the workspace timezone When a user selects 2025-11-25 10:00 for a preset release in the Release Scheduler and clicks Save Then the client shows an inline validation error stating the selection is within a freeze window and prevents saving And the UI suggests the nearest available datetime after the freeze ends Given the same request is attempted via API When POST /v1/releases is called with a scheduledAt inside a freeze window Then the API responds HTTP 422 with error code FREEZE_WINDOW_VIOLATION and includes fields nearestAvailableAt and conflictingWindowId And the release is not created

Scope: Workspace vs Preset Freeze Precedence

Given a workspace-level freeze window is active and a preset-level freeze window is not configured for Preset A When a user attempts to edit or release Preset A Then the action is blocked due to the workspace freeze Given a preset-level freeze window is active for Preset B while the workspace has no active freeze When a user attempts to edit or release Preset B Then the action is blocked due to the preset-level freeze only Given both workspace and preset-level freeze windows overlap for Preset C When a user views allowed times in the scheduler for Preset C Then the blocked intervals reflect the union of both freezes And read-only operations (e.g., viewing preset details) remain available

Recurring and One-Off Freeze Windows with Timezone & DST Handling

Given a recurring weekly freeze is configured for Fridays 08:00–20:00 in America/Los_Angeles When the calendar transitions across DST changes Then enforcement occurs at 08:00–20:00 local wall-clock time each Friday regardless of DST shift Given a one-off freeze is configured on 2025-12-01 06:00–14:00 local time When both the recurring Friday freeze and the one-off window overlap a date Then the system enforces the union of blocked intervals for that date And UI calendars shade the full union as unavailable Given overlapping freeze windows are configured via API When GET /v1/freeze-windows is called Then the API returns all configured windows with timezone, recurrence, and effective intervals, and clearly identifies overlaps via overlappingWindowIds

Admin Override Requires Justification and Scoped Duration

Given an active freeze window applies When a Workspace Admin initiates an override Then they must provide a justification of at least 20 characters and select a scope of either Single Operation or Time-bound (max 30 minutes) And the override cannot be activated without meeting both requirements Given a valid override is active and scoped to Single Operation When the admin performs one blocked action (e.g., publishing a preset version) Then the action succeeds and the override automatically expires immediately after Given a non-admin attempts to activate an override When they submit the override form or call the API endpoint Then the request is rejected with HTTP 403 and error code INSUFFICIENT_ROLE Given any override is activated When the action completes Then an audit record is created including actorId, role, targetId, action, windowId, justification, scope, startedAt, endedAt, and outcome

UI Indicators and Disabled Controls During Freeze

Given an active freeze window applies to a preset or workspace When the user opens the Preset List, Preset Editor, or Release Scheduler Then a visible Frozen badge appears for affected presets and a banner displays next available edit time And edit/publish controls are disabled with a tooltip explaining the freeze and end time And the indicators are accessible (ARIA labels provided, tooltip content reachable by keyboard) Given no active freeze window applies When the same pages are opened Then no Frozen indicators are shown and all controls are enabled

Granular Permissions and Change Logging for Freeze Windows

Given role-based access control is configured When a Workspace Owner or Admin attempts to create, update, or delete a freeze window via UI or API Then the action succeeds with HTTP 200/201 and the window is persisted Given an Editor or Viewer attempts the same When they submit the request Then the action is rejected with HTTP 403 and error code INSUFFICIENT_ROLE Given any freeze window is created, updated, or deleted When the operation completes Then an immutable audit log entry is recorded with actorId, role, operation, windowId, scope (workspace/preset), recurrence, timezone, previousValues, newValues, timestamp, and ipAddress And GET /v1/audit-logs?eventType=freeze-window returns the entry within 5 seconds of the operation

One-click Rollback & Restore

"As an operations manager, I want to roll back to a prior preset version with one click so that I can quickly recover from unexpected visual issues."

Description

Provide an atomic rollback action that instantly reassigns the active version pointer to the prior stable version, with optional selection of a specific historical version. Ensure rollbacks are idempotent, logged, permission-gated, and include safety checks (e.g., cannot roll back to deleted or incompatible versions). Offer UI confirmation with impact summary and an API endpoint for automated recovery. Handle job routing so new jobs use the restored version while in-flight jobs complete with the previously active version.

Acceptance Criteria

Atomic One-Click Rollback Switch

Given a preset P with active version V3 and prior stable version V2, and at least one in-flight job using V3 When an authorized user triggers the one-click rollback to the prior stable version Then the active version pointer for P updates atomically to V2 within 2 seconds and is immediately reflected in both UI and API reads And no intermediate state is observable (reads return either V3 before commit or V2 after commit) And in-flight jobs continue and complete on V3 without restart or version reassignment And all new jobs submitted after the commit route to V2 100% of the time And repeating the same rollback within 5 minutes results in a no-op with 200 OK and no additional side effects (idempotent)

Historical Version Selection and Safety Checks

Given a preset P with versions V1, V2, V3 where V1 and V2 are compatible, V0 is deleted, and V4 is incompatible When a user selects "Rollback to…" and chooses V1 Then the system validates the target exists, is not deleted/archived, is enabled, and is schema-compatible with P And on pass, the active pointer updates to V1; on fail, the rollback is blocked without side effects And attempting rollback to a deleted version returns 400 with code VERSION_NOT_FOUND and a human-readable message And attempting rollback to an incompatible version returns 409 with code VERSION_INCOMPATIBLE and includes the incompatibility reason And if organizational policy blocks rollback, the call returns 423 with code ROLLBACK_BLOCKED and a remediation hint

Permission-Gated Rollback with Audit Logging

Given a user without the "Preset:Rollback" permission attempts a rollback via UI or API When the action is executed Then the system denies with 403 FORBIDDEN and code FORBIDDEN_OPERATION, without changing the active version And for an authorized user, a rollback (attempted or successful) writes an immutable audit log entry containing: presetId, previousVersionId, newVersionId (if any), actorId, actorRole, source (UI/API), timestamp, requestId, idempotencyKey (if provided), rationale (if provided), outcome (SUCCESS/FAILURE), and affectedScopes (e.g., pinned batches count) And audit entries are queryable by admins by presetId/date range and exportable as CSV/JSON

UI Confirmation Modal with Impact Summary

Given an authorized user initiates rollback from the preset detail page When the confirmation modal opens Then it displays: current active version, proposed target version, count of in-flight jobs on current version, count of queued jobs impacted, note that in-flight jobs will not be switched, note that pinned batches remain on their pinned versions, and any active freeze windows affecting scope And the Confirm button remains disabled until pre-checks load (<= 2 seconds) and the user explicitly confirms (e.g., checkbox or typing "ROLLBACK") And if pre-checks fail (e.g., target incompatible or policy freeze without override), the modal shows an actionable error and prevents submission And on confirm, the modal closes only after the rollback commits and shows a success toast summarizing the change

API Rollback Endpoint with Idempotency and Concurrency Control

Given a client calls POST /presets/{presetId}/rollback with an optional targetVersionId and Idempotency-Key header When the request is valid and authorized Then the server performs the rollback and returns 200 with body { presetId, previousVersionId, newVersionId, committedAt, routerEpoch } And duplicate requests with the same Idempotency-Key within 24 hours return the original result without re-executing side effects And concurrent rollback attempts on the same preset result in exactly one success; losers receive 409 with code CONFLICT_ACTIVE_CHANGE And p95 latency is <= 800 ms under nominal load; the version pointer is globally consistent within 2 seconds of commit

Job Routing Consistency with Pins and Freeze Windows

Given some batches are pinned to version V3 and a global drop freeze window may be active When a rollback to V2 is executed Then the job router sends all new unpinned jobs to V2 within 2 seconds of commit And pinned batches continue on their pinned versions unaffected by the rollback And scheduled future releases retain their configured go-live times and target versions and are not altered by the rollback And if a freeze window blocks version changes, rollback is prevented unless the actor has "Preset:OverrideFreeze"; on override, a rationale is required and recorded in audit logs And no in-flight jobs are re-routed; queue health metrics (enqueue/dequeue rates, failure rate) remain within baseline ±5% over 10 minutes post-rollback

Audit Log & Alerts

"As a team admin, I want a clear audit trail and alerts for releases and rollbacks so that I can monitor changes and respond to failures."

Description

Record all release-related events—version creation, scheduling, activation, freeze/override, pin changes, and rollback—with actor, timestamp, and affected entities. Provide searchable UI, export, and retention policies. Emit real-time notifications to email and Slack on successes, failures, and upcoming cutovers, with configurable recipients per workspace. Surface failure reasons and remediation tips inline to speed incident response.

Acceptance Criteria

Event Logging Coverage & Fidelity

Given a workspace with Release Scheduler enabled When any of the following events occur: preset version created; schedule created or updated; activation started or completed; freeze window applied or removed; override applied or removed; batch pin added, changed, or removed; rollback initiated or completed Then an audit log entry is persisted for each event with fields: event_type, actor_id, actor_name, actor_type (user|service), occurred_at (UTC ISO-8601), workspace_id, preset_id, from_version_id (nullable), to_version_id (nullable), batch_ids (array), schedule_id (nullable), reason (nullable), outcome (success|failure), correlation_id, entry_id And the log entry is immutable; any correction results in a new entry that references the prior entry via prior_entry_id And the entry becomes queryable within 5 seconds of the event time

Audit Log Search & Filter

Given an audit log with at least 5,000 events in the last 30 days When a user opens the Audit Log UI and applies any combination of filters: date range, event_type, actor (name or ID), preset_id, version_id, batch_id, outcome, text search (reason) Then results include all and only records matching the filters within the current workspace And the first page of 50 results returns in ≤ 2 seconds for up to 10,000 matching records And results are sortable by occurred_at (asc|desc), actor_name, event_type And the UI displays total count and supports pagination (configurable page size: 25/50/100) And clearing filters resets the view to the default last 7 days

Audit Log Export & Retention Enforcement

Given the workspace retention policy is set to 180 days When a user requests an export (CSV or JSON) for a selected date range Then the system validates the range is within retention and the estimated size ≤ 500,000 rows; otherwise prompts to narrow the range And upon confirmation an export job is created and recorded in the audit log with correlation_id And for exports ≤ 500,000 rows a downloadable file is produced within 10 minutes, with a 24-hour expiring link And the exported file includes header fields and all logged fields in UTC And records older than 180 days are neither searchable nor exportable And when an admin updates retention to a new value (e.g., 365 days), subsequent searches and exports honor the new policy and the change is logged (old→new)

Real-time Email & Slack Notifications

Given notification recipients are configured for the workspace When a scheduled activation is 15 minutes away Then an upcoming cutover notification is delivered to subscribed recipients with: preset name, from_version_id, to_version_id, schedule_time (UTC), affected batches count/list (≤ 20 listed, rest summarized), and a deep link to details When an activation or rollback completes with success or failure Then a notification is delivered within 60 seconds including: event_type, outcome, error_code and message if failure, correlation_id, and deep link And duplicate notifications with the same correlation_id are not sent more than once And if Slack delivery fails (non-2xx), the failure is logged and email is attempted as a fallback And per-event-type subscriptions (upcoming_cutover, activation_success, activation_failure, rollback_success, rollback_failure) are respected

Recipient Configuration & Permissions

Given a user with Workspace Admin role opens Notifications settings When they add or remove email recipients and Slack webhooks Then emails must pass format validation and Slack webhooks must pass a connectivity test (HTTP 2xx) before saving And recipients can be subscribed per event type (upcoming_cutover, activation_success, activation_failure, rollback_success, rollback_failure) And saving changes records a configuration-changed audit entry with before/after (secrets redacted) And users without Admin role can view but cannot modify recipients (controls disabled and server rejects writes)

Inline Failure Reasons & Remediation

Given a scheduled activation or rollback fails When a user opens the incident details via the audit log or notification deep link Then the UI displays error_code, error_message, failed_step, correlation_id, affected entities (workspace, preset_id, version_id, batch_ids), and timestamp And at least one remediation tip is displayed with a link to relevant docs or settings (e.g., re-auth Slack, adjust permissions, resolve conflicting freeze) And a context-appropriate action is available (Retry, Rollback, or Dismiss) based on event type and state And details and tips render within 10 seconds of page load And acknowledging the incident records an audit entry with actor and timestamp

Release Webhooks & API Events

"As a developer integrating PixelLift, I want webhook events and APIs for releases so that our storefront and workflows can react to changes in real time."

Description

Expose secure APIs to manage schedules, pins, and freeze windows, and publish signed webhooks for key lifecycle events (scheduled, activated, skipped, failed, rolled_back). Include HMAC signature verification, retries with backoff, idempotency keys, and per-workspace rate limits. Provide event payloads that include preset identifiers, version, timestamps, and affected batches so external systems (CMS, storefront, CI) can react in real time to visual changes.

Acceptance Criteria

Webhook HMAC Signature & Replay Protection

Given a workspace has a webhook endpoint and secret S When PixelLift sends any release lifecycle event webhook (scheduled, activated, skipped, failed, rolled_back) Then the request includes headers: X-PixelLift-Timestamp (epoch ms), X-PixelLift-Signature (HMAC-SHA256 over "{timestamp}.{raw_body}" using S), X-PixelLift-Event-ID (UUID), X-PixelLift-Delivery-ID (UUID), X-PixelLift-Delivery-Attempt (integer >= 1) Given the consumer recomputes the signature with S over the exact raw body and X-PixelLift-Timestamp within a 5-minute tolerance When compared to X-PixelLift-Signature Then the signatures match Given the same delivery is retried When sent again Then X-PixelLift-Delivery-ID remains the same and X-PixelLift-Delivery-Attempt increments by 1 Given the webhook body or timestamp is altered in transit When the consumer recomputes the HMAC Then the signature does not validate

Event Delivery, Exponential Backoff, and At-Least-Once

Given a subscriber responds with 2xx When a release lifecycle event is generated Then 99% of deliveries complete within 10 seconds of event creation and 100% complete within 60 seconds Given a subscriber responds with 5xx or times out When delivery fails Then PixelLift retries with exponential backoff and full jitter for up to 6 total attempts (initial + 5 retries), each delay capped at 60 seconds, with a total retry window <= 5 minutes Given the final attempt also fails When retries are exhausted Then the delivery is marked failed; no further retries occur; a delivery record is accessible via API including attempt history and last response/status Given multiple events exist for the same preset_id When delivered Then deliveries maintain per-preset causal order (earlier events are attempted before later ones) Given a transient failure When a retry receives a 2xx Then the delivery is marked delivered and no additional retries occur

Idempotency for Management APIs

Given a client sends POST to create or mutate schedules, pins, or freeze windows with an Idempotency-Key header and identical request body When the same request is retried within 24 hours Then the API returns 200/201 with the original response body, echoes Idempotency-Key, and no duplicate side effects or duplicate events are created Given two concurrent POST requests with the same Idempotency-Key When processed Then exactly one operation is committed; the other returns the same result Given a non-matching body is sent with a reused Idempotency-Key When processed Then the API responds 409 Conflict indicating key reuse mismatch and performs no side effects Given GET and DELETE endpoints When supplied an Idempotency-Key Then GET ignores the header; DELETE is safe to retry (idempotent) and does not create duplicate events

Per-Workspace Rate Limiting for Management APIs

Given a workspace W has a configured limit of 600 requests per 10 minutes for management APIs When W sends requests under the limit Then each response includes X-RateLimit-Limit: 600, X-RateLimit-Remaining with the correct remaining count, and X-RateLimit-Reset with the reset epoch seconds Given W exceeds the limit within the window When an additional request is received Then the API responds 429 Too Many Requests with Retry-After set to remaining seconds until reset and performs no side effects Given another workspace W2 with a different limit When W2 sends requests Then W2’s quota is enforced independently of W’s Given outbound webhook deliveries occur When measuring API usage Then webhook delivery traffic does not consume management API rate limit quota

Event Payload Schema Completeness

Given any release lifecycle event (scheduled, activated, skipped, failed, rolled_back) When the webhook is delivered Then the JSON payload validates against the published schema (via field schema_version) and includes: event_id (UUID), event_type, occurred_at (RFC3339), workspace_id, preset_id, preset_version, affected_batch_ids (array), actor (user_id or "system"), reason (for "skipped"/"failed"), previous_version (for "rolled_back"), and metadata.trace_id Given a new optional field is added in a newer schema_version When delivered to an older client Then required fields and semantics remain backward-compatible Given an event of type "rolled_back" When delivered Then payload contains from_version and to_version with correct values and a non-empty reason Given an "activated" event resulting from a scheduled release When delivered Then payload contains schedule_id and activation_time matching server-side records

Secure APIs for Schedules, Pins, and Freeze Windows

Given a workspace-scoped API key with permission "releases:write" When calling endpoints to create/update/delete schedules, pin versions to batches, and create freeze windows Then unauthorized requests return 401, insufficient scope returns 403, and authorized requests return 2xx with stable resource IDs Given a valid schedule creation specifying preset_id, preset_version, and a future activation_time When created Then the API returns 201 with schedule_id and a "scheduled" event is emitted with matching fields Given a valid pin request specifying preset_version and batch_ids When processed Then the API returns 200/201, the batches are locked to the pinned version, and an "activated" event includes affected_batch_ids and preset_version Given an active freeze window overlapping an activation_time or pin attempt When a change is attempted Then the API rejects with 409 Conflict and emits a "skipped" event with reason "freeze_window" Given a rollback request for a preset to a prior version When executed Then the API returns 200 and a "rolled_back" event is emitted with from_version, to_version, and affected_batch_ids

Brand Binding

Automatic enforcement of preset-to-brand and channel pairing. Block off-brand usage, auto-assign the correct preset based on supplier or folder tags, and warn on mismatches—reducing rework and safeguarding visual identity across catalogs.

Requirements

Brand–Preset Mapping Engine

"As a brand manager, I want to define which style presets are approved per brand and channel so that all processed images remain consistent with our visual identity."

Description

A centralized mapping service that binds each brand to its approved style presets and channel variants. Supports one-to-many relationships, channel-level overrides, versioning, and effective date ranges to accommodate seasonal campaigns. Provides default fallbacks when metadata is incomplete and exposes an API for other PixelLift services to resolve the correct preset during batch processing. Ensures every image is processed with the right, brand-compliant preset without manual selection.

Acceptance Criteria

Preset Resolution by Brand and Channel Overrides

Given brand B1 has base preset P1 and channel override Instagram->P1_IG When resolvePreset(brand=B1, channel=Instagram) Then response.presetId = P1_IG and response.mappingVersionId is not null Given brand B1 has base preset P1 and no channel override for Web When resolvePreset(brand=B1, channel=Web) Then response.presetId = P1 Given unknown brand When resolvePreset(brand=Unknown, channel=Web) Then response.presetId = GlobalDefaultPreset and response.warnings includes WARN_FALLBACK_BRAND Given normal load When performing 10,000 single-key lookups over 10 minutes Then p95 latency <= 100 ms and error rate < 0.1%

Auto-Assignment via Supplier and Folder Tags

Given mappings: supplierTag=Acme -> P2; folderTag=Summer24 -> P3 When resolvePreset(brand=B1, channel=Web, metadata={supplierTag:Acme, folderTag:Summer24}) Then response.presetId = P3 Given mapping precedence When conflicting tag mappings exist Then precedence is channelOverride > folderTag > supplierTag > brandDefault > globalDefault and response.decisionPath reflects applied steps Given no metadata tags When resolvePreset(brand=B1, channel=Web) Then response.presetId = brandDefaultPreset Given unit tests covering precedence When running the mapping precedence test suite Then all 12 test cases pass

Effective Date Ranges and Versioning for Seasonal Campaigns

Given brand B1 default P1 (v1) and seasonal override P2 (v2) effective [2025-10-01T00:00Z, 2025-12-31T23:59:59Z] When resolvePreset is called with processingTimestamp=2025-11-15T12:00Z Then response.presetId = P2 and response.mappingVersionId = v2 Given the same configuration When processingTimestamp=2026-01-01T00:00Z Then response.presetId = P1 and response.mappingVersionId = v1 Given overlapping effective ranges for the same brand+channel When both match the timestamp Then the rule with the latest effectiveStartDate wins; if equal, the highest version number wins; if equal, the latest updatedAt wins Given any timestamp ambiguity When resolving Then all timestamps are interpreted in UTC and returned as UTC

Block Off-Brand Preset Usage and Channel Mismatch Warning

Given preset P9 is not approved for brand B1 When processBatch is called with explicitPresetId=P9 for images of brand B1 Then the request is rejected with HTTP 403 and errorCode=ERR_OFF_BRAND and allowedPresetIds are returned Given preset P1 base is approved for brand B1 and channel variant P1_IG exists When processBatch is called with explicitPresetId=P1 and channel=Instagram Then the service auto-corrects to P1_IG, proceeds, and returns warnings=[WARN_CHANNEL_MISMATCH] and appliedPresetId=P1_IG Given any block or auto-correction event When it occurs Then an audit log entry is written with fields {brandId, channel, requestedPresetId, appliedPresetId, userId/requestId, timestamp} within 2 seconds

Batch Resolution API Contract and Performance

Given endpoint POST /v1/preset-resolution When called with a payload of up to 1,000 items Then it returns per-item decisions with HTTP 200 and content-type application/json in under 300 ms p95 and 600 ms p99 Given a requestId header X-Idempotency-Key When the same key is retried Then identical responses are returned and no duplicate side effects are recorded Given sustained traffic of 10,000 resolutions per minute When load-tested for 30 minutes Then success rate >= 99.9% and mean CPU utilization <= 70% Given invalid items in the batch When N out of M items are invalid Then the API returns HTTP 207 Multi-Status, processes valid items, and reports per-item errors with codes

Fallback Hierarchy When Metadata Is Incomplete

Given brand B1 has default preset P1 and global default P0 exists When resolvePreset(brand=B1, channel=Unknown) Then response.presetId = P1 and response.decisionPath includes ["brandDefault"] Given unknown brand and global default P0 exists When resolvePreset(brand=Unknown, channel=Web) Then response.presetId = P0 and response.warnings includes WARN_FALLBACK_BRAND Given no global default configured When resolvePreset cannot determine a preset Then the API returns HTTP 412 with errorCode=ERR_NO_DEFAULT_CONFIG and does not process the image Given any fallback decision When resolved Then response includes decisionPath enumerating each applied rule in order

Auditability and Change History for Mappings

Given mapping changes are made When a mapping is created, updated, or deleted Then a new immutable version is recorded with {versionId, brandId, channel, tags, presetId, effectiveRange, changedBy, changedAt, changeReason} Given the history endpoint GET /v1/mappings/{brandId}/history When called Then it returns the last 100 changes in reverse chronological order within 200 ms p95 Given role RBAC 'BrandAdmin' When a user without this role attempts to modify mappings Then the API returns HTTP 403 ERR_FORBIDDEN Given any resolution response When returned Then it includes mappingVersionId and effectiveRangeStart/End fields referencing the applied version

Rule-Based Auto Assignment

"As an operations lead, I want presets to auto-assign based on supplier and folder tags so that batch uploads require zero manual selection."

Description

A rules engine that auto-assigns presets based on supplier, folder tags, filename patterns, SKU prefixes, EXIF/camera data, or custom metadata. Rules are prioritized with deterministic conflict resolution and fallbacks. Integrates at import and pre-process stages to apply bindings at scale for batch uploads. Includes dry-run mode, decision logging, and idempotent reprocessing to avoid duplicate work.

Acceptance Criteria

Auto-Assign by Supplier on Import (Batch Upload)

Given a rule exists: supplier = "Acme" -> preset = "StudioClean v3" And a batch of 500 images is imported with supplier = "Acme" When the import begins Then 100% of images are auto-assigned preset "StudioClean v3" before entering the processing queue And no manual action is required to apply the preset And each item displays assignment status = "Auto-assigned by rule: supplier=Acme"

Auto-Assign via Filename Pattern and SKU Prefix

Given a rule exists: filename matches "*_hero.*" OR SKU prefix = "HR-" -> preset = "HeroPop v2" And files 12345_hero.jpg and HR-7788.png are imported without supplier or folder tags When the files are evaluated by the rules engine at import Then both files are assigned preset "HeroPop v2" And files not matching the filename or SKU patterns are not assigned by this rule

Auto-Assign via EXIF and Custom Metadata

Given a rule exists: EXIF.CameraModel = "Canon EOS R5" AND custom.collection = "Lookbook" -> preset = "Lookbook SoftLight v1" And an image has EXIF.CameraModel = "Canon EOS R5" and custom.collection = "Lookbook" When the image is imported or reaches the pre-process evaluation stage Then the preset "Lookbook SoftLight v1" is assigned And if the EXIF field is missing or mismatched, this rule is skipped without error and other rules continue to evaluate

Deterministic Rule Priority and Conflict Resolution

Given rules exist: - R1: supplier = "Acme" -> preset = "A" priority = 90 - R2: filename contains "_hero" -> preset = "B" priority = 80 - R3: SKU prefix = "AC-" -> preset = "C" priority = 90 And an image from supplier "Acme" has filename "AC-123_hero.jpg" and SKU "AC-123" When the rules engine evaluates matches Then the applied preset is "A" because higher priority wins and ties are resolved by specificity order: supplier > folder tag > SKU prefix > filename > EXIF > custom metadata; if still tied, lowest rule ID wins And the decision records the evaluated rules, priorities, and the applied tie-breaker

Fallback to Brand/Global Defaults When No Rule Matches

Given no rule matches an image And a brand default preset "BrandDefault v4" exists for the image's brand/channel When the image is evaluated at import or pre-process Then the preset "BrandDefault v4" is assigned And if no brand default exists, the global default preset "Neutral v1" is assigned And the decision record notes "Fallback: Brand Default" or "Fallback: Global Default"

Dry-Run Mode with Decision Logging and Export

Given dry-run mode is enabled for an import job with 200 images And rules R1..Rn are configured When the job is executed in dry-run Then no presets are applied or persisted to any image And a decision report is generated listing for each image: selected preset (if any), matched rule IDs, tie-breaker used, and whether fallback was applied And the report is viewable in the UI and downloadable as CSV And the dry-run summary shows counts for: matched by rule, used brand fallback, used global fallback, no decision

Idempotent Reprocessing and Update Behavior

Given an image was previously auto-assigned preset "P1" and processed When the rules engine is re-run with the same rules and inputs Then no duplicate assignment or reprocessing occurs And when rules change so the image now matches preset "P2" and re-evaluation is invoked with allowUpdates = true Then the assignment updates to "P2" and the image is reprocessed exactly once And when allowUpdates = false, the existing assignment "P1" remains unchanged

Off-Brand Usage Blocker & Override

"As a brand manager, I want the system to block off-brand preset usage and allow controlled overrides so that we reduce rework while retaining governance."

Description

Real-time enforcement that prevents applying non-approved presets to a brand or channel. Provides clear error messaging, suggested compliant alternatives, and a governed override path for authorized roles (with reason codes, approver identity, and time-limited exceptions). Configurable strictness per workspace or brand, with batch-safe handling to stop only offending items while continuing valid ones.

Acceptance Criteria

Real-Time Block on Non-Approved Preset (UI Single Apply)

Given a workspace with brand "Acme" and channel "Amazon" configured to Block non-approved presets And preset "Moody Vibe" is not approved for the Acme-Amazon pairing When a user attempts to apply "Moody Vibe" to an image in the Acme-Amazon context via the UI Then the apply action is prevented (no mutation to the asset) And an error is displayed containing: presetName, brandName, channelName, policyMode=Block And 1–3 compliant preset suggestions are shown, ranked by relevance And the decision is logged with fields: userId, assetId, presetId, brandId, channelId, timestamp, policyMode, overrideAvailable=true And the UI response occurs within 300 ms P95 from click to message

Batch Processing: Partial Continue with Offending Items Isolated

Given a batch of N images where some have a non-approved preset for the active brand-channel When the batch is processed Then compliant items are processed successfully And offending items are blocked without halting the entire batch And the batch result indicates Partial Success with counts: total=N, succeeded=S, blocked=B, failed=F (non-policy failures) And each blocked item has reasonCode=OFF_BRAND_PRESET and 1–3 suggested compliant presets And a downloadable per-item report (CSV and JSON) is available within 60 seconds of batch completion And batch duration overhead is <= 10% compared to an all-compliant baseline of the same size

Governed Override Request and Approval (Four-Eyes)

Given a user with role Editor encounters a Block on applying a non-approved preset When the user submits an override request with required fields: reasonCode (from configured list) and optional comments (<= 500 chars) Then the request is routed to a Brand Admin or Compliance Approver (not the requester) And approval by an eligible approver issues an override token scoped to {brandId, channelId, presetId} with TTL default 24h (configurable 1–168h) And the apply action succeeds only after approval and only within token scope and validity And the audit log records requesterId, approverId, timestamps (requested, approved), tokenId, expiry, reasonCode And if requester==approver, the approval is rejected with code FOUR_EYES_REQUIRED And upon expiry, subsequent applies are blocked again until a new approval is granted

Configurable Strictness per Workspace/Brand

Given policyMode can be set at workspace-level and overridden at brand-level to one of [Block, Warn, Off] When a brand-level mode is set, it takes precedence over the workspace-level mode Then for Warn mode, non-approved preset applies are allowed but display a warning with suggestions and are logged with outcome=WARNED And for Off mode, no policy check runs and no warnings are shown And policy changes propagate to enforcement (UI and API) within 60 seconds and are visible in the policy settings UI And a GET policy endpoint returns the effective mode for a brand-channel And unit/integration tests cover all mode combinations and precedence

API Enforcement and Error Contract

Given the API endpoints POST /apply-preset and POST /batches/{id}/apply-preset When a non-approved preset is requested under policyMode=Block Then the server returns HTTP 422 with code=OFF_BRAND_PRESET and a payload including: presetName, brandName, channelName, policyMode, suggestedPresetIds (0–3) And no changes are persisted to the target asset(s) And under Warn mode, the server returns 200 with warnings[] containing code=OFF_BRAND_PRESET and suggestions, while applying the preset And including a valid, unexpired override token (X-Override-Token) that matches {brandId, channelId, presetId} yields 200 and applies the preset And invalid/expired/mismatched tokens yield 403 with code=INVALID_OVERRIDE And P95 API latency for enforcement checks is <= 400 ms for single apply and <= 800 ms for batch per 100 items

UI Messaging and Discoverability of Compliant Presets

Given a user is blocked or warned for off-brand preset usage in the UI When the message is shown Then the modal/banner headline reads "Preset not approved for Brand-Channel" And the body includes reason, current preset, effective policyMode, and a link "View brand-approved presets" that opens a filtered list And suggested compliant presets (1–3) show name, thumbnail, and Apply buttons; selecting one applies immediately if allowed And the component meets accessibility: focus is trapped, Esc closes, screen-reader labels for controls, and all visible text is localizable And analytics events track impressions, suggestion clicks, and conversion to compliant apply

Audit Log and Reporting for Compliance Events

Given enforcement is enabled When any of the following occurs: Block, Warn, OverrideRequested, OverrideApproved, OverrideDenied, AppliedWithOverride Then an audit event is recorded with fields: eventType, userId, assetId|batchId, presetId, brandId, channelId, policyMode, reasonCode (if any), overrideTokenId (if any), timestamp (UTC), outcome And Admins can filter and export these events via UI and API over a selectable date range And data retention is 365 days with role-based access controls (only Admins view PII) And exported CSV reflects filters and includes a checksum/hash of rows for integrity verification

Real-Time Mismatch Detection

"As a catalog editor, I want real-time warnings when a photo’s preset doesn’t match its destination channel so that I can fix issues before publishing."

Description

Validation layer that detects preset-to-brand or channel mismatches during upload, editing, and export. Checks parameters like aspect ratio, background color, margins, watermarking, and color profile against the bound preset. Surfaces inline warnings and one-click fixes (reapply correct preset or adjust parameters), plus a batch summary view to resolve issues before publishing.

Acceptance Criteria

Inline Warning on Upload for Preset Mismatch

Given a project with a bound preset-to-brand and/or channel mapping When a user uploads images and analysis completes Then any image whose aspect ratio, background color, margin, watermarking, or color profile deviates from the bound preset is flagged with an inline warning badge on the thumbnail and a summary tooltip listing the offending parameters And the warning appears within 1 second per image at the 95th percentile for batches up to 500 images And each flagged image shows two CTAs: "Reapply Correct Preset" and "Adjust Parameters" And unflagged images display no warning and proceed normally

Auto-Assign Correct Preset by Supplier or Folder Tag

Given a folder or upload batch tagged with a supplier or brand identifier that has a configured preset binding When images are added to the batch Then the correct bound preset is auto-applied without user action and the action is indicated via a non-blocking toast and per-image icon And accuracy is 100% for all images with a valid mapping; images without a mapping are left unchanged and listed in a "No Mapping" filter And attempting to manually select an off-brand preset is blocked with an explanatory message

Real-Time Parameter Drift Detection During Editing

Given the editor is open on an image with a bound preset When the user modifies any controlled parameter (aspect ratio, background color, margin, watermarking, or color profile) so that it no longer matches the preset Then a real-time warning banner and inline control highlights appear within 200ms, identifying the specific parameters out of compliance And the "Revert to Preset" CTA becomes enabled; selecting it restores only the drifted parameters while preserving other edits And the warning clears immediately when parameters return to compliance

One-Click Fix Applies Bound Preset

Given an image is flagged for mismatch When the user clicks "Reapply Correct Preset" Then all out-of-compliance parameters are reset to the bound preset values in a single operation within 500ms, the image revalidates, and the warning state is cleared And non-conflicting user edits (e.g., exposure or retouch adjustments) remain intact And an undo step is added to history allowing full reversal in one action

Pre-Export Batch Mismatch Summary and Resolution

Given a user initiates export for a batch When any images in the batch are out of compliance with their bound preset/channel rules Then a pre-export modal shows counts by mismatch type, affected image thumbnails, and per-image checklists of failing parameters And the modal provides "Fix All" and "Fix Selected" actions that apply corrections and revalidate before allowing export And if organizational policy blocks off-brand exports, export is prevented for failing items with a clear reason and link to fix; otherwise user can proceed after acknowledging warnings

Channel-Specific Rules Enforcement on Export

Given a bound channel with explicit constraints (e.g., marketplace requires 1:1 aspect, pure white background #FFFFFF, sRGB, no watermark) When the user selects an export target that conflicts with the bound channel or when the image parameters violate channel constraints Then the system flags the conflict, auto-selects the correct channel preset if available, or blocks export until parameters comply And the validation explicitly checks and reports: aspect ratio tolerance ±0.01, background color deltaE < 2 to #FFFFFF, margins within preset %, watermarking disabled/enabled per rule, and ICC profile equals sRGB

Channel-Specific Preset Variants

"As an e-commerce marketer, I want channel-specific preset variants to be applied automatically so that each marketplace receives optimized, compliant images."

Description

Support for per-channel variations of a base brand preset (e.g., Amazon white background, Instagram square crop, Shopify padding). Automatically selects and renders the correct variant at export or publish time, with reprocessing when channel policies or presets change. Maintains asset versioning to track which variant was used and ensures consistent outputs across marketplaces.

Acceptance Criteria

Auto-Select Preset Variant per Export Channel

- Given a base brand preset has channel-specific variants configured, When an export or publish job targets a specific channel (e.g., Amazon, Instagram, Shopify), Then the system auto-selects the matching channel variant without user input. - And the job record for each asset stores brand_id, preset_id, variant_id, and channel. - And the output asset filename includes the channel and variant version in the pattern: <sku>_<channel>_v<presetVersion>.<ext>. - And 100% of assets in the job receive the correct channel variant; any asset without a mapping is halted with status "BLOCKED" and reason "MISSING_VARIANT".

Brand Binding Blocks Off-Brand Variant Usage

- Given workspace brand B is active, When a user attempts to apply a preset variant from brand C, Then the action is blocked with error code BRAND_BINDING_VIOLATION and cannot proceed to export/publish. - Given a variant is selected that does not match the target channel, When the user queues export, Then a warning is shown and a one-click "Switch to <channel> variant" option is provided; proceeding without switch is blocked if policy=Block, otherwise allowed with recorded warning if policy=Warn. - And all blocks/warnings are logged with user_id, asset_id, rule_id, timestamp, and decision (Blocked|Warned|Auto-switched).

Auto-Assign Variant via Supplier and Folder Tags

- Given supplier and/or folder tags are mapped to a brand preset and channel, When images are batch-uploaded into a tagged folder, Then the system auto-assigns the corresponding channel-specific variant and queues processing. - And the assigned variant is displayed in batch review; users with Brand Manager role may override before processing; overrides are captured in audit with old_variant_id and new_variant_id. - And if multiple rules match, the highest-priority rule is chosen and the decision path is shown; if no rules match, default to the base preset and set job warning NO_RULE_MATCH.

Auto-Reprocess on Policy or Preset Change with Versioning

- Given a channel policy or any channel-specific preset variant is edited and saved, When the change is published, Then previously exported assets tied to that variant are marked Outdated and added to a Reprocess queue. - And reprocessing creates new asset versions with incremented presetVersion (and variant settings hash), leaving prior versions intact and retrievable. - And enabled channel connectors receive updated assets after reprocessing; publish actions reference the new version id. - And a summary report lists counts of impacted, reprocessed, skipped, and failed assets with reasons.

Asset Version Metadata and Traceability

- Given any export or publish generates an output, When the asset is produced, Then the system persists an immutable version record with fields: assetVersionId, brand_id, preset_id, variant_id, presetVersion, channel, settings_hash, source_hash, and timestamp. - And the version history view can filter by channel and presetVersion, and displays the exact variant used. - And a Reproduce Output action re-runs processing pinned to the recorded presetVersion and settings_hash; the resulting file matches the stored version by byte-level checksum.

Fallback Behavior When Channel Variant Missing

- Given a base preset has no defined variant for the target channel, When a user exports or publishes to that channel, Then the system applies the base preset with channel-safe defaults and emits warning MISSING_CHANNEL_VARIANT. - And an org-level policy controls fallback behavior (Block | Allow with Warning); the configured policy is enforced consistently across batch jobs. - And when a new channel variant is later defined, users can bulk reprocess previously warned assets to replace fallback outputs; the new versions are linked to the original via priorVersionId.

Binding Audit & Compliance Reporting

"As a compliance analyst, I want audit logs and reports of bindings and overrides so that I can track adherence and identify training gaps."

Description

Comprehensive, immutable logs of preset bindings, assignments, blocks, and overrides, including user, timestamp, source metadata, and reason codes. Provides dashboards and exportable reports for compliance rate, off-brand attempts prevented, and rework avoided. Offers filters by brand, supplier, channel, and time window, plus webhooks/CSV exports for BI integration and retention controls.

Acceptance Criteria

Immutable Audit Log for Binding Events

Given a binding action (auto-assign, block, or override) occurs, When the event is processed, Then an audit entry is created within 5 seconds containing: event_id (UUID), event_type, outcome, preset_id/name, brand_id/name, channel_id/name, supplier_id/name, source_folder, tag_ids, image_id(s), rule_id, user_id/email or "system", timestamp (ISO 8601 UTC), reason_code (if block/override), request_id, and hash_chain value. Given an audit entry exists, When any actor attempts to modify it via API or UI, Then the system prevents mutation and logs a tamper_attempt event with actor and timestamp. Given audit entries are appended, When a daily integrity check runs, Then each entry’s hash_chain equals SHA-256(prev_hash + canonical_entry_json) and the check reports 0 mismatches. Given an event_id, When retrieved via API, Then the response returns the exact stored entry and a 200 status within 300 ms for the 95th percentile.

Dashboard Metrics: Compliance, Blocks, Rework Avoided

Given a selected time window and scope, When the dashboard loads, Then compliance_rate = compliant_events ÷ eligible_events rounded to 1 decimal and displayed with numerator/denominator. Given block events occurred in scope, When the dashboard loads, Then off_brand_attempts_prevented equals the count of block events in scope. Given minutes_per_rework is set to 5, When calculating rework_avoided, Then rework_avoided_minutes = off_brand_attempts_prevented × 5 and is displayed as HH:MM and (if cost_per_min configured) as currency. Given new events stream in, When 60 seconds elapse, Then dashboard aggregates reflect the new totals. Given filters return no data, When the dashboard loads, Then the UI shows zeros and an empty-state message without errors.

Filterable Reporting by Brand, Supplier, Channel, and Time

Given filters for brand, supplier, channel, and time window, When multiple values are selected per facet, Then results include events matching any selected value within each facet and all facets are combined with AND. Given a time zone is chosen, When the time window is applied, Then boundaries are computed in that time zone and displayed; default is UTC if none provided. Given a query returning ≤ 100,000 events, When results are requested, Then the first page loads within 3 seconds and the query supports deterministic sorting: timestamp desc, event_id as tiebreaker. Given paginated results, When navigating pages, Then the total count is accurate and page changes do not alter ordering for a fixed dataset.

Exports: CSV Download and Webhook Delivery for BI

Given a user requests CSV export for the current filters, When the export completes, Then the CSV is UTF-8 with a header row and columns: event_id, event_type, outcome, preset_id, preset_name, brand_id, brand_name, channel_id, channel_name, supplier_id, supplier_name, source_folder, tag_ids, image_ids, rule_id, user_id, user_email, timestamp_utc, reason_code, request_id, hash_chain. Given an export exceeds 100,000 rows, When requested, Then an asynchronous job is created and a downloadable, signed URL is provided that expires in 24 hours. Given fields contain commas, quotes, or newlines, When generating CSV, Then values are escaped according to RFC 4180. Given a webhook destination with shared secret is configured, When new audit events occur, Then a JSON batch is POSTed within 30 seconds signed with HMAC-SHA256 (header X-PixelLift-Signature) and includes an Idempotency-Key. Given a webhook delivery receives a non-2xx response, When retrying, Then the system retries up to 6 times with exponential backoff capped at 10 minutes and guarantees at-least-once delivery.

Override Logging with Reason Codes and Approvals

Given a user initiates a preset override, When submitting, Then a reason_code from the configured list is required and an optional note up to 500 characters is allowed. Given override requires approval, When the request is made, Then the approver must differ from the requester and the approver’s user_id is recorded in the audit log. Given an override is executed, When logging the event, Then the entry includes previous_preset_id/name, new_preset_id/name, override=true, and links to the originating rule_id. Given a user lacks override permission, When they attempt an override, Then the action is blocked and a block event is logged with reason_code=permission_denied and no state change occurs.

Retention Controls and Redaction Policy

Given an admin sets a workspace retention policy (e.g., 365 days), When a log entry exceeds the retention window and is not under legal hold, Then it is deleted or archived per policy and a retention_deletion summary event is recorded. Given PII fields (e.g., user_email) are configured for redaction after 90 days, When an entry passes the redaction threshold, Then those fields are irreversibly redacted while preserving non-PII fields and metric integrity. Given a legal hold is applied to a brand, When retention jobs run, Then entries under hold are excluded from deletion until the hold is lifted and the hold is auditable. Given a retention policy change, When saved, Then the change is logged with admin user_id, timestamp, and old/new values and applies prospectively to new deletions.

Mapping Admin Console

"As a workspace admin, I want an admin console to manage mappings and import rules in bulk so that onboarding new brands and suppliers is fast and error-free."

Description

An admin UI for managing brand-to-preset mappings and auto-assignment rules. Includes bulk import/export, in-line validation, role-based access control, preview of presets on sample images, and test-run capability before applying changes to full catalogs. Tracks change history with rollback, supports multi-brand workspaces, and provides guardrails to avoid breaking existing bindings.

Acceptance Criteria

Bulk Import with Inline Validation

- Given an Admin or Editor uploads a CSV/JSON using the provided template, the system validates required fields (brand_id, channel_id, preset_id, rule_scope, priority, status) before commit. - Row-level errors and warnings are displayed inline with line numbers; first 200 surfaced in UI and a full downloadable error report is generated. - Critical errors (missing required fields, invalid IDs, duplicate key brand+channel+scope, non-integer priority) block import; warnings (unused columns, deprecated fields) allow proceed with explicit confirmation. - Import is atomic: either all valid rows are applied or none if any critical error exists. - Idempotency: re-importing an identical file produces zero changes and records a no-op audit entry. - Performance: validation for 10,000 rows completes ≤ 15s; applying 10,000 valid rows completes ≤ 60s. - Audit: a single change-set is created with counts of created/updated/deleted mappings and a checksum of the source file. - Guardrail: attempts to overwrite active mappings used by in-flight jobs require Admin confirmation and are deferred if a safe window is scheduled.

Auto-Assignment Rule Engine Evaluation

- When an asset enters the pipeline with brand_id, channel, supplier_tag, and folder_tag, rules evaluate in descending priority; match order: supplier_tag > folder_tag > brand default; first match wins. - Deterministic outcome: given identical inputs and ruleset version, the same preset_id is assigned; tiebreaker is most recent update timestamp if priorities are equal. - Fallback: if no rule matches, processing is blocked with error "No preset mapped"; a notification is sent to brand owners and Admins. - Per-channel enforcement: a preset not bound to the asset's brand+channel is rejected with a mismatch error; asset is not processed. - Rule application logs rule_id, ruleset_version, and assigned preset_id with the asset's job record. - Rule changes affect only jobs queued after the save timestamp; in-flight jobs retain previous assignments. - Throughput: the engine evaluates ≥ 300 assets/second per workspace under nominal load with p95 decision latency ≤ 50 ms.

Role-Based Access Control for Mapping Admin Console

- Roles supported: Owner, Admin, Editor, Viewer; default new users are Viewer. - Owner/Admin: full CRUD on mappings, rules, imports, rollbacks, and can apply test-run results. - Editor: can create/edit mappings and run dry runs; cannot apply to production or perform rollbacks. - Viewer: read-only; may export reports; no create/update/delete actions. - All mutating endpoints enforce authorization and return 403 on insufficient permission; corresponding UI controls are disabled and tooltipped. - Every attempted and successful action is logged with actor_id, role, IP, timestamp, and outcome. - SSO group-to-role mapping is supported; manual overrides persist and are auditable. - Permission changes take effect immediately across sessions and are reflected in UI within 5 seconds.

Preset Preview on Sample Images

- Selecting a brand and preset allows previewing up to 5 sample images; previews render ≤ 2s per 1080p image (batch of 5 ≤ 10s, p95). - Preview shows before/after toggle and 1:1 zoom; no catalog changes are persisted. - The rendering pipeline version used is identical to production; the version is displayed in the UI. - Visual fidelity: color difference between preview and production output for the same input is ΔE ≤ 2. - Input color spaces supported: sRGB and Adobe RGB; others are rejected with an explanatory message. - Optional preview download as JPEG/WebP is available with max file size 10 MB per image; watermark "Preview" is applied. - Caching: repeating the same preset+image preview returns cached result with p95 ≤ 500 ms.

Dry Run of Rule Changes Before Catalog Apply

- Users select a ruleset and target scope (workspace/brand/channel/folder) and initiate a dry run; no changes are persisted. - The system evaluates ≥ 50,000 assets and produces an impact report: counts per rule, conflicts, blocked assets, and proposed preset assignments. - Report generation completes ≤ 3 minutes for 50,000 assets with progress feedback and cancel option. - Admins can promote the dry run to apply; a content hash of the evaluated set ensures the applied set matches the report; if the catalog has changed, promotion is blocked and a re-run is required. - No emails or processing jobs are triggered by dry runs; results are stored for 30 days and are auditable. - Applying from a dry run executes transactionally; success/failure and counts are recorded in a change-set.

Change History and Point-in-Time Rollback

- Every create/update/delete of mappings or rules generates a versioned change-set with before/after values, actor, timestamp, and rationale (optional text). - History view supports filtering by brand, channel, actor, and date range; export available as CSV/JSON. - Admins can rollback an individual change-set or restore to a specific timestamp; the system presents an impact summary before execution. - Rollback is atomic and requires a confirmation with a reason ≥ 15 characters. - Guardrails block rollbacks that would introduce duplicate/conflicting mappings; suggested resolutions are presented. - All rollback operations are logged and reversible (forward-apply the prior change-set). - Performance: rollback of change-sets ≤ 10,000 rows completes ≤ 60s.

Guardrails in Multi-Brand Workspaces

- Isolation: a preset from Brand A cannot be mapped to Brand B; cross-workspace access is blocked. - Uniqueness enforced: at most one active default preset per brand+channel; scoped rules (supplier/folder) cannot overlap at equal or higher priority without explicit override and justification. - Impact check on save shows affected catalogs/assets; if affected_count > 0, Admin confirmation is required; option to schedule apply window is provided. - Deletion is blocked if a mapping is referenced by active or scheduled jobs; UI lists blockers and links to cancel or reschedule. - Real-time validation returns field-level errors within 500 ms of input change. - Concurrency: optimistic locking prevents stale updates; saves with outdated versions are rejected with guidance to refresh. - Notifications are sent to brand owners and Admins on blocked saves, high-impact changes, and scheduled applies.

Audit Insights

Tamper-proof activity trails with usage analytics by brand, user, and preset version. Receive anomaly alerts (e.g., off-brand attempts), export compliance reports, and map usage to cost centers for clear accountability and simpler audits.

Requirements

Immutable Audit Log Ledger

"As a security and compliance officer, I want an immutable, tamper-evident audit log of all processing activities so that I can prove compliance and investigate incidents with confidence."

Description

Provide an append‑only, tamper‑evident audit ledger that records every significant action in PixelLift, including uploads, AI retouch operations, background removals, style‑preset applications (with preset/version IDs), exports, approvals, and administrative changes. Each entry must capture timestamp (NTP-synchronized), actor (user/service/API key), brand/workspace, cost center tag, asset IDs, operation parameters, model versions, originating IP/agent, and outcome status. Entries are hash‑chained and signed to detect alteration, stored on WORM-capable storage with configurable retention and legal holds. Include a verifiable proof endpoint to validate integrity of individual entries or ranges. Integrate via an event bus so all services emit standardized audit events with correlation IDs, ensuring end‑to‑end traceability across the processing pipeline.

Acceptance Criteria

Append‑Only Hash‑Chained Ledger Writes

Given a valid audit event, When it is appended, Then the entry includes previous_hash, entry_hash, and digital_signature with key_id, and the chain continuity is verifiable end-to-end. Given any attempt to update or delete an existing ledger entry via API or storage, When executed, Then the operation is rejected with 403 and a tamper_attempt audit event is recorded. Given the ledger data at rest, When any bit in a past entry is modified, Then the next verification detects chain break and reports the earliest affected entry_id. Given signing key rotation occurs, When new entries are appended, Then signatures verify with the new key_id and previously signed entries continue to verify with their original key_id.

Complete Field Capture for Significant Actions

Given actions upload, AI_retouch, background_removal, style_preset_apply, export, approval, and admin_change, When performed, Then an audit entry is recorded for each with fields: timestamp_utc, actor (user/service/api_key), brand_workspace, cost_center_tag, asset_ids, operation_type, operation_parameters, model_version, originating_ip, user_agent, outcome_status, correlation_id. Given a style_preset_apply action, When recorded, Then preset_id and preset_version are present and non-empty. Given an action fails, When recorded, Then outcome_status includes failure_code and error_message. Given timestamps are captured, When validated against NTP, Then ntp_offset_ms is present and within ±500 ms.

WORM Storage, Retention, and Legal Hold

Given WORM-capable storage is configured, When ledger objects are written, Then they are immutable for the active retention period. Given a brand/workspace retention policy between 1 and 10 years is configured by an authorized admin, When saved, Then new entries inherit the policy and the change is itself audited. Given a legal hold is placed on a brand/workspace or entry range, When retention expiry is reached, Then deletion is prevented until the hold is lifted and all actions are audited. Given an attempt to delete or rewrite a WORM-protected object, When executed, Then the operation fails with 403 and a tamper_attempt audit event is recorded.

Integrity Proof Endpoint (Entry and Range)

Given an entry_id, When GET /audit/proof?entry_id={id} is called with valid auth, Then 200 is returned with entry_hash, previous_hash, signature, key_id, and verification_result=true. Given start and end identifiers, When GET /audit/proof?start={s}&end={e} is called, Then a contiguous-chain proof for the range is returned within 2 seconds for ranges ≤ 10,000 entries and verifies client-side. Given a nonexistent entry_id or range, When proof is requested, Then 404 is returned with error_code=ENTRY_NOT_FOUND. Given a chain break within the requested range, When proof is requested, Then verification_result=false and first_corrupted_entry_id is included.

Event Bus Standardization and Correlation

Given any service emits audit events, When publishing to the event bus, Then messages conform to schema_version=X with required fields and pass schema registry validation. Given a user action triggers downstream processing, When events are emitted across services, Then a single correlation_id is propagated and present in all related ledger entries. Given the event bus is unavailable, When an event is to be emitted, Then the service retries with exponential backoff up to 5 minutes, persists to an outbox, and duplicates are handled idempotently. Given a publish fails schema validation, When attempted, Then it is rejected, captured in a DLQ with reason, and the failure is audited.

Administrative Key and Role Controls

Given RBAC policies are enforced, When a user without Audit.Admin role attempts to change retention, legal holds, signing keys, or proof settings, Then the request is denied with 403 and is audited. Given a key rotation is initiated by Audit.Admin, When completed, Then a new key pair with incremented key_id is active for signing, prior keys remain for verification-only, and rotation metadata is recorded. Given a brand-scoped user requests audit entries, When authorized, Then results are limited to the user's brand/workspace scope and access is audited.

Time Synchronization and Clock Drift Monitoring

Given system nodes synchronize time via NTP, When drift exceeds ±500 ms on any node, Then new audit writes from that node are blocked, an alert is raised, and the event is audited. Given entries are timestamped, When saved, Then each includes server_time_utc and a monotonic_sequence to guarantee ordering for same-millisecond events. Given NTP services are restored, When drift returns within threshold, Then audit writes resume and recovery is audited.

Granular Audit Filters & Search

"As a compliance analyst, I want to filter and search audit records by brand, user, preset version, and time range so that I can quickly find relevant evidence during reviews."

Description

Deliver a queryable audit interface (UI + API) that supports filtering by brand, user, role, time range, action type, asset ID, preset name/version, outcome status, IP/geolocation, and correlation ID. Provide full‑text search on metadata, sortable columns, pagination, saved searches, and export of result sets. Enforce role‑based access controls so users only see records within their permitted scopes. Optimize for large datasets with indexed queries and consider time‑partitioned storage. Include quick pivots from aggregated views to raw audit entries for fast investigations.

Acceptance Criteria

Multi-Dimensional Filtering (UI & API)

Given an organization with ≥10M audit records spanning multiple brands, users, roles, action types, asset IDs, preset names/versions, outcome statuses, IPs/geolocations, correlation IDs, and a time range ≥90 days When a user applies any single supported filter (brand, user, role, time range, action type, asset ID, preset name/version, outcome status, IP/geolocation, correlation ID) in the UI or via API Then only matching records are returned and the applied filter is shown as active And p95 response time for single-indexed filter queries is ≤2s for result sets ≤10k rows When multiple different filter types are applied concurrently Then filters combine with AND semantics across types and OR semantics within multi-select values of the same type And the total count equals the filtered set size (±0.1%) And time range filters are inclusive of start and exclusive of end in UTC And IP filter accepts exact IP and CIDR; geolocation filter accepts country/region/city codes And correlation ID filter is exact and case-insensitive And UI and API return identical results for equivalent queries

Full-Text Search Across Metadata

Given audit records containing metadata fields (e.g., file name, preset name, notes, error messages) When the user submits a full-text query string in the UI or API parameter q Then results include records where any metadata field contains the query terms (case-insensitive, tokenized on alphanumerics) And combining full-text search with filters returns the intersection of both And p95 response time for full-text queries over ≤10k returned rows is ≤3s And results are ranked by relevance then by timestamp desc when sort is not explicitly set And an empty query or only stop-words returns no full-text matches and does not alter applied filters

Sortable Columns & Pagination (UI & API)

Given the audit list view with columns: timestamp, brand, user, role, action type, asset ID, preset name, preset version, outcome status, IP, geolocation, correlation ID When the user clicks a column header (UI) or sets sort parameters (API) Then the list sorts ascending/descending correctly and stably, defaulting to timestamp desc on initial load And changing sort resets pagination to the first page And pagination supports page sizes 25/50/100 (default 50) in UI, and API supports cursor-based pagination with next/prev tokens and a limit up to 500 And the same query + sort + cursor deterministically returns the same page And p95 latency to fetch any page ≤2s for pages up to 500 rows And the total count reflects the filtered set across all pages

Saved Searches: Create, Run, Manage

Given a user with permission to save searches When the user saves the current combination of filters, full-text query, sort, and columns as a named Saved Search Then the Saved Search is persisted with a unique name per user (1–120 chars) and is listed under the user’s Saved Searches When the user runs a Saved Search Then the UI applies the exact saved parameters and returns identical results to the original at the same data snapshot And p95 load time to apply a Saved Search is ≤2s for queries returning ≤10k rows When the user renames, updates, or deletes a Saved Search Then the changes are persisted and reflected in the list; deleted searches no longer appear And Saved Searches are private to the owner user and not visible to other users

Export Filtered Results (CSV/JSON, Sync/Async)

Given a filtered/sorted result set in the audit interface When the user exports results to CSV or JSON via UI or API Then only records within the current query scope are exported and RBAC is enforced And for result sets ≤50k rows, the export downloads synchronously within 60s; for larger sets up to 5M rows, an asynchronous job is created and a downloadable link is provided upon completion And exported data preserves applied sort order and includes the selected columns; timestamps are ISO 8601 UTC And the export file name includes org/brand (where applicable), query date range, and a timestamp And the API exposes endpoints to create export jobs and poll job status And the number of exported rows equals the query’s total count at export time

Role-Based Access Scope Enforcement

Given users with different scopes (e.g., Org Admin with all brands in org; Brand User restricted to Brand A; Standard User restricted to own actions) When a user queries the audit UI or API Then only records within that user’s permitted brands/resources are returned And out-of-scope filter values are disabled in UI and rejected by API with 403 and an explanatory error And tenant isolation is enforced so users cannot access records from other organizations under any circumstances And API tokens inherit the issuing user’s scopes and produce identical visibility And all access denials are themselves auditable with user, reason, and attempted scope

Pivot from Aggregated Views to Raw Audit Entries

Given an aggregated view (e.g., counts by brand, action type, or outcome over a selected time window) When the user clicks an aggregate bucket (bar, slice, or table cell) Then the app opens the raw audit list view with corresponding filters and the same time window pre-applied And the raw list total count matches the aggregate bucket count at query time (±0.1%) And the pivoted state is shareable via a deep link URL capturing filters, time range, and sort And navigating back returns to the aggregate view with prior state preserved And p95 latency to open the pivoted raw list is ≤2s for ≤10k-row result sets

Usage Analytics by Brand/User/Preset Version

"As a brand operations manager, I want usage analytics by brand, user, and preset version so that I can measure adoption, efficiency gains, and enforce brand standards."

Description

Aggregate audit events into analytics dashboards that show volumes processed, success/failure rates, average processing time, estimated time saved, preset adoption and effectiveness, and anomaly counts. Provide drill‑down from charts to underlying audit entries, with breakdowns by brand, cost center, user, preset version, and time window. Ensure multi‑tenant data isolation, near‑real‑time updates via streaming aggregations, and exportable widgets. Surface KPIs relevant to e‑commerce outcomes (e.g., conversion lift correlation where available) to demonstrate value and drive preset governance.

Acceptance Criteria

Real-Time Brand-Level Aggregates

Given audit events for tenant T and brand B are ingested via the streaming pipeline When new events arrive Then dashboard metrics for brand B (processed volume, success rate, failure rate, average processing time, estimated time saved) update within 60 seconds at p95 and 120 seconds at p99 And estimated time saved is computed as sum(max(0, baseline_edit_time_per_asset - processing_time_per_asset)) with a brand-configurable baseline (default 180 seconds), and the baseline value is displayed in the widget info And for any 1-hour window, each metric matches a point-in-time recomputation from the audit store within the greater of 0.5% absolute difference or ±5 events

Drill-Down From Charts to Audit Trail

Given a user views an analytics chart with filters for brand(s), cost center(s), user(s), preset version(s), and a time window When the user clicks a chart element (bar, slice, point, or legend bucket) Then a drill-down table opens showing underlying audit entries constrained by the same filters and time window And the table includes columns: timestamp, brand, cost center, user, preset version, status, processing time, assetId, error code (if any) And the drill-down supports server-side pagination to at least 50,000 rows with sortable columns; p95 response time <= 2 seconds for page sizes up to 100 rows And a deep link "View in Audit Trail" preserves the drill-down filters and time window; a back action returns to the original dashboard and scroll position

Multi-Tenant Isolation and Role-Based Access

Given a user authenticated for tenant A requests analytics for tenant B Then the request is denied with HTTP 403 and no cross-tenant identifiers or counts are leaked in error bodies or timing side channels beyond a constant-time response window (±100 ms) And all analytics and drill-down queries require and verify tenantId from the access token and enforce brand/cost-center scopes from the user role (e.g., AnalyticsViewer) so that only authorized brands/cost centers are visible And a support break-glass role requires a justification string and produces an immutable audit event; access is time-bound (default 1 hour) and is visible in the tenant's audit trail And automated integration tests with two synthetic tenants confirm zero cross-join leakage and zero data visibility across tenants

Advanced Breakdown and Filtering

Given filters for brand(s), cost center(s), user(s), preset version(s), and time window (Last 15m, 1h, 24h, 7d, 30d, Custom) When any combination of these filters is applied Then all widgets and tables reflect the filters consistently within 2 seconds at p95 for up to 1,000,000 events in range And breakdown mode renders grouped/stacked charts for the selected dimension and totals equal the sum of buckets within the greater of 0.1% or ±5 events And empty-result states render a clear "No data for selected filters" message without errors; clearing filters restores prior state And the current filter state is encoded in a shareable URL and is restored when the URL is revisited

Anomaly Detection and Counts

Given anomaly rules are enabled (off-brand attempts, failure-rate spikes, processing-time outliers, abnormal preset adoption changes) When an anomaly is detected in the last 1-hour window Then the anomaly count widgets update within 2 minutes at p95 and 5 minutes at p99 of detection And each anomaly entry includes rule id, severity, brand, user, preset version, count, first_seen, last_seen, and a link to filtered drill-down And on a labeled validation dataset, the false-positive rate is <= 5% and true-positive rate is >= 80% And disabling a rule removes it from counts and suppresses new detections within 60 seconds

Exportable Analytics Widgets and Reports

Given a user selects Export on any analytics widget When export is requested Then the system provides PNG (visual), CSV (aggregated rows), and JSON (widget config and query) that respect current filters, breakdowns, and user scopes And downloads begin within 5 seconds at p95; CSV exports up to 1,000,000 rows complete within 2 minutes or stream progressively without errors; column headers are consistent and UTF-8 encoded And an embeddable signed URL (iframe) is generated with tenant/brand scoping and a configurable TTL (default 7 days); access after TTL expiration is denied And exported artifacts include metadata (generated_at UTC, tenantId, brand(s), time range, filters) and a SHA-256 checksum for integrity verification

KPI Correlation: Preset Adoption vs Conversion Lift

Given a brand has connected conversion/order data with at least 30 days and 1,000 sessions of coverage When computing correlation between preset adoption rate and conversion lift over a selectable time granularity (day/week) Then the dashboard displays Pearson r, 95% confidence interval, and p-value; results marked "significant" when p < 0.05 And where data coverage is insufficient, the widget shows an "Insufficient data" state with required minimums and does not show a misleading value And the correlation methodology and data sources are linked; last computed timestamp is shown; computations refresh daily at 02:00 UTC and backfill within 24 hours

Anomaly Detection & Off‑Brand Alerts

"As a brand security admin, I want real-time alerts on off-brand or anomalous activity so that I can intervene before noncompliant images reach customers."

Description

Implement policy‑driven anomaly detection that flags off‑brand or risky activity, such as use of unapproved presets, manual overrides beyond thresholds, unusual processing spikes, access from atypical locations/IPs, or after‑hours usage. Allow per‑brand rules and baselines with severity levels, suppression windows, and alert routing to email, Slack, and webhooks. Each alert must link to supporting audit entries and include enough context for triage. Provide acknowledge/snooze/resolve workflows and capture analyst feedback to refine detection over time. Ensure low latency from event to notification to block noncompliant outputs before publication where configured.

Acceptance Criteria

Off‑Brand Preset Usage Blocks Publication

Given Brand A has an approved preset list and blocking enabled for off-brand usage When a user processes an image with a preset not on Brand A’s approved list Then the output is blocked from publication, a High severity alert is generated, and alerts are routed to Brand A’s configured email, Slack, and webhook channels And the first alert delivery occurs within 10 seconds at p95 (30 seconds p99) from the event time And the alert and block are recorded with a correlation ID and linked audit trail entries

Manual Override Threshold Breach Alert

Given Brand B has policy thresholds configured for manual adjustments (e.g., exposure, saturation, crop) When a user’s manual override exceeds any configured threshold for Brand B during processing Then a Medium or High severity alert (per rule config) is generated and routed to Brand B’s channels And the processed output is blocked if the rule’s action is set to Block, otherwise allowed with alert-only And the alert includes the parameter, attempted value, allowed threshold, user ID, asset ID, timestamp, and rule ID

Processing Spike Anomaly per Brand Baseline

Given Brand C has a baseline defined as the 7-day moving median per weekday-hour for jobs/hour with a spike threshold of 3x When processing volume for Brand C in a given hour exceeds 3x the baseline Then a Medium severity anomaly alert is generated and delivered to Brand C’s routing And duplicate alerts for the same spike are suppressed for the configured suppression window (e.g., 60 minutes) And the alert includes baseline value, observed value, multiplier, timeframe, and initiating users (if identifiable)

Atypical Location/IP and After‑Hours Access Alerts

Given Brand D has allowed geographies/IP ranges and business hours configured When a login or processing action occurs from a new country or ASN not on the allowlist, or outside business hours by more than 30 minutes Then a High (geo/IP) or Medium (after-hours) severity alert is generated and routed to email, Slack, and webhook And the first alert delivery occurs within 10 seconds at p95 (30 seconds p99) from the event time And the alert includes user ID, IP, geo lookup, ASN, device fingerprint (if available), local time, rule ID, and recommended actions

Alert Payload Completeness and Audit Linkage

Given an alert is generated for any rule When the alert is viewed in the console or received via any channel Then it contains: brandId, userId, assetId(s), ruleId, rule name, severity, event timestamp (UTC), processing node, presetId/version (if applicable), action taken (Block/Allow), correlationId, and links to supporting audit entries And following any audit link displays the corresponding immutable audit records with matching correlationId and hashes

Acknowledge/Snooze/Resolve Workflow with Suppression

Given an active alert is visible in the console When an analyst acknowledges the alert Then the alert status changes to Acknowledged with user, timestamp, and note captured When the analyst snoozes the alert for 60 minutes Then no duplicate alerts for the same rule-entity (brand+rule+asset or brand+rule+user, as applicable) are emitted during the snooze window unless severity escalates When the analyst resolves the alert with a disposition (True Positive, False Positive, Benign) and optional tags/notes Then the resolution, disposition, tags, and notes are persisted and visible in audit and via API

Per‑Brand Rule Configuration and Routing Isolation

Given Brands E and F have separate rule sets, severities, suppression windows, and channel routings configured When an off-brand event occurs in Brand E Then only Brand E’s rules evaluate and only Brand E’s channels receive the alert per its severity and suppression settings And no alerts are emitted to Brand F’s channels, and Brand F’s rules/baselines remain unaffected And updating Brand E’s rule (e.g., severity change) does not modify Brand F’s rules or baselines

Compliance Report Export (CSV/PDF/API)

"As an auditor, I want to export signed compliance reports with detailed activity and summaries so that I can satisfy audit requests efficiently."

Description

Offer one‑click and scheduled export of compliance reports over a selected scope and period, delivering digitally signed PDFs and CSV/JSON datasets via download, email, or S3. Reports include executive summaries, KPI snapshots, detailed activity line items, preset versions used, approval records, anomalies with dispositions, and cryptographic proofs (hash‑chain anchors and signatures) for evidentiary integrity. Provide time zone normalization, localization, and versioned report templates mapped to common frameworks (e.g., SOC 2, ISO 27001 evidence categories). Expose an API for programmatic retrieval and integration with GRC systems.

Acceptance Criteria

One-Click On-Demand Compliance Report Export

Given a user with "Reports.Export" permission selects a scope (brands/users/presets) and a date range and clicks "Export", When processing completes, Then a digitally signed PDF and CSV and JSON dataset are produced and available via in-app download within 2 minutes for workloads ≤ 10,000 activity records. And Then, if an email recipient is specified, the same artifacts are delivered by email with expiring signed download links valid for 7 days. And Then, if an S3 destination is configured, the artifacts are uploaded to the configured bucket/path with server-side encryption enabled (SSE-S3 or SSE-KMS) and private ACL. And Then each artifact name follows compliance_<scope>_<periodStart>-<periodEnd>_<templateVersion>_<timestampZ>.<ext> and includes a stable reportId in metadata. And Then the PDF includes sections: Executive Summary, KPI Snapshot, Detailed Activity Line Items, Preset Versions Used, Approval Records, Anomalies with Dispositions, and Cryptographic Proofs; CSV/JSON contain machine-readable equivalents with a data dictionary and templateVersion.

Scheduled Recurring Report Delivery

Given a schedule (daily/weekly/monthly) is created with a specific time zone and delivery channels (email and/or S3), When the schedule triggers, Then the report for the prior period aligned to that time zone is generated and delivered within 15 minutes of the scheduled time. And Then runs are idempotent: reruns for the same schedule+period overwrite S3 objects atomically and suppress duplicate emails (max one per period per recipient). And Then failures notify owners and retry up to 3 times with exponential backoff; after final failure, the run is marked failed with an error code and correlationId. And Then a backfill option allows generating historical reports from a chosen start date with no missing periods. And Then schedule creation, updates, and executions are audit logged with who/when and parameters.

Cryptographic Integrity and Tamper-Proofing

Given a report is generated, Then the PDF is digitally signed (PAdES-B) with an organization X.509 certificate and validates as trusted in Adobe Acrobat and via command-line verification. And Then CSV and JSON artifacts are accompanied by a manifest.json containing SHA-256 hashes of each file, a Merkle root, chainHeight, previousRoot, anchorTimestamp, and a detached signature. And Then recomputing file hashes matches manifest values, and the manifest signature validates against the published public certificate. And Then altering any artifact or manifest causes verification to fail, producing a distinct integrity error code.

API Retrieval for GRC Integration

Given a service authenticates via OAuth 2.0 client credentials with scope reports:read, When it POSTs to /api/v1/reports with parameters (scope, periodStart, periodEnd, templateVersion, formats, delivery), Then it receives HTTP 202 with jobId and correlationId. And Then GET /api/v1/reports/{jobId} returns status (queued|running|failed|complete), progress percent, and on completion, signed URLs for PDF/CSV/JSON, metadata (reportId, templateVersion, itemCount, timeZone), and expiresAt. And Then GET /api/v1/reports supports listing with filters (scope, createdAt range, templateVersion) and pagination (limit, cursor) sorted by createdAt desc. And Then API enforces rate limits (429 with Retry-After) and returns structured errors with machine-readable codes and correlationId; all timestamps are ISO 8601 with offset.

Framework-Mapped Template Versions

Given a user selects a report template version mapped to SOC 2 and ISO 27001 evidence categories, When the report is generated, Then it includes a control mapping section referencing the chosen frameworks and a templateVersion identifier. And Then changing the default template does not alter previously generated reports; prior reports render with their original template version. And Then deprecated templates are flagged at selection with a warning and suggested replacement, and selection is still allowed for backward compatibility. And Then all template selections and changes are audit logged with actor, timestamp, and reason.

Localization and Time Zone Normalization

Given a user sets locale and time zone (e.g., fr-FR and Europe/Paris), When a report is generated, Then the PDF renders headings and date/number formats per locale and normalizes all timestamps to the selected time zone with explicit offset (e.g., 2025-09-21 14:30:00 GMT+02:00). And Then CSV/JSON timestamps are ISO 8601 with offset and include top-level fields timeZone and locale; numeric fields use dot decimal in machine-readable datasets regardless of locale. And Then KPI rollups and period boundaries are computed using the selected time zone and match PDF narratives and dataset aggregates.

Cost Center and Usage Analytics Breakdown

Given cost centers are configured and entities (brands/users/preset versions) are mapped, When a report is generated, Then it includes breakdowns by brand, user, preset version, and cost center with totals for photos processed, processing minutes, storage egress, anomalies, approvals, and attributable cost. And Then overall totals reconcile with the sum of group totals within 0.1% tolerance and differences are explained (e.g., rounding) in the report notes. And Then the report includes an appendix listing entity-to-cost-center assignments; unmapped entities are flagged and grouped under "Unmapped" with counts and costs. And Then all grouping keys include stable IDs in CSV/JSON and human-readable names in PDF.

Cost Center Mapping & Chargeback

"As a finance ops manager, I want to map usage to cost centers and generate chargeback reports so that I can allocate costs accurately across teams."

Description

Enable administrators to define cost center codes and map them to brands, teams, users, or API keys, with rule‑based overrides by preset type or project tag. Attribute usage from the audit stream to cost centers and generate periodical chargeback reports with unit counts, rates, taxes, and totals. Provide simulations to preview the financial impact of mapping changes and support retroactive reclassification with a controlled workflow. Integrate exports to billing/ERP systems and enforce permissions so only finance roles can modify mappings and rates.

Acceptance Criteria

Cost Center Mapping by Entity with Rule-Based Overrides

Given a FinanceAdmin defines cost center codes with names, status, currency, and effective-dated unit and tax rates And creates mappings for brand, team, user, and API key entities And defines override rules that match on preset type and/or project tags (exact and glob patterns) And sets a global default cost center for unmatched usage When a usage event arrives with attributes {brand, team, user, apiKey, presetType, projectTags} Then the system resolves the event to exactly one cost center using this precedence: override rules (higher priority > higher specificity > most recent) > user > apiKey > team > brand > global default And the resolved mapping includes costCenterCode, mappingVersion, matchedRuleId (if any), and evaluation timestamp And conflicting or overlapping mappings are prevented at save time with validation errors And p95 rule resolution latency is <= 100 ms per event at a throughput of 5,000 events/minute

Mapping Change Simulation & Financial Impact Preview

Given a FinanceAdmin edits mappings, rates, or taxes in draft mode When they run a simulation for a selected date range and scope (brand/team/user/preset type/tag) Then the system computes projected deltas per cost center: unit counts, subtotals, taxes, and totals, comparing draft vs current production And presents a side-by-side comparison with net impact and top-affected entities And the simulation excludes events already reclassified in an approved retroactive workflow unless explicitly included And for up to 1,000,000 events the simulation completes within 10 minutes, with progress updates and cancel support And no production data or reports are modified until the draft is approved and published

Audit Stream Usage Attribution

Given audit events contain tenant, brand, team, user, apiKey, presetType, projectTags, unitType, unitQuantity, and timestamp When events are ingested Then attribution to a cost center is applied idempotently based on the active mapping at event timestamp (effective-dated) And the attribution record stores {eventId, costCenterCode, unitType, unitQuantity, rate, taxRate, mappingVersion, matchedRuleId} And p95 end-to-end attribution delay from ingestion to availability for reporting is <= 60 seconds And malformed events are quarantined with alerting, retriable after fix, and excluded from reports until attributed And tenant boundaries are enforced so mappings and attributions are isolated per tenant

Scheduled Chargeback Report Generation & Contents

Given FinanceAdmin schedules reports (weekly or monthly) with time zone and currency When a report is generated for a period Then it includes per cost center and per unit type: unit counts, rate, subtotal, tax rate, tax amount, and total, plus period start/end, tenant, mappingVersionHash, and number of unattributed events And provides breakdowns by brand and preset type, and an optional detail file with line items limited by configured retention And totals reconcile: sum(subtotals) + sum(taxes) = grand total, with rounding to 2 decimals using bankers rounding And rates and taxes used reflect the effective configuration at event time (not report time) And the report is available as CSV and JSON, downloadable in-app and deliverable via configured exports

Finance-Only Permissions and Change Auditability

Given role-based access control is configured When a non-Finance user attempts to create/update/delete cost centers, mappings, rates, or taxes Then the action is blocked with 403 and no changes persist And when a FinanceAdmin performs such changes, MFA (if enabled) is required and changes are versioned with before/after, user, timestamp, and reason And every change produces a tamper-evident audit log entry linked to related reports and exports And read-only viewers can access reports but cannot view rates if rate visibility is disabled by policy

ERP/Billing Export Delivery and Reliability

Given FinanceAdmin configures export destinations: SFTP (SSH key), HTTPS webhook (OAuth2), or email When a report is finalized Then the system exports summary and detail files with deterministic filenames {tenant}_{period}_{version}.{csv|json} And includes an idempotency key and SHA-256 checksum; deliveries use at-least-once retries with exponential backoff for up to 24 hours And success is recorded only upon 2xx webhook response, verified SFTP write, or accepted email status; failures trigger alerts with retry status And exported payloads conform to the published schema with required columns and field types

Retroactive Reclassification Approval Workflow

Given a FinanceAdmin proposes a reclassification for a date range and filters (brand/team/user/apiKey/preset type/tag) from cost center A to B with a justification When the request is submitted Then the system runs a pre-apply simulation showing deltas and impacted reports And requires a second approver with Finance role to approve before changes apply And upon approval, affected attributions and reports are recalculated, prior report versions are retained and marked superseded, and corrected exports are sent with a correction flag And the workflow supports rollback to the prior state, producing a new version and corresponding exports And all steps are audit-logged and immutable

Preset Version Lineage & Change Diff

"As a brand owner, I want visibility into preset version history and differences so that I can control changes and trace outcomes for accountability."

Description

Track full lineage of style presets, including who changed what and when, approval status, and version notes. Provide visual and textual diffs of preset parameters (e.g., background style, crop rules, retouch intensities) and link each processed image in the audit ledger to the exact preset version applied. Support allow/deny lists of preset versions per brand, rollback to prior versions, and optional reprocessing of affected assets. Expose this lineage in analytics and exports to strengthen brand governance and explain outcome differences over time.

Acceptance Criteria

Immutable Preset Version Lineage Visible per Brand

Given a brand preset exists and a user saves changes, When the changes are committed, Then a new preset versionId is created with parentVersionId, editorUserId, ISO-8601 UTC timestamp, changeSummary, approvalStatus, and versionNotes recorded in the audit ledger Given any existing preset version, When a user attempts to edit it directly, Then the system prevents modification and returns an error indicating versions are immutable and a new version must be created Given a preset with 200+ versions, When viewing the lineage timeline, Then versions render in chronological order with parent-child links, and the view loads in under 2 seconds for up to 500 versions Given the audit ledger, When a ledger entry is programmatically altered outside the application, Then tamper detection marks the entry as invalid and the UI surfaces a tamper-evident warning

Visual and Textual Diff Between Any Two Preset Versions

Given two versions of the same preset are selected, When requesting a diff, Then changed parameters (e.g., background.style, crop.rules, retouch.intensity) are listed with exact old and new values Given two versions are selected, When displaying the diff, Then a side-by-side visual preview renders using the product sample image, and a textual JSON-style diff shows parameter paths with old→new values Given parameters that have not changed, When rendering the diff, Then they are excluded from the changed list and can be optionally toggled on via a "show unchanged" control Given numeric parameters changed within tolerance (e.g., float rounding), When computing the diff, Then the result suppresses false positives via stable rounding and order-insensitive comparisons Given the diff view, When the user clicks Export, Then a downloadable artifact (PDF for visuals and JSON for text) is generated containing versionIds, timestamps, editors, and the full change list

Audit Ledger Links Each Processed Image to Exact Preset Version

Given a batch is processed with preset version V, When inspecting any resulting image’s audit entry, Then the entry includes presetId and presetVersionId=V with a link to the version details Given a preset is updated to a new version during an ongoing batch, When later inspecting images, Then images completed before the update reference the previous version and images completed after reference the new version Given an export of processed assets is requested, When the CSV is downloaded, Then each row includes presetId, presetVersionId, jobId, brandId, and processedAt timestamp Given an image was reprocessed, When viewing its history, Then all prior presetVersionIds are shown in chronological order with who initiated each processing run

Enforce Brand Allow/Deny Lists for Preset Versions

Given a brand has an allow list of preset versionIds, When a user attempts to process with a version not on the allow list, Then the job is blocked and the API responds 403 BRAND_VERSION_NOT_ALLOWED with the offending versionId Given a brand has a deny list entry for a preset versionId, When a user attempts to process with that version, Then the job is blocked and an anomaly event is logged and alert is sent to brand admins Given an admin updates allow/deny lists, When the change is saved, Then the change is recorded in the audit ledger with editorUserId, timestamp, and rationale and takes effect within 10 seconds Given the UI lists available versions for a brand, When rendering the selector, Then only allowed and non-denied versions are selectable, and denied versions appear disabled with a tooltip reason

One-Click Rollback to Prior Preset Version with Optional Reprocessing

Given a preset has multiple versions, When an admin selects a prior version Vprev and confirms rollback, Then Vprev becomes the active version for the brand and the action is recorded with actor, timestamp, and notes Given rollback is performed with "Reprocess affected assets" checked, When the job runs, Then assets processed since a specified date or since version Vbad are reprocessed using Vprev and linked to the new processing run Given the reprocessing job completes, When viewing the job summary, Then success/failure counts, duration, and sample error messages are shown and each affected image’s audit entry is updated with the new presetVersionId Given a rollback is executed, When inspecting the lineage, Then no historical versions are deleted and the lineage shows a rollback event node linking from the current to Vprev

Lineage and Version Usage Exposed in Analytics and Exports

Given the analytics dashboard, When filtering by presetId and presetVersionId, Then usage metrics (images processed, success rate, average processing time, conversion lift if available) update accordingly Given a multi-brand account with cost centers, When exporting usage analytics, Then the CSV includes brandId, costCenter, presetId, presetVersionId, userId, jobId, processedAt, and counts per day Given a large export request of up to 100k rows, When the export is initiated, Then the file is delivered within 60 seconds or the system returns an asynchronous download link within 10 seconds and emails when ready Given a time range is selected, When viewing the trend chart, Then version changes are annotated on the timeline with tooltips linking to diffs and approver notes

Approval Workflow and Gating for Preset Versions

Given a new preset version is created, When saved, Then its approvalStatus defaults to Draft and cannot be used for production processing by non-admins Given a user with approver role, When they set approvalStatus to Approved and add notes, Then the change is recorded with approverUserId, timestamp, and notes and the version becomes eligible for production Given a non-approver attempts to approve or use a Draft/Pending version for production, When they submit, Then the system blocks the action with 403 VERSION_NOT_APPROVED and logs an anomaly event Given a version is Rejected or Deprecated, When a user attempts to select it, Then the UI disables selection and the API denies processing with a clear error referencing the approvalStatus

Smart Sampler

Automatically selects five representative images from your recent uploads—covering lighting, backgrounds, product types, and edge cases—so your brand preset is trained on the real variety you ship. Skip guesswork and get a sturdier, more reliable style in minutes.

Requirements

Diversity Sampler Engine

"As a boutique owner, I want PixelLift to automatically pick a diverse set of five photos from my latest uploads so that my brand preset learns from the real variety in my catalog without me handpicking examples."

Description

Implements the core selection algorithm that automatically chooses five representative images from a user’s recent uploads. Uses computer vision embeddings and clustering to capture variation across lighting conditions, background types, product categories, materials, and compositions. Applies quality heuristics (sharpness, exposure, noise) and tie-breakers to avoid near-duplicates and ensure coverage of distinct visual modes. Integrates with PixelLift’s asset store and indexing pipeline, outputs a ranked set with reason codes (e.g., “low-key lighting,” “busy background,” “reflective surface”) for transparency. Provides confidence scores and fallbacks when insufficient diversity is detected, and exposes a service API consumed by preset training.

Acceptance Criteria

Five-Image Selection from Configured Recent Uploads Window

Given a user has at least 5 eligible recent uploads within the configured recent-uploads window and assets are indexed and visible When the Diversity Sampler Engine is invoked for that user Then it returns exactly 5 unique image_ids ordered by representativeness_score descending And all returned images originate from the configured recent-uploads window And the selection is deterministic for identical input set and configuration (same 5 ids and order)

Diversity Coverage Across Visual Modes

Given an eligible pool of >= 5 images with computed embeddings and attribute tags When the engine samples the representative set Then the 5 selections belong to at least 4 distinct embedding clusters at k>=5 using cosine distance And pairwise cosine similarity between any two selections is < 0.90 And attribute coverage across the 5 selections spans at least 3 dimensions among: lighting variants, background types, product categories, material types, composition types And cluster_ids, diversity_score [0,1], and covered_dimensions are included in the response

Near-Duplicate and Redundancy Avoidance

Given near-duplicate candidates exist in the eligible pool When building the selection set Then items with perceptual-hash Hamming distance <= 8 or embedding cosine similarity >= 0.97 to a higher-ranked candidate are excluded from the 5 And no two returned items share the same source_capture_id and capture_timestamp within 2 seconds And if de-duplication reduces eligible choices below 5, fallback behavior is triggered

Quality Heuristics Filtering

Given quality metrics are computed for each candidate image When scoring candidates Then every selected image meets all thresholds: sharpness (Laplacian variance) >= 120, exposure EV in [-1.5, 1.5], SNR >= 20 dB, short_side >= 800 px, no major motion blur detected And quality_scores per metric are included per selected image And if fewer than 5 candidates meet thresholds, fallback behavior is triggered

Transparency via Reason Codes and Ranking

Given a ranked selection is produced When the response is returned Then each selected item includes 1–3 reason_codes from the controlled vocabulary and a primary_reason And each item includes: image_id, rank (1–5), cluster_id, representativeness_score [0,1], quality_scores, attribute_flags And ties in representativeness_score are broken by higher quality_score, then more recent upload_time And ranks are contiguous and strictly increasing from 1 to 5

Confidence Scores and Low-Diversity Fallback

Given diversity and quality metrics are computed When diversity_score < 0.60 or unique_clusters < 4 or eligible_count < 5 Then the engine returns the best-available set (minimum 3 items) with fallback=true, fallback_reason, guidance_message, and missing_dimensions And sampler_confidence [0,1] is included and reflects diversity and quality (monotonic with diversity_score) And can_train=false when returned_count < 3, else true

Service API and Indexing Integration

Given a client presents valid credentials and organization_id When POST /sampler/jobs is called with org_id and optional window/config Then the service responds 202 with a job_id and enqueues the task And the engine queries the asset store for assets where index_status=ready and visibility=seller within the specified window And GET /sampler/jobs/{job_id} returns 200 with status in {queued, running, succeeded, failed}, duration_ms, and the selection payload on success And the operation is idempotent for identical inputs within 24h (same job_id and result) And P95 end-to-end time for pools up to 2,000 assets is <= 45,000 ms; P99 <= 90,000 ms

Edge Case Inclusion

"As a seller of varied products, I want tricky product scenarios automatically represented in the sample so that my preset performs reliably on hard cases I actually ship."

Description

Detects and prioritizes inclusion of edge-case photos (e.g., transparent or reflective products, black-on-black, white-on-white, intricate patterns, extreme aspect ratios, low-resolution, motion blur) when present in the candidate set. Maintains a taxonomy of edge-case types and thresholds to ensure at least one edge case is represented without over-weighting anomalies. Includes safeguards to exclude irrecoverable defects (e.g., corrupted files) from selection. Produces labeled tags that can be surfaced in the UI rationale and stored with the training snapshot.

Acceptance Criteria

Edge Case Inclusion When Present

Given a candidate set of 5–1000 images and the edge-case taxonomy thresholds are loaded, When the set contains ≥1 image classified as any edge-case type with classifier confidence ≥ its configured threshold, Then the 5 selected images include ≥1 edge-case image; And when no image meets any edge-case threshold, Then 0 edge-case images are selected; classifications below threshold must not trigger inclusion.

Edge Case Non-Overweighting Rule

Given the proportion P of edge-case images (meeting thresholds) in the candidate set, When selecting 5 images, Then the number of edge-case selections E satisfies: - If P < 20%, E ≤ 1 - If 20% ≤ P < 50%, E ≤ 2 - If P ≥ 50%, E ≤ 3

Edge Case Diversity Preference

Given the candidate set contains ≥2 distinct edge-case types meeting thresholds, When selecting edge-case images for inclusion, Then selected edge-case images must prefer distinct types (no duplicate type) until each present type has at least one representation or the edge-case cap is reached; And ties are broken by classifier confidence in descending order.

Irrecoverable Defects Safeguard

Given the candidate set contains corrupted or unreadable files (e.g., zero-byte, unsupported format, decode error), When running selection, Then such files are excluded from consideration and not counted toward edge-case proportion P or selections E; And a reason code per file is logged and included in the selection rationale output; And the sampler returns up to 5 valid selections; if <5 valid candidates exist, it returns all valid ones with a shortfall reason.

Edge Case Tagging and Snapshot Persistence

Given selection completes, When returning the 5 selected images, Then each selected image includes tags[] with zero or more edge-case types, classifier confidence per tag, and taxonomyVersion; And these tags and the taxonomyVersion are persisted with the training snapshot and available to the UI rationale endpoint within 1 second of selection completion.

Taxonomy Versioning and Thresholds

Given the system is initialized, When retrieving the edge-case taxonomy, Then it is versioned and includes at least: transparent, reflective, black-on-black, white-on-white, intricate pattern, extreme aspect ratio, low-resolution, motion blur; And each type has a configurable confidence threshold in [0.0, 1.0]; And the taxonomyVersion used is recorded with every selection output for auditability.

Recency & Eligibility Rules

"As a brand manager, I want Smart Sampler to pull from my most recent, valid uploads so that the sample reflects what I’m currently listing, not outdated or low-quality images."

Description

Defines the candidate pool for sampling based on configurable recency windows (e.g., last 14 days or last 500 uploads), workspace/brand scoping, and eligibility filters. Excludes failed imports, near-duplicates, and images below minimum quality thresholds. Supports manual refresh and rerun to capture newly uploaded images. All rules are configurable per workspace and auditable to ensure the sample reflects true, current catalog conditions.

Acceptance Criteria

Recency Window — Last N Days

Given a workspace with time zone set and recency rule type "Days" = 14 When Smart Sampler builds the candidate pool Then only images with uploaded_at in [now-14 days, now] in the workspace time zone are included And images older than now-14 days are excluded And images with uploaded_at exactly at now-14 days are included

Recency Window — Last N Uploads

Given recency rule type "Uploads" = 500 in Workspace A When Smart Sampler builds the candidate pool Then the 500 most recent successfully imported unique uploads in Workspace A are included And if fewer than 500 eligible uploads exist, all eligible uploads are included And ties on uploaded_at are resolved by descending upload_id

Workspace and Brand Scoping

Given Workspace A with brands B1 and B2 and the brand filter set to B1 When Smart Sampler builds the candidate pool Then only images where workspace_id = A and brand_id in {B1} are eligible And images from other workspaces or brands are excluded And if the brand filter is unset, all brands within Workspace A are eligible

Eligibility Exclusions — Import Failures, Near-Duplicates, Low Quality

Given workspace thresholds min_quality_score = 0.7, min_dimension_px = 1000, and duplicate_similarity_threshold = 0.95 When Smart Sampler evaluates eligibility Then images with import_status != "completed" or is_soft_deleted = true are excluded And images with similarity >= 0.95 to any other image within the recency window are near-duplicates; only the highest quality_score instance is retained and the rest excluded And images with quality_score < 0.7 or min(width, height) < 1000 are excluded And an exclusion_reason code is recorded for every excluded image

Manual Refresh & Rerun Captures New Uploads

Given a previous sampler run completed and new eligible images have been uploaded since that run When the user clicks "Refresh & Rerun" for the workspace Then the candidate pool is rebuilt using the currently saved rules and includes the new eligible images And triggering "Refresh & Rerun" twice without additional uploads yields identical candidate pools And the rebuild completes within 10 seconds for up to 10,000 eligible images

Audit Trail for Candidate Pool Generation

Given Smart Sampler generates a candidate pool When the run completes Then an audit record is stored with: run_id, timestamp, initiated_by (user_id/service), workspace_id, brand filter, rule_version, recency rule type/value, quality thresholds, duplicate threshold, include_count, exclude_count by reason, and lists of included and excluded image IDs with reason codes And a user with audit permissions can view and export the audit record in JSON and CSV And rerunning with the same data snapshot and rule_version reproduces the same candidate pool and audit counts

Per-Workspace Rule Configuration & Versioning

Given a workspace admin updates recency and eligibility rules When the admin clicks Save Then the configuration is validated and saved as a new version with version_id, editor_id, and timestamp And subsequent Smart Sampler runs use the latest saved version; in-flight runs use the version snapshot they started with And non-admin users cannot modify the rules; they can view the active version and its effective values

Sample Review & Override

"As a boutique owner, I want to quickly review and swap any of the five picks before training so that I stay confident the sampler reflects my brand reality."

Description

Provides a lightweight review UI showing the five selected images, their diversity rationale, and suggested alternates per slot. Enables one-click swap, approve, or regenerate actions before preset training. Persists an immutable selection snapshot (image IDs, model/version, parameters, reason codes) tied to the preset training job for traceability. Includes keyboard shortcuts and accessible controls to minimize friction and support quick confirmation.

Acceptance Criteria

Display Sampler Selection with Diversity Rationale & Alternates

Given a completed Smart Sampler run for recent uploads When the user opens the Sample Review UI Then exactly 5 slots are displayed with the selected representative images And each slot shows a human-readable diversity rationale mapped to reason codes And each slot lists 1–3 suggested alternates with their reason codes And thumbnails use skeletons while loading and show a fallback with retry on error And the selection shows the model name, version, and key parameters used

One-Click Approve, Swap, and Regenerate

Given the Sample Review UI is open When the user clicks Approve All Then all 5 slots are marked approved and the Continue/Train action becomes enabled Given a slot with alternates When the user clicks Swap on an alternate Then the alternate becomes primary within 300 ms and the previous primary moves to alternates Given a slot When the user clicks Regenerate Then up to 3 new alternates return within 5 seconds with updated reason codes and previously rejected images are excluded Given any in-progress changes When the user navigates away and returns Then the current selection state is preserved until approval and snapshot

Immutable Selection Snapshot & Traceability

Given the user confirms the selection When the system persists the snapshot Then it records: preset ID, training job ID, 5 primary image IDs, per-slot alternates at decision time, reason codes, model name, model version, sampler parameters, timestamps, and user ID And the snapshot is immutable; subsequent changes create a new snapshot with a new version and existing records are read-only And each training job references exactly one snapshot ID; re-running training on the same selection reuses the same snapshot And an audit API allows retrieval by preset ID and training job ID and returns a checksum to verify integrity

Keyboard Shortcuts & Efficient Navigation

Given the Sample Review UI When keyboard shortcuts are used Then the user can approve all (A), approve current slot (Enter), swap to highlighted alternate (S), regenerate current slot (R), navigate slots (Left/Right), and cycle alternates (Up/Down) And a visible shortcut hint panel is toggled with ? and is accessible to screen readers And all actions are fully operable without a pointing device; a full review can be completed via keyboard alone And shortcuts avoid conflicts with native browser defaults via scoped handling or remapping

Accessibility Compliance (WCAG 2.1 AA)

Given assistive technology users navigate the Sample Review UI When interacting with controls and status changes Then all actionable elements have programmatic names, roles, and states with logical focus order And focus indicators meet 3:1 contrast; text and interactive elements meet 4.5:1 contrast; images and rationales have text alternatives And dynamic updates (swap, regenerate, approve) are announced via ARIA live regions without stealing focus And the UI is operable at 200% zoom and in high-contrast mode with no keyboard traps and 44x44px minimum hit targets

Performance & Responsiveness

Given a median network and typical session When the user opens the Sample Review UI Then above-the-fold content renders within 2 seconds and all thumbnails within 3 seconds And approve/swap actions update the UI within 300 ms; regenerate returns alternates within 5 seconds at p95 And snapshot persistence completes within 1 second at p95 and does not block the UI And UI animations and transitions maintain responsiveness above 55 FPS during loading and interactions

Error Handling & Recovery

Given an image fails to load When a retry is triggered Then the system retries up to 3 times with exponential backoff and provides a visible Retry control; if still failing, a meaningful error is shown and the slot remains actionable Given regenerate fails for a slot When the user retries Then previous alternates are retained, a non-blocking toast explains the failure, and no partial state corruption occurs Given snapshot persistence fails When the user confirms again Then a blocking error with safe retry is shown and no training job starts without a persisted snapshot And all errors are logged with a correlation ID and exposed in the audit trail with failure reason codes

Processing SLA & Scalability

"As a time-pressed seller, I want the sample to be ready in seconds even for large batches so that I can train and publish presets without waiting."

Description

Ensures the sampler processes large catalogs quickly and reliably. Targets selection completion within 30 seconds for up to 2,000 eligible images and scales via background jobs and batching to 10,000+ images. Implements queueing, backoff, and partial results fallback when resources are constrained. Exposes progress indicators and clear error states in the UI. Observability includes latency metrics, timeouts, and autoscaling signals to maintain the SLA under peak loads.

Acceptance Criteria

SLA: Complete selection within 30s for 2,000 eligible images

Given a tenant with 2,000 eligible images and no other sampler job running for that tenant When the user starts Smart Sampler Then five representative images are selected and persisted within 30 seconds end-to-end in at least 95% of runs And the 99th percentile completes within 45 seconds And no job-level timeout or unhandled error occurs And a completion event is emitted within 1 second of persistence

Scalability: Background jobs and batching for 10,000+ images

Given a tenant with 10,000 eligible images When Smart Sampler is started Then the sampler job enqueues within 1 second and begins processing within 5 seconds when a worker is available And images are processed in batches of no more than 500 per worker with parallel workers And a per-tenant concurrency cap is enforced (default 2) to prevent noisy-neighbor impact And the job completes without worker crashes or data corruption

Reliability: Queueing, retries, and backoff under resource constraints

Given transient failures (e.g., HTTP 5xx, timeouts) while processing a batch When a batch fails Then the system retries with exponential backoff (initial 2s, max 30s, full jitter) up to 5 attempts And duplicate concurrent sampler jobs for the same tenant are prevented And after final failure the batch is moved to a dead-letter queue with error code and trace metadata And a DLQ metric increments for alerting

Resilience: Partial results fallback with deadline

Given the job exceeds the 30-second SLA or encounters resource constraints When fewer than five selections are ready by the SLA deadline Then at least three high-confidence representatives are returned if available and the job is marked Partial And remaining selections continue in the background until completion or a 2-minute ceiling, whichever comes first And the UI shows 'Partial (X/5)' with Resume and Retry actions And a completion notification is emitted when the remaining selections finalize

UX: Real-time progress indicator and states

Given an active sampler job When the user views job status Then the UI displays state (Queued, Processing, Partial, Completed, Failed), percent complete, items processed/total, and ETA And progress updates within 1 second of backend state changes without requiring a manual refresh And progress is announced via ARIA live regions for screen readers And Cancel and Retry actions are available when applicable

UX: Clear error states with actionable recovery

Given a sampler job fails When the user views the error Then the UI presents a specific error category (Upload Corruption, Rate Limited, Resource Exhausted, Internal Error) And shows recommended next steps and a Retry button for retryable errors And displays a correlation ID matching backend logs And non-retryable failures do not requeue automatically

Observability & Autoscaling: Maintain SLA under peak load

Given a peak load of 100 concurrent sampler jobs each with 2,000 eligible images When the load persists for 5 minutes Then P95 end-to-end selection latency remains ≤ 30 seconds And metrics are emitted for queue wait, processing time, retries, timeouts, and batch sizes at 1-minute resolution And an alert fires if P95 latency > 30s for 5 consecutive minutes And autoscaling increases worker capacity within 60 seconds of sustained queue depth > 50 and scales back within 10 minutes after queue depth < 10

Selection Telemetry & Feedback Loop

"As a product manager, I want insights into how users accept or modify the sampler’s picks so that we can continuously improve selection quality and training outcomes."

Description

Captures user interactions (approvals, swaps, regenerations) and post-training outcomes (preset acceptance rate, downstream edit rate) to measure sampler effectiveness. Computes coverage metrics for lighting, backgrounds, and product categories in chosen samples versus catalog distribution. Feeds anonymized statistics into model improvement while respecting workspace boundaries and privacy settings. Surfaces basic quality analytics to the product team for iterative tuning.

Acceptance Criteria

User Interaction Telemetry Capture

Given telemetry is enabled for workspace W and a user is interacting with Smart Sampler selections When the user approves a sample, swaps a sample, requests a regeneration, or dismisses a sample Then an event is recorded with fields: event_id (UUIDv4), workspace_id, anonymized_user_id (workspace-scoped hash), session_id, sampler_session_id, selection_id, action_type ∈ {approve, swap, regenerate, dismiss}, action_metadata (optional), timestamp (ISO-8601 UTC), client_version, latency_ms And no raw image pixels, filenames, or free-text notes are included in the payload And server-side idempotency ensures duplicate event_ids do not create multiple records And the end-to-end capture rate for eligible actions over a 24h window is ≥ 99.5%

Telemetry Reliability & Latency

Given the client is offline or experiencing intermittent connectivity When interaction events are generated Then events are queued locally up to 5,000 events or 10 MB (whichever comes first) for up to 24 hours And queued events are retried with exponential backoff and jitter until acknowledged by the server And P95 end-to-end latency from event creation to server availability is ≤ 120 seconds; P99 ≤ 5 minutes And daily event loss (client-sent minus server-acknowledged) is ≤ 0.2% And server guarantees at-least-once delivery with idempotent upserts (dedupe by event_id), resulting in ≤ 0.5% duplicate insert attempts and 0% duplicate stored records

Coverage Metrics Computation

Given a Smart Sampler run is finalized for workspace W When coverage is computed Then the baseline distribution is derived from W’s last 30 days of uploads or the most recent 10,000 images, whichever is smaller And the selected 5 samples are classified for lighting ∈ {studio, natural, low-light, mixed}, background ∈ {white, colored, textured, in-situ}, and product_category (workspace taxonomy) And classifier macro-F1 on a validation set is ≥ 0.90 for lighting and background, and ≥ 0.85 for product_category And the system computes per-dimension representation deltas (selected vs baseline) and an overall Coverage Score ∈ [0,1] And metrics are persisted within 60 seconds of selection and are queryable by sampler_session_id

Post-Training Outcome Tracking

Given a brand preset is trained from a Smart Sampler session S When the preset is used in production Then the system attributes downstream outcomes to S for the next 14 days or first 500 processed images, whichever comes first And preset acceptance rate is computed as accepted_presets / eligible_presets and is updated daily And downstream edit rate is computed as images with manual edits beyond crop / generated images and is updated daily And outcome metrics exclude workspaces that opt out of telemetry and sessions with fewer than 50 generated images And metrics are available via internal analytics API with filters by workspace_id and date range

Privacy & Workspace Boundary Controls

Given workspace W has telemetry disabled When a user performs sampler-related actions Then no telemetry payloads are transmitted or stored for W and local buffers are purged within 60 seconds Given telemetry is enabled When events are processed and aggregated Then user identifiers are hashed with a workspace-scoped salt; no email, names, or image content are stored And cross-workspace joins are blocked; queries are constrained to workspace_id unless using approved aggregated datasets with k-anonymity k ≥ 20 And data retention for raw events is ≤ 180 days; deletion requests are honored within 7 days And all access to telemetry and analytics requires authorized roles and SSO; access is logged

Product Team Quality Analytics

Given a product analyst with the appropriate role accesses the Quality Analytics dashboard When viewing Smart Sampler analytics Then the dashboard displays, by selectable time range, event volumes, action rates (approve/swap/regenerate/dismiss), coverage scores, preset acceptance rate, and downstream edit rate And no raw images, product names, or free-text are displayed; workspace identifiers are hashed And API/dashboards return within P95 ≤ 800 ms for cached queries and ≤ 3 s for uncached, with 99.9% monthly availability And metric definitions are documented and accessible via in-UI tooltips And data freshness is ≤ 60 minutes

Anonymized Model Feedback Ingestion

Given the nightly aggregation window closes When the model improvement pipeline requests input Then the system exports only cohort-level aggregates (cohort size ≥ 20) with workspace identifiers removed and differential privacy noise calibrated to ε ≤ 2, δ ≤ 1e-5 per 30 days And the exported schema includes coverage_score, per-dimension representation deltas, preset_acceptance_rate, downstream_edit_rate, and sample_size And exports are versioned and lineage-tracked; the consumer job references the dataset version and fails closed if privacy checks or schema validation fail And no data from telemetry-disabled workspaces is included

Audit & Reproducibility

"As a support engineer, I want to reproduce a customer’s sample exactly so that I can debug issues and explain selection rationale with confidence."

Description

Provides deterministic sampling via seeded randomness and records all inputs required to reproduce a given selection (candidate set identifiers, embeddings version, classifier versions, thresholds, seed). Enables support and power users to re-run the sampler and verify identical outputs or explain divergences after model upgrades. Stores audit logs with retention aligned to workspace policy and exposes a support-only replay tool.

Acceptance Criteria

Deterministic Sampling With Seed

Given a fixed candidate_set_ids list, embeddings_version, classifier_versions, threshold set, sampler_version, and seed S When Smart Sampler is executed three consecutive times without changing any inputs Then each run returns exactly five unique image_ids identical to one another and in the same order And the run records indicate the same seed S was used for all executions

Complete Audit Record Per Sampler Run

Given a sampler run completes successfully When the audit record is written Then the record contains: run_id, timestamp (UTC), workspace_id, actor (user_id or service_account), candidate_set_ids, embeddings_version, classifier_versions (by component), thresholds, sampler_version, seed, sampling_strategy, and output image_ids with ranks And the record is immutable and retrievable by run_id via support tooling And no successful run is observable to end users unless its audit record exists

Retention Enforcement Aligned to Workspace Policy

Given a workspace retention policy of N days is configured When an audit record becomes older than N days Then it is purged within 24 hours of crossing the threshold and is no longer retrievable by any role And audit records newer than N days remain retrievable And purge events are themselves logged with run_id and purge_timestamp

Support-Only Replay Reproduces Output

Given a user with support role and permission sampler.replay and a target run_id When they invoke Replay with option use_recorded_versions=true Then the tool re-runs Smart Sampler using the recorded candidate_set_ids, embeddings_version, classifier_versions, thresholds, sampler_version, and seed And it returns the same five image_ids in the same order and marks the replay status as Reproduced And if required inputs are missing (e.g., candidate not present or version unavailable), the tool returns an error code (MISSING_INPUTS or VERSION_UNAVAILABLE) without partial results And users without support role receive 403 Forbidden with no leakage of audit fields

Divergence Report After Model Upgrades

Given a historical run_id and newer component versions are available When Replay is invoked with use_recorded_versions=false (use_latest=true) Then the tool executes the sampler with the latest available embeddings/classifiers while keeping the same candidate_set_ids and seed And it produces a divergence report including: original vs new component versions, Jaccard similarity of selection sets, list of added/dropped image_ids with rank positions, and summary reasons where available And the replay is marked DIVERGED if the five image_ids or their order differ

Power-User Deterministic Re-run via API

Given a workspace admin or designated power_user and a historical run_id When they call the re-run API with explicit pins matching the audit record (seed, embeddings_version, classifier_versions, thresholds, sampler_version) and scope limited to their workspace Then the API returns the same five image_ids in the same order And the API creates a new run_id whose audit record references the original run_id as replay_of And access is denied (403) for users outside the workspace

Style Coach

Inline guidance with best‑practice tips and guardrails as you set background, crop, lighting, and retouch levels. Clear, plain‑language hints explain trade‑offs and suggest starting points, helping non‑designers dial in an on‑brand look with confidence.

Requirements

Contextual Inline Tips & Explanations

"As a non‑designer seller, I want clear, in‑context explanations for each adjustment so that I can choose settings confidently without learning photo jargon."

Description

Provide inline, plain‑language guidance for background, crop, lighting, and retouch controls that explains what each adjustment does, the trade‑offs (e.g., “stronger retouching may reduce texture realism”), and suggests starting values based on product category and selected brand preset. Hints appear contextually as users hover or adjust sliders, with concise do/don’t examples and quick links to apply recommended settings. Content is non-technical, localized, and accessible, with glossary rollovers for unfamiliar terms. Integrates with the existing editor UI as a collapsible “Coach” panel and lightweight tooltips, instrumented with analytics to measure usage and tip efficacy, and supports A/B testing of copy variants.

Acceptance Criteria

Tooltip guidance for background control during hover and adjustment

Given the user hovers over or focuses the Background control or begins adjusting its slider When the trigger occurs Then a tooltip appears within 300 ms, anchored to the control, never obscuring the main image canvas, and remains visible while the control is focused or active And the tooltip copy is plain-language (Flesch-Kincaid Grade ≤ 8), includes a one-sentence what-it-does explanation, a trade-off statement (e.g., realism vs. cleanliness), and a starting value recommendation derived from the current product category and brand preset And the tooltip contains one Do and one Don’t micro-example with 96×96 px thumbnails and alt text And an Apply Recommendation link is present and focusable And pressing Esc or moving focus away dismisses the tooltip And the tooltip is not rendered when the control is disabled

Coach panel integration, collapse/expand, and state persistence

Given the editor is loaded When the user clicks the Coach panel toggle or presses the assigned keyboard shortcut Then the panel expands/collapses without shifting the image canvas and without overlapping critical editor controls And panel open/closed state persists across projects and sessions for the signed-in user And on viewports < 1024 px width the panel renders as a bottom sheet with the same content and controls And the Coach code is lazy-loaded; initial editor bundle size increase is ≤ 30 KB gzipped And first open completes within 500 ms on a 5 Mbps connection (p95) And the panel is fully operable via keyboard (Tab/Shift+Tab) with a visible focus indicator

Personalized starting values by product category and brand preset

Given a product has a detected or selected category and a brand preset is active When a user opens tips for Background, Crop, Lighting, or Retouch Then each tip displays a Recommended starting value sourced from the category×preset mapping table with a visible version tag And clicking Apply sets the corresponding control(s) to the recommended values and updates the preview within 200 ms (p95) And an "Applied" toast appears for 2–4 seconds with an Undo action that reverts the settings And if category is unknown, preset-level defaults are used; if both unknown, global defaults are used; these fallbacks are explicitly labeled And recommendations never exceed the allowed min/max for each control

Do/Don’t examples and quick actions

Given a tip is displayed for any of the four controls When the content renders Then it includes at least one Do and one Don’t example relevant to the current control and product category And each example includes a 96×96 px thumbnail, concise caption (≤ 80 characters), and alt text describing the example And clicking the Do example’s Apply button applies its linked settings immediately without page navigation And the Don’t example has no Apply action And examples are hidden if bandwidth is low and thumbnails fail to load after 2 seconds, with text-only fallbacks shown

Localization and glossary rollovers for non-technical language

Given the user’s locale is en-US, es-ES, fr-FR, or de-DE When tips and Coach content are displayed Then copy is served in the user’s locale with no truncated or clipped strings, and numbers/date formats follow the locale And if the locale is unsupported, content falls back to en-US with a non-blocking notice in settings And all tip bodies meet Flesch-Kincaid Grade ≤ 8 in each locale And glossary-marked terms display an inline dotted underline; on hover or focus, a glossary tooltip opens within 250 ms with a 20–120 character definition and is dismissible with Esc or blur And glossary tooltips are navigable via keyboard and announced by screen readers

Accessibility of tooltips and Coach panel (WCAG 2.2 AA)

Given a keyboard-only or screen reader user interacts with tips or the Coach panel When navigating through controls and content Then all interactive elements are reachable via Tab order, have visible focus, and support activation via Enter/Space And tooltips use role="tooltip" with proper aria-describedby associations; panel uses appropriate landmark/role And content meets contrast ratio ≥ 4.5:1; interactive targets are ≥ 44×44 px And no tooltip auto-dismisses in under 5 seconds while focused/hovered; Esc always dismisses And screen readers announce tooltip open/close and the Apply action result And there is no keyboard trap; focus returns to the invoking control after tooltip dismissal

Analytics instrumentation and A/B testing of tip copy

Given analytics is enabled When a user views a tip, expands/collapses the Coach, clicks Apply, views a glossary term, or dismisses a tip Then events tip_viewed, coach_opened/closed, tip_applied, glossary_viewed, and tip_dismissed are emitted with properties: user_id_hash, session_id, control_id, product_category, brand_preset_id, locale, ab_variant_id, timestamp And events are queued and delivered with p95 latency ≤ 2 s and daily drop rate ≤ 2% And no PII (names, emails, images) is included in payloads And A/B copy variants are supported with a 50/50 random split (configurable), sticky per user for the experiment duration, and the assigned ab_variant_id is included on all related events And experiments can be enabled/disabled remotely with a safe fallback to control copy within 200 ms

Real‑time Before/After Preview

"As a boutique owner, I want to preview changes instantly and compare before/after so that I can see the impact and avoid over‑editing."

Description

Render immediate visual feedback for all Style Coach adjustments with a smooth, low‑latency preview that supports a before/after toggle (split slider and quick tap), per‑adjustment previews, and instant revert to defaults. The preview pipeline incrementally applies changes on-device with GPU acceleration and smart throttling to maintain target frame rates, falling back to progressive updates for large images or low‑power devices. Ensures pixel parity between preview and final export, with safeguards to re-render at full fidelity after adjustments. Includes keyboard shortcuts and accessible controls for comparison modes.

Acceptance Criteria

Split Slider Before/After Comparison

Given a product photo is loaded in Style Coach with at least one adjustment applied And the before/after split slider is visible When the user drags the slider horizontally via mouse, touch, or keyboard arrows Then the preview updates continuously so that pixels left of the divider show the unprocessed "before" image and pixels right show the processed "after" image with no cross-bleed And the divider tracks input within 1 rendered frame of movement And median frame rate during drag on a reference device is >= 45 FPS, with 95th percentile input-to-frame latency <= 120 ms And the divider cannot be dragged outside image bounds and supports RTL locales And there is no flicker, tearing, or mismatch between displayed halves during drag

Quick Tap and Hold Before/After Toggle

Given at least one adjustment is applied When the user presses and holds the comparison key (e.g., Space) or presses and holds the on-screen Compare button Then the preview switches to "before" within 60 ms and remains so while held And on release, the preview returns to "after" within 60 ms preserving current adjustments And a single tap (keyboard or button) toggles persistent compare mode on/off And the toggle state is reflected in the UI and is limited to the current session And frame rate during toggling does not drop below 30 FPS on reference devices

Per-Adjustment Live Preview Responsiveness

Given background, crop, lighting, or retouch controls are being adjusted via drag or input When the control value changes Then the on-device GPU-accelerated preview updates within 80 ms of the latest input and at least every 100 ms during continuous drag And intermediate renders are cancelable; outdated frames are dropped in favor of the newest value And upon interaction end (>= 200 ms idle), a full-fidelity re-render completes within 500 ms And visual output of the full-fidelity re-render matches the next exported image per Pixel Parity criteria

Instant Revert to Defaults

Given adjustments have been made When the user invokes Revert to Defaults (button or Cmd/Ctrl+Backspace) Then all Style Coach controls reset to system defaults and the preview updates within 150 ms And the action is atomic (single undo step) and undo/redo restores prior state including compare mode And no residual crop, mask, or hidden parameters remain after revert And any in-progress renders are canceled and replaced by a default-state full-fidelity render

Preview-to-Export Pixel Parity Safeguard

Given the preview has stabilized after user interaction (>= 200 ms idle) When the user triggers Export or background save Then the system verifies parity by producing a full-fidelity render with the same pipeline and color management as preview And the exported image and the stabilized preview buffer are bitwise identical in linear RGB 16-bit (or 8-bit where applicable) for supported devices Or, if hardware/driver variance prevents bitwise equality, the per-pixel absolute delta must be <= 1/255 with SSIM >= 0.999; otherwise an automatic re-render is performed before export until criteria are met And color profile, ICC tags, and output dimensions match exactly between preview and export

Smart Throttling and Progressive Updates on Constrained Devices

Given a large image (> 24 MP) or a device flagged as low-power or thermally throttled When the user adjusts any Style Coach control Then the system maintains interactive frame rate >= 30 FPS by throttling expensive passes and rendering lower-resolution tiles first And a progressive preview appears within 150 ms and refines to full resolution within 700 ms after interaction ends And no UI thread stall exceeds 16 ms at 60 Hz; input remains responsive throughout

Keyboard Shortcuts and Accessible Comparison Controls

Given a keyboard-only or screen reader user is operating Style Coach When interacting with compare modes (split slider, quick toggle) and revert Then all functions are reachable via documented shortcuts: Focus split slider, Move divider (Left/Right), Quick compare (Space), Toggle compare (C), Revert (Cmd/Ctrl+Backspace) And every control exposes accessible name, role, and state; focus order is logical; no focus trap occurs And features are operable without a pointer; shortcuts are surfaced in tooltips and Help And meets WCAG 2.2 AA for 2.1.1 Keyboard, 2.4.3 Focus Order, 4.1.2 Name, Role, Value And screen readers announce compare mode state and divider position (percentage) within 500 ms

Style Guardrails & Safe Ranges

"As a marketplace seller, I want guardrails that keep edits within platform guidelines so that my listings aren’t penalized and my brand looks consistent."

Description

Enforce recommended and hard‑limit ranges for background, crop, lighting, and retouch parameters based on brand presets and marketplace policies (e.g., pure white background, minimum subject coverage). Provide proactive warnings before settings violate guidelines, explain why, and offer one‑click corrections ("snap to safe"). A rules engine maps product category and target marketplace to parameter bounds and validation checks, with configurable templates per marketplace and brand. Guardrails never block experimentation in sandbox mode but require confirmation to publish non‑compliant results. Includes audit logging for applied guardrails and a visual indicator when settings are outside recommended ranges.

Acceptance Criteria

Non-Compliant Publish Requires Explicit Confirmation

Given a live publish target is selected for marketplace M and brand preset B is active And at least one parameter (background, crop, lighting, retouch) violates a configured rule When the user clicks Publish Then a confirmation modal appears within 500 ms listing each violated rule with plain-language explanation and source (Marketplace/Brand) And for any hard-limit violation, the Publish Anyway action is not available; primary actions are Snap to Safe & Publish and Cancel And if only recommended-range violations exist, actions include Publish Anyway, Snap to Safe & Publish, and Cancel And choosing Snap to Safe & Publish adjusts all violating parameters to the nearest compliant values per the active ruleset and completes publish successfully And if there are no violations, publish proceeds with no modal

Sandbox Mode Allows Unrestricted Experimentation

Given sandbox mode is enabled for the current project When the user sets parameters outside recommended or hard-limit ranges Then no editing interactions are blocked and changes are applied in the editor And inline warnings and indicators still appear to inform about potential non-compliance And exporting previews or downloading test renders proceeds without confirmation And attempting to publish to a live marketplace from sandbox triggers the non-compliant publish confirmation modal if violations exist

Proactive Warning Explains Violation and Suggests Fix

Given the user is adjusting a parameter slider governed by a ruleset When the value enters within 10% of a hard limit Then an inline caution tooltip appears within 300 ms indicating proximity to the limit and the reason source (e.g., Marketplace policy) When the value crosses a recommended bound Then a yellow warning pill appears within 300 ms with message explaining the recommendation, the why, and a CTA Snap to Safe When the value exceeds a hard limit Then the control is marked red, a persistent banner appears with a plain-language explanation and link to Learn more, and publish will require correction And all warnings clear within 300 ms after the value returns to within recommended range

One-Click Snap to Safe Corrects Settings

Given one or more parameters are outside compliant ranges for the active ruleset When the user clicks Snap to Safe from a warning, banner, or modal Then the system computes nearest compliant values and applies them to the current selection (single image or batch) in under 700 ms per 100 images And a toast confirms "Fixed N issues across K images" with a View details link listing adjustments by parameter And after correction, no hard-limit violations remain and any recommended-range violations are resolved unless explicitly excluded by the user

Rules Engine Loads Bounds by Category and Marketplace

Given product category C and marketplace M are selected and brand preset B is active When the editor session initializes or the user changes C, M, or B Then the rules engine loads the configured templates for M and B, including any category-specific overrides And hard-limit bounds are computed as the most restrictive intersection of M and B hard limits And recommended ranges are taken from B and clipped to the computed hard-limit bounds; if B lacks a recommendation, fall back to M, else to global defaults And the active ruleset exposes ruleId(s), ruleVersion(s), and sources for each parameter And ruleset evaluation completes within 200 ms when cached and within 800 ms on a cold fetch

Audit Log Captures Guardrail Events

Given a guardrail-related event occurs (warning shown, snap-to-safe applied, publish override, ruleset change) When the event is processed Then an audit log entry is written within 1 s containing: timestamp, userId, projectId, imageId(s), environment (sandbox/live), actionType, parameter(s), previousValue(s), newValue(s), ruleId(s), ruleVersion(s), marketplace, brandPresetId, and outcome And audit entries are retained for at least 180 days and are exportable as CSV and JSON And viewing Compliance History for an image shows the last 20 guardrail events in chronological order

Visual Indicator for Out-of-Range Settings

Given a parameter value is within recommended range Then the control shows a neutral state with no badge and no thumbnail indicator When the value exits the recommended range but remains within hard limits Then a yellow Out of recommended badge appears on the control and a yellow indicator appears on affected image thumbnails within 300 ms When the value exceeds a hard limit Then a red Non-compliant badge appears on the control, affected thumbnails show a red indicator, and the Publish button displays a red dot with the count of violating images And returning values to within recommended range clears indicators within 300 ms

AI Smart Defaults

"As a time‑pressed seller, I want smart starting values based on my product type so that I can get an on‑brand look faster with fewer adjustments."

Description

Auto‑suggest starting values for background, crop, lighting, and retouch based on detected product type, material, and initial image conditions (exposure, shadows, background uniformity). The model leverages existing product metadata, visual features, and the user’s selected brand preset to compute a balanced baseline, displaying confidence and reasoning in plain language (e.g., “jewelry identified—reduce shadows to reveal sparkle”). Users can accept all suggestions, apply per‑control, or dismiss. The system learns from user overrides to refine future defaults per brand. Includes privacy‑safe processing and deterministic fallbacks when detection is uncertain.

Acceptance Criteria

Auto-Suggest Smart Defaults Based on Product and Brand

Given a product image with metadata and a selected brand preset When the Style Coach panel is opened Then the system proposes values for background, crop, lighting, and retouch for that image without applying them. Given a batch of up to 200 images When suggestions are generated Then 95th-percentile latency per image is <= 2 seconds and average latency is <= 1 second. Given different product types (e.g., jewelry, apparel, footwear) When suggestions are generated Then proposed values vary appropriately by type and respect brand preset constraints.

Apply All, Per-Control, or Dismiss Suggestions

Given suggestions are available for an image or batch When the user selects "Apply all" Then all four controls adopt the suggested values and a confirmation toast appears. Given suggestions are available When the user toggles apply per control Then only the selected controls update to suggested values and others remain unchanged. Given suggestions are available When the user selects "Dismiss" Then no control values change and the suggestion banner is hidden for the current session. Given any application of suggestions When the user clicks Undo within 30 seconds Then all affected controls revert to their exact prior values.

Confidence and Plain-Language Reasoning Display

Given suggestions are available When the UI renders them Then each control shows a numeric confidence score (0–100%) and a one-sentence rationale referencing detected product, material, or image conditions. Given a rationale sentence When measured for readability Then it has a Flesch–Kincaid grade level <= 8 and length <= 140 characters. Given any control has confidence < 50% When displayed Then a "Low confidence" badge appears with a link to view fallback criteria. Given the user opens the rationale tooltip When expanded Then detection signals (product type, material, exposure level, background uniformity) and their confidences are listed.

Learning From User Overrides Per Brand

Given brand learning is enabled When a user adjusts any control by >= 15% from the suggested value on >= 5 images of the same detected product type within 30 days Then subsequent suggestions for that brand and product type shift at least 50% toward the median of those overrides. Given multiple brands operate in the system When overrides differ between brands Then learned adjustments remain brand-scoped and never influence other brands. Given the admin selects "Reset learning" for a brand When confirmed Then all learned adjustments for that brand are cleared and new suggestions revert to the model baseline. Given a brand has opted out of learning When users make overrides Then no new override data is stored and suggestions remain at baseline.

Deterministic Fallback on Low Detection Confidence

Given overall detection confidence < 0.5 or product type is unknown When suggestions are requested Then the system uses deterministic brand-neutral baseline values for all controls. Given the same input image and brand preset under fallback conditions When suggestions are requested multiple times Then identical fallback values are returned each time. Given fallback is active When suggestions are displayed Then the UI shows a banner stating "Using fallback defaults due to low confidence" and suppresses per-control confidence scores. Given the user selects "Retry detection" once per session When re-run completes Then the system either replaces fallback with new suggestions (if confidence >= 0.5) or remains in fallback with an updated timestamp.

Privacy-Safe Processing and Data Retention

Given images are processed to generate suggestions When processing completes Then raw images and derived features are automatically deleted within 30 minutes and are not used to train global models. Given system event logs are stored When reviewed Then they contain no full-resolution images or PII, only anonymized event IDs, timestamps, and aggregated metrics. Given override-learning data is stored When scoped Then it is limited to brand ID, contains only control deltas and detection labels, and can be purged upon brand request within 72 hours. Given a brand admin requests a data export When initiated Then the system provides an export of learned adjustments and anonymized event history for that brand within 24 hours.

Batch Consistency Advisor

"As a catalog manager, I want the system to flag inconsistent edits across a batch so that all images align with my brand preset."

Description

Analyze parameter variance across a batch and flag inconsistencies that may harm brand coherence (e.g., mixed crops or background hues). Provide a summary view with suggested harmonization actions such as “apply average crop to all,” “normalize background tone,” and “match lighting to reference image,” with previews and per‑image exceptions. Supports cluster‑based grouping for different subcategories within a batch and runs as a background job with progress updates for large uploads. Integrates with existing batch apply/undo, and records changes for easy rollback.

Acceptance Criteria

Background Hue Inconsistency Flagging

Given a batch of images is uploaded and Consistency Advisor analysis is initiated When the system computes each image’s background hue and detects ΔE00 > 3.0 relative to the batch median on ≥10% of images or ≥5 images (whichever is greater) Then the Summary view displays a "Background hue inconsistent" flag with the exact count of affected images and a preview grid of at least 6 thumbnails And the flag shows measured variance (median hue and ΔE00 standard deviation) and includes a "Normalize background tone" action

Normalize Background Tone Action with Preview and Exceptions

Given a "Background hue inconsistent" flag is present When the user clicks "Normalize background tone", selects target = batch median hue, and optionally deselects exception images Then a side-by-side preview renders within 2 seconds for the selected subset, showing before/after for at least 3 representative images And on Apply, the normalized tone is applied only to selected images, thumbnails and metadata update within 5 seconds, and a confirmation reads "<updated> images updated, <excluded> excluded" And the operation is recorded as a single batch action with parameters (target hue, method, affected image IDs) for rollback

Average Crop Apply and Undo

Given the advisor detects more than one output aspect ratio across the batch When the user selects "Apply average crop to all" Then the system computes the modal aspect ratio and center alignment, shows preview overlays for at least 3 images, and on Apply updates crops for all selected images within 5 seconds And a single Undo reverts all applied crops and restores previous crop metadata and thumbnails for all affected images

Cluster-Based Grouping of Subcategories

Given a batch contains visually distinct subcategories When the advisor performs clustering Then images are partitioned into k ≥ 2 groups, each with a group identifier and ≥5 images when available, and the Summary view shows per-group inconsistency flags and suggested actions And applying a harmonization action at the group level affects only images in that group and leaves other groups unchanged

Match Lighting to Reference Image

Given the user selects a reference image from the batch When the user chooses "Match lighting to reference" Then the advisor estimates exposure, contrast, and color temperature from the reference and generates previews for at least 3 sample images And on Apply, adjusted images’ mean luminance and white balance deviate by no more than ±2% from the reference metrics (unless excluded), and a confirmation shows the affected count

Background Job and Progress Updates for Large Batches

Given an upload of ≥200 images or total size ≥500 MB When analysis starts Then the Consistency Advisor runs as a background job with visible status stages (Queued, Analyzing, Suggestions ready) and a progress bar updating at least every 2 seconds And users can continue other in-app tasks while the job runs, and the Summary view becomes available with incremental results within 60 seconds

Change Log and One-Click Rollback

Given one or more harmonization actions have been applied When the user opens the change log Then each entry lists action type, parameters, actor, timestamp, and exact image IDs affected And clicking "Rollback" on an entry reverses the changes across all affected images within 10 seconds, restoring previous thumbnails and metadata, and creates a log entry noting the rollback

Guidance Content Management & Localization

"As a content admin, I want to manage and localize the guidance copy so that tips remain accurate, testable, and relevant across markets."

Description

Provide a lightweight CMS for Style Coach copy and examples, allowing product/UX teams to author, version, localize, and target tips by control, product category, marketplace, and user proficiency. Supports feature flags, rollout scheduling, and A/B testing hooks. Content is delivered via a cached, schema‑validated config to avoid app rebuilds, with rollback on failure and analytics to track tip engagement and impact on edit outcomes. Enables consistent tone of voice and rapid iteration of guidance without code changes.

Acceptance Criteria

CMS Authoring, Versioning, and Style Compliance

Given I am a CMS editor with permission "Guidance:Write" When I create or edit tip copy or examples Then I can save as Draft with a semantic version increment and required change log notes Given a Draft exists When I Publish it Then the config version is incremented, the previous version is retained as Last Good, and the audit trail records timestamp, editor, diff, and reason Given two editors modify the same tip concurrently When the second editor attempts to Publish Then a merge conflict is surfaced and must be resolved before publish succeeds Given content fails tone-of-voice/style lint rules When I attempt to Publish Then the publish is blocked with actionable lint errors and required approver(s) can override only via recorded approval Given I view the Version History of a tip When I select any prior version Then I can preview and restore it in one action without code changes

Localization Coverage and Fallback

Given locales en-US and es-ES are required for a tip When I submit a Draft for review Then the schema validator enforces either localized strings for each key or an explicit fallback marker Given a user with locale de-DE When de-DE and de are not provided Then the system serves en-US per the configured fallback chain and records the fallback level in telemetry Given a locale ar-SA (RTL) When the tip is rendered Then layout direction, punctuation, and numerals follow locale rules and screenshots/examples swap direction where applicable Given a new translation is Published When client cache TTL expires or a purge is issued Then clients fetch the updated locale strings without an app rebuild and without visual flash of untranslated text

Contextual Targeting by Control, Category, Marketplace, and Proficiency

Given targeting rules {control:"Crop", category:"Apparel", marketplace:"Amazon", proficiency:"Novice"} When a Novice user opens the Crop control on an Apparel item configured for Amazon Then the targeted tip renders in the Style Coach within 200 ms of panel open Given multiple tips match the same context When priorities are defined Then the highest priority tip displays; when priorities tie Then the most specific rule (greatest attribute match count) wins deterministically Given no tip matches the context When the panel opens Then the control-level default tip displays if enabled, otherwise no tip renders and no empty container is shown

Feature Flags and Scheduled Rollouts

Given a tip group is gated by feature flag "stylecoach.tips.v2" When the flag is Off for a cohort Then members of that cohort do not receive v2 content Given a rollout schedule 10% at T0, 50% at T0+12h, 100% at T0+24h When time passes each threshold Then exposure adjusts automatically without deploy and the change is visible in audit logs Given an internal QA override is enabled When a user is in the QA cohort Then they receive 100% exposure regardless of global rollout settings Given a pause is invoked due to incident When the rollout is paused Then further exposure increases halt within 60 seconds and newly ineligible users revert to the previous config

A/B Testing Hooks and Stable Assignment

Given an experiment with variants A and B and experiment_id is defined When an eligible user first views the Style Coach in the targeted context Then the user is assigned a stable variant by (user_id, experiment_id) hashing for 30 days Given a tip impression, click, or dismiss occurs When the analytics event is emitted Then payload includes experiment_id, variant_id, tip_id, control, category, marketplace, proficiency, locale, and config_version Given hourly sample-ratio monitoring is active When absolute allocation imbalance exceeds 2 percentage points for >2 consecutive checks Then an alert is sent to the analytics channel with experiment metadata

Schema-Validated Cached Config with Health-Based Rollback

Given a config build completes When it is promoted to CDN Then it must pass JSON Schema vX validation and signature verification; otherwise the promotion fails and Last Good remains active Given a client launches offline When a Last Good config is cached Then guidance content loads from cache without errors and is marked as offline-source in telemetry Given a newly activated config correlates with >3% 5xx or client error rate increase over 5 minutes When the health check triggers Then an automatic rollback to Last Good is executed and a purge forces clients to revert within the cache TTL window Given a manual purge is requested When executed Then 95% of active clients fetch the new config within 10 minutes, confirmed by config_version in heartbeat events

Engagement Analytics and Outcome Attribution

Given a tip is displayed When the user views, clicks Learn More, hovers, or dismisses Then impression, click, hover_time_ms, and dismiss events are captured with session_id and tip_id Given the user adjusts the targeted control within 2 minutes of a tip impression When events are processed Then the adjustment is attributed to the tip with control_name and delta value recorded Given daily aggregation runs at 02:00 UTC When metrics are computed Then the dashboard surfaces tip-level CTR, average engagement time, and post-tip adoption rate by locale, marketplace, and proficiency with data freshness < 24h

Preview Grid

See instant before/after results across all five samples at once. Toggle elements (shadow, crop ratio, background tone, retouch strength) to compare variants side‑by‑side and lock choices faster—no tab hopping or reprocessing delays.

Requirements

Instant Multi-Sample Preview Grid Rendering

"As an online seller previewing my catalog, I want to see all five samples appear instantly in a grid so that I can evaluate options quickly without waiting or switching tabs."

Description

Render five sample images in a responsive grid with instant before/after visibility, using progressive preview generation and client-side GPU acceleration (WebGL/canvas) to achieve sub-200ms interaction latency. Provide skeleton loaders, lazy-loading for high-resolution frames, and caching to avoid reprocessing delays. Support synchronized zoom/pan, responsive breakpoints, and device memory safeguards to prevent crashes on large batches. Implement error and retry states per tile, plus observability (metrics and logs) for time-to-first-preview, FPS, and failure rates. Ensure parity with final server-rendered output via color profiles and consistent tone mapping, with graceful fallback to static previews when hardware acceleration is unavailable.

Acceptance Criteria

Five-Sample Responsive Grid with Before/After Toggle and Breakpoints

- Given the Preview Grid is opened with at least five uploaded samples, When the viewport width is >= 1200px, Then five tiles render in a single row with equal gutters and consistent aspect-fit, showing all five simultaneously. - Given the viewport width is 992–1199px, When the grid renders, Then five tiles are visible within two rows (columns 3+2) without horizontal scroll; gutters are 8–16px; tiles maintain aspect-fit. - Given the viewport width is 768–991px, When the grid renders, Then three columns are used and all five tiles are accessible via vertical scroll with no content overflow. - Given the viewport width is < 768px, When the grid renders, Then two columns are used and all per-tile controls remain accessible without overlapping content. - Given a user activates the global Before/After toggle, When toggled, Then all five tiles switch state in unison and each tile indicates the active state. - Given a user modifies shadow, crop ratio, background tone, or retouch strength, When applied, Then the same parameters are applied uniformly across all five visible tiles for side-by-side comparison. - Given keyboard navigation, When using Tab/Shift+Tab and Enter/Space, Then each tile’s controls (including Before/After) are focusable and operable with a visible focus indicator.

Progressive Previews with Skeletons, Lazy-Loading, and Caching

- Given a visible tile requests a preview, When loading starts, Then a skeleton loader appears within 100ms and persists until an image frame is displayed. - Given Good 4G network conditions (≈10 Mbps, RTT ≈150ms), When loading visible tiles, Then time-to-first-preview (low-res) per tile is <= 600ms at the 95th percentile. - Given high-resolution frames are available, When swapping from low-res to high-res, Then each visible tile completes the swap within 2.5s at P95 without layout shift (CLS <= 0.01 per tile). - Given tiles are outside the viewport, When the grid loads, Then high-res requests for those tiles are deferred until they are within 300px of the viewport edge. - Given network concurrency limits, When loading multiple tiles, Then no more than 6 image requests are in-flight concurrently. - Given a previously rendered style combination is requested again, When the tile renders, Then the preview is served from cache and displayed within 150ms at P95 without issuing a server request. - Given the device is offline and a cached preview exists, When the tile becomes visible, Then the cached preview is displayed; otherwise an offline placeholder is shown with no spinner.

Performance and Observability Under Load

- Given five tiles are visible, When the user performs a before/after toggle, a single style change, zoom, or pan, Then end-to-end visual response latency is <= 200ms at the 95th percentile on Desktop Baseline (8-core CPU, 8GB RAM, integrated GPU) and Mobile Baseline (A14-class or equivalent). - Given the user pans continuously for 3 seconds, When measuring frame cadence, Then average FPS is >= 45 and FPS does not drop below 30 for more than 200ms. - Given telemetry is enabled, When a session uses the Preview Grid, Then the client emits metrics: time_to_first_preview_ms, interaction_latency_ms, pan_fps_avg, preview_failure_rate, gpu_backend (WebGL/Canvas), each tagged with session_id, tile_id, and timestamps. - Given a preview failure occurs, When logging, Then a failure event is recorded with error_code and retry_count and is included in preview_failure_rate; sampling rate is 100% for this feature. - Given distributed tracing, When any tile loads, Then a trace_id is propagated across client logs and image requests to correlate timing.

GPU Acceleration with Graceful Fallback to Static Previews

- Given the browser supports WebGL2 or WebGL, When the grid initializes, Then GPU acceleration is used for compositing and tone operations and gpu_backend=WebGL is reported in diagnostics. - Given WebGL context creation fails or is lost, When rendering, Then the system automatically falls back to Canvas2D/WASM without a crash and displays a non-blocking notice that hardware acceleration is disabled. - Given a QA flag to force software mode, When enabled, Then rendered results match the GPU path within defined color parity thresholds and the UI remains fully operable. - Given hardware acceleration is unavailable and software cannot meet performance targets, When necessary, Then static server-generated previews are displayed while controls remain functional and the user is informed of the fallback.

Synchronized Zoom/Pan Across Grid

- Given any tile is focused or hovered, When the user zooms via Ctrl/scroll, pinch, or zoom controls, Then all five tiles synchronize to the same zoom level within one frame and maintain the same focal point within 1px. - Given the user pans by dragging within one tile, When the drag ends, Then all other tiles reflect the same relative pan offset simultaneously. - Given the user activates Reset View, When triggered, Then all tiles return to fit-to-frame within 100ms. - Given zoom constraints, When zooming, Then the supported zoom range is 100%–400% with smooth interpolation and no visible aliasing at 200% on 2x DPR displays.

Color and Tone Parity with Server Output

- Given a calibration set of images with server-rendered outputs, When the client renders previews, Then CIEDE2000 color difference per tile vs server output is median <= 1.0 and P95 <= 2.0 in sRGB. - Given tone mapping and gamma, When comparing luminance histograms, Then mean luminance delta is <= 2% and clipped pixel proportion differs by <= 0.5% vs server output. - Given input images contain sRGB or Display P3 profiles, When rendered, Then colors are correctly converted/managed such that the above parity thresholds are maintained. - Given the browser lacks color management, When detected, Then the system warns via a non-blocking toast and falls back to sRGB assumptions while maintaining P95 <= 3.0 DeltaE.

Resilience: Device Memory Safeguards and Per-Tile Error/Retry

- Given the device reports Device Memory <= 4 or a memory pressure signal is received, When rendering five tiles, Then concurrent decoded image buffers are limited to <= 3 tiles and peak JS heap <= 200MB and GPU textures <= 256MB as verified in performance profiling. - Given a memory pressure or WebGL out-of-memory event, When detected, Then the grid reduces preview resolution by one step and disables optional effects (e.g., soft shadows) and continues without a crash. - Given a tile fails due to network timeout or processing error, When the failure occurs, Then the tile shows an error state with message, error_code, and Retry action while other tiles remain interactive. - Given retry policy, When auto-retry triggers or the user taps Retry, Then exponential backoff is applied (e.g., 1s, 3s) with a maximum of 2 automatic retries and 1 manual retry; on success the tile returns to normal. - Given requests are superseded by new style changes, When cancellation occurs, Then the previous request is aborted and no stale frames replace newer ones (no flashback).

Real-time Variant Toggles (Shadow, Crop Ratio, Background Tone, Retouch Strength)

"As a boutique owner refining my product photos, I want to toggle styling controls and see instant updates so that I can compare variants side-by-side and decide faster."

Description

Provide interactive controls that update previews in real time without a server round-trip: shadow on/off and intensity, selectable crop ratios (e.g., 1:1, 4:5, 3:2) with safe-zone guides, background tone palette/slider including brand presets, and retouch strength levels with live approximations. Support per-sample overrides and a global apply mode with clear UI state. Use worker threads for image operations to keep the UI responsive and reconcile client approximations with server-quality renders in the background to guarantee visual consistency. Persist chosen values in session state and prefill from the last-used brand preset.

Acceptance Criteria

Instant Shadow Toggle and Before/After in Preview Grid

Given the Preview Grid shows five sample images with shadow controls and Before/After mode available When the user toggles Shadow On or Off Then all five previews reflect the new shadow state within 150 ms without a loading indicator And no synchronous network request is made and the UI remains interactive (pointer latency ≤ 50 ms) And the Shadow Intensity slider is enabled only when Shadow is On Given Shadow is On and the user drags the Shadow Intensity slider When the slider value changes Then previews update continuously at least every 100 ms and land on the exact final value on release And the chosen intensity value is persisted in session state Given the user activates Before/After mode When any toggle is changed Then each of the five tiles displays synchronized Before and After states for the current image without reprocessing delay

Crop Ratios with Safe-Zone Guides Across Samples

Given crop ratio options 1:1, 4:5, and 3:2 are visible When the user selects a ratio Then safe-zone guides render on all five previews within 100 ms and the crop overlays reflect the selected ratio exactly And panning/zooming within the crop remains ≥ 55 fps And the selected ratio is saved per-sample and (if Global Apply is active) as a global value Given Global Apply mode is active When the user changes crop ratio Then all samples without overrides adopt the new ratio, and overridden samples remain unchanged and are visibly marked as locked

Background Tone Palette and Brand Preset Application

Given the background control shows a tone palette, a numeric slider, and brand presets When the user selects a brand preset Then all five previews update within 150 ms to the preset tone and the preset is highlighted as selected And the applied tone matches the preset target within ΔE00 ≤ 2 And the choice is persisted in session state Given the user adjusts the tone slider When the slider is moved Then the preview updates continuously without visible banding And the numeric value is displayed and copyable Given Global Apply is Off When a tone change is made Then only the active sample updates

Retouch Strength Live Approximation and Background Reconciliation

Given retouch strength control with levels 0–100 is visible When the user sets a new value Then the local preview updates within 200 ms using a client-side approximation And when the background server-quality render completes, SSIM ≥ 0.98 and ΔE00 ≤ 2 between local preview and final within the product region And the swap to server-quality occurs without flicker and with layout shift ≤ 1 px And the retouch value persists in session and restores on reload

Per-Sample Overrides vs Global Apply with Clear UI State

Given a Global Apply toggle is visible and Off by default When Global Apply is turned On Then subsequent control changes apply to all samples except those with per-sample overrides And overridden samples display a lock icon and tooltip "Override active" And a "Reset to Global" action is available on overridden samples Given a sample has an override When the user clicks "Reset to Global" Then the sample adopts the current global values and the override indicator is removed Given global values change Then overridden samples remain unchanged

Worker Threads Keep UI Responsive During Image Operations

Given the app is processing 5× 2048 px previews on a device with ≥ 4 logical cores When the user rapidly toggles controls and drags sliders for 10 seconds Then the main thread frame rate remains ≥ 55 fps and input latency ≤ 50 ms And image processing executes in worker threads with no single main-thread long task ≥ 50 ms aside from painting And no "Unresponsive" browser prompt appears Given network latency is 200–400 ms and background reconciliation is active Then preview updates are never blocked by network and no spinner is shown for local updates

Session Persistence and Last-Used Brand Preset Prefill

Given the user sets values for shadow, crop ratio, background tone, and retouch strength When the user navigates away and returns within the same session Then the same values are restored for each sample and the global state Given a new batch is started without selecting a preset When the editor loads Then controls prefill from the last-used brand preset for the account And the preset name is displayed as selected And users can override any control per-sample or globally

Synchronized Before/After Comparison & Grid Controls

"As a power user comparing edits, I want synchronized before/after controls across the grid so that I can spot differences quickly without repeating the same actions on each image."

Description

Enable per-tile and global comparison modes, including a before/after flip, a draggable split slider, and synchronized zoom/pan across selected tiles. Provide reset-to-default and quick-compare (press-and-hold) interactions. Maintain consistent overlays (crop guides, safe zones) in both before and after states. Ensure keyboard and pointer parity for all compare actions and maintain performance targets under 60 FPS on supported hardware.

Acceptance Criteria

Per‑Tile and Global Compare Scoping

Given five preview tiles are visible and at least one tile is selected When the user switches scope to Global compare Then any compare action (flip, split slider, zoom, pan) applies simultaneously to all selected tiles, and non-selected tiles are unaffected And a scope indicator displays "Global" and the count of affected tiles equals the number of selected tiles When the user switches to Per‑Tile scope and focuses a tile Then compare actions affect only that focused tile and other tiles remain unchanged And switching scopes does not modify underlying image adjustments or style presets

Synchronized Zoom/Pan Across Selected Tiles

Given two or more tiles are selected and Global scope is active When the user zooms via mouse wheel, trackpad pinch, or keyboard +/− Then all selected tiles adjust zoom by the same factor within ±1% tolerance and maintain the same focal point When the user pans by dragging or using arrow keys Then all selected tiles pan by the same pixel offset within ±1 px, clamped to image bounds; non-selected tiles do not move And continuous zoom/pan interactions render at an average ≥60 FPS (p95 ≥50 FPS) on supported hardware

Before/After Comparison Modes (Flip and Split)

Given compare mode is Flip When the user toggles Before/After via the toolbar control or its keyboard shortcut Then the image state switches instantly between Before and After with toggle latency <50 ms and no reprocessing delay Given compare mode is Split When the user drags the split handle within a tile Then the handle moves smoothly and reveals Before on one side and After on the other, with the handle position persisted per tile in Per‑Tile scope and shared in Global scope And the handle snaps at 0%, 50%, and 100% when within 8 px proximity And split interactions render at ≥60 FPS average on supported hardware

Quick‑Compare Press‑and‑Hold

Given the After state is visible When the user press‑and‑holds the Quick Compare control (pointer down on the compare button or holding the assigned keyboard key) Then the affected tile(s) temporarily switch to the Before state for the duration of the hold and revert to the previous state within 50 ms of release And in Global scope only selected tiles are affected; in Per‑Tile scope only the focused tile is affected And Quick Compare uses no transitional animations; overlays remain visible and unchanged

Reset Compare Controls to Defaults

Given any non‑default compare settings are active (e.g., zoom not fit‑to‑tile, non‑zero pan, split mode on, flip active) When the user activates Reset Compare Then all tiles return to defaults: After state, Fit‑to‑Tile zoom, centered pan, Split mode off, Flip mode off, split handle hidden, overlays on, scope unchanged And the reset completes within 100 ms and produces no change to image edits or style presets

Keyboard and Pointer Parity for Compare Actions

Given the application window has focus When a user performs any compare action with a pointer (zoom, pan, flip, split slider, quick‑compare, reset) Then an equivalent keyboard interaction exists and is accessible via documented shortcuts And all compare controls are reachable by Tab order, show a visible focus indicator, and can be activated via Enter/Space as applicable And no compare functionality is pointer‑only; automated tests verify keyboard parity for each action

Overlay Consistency Across States and Compare Modes

Given crop guides and safe‑zone overlays are enabled When the user flips Before/After, uses the split slider, or performs zoom/pan in any scope Then overlay position, size, and opacity remain constant relative to the image content (alignment error ≤1 px at 100% zoom or ≤0.5% of tile dimension at other zoom levels) And overlays render above imagery in both states without flicker or duplication and never lag behind image transforms by more than one frame

Variant Pinning and Choice Locking

"As a seller narrowing options, I want to pin the best-looking variant so that I don’t lose it while testing other adjustments."

Description

Allow users to pin a preferred variant in any tile, freezing its settings and visual state while continuing to experiment with others. Indicate pinned status visually and prevent accidental overrides until explicitly unlocked. Support naming or tagging the locked choice and persisting it when navigating between pages or sessions. Expose a concise summary of locked parameters for auditability.

Acceptance Criteria

Pin Variant in Tile and Freeze State

Given the Preview Grid is loaded with at least five sample tiles and each tile has multiple generated variants When the user pins any variant within a tile via the pin control or context menu action Then the selected tile’s current image and parameter values (shadow, crop ratio, background tone, retouch strength, preset) are snapshotted and frozen And subsequent adjustments to any global toggle or preset do not alter the pinned tile’s visual output or stored parameters And the pin action completes within 300 ms from user input And any number of tiles can be pinned simultaneously without error

Pinned Status Indicator and Accessibility

Given a variant is pinned in a tile Then the tile displays a lock icon overlay and a distinct border style indicating pinned status And a tooltip labeled "Pinned" appears on hover or focus of the tile And the pinned indicator has a minimum 4.5:1 contrast ratio against the image/background And assistive technologies announce the state change via an ARIA label/state (e.g., aria-pressed or aria-selected equivalent) as "Pinned" And the indicator remains visible and accurate after grid refreshes, scrolling, or viewport resize

Prevent Overrides and Respect Pins in Batch Operations

Given one or more tiles are pinned When the user attempts to modify a pinned tile’s parameters (e.g., change retouch strength, crop ratio) via per-tile controls Then the controls are disabled or the change is blocked for the pinned tile, and no parameter value is altered And a non-blocking toast appears within 500 ms stating the action was skipped due to a pin, with a direct "Unlock" action When the user applies a batch preset/reset to all tiles Then pinned tiles are excluded from the operation and a summary toast reports "Skipped N pinned tiles" with accurate count And no reprocessing is triggered for pinned tiles during batch actions

Name/Tag Locked Choice and Edit Constraints

Given a tile is pinned When the user adds or edits a tag for the pinned choice Then the input accepts 1–32 characters consisting of letters, numbers, spaces, hyphens, and underscores only And invalid characters are rejected with inline validation messaging within 200 ms And the tag is saved automatically within 300 ms of input blur or Enter And the tag is rendered on the tile and in details view without truncation up to 32 chars (ellipsis beyond) And removing a tag leaves the pin intact

Persistence Across Navigation and Sessions

Given one or more tiles are pinned (with optional tags) When the user navigates to another page within the app and returns to the Preview Grid, or refreshes the browser Then all pinned states, frozen images, and tags are restored exactly as saved When the user signs out and signs back in on the same account within 30 days Then all pinned states, frozen images, parameter snapshots, and tags persist for the associated catalog/batch And pinned state restoration for grids up to 100 tiles completes within 1 second on a typical broadband connection

Locked Parameters Summary for Auditability

Given a tile is pinned When the user opens the details/summary for the pinned tile Then the UI displays the exact frozen parameter values: shadow (on/off), crop ratio (numeric), background tone (code/name), retouch strength (0–100), and applied preset (name/version) And the summary includes timestamp of pin and user identifier And the values match the underlying stored snapshot with no rounding beyond one decimal place for numeric sliders And the user can copy the summary to clipboard in a single action

Explicit Unlock Flow, Confirmation, and Undo/Redo

Given a tile is pinned When the user selects Unlock via the lock control or context menu Then a confirmation appears once per session (until "Don’t ask again" is checked) and, upon confirm, the tile is unlocked within 300 ms And after unlocking, subsequent global or per-tile changes apply normally to that tile When the user performs Undo after an unlock action Then the tile returns to the pinned state with the original frozen image and parameters restored And Redo reapplies the unlock state

Batch Apply Selected Settings to Full Upload

"As a store owner, I want to apply my chosen look to all product photos at once so that I can finish edits quickly and keep my brand consistent."

Description

Provide a one-click action to apply the currently selected settings (or pinned variant parameters) to the entire batch or a selected subset. Validate conflicts (e.g., incompatible crop ratio for certain SKUs) and show an impact summary before committing. On confirm, update the processing job configuration, enqueue reprocessing, and display progress with the ability to cancel or revert. Ensure idempotency and record the chosen settings as a reusable style preset.

Acceptance Criteria

Apply Selected Settings to Entire Batch with Impact Summary

Given the user has selected settings or pinned variant parameters in the Preview Grid And a batch of N images is available When the user clicks "Apply to All" Then an impact summary modal is displayed containing: total_selected=N, compatible_count, conflict_count grouped by reason, parameter_deltas, estimated_duration_range, and estimated_storage_delta And the Confirm button remains disabled until the summary renders with all counts When the user confirms Then the processing job configuration is updated atomically with a new version number And one reprocessing job with a unique job_id is enqueued And a progress panel appears showing states (Queued, Processing, Finalizing) with item-level counts updating at least every 5 seconds

Apply Settings to Filtered Subset or Manual Selection

Given the user has filtered the grid or manually multi-selected K items (K ≥ 1) When the user clicks "Apply to Selection" Then the impact summary reflects total_selected=K and excludes unselected items And only the selected items are included in the updated job configuration And non-selected items remain unchanged after processing completes

Conflict Detection and Resolution for Incompatible Crop Ratios

Given one or more items are incompatible with the selected crop ratio or min-dimension requirements When conflicts are detected during pre-commit validation Then the impact summary lists conflict_count with reason codes (e.g., AR_LOCKED, MIN_DIM) and example SKUs And the user can choose a resolution policy: (a) Skip conflicted items [default], (b) Auto Adjust per SKU rule, or (c) Use Original crop And changing the policy updates the counts in the summary before confirmation When the user confirms Then the chosen policy is applied consistently, and each conflicted item receives a logged resolution outcome

Idempotency and Concurrency Control

Given the user triggers "Apply" and the client retries or the user clicks again within 60 seconds When duplicate requests with the same settings hash are received Then the backend returns the same job_id and does not enqueue duplicate jobs And the UI deduplicates and shows a single progress panel for that job When two different clients attempt to apply changes to the same batch concurrently Then optimistic concurrency prevents a stale write, and the second request receives a 409 with the latest config version to review

Cancel and Revert Reprocessing

Given a reprocessing job is in progress When the user clicks Cancel Then the job is aborted and no additional items are updated And if any items were already updated, a Revert option is presented When the user clicks Revert within 24 hours of the job start Then all affected items and configuration are restored to the previous version within 5 minutes for batches up to 1,000 images And an audit log records cancel/revert actions with timestamps and actor

Save Chosen Settings as Reusable Style Preset

Given the user confirms applying selected settings Then those parameters (background tone, shadow, crop ratio, retouch strength, pinned variant parameters) are saved as a new preset with a generated default name and optional user override And the preset appears in the user's preset library within 5 seconds And presets are immutable and versioned; editing creates a new version When the user applies this preset to a future batch Then the settings populate instantly and can be batch-applied using the same flow

Non-Destructive Versioning & Undo History

"As a user experimenting with styles, I want non-destructive history so that I can safely explore options and roll back if needed."

Description

Track all preview-grid changes as non-destructive versions per asset, enabling undo/redo and revert-to-original without data loss. Store lightweight diffs and parameter sets rather than duplicating full images. Persist session state to recover work after refresh or reconnect, and allow exporting/importing the parameter set for reuse across projects.

Acceptance Criteria

Undo/Redo of Preview Grid Parameter Changes

- Given an asset open in Preview Grid with five samples visible, when the user changes any parameter (shadow, crop ratio, background tone, retouch strength) on any sample, then a new undo entry is recorded with parameter name, old value, new value, sampleId, assetId, timestamp, and userId. - Given there are N undo entries, when the user invokes Undo stepwise, then the preview updates to each prior state in order, with visual update latency ≤ 300 ms per step if cached or ≤ 1000 ms per step if recomposition is required, and no image data is duplicated. - Given the user has undone M steps, when Redo is invoked M times, then the resulting render is visually equivalent to the pre-undo state (SSIM ≥ 0.99, ΔE2000 ≤ 2.0) and the parameter sets are identical to their recorded values.

Revert to Original Without Data Loss

- Given an asset with one or more saved versions, when the user selects Revert to Original, then the original image and default parameters are restored while preserving the ability to Undo back to the previous state. - Given a revert operation, when the change is persisted, then no full-resolution image blob is duplicated and the storage delta for the new version record is ≤ 5 KB (parameters + metadata only). - Given the asset is reverted, when the user exports the current parameter set, then the export clearly indicates it is the "original" baseline (versionId, schemaVersion, createdAt) and contains no derived adjustments.

Session Persistence After Refresh/Reconnect

- Given an active editing session on an asset in the Preview Grid, when the browser is refreshed or the network disconnects and reconnects within 24 hours, then the last active version, full undo/redo stack, selected sample, and parameter values are restored automatically upon project reload. - Given any parameter change, when 2 seconds of user inactivity elapse, then the session state is autosaved to durable storage and survives process restarts. - Given the client was offline when changes were made, when connectivity is restored, then the queued diffs are synced exactly once in causal order with no lost or duplicated version entries.

Lightweight Diff Storage and Reconstruction Fidelity

- Given versioning uses parameter sets and lightweight diffs, when reconstructing any prior version from the original image, then the rendered output matches a fresh application of the same parameters with SSIM ≥ 0.99 and ΔE2000 ≤ 2.0 across the full image. - Given a sequence of K parameter changes, when K versions are saved, then total additional storage consumed is ≤ 5 KB × K and at most one original image blob exists for the asset. - Given a version record, when inspected, then it includes parentVersionId, schemaVersion, parameterSetHash, createdAt, and userId fields for auditability.

Export/Import Parameter Set Across Projects

- Given an asset version is active, when the user selects Export Parameters, then a portable JSON (≤ 10 KB) is downloaded containing schemaVersion, parameterSet, seed/randomness controls, and a checksum. - Given a valid exported file, when Import Parameters is used in a different project and asset, then the parameters apply successfully and the resulting render is visually equivalent to the source (SSIM ≥ 0.99, ΔE2000 ≤ 2.0). - Given an export with an unsupported schemaVersion, when import is attempted, then the user receives a clear, actionable message and the import is blocked without mutating the current asset.

Version Timeline Visibility and Selection in Preview Grid

- Given an asset is open, when the user opens the Versions panel, then a chronological list of versions is shown with versionId, createdAt, author, and a compact diff summary (changed parameters) for each entry. - Given a version is selected in the panel, when applied, then all five samples update consistently to reflect that version’s parameter set within 500 ms and the selection is reflected in the undo/redo stack. - Given multiple versions exist, when hovering a version, then a quick preview renders without committing, and dismissing hover leaves the current state unchanged.

Accessibility & Keyboard-First Navigation

"As a keyboard and screen-reader user, I want complete access to preview and compare controls so that I can efficiently make decisions without a mouse."

Description

Implement full keyboard navigation of the grid and controls with logical tab order, focus indicators, and shortcuts for common actions (toggle before/after, adjust retouch strength, cycle crop ratios, pin/unpin). Provide ARIA roles, labels, and live region announcements for state changes (e.g., variant pinned, batch apply started). Ensure color-contrast and screen-reader compatibility for overlays and sliders, with motion-reduction options for users sensitive to animations.

Acceptance Criteria

Keyboard-Only Grid Navigation and Logical Tab Order

- Given the Preview Grid is loaded, When the user presses Tab from the page start, Then focus moves in this order: top action bar → grid view controls → grid tiles (left-to-right, top-to-bottom) → details panel controls (if present) → footer actions, with no dead-ends or skipped interactive elements. - Given focus is on a grid tile, When the user presses Arrow Up/Down/Left/Right, Then focus moves to the adjacent tile and a screen reader announces "Tile X of Y" for the newly focused tile. - Given any modal or overlay opens, When it appears, Then focus moves to the modal's first focusable element, is trapped within until closed, and returns to the triggering control on close. - Given a control becomes disabled or hidden, When tabbing, Then it is removed from the tab sequence and skipped. - Given the grid re-renders (e.g., after applying a crop), When the previously focused element no longer exists, Then focus is set to the nearest relevant element (same tile index or parent group) without dropping to the document body.

Visible Focus Indicators on All Interactive Elements

- Given any interactive element is focused via keyboard, Then a 2px focus indicator with at least 3:1 contrast against adjacent colors is clearly visible and not clipped by container overflow. - Given focus is on an image tile, Then a focus ring or overlay ensures 3:1 contrast regardless of the image content beneath. - Given the app is in light or dark theme, Then the focus indicator maintains the 3:1 contrast requirement in both themes. - Given the browser zoom is set to 200%, Then the focus indicator remains at an effective thickness of ≥2px and fully visible around the element.

Shortcut Keys for Toggle, Pin, Retouch, and Crop Ratio

- Given a grid tile is focused and no text input is active, When the user presses B, Then the tile toggles Before/After preview and an aria-live polite message announces "Before view" or "After view". - Given a grid tile is focused, When the user presses P, Then the variant is pinned/unpinned within 100 ms, the pin icon reflects the state, and aria-live announces "Pinned" or "Unpinned". - Given retouch strength is adjustable for the focused tile, When the user presses ] or [, Then the value increases/decreases by 5 within bounds 0–100, the numeric value updates, aria-valuenow matches the value, and holding the key repeats at ~5 steps/second. - Given multiple crop ratios exist (e.g., 1:1, 4:5, 16:9), When the user presses C, Then the next ratio is applied; When Shift+C is pressed, Then the previous ratio is applied; aria-live announces the new ratio label. - Given focus is inside a text input or while dragging a slider, When shortcut keys are pressed, Then no shortcut action is triggered to avoid conflicts.

ARIA Roles, Labels, and Live Announcements for State Changes

- Given interactive components are rendered, Then tiles expose role="group" or role="button" as appropriate with accessible names (e.g., "Sample 3"), buttons have aria-labels or visible labels, and toggleable controls use aria-pressed to reflect state. - Given sliders are present, Then each uses role="slider" with aria-label or aria-labelledby and correct aria-valuemin, aria-valuemax, and aria-valuenow reflecting the current value. - Given state changes occur (pin/unpin, before/after toggled, crop ratio changed, batch apply started/completed/failed), Then a single aria-live="polite" (or role="status") region announces a concise, non-duplicated message within 500 ms (e.g., "Pinned sample 2", "Batch apply started for 5 items", "Batch apply complete: 5 succeeded, 0 failed"). - Given page structure is defined, Then landmark roles (header, main, complementary, contentinfo) are present for efficient screen reader navigation.

Screen Reader Operability of Sliders and Modals

- Given the retouch strength slider has focus, When Arrow Left/Right (or Up/Down) are pressed, Then the value changes by 1; When PageUp/PageDown are pressed, Then the value changes by 10; When Home/End are pressed, Then the value jumps to min/max; each change is announced as a percentage. - Given any slider change occurs via keyboard or mouse, Then aria-valuenow updates immediately and the visible numeric value stays in sync. - Given a modal or side panel opens, Then it has aria-modal="true", is labelled by an element referenced via aria-labelledby, provides an accessible description if needed via aria-describedby, and pressing Escape closes it. - Given a modal closes, Then focus returns to the element that opened it.

Motion Reduction Preference and In-App Toggle

- Given the user has OS/browser prefers-reduced-motion enabled, Then non-essential animations (transitions, parallax, wipes) are disabled (0 ms) and essential feedback animations are reduced to ≤100 ms fade with no motion. - Given the user toggles "Reduce animations" in app settings, Then the preference takes effect immediately across the grid and controls, persists across sessions, and overrides default motion settings. - Given before/after comparisons are shown, Then with motion reduction enabled they switch instantly with no sliding wipe effects.

Overlay and Control Color Contrast on Image Tiles

- Given text and icons appear over images, Then all text meets WCAG AA contrast ≥4.5:1 and UI components (icons, controls, focus rings) meet ≥3:1 against their immediate background, achieved via dynamic scrims or alternative styling as needed. - Given hover, active, and focus states of controls, Then the contrast ratios remain at or above their respective thresholds. - Given the interface is zoomed to 200% and in both light and dark themes, Then the stated contrast ratios are maintained.

Consistency Meter

Real‑time score that measures uniformity across your sample set and predicts how well the preset will generalize to the rest of your catalog. Flags outliers and offers quick fixes (e.g., adjust crop margin by 2%) to secure a cohesive, brand‑true finish.

Requirements

Real-time Uniformity Scoring Engine

"As an online seller, I want a real-time consistency score while configuring my preset so that I can quickly see whether my catalog will look cohesive before I process the full batch."

Description

Implements a streaming analyzer that computes a 0–100 Consistency Score and per-metric sub-scores (e.g., crop margin, subject centering, aspect ratio, background uniformity, exposure, white balance, shadow hardness, color cast, edge cleanliness, resolution consistency, compression artifacts, and style similarity to the selected preset). Scores update instantly as users upload samples, tweak preset parameters, or change the sample set. Reuses shared image features from PixelLift’s preprocessing to minimize recomputation, exposes an event bus for UI updates, and supports idempotent rescoring on partial batches. Allows brand-specific thresholds and weighting profiles, and handles missing or corrupted images gracefully.

Acceptance Criteria

Real-time Score Update and Event Publication

Given a project with Consistency Meter visible and an active sample set When (a) an image upload completes and preprocessing finishes, (b) a preset parameter is changed, or (c) an image is added or removed from the sample set Then the overall Consistency Score (0–100, integer) and all per‑metric sub‑scores are recalculated and published on the scoring event bus within 300 ms of the triggering change And Then the published message includes: projectId, sampleSetVersion (monotonic), changeType ∈ {uploadComplete, presetChanged, sampleSetChanged}, overallScore, subScores[{metricId, value 0–100, unit}], timestamp (ISO‑8601), dedupKey And Then events for a given project are delivered in ascending sampleSetVersion order to subscribed clients

Comprehensive Per‑Metric Sub‑Scores and Overall Calculation

Given any scoring run Then sub‑scores are produced for each metric: crop margin, subject centering, aspect ratio, background uniformity, exposure, white balance, shadow hardness, color cast, edge cleanliness, resolution consistency, compression artifacts, style similarity to the selected preset And Then each sub‑score is an integer in [0,100], where 100 = perfect adherence to the preset/brand target And Then the overall Consistency Score equals the weighted arithmetic mean of the sub‑scores using the active weighting profile; difference between published and recomputed overall score ≤ 0.5 points And Then with identical inputs, repeated runs produce identical sub‑scores and overall score

Feature Reuse and Incremental Scoring Efficiency

Given N images with cached preprocessing features When only preset weights or thresholds change Then zero image features are recomputed and only aggregation runs; end‑to‑end score update latency ≤ 150 ms for N ≤ 200 on reference hardware When exactly k images in the sample set change content or features (k ≤ N) Then only those k images are rescored; total recomputed images = k; cache hit rate for unaffected images ≥ 90% And Then average CPU time for rescoring after weight‑only changes is ≤ 20% of a cold scoring run baseline for the same N

Idempotent Rescoring on Partial Batches

Given a partial batch upload where some images are pending, retried, or duplicated by the client When a rescoring request with the same inputs (projectId, sampleSetVersion, changeSet hash) is processed multiple times Then the engine produces identical overall and sub‑scores, emits the same dedupKey, and increments sampleSetVersion at most once And Then a rescoring trigger that results in no score changes produces no new event and no version increment

Brand Thresholds and Weighting Profiles

Given Brand A and Brand B with distinct thresholds and weighting profiles applied to the same sample set When scoring is executed under each brand profile Then sub‑scores and the overall score reflect the respective weights and thresholds, and the published overall score difference matches the recomputed difference within 0.5 points And Then when no brand profile is specified, the system applies the default profile and records profileId in the event payload And Then updating a brand profile’s weights takes effect on the next scoring and is published within 300 ms of the change

Outlier Flagging with Actionable Quick Fixes

Given active brand thresholds for each metric When an image’s metric deviates beyond its threshold by Δ Then the image is flagged as an outlier for that metric and the deviation amount and direction are included in the result payload And Then the engine provides at least one quick‑fix suggestion containing: parameterName, suggestedDelta within allowed range (e.g., cropMargin:+2%), and expected score improvement (Δscore) And Then applying the suggestion via API triggers rescoring and increases the targeted metric sub‑score by ≥ 80% of the predicted Δscore on a controlled test set

Graceful Handling of Missing or Corrupted Images

Given an input sample set containing missing or corrupted images When scoring runs Then affected images are skipped, marked with errorCode ∈ {MissingImage, CorruptImage, UnsupportedFormat, Timeout}, and the remaining images continue processing And Then the overall score is computed using only successfully processed images and includes completenessRatio = processedCount/totalCount in the event payload And Then the API returns a multi‑status result (e.g., 207) or equivalent, and no unhandled exceptions or crashes occur

Preset Generalization Predictor

"As a boutique owner, I want the meter to predict how well my chosen preset will generalize to the rest of my catalog so that I can avoid rework and ensure brand consistency across new uploads."

Description

Forecasts how a chosen style preset will perform on the remainder of the catalog by modeling variance in the sample set, product category metadata, and historical outcomes. Produces confidence bands (e.g., high/medium/low) and expected failure modes (e.g., dark fabrics underexposed, reflective items with harsh shadows). Requires a minimum diverse sample size, highlights underrepresented categories, and suggests additional samples to improve prediction quality. Integrates with the scoring engine to run quick what‑if simulations as the user adjusts preset parameters.

Acceptance Criteria

Real‑time Confidence Band Display

- Given a sample set that meets the minimum diversity threshold, when a user selects a style preset, then the predictor outputs a numeric confidence score (0–100) and a band mapped as: High ≥ 80; Medium 50–79; Low < 50. - When the user changes any preset parameter, the confidence score and band refresh within 700 ms for sample sets ≤ 500 images and within 1.5 s for ≤ 2,000 images. - The displayed score and band remain consistent after page refresh and when re-opening the project (persisted in project state). - The confidence band tooltip lists contributing factors including sample variance, category coverage, and historical performance weights.

Expected Failure Modes Enumeration

- When predicting, the system returns at least the top 3 expected failure modes with human-readable labels and example thumbnails sourced from the sample. - Each failure mode includes an estimated prevalence percentage (0–100%) and severity levels mapped as: High ≥ 70th percentile severity, Medium 40–69th, Low < 40th. - For each failure mode, at least one auto-fix suggestion is provided (e.g., exposure +0.2 EV, crop margin +2%) and is previewable on click. - On a labeled holdout set (n ≥ 200), ≥ 70% of predicted failure modes correspond to observed issues after applying the preset. - Failure modes and metadata are available via API at GET /v1/predictions/{id}.

Minimum Diverse Sample Size Enforcement

- The predictor requires a minimum of 24 images spanning ≥ 3 product categories with no single category > 60% of the sample; thresholds are configurable per workspace. - If the minimum is not met, the Predict action is disabled and an inline message lists unmet conditions and the exact additional images required per category. - The system suggests up to 3 categories to add with required counts and provides a CTA to upload or select from catalog. - Once requirements are satisfied, the Predict action enables without manual refresh, and the first prediction completes within 1 s for ≤ 500 images.

Underrepresented Category Highlighting and Sample Suggestions

- Category coverage is computed as sample share divided by catalog share; categories with coverage index < 0.5 are flagged as underrepresented. - Underrepresented categories are shown in the UI with a badge and a recommendation "Add N images" where N is computed to reach coverage index ≥ 0.8. - Clicking the recommendation opens the uploader pre-filtered to the category; after adding images, coverage recalculates and flags clear within 1 s. - A downloadable CSV report lists categories, coverage index, required N, and current counts.

What‑If Simulation Integration

- When a user adjusts a preset parameter (e.g., crop margin +2%, exposure +0.2 EV), a what‑if simulation runs and returns updated confidence score, band, and Consistency Meter delta (± points) within 800 ms for ≤ 500 images. - A side‑by‑side comparison view shows baseline vs simulated metrics and the top 5 images with greatest predicted change. - Toggling off the simulation or clicking Reset restores baseline metrics instantly (<150 ms) with no state persisted. - No database writes occur until the user clicks Apply; event logs include parameter changes and deltas.

Outlier Detection and Quick Fixes

- Outliers are defined as images with predicted post‑process quality > 2.0 standard deviations below the sample mean or in the lowest 10th percentile; the system lists count and thumbnails. - Each outlier includes at least one quick fix with estimated impact (predicted score change) and a one‑click preview. - Batch apply is available for outliers sharing feature similarity cosine ≥ 0.8; processing completes within 2 s for up to 100 images. - After applying a quick fix (preview or batch), the predictor recalculates metrics for affected images and updates the overall confidence score within 700 ms.

Outlier Detection & Smart Suggestions

"As a product photographer, I want outliers flagged with specific, actionable suggestions so that I can fix the few problem images without manually hunting for them."

Description

Identifies per-metric outliers within the sample set using robust statistics and learned thresholds, visually flags them, and generates targeted, parameter-level suggestions (e.g., increase crop margin by 2%, shift white balance +250K, reduce shadow intensity by 10%). Quantifies expected score lift for each suggestion and allows users to apply fixes per-image or across the preset. Ensures suggestions remain within brand guardrails and clearly label any trade-offs.

Acceptance Criteria

Detect Per‑Metric Outliers in Sample Set

- Given a labeled sample set (>=30 images) with known per-metric outliers (crop margin, white balance, shadow intensity), when the Consistency Meter runs, then the system flags outliers per metric on image tiles and lists them in the Outliers panel. - Then per-metric detection achieves >=90% precision and >=85% recall versus ground truth on the set. - Then thresholds are computed via robust statistics (median + k*MAD) combined with learned per-metric offsets, and the effective threshold value is shown in a tooltip. - Then each flag stores metric name, measured value, deviation in MAD units, and timestamp for auditability.

Visual Flagging and Drill‑Down

- Given flagged images, when a user hovers a flag, then a tooltip shows: metric, image value, sample median, deviation (MAD units). - When the user clicks a flag, then a side panel opens with a histogram of the sample distribution and the image’s position highlighted. - Then flag icons render within 500 ms after analysis completion for a 100-image sample set. - Then each flag control has an accessible aria-label: "Outlier: {metric}" and is keyboard focusable.

Generate Parameter‑Level Smart Suggestions

- For each flagged metric, the system generates 1–3 parameter-level suggestions with explicit numeric adjustments (e.g., crop margin +2%, white balance +250K, shadow intensity −10%). - Each suggestion displays an expected Consistency score lift (+X points) with an 80% confidence interval and a model confidence score (0–1). - Suggestions that would violate brand guardrails are not shown; instead, a note states "Blocked by guardrail" with the specific rule. - Each suggestion lists any trade-offs (e.g., "May reduce naturalness by 1–2 pts") next to the lift value. - Numeric adjustments respect per-metric step sizes (crop: 0.5%, WB: 50K, shadows: 1%) and rounding rules.

Apply Fixes Per‑Image and Across Preset

- Given a flagged image, when the user clicks "Apply fix" on a suggestion, then only that image is re-rendered with the adjustment and its Consistency score is recalculated. - When the user selects "Apply to preset", then the preset parameter updates and all images in the sample set re-evaluate; for 100 images, updated flags and scores appear within 3 seconds. - Batch apply provides an Include/Exclude filter defaulted to images matching the outlier condition; changes affect only the filtered set. - All apply actions are undoable in one step; selecting Undo restores the prior preset and image states with matching scores. - Partial failures surface an error list with retry for failed images; successful applies remain intact.

Expected Score Lift Calculation Validity

- Given a held-out validation subset (>=50 images), when applying the top suggestion in dry-run evaluation, then predicted lift is within ±2 Consistency points of realized lift for >=80% of images. - When two suggestions are stacked, then cumulative lift prediction error is <=30% relative for >=70% of cases. - If model confidence <0.5, then expected lift displays "N/A" and the suggestion is tagged Low Confidence. - The lift detail panel shows model version and validation dataset timestamp.

Brand Guardrails Enforcement

- Suggestions never propose values outside brand guardrails defined in the active brand profile; attempts are blocked with the message: "Blocked by guardrail: {rule}". - Guardrails are enforced for both per-image and preset-level applies, including batch operations. - Overrides require a user with the "Brand Admin" role; otherwise the override control is disabled. - All blocked or overridden actions are logged with user, rule, old/new values, and timestamp in the audit trail.

One-click Auto-tune Preset

"As a time-pressed seller, I want one-click fixes that auto-tune the preset based on the meter’s guidance so that I can lock in a cohesive look with minimal effort."

Description

Applies selected meter suggestions to the active preset in a single action, creates a versioned draft, updates previews, and triggers automatic rescoring. Supports scoped application (entire batch, subset, or single image), instant undo, and side-by-side before/after comparison. Enforces safe bounds, respects locked brand settings, and writes an audit trail of parameter changes to support review and rollback.

Acceptance Criteria

One‑Click Apply Selected Meter Suggestions

Given an active preset with one or more Consistency Meter suggestions visible and at least one suggestion selected, When the user clicks "Auto‑tune Preset", Then the system creates a new draft preset version with an incremented version tag (e.g., vN+1) without altering the published preset, And applies only the selected suggestions to the draft parameters exactly as specified (including units), And validates all changes against schema and constraints prior to save; any invalid change is rejected with a field‑level error and no partial save occurs, And updates on‑canvas previews for the current scope within 2 seconds of save, And enqueues an automatic Consistency Meter rescore for the same scope within 1 second of save, And displays a success confirmation including the new draft version identifier.

Scoped Application (Batch, Subset, Single Image)

Given the user has chosen a scope (Entire batch | Selected subset | Current image), When "Auto‑tune Preset" is executed, Then only images within the chosen scope have previews and sidecar edits updated, And images outside the scope remain unchanged, And the Consistency Meter rescore job is created for the chosen scope only, And the UI reflects the active scope via a visible scope badge, And a progress indicator reports counts for total, completed, and failed items within the scope until processing completes.

Respect Locked Brand Settings and Safe Bounds

Given one or more brand settings are locked and safe bounds exist for all tunable parameters, When Auto‑tune attempts to modify any locked parameter, Then the locked parameters remain unchanged, And those suggestions are skipped with a non‑blocking notice listing each skipped field and reason "locked", And all applied parameter values are clamped within defined min/max bounds; any clamping is recorded and surfaced as an informational note, And the operation completes without errors and with remaining applicable suggestions applied.

Instant Undo of Auto‑tune

Given Auto‑tune has been applied to a draft in the current session, When the user clicks Undo, Then all parameter changes from the last Auto‑tune action are reverted atomically to the prior values, And previews revert to the prior state within 2 seconds, And a Consistency Meter rescore is triggered for the reverted state within 1 second, And the activity log records an undo entry referencing the reversed change set and resulting draft version.

Side‑by‑Side Before/After Comparison

Given the user has enabled side‑by‑side comparison, When Auto‑tune is applied, Then the "Before" pane displays the exact pre‑apply render frozen at the moment before changes, And the "After" pane displays the updated draft render, And zoom and pan remain synchronized between panes, And each pane is labeled with timestamp and version (Before: vN, After: vN+1).

Audit Trail of Parameter Changes with Versioning

Given Auto‑tune modifies the draft preset, When the draft is saved, Then an audit entry is created containing: ISO‑8601 timestamp, actor (user ID), action "auto_tune_apply", preset ID, new draft version tag, scope (type and counts), list of parameter deltas (field, previous value, new value, units), suggestion IDs applied, suggestions skipped with reason (locked/safe_bound_violation), and rescore job ID, And the audit entry is immutable and retrievable via Activity Log UI and API endpoint, And rollback can target this audit entry to restore the prior version in one action.

Outlier Reduction Feedback and Quick Fix Application

Given the Consistency Meter flags outliers and provides quick‑fix suggestions, When the user selects those suggestions and runs Auto‑tune, Then the subsequent rescore presents an updated score and delta versus prior for the scoped items, And previously flagged outliers are re‑evaluated and marked resolved or unresolved with counts displayed, And any unresolved outliers present follow‑up actionable suggestions, And the success notification summarizes score delta and count of outliers resolved.

Meter UI & Drilldown Dashboard

"As a user, I want a clear meter UI with drilldowns and previews so that I can understand why the score is low and confidently apply targeted adjustments."

Description

Provides a responsive UI component showing the primary Consistency Score, sub-score bars, and status badges with hover tooltips and plain-language explanations. Includes a drilldown modal with sortable thumbnails, per-metric filters, and a diff view to compare suggested adjustments. Offers keyboard navigation, accessible labels, and clear empty/error/loading states. Integrates into the batch upload flow and preset editor without blocking other actions and supports localization.

Acceptance Criteria

Primary Score and Sub-score Display (Responsive)

Given a processed sample set and an active preset When the Meter UI is rendered on a desktop viewport (width ≥ 1280px) Then the primary Consistency Score (0–100) and five labeled sub-score bars are visible without horizontal scrolling Given the same context When the viewport is mobile (width ≤ 480px) Then the primary score and sub-scores reflow into a single-column layout with no visual overlap and essential labels remain readable Given an adjustment to the active preset settings When the change is applied Then the primary and sub-scores visually update within 1000 ms

Status Badges and Tooltips with Plain-Language Explanations

Given sub-score thresholds configured for Good, Needs Review, and Outlier When a sub-score meets a threshold Then the correct status badge is shown with the mapped label and color and the mapping is consistent across views Given a status badge is hovered or keyboard-focused When the tooltip is triggered Then a tooltip appears within 200 ms containing a one-sentence explanation, the relevant threshold, and a suggested quick fix Given a tooltip is open When focus moves away or Escape is pressed Then the tooltip dismisses and focus returns to the triggering element

Drilldown Modal with Sortable Thumbnails and Per-Metric Filters

Given the user clicks View details from the Meter UI When the drilldown opens Then a modal overlays the page without navigation and initial focus is set to the modal header Given thumbnails with per-image scores are displayed When the user clicks a column header (Score, File name, Time) Then the grid sorts by that column and toggles between ascending and descending on subsequent clicks Given metric and status filters are available When the user selects a metric and a status Then only matching thumbnails remain visible and the filter chips reflect the active filters Given the user closes the modal When the modal is dismissed Then the underlying page scroll position and state are preserved

Diff View to Compare Suggested Adjustments

Given a thumbnail is selected in the drilldown When the user opens Diff view Then a side-by-side before and after is shown with a comparison slider and metric deltas Given quick fix suggestions are available for the selected image When the user applies a suggested adjustment Then the after preview updates and metric deltas recalculate within 1000 ms and an undo option becomes available Given the user cancels or undoes the change When the action is triggered Then the view reverts to the original state without residual visual artifacts

Keyboard Navigation and Accessibility Compliance

Given the Meter UI and drilldown are present When navigating by keyboard Then all interactive elements are reachable via Tab and Shift+Tab with a visible focus indicator and Enter or Space activates the focused control Given the drilldown modal is open When navigating Then focus is trapped within the modal and Escape closes the modal and focus returns to the previously focused trigger Given assistive technology users interact When screen readers announce elements Then the primary score, sub-scores, badges, and tooltips have accessible names and roles, and color contrast for text and essential indicators is at least 4.5:1

Empty, Error, and Loading States

Given no sample images are available When the Meter UI loads Then an empty state appears with a concise message and a primary action to add images and no errors are shown Given metrics are being computed When the user opens the Meter UI Then skeleton placeholders are shown until data arrives and no layout shift exceeds 100 px during loading Given an error occurs while fetching or computing metrics When the UI receives an error Then a non-technical message with an error reference code and a Retry action is displayed and the app remains responsive

Non-blocking Integration and Localization Support

Given the user is in the batch upload flow or preset editor When the Meter UI mounts Then Upload, Save Preset, and Apply Preset actions remain available and responsive and meter computation runs asynchronously Given the user navigates away mid-computation When the view changes Then computation is canceled or safely paused and no blocking dialogs prevent navigation and returning restores the last known state Given the user switches locale between en-US, fr-FR, and es-ES (and RTL ar) When the locale change is applied Then all visible strings come from localization files, numbers and dates format per locale conventions, RTL layouts mirror correctly, and no hard-coded text remains

Low-latency Incremental Scoring at Scale

"As a team lead, I want the meter to update quickly as images stream in so that my team doesn’t stall during large batch uploads."

Description

Meets real-time performance targets with initial scoring in ≤2 seconds for up to 50 images and ≤200 ms incremental updates per additional image at the 95th percentile. Uses batched inference, background workers, and cached features to minimize latency and compute cost. Degrades gracefully for very large sets, providing progressive results and queue status indicators. Includes telemetry, rate limiting, and backpressure controls to ensure stability under load.

Acceptance Criteria

P95 Initial Scoring ≤2s for First 50 Images

Given a user uploads up to 50 images and selects a style preset When the Consistency Meter starts scoring Then the first overall score and all per-image scores are displayed within 2,000 ms at the 95th percentile over at least 500 runs in a warm-service environment And fewer than 1% of runs exceed 5,000 ms And a visible progress indicator is shown until scores appear

P95 Incremental Update ≤200ms Per Additional Image

Given an already-scored set of 50 images When 1 to 10 additional images are added Then the overall score updates and the new image(s) receive scores with server-side latency ≤200 ms per image at P95, measured from enqueue to publish, across ≥500 runs And client UI renders the updated score(s) within 300 ms P95 from server publish And updates are streamed without requiring a page refresh

Progressive Results and Queue Status for Large Uploads

Given an upload of 5,000 images When scoring begins Then the UI displays progressive results with at least the first 50 scores within 3,000 ms and subsequent updates at intervals ≤2,000 ms And a queue status indicator shows items processed/total, current throughput (images/sec), and ETA, updated at least every 2,000 ms And no update gap exceeds 5,000 ms until processing completes

Telemetry and Alerts for Latency and Stability SLOs

Given production load When telemetry is collected Then metrics are emitted for initial and incremental latency (P50/P95/P99), queue depth, batch size distribution, cache hit rate, worker utilization, and error rates at 10-second resolution And dashboards visualize these metrics with 1-minute and 5-minute windows And alerts trigger when P95 initial latency >2,000 ms or P95 incremental latency >200 ms for 5 consecutive minutes, or 5xx error rate >0.5% for 5 minutes

Rate Limiting and Backpressure Under Multi-tenant Load

Given 100 tenants concurrently uploading totaling ≥20,000 images at ~200 requests/sec When a tenant exceeds its configured limit Then the API returns HTTP 429 with a Retry-After header and the excess work is queued or rejected without data loss And tenants under their limits continue to meet latency SLOs (P95 initial ≤2,000 ms; P95 incremental ≤200 ms) And system 5xx error rate remains ≤0.5% and worker CPU utilization ≤85%

Cached Features Accelerate Re-scoring

Given a previously scored batch is re-scored with the same preset within 15 minutes When scoring is re-triggered Then time to first 50 scores decreases by ≥60% compared to a cache-cold baseline measured the same day And cache hit rate for feature retrieval is ≥80% during the run And compute time (GPU/CPU seconds) per image is reduced by ≥50% as reported by telemetry

Channel Targets

Declare where images will go (Amazon, Etsy, Shopify, social) and Sprint auto‑sets compliant bounds for margins, backgrounds, DPI, and aspect ratios. You design once while PixelLift enforces the right constraints to pass marketplace checks later.

Requirements

Channel Rules Engine

"As a boutique seller, I want PixelLift to automatically apply channel-specific rules so that my images pass marketplace checks the first time."

Description

A centralized, versioned repository of marketplace and social channel compliance rules (e.g., aspect ratios, min/max dimensions, DPI, file types, file-size limits, background color/opacity, margin/safe-area, watermark/border allowances, color space). Rules support variants by channel, locale, and image role (e.g., Amazon Main vs Additional). Rules are expressed in a machine-readable schema consumed by the rendering pipeline and UI. PixelLift maintains default templates and updates them; admins can extend/override within workspaces. When a user declares targets, the engine resolves applicable constraints and exposes them as enforceable policies to processing, validation, and export services, ensuring consistent, auditably correct outputs.

Acceptance Criteria

Resolve Constraints for Declared Targets (Single Channel, Role, Locale)

- Given channel=Amazon, locale=US, role=Main Image, when targets are saved, then the rules engine returns a resolved policy scoped to those three dimensions. - The policy includes: aspect_ratio, min_width, max_width, min_height, max_height, min_dpi, allowed_file_types, max_file_size_bytes, background_color, background_opacity, safe_area_pct, watermark_allowed, border_allowed, color_space, and rule_version_id. - The policy payload includes source identifiers: channel_id, locale_id, role_id, template_version_id. - Resolution occurs within 200 ms at p95 under nominal load. - If any dimension is unknown, the engine returns HTTP 400 with error_code="UNKNOWN_TARGET_DIMENSION" and no policy is produced.

Workspace Rule Override Precedence and Audit Logging

- Given a workspace-level admin override on a rule (e.g., background_color) for channel=Amazon US Main Image, when resolving a policy, then the override value is returned and the default template value is not. - Overrides apply only within the originating workspace; other workspaces continue to resolve defaults. - All override create/update/delete actions are captured with actor_id, timestamp, change_diff, and version_id; the audit log is retrievable via an API and includes pagination. - Submitting an override with invalid schema is rejected with HTTP 422 and a list of schema violations; no changes are persisted. - Non-admin users attempting to create overrides receive HTTP 403.

Machine-Readable Rule Schema Validation and Coverage

- Rules are stored and retrieved as JSON conforming to schema version v1.x; JSON Schema validation passes for all default templates. - Required properties cannot be null or missing; invalid values trigger descriptive errors with JSON Pointer paths to the offending fields. - The engine exposes a validation operation that returns valid=true/false and violations[]; test suite includes at least 20 positive and 20 negative samples. - The schema supports variants by channel, locale, and image role; resolving selects the most specific matching rule per dimension. - Backward-compatible schema changes increment the minor version; incompatible changes increment the major version and are rejected unless a feature flag explicitly enables them.

Versioning, Pinning, and Update Propagation

- Each resolved policy includes rule_version_id and template_version_id; IDs are immutable and monotonically increasing. - When a batch job starts, it pins the rule_version_id; subsequent rule updates do not affect that job's processing, validation, or export. - New jobs created after a rule update automatically use the latest version by default; users may opt to pin to a prior version via API/UI. - Publishing a rule update propagates to resolution within 60 seconds; p95 publish-to-availability latency <= 60s. - A diff operation returns a structured comparison between two versions, including breaking_change=true/false and a list of impacted fields.

Enforcement During Processing, Validation, and Export

- Processing enforces resolved policies; any image violating a constraint is marked non-compliant with reason codes (e.g., ASPECT_RATIO_OUT_OF_RANGE) and remediation hints. - The validation service returns per-image, per-target pass/fail with a complete list of violations and the rule_version_id used. - Export blocks non-compliant images by default; an authorized override flag is required to export non-compliant images, and the export includes a compliance report JSON alongside assets. - For a batch of 500 images, validation p95 latency per image <= 50 ms; validation completes successfully for >99% of images without timeouts. - Exported assets include a manifest embedding channel, locale, role, and rule_version_id; the manifest validates against a published JSON Schema.

Multi-Channel Targets and Per-Target Policy Output

- When multiple targets are declared (e.g., Amazon US Main, Etsy Listing, Instagram Feed), the engine returns a distinct policy object for each target with unique policy_id and rule_version_id. - The rendering pipeline receives a per-target policy map; it produces separate outputs when constraints differ and reuses a single output when constraints are identical; the number of outputs equals the count of unique policies. - Conflicting constraints across targets are not merged; the engine flags conflicts with a warning list and recommends multi-render; no single-output over-constraining occurs. - The UI and API display compliance status per target; a target can pass while another fails within the same batch. - For 10 combined targets across 100 images, policy resolution p95 latency per target <= 50 ms, and total memory overhead remains within 200 MB during resolution.

Target Selection & Preset Overrides

"As a store owner, I want to select my sales channels once per batch so that PixelLift enforces the right constraints without manual tweaks."

Description

An intuitive project/batch-level UI and API to declare one or more channel targets (Amazon, Etsy, Shopify, Instagram, etc.) with brand-default presets per workspace. Supports per-channel overrides (e.g., different background policy), per-image exceptions, and preset saving/sharing. Displays real-time rule summaries (badges for required dimensions, background, margins) and conflict warnings. Selections persist in project metadata, are version-aware with respect to rules, and drive downstream enforcement, validation, and export. Streamlines setup to a single step while enabling granular control when necessary.

Acceptance Criteria

Select Multiple Channel Targets at Project Level

Given a new project with no channels selected When the user selects Amazon and Etsy and clicks Save Then the project metadata stores ["amazon","etsy"] and the UI shows 2 selected Given a saved project with channels selected When the project is reloaded Then the previously selected channels remain selected with no duplicates Given a project with three channels selected When the user deselects Etsy and saves Then "etsy" is removed from project metadata and the badges update within 500 ms Given an attempt to add an unsupported channel ID When the selection is saved Then the save is rejected and a validation error message is shown

Apply Workspace Brand-Default Presets on New Project

Given workspace brand-default presets exist for Amazon and Etsy When a new project is created and those channels are selected Then the defaults auto-apply and are visible in the rule summary (background, margins, DPI, aspect ratio) Given no brand-default preset exists for a selected channel When the channel is added to the project Then PixelLift applies global defaults and marks them with a "Default" badge Given brand-default presets are changed at the workspace level When creating a new project after the change Then the new defaults apply to the new project, and existing projects remain unchanged Given an override was made and saved When the user clicks "Reset to default" for that channel Then the preset values revert to the brand-default values

Per-Channel Preset Override at Project Level

Given Amazon and Instagram are selected for a project When the user changes Instagram background to #F7F7F7 and leaves Amazon at #FFFFFF Then only Instagram's preset reflects #F7F7F7 and Amazon remains #FFFFFF Given a channel has at least one overridden field When viewing the channel list Then an "Overridden" indicator is shown for that channel Given a channel has overrides When the user clicks "Reset to default" for that channel Then all overridden fields revert to the channel's default preset and the indicator disappears Given channel overrides are saved When the project is re-opened Then the overrides persist exactly as saved

Per-Image Exception Overrides Within Batch

Given a batch of 100 images with Amazon and Etsy selected When the user sets a 10% margin exception for image IMG_001 on Amazon only Then only IMG_001 on Amazon uses 10% margin, while all other images and channels retain their channel-level settings Given 10 images are multi-selected When a 3000x3000 export size exception is applied for Etsy Then the exception applies to those 10 images on Etsy only Given an image has one or more exceptions When opening its details panel Then a badge lists each overridden property per channel Given exceptions exist on an image When the user clicks "Clear exceptions" Then the image reverts to the project/channel preset and exception badges are removed Given exceptions are set When exporting the batch Then the rendered outputs honor the exceptions for the affected images and channels

Real-Time Rule Summaries and Conflict Warnings

Given one or more channels are selected When viewing the rule summary panel Then badges show per-channel requirements for dimensions, background policy, margin bounds, DPI, and aspect ratio Given a setting violates a selected channel's rule When the user changes a value into a non-compliant state Then a conflict warning appears within 200 ms with guidance to restore compliance Given a conflict warning is visible When the user adjusts the value into the allowed range/policy Then the warning clears within 200 ms and the badge becomes compliant Given mutually incompatible settings across selected channels are detected When the summary is displayed Then the UI surfaces a non-blocking notice suggesting per-channel overrides and allows the project to be saved

Version-Aware Rule Persistence in Project Metadata

Given marketplace rule definitions are versioned with IDs When a project is saved with selected channels Then the project metadata stores ruleVersionIds per channel at time of save Given ruleVersionIds have been updated upstream When opening an existing project saved on older versions Then the stored versions remain active and a banner offers to upgrade with a changelog preview Given the user accepts an upgrade When confirmation is given Then the project switches to the new ruleVersionIds, revalidates, and displays any new conflicts introduced Given an export is performed When the export manifest is generated Then it records the ruleVersionIds used for validation and enforcement

Preset Saving and Sharing Across Workspace

Given a user with permission edits channel presets When the user saves the configuration as a named preset Then a preset with a unique ID is created and visible to all workspace members Given a shared preset exists When another member applies it to a project/channel Then all preset fields apply exactly and the action is audit logged with user, time, and preset version Given a preset is updated When changes are saved Then a new preset version is created; existing projects retain their previous version until explicitly upgraded Given a preset is deleted When it is referenced by existing projects Then those projects retain a frozen copy; the preset is no longer available for new applications

Auto-Adjust Constraint Enforcement

"As a catalog manager, I want PixelLift to auto-adjust images to meet channel requirements so that I don’t have to re-edit or reshoot."

Description

Non-destructive processing that automatically enforces selected channel constraints: canvas resize, smart crop/pad using subject-aware bounding boxes, background replacement/flattening, DPI resampling, color profile conversion, margin normalization, and format/quality optimization to meet file-size caps. Detects and removes/flags prohibited elements (e.g., watermarks, borders, text overlays) when rules require. Provides configurable strictness (auto-fix vs flag-only) and protects against quality loss via thresholds and skip-with-warning behavior. Ensures outputs meet compliance while preserving visual integrity and brand style.

Acceptance Criteria

Amazon Main Image Constraints Auto-Enforcement

Given a channel profile "Amazon Main Image" with constraints: background=#FFFFFF flattened; subject coverage 85–95% of canvas; aspect ratio 1:1; min longest side=1600 px; DPI=300; color profile=sRGB IEC61966-2.1; max file size=10 MB; prohibited elements: text, logos, watermarks, borders. When a batch of 50+ varied product photos is processed with Auto-Adjust Constraint Enforcement. Then each output: - has a pure white background (CIE ΔE2000 to #FFFFFF ≤ 1.0) and no transparency - maintains the detected subject fully inside frame; subject coverage is between 85% and 95%; edge padding ≥ 2% on all sides - is 1:1 aspect ratio; longest side ≥ 1600 px; metadata DPI set to 300 - embeds sRGB IEC61966-2.1; flattened single layer - is exported as JPEG with file size ≤ 10 MB while SSIM ≥ 0.98 vs the pre-optimization image - contains no prohibited elements; any removed elements are logged with reason codes - passes channel compliance validation with zero errors - leaves the original source files unmodified

Shopify Color Profile Conversion & File Size Optimization

Given a channel profile "Shopify Product" with constraints: target longest side=2048 px (upscale limit 2×), color profile=sRGB IEC61966-2.1, preferred export=WebP then JPEG, max file size cap=20 MB, quality threshold SSIM ≥ 0.98. When images are processed with Auto-Adjust Constraint Enforcement. Then each output: - has longest side set to 2048 px unless that requires >2× upscaling; in that case upscaling is limited to 2× and a "ResolutionLimit" warning is recorded - is converted to and embeds sRGB IEC61966-2.1 - is exported as WebP if it meets the cap with SSIM ≥ 0.98; otherwise exported as JPEG - has file size ≤ 20 MB; if meeting the cap would break SSIM ≥ 0.98, the system retains SSIM ≥ 0.98, attaches "SkipWithWarning:FileSize", and marks compliance=false

Social Square Crop with Subject-Aware Padding

Given a channel profile "Instagram Square" with constraints: aspect ratio 1:1; min size 1080×1080; safe margin ≥ 3% around detected subject; background preset="Gradient A"; color profile=sRGB; max file size=8 MB. When an image with an off-center subject is processed. Then the output: - is 1:1 with dimensions ≥ 1080×1080 via subject-aware crop/pad; no subject pixels are clipped (subject IoU ≥ 0.99 with pre-crop mask) - has ≥ 3% padding between the subject bounding box and each edge - replaces background with preset "Gradient A" and flattens layers - is in sRGB and ≤ 8 MB with SSIM ≥ 0.98 vs pre-optimization

Prohibited Elements Detection & Handling

Given a channel where prohibited elements include watermarks, borders, and text overlays, and strictness=Auto-Fix, and a test set containing examples for each violation plus clean images. When the batch is processed. Then: - watermarks are removed or masked without altering subject pixels (no change inside subject mask; Dice coefficient ≥ 0.99) - borders are trimmed and the canvas is re-padded/cropped to restore required margins and aspect ratio - text overlays not integral to the subject are removed; if removal would alter the subject, the image is flagged "RemovalRisk" and left unchanged - clean images are not modified - each action logs type, bounding box coordinates, and reason code; compliance=true where all violations are fixed; compliance=false for flagged images

Strictness Modes: Auto-Fix vs Flag-Only Behavior

Given the same violating input and channel constraints, with strictness toggled at the channel level. When strictness=Auto-Fix. Then all fixable violations are corrected automatically; unfixable violations are flagged; final compliance=true only if no violations remain; originals are unmodified; an audit trail lists each fix. When strictness=Flag-Only. Then no pixel-level changes are applied; all violations are detected and reported with reason codes; final compliance=false; suggested fixes are included in the report.

Quality Preservation Thresholds & Skip-with-Warning

Given a channel profile with a file size cap and quality thresholds SSIM ≥ 0.98 and PSNR ≥ 40 dB. When resampling or compression would require breaching thresholds to meet the file size cap. Then the system: - attempts alternate formats and qualities and selects the smallest file that maintains thresholds - if no setting meets the cap while maintaining thresholds, outputs the version that maintains thresholds, attaches "SkipWithWarning:QualityGuard", and marks compliance=false unless the channel allows "allow-near-cap ≤ 10%", in which case exceeding the cap by ≤ 10% is permitted and compliance=true

Non-Destructive Processing, History & Reversion

Given any channel constraints and an input image. When Auto-Adjust Constraint Enforcement is applied and the user later changes a rule and reprocesses. Then: - the original asset remains unchanged on disk - all operations are stored as an ordered, parameterized history - reprocessing starts from the original asset, not from a prior export; results are deterministic for identical inputs/configuration - the user can revert to any prior step and export again; exports are versioned and traceable to the source and configuration hash

Preflight Compliance Report

"As a marketer, I want a preflight report per channel so that I know what will fail and how to fix it before publishing."

Description

A pass/fail validator that checks each image against each selected channel’s rule set prior to export or publishing. Presents actionable diagnostics (what failed and why) with one-click fix suggestions, bulk apply, and preview. Summarizes blocking errors vs non-blocking warnings at batch and image levels. Exposes results via UI and downloadable CSV/JSON for audit/QA. Integrates with the rules engine for versioned, reproducible validation, reducing rejections and back-and-forth edits.

Acceptance Criteria

Batch Preflight Validation Across Multiple Channels

Given a batch of at least 100 images and channels Amazon and Etsy selected with rules engine version v1.12 pinned When the user runs Preflight Validation Then each image is evaluated independently against each selected channel’s rule set And the report lists for every image–channel pair a status of Pass, Fail, or Warn And the batch summary displays counts of Pass, Fail, and Warn across all images and channels And validation completes without error within 120 seconds for a batch of up to 500 images on the standard plan

Blocking Errors vs Warnings Summary

Given the rules define severities Blocking and Warning When validation completes Then the batch header shows total Blocking Errors and total Warnings separately And each image row shows per-channel chips for Blocking and Warning counts And a filter “Show blocking only” lists only images with one or more blocking errors And non-blocking warnings do not prevent export, but remain visible in summaries

Actionable Diagnostics and One-Click Fix with Preview

Given a failed rule (e.g., background color non-compliant for Amazon) When the user opens the image detail panel Then the panel lists each failed rule with channel, ruleId, severity, human-readable message, and suggested fix And a “Preview Fix” shows a non-destructive preview of the adjustment When the user clicks “Apply Fix” Then the image variant is updated and validation re-runs for affected channels And “Bulk Apply All Suggested Fixes” applies the corresponding fixes across all applicable images and re-runs validation And all applied fixes are logged with timestamp, user, and affected imageIds

Exportable Compliance Report (CSV and JSON)

Given preflight validation has been executed When the user downloads the CSV report Then the file contains one row per image–channel–rule with columns: batchId, imageId, filename, channel, ruleId, ruleVersion, severity, status, message, fixSuggested, fixApplied, timestamp And when the user downloads the JSON report Then it contains the same fields grouped by image and channel and validates against schema version 2.0 And exported filenames include batchId and ISO-8601 timestamp And exports complete within 30 seconds for up to 10,000 rows

Reproducible Validation via Versioned Rules

Given channels are selected and the rules engine exposes versioned rule sets When validation runs Then the report header records the exact rules version ID per channel And re-running validation on the same images with the same pinned versions yields identical results And when a newer rules version is available, the UI displays an “Update available” indicator and offers “Revalidate with latest” Then results from the latest run are labeled with the new version and do not overwrite prior results unless confirmed by the user

Preflight Gate Prior to Export or Publish

Given a batch contains images with mixed Pass, Fail, and Warn results When the user opens the Export/Publish dialog Then channels with any blocking errors have export toggles disabled and show the count of blocking issues And channels with zero blocking errors are enabled for export And a “View issues” link opens the report filtered to the selected channel and blocking errors And attempting to export a channel with blocking errors is prevented and shows an error listing up to three blocking issues with a link to view all; warnings do not block export

Multi-Channel Derivative Export

"As an operations lead, I want per-channel exports with correct naming and metadata so that my upload automations and listings just work."

Description

Generation of per-channel, per-variant outputs from a single design, applying naming conventions, folder structures, and embedded metadata mappings appropriate to each destination. Handles color profile normalization (e.g., sRGB), background flattening/alpha handling, DPI setting, and compression strategies tuned to hit file-size limits without visible artifacts. Supports parallelized, resumable batch export with deterministic outputs for caching and duplicate detection. Ensures ready-to-upload assets for every target with minimal manual handling.

Acceptance Criteria

Per-Channel Variant Export From Single Design

Given a project with one base design, 3 channel targets (Amazon, Etsy, Shopify), and 4 product variants When Multi-Channel Derivative Export is triggered Then exactly 12 output images are produced (3 channels × 4 variants) And each file name matches the channel’s naming template (including SKU, VariantID, and Channel suffix) And each file is placed under the correct channel/variant folder structure as configured And the export summary reports counts per channel and variant that match files on disk

Metadata Mapping & Embed Per Destination

Given a channel-specific metadata mapping defining required keys and prohibited fields When the export completes Then each output embeds exactly the mapped metadata for its destination (e.g., SKU, VariantID, Channel, AltText) And prohibited metadata (e.g., GPS, camera serial) is removed And validating metadata with the built-in checker returns 0 errors and 0 warnings for all files

Color Profile Normalization & Background Handling

Given inputs may contain varied color profiles and transparency When exporting to channels that require sRGB and opaque backgrounds Then outputs are converted to sRGB IEC61966-2.1 and transparency is flattened to the configured background color And for channels that allow transparency, alpha is preserved and no background flattening occurs And measured color difference after conversion is ΔE00 ≤ 2 against the reference transform And exported files include exactly one embedded sRGB profile tag

DPI, Aspect Ratio, and Margin Compliance

Given each channel profile defines target pixel bounds, aspect ratio, DPI tag, and content margins When the export runs Then each output’s pixel dimensions and aspect ratio fall within the specified constraints for its channel And DPI metadata matches the channel profile value And auto-crop/pad operations do not clip the detected product bounding box (0 clipped pixels) And channel compliance validation passes 100% for all outputs

Compression & File Size Compliance With Quality Guardrails

Given each channel defines a maximum file size and recommended codec/quality settings When encoder settings are applied during export Then every output is ≤ the channel’s max file size And structural similarity SSIM ≥ 0.98 and PSNR ≥ 40 dB versus the pre-compression rendered image And no visible blocking/ringing is detected by the artifact detector (0 flagged tiles) And the export log records the chosen quality level and number of encode passes per file (≤ 3 attempts)

Parallelized Resumable Batch Export

Given a batch of 500 images with concurrency set to 8 workers When the export is paused at 37% and later resumed Then previously completed files are not reprocessed and the job resumes from the last confirmed checkpoint And no partial/corrupted files are present (all outputs are written atomically) And final produced file count equals the expected total and matches the manifest And average CPU utilization of workers remains within configured limits without worker crashes (0 worker failures)

Deterministic Outputs, Caching, and Duplicate Detection

Given identical inputs and channel configurations across two runs When Multi-Channel Derivative Export is executed twice Then all outputs are byte-identical with matching checksums and metadata across runs And unchanged items are served from cache on the second run (cache hit rate ≥ 90% when no inputs changed) And duplicate detection prevents emitting multiple identical files to the same destination (0 duplicate files), with dedup events logged

Direct Publish Connectors

"As a seller, I want to publish directly to my stores and marketplaces so that I can go live faster without downloading and reuploading files."

Description

Optional connectors to publish validated assets directly to Shopify, Etsy, Amazon (SP-API), and social platforms. Provides OAuth account linking, per-store/channel mapping, SKU/handle association, destination collection/album selection, and dry-run mode. Implements queued, rate-limited uploads with retries, webhook/callback handling, and clear error surfacing. Adheres to each platform’s API constraints and quotas, enabling end-to-end flow from design to live listing without manual downloads/uploads.

Acceptance Criteria

OAuth Account Linking and Token Management

- Given I am an org admin and select "Connect Shopify" from Direct Publish Connectors, when I complete the OAuth flow successfully, then the connector status changes to "Linked" and displays the store domain and shop ID. - Given an access token has expired, when a publish job is initiated, then the system refreshes the token using the stored refresh token and retries the request once without user action. - Given I click "Unlink" for a linked connector, when I confirm, then all tokens are revoked at the provider (if supported), deleted from storage, audit-logged, and the connector status becomes "Not linked". - Given multiple stores/accounts are linked for a provider, when I view publish settings, then each account is selectable independently with its own saved configuration. - Rule: Access and refresh tokens are stored encrypted at rest and redacted in logs and UI.

Per-Store/Channel Mapping and Destination Selection

- Given multiple channels are linked (Shopify, Etsy, Amazon, social), when I configure per-store/channel mappings for a project, then the mapping is saved and preselected on subsequent publish flows. - Given Shopify is selected, when I choose one or more destination collections and/or a product handle, then published images are attached to the specified product and the product is assigned to the selected collections. - Given Etsy is selected, when I specify a listing ID (draft or active), then images are uploaded to that listing's gallery in the configured order. - Given Amazon (SP-API) is selected, when I specify Seller SKU(s) and marketplace, then images are attached to the correct catalog item(s) under the selected marketplace. - Given a selected channel is missing required destination details, when I attempt to start a publish job, then validation fails with a blocking message identifying the channel and missing fields.

SKU/Handle Association and Image Attachment

- Given assets carry SKU/handle metadata or a mapping table is provided, when publishing to Shopify, then each image attaches to the product matching the handle or SKU per the mapping. - Given publishing to Amazon, when an image is marked as "MAIN" vs "ADDITIONAL", then it is submitted to the corresponding image slot (MAIN or PT01–PT08) for the specified SKU. - Given duplicate images are mapped to multiple variants of the same product, when publishing, then the image is uploaded once and associated to each variant as allowed by the platform without duplicate uploads. - Given no matching SKU/handle is found for an asset, when validating, then the asset is flagged as "Unmapped" and is excluded from publishing with an actionable error.

Dry-Run Mode Publish Simulation

- Given Dry Run is enabled, when I start a publish job, then no external write calls are executed and no assets are created/modified on any platform. - Then a per-asset simulation report is generated including target channel, destination identifier (e.g., shop ID, listing ID, SKU), intended slot/position, and predicted payload size. - Given Dry Run, when constraints or mappings are invalid, then the same validation errors are surfaced as in a real publish and the job result shows 0 successful publishes. - Rule: Audit logs for Dry Run include only simulated requests and explicitly mark the job as "Dry Run".

Queued, Rate-Limited Uploads with Retries and Idempotency

- Given a publish job with 500+ assets across channels, when the job runs, then uploads are queued and processed with a maximum concurrency of 5 per channel by default (configurable). - Given a 429 or rate-limit response is received, when retrying, then the system honors Retry-After headers (if present) and uses exponential backoff with jitter up to 5 attempts before marking the asset as failed. - Rule: Each asset publish attempt uses a stable idempotency key per channel (jobId+assetId) so that retries do not create duplicate images within a 24-hour window. - Given the worker service restarts mid-job, when it resumes, then already-succeeded assets are not re-sent and remaining queued items continue processing.

Webhook/Callback Handling and Status Synchronization

- Given a provider emits success/failure callbacks for image uploads, when a webhook is received, then the corresponding asset/job status updates within 30 seconds and stores external identifiers (e.g., Shopify image ID). - Given no webhook is received within 10 minutes of a request, when the job is still pending, then the system polls the provider every 60 seconds up to 30 minutes or until completion, after which remaining items time out as failed. - Rule: Duplicate or out-of-order callbacks are handled idempotently and do not regress a terminal status. - Then the user-facing job summary shows counts by channel: Succeeded, Failed, Skipped, and Pending, with downloadable error details.

Marketplace Constraint Validation and Blocking Pre-Publish

- Given Channel Targets are applied, when an asset violates a selected channel's requirements (e.g., Amazon MAIN not on pure white background, aspect ratio outside bounds, DPI below minimum, margins outside limits, max file size exceeded), then the asset is blocked from publish with a specific, per-violation message and proposed fix. - Given all selected assets meet their channel constraints, when I start a publish job, then pre-flight validation passes in under 10 seconds per 1,000 assets and the job proceeds to queue. - Rule: Blocked assets are excluded from API calls; the publish job proceeds for compliant assets and reports the count and reasons for exclusions per channel.

Rule Change Monitoring & Alerts

"As a brand admin, I want to be alerted when marketplace rules change so that my presets stay compliant and exports continue to pass."

Description

Continuous monitoring of marketplace documentation and PixelLift-maintained rule templates to detect changes. On updates, creates a new rules version, shows human-readable diffs, notifies workspace admins (in-app, email, Slack), and proposes migration of presets/targets with effective-date scheduling. Triggers automatic re-validation of affected projects and flags at-risk exports. Maintains an audit log of rule versions applied to each asset for traceability and compliance evidence.

Acceptance Criteria

Rule Change Detection and Version Creation

Given a monitored rules source (marketplace documentation or PixelLift template) publishes a material change When the monitoring job next runs (<= configured polling interval) Then the system detects the delta and creates a new immutable rules version with unique versionId, channelTarget, source, createdAt, and parentVersionId And Then no new version is created if the normalized rules JSON is unchanged versus the latest version (idempotent) And Then the new version is persisted and visible in the Rules Library list

Human-Readable Diffs for Updated Rules

Given a new rules version exists When an admin opens the diff view Then the UI shows added, removed, and modified constraints with plain-language labels and old vs new values for margins, background, DPI, and aspect ratio And Then the diff view displays the total number of changes and the count of impacted presets/targets And Then a link to the diff is available for sharing with other admins

Admin Notifications (In-App, Email, Slack) on Rule Update

Given a new rules version is created When notifications are dispatched Then all workspace admins receive an in-app alert, an email, and a Slack message (if a webhook is configured) within one polling interval containing channel name, versionId, change summary, link to diff, and impacted preset/target counts And Then if Slack delivery fails, the failure is logged and email/in-app delivery still succeeds

Preset/Target Migration Proposal with Effective-Date Scheduling

Given a new rules version exists When an admin opens the migration wizard Then the system lists all impacted presets and channel targets with per-item change summaries and a selectable checkbox And Then the admin can schedule an effective date/time (with timezone) for migration and preview the resulting constraints And Then upon confirmation, scheduled migration jobs are created and tracked with statuses (pending, running, completed, failed) And Then at the effective time, selected presets/targets are atomically updated to reference the new rules version

Automatic Re-Validation and At-Risk Export Flagging

Given a new rules version is created When automatic re-validation runs Then assets in affected projects and channel targets are re-validated against the new rules and results are recorded And Then any existing exports that would fail under the new rules are flagged At Risk with the failing constraints listed And Then At-Risk flags are visible in the dashboard and on project pages with links to the failing assets

Asset-Level Audit Log and Compliance Evidence

Given any asset has been validated or exported When viewing its audit log Then entries show ruleVersionId, channelTargetId, validation outcome, timestamp, and actor (system/user) for each event And Then audit entries are immutable and can be filtered by asset, project, channel, and ruleVersionId And Then the audit log can be exported to CSV for a selected time range

Source Attribution and Impact Scoping

Given a rules update is detected When viewing the version details Then the source is labeled as Marketplace Documentation or PixelLift Template with a source URL or internal reference And Then the scope of impact is enumerated (channels and constraint categories changed) And Then only impacted presets/targets are included in migration proposals and re-validation

Batch Validator

Run your new preset on a small test batch and get instant feedback: pass/fail reasons, visual diffs, and estimated processing time and cost. Accept or tweak with one click, ensuring you only roll out settings that meet standards and timelines.

Requirements

Sample Set Builder

"As a boutique owner, I want to run my preset on a representative subset of my catalog so that I can validate quality and timing without processing the entire batch."

Description

Enable users to select and generate a small, representative test batch (e.g., 10–50 images) from an uploaded catalog to validate a style-preset before full rollout. Provide selection modes (random, stratified by product/category/SKU, and outlier-focused sampling such as low-resolution or atypical aspect ratios) with configurable caps to control spend. Display sample composition and representativeness indicators (e.g., coverage by category, lighting, background types). Integrates with PixelLift’s catalog metadata and image analysis services to tag images for stratification and to precompute attributes used in sampling.

Acceptance Criteria

Random Sample Generation (10–50 images)

Given an uploaded catalog with at least 50 images and Random mode selected When the user requests a sample size between 10 and 50 (inclusive) and confirms Then the system returns exactly the requested number of unique images within 3 seconds for catalogs up to 10,000 items And each image has an equal selection probability (±2% over 100 repeated draws with fixed parameters) And if the catalog contains fewer images than requested, the system blocks generation and displays a validation error explaining the minimum and current counts And if a random seed S is provided, repeated generation with the same catalog snapshot and S returns the identical image set

Stratified Sampling by Category/SKU

Given an uploaded catalog with category and SKU metadata available and Stratified mode selected When the user requests N images and chooses proportional by category (default) or custom targets per category/SKU Then the sample contains items per stratum that match the target distribution within ±1 item per stratum or ±5% (whichever is larger), and totals equal N And strata with no available items are omitted and reported; shortfalls are reallocated proportionally to remaining strata And at least one item per present stratum is included when N >= number of strata, unless explicitly excluded by the user And the composition summary displays counts and percentages per stratum matching the actual sample

Outlier-Focused Sampling (low-res, atypical aspect ratios)

Given image analysis tags (resolution, aspect ratio, background type) are available and Outlier-focused mode is selected When the user defines outlier rules (e.g., lowest 10% resolution and aspect ratio outside 0.75–1.5) and requests N images Then at least 60% of the sample satisfies at least one outlier rule or the maximum possible if fewer outliers exist, with any shortfall backfilled by random non-outliers And the sample metadata lists which rules each selected image matched And if fewer than 5 outliers exist, the system warns the user before generation And generation completes within 5 seconds for catalogs up to 10,000 items

Spend Cap Enforcement and Cost/Time Estimation

Given pricing per image is available and a spend cap is set by image count or currency When the user requests a sample size N that would exceed the cap Then the system suggests the maximum allowable sample size K that fits the cap, with estimated cost and processing time, and blocks sizes > K unless the cap is changed And the estimated cost equals unit price × selected image count within ±$0.01 rounding tolerance and updates within 1 second of parameter changes And if pricing is unavailable, the system disables cap-by-currency and informs the user

Representativeness Indicators Display

Given a generated sample and catalog attribute distributions (category, lighting, background) are available When the composition is displayed Then indicators show counts and percentages for each attribute value for both sample and catalog And any attribute value whose sample percentage deviates by more than 20% relative from the catalog is flagged with a warning icon and tooltip And coverage includes at least 90% of attribute values present in the catalog when N >= number of values, otherwise a warning explains limitations And the indicators render within 2 seconds and can be exported as CSV

Integration with Metadata and Image Analysis Services

Given access to catalog metadata and image analysis services When sampling requires tags not present locally Then the builder requests and caches required tags before selection, with each external call retried up to 3 times with exponential backoff And if either service is unavailable after retries, only Random mode remains enabled with a non-blocking alert explaining reduced functionality And all selected images in the sample are saved with their tags and sampling mode in the sample record for audit

Persist and Reuse Sample Definitions

Given a configured sampling setup (mode, parameters, seed, filters) When the user saves the sample definition Then the definition is stored with a unique ID and can be rerun later to produce the same image set when executed against the same catalog snapshot and seed And if the catalog has changed, rerun displays differences and requires confirmation before regenerating And users can rename, duplicate, and delete saved definitions; deletes are soft-deleted and recoverable for 30 days

Validation Rule Engine

"As a brand manager, I want explicit pass/fail criteria with clear reasons so that I can ensure images meet our standards and marketplace requirements."

Description

Provide a configurable rules framework that evaluates processed test images against brand and marketplace standards to yield clear pass/fail outcomes and human-readable reasons. Support rules such as background uniformity and color tolerance, subject centering and margin bounds, minimum resolution and aspect ratio, shadow/halo tolerances, color palette adherence, compression/file size limits, and watermark detection. Include default rulesets (e.g., Amazon white background, Shopify guidelines) and allow custom thresholds per workspace. Compute per-image metrics and aggregate pass rate, highlight failing rules with guidance, and expose an extensible metrics registry for future criteria.

Acceptance Criteria

Core Rule Suite Evaluation on Test Batch

Given a configured ruleset "Amazon Default" with thresholds: background L*a*b* std dev <= 2.0; background mean deltaE to pure white <= 3.0; subject margins between 5% and 15% of image edges; subject centroid offset <= 2% of image width/height; minimum resolution >= 1600x1600 px; aspect ratio = 1:1 ± 1%; shadow/halo opacity <= 5%; color palette adherence >= 95% to brand palette; file size <= 2 MB; JPEG quality >= 85; watermark detection confidence <= 0.10 When a test batch of N images is processed with a selected preset Then each image is marked Pass only if all enabled rules are satisfied within thresholds, otherwise Fail with failing rules listed per image

Default Marketplace Rulesets Availability and Selection

Given a workspace with no custom rules defined When the user selects "Amazon Default" or "Shopify Default" Then the engine loads the corresponding predefined rule bundle with documented thresholds and enabled rule list And running validation applies those rules without additional configuration And switching between defaults updates the active bundle and subsequent validation outcomes accordingly

Workspace Threshold Overrides and Precedence

Given a workspace that duplicates "Amazon Default" into "Brand A - Amazon" And overrides: background deltaE <= 2.5, min margin >= 8%, max file size <= 1.5 MB When validation runs under "Brand A - Amazon" Then the engine uses the overridden thresholds for evaluation And unspecified thresholds inherit from the base ruleset And changes to the base ruleset after duplication do not alter overridden values in the workspace version And any rule disabled at workspace level is not evaluated

Aggregate Pass Rate and Metrics Export

Given a processed test batch of N images with per-image rule evaluations and metrics recorded When validation completes Then the engine returns aggregate pass rate = (count of images with Pass)/N to two decimal places And returns batch-level summaries (mean, p95) for background deltaE, centroid offset, margin min/max, resolution, aspect ratio, file size, watermark confidence And exports per-image metrics in a documented schema including ruleId, metric name, value, threshold, and pass boolean

Failing Rule Reasons and Actionable Guidance

Given any image that fails one or more rules When the validation report is generated Then each failing rule includes: rule name, measured value, threshold, delta from threshold, and a concise remediation tip And remediation tips reference relevant preset controls (e.g., "Increase background cleanup to 70–80%") And reasons are human-readable and do not expose internal model IDs

Extensible Metrics Registry and New Rule Onboarding

Given a new metric plugin "specularHighlightRatio" registered with id "metric.specularHighlightRatio" and documentation And a new rule "MaxSpecularHighlightRatio" configured to use that metric with threshold <= 0.12 When the engine loads the registry Then the new metric is discoverable via API and usable in rule expressions without core engine changes And validation using a ruleset that includes the new rule evaluates it and affects Pass/Fail as expected And removing the plugin cleanly invalidates the dependent rule with a descriptive configuration error

Deterministic Evaluation and Repeatability

Given the same input image bytes, preset, and ruleset version When validation is executed multiple times on the same hardware and configuration Then per-image metrics and Pass/Fail outcomes are identical across runs And outputs include the ruleset version hash and preset version to enable reproducibility

Visual Diff Viewer

"As a photo editor, I want to visually compare before-and-after results with rule overlays so that I can quickly spot issues and decide whether the preset is acceptable."

Description

Provide an interactive viewer to compare original vs. processed images with side-by-side and overlay modes, adjustable opacity slider, zoom/pan, grid and safe-margin overlays, clipping and color gamut warnings, and rule annotations directly on the image (e.g., centering boxes, background masks). Allow quick navigation across the sample set, keyboard shortcuts for review speed, and download/export of processed samples. Ensure responsive performance for large images and accessibility compliance for controls and annotations.

Acceptance Criteria

Compare Modes and Opacity Control

Given a test batch image pair is open in the Visual Diff Viewer When the user toggles view mode via UI or hotkey "M" Then the viewer switches between Side-by-Side and Overlay within 150 ms and preserves zoom/pan state Given Overlay mode is active When the user adjusts opacity via slider, arrow keys (1% step), or Shift+Arrow (10% step) Then overlay opacity updates continuously at ≥30 fps and the current percentage is displayed and announced to assistive tech Given the reviewer advances to the next image When no changes are made to display settings Then the last-used mode and opacity persist across images in the session

Synchronized Zoom and Pan Performance

Given Side-by-Side mode with 24–50 MP images When the user zooms via wheel/pinch (5%–800%) or double-click to 100% and pans via drag/space-drag Then both panes stay synchronized within 1 px, interactions render at ≥60 fps (24 MP) or ≥30 fps (50 MP), and reset ("R") completes within 100 ms Given the user holds "Alt" When panning or zooming Then panes temporarily desynchronize and revert to sync on release Given the viewer is resized from 1280x800 to 4K When maintaining current zoom Then content scales without pixelation beyond source resolution and maintains centering

Overlays and Rule Annotations

Given overlays are toggled via "G" (grid) and "S" (safe margin) When enabled Then a rule-of-thirds grid and a safe-margin overlay (default 5%, adjustable 0–20% in 1% steps) render on both panes with opacity 10–60% and persist across images Given rule annotations are toggled via "T" When enabled Then centering crosshair, subject bounding box, and background mask appear with labels; center deviation and fill percentage are shown with numeric readouts accurate to ±1 px/±0.5% Given any overlay or annotation is visible When tabbing through controls Then each has an accessible name, ARIA state, and visible focus indicator meeting WCAG 2.2 AA contrast

Clipping and Color Gamut Warnings

Given the warnings toggle "W" is activated When evaluating the processed image against its embedded or assigned color profile Then clipped pixels (<=0.5% or >=99.5% per channel) and out-of-gamut pixels (relative to sRGB or the selected profile) are highlighted with distinct legends and counts Given synthetic test charts with known clipped and out-of-gamut regions When analyzed Then the viewer reports pixel counts within ±1% tolerance of ground truth and updates the legend in under 150 ms Given warnings overlay is active with other overlays When overlay opacity is adjusted Then all overlays remain distinguishable with a minimum 3:1 contrast against the image content

Sample Navigation and Keyboard Shortcuts

Given a sample set of 200 images When navigating with Right/Left arrows, Home/End, or clicking thumbnails in the filmstrip Then the next image renders within 300 ms on cache hit and 800 ms on first load, with prefetch of the next two images Given keyboard shortcuts are used When pressing "M", "R", "G", "S", "T", or "W", or holding Space for pan Then the mapped action occurs and a non-intrusive toast shows the action name and shortcut for 2 seconds Given the first or last image is in view When pressing Left at first or Right at last Then navigation wraps around only if "Wrap" is enabled; otherwise an edge hint is shown and focus remains

Download/Export Processed Samples and Reports

Given one or more samples are selected When the user clicks "Download Processed" Then a ZIP containing processed images with original filenames and a JSON summary (image_id, pass/fail reasons, diff metrics, processing time and cost estimates) is generated and downloaded Given a selection of up to 50 images totaling ≤500 MB When exporting Then the ZIP is prepared server-side and the download starts within 10 seconds; progress is displayed and the operation is cancellable Given color profiles are embedded in processed files When files are exported Then processed images retain the intended ICC profile and metadata is preserved or stripped according to export settings

Accessibility and Compliance

Given keyboard-only navigation When reviewing all controls and overlays Then all functions are operable via keyboard with a logical focus order and visible focus states Given a screen reader such as NVDA, JAWS, or VoiceOver When announcing UI controls and overlay states Then controls expose accessible names, roles, states, and live updates; opacity, zoom, and mode changes are announced within 500 ms Given WCAG 2.2 AA criteria When auditing color contrast, target sizes, and motion Then controls meet contrast ≥4.5:1, interactive targets are ≥24x24 px, reduced-motion preference disables animated transitions, and no content flashes more than 3 times per second

Time & Cost Estimator

"As an independent seller, I want to know how long processing will take and what it will cost so that I can plan my listing schedule and stay within budget."

Description

Estimate total processing time and monetary cost for running the selected preset across the entire batch by extrapolating from the test run, factoring in current queue load, hardware tier, image resolution distribution, and pricing rules. Present min/avg/max time ranges, confidence indicators, and currency breakdown. Update estimates dynamically when the preset, rules, or sample composition changes. Persist the estimate alongside the validation report and surface alerts if estimates exceed workspace budgets or SLA targets.

Acceptance Criteria

Extrapolate from Test Batch to Full Batch

Given a completed test batch of at least 10 images processed with preset P and recorded per-image timings and costs by resolution bucket When the estimator is opened for a full batch of M images Then the total estimated cost equals the sum over resolution buckets of (avg per-image cost from the test batch × image count in the full batch) using the current pricing rules, with rounding tolerance ≤ ±0.01 in the workspace currency And the total estimated processing time equals the sum over resolution buckets of (avg per-image processing time from the test batch × image count in the full batch), expressed in minutes And images flagged as unsupported are excluded from both totals and displayed as an excluded count

Factor Real-Time Queue Load and Hardware Tier

Given a current queue load Q (jobs ahead) and a selected hardware tier T with a configured performance multiplier When the estimator computes time ranges Then the ranges include queue wait time based on Q and the rolling average per-job time, and are scaled by the multiplier for T And the UI displays the current Q and selected T alongside the estimate And changing T updates the time estimates within 2 seconds of selection

Present Time Ranges with Confidence Indicator

Given test batch size N and coefficient of variation cv derived from per-image processing times When computing and displaying the estimate Then the UI shows numeric min, avg, and max durations with units (minutes/hours) and ensures min ≤ avg ≤ max And a confidence badge is displayed as High if N ≥ 50 and cv ≤ 0.20, Medium if N ≥ 20 and cv ≤ 0.35, else Low And the badge tooltip communicates expected error bounds: High ±15%, Medium ±25%, Low ±40%

Show Currency and Cost Breakdown

Given workspace currency settings and active pricing rules (base, hardware surcharges, preset surcharges, discounts) When the estimator displays cost Then the UI shows currency code and symbol and a breakdown with per-image average, batch total, and line items (base, surcharges, discounts) And totals equal the sum of line items, rounded to two decimals, with no hidden adjustments And the pricing rules version/timestamp used is visible in a tooltip or details panel

Dynamic Refresh on Preset/Rules/Sample Changes

Given the estimator panel is open When the user changes preset settings, pricing rules, or the test batch composition (add/remove images) Then the estimate recalculates and the UI refreshes within 2 seconds And an "Updated" timestamp and change highlight are shown And if recalculation fails, an inline error is shown and the prior estimate is labeled "Stale" and visually dimmed

Persist Estimate with Validation Report

Given a completed test run with an active estimate When the user saves or exports the validation report Then the estimate (time ranges, confidence, cost breakdown, and inputs snapshot: preset version, pricing rules version, queue timestamp, hardware tier, resolution distribution) is persisted with the report And reopening the report later shows the same values regardless of subsequent changes to presets or pricing rules And the report appears in the Batch Validator history with the estimate accessible from the list and detail views

Budget and SLA Exceedance Alerts

Given a workspace remaining budget B and SLA target T_sla When the estimated total cost exceeds B Then a blocking red alert states the overage amount and the "Accept" action is disabled unless the user has override permission and confirms When the estimated average or P95 processing time exceeds T_sla Then a non-blocking amber warning states the delta and provides actions to adjust preset, choose a faster hardware tier, or reduce batch size

One-click Accept/Tweak/Retry

"As a shop owner, I want to accept or adjust my preset and revalidate with minimal friction so that I can confidently roll out settings without wasting time."

Description

Provide primary actions to either accept the current preset and launch full-batch processing, open the preset editor with current settings for tweaks, or re-run the validator with a new sample in one click. Gate acceptance on meeting a configurable pass-rate threshold or require explicit override with confirmation and reason. Ensure idempotent job creation, atomic transition from validation to production run, and real-time status updates/notifications. Preserve context so edits in the preset editor can be revalidated and compared to prior runs.

Acceptance Criteria

Accept gated by pass-rate threshold with explicit override

Given an organization-level or preset-level pass-rate threshold is configured (default 95%) and the latest validator run is completed When the validator results screen loads Then the Accept button is enabled only if pass-rate >= threshold and disabled otherwise Given the Accept button is disabled due to threshold not met When the user selects Override Then a confirmation modal requires a reason of at least 10 characters, displays the current pass-rate and threshold, and shows an explicit warning about risks Given the user confirms the override with a valid reason When the override is submitted Then the production run is initiated and an audit log entry is recorded with userId, timestamp, presetVersion, validatorRunId, pass-rate, threshold, and override reason Given the pass-rate >= threshold When the user clicks Accept Then the production run is initiated without requiring an override modal

Atomic promotion and idempotent production job creation

Given a validator run meets gating (or override confirmed) When the user clicks Accept Then exactly one production job is created with an idempotency key composed of orgId + batchId + presetVersion + validatorRunId and duplicate attempts within 5 minutes do not create additional jobs Given the job creation succeeds When transitioning from validation to production Then the validator run is marked Promoted atomically only after the job is persisted and a jobId is returned; otherwise the run remains Unpromoted and an actionable error is shown Given the job is created When viewing the job details Then metadata includes presetVersion, validatorRunId, source batchId, item counts, and a snapshot of estimated time and cost as of acceptance

One-click Tweak opens editor with preserved context and compare

Given a completed validator run is in view When the user clicks Tweak Then the preset editor opens preloaded with the exact settings used by the latest run and the sampled images, metrics, and diffs context are preserved Given the editor is open with preserved context When the user clicks Revalidate Then a new validator run executes against the same sample by default, with an option to pick a different sample, and the results view shows side-by-side comparison to the prior run (scores, pass/fail reasons, and visual diffs) Given the user saves changes and exits the editor When returning to the validator results view Then the latest run is selected, prior runs remain available in history, and filters/selections persist across page refresh

One-click Retry runs validator with a new sample

Given a completed validator run is in view When the user clicks Retry Then a one-step sample picker offers Last Sample, Random (10, 50, 100), and Custom selection options with Random 50 preselected, and starting the run requires a single confirm click Given a new validator run is started via Retry When the run begins Then the UI reflects Validating state within 3 seconds and shows an updated time and cost estimate Given the Retry run completes When results are displayed Then pass/fail reasons, visual diffs, and summary metrics are shown and Accept gating is recalculated from this latest run Given rate limits are in place When more than 5 validator runs are triggered for the same preset within 60 seconds Then the Retry control is disabled and a cooldown countdown is displayed until limits reset

Real-time status updates and notifications

Given a validator run or production job is active When backend status changes occur Then in-app progress updates are displayed within 2 seconds via WebSocket; if WebSocket is unavailable, polling occurs every 10 seconds Given key state transitions occur (Run Completed, Job Started, Job Completed, Job Failed) When notifications are enabled Then users receive an in-app toast with the state, primary metrics, and a deep link to details Given email/webhook notifications are configured at the workspace level When a production job starts or finishes Then an email/webhook is sent containing jobId, presetVersion, validatorRunId, total items, succeeded/failed counts, startedAt, finishedAt, and duration

Concurrency safety for Accept across multi-clicks and clients

Given multiple Accept attempts occur (double-clicks, rapid retries, or from different clients) for the same presetVersion and validatorRunId within 30 seconds When requests reach the backend Then only the first succeeds in creating a production job and subsequent attempts return 409 Already Promoted with the existing jobId surfaced to the UI Given the first Accept succeeds When the page is refreshed or opened on another device Then the UI shows the Promoted state with a link to the existing job and Accept is disabled Given suppressed duplicate attempts occur When viewing the activity log Then one Promotion entry exists with additional Suppressed duplicate records capturing userId, timestamp, and source client

Validation Audit Trail & Versioned Reports

"As an operations lead, I want a complete history of validation runs and decisions so that I can audit quality, reproduce results, and roll back when necessary."

Description

Record every validation session with immutable metadata: preset version and diff, ruleset and thresholds, sample selection method and composition, per-image results and annotations, aggregate metrics, time/cost estimates, user decisions, and any overrides. Provide searchable history, sharable links, and export (PDF/CSV) for compliance and collaboration. Support retention policies by workspace and enforce role-based access to reports and sample outputs. Enable one-click rollback to a previously validated preset version.

Acceptance Criteria

Immutable Session Recording

Given a completed batch validation run in workspace W When the system persists the session Then the session record includes non-null fields: session_id, created_at (UTC ISO-8601), created_by_user_id, workspace_id, preset_id, preset_version, preset_diff (JSON Patch), ruleset_id, ruleset_thresholds, sample_selection_method, sample_composition, per_image_results (image_id, pass/fail reasons, annotations, visual_diff_uri), aggregate_metrics (pass_rate, avg_score, stdev), estimated_time_ms, estimated_cost_cents, user_decision, overrides (approver_id, rationale) And subsequent attempts to modify any of these fields return HTTP 409 and are audit-logged with actor_id and timestamp And repeated reads of the session return the same SHA-256 checksum for the persisted payload

Searchable Validation History & Filters

Given a workspace with >= 1,000 validation sessions over the last 90 days When a user applies filters (date range, preset_version, ruleset_id, user_decision, pass_rate >= threshold) and a full-text query on History Then only matching sessions are returned And the first page (<= 50 rows) renders within 2000 ms at p95 And results are sortable by created_at and pass_rate And the full-text query matches across metadata and per-image annotations And an empty state is shown when no records match

Role-Based Access Enforcement

Given RBAC roles Owner, Admin, Editor, Viewer, External Reviewer with defined permissions When a user attempts to view, export, share a report, or download sample outputs Then access is granted only if the role has the corresponding permission in that workspace And unauthorized attempts return HTTP 403 and are audit-logged with actor_id, action, resource_id And sample output URIs are time-limited and require the same authorization context

Shareable Link Lifecycle

Given a user with Share permission creates a shareable link to a specific session When they set scope (report-only | report+samples) and expiry (1 hour to 30 days) Then the system issues a signed URL that grants exactly the requested scope until expiry And revoking the link invalidates access within 60 seconds And each access is logged (viewer_id or IP, timestamp, user_agent) And the shared view is read-only and hides destructive actions

Export Reports (PDF/CSV) with Data Fidelity

Given a session record with up to 500 images When a user exports PDF and CSV Then the PDF includes metadata, aggregate metrics, per-image summaries (image_id, pass/fail reasons), visual diff thumbnails or URIs, and a document hash And the CSV contains one row per image with stable column headers and UTC timestamps And generation completes within 10 seconds at p95 And the contents of both exports match the persisted session data exactly

Workspace Retention Policies & Purge

Given workspace retention policy R days and optional legal holds on sessions When a session reaches age > R and has no active legal hold Then its report, exports, share links, and sample outputs are permanently deleted And search indexes are updated and access attempts return HTTP 410 Gone And purge runs at least daily and emits an audit log entry per session And sessions on legal hold are excluded from purge until the hold is removed

One-Click Rollback to Validated Preset Version

Given an authorized user views a previously validated preset version in History When they click Rollback and confirm Then that version becomes the active preset for the workspace within 60 seconds And an audit entry links the rollback action to the source session and prior active version And new validations started after propagation use the rolled-back version And rollback is blocked if there are uncommitted preset edits, showing a clear error message

RuleStream

Always-current rule engine that auto-syncs marketplace specs by region and category. Get change alerts with plain‑language summaries, auto-revalidate impacted images, and see exactly what needs updating—eliminating surprise rejections and last‑minute rework.

Requirements

Marketplace Spec Auto-Sync

"As an e-commerce seller, I want PixelLift to automatically keep marketplace rules up to date for my regions and product categories so that my images always meet current requirements without me having to track changes manually."

Description

Continuously ingest and normalize marketplace compliance specifications (e.g., Amazon, Etsy, eBay, Shopify) by region and category via scheduled pulls and webhook triggers. Map external rule fields (dimensions, background, watermark, file size/format, text overlays, margins) into PixelLift’s internal rule schema and category taxonomy. Provide resilient caching, rate limiting, and fallback to last-known-good rules on source outages. Detect deltas between versions to mark impacted categories/regions and set effective dates. Expose a health panel showing last sync time, source URLs, and any parser errors. Enables always-current compliance without manual updates, reducing listing rejections and rework.

Acceptance Criteria

Scheduled Pull Normalizes Multi-Market Rules by Region & Category

Given scheduled sync interval is configured to 60 minutes and sources for Amazon, Etsy, eBay, and Shopify are enabled When the scheduler runs at the configured interval Then the system fetches specs for each enabled marketplace-region-category combination from the configured source URLs And parses and normalizes fields into the internal rule schema: dimensions (unit-normalized), background, watermark, file size (bytes), file format (enum), text overlays, margins (percent or pixels) And maps each rule to the internal category taxonomy ID for the corresponding marketplace-region-category And persists a normalized RuleSet with keys {marketplace, region, categoryId, versionId} including sourceUrl and fetchedAt timestamps And records metrics per marketplace for requests made, RuleSets stored, and failures And completes without error with success rate 100% for reachable sources

Webhook Change Event Triggers Idempotent Delta Sync

Given a valid marketplace webhook notification for a specific marketplace-region-category with a unique changeId is received When the webhook is processed Then the system fetches the latest source spec and computes a field-level delta against the last stored version And creates a new version only if a non-empty delta exists And sets version metadata: versionId, changeId, effectiveDate (from source or now), and delta summary And ensures idempotency by ignoring duplicate webhooks with the same changeId for 24 hours And enforces a single in-flight sync per marketplace-region-category via locking

Version Delta Detection Marks Impacted Categories/Regions and Effective Dates

Given two consecutive RuleSet versions exist for a marketplace-region-category When any of the mapped fields (dimensions, background, watermark, file size/format, text overlays, margins) differ between versions Then the system generates an ImpactRecord listing impacted marketplace, region, and categoryId with change types per field And marks the affected marketplace-region-category as "Impacted" until the effectiveDate passes And assigns effectiveDate from the source if provided; otherwise sets effectiveDate to the ingestion timestamp And emits a RuleSetChanged event containing marketplace, region, categoryId, previousVersionId, newVersionId, effectiveDate

Rate Limiting, Caching, and Retry Backoff Protect Source APIs

Given provider rate limits and a cache TTL are configured When the sync client calls provider APIs Then requests are paced so that the configured rate limits are never exceeded And responses are cached per source URL for the configured TTL and served from cache when within TTL And on HTTP 429/5xx/timeouts, the client retries up to 3 times with exponential backoff starting at 1s and jitter And after retries are exhausted, the sync marks the source as DEGRADED and continues with other sources

Fallback to Last-Known-Good Rules on Source Outage

Given a source API experiences sustained failures (>=3 consecutive attempts within 15 minutes) When a sync is attempted Then the system serves and exposes the last-known-good RuleSet for all affected marketplace-region-category combinations And flags the affected scope with usingFallback=true and records outage details And prevents deletion or overwriting of the last-known-good data until a successful sync occurs And clears the fallback flag automatically on the next successful sync

Health Panel Exposes Sync Status, Source URLs, and Parser Errors

Given the health API endpoint /rulestream/health is queried When the request is made with or without filters (marketplace, region, categoryId, status) Then the response includes for each marketplace-region-category: lastSyncStartedAt, lastSyncSucceededAt, sourceUrl(s), status (OK/DEGRADED/ERROR), currentVersionId, parserErrors[] And the endpoint responds within 800 ms for up to 10,000 records And parser errors include categoryId, field, message, and sample payload snippet And the data reflects the latest completed sync run

Schema Mapping Completeness for Core Fields

Given a source spec payload contains fields for dimensions, background, watermark, file size, file format, text overlays, and margins When the payload is parsed Then 100% of required fields are mapped to the internal schema or a parser error is recorded with details And units are normalized (e.g., inches/cm to mm) and enumerations are validated against allowed values And invalid or unsupported fields do not block processing of supported fields; partial mappings are flagged with severity=WARNING And the stored RuleSet passes schema validation with no critical errors

Plain-Language Change Summaries

"As a catalog manager, I want plain-language alerts explaining what changed and which listings are affected so that I can quickly assess impact and prioritize fixes."

Description

Generate human-readable summaries for rule changes with clear highlights of what changed, why it matters, affected categories/regions, effective dates, and severity (blocking vs advisory). Provide concise diffs (before/after) and link to full rule details. Deliver alerts across channels (in‑app notifications, email, Slack/webhook) with actionable CTAs to review impact or start revalidation. Summaries use non-technical language and examples (e.g., “Main image must have pure white background (#FFFFFF)”); localize terms and units per user locale. Improves awareness and reduces time to understand and act on changes.

Acceptance Criteria

In-App Alert: Plain-Language Summary with CTAs

Given a marketplace rule update affects at least one category or region in the user’s workspace When RuleStream syncs changes Then an in-app notification is created within 15 minutes containing: a human-readable title, what changed, why it matters, affected categories/regions, severity (Blocking or Advisory), effective date/time, and a concise before/after diff snippet, plus a link to full rule details And Then the notification shows a Review Impact CTA that opens the impact view filtered to the specific change And Then the notification shows a Start Revalidation CTA only if impacted images > 0; otherwise the CTA is hidden or disabled with an explanation And Then clicking Start Revalidation enqueues revalidation for all impacted images and displays progress status in-app

Email Notification: Localized Summary with Before/After Diff

Given a user has email alerts enabled and a locale/time zone set When a rule change is synced that impacts the user’s selected marketplaces/regions/categories Then an email is sent within 15 minutes with a subject that includes severity, marketplace, region, and a short change summary And Then the email body includes: what changed, why it matters, affected categories/regions, effective date/time in the user’s time zone, a concise before/after diff, at least one simple example, and links to Review Impact and full rule details And Then terms, dates, color codes, and units are localized to the user’s locale (e.g., cm vs inches, date format, decimal separators) without altering numeric accuracy And Then the email is sent once per unique rule change per workspace (no duplicates) and includes a Manage Notifications link respecting user preferences

Slack/Webhook Delivery: Actionable Alert and Structured Payload

Given a workspace has Slack integration configured When a relevant rule change is synced Then a Slack message posts within 15 minutes using clear, plain language and includes severity color/label, what changed, why it matters, affected categories/regions, effective date/time, a concise before/after diff snippet, and links to Review Impact and full rule details And Then all links in the Slack message deep-link to the corresponding in-app views Given a workspace has a generic webhook configured When a relevant rule change is synced Then a POST is delivered with a JSON payload that includes: change_id, severity, marketplace, regions, categories, effective_at (ISO 8601), summary_text, diff.before, diff.after, examples[], urls.review_impact, urls.rule_details, locale, impacted_counts, created_at And Then the webhook is HMAC-SHA256 signed via shared secret with X-Signature header, uses an Idempotency-Key, and retries up to 3 times with exponential backoff on 5xx responses

Diff Generation: Concise, Accurate Before/After Highlights

Given the previous and current versions of a marketplace rule When generating the change summary Then the before/after diff highlights only modified elements (added/removed/changed) and preserves exact tokens (e.g., “#FFFFFF”) And Then each side of the diff is truncated to a maximum of 240 characters with ellipses if longer, and provides a View full rule link And Then the diff includes one positive and one negative example illustrating compliance vs non-compliance And Then visual diff indicators meet accessibility contrast (WCAG AA) and include text labels for screen readers

Severity and Effective Date Display Logic

Given rule metadata indicates enforcement level When summarizing Then severity is mapped to Blocking or Advisory and displayed consistently across all channels with corresponding label and color Given the rule has an effective date/time When summarizing Then the effective date/time is shown in the user’s time zone and formatted per locale; if effective immediately, display Effective now And Then summaries for Blocking changes include an Action required badge; Advisory changes include a Recommended badge

Plain-Language Quality and Examples

Given technical rule text is available When creating the human-readable summary Then the text is non-technical, avoids undefined jargon, and achieves a readability score at or below 8th-grade level (Flesch-Kincaid Grade <= 8.0) And Then the summary includes at least one concrete example with localized units/terms (e.g., “Main image must have a pure white background (#FFFFFF)”) and is <= 120 words And Then prohibited/ambiguous terms list (e.g., “utilize”, “henceforth”, “hereunder”) does not appear in the summary

Impact Review and Revalidation Flow

Given a rule change is detected When the user opens Review Impact from any channel Then the impact view is scoped to the change_id and lists impacted assets grouped by severity (Blocking vs Advisory), with exact counts And Then if auto-revalidation is enabled for the workspace, impacted assets are queued within 15 minutes and results update the counts in real time; otherwise the user can trigger Start Revalidation from the view And Then revalidation job status (queued, running, completed) and outcome (Pass/Fail) are visible, and completing revalidation updates the alert to Resolved if all assets pass

Impact Analysis & Auto-Revalidation

"As a merchandising lead, I want PixelLift to automatically revalidate affected images when rules change so that I can see exactly what will fail and fix it before listings are rejected."

Description

Identify all assets, listings, and style-presets impacted by rule deltas using category/region mapping and historical validation results. Automatically queue revalidation jobs to test images against new rules (e.g., background color, aspect ratio, edge whitespace, watermarks, text overlays, file type/size). Produce pass/fail results with reasons, confidence scores, and remediation tags. Update dashboards with affected counts, risk levels, and deadlines based on effective dates; support filtering by marketplace, region, category, and account. Minimizes surprise rejections by proactively catching upcoming noncompliance.

Acceptance Criteria

Impact Analysis on Background Color Rule Delta

Given a published rule delta changing background color for marketplace=MKT-A, region=EU, category=Shoes with effective_date set When impact analysis runs Then 100% of assets, listings, and style-presets historically validated under (MKT-A, EU, Shoes) are evaluated within 10 minutes And the impacted set includes all and only items that would fail under the new rule And each impacted item record includes id (asset/listing/preset), marketplace, region, category, impacted_rule_ids, effective_date, and last_validated_at

Auto-Queue Revalidation for Aspect Ratio Change

Given impacted items exist for a rule delta on aspect_ratio When the delta is published Then revalidation jobs for all impacted items are enqueued within 5 minutes And job priority is ordered by ascending effective_date and higher risk_level first And the queue prevents duplicate jobs per (asset_id, rule_version) pair And a retry policy (max 3 attempts, exponential backoff up to 15 minutes) is applied for transient failures

Pass/Fail Results with Reasons, Confidence, and Remediation Tags

Given a revalidation job completes for an asset When rules are evaluated Then a result is produced with status PASS or FAIL for the asset And for each failed rule the result includes rule_id, reason_code, human_readable_reason, confidence_score between 0 and 1 with two-decimal precision, and at least one remediation_tag And the result is persisted and available via dashboard and API within 1 minute

Dashboard Affected Counts, Risk Levels, and Deadlines

Given at least one rule delta with impacted items exists When the dashboard loads Then it displays total impacted counts and breakdowns by marketplace, region, category, and account And shows risk_level as High when effective_date <= 7 days and impacted_count > 0, Medium when 8–30 days, Low when > 30 days And displays deadline equal to the rule's effective_date for each group And all aggregates refresh within 1 minute after new results are persisted

Filtering by Marketplace, Region, Category, and Account

Given the dashboard contains impacted data across multiple marketplaces, regions, categories, and accounts When the user applies filters marketplace=MKT-A, region=US, category=Apparel, account=ACC-123 Then lists, charts, and counts display only items matching all selected filters And the URL/query state reflects the filters for shareable deep links And clearing filters restores unfiltered totals within 2 seconds

Idempotent Revalidation on Rule Version Updates

Given a rule delta is superseded by a new version before its effective_date When the new version is received Then queued jobs for the old version are cancelled within 2 minutes And assets already revalidated against the old version are requeued only if their expected outcome differs under the new version And the system guarantees no more than one active job per (asset_id, rule_version) at any time

One-Click Remediation Suggestions

"As a seller, I want one-click, compliant fixes for failing images so that I can remediate issues at scale without manual editing."

Description

Provide prescriptive, auto-generated fixes for failed validations and enable one-click batch remediation. Supported actions include adjusting canvas to required aspect ratio, adding padding for edge whitespace, enforcing pure white or compliant background, converting file format/quality to meet size caps, and removing disallowed text/watermarks. Integrate with PixelLift style-presets to suggest safe preset updates and version bumps; allow preview and selective apply with rollback. Track changes back to the specific rule version that prompted the fix. Accelerates recovery from rule changes while preserving brand consistency.

Acceptance Criteria

Auto-Generated Fix Suggestions for Failed Validations

Given a batch upload contains images failing marketplace rules by region and category When the user opens the RuleStream Remediation panel for that batch Then the system displays for each failed rule on each image: a prescriptive fix action, a plain‑language explanation referencing rule name and version, a preview thumbnail, and estimated impact (aspect ratio, background, file size) And suggestions are generated for up to 500 images within 60 seconds And every suggestion is traceable to the specific rule ID and version that triggered it

One-Click Batch Remediation Execution and Auto-Revalidation

Given the user selects any subset of suggested fixes across the batch When the user clicks Remediate All Then the system applies the selected fixes atomically per image, creating a new image version and preserving the original And completes processing of 500 images within 10 minutes under normal load And displays a completion summary with counts: succeeded, failed, retried, skipped And automatically revalidates all remediated images against the active rule set And shows pass/fail per image with links to any remaining issues And partial failures do not block other images; failed items have actionable error messages and a retry option

Aspect Ratio and Edge-Whitespace Compliance via Canvas Adjust/Padding

Given images fail aspect ratio or edge‑whitespace rules When the user accepts the "Adjust canvas/add padding" suggestion Then each output image matches the required aspect ratio within ±0.01 tolerance And minimum edge whitespace meets or exceeds the rule requirement And no subject clipping is introduced; the subject remains fully visible And revalidation for aspect ratio and edge‑whitespace rules passes And output dimensions and file size remain within marketplace caps

Background Compliance and Text/Watermark Removal

Given images fail background or disallowed text/watermark rules When the user accepts the "Enforce background and remove text/watermarks" suggestion Then background is set to the rule‑compliant color/template (e.g., pure white #FFFFFF) per rule And OCR/watermark detection confidence for disallowed elements is below the rule's rejection threshold And no text overlays remain except those explicitly allowed by the rule And revalidation for background and text/watermark rules passes

Format/Quality Conversion to Meet File Size Caps

Given images exceed file size caps or have non‑compliant format/color profile When the user accepts the "Convert format/quality" suggestion Then output format, color profile, and metadata match the rule requirements And file size is ≤ the marketplace cap while maintaining SSIM ≥ 0.98 versus the input And pixel dimensions meet min/max constraints And revalidation for file type, color profile, and size passes

Style-Preset Safe Update, Preview, Selective Apply, and Rollback with Traceability

Given a rule change conflicts with the current PixelLift style‑preset When the user opens preset suggestions Then the system proposes a new preset version with only rule‑safe parameter changes, listing the changed parameters And the user can preview side‑by‑side before/after on sample images within 3 seconds per preview And the user can selectively apply the new preset version to chosen catalogs or images And a one‑click rollback restores prior image versions and reverts the preset to the previous version And all actions are logged with timestamp, user, impacted assets, and the originating rule ID/version And remediated images are revalidated and results recorded

Rule Versioning & Audit Trail

"As a compliance officer, I want a complete versioned history of rules, validations, and fixes so that I can demonstrate due diligence and trace issues when marketplaces question a listing."

Description

Maintain immutable, versioned snapshots of marketplace rules per source/region/category with timestamps, source references, and content hashes. Store validation outcomes and remediation actions per asset, linked to the rule version in effect at the time. Provide exportable audit logs (CSV/JSON/PDF) and APIs for compliance evidence, including who approved changes and when. Support rollback to prior rule mappings if a feed is erroneous. Ensures traceability for enterprise customers and simplifies dispute resolution with marketplaces.

Acceptance Criteria

Create Immutable Rule Snapshot on Sync

Given a new or changed marketplace rule feed is detected for a specific source/region/category When the sync job runs Then the system creates a new rule version snapshot with fields: versionId (UUID), sourceReference, region, category, fetchedAt (ISO 8601 UTC), contentHash (SHA-256), and rawRulePayload And the snapshot content is immutable; any update attempt returns 409 and does not alter contentHash or payload And retrieving by versionId returns the exact stored snapshot bytes And if the computed contentHash matches the latest active snapshot for the same source/region/category, no new version is created and the sync is recorded as idempotent

Link Validation Outcomes to Rule Version

Given an asset is validated against marketplace rules When validation completes Then the system stores a validation record with assetId, ruleVersionId used, validatorEngineVersion, validatedAt (UTC), outcome (Pass/Fail), and violations[] And the record persists unchanged even if newer rule versions are activated later And remediation actions (manual or automated) are appended with actionType, actorId, actionAt (UTC), and reference the originating validation recordId And API/UI retrieval returns a complete chronological history per asset including all validation records and remediation actions

Export Audit Logs for Compliance

Given a user with audit:read scope requests an export via API or UI with filters (date range ≤ 90 days, region, category, assetIds[], outcome) When the export is generated Then CSV and JSON files are produced for up to 100,000 records within 2 minutes and a PDF summary for up to 5,000 records within 2 minutes And each record includes assetId, ruleVersionId, ruleFetchedAt, sourceReference, region, category, contentHash, outcome, violations, remediation actions, actorIds, decision/approval info, and timestamps (UTC) And download URLs are returned with SHA-256 checksums and 24-hour expiry And exports larger than 100,000 records are paginated via a cursor-based API without data loss or duplication across pages And all export requests are themselves logged in the audit trail with requesterId, requestedAt, filters, and file checksums

Record Rule Change Approvals

Given a proposed rule mapping change requires approval before activation When an approver with role Compliance Admin approves or rejects the change Then an approval record is stored with approverId, decision (Approve/Reject), decisionAt (UTC), changeSummary, priorVersionId, newVersionId, and optional rationale And the approval record is tamper-evident via an HMAC signature over its fields using the organization key And activation of a rule version without an approval record returns 403 and the version remains inactive And audit queries by versionId return who approved and when

Rollback to Prior Rule Mapping

Given an active rule version is determined erroneous for a source/region/category When a user with rollback permission selects a prior version to activate Then the selected prior version becomes the active mapping within 60 seconds and a rollback event is recorded with requesterId, reason, priorActiveVersionId, newActiveVersionId, and timestamp (UTC) And no snapshots are deleted or modified; only the active pointer changes And all impacted assets are queued for revalidation against the restored version within 15 minutes And notifications are sent to subscribed users indicating rollback details and revalidation status

Auto-Revalidate on Rule Update

Given a new rule version becomes active for a source/region/category When activation occurs Then impacted assets in that scope are identified and queued for revalidation within 5 minutes And 95% of queued assets up to 50,000 complete revalidation within 60 minutes And new validation records reference the new ruleVersionId while preserving prior records intact And assets that transition from Pass to Fail are flagged and included in a change-impact report accessible via API

Rule Testing Sandbox

"As a brand ops manager, I want to test upcoming rules against my catalog and presets so that I can anticipate failures and adjust workflows before changes go live."

Description

Offer a sandbox to simulate current, upcoming, or custom rule sets against selected assets and style-presets without affecting production. Allow users to import proposed marketplace changes or upload custom rule JSON for private channels, run validations, and preview remediation outcomes. Provide what-if comparisons (current vs upcoming) and estimated effort to fix. Enable promotion of a tested rule set to production with safeguards and approvals. Helps teams prepare for changes and de-risk rollouts.

Acceptance Criteria

Upload and Validate Custom Rule JSON

- Given a user with Rules:Manage permission uploads a custom rule JSON file (<=10 MB) conforming to PixelLift RuleStream schema v1.2, When validation runs, Then the system accepts the upload, displays parsed rule count, channel=Private, and assigns a rule-set version ID. - Given a JSON that violates schema or has syntax errors, When validation runs, Then the system rejects the upload and returns a clear error list with file name, line, column, and up to 100 issues; no rule set is created. - Given duplicate rule IDs or key collisions, When validation runs, Then the system blocks save and prompts to auto-namespace or fix conflicts before proceed. - Given references to unknown marketplace categories or regions, When validation runs, Then the system requires mapping to known categories/regions before enabling Run; unresolved references keep status=Draft.

Import Proposed Marketplace Rule Changes

- Given a supported marketplace, region, and category are selected, When the user clicks Import Upcoming Changes, Then the system fetches the latest proposed rules, tags them with Effective Date, and shows a plain-language summary with counts of Added/Modified/Removed rules and links to sources. - Given the marketplace feed is unavailable, When import is attempted, Then the system shows a retry/backoff status and offers manual file import; successful manual import is tagged Source=Manual. - Given an upcoming rule set is imported, When saved, Then it is read-only (annotatable), versioned, and available for sandbox runs without altering production rules.

Non-Destructive Sandbox Validation Run

- Given a user selects up to 10,000 assets and one or more style-presets, and selects a rule set (current, upcoming, or custom), When Run Validation in Sandbox is started, Then the system validates all selected combinations without writing to production assets, metadata, caches, or publish statuses. - Given the run completes, When results are stored, Then they are saved under a Sandbox Project ID with full audit log and auto-expire in 30 days; production automations are not triggered. - Given asset selection exceeds 10,000, When run is initiated, Then the system blocks start and prompts to split the batch or request a quota increase.

What-If Comparison: Current vs Upcoming Rules

- Given the same asset set is evaluated against Current and Upcoming rule sets, When the comparison completes, Then the system displays side-by-side pass/fail totals, newly failing/passing asset lists, and the specific rules causing status changes. - Given comparison results are shown, When the user exports, Then a CSV export containing asset IDs, rule IDs, current status, upcoming status, and delta is downloadable. - Given per-rule estimated fix time defaults (minutes) are configured, When the comparison identifies failures, Then the system displays total estimated effort (hours), average per asset, and breakdown by rule.

Remediation Preview for Failed Assets

- Given failed assets have auto-remediable rules, When the user requests Preview Fix, Then the system generates non-destructive preview thumbnails (min 1024px) showing proposed changes with visual diffs within 60 seconds per asset, without altering originals. - Given an issue is not auto-remediable, When viewing details, Then the system provides plain-language remediation guidance referencing the violated rule and required change. - Given previews are generated, When the user downloads the remediation plan, Then a JSON/CSV containing asset IDs, proposed actions, and estimated effort is provided.

Governance and Safeguards for Promotion to Production

- Given a sandbox rule set has passed review, When Promote to Production is initiated, Then at least two distinct approvers with Rules:Approve must approve; the requester cannot self-approve. - Given promotion pre-checks are configured (0 critical fails, <=5% non-critical fails), When the system runs a gating validation on a defined sample or full set, Then promotion is blocked unless thresholds are met or documented waivers exist. - Given promotion succeeds, When finalization occurs, Then the system records an immutable audit entry (timestamp, approvers, rule-set hash) and provides one-click rollback to the previous production version.

Performance and Progress for Batch Sandbox Runs

- Given a sandbox run includes up to 5,000 assets, When validation starts, Then 95% of runs complete within 5 minutes and display a live progress indicator with remaining time estimate. - Given a sandbox run is executing, When the user pauses or cancels, Then the system safely pauses/cancels within 30 seconds and preserves partial results for later resume. - Given runs up to 10,000 assets, When completed, Then per-asset and per-rule timing metrics are available for download to support capacity planning.

Category IQ

Automatically identifies the correct product category from image cues and listing metadata, then applies the right, category‑specific checks. Cuts false flags and ensures precise validation (e.g., apparel vs. jewelry nuances) without manual mapping or guesswork.

Requirements

Multimodal Category Classification Engine

"As a boutique seller, I want the system to automatically detect the correct category from my photos and listing details so that I don’t have to map categories manually and can trust downstream checks."

Description

Build a production-grade classifier that fuses image features with listing metadata (titles, tags, attributes) to predict the correct product category from PixelLift’s unified taxonomy. Must output category ID, full path, and confidence score; support top-k predictions and out-of-distribution detection; be resilient to background removal artifacts and partial occlusions. Expose as a horizontally scalable microservice (REST/gRPC) with p95 latency ≤300 ms per item at batch size 32, autoscaling, health checks, and detailed metrics/tracing. Enable safe, zero-downtime model updates via canary and version pinning.

Acceptance Criteria

Batch Classification Output Schema and Top‑K Results

Given a request to classify a batch of items (batch_size ≤ 128) via REST /v1/classify or gRPC Classify with images and listing metadata When the request is processed Then each item response includes: category_id (string in unified taxonomy), full_path (string "Department>Category>Subcategory"), confidence (0.0–1.0), and top_k results each with category_id and confidence sorted by descending confidence And the number of top_k results equals the requested k (default 5, max 20) And responses preserve input order and include the client-supplied item_id And all returned category_id values exist in the taxonomy snapshot for the returned model_version

Multimodal Accuracy on Unified Taxonomy Benchmark

Given the PixelLift validation set v1.0 (stratified by category; n ≥ 50,000) When evaluating the production model with default settings Then top-1 accuracy ≥ 92% and top-3 accuracy ≥ 98% overall And weighted F1 ≥ 0.93 overall And per-major-department top-1 accuracy ≥ 88% And expected calibration error (ECE, 15 bins) ≤ 0.05

Out‑of‑Distribution Detection and Unknown Handling

Given an OOD benchmark consisting of non-taxonomy product types and non-product images, and an operational OOD threshold τ When evaluating the service Then AUROC for OOD vs in-distribution ≥ 0.95 and FPR at 95% TPR ≤ 10% And OOD items are returned with ood=true, category_id="unknown", and confidence ≤ 0.20 And in-distribution items are returned with ood=false and in-distribution recall ≥ 95% at τ And τ is configurable per model_version and applied without restart within 60 seconds of change

Robustness to Background Removal and Partial Occlusions

Given a paired dataset of original images and variants with AI background removal artifacts (alpha edges), JPEG compression at Q=60, and 20% area rectangular occlusions When running classification with default settings Then top-1 accuracy on the perturbed set decreases by ≤ 2 percentage points vs the original set And for pairs where original confidence ≥ 0.60, predicted top-1 category matches the original in ≥ 95% of cases And OOD flag rate on transparent PNGs differs by ≤ 2 percentage points from originals

Latency, Throughput, and Horizontal Scalability

Given steady load generated with batch size = 32 and typical payloads When the service runs on the target production instance type per pod Then p95 per-item latency ≤ 300 ms and p99 ≤ 450 ms over a 30-minute window And throughput ≥ 200 items/second per pod at ≤ 85% GPU/CPU utilization And autoscaling increases replicas from min=2 to max=20 within 60 seconds of sustained CPU > 70% or queue backlog > 1000 items, maintaining p95 ≤ 300 ms And request error rate (HTTP 429/503/gRPC UNAVAILABLE) ≤ 0.10% during 10× baseline load for 30 minutes

Service Interfaces, Validation, and Observability

Given clients integrate via REST and gRPC When invoking REST POST /v1/classify or gRPC Classify with a batch payload Then the service validates inputs and returns 400/INVALID_ARGUMENT for schema violations with machine-readable error codes And supports parameters: top_k (1–20), model_version, ood_threshold, and returns model_version in responses And exposes /healthz (liveness), /readyz (readiness), and /metrics (Prometheus) including latency histograms, error rates, model_version labels, and top-k distribution And propagates W3C tracecontext (traceparent) and emits spans per request and per item with attributes: model_version, batch_size, device, latency_ms

Zero‑Downtime Model Updates with Canary and Version Pinning

Given a new model version is deployed behind the service When initiating a canary rollout Then traffic starts at 10% for ≥ 15 minutes and auto-promotes only if canary p95 latency within +5% of baseline, error rate ≤ 0.5%, and shadow top-1 agreement with baseline ≥ 99% And automatic rollback triggers within 2 minutes if any threshold is breached And clients can pin model_version; pinned requests are not routed to canary and remain available during rollout And no downtime occurs (availability ≥ 99.99% and 5xx spike ≤ 0.5% over rollout window)

Adaptive Taxonomy Mapping & Versioning

"As an operations manager, I want Category IQ to stay aligned with marketplace category changes so that validations remain accurate without rework."

Description

Maintain a normalized, versioned category graph aligned to major marketplaces (Shopify, Etsy, Amazon) and custom merchant taxonomies. Provide tools to import external taxonomies, map them to the internal schema, manage synonyms/aliases, and deprecate or merge categories with effective dates. Expose read APIs to resolve current and historical mappings and ensure backward compatibility for existing jobs and presets.

Acceptance Criteria

Shopify/Etsy/Amazon Taxonomy Import and Mapping to Internal Graph

- Given a valid Shopify/Etsy/Amazon taxonomy export (≤50k nodes), When an admin imports it via tooling, Then a new immutable internal taxonomy version is created with a unique version_id and full audit log (actor, checksum, timestamp). - Given repeated import of the same file, When executed, Then the operation is idempotent (0 changes detected). - Given source nodes lacking mappings, When import completes, Then 100% are either mapped to internal categories or reported with actionable errors; import fails and fully rolls back if unmapped nodes > 0. - Given a successful import, When validation runs, Then parent/child integrity is preserved (no cycles, no orphans) and P95 import time ≤ 5 minutes.

Custom Merchant Taxonomy Mapping with Synonyms and Aliases

- Given a merchant uploads a custom taxonomy with synonym/alias columns, When mapping to internal categories, Then each alias resolves to a single canonical internal_category_id and is case- and locale-insensitive. - Given an alias collides across two internal categories, When saving, Then the operation is rejected with a conflict error listing collisions. - Given synonyms are added or removed, When a new taxonomy version is published, Then the change is versioned and historical resolutions continue to honor the prior version. - Given an alias is resolved via the read API, When requested, Then the response returns the canonical internal_category_id and indicates it was resolved via alias.

Scheduled Category Deprecation with Effective Dates and Redirects

- Given a category is scheduled for deprecation with effective_at and optional replaced_by, When current time < effective_at, Then the category remains active and writable. - Given current time ≥ effective_at, When creating new mappings or jobs targeting the deprecated category, Then the operation is blocked and a redirect to replaced_by is suggested if present. - Given existing jobs/presets referencing the deprecated category created before effective_at, When resolving without an explicit timestamp, Then they resolve via their pinned version; When resolving with at ≥ effective_at, Then they return the replacement category (or status=deprecated if none). - Given a deprecation is published, When auditing, Then an immutable audit log entry exists with actor, reason, and effective_at.

Category Merge with Historical Resolution Preservation

- Given categories A and B are merged into category C with effective_at, When resolving with at < effective_at, Then A and B resolve to themselves; When resolving with at ≥ effective_at or without at, Then A and B resolve to C. - Given the merge is committed, When validating graph integrity, Then no cycles or orphaned nodes exist and all synonyms of A and B transfer as aliases of C. - Given presets or jobs referencing A or B, When run after the merge, Then they execute via redirect to C with no configuration changes required and the redirect is logged.

Read API: Resolve Current and Historical Mappings

- Given marketplace_id, marketplace_category_id, and optional at timestamp, When calling the category resolution API, Then it returns 200 with internal_category_id, version_id, and status for known mappings. - Given a deprecated category without at, When calling the API, Then it returns 410 with replaced_by metadata if available. - Given an unknown mapping, When calling the API, Then it returns 404 with error_code=MAPPING_NOT_FOUND. - Given normal system load, When calling the API, Then P95 latency ≤ 200 ms and availability ≥ 99.9% monthly.

Backward Compatibility for Existing Jobs and Presets

- Given a job was created under taxonomy version V, When a new taxonomy version V+1 is published, Then the job completes using V without change and outputs identical internal category IDs to prior runs. - Given a preset references an internal_category_id that becomes deprecated or merged, When the preset is used after the change, Then it auto-resolves to the replacement and logs the redirect in the run metadata. - Given system-wide taxonomy updates, When examining job failure rates, Then the post-update 24-hour failure rate does not increase by more than 0.5 percentage points relative to the 7-day prior baseline. - Given a merchant opts to migrate presets to the latest taxonomy, When using the migration tool, Then a preview shows affected items and the change applies atomically with rollback support.

Category-Specific Validation Rules Engine

"As a seller, I want the correct checks to run for each product type so that issues are caught and fixed according to category nuances."

Description

Implement a declarative rules engine that triggers category-specific validations after classification (e.g., apparel: mannequin/pose compliance and wrinkle detection; jewelry: specular highlight/reflection and macro focus; footwear: pair presence and sole visibility; cosmetics: label legibility and shade swatch; home decor: scale reference). Rules are configurable per merchant and brand, support thresholds and dependencies, and emit pass/fail with structured reason codes and suggested fixes. Integrate with PixelLift’s retouch pipeline to auto-apply corrective actions where possible and re-validate post-fix.

Acceptance Criteria

Post-Classification Apparel Rule Execution

Given an image is classified as category=apparel When the rules engine runs category-specific validations Then it executes only apparel rules: pose_compliance, mannequin_presence_policy, wrinkle_intensity And it applies thresholds from brand override if present, otherwise merchant default And each rule returns status in {Pass, Fail} with a reasonCode and suggestion on Fail And no non-apparel category rule is executed for this image

Jewelry Reflection and Macro Focus Validation

Given an image is classified as category=jewelry When the rules engine validates category rules Then specular_highlight_percentage <= configured threshold results in Pass, otherwise Fail with reasonCode=JEWELRY_GLARE_EXCESS and a suggestion And macro_focus_score >= configured threshold results in Pass, otherwise Fail with reasonCode=JEWELRY_MACRO_FOCUS_LOW and a suggestion And the response includes per-rule metric values and thresholds used

Footwear Pair Presence and Sole Visibility

Given an image is classified as category=footwear When category-specific validations run Then pair_presence is detected for two shoes in frame; otherwise Fail with reasonCode=FOOTWEAR_PAIR_MISSING and a suggestion And sole_visibility_percent for at least one shoe >= configured threshold; otherwise Fail with reasonCode=FOOTWEAR_SOLE_NOT_VISIBLE and a suggestion And only footwear rules are evaluated for this image

Cosmetics Label Legibility and Shade Swatch

Given an image is classified as category=cosmetics and listing metadata contains a shade attribute When category validations run Then for primary images, label_ocr_confidence >= configured threshold; otherwise Fail with reasonCode=COS_LABEL_ILLEGIBLE and a suggestion And a shade_swatch is detected with swatch_area_percent >= configured threshold; otherwise Fail with reasonCode=COS_SWATCH_MISSING and a suggestion And thresholds reflect brand overrides when configured

Auto-Correction Integration and Re-Validation

Given a rule evaluation fails and has a mapped auto-corrective action with autoFixEnabled=true When the engine triggers the retouch pipeline for that action Then the engine re-validates the previously failed rule(s) exactly once on the corrected image And if the rule passes after correction, the result records autoFixApplied=true, fixAction code, and before/after metric values And if the rule still fails, the result records autoFixApplied=false and updated metrics with the same reasonCode and a suggestion And the job completes with consolidated results for the original and re-validated evaluations

Merchant/Brand Config Overrides and Rule Dependencies

Given a merchant default config and a brand-level override that raises the wrinkle_intensity threshold and disables mannequin_presence_policy When an apparel image tagged with that brand is evaluated Then wrinkle_intensity uses the brand override threshold And mannequin_presence_policy is not executed and does not appear in evaluated rules And for images where human_presence=false, pose_compliance is not executed and does not appear in evaluated rules And images without the brand tag use merchant default thresholds

Structured Result Envelope with Reason Codes and Fix Suggestions

Given any category validation run completes When the engine emits results Then the top-level payload includes assetId, merchantId, brandId, category, configVersion, engineVersion, evaluatedAt, and overallStatus derived from per-rule statuses And each evaluated rule entry includes ruleCode, category, status in {Pass, Fail}, metric name(s) with numeric values, threshold(s), severity, reasonCode (on Fail), suggestion (on Fail), autoFixAttempted, autoFixApplied, and fixAction (if applied) And the payload validates against the published JSON schema without errors

Confidence Thresholds & Human-in-the-Loop Review

"As a catalog manager, I want low-confidence items routed to a quick review flow so that I can resolve edge cases fast without slowing the batch."

Description

Provide configurable per-merchant thresholds for auto-assign vs. review. For low-confidence or conflicting metadata cases, surface top-3 category suggestions with rationales and allow one-click selection or keyboard-driven bulk actions. Support review queues, SLAs, and notifications. After resolution, the system re-runs validations and records the decision for learning and audit.

Acceptance Criteria

Per‑Merchant Confidence Threshold Configuration

Given I am a merchant admin in PixelLift settings When I configure AutoAssignThreshold and ReviewThreshold values between 0.00 and 1.00 with two‑decimal precision Then the form validates the range, prevents invalid input, and shows default values of 0.85 (auto‑assign) and 0.50 (review) And saving the settings versions the change with timestamp, actor, previous value, and merchant ID And the new thresholds take effect for new ingestions within 1 minute and apply only to my merchant’s items

Auto‑Assign on High Confidence Without Metadata Conflict

Given an item is ingested and Category IQ returns a top‑1 category with probability p And the merchant’s AutoAssignThreshold = Ta And no metadata conflict is detected When p ≥ Ta Then the system auto‑assigns the top‑1 category within 30 seconds And immediately runs the category‑specific validations And logs an audit entry with p, Ta, selected category, and rationale And the item bypasses the review queue

Top‑3 Suggestions for Low Confidence or Conflicting Metadata

Given Category IQ returns probabilities for categories and the merchant thresholds Ta (auto) and Tr (review) When p < Ta or a metadata conflict is detected (metadata‑derived category ≠ top‑1 image category) Then the UI displays the top‑3 category suggestions sorted by probability with confidence scores and one‑sentence rationales And the reviewer can apply any suggestion with a single click And applying a suggestion updates the category in under 1 second and removes the item from the queue And the decision is logged with all top‑3 suggestions and confidences

Keyboard and Bulk Actions in Review Queue

Given a reviewer is focused on the review queue When they navigate with Arrow/J/K keys and open the suggestion picker with Enter Then pressing 1/2/3 selects the corresponding suggestion and applies it in under 1 second And Shift+Arrow or Space allows multi‑select of items And choosing a bulk action to apply suggestion index (1/2/3) applies to up to 200 selected items within 5 seconds And a results summary shows success and per‑item failures without losing selection context

SLA Tracking and Reviewer Notifications

Given the merchant configures a review SLA duration and notification channels (email/Slack/webhook) When an item enters the review queue Then an SLA deadline is computed in the merchant’s time zone and displayed with a countdown And a reminder is sent at 15 minutes before deadline to assigned reviewers And an escalation notification is sent upon breach to the configured channel(s) And all notifications are logged with timestamp, recipients, and delivery status

Post‑Resolution Re‑Validation

Given a reviewer selects a category (via single or bulk action) When the decision is saved Then category‑specific validations automatically re‑run and results are available within 10 seconds And if all validations pass, the item status updates to Validated and downstream workflows resume And if any validation fails, failures with reasons are shown and the item remains in Needs Fix until re‑run passes

Decision Recording for Learning and Audit Export

Given a classification decision occurs (auto‑assign or human selection) When the decision is finalized Then an immutable audit record stores merchant ID, item ID, actor, timestamps, thresholds in effect, conflict flags, top‑3 suggestions with confidences, chosen category, rationale, and validation outcomes And human corrections are queued for the learning pipeline within 5 minutes And an authorized merchant admin can export audit records via API for a date range (≤100k rows) as CSV or JSON within 2 minutes

Batch Processing & Throughput Guarantees

"As a user uploading a large catalog, I want Category IQ to process quickly at scale so that my editing workflow isn’t blocked."

Description

Enable high-throughput processing for batches of 100–10,000 items with parallel inference, autoscaling across GPU/CPU nodes, and backpressure-aware job orchestration. Targets: sustain 1,500 items/min per node and complete 1,000-item batches with end-to-end p95 under 5 minutes. Provide resumable jobs, idempotent tasking, retries with exponential backoff, and real-time progress via WebSockets and webhooks.

Acceptance Criteria

Sustain 1,500 Items/Minute Per Node

Given a single production node under normal operating conditions and representative item mix When a steady-state load is applied for at least 10 consecutive minutes Then the node sustains >= 1,500 processed items per minute measured over rolling 1-minute windows And the item-level error rate is <= 0.5% And the input queue depth does not increase over the interval (net drain >= 0)

1,000-Item Batch p95 End-to-End Under 5 Minutes

Given a 1,000-item batch submitted via API with autoscaling enabled When processing begins under normal production conditions Then the end-to-end latency (submission to final completion webhook delivery) for >= 95% of items is <= 5 minutes And the batch completes with 100% of items either succeeded or placed in a dead-letter list with error details And WebSocket progress indicates >= 95% completion by 5 minutes

Autoscaling Across GPU/CPU Nodes to Clear Backlog

Given the global queue backlog exceeds 2x the sustainable per-node rate for 60 seconds When autoscaling evaluates capacity Then additional GPU/CPU worker nodes are provisioned within 60 seconds to achieve a processing rate greater than the arrival rate And backlog begins to decrease within 2 minutes of scale-out And no in-flight tasks are dropped during scale-out or scale-in And scale-in occurs only after backlog remains < 0.5x sustainable rate for 10 minutes

Backpressure-Aware Orchestration and Ingestion Throttling

Given worker queues exceed the target wait threshold of 30 seconds When additional batch submissions arrive Then the ingestion API responds with HTTP 429 and a Retry-After header reflecting current drain capacity And the orchestrator rate-limits dispatch to saturated nodes so queue wait time p95 stays <= 60 seconds And no tasks fail with resource exhaustion due to overload And no message loss occurs in the job queue

Resumable Batches After Network/Worker Interruptions

Given an active batch with some items completed and others pending When the client disconnects or a worker restarts and later recovers Then the batch resumes without reprocessing completed items And the client can query an accurate list of completed, in-progress, and pending item IDs And duplicate outputs produced per batch is 0

Idempotency and Retry with Exponential Backoff

Given a batch submission includes an Idempotency-Key header When the same submission is repeated within 24 hours Then the API returns the original batch ID and status without creating duplicate work or side effects And transient item failures are retried up to 5 attempts with exponential backoff starting at 2 seconds, factor 2, with full jitter capped at 60 seconds And after max attempts the item moves to a dead-letter queue with machine-readable error code and trace ID

Real-Time Progress via WebSockets and Webhooks

Given a client subscribes to WebSocket updates and registers a webhook endpoint When a batch is running Then progress updates are emitted at least every 2 seconds or on >= 1% progress change, whichever is sooner And webhook events (started, progress, completed, failed) are HMAC-SHA256 signed and delivered with at-least-once semantics with retries for 24 hours And event delivery latency p95 from state change to client receipt is <= 3 seconds And WebSocket streams can resume using the last sequence number without loss or duplication

Explainability & Audit Trail

"As a QA lead, I want to see why a category was chosen and what checks ran so that I can verify correctness and train my team."

Description

Expose interpretable signals behind each categorization, including saliency heatmaps for visual cues and highlighted metadata tokens. Display reason codes and rule results in UI and API. Persist detailed audit logs (predictions, confidences, overrides, rules invoked, outcomes) for at least 180 days with export to CSV/NDJSON and searchable trace IDs for support and compliance.

Acceptance Criteria

UI Explainability: Heatmaps, Token Highlights, Reason Codes

Given a completed Category IQ categorization in the UI for a single product image with listing metadata When the user opens the Explainability panel Then a saliency heatmap overlay toggle is visible and defaults to off And enabling the overlay displays the top 3 visual regions ranked by saliency with numeric scores (0–1) and tooltips on hover And the listing metadata shows the top highlighted tokens (minimum 3, maximum 10) contributing to the prediction with contribution scores And a Reason Codes list shows at least 1 and up to 5 codes with human‑readable labels and machine codes And a Rules Evaluated section lists each invoked rule with result (Pass/Fail), threshold(s), input(s), and rule version And all scores in the panel match the API values for the same trace_id within a tolerance of 0.001

API Explainability Payload with Reason Codes and Rules

Given a GET request to /v1/category/explanations/{trace_id} with a valid trace_id When the response returns 200 Then the payload includes: category, confidence (0–1), reason_codes[], rules_invoked[], saliency_map.url, saliency_regions[], metadata_tokens[] with offsets and contribution, and trace_id And each rules_invoked[] item contains: name, version, inputs, thresholds, result (pass|fail), and outcome_notes And reason_codes[] contains code, label, and weight And the response validates against the published OpenAPI schema and is backward compatible with the last minor version And response time p95 ≤ 500 ms without raster saliency_map; ≤ 1500 ms when including raster And if trace_id is not found, Then 404 is returned with error.code="TRACE_NOT_FOUND" and support_contact

180-Day Audit Log Persistence and Retrieval

Given predictions and any overrides are generated When the audit record is written Then it is persisted within 5 seconds and includes: timestamp (UTC ISO‑8601), user/account_id, job_id, trace_id, input_refs, predicted_category, confidence, reason_codes[], rules_invoked[], outcome, override_flag, override_details (if any), and actor And records are retained for at least 180 days and purged on day 181 And retrieval by exact trace_id returns the record with p95 latency ≤ 300 ms And a tamper‑evident hash is stored per record and can be verified via API

Export Audit Logs to CSV and NDJSON with Filters

Given an admin selects a time window and optional filters (account, category, outcome, has_override) When they request an export Then a downloadable file is generated in CSV and NDJSON with identical content And exports include headers/keys matching the audit log schema and include trace_id in every row/object And exports up to 1,000,000 records complete within 10 minutes and are chunked if larger And exports can be requested via API POST /v1/audit/exports and polled at GET /v1/audit/exports/{export_id} And files are UTF‑8 encoded; NDJSON is newline‑delimited; ZIP compression is applied when size > 100 MB

Search and Trace ID-based Case Reconstruction

Given a user has a trace_id from a support ticket When they search in the UI or call GET /v1/audit/records?trace_id={id} Then exactly one matching record is returned (or zero if purged) with full details And trace_id values are unique within the 180‑day retention window And the UI opens a timeline view showing events: prediction, rules evaluation, override (if any), export references And partial search by prefix (first 8 chars) returns the exact match if unique, else prompts to disambiguate

Manual Override Logging and Explainability Update

Given a human reviewer overrides a category in the UI or via API When the override is saved Then the system records override_reason (free‑text up to 280 chars), actor_id, timestamp (UTC), previous_category, new_category, and linkage to original trace_id And the audit record is versioned (v1, v2, …) preserving the original prediction as v1 And explainability views and API reflect override_applied=true and display the override reason and actor And exports include both original and latest outcomes with version metadata

Rule Invocation Trace with Inputs and Outcomes

Given a categorization run uses category‑specific rules When evaluation completes Then the system logs for each rule: rule_id, name, category_scope, version, inputs (with values), thresholds, result, latency_ms, and any error And the UI Rules Evaluated section can be expanded to show inputs and thresholds for each rule And the API returns the same details under rules_invoked[] with consistent ordering And if a rule errors, Then result="error" and the overall prediction includes a reason_code indicating degraded rules evaluation

Continuous Learning from Corrections

"As a product owner, I want the system to improve from user feedback so that accuracy increases over time without manual rule maintenance."

Description

Capture user overrides and validation outcomes into a feedback store and use them to fine-tune ranking or adapters on a scheduled cadence. Run guarded A/B evaluations and monitor precision/recall by category and merchant; alert on degradations >5% and enable one-click rollback to previous model versions. Support per-merchant personalization with caps to prevent overfitting and maintain global performance.

Acceptance Criteria

Feedback Capture from User Overrides

Given a merchant changes the predicted category in Category IQ, When the override is saved, Then a feedback event is written to the feedback store within 5 seconds with fields: event_id (UUID), timestamp (UTC), merchant_id, product_id, original_category, new_category, confidence_score, user_id (hashed), session_id, source=ui, model_version, category_tree_version. Given the same override is retried with the same event_id, When the write is processed, Then the store performs idempotent upsert and retains a single record. Given a batch CSV correction import, When the import completes, Then one feedback event per item is stored with source=batch and a batch_id linking the set. Given transient storage or network errors, When an event write fails, Then the client retries with exponential backoff up to 3 attempts and emits an error metric; overall 24h write success rate >= 99.5% and P95 write latency <= 2s. Given privacy requirements, When the feedback is stored, Then no raw PII is stored (user_id hashed, no emails) and records are encrypted at rest.

Nightly Fine-Tuning with Data Thresholds

Given the nightly scheduler at 02:00 UTC, When the training job triggers, Then it assembles the last 30 days of labeled feedback and validation outcomes partitioned by category and merchant. Given a category-partition has insufficient data, When counts are calculated, Then training for that partition is skipped if total labeled examples < 500 or positives per class < 50, and an 'insufficient_data' event is logged. Given sufficient data for a partition, When training runs, Then it completes within 60 minutes and produces versioned artifacts (semver vX.Y.Z+build), with registry entry including checksum and data snapshot_id for reproducibility. Given training completes, When validation is evaluated, Then precision and recall are computed per category and per merchant and stored in the experiment tracker; if any partition deviates >5% from the last successful validation, the candidate is flagged for guarded A/B only (no auto-promotion).

Guarded A/B Evaluation and Promotion Gate

Given a candidate model version exists, When an A/B experiment is started, Then 10% of traffic is routed to the candidate and 90% to the control with stratified sampling across merchants and top 50 categories. Given the experiment is running, When stop criteria are evaluated, Then it runs until each of the top 20 categories has ≥ 5,000 predictions or 7 days elapse, whichever comes first. Given sufficient sample sizes, When metrics are computed, Then precision, recall, and false-positive rate are calculated per category and per merchant cohort (cohorts with ≥ 500 predictions) with 95% confidence intervals. Given metrics are available, When eligibility for promotion is checked, Then the candidate is promotable only if: (a) overall weighted precision and recall are ≥ control, (b) no category shows > 5% relative degradation in precision or recall with 95% confidence, and (c) no merchant cohort shows > 5% relative degradation. Given eligibility fails, When the gate evaluates, Then the experiment auto-stops, the candidate is rejected, and a notification is sent to the ML channel with links to the report. Given eligibility passes, When promotion occurs, Then the serving registry updates atomically to the new model version and the experiment is archived with final metrics.

Real-time Degradation Monitoring and Alerting

Given any serving model version, When rolling 24h precision and recall per category or merchant drop by > 5% relative to baseline with ≥ 1,000 predictions, Then an alert is sent to Slack #ml-alerts and PagerDuty within 2 minutes. Given an alert is sent, When the payload is constructed, Then it includes model_version, baseline_version, impacted categories/merchants, metric deltas, volume, start time, and dashboard and rollback links. Given the degradation condition clears, When metrics recover within thresholds for 24h, Then the alert auto-resolves and the incident is closed with resolution notes captured. Given persistent degradation > 48h, When auto-remediation is evaluated, Then a recommendation to rollback is posted to the incident with a one-click action link.

One-Click Rollback to Previous Model

Given a current serving model and a previous passing model exist, When a user triggers rollback via UI or API, Then 100% of traffic is switched back to the previous model within 5 minutes with zero failed prediction requests attributable to the switch. Given rollback executes, When system state is recorded, Then the serving registry logs actor, timestamp, from_version, to_version, reason, and request_id, and all new predictions include the rolled_back model_version. Given rollback is initiated, When safety checks run, Then a 1% canary is executed for up to 2 minutes; if canary fails health checks, rollback aborts and an alert is raised; otherwise rollout proceeds to 100%. Given rollback completes, When monitoring runs, Then precision/recall and error rates return to baseline levels within 30 minutes or an incident is opened automatically.

Per-Merchant Personalization with Global Caps

Given merchant-specific feedback exists, When eligibility is evaluated, Then a per-merchant adapter is enabled only if the merchant has ≥ 100 labeled corrections and outcomes in the last 60 days across ≥ 3 categories and each included category has ≥ 30 examples. Given a merchant adapter is trained, When its impact is assessed, Then on a 10% merchant holdout it shows ≥ 3% relative improvement in both precision and recall, and simultaneously the global (all-merchants) holdout shows < 2% relative degradation in either metric. Given personalization weights are applied, When serving constraints are enforced, Then adapter weight norms are clamped to configured caps, and at least 10% of the merchant's traffic is routed to the global model for exploration and drift detection. Given a personalized merchant shows drift, When 7-day rolling metrics degrade by > 3% with ≥ 1,000 predictions, Then the merchant is automatically reverted to the global model and a notification is sent to the account owner and ML channel.

FixFlow

Configurable auto‑fix pipeline that resolves common failures (background, margins, DPI, shadow) with safe thresholds and rollback. Choose when to auto‑apply vs. request review, preview diffs in one click, and ship compliant images faster without compromising brand fidelity.

Requirements

Rule-Based Auto-Fix Engine

"As a boutique owner, I want my product photos auto-corrected for background, margins, DPI, and shadow in one pass so that I can publish a consistent catalog faster without hand-editing."

Description

Implements a configurable pipeline that automatically detects and corrects common image issues (background cleanup, margin normalization, DPI standardization, and shadow reconstruction) using ordered, modular rules. Each rule is toggleable, parameterized, and can be scoped per workspace, brand, style-preset, or marketplace profile. The engine supports batch execution for hundreds of photos, honors preset styles from PixelLift, and records per-step outcomes for observability. It ensures consistent, studio-quality outputs while reducing manual editing time by up to 80% and aligning with brand guidelines across large catalog uploads.

Acceptance Criteria

Scoped, Toggleable Rule Configuration

1) Given a workspace W with defaults and overrides at brand B, style-preset S, and marketplace profile M, When an image tagged (B, S, M) is processed, Then rule parameters resolve using scope precedence S > M > B > W and unspecified params inherit from the next-wider scope. 2) Given a rule R is toggled OFF at scope S, When processing images under S, Then R is not executed and is recorded as "skipped (disabled)" in the run log. 3) Given an invalid parameter value is saved for rule R at any scope, When saving, Then the system rejects it with a validation error and no changes are applied.

Ordered, Modular Batch Execution

1) Given a pipeline order [background_cleanup, margin_normalization, dpi_standardization, shadow_reconstruction], When processing a batch of 500 images, Then each image's run log lists the rules executed in exactly that order with per-rule status applied/skipped/failed. 2) Given a batch job is started, When 500 images are submitted, Then the system processes all images without exceeding the configured concurrency limit and exposes real-time progress (total, processed, succeeded, failed, pending). 3) Given a batch contains corrupt images, When the pipeline runs, Then corrupt images are marked failed with error codes while the rest complete successfully.

Auto-Apply vs Review Gate with Safe Thresholds and Rollback

1) Given rule-level confidence scores in [0,1] and a per-scope threshold T, When R's confidence < T, Then the system routes the image to "Request Review", does not commit R's changes, and notifies the review queue. 2) Given safe delta limits for metrics (e.g., margin variance <= 2%, background residual <= 0.5%, DPI = target), When any post-rule metric violates its limit, Then the engine rolls the image back to the pre-rule state, records a rollback event, and marks the step "failed (rolled back)". 3) Given a pipeline configured for Auto-Apply, When all rules meet thresholds and deltas, Then the image is committed automatically with status "auto-applied" and no review requested.

Preset Harmony and Marketplace Compliance

1) Given a PixelLift style-preset S sets target margins and background, When the pipeline runs, Then margin_normalization and background_cleanup use S's targets and do not override S-defined aesthetics. 2) Given a marketplace profile M defines background = #FFFFFF and min DPI = 300, When processing images for M, Then outputs meet those constraints exactly or are flagged for review with reasons listed. 3) Given conflicting settings between S and M, When processing for M, Then M's compliance rules take precedence for compliance-critical fields while S governs non-compliance-critical styling, and the precedence is logged.

Per-Step Observability and Diff Preview

1) Given a processed image, When inspecting the run log, Then for each rule the system shows: start/end timestamps, parameters used, metrics before/after, outcome, and any artifacts (e.g., masks) with IDs. 2) Given an image with changes, When "Preview Diff" is clicked, Then the system displays side-by-side original vs final and per-step diffs within 1 second P95 for images <= 25 MB. 3) Given export is requested, When downloading the audit bundle, Then a JSON report and per-step artifacts are included with consistent IDs matching the run log.

Idempotency and Deterministic Ordering

1) Given identical inputs and configuration (including rule order and parameters), When the pipeline is re-run on the same image, Then the final output and run log are byte-for-byte identical. 2) Given the rule order is changed, When the pipeline is re-run, Then the run log reflects the new order and any differences in output are recorded with a change summary. 3) Given non-deterministic operations (if any) exist, When the pipeline runs, Then a fixed seed is used per job so repeated runs reproduce identical results.

Partial Failure Handling and Retry

1) Given a batch with N images where k fail at any rule, When the batch completes, Then a retry action is available that targets only the k failed images with the same config or with updated config. 2) Given a rule times out on an image, When the pipeline continues, Then the image is marked "failed (timeout)" with duration recorded and the batch proceeds without halting. 3) Given a batch job is cancelled by a user, When cancellation occurs, Then in-flight image processing completes current rule and stops with status "cancelled" and no further rules are executed.

Confidence Thresholds & Safeguards

"As a brand manager, I want configurable confidence thresholds so that automated fixes only apply when quality is assured and risky edits are flagged for review."

Description

Adds per-rule confidence scoring and safe thresholds to prevent overcorrection and protect brand fidelity. Each auto-fix computes quality metrics (e.g., mask confidence, edge integrity, fill ratio, color variance) and compares them to configurable thresholds. If confidence is below threshold or deviation exceeds tolerance, the system halts that fix, tags the image for review, and preserves the prior version. Thresholds can be set globally, per brand, or per marketplace compliance profile to balance automation with control.

Acceptance Criteria

Threshold Precedence: Marketplace > Brand > Global

Given a product image associated with Brand A and Marketplace Profile M and thresholds set as: global.mask_confidence=0.85, brandA.mask_confidence=0.90, profileM.mask_confidence=0.92 When the background removal rule evaluates a fix with mask_confidence=0.91 Then the system uses the Marketplace Profile threshold (0.92), does not auto-apply the fix, tags the image status="Needs Review", preserves the prior version, and records decision_source="Marketplace Profile" And when mask_confidence=0.93 Then the system auto-applies the fix and records decision_source="Marketplace Profile" And if a threshold is missing at the profile level Then the brand level is used; if missing at the brand level, the global level is used

Halt and Review on Low Confidence or Excess Deviation

Given thresholds for background removal are mask_confidence>=0.90 and edge_integrity>=0.95 and color_variance<=DeltaE 2.0 When an auto-fix proposes changes with mask_confidence=0.88 Then the fix is not applied, the image is tagged status="Needs Review", the prior version is preserved without modification, and dependent fixes for this rule are skipped And a decision record is created including metric values, thresholds, and reason="Below Threshold" Given edge_integrity threshold is 0.95 When measured edge_integrity after the proposed fix is 0.92 Then the fix is not applied, the image is tagged status="Needs Review", the prior version is preserved, and reason="Deviation Exceeds Tolerance" is recorded

Metric Computation, Comparison, and Storage

Given an input image and the background removal rule runs When the rule completes evaluation Then it computes and stores metrics: mask_confidence (0.000–1.000), edge_integrity (0.000–1.000), fill_ratio (0–100%), color_variance (DeltaE), each recorded with at least 3-decimal precision And each metric is compared against its configured threshold to derive a decision in {Auto-Apply, Request Review, Skip} And the decision record includes: image_version_id, rule_name, metric values, thresholds used, threshold_source in {Global, Brand, Marketplace Profile}, evaluator="system", and a timestamp And repeated evaluations on the same input with the same model/seed produce metric values within ±0.005 of prior results

Batch Decisions: Auto-Apply vs Review Summary

Given a batch of 200 images with FixFlow configured to auto-apply on pass and request review on fail When the batch pipeline completes Then every image receives exactly one decision in {Auto-Apply, Request Review, Skip}; no image remains undecided And a batch summary reports counts and percentages per decision type and the number of halted fixes due to thresholds And the 95th percentile decision latency per image (from queued to decision) is <=2 seconds

One-Click Diff Preview and Reviewer Actions

Given an image flagged status="Needs Review" due to thresholds When the reviewer clicks Preview Diff Then the system renders a before/after diff with overlay of metric values and thresholds used within 1 second And when the reviewer clicks Approve Then the fix is applied, a new image_version_id is created, and the audit record is updated with actor=reviewer_id and decision="Approved" And when the reviewer clicks Reject Then the prior version remains active, no new version is created, and the audit record is updated with actor=reviewer_id and decision="Rejected" and a required note (<=500 characters)

Comprehensive Audit Trail and Export

Given any auto-fix evaluation completes When querying the audit log Then there exists a record with: image_id, version_before, version_after (nullable), rule_name, metric values, thresholds, threshold_source, decision, actor (system or reviewer_id), and timestamps And when exporting audit data for a date range filtered by brand and marketplace profile Then CSV and JSON files are generated and downloadable and include all matching records And audit records are immutable; any correction appends a new record linked by decision_parent_id

Safe Defaults and Dry-Run Mode

Given no custom thresholds are set When FixFlow runs Then safe default thresholds are used: mask_confidence>=0.92, edge_integrity>=0.96, fill_ratio within ±5% of target, color_variance<=DeltaE 2.0 And when Dry-Run Mode is ON for a rule Then decisions are computed but fixes are not applied, images are labeled status="Simulated", and a summary shows projected auto-apply and review rates with counts And when toggling a rule from Dry-Run to Enforce Then a confirmation is required and the change is logged with user_id and timestamp

Auto-Apply vs Review Policies

"As an operations lead, I want to auto-apply high-confidence fixes and queue low-confidence ones for review so that my team scales output while keeping quality high."

Description

Provides policy controls to define when fixes are auto-applied versus routed to a human review queue. Policies can be defined per rule, per preset, or per marketplace profile and can leverage confidence scores, product category, or SKU tags. Includes routing to an in-app review inbox, assignment, notifications, and bulk approve/override actions. Ensures high-confidence fixes flow through unattended while edge cases receive timely review, accelerating throughput without sacrificing quality.

Acceptance Criteria

Auto-Apply by Confidence Threshold per Rule

Given a policy P1 scoped to Marketplace Profile "Amazon US" and Rule "Background Removal" with auto-apply threshold=0.92 When an image in "Amazon US" is processed and the rule returns confidence=0.95 Then the fix is auto-applied, the image bypasses the Review Inbox, and an audit entry is recorded with image_id, rule, confidence, decision=auto-applied, policy_id Given policy P1 When an image is processed and the rule returns confidence=0.89 Then the fix is not auto-applied, the image is routed to the Review Inbox with reason="below-threshold" and the displayed confidence=0.89

Review Routing, Assignment, and Notifications for Low-Confidence Cases

Given a review routing rule that assigns items round-robin to reviewers [alice, bob] and notifications are enabled When three images are routed to review sequentially Then the items are assigned [alice, bob, alice] respectively Given an item is assigned to a reviewer When the assignment occurs Then the assignee receives an in-app notification within 10 seconds and an email within 60 seconds containing a deep link to the item Given the notification email deep link When the reviewer clicks it while authenticated Then the Review UI opens directly to the item details within 3 seconds

Policy Scopes and Fallback to Marketplace Default

Given a per-rule policy exists for Rule "DPI Fix" in Marketplace Profile "Amazon US" When an image triggers "DPI Fix" in that profile Then the per-rule policy is evaluated for that fix Given a per-preset policy exists for Preset "Studio White" When an image is processed with Preset "Studio White" Then the per-preset policy is evaluated for all fixes initiated by that preset Given Marketplace Profile "Etsy" has a default policy "Route to Review" When an image in "Etsy" has no matching per-rule or per-preset policy Then it is routed to the Review Inbox

Conditional Policies by Product Category and SKU Tags

Given a policy that applies to Category="Apparel" AND SKU tags include "premium" with auto-apply threshold=0.94 for Rule "Background Removal" When an Apparel item with SKU tags ["premium","sale"] is processed and the rule confidence=0.97 Then the fix auto-applies and bypasses review Given the same policy When a "Home Decor" item or an Apparel item without the "premium" tag is processed Then the policy does not match and the next applicable policy or Marketplace default is applied

Review Inbox Bulk Approve/Override with Diff Preview

Given 50 items are in the Review Inbox When the reviewer selects 20 items and opens "Preview Diff" for any one Then the before/after diff modal loads within 2 seconds and accurately reflects the proposed fix at 100% zoom Given 20 selected items When the reviewer clicks "Approve Selected" Then the fixes are applied to all selected items, successful items are removed from the inbox, and an audit entry is created per item Given a bulk approval where one item fails to apply due to a processing error When the operation completes Then 19 items are applied, the failed item remains in the inbox with error_code and retry action, and a failure notification is shown Given 10 selected items When the reviewer clicks "Override Selected" Then the proposed fixes are rejected, the items are removed from the inbox, and the decision is logged as decision=rejected with reason="manual-override"

Decision Audit Trail and Export

Given auto-applied and manually reviewed items within a date range When an admin exports the Decision Log to CSV Then each row includes at minimum: image_id, SKU, rule, preset, marketplace_profile, policy_id, decision (auto-applied|approved|rejected), confidence, reason_code, actor (system|user_id), timestamp Given an audit log entry When the admin opens it Then a detail view shows the policy evaluation summary (matched conditions, thresholds) and provides a link to the diff preview Given the audit logs When filtered by decision=auto-applied and rule="Background Removal" Then only matching entries are returned within 2 seconds for up to 10,000 records

One-Click Diff Preview

"As a photo reviewer, I want a one-click before/after diff so that I can quickly verify fixes and approve or request changes without leaving the flow."

Description

Enables instant visual comparison between original and fixed images with side-by-side and overlay modes, zoom, pan, and toggleable pixel-diff heatmaps. Accessible from the review queue and batch results, the preview shows per-rule annotations (e.g., margin adjustments, background mask edges) and renders in under 300ms for snappy triage. Keyboard shortcuts support rapid navigation across batches, speeding up approvals and rejections during high-volume processing.

Acceptance Criteria

Launch From Review Queue and Batch Results

Given a signed-in user viewing the Review Queue with at least one item processed by FixFlow, When the user activates “Diff Preview” via click or Enter on a focused item, Then the preview opens within 150ms and focus moves into the viewer. Given the preview opens from the Review Queue, When opened, Then it defaults to Side-by-Side mode and displays Original on the left and Fixed on the right with synchronized dimensions. Given a signed-in user viewing Batch Results, When the user activates “Diff Preview” on any listed item, Then the same viewer opens with identical default state and controls. Given an item has both original and fixed assets available, When the viewer opens, Then both assets are requested in parallel and the first interactive render appears within 300ms after both assets finish decoding. Given the user closes the viewer, When they return to the list, Then keyboard focus is restored to the originating item and no background scroll position is lost.

Toggle Between Side-by-Side and Overlay Views

Given the viewer is open in Side-by-Side mode, When the user switches to Overlay mode via UI toggle or shortcut, Then the mode changes within 100ms without layout shift and both images are pixel-aligned. Given the viewer is in Overlay mode, When the user switches back to Side-by-Side, Then canvases render within 100ms and the previous zoom level and pan position are preserved. Given either mode is active, When the user reopens the viewer within the same session, Then the last-used mode is restored for that user and workspace. Given either mode is active, When the window is resized, Then content responsively fits without overlapping controls and maintains image alignment.

Pixel-Diff Heatmap Toggle and Threshold

Given the viewer is open in any mode, When the user toggles Pixel-Diff Heatmap ON, Then non-identical pixels between Original and Fixed are highlighted and the overlay appears within 50ms. Given Heatmap is ON, When the user toggles it OFF, Then all heatmap overlays are removed within 50ms with no residual artifacts. Given Heatmap is ON, When the user switches between Side-by-Side and Overlay modes, Then heatmap state persists and remains consistent across modes. Given a static pair of images, When Heatmap is ON, Then the number of highlighted pixels equals the pixel-wise difference count computed by the configured diff algorithm within ±0.5% tolerance.

Zoom and Pan Synchronization

Given the viewer is open, When the user zooms via Ctrl/Cmd +/-, trackpad pinch, or UI controls, Then zoom increments in steps of 10% within a range of 25%–800% and renders within 50ms per step. Given any zoom level, When the user pans via drag or trackpad, Then Original and Fixed views remain synchronized with no drift and panning latency under 16ms per frame. Given the user zooms with the cursor over the image, When zoom changes, Then the zoom focal point remains centered on the cursor location in both images. Given the user resets zoom, When the reset action is triggered, Then zoom returns to 100% and pan re-centers within 50ms.

Per-Rule Annotations Visibility and Accuracy

Given an image with FixFlow rules applied (e.g., margin adjustment, background mask), When the viewer opens, Then a toggleable Annotations panel lists only the rules applied to that image. Given Annotations are ON, When the user hovers an annotation (e.g., margin box), Then a tooltip shows the rule name and quantified values (e.g., top/bottom/left/right px) and the overlay aligns to the exact pixel edges at ≥100% zoom. Given Annotations are toggled OFF, When the user views the image, Then no annotation overlays are rendered and image performance is unaffected (<1% CPU overhead compared to annotations ON idle). Given multiple annotations are present, When the user filters to a specific rule, Then only that rule’s overlays are displayed within 50ms.

Keyboard Shortcuts for Rapid Batch Triage

Given the viewer is open and focused, When the user presses Left/Right Arrow, Then the viewer loads the previous/next item in the current batch within 200ms while preserving view mode, heatmap state, and zoom/pan. Given the viewer is open, When the user presses V, Then view mode toggles between Side-by-Side and Overlay within 100ms. Given the viewer is open, When the user presses H, Then the heatmap toggles ON/OFF within 50ms. Given the viewer is open, When the user presses + or -, Then zoom increases/decreases by 10% steps; Esc closes the viewer and returns focus to the originating list item. Given a user needs discoverability, When they press ?, Then a shortcut help overlay appears with all active bindings and does not block shortcuts when dismissed.

Rendering Performance SLA and Loading Feedback

Given original and fixed assets up to 12MP (≤4000×3000) and ≤25MB each, When the viewer initializes after assets are decoded, Then the first interactive render occurs within 300ms on a mid-range device (e.g., 4-core CPU, integrated GPU). Given network latency delays asset downloads beyond 300ms, When the user opens the viewer, Then a skeleton loader appears within 100ms and progress indicators reflect per-asset loading until decode completes. Given the user rapidly toggles heatmap, view modes, and zoom, When interactions occur, Then input-to-paint latency remains under 100ms for each action and frame drops do not exceed 1 consecutive frame at 60Hz. Given an unrecoverable render error occurs, When the viewer fails to render, Then an inline error state appears with retry and back-to-list actions, and an error event is logged with image IDs and timing metadata.

Non-Destructive Rollback & Versioning

"As a store owner, I want to revert any automated fix to a previous version so that I can recover from mistakes and maintain brand consistency."

Description

Stores originals and all intermediate outputs as immutable versions with full audit trails, enabling per-image or batch-level rollback at any time. Each version captures applied rules, parameter values, confidence scores, timestamps, and approver identity. Rollbacks are atomic, reversible, and exposed via UI and API for integrations. This protects against undesirable changes, supports compliance audits, and allows experimentation with new thresholds or presets without risk.

Acceptance Criteria

Immutable Version Chain on Processed Images

Given a catalog of 100 images is processed by FixFlow (background, margins, DPI, shadow), When processing completes, Then for each image the system persists v0=original and v1..vn=each pipeline step output as immutable versions, And any attempt to modify or delete a stored version via UI or API returns 403, And each version’s metadata includes rulesApplied[], parameterValues, confidenceScores, createdAt, actorId, approverId (nullable), And versions are visible via UI and API within 10 seconds of the step completing for 95% of cases.

Atomic Per-Image Rollback

Given an image with versions v0..v4, When a user with Editor role initiates a rollback to v2 via UI or API, Then a new head version v5 is created with type=rollback and sourceVersionId=v2, content exactly equal to v2, And the operation is atomic such that any failure leaves the current head unchanged, And CDN and caches are invalidated so new requests serve v5 within 60 seconds globally for 95% of requests, And the audit trail records actorId, reason, timestamp, and target/source version IDs.

Transactional Batch Rollback

Given a batch of 250 images each with versions v0..v4, When a rollback to v2 is requested for the batch in a single operation, Then either all 250 images receive a new rollback version and current pointers update, or none do if any single rollback fails, And the response returns an operationId and per-image status summary, And the operation is idempotent for 24 hours when the same Idempotency-Key is used, And no partial changes persist on failure.

Reversible Rollback (Forward Restore)

Given an image where a rollback created v7 referencing v3, When the user restores forward to prior head v6, Then a new version v8 is created with type=restore and sourceVersionId=v6, And history preserves v0..v8 with no deletions and correct parent/lineage links, And UI/API clearly show restoredFromVersionId and rolledBackFromVersionId, And the change is visible in UI/API within 10 seconds for 95% of cases.

Version Timeline UI with Preview & Diff

Given a user opens the Versions panel for an image with at least 5 versions, When the user selects any version, Then a 1024px preview renders within 500ms for 95% of interactions and matches the stored asset within 1% pixel RMSE, And clicking Diff shows side-by-side and overlay modes with percent pixel change and bounding regions, And the card displays approver identity (user or system), timestamps, and rulesApplied summary, And no destructive edits are possible from this panel.

Versioning API & Webhooks

Given a client with a valid OAuth2 token, When it calls GET /images/{id}/versions, Then it receives 200 with an ordered list including id, parentId/sourceVersionId, type, rulesApplied, parameterValues, confidenceScores, createdAt, actorId, approverId, And when it calls POST /images/{id}/rollback with targetVersionId and an Idempotency-Key, Then it receives 202 with operationId and later a webhook image.rollback.completed summarizing success/failures, And unauthorized or malformed requests return 401/403/422 with error details.

Exportable, Append-Only Audit Trail

Given an auditor role requests an export for a date range, When the system generates the export via UI or GET /audit/exports, Then the export includes all versions with applied rules, parameter values, confidence scores, timestamps, actorId, approverId, and operation type, And entries are append-only: attempts to modify or delete audit records return 403 and are logged, And exports covering 10k versions complete within 60 seconds for 95% of requests.

Marketplace Compliance Profiles

"As a seller listing on multiple marketplaces, I want compliance profiles applied automatically so that my images meet each marketplace’s standards without manual tweaks."

Description

Introduces predefined and customizable profiles for marketplaces (e.g., Amazon, Shopify, eBay) encoding requirements such as background color, product fill percentage, minimum dimensions, DPI, and shadow rules. FixFlow maps rules to these profiles and validates outputs against them, auto-correcting where possible and flagging violations otherwise. Profiles can be attached to style-presets and batches so that images ship compliant by default, reducing listing rejections and rework.

Acceptance Criteria

Apply Predefined Marketplace Profile to Batch

- Given a batch with a selected predefined marketplace profile, when processing starts, then each image is validated against all profile rules (background color, product fill %, minimum dimensions, DPI, shadow rule). - Given auto-apply fixes is enabled, when a rule has a mapped fix within safe thresholds, then the fix is applied and the rule passes on re-validation. - Given a violation cannot be safely auto-fixed, when processing completes, then the image is flagged with the specific failed rule(s) and no destructive changes are applied. - Given processing completes, when the batch summary renders, then it shows counts of Passed, Auto-fixed, and Flagged images and an overall compliance rate.

Create and Customize Marketplace Compliance Profile

- Given a user with edit permissions, when creating a profile from a marketplace template, then they can configure background color (hex), product fill min/max %, minimum width/height (px), DPI, and shadow allowance. - Given invalid values are entered, when the user attempts to save, then inline validation prevents save and lists exact fields and allowed ranges. - Given the profile is saved, when viewing profile details, then it displays a unique name, version number, and a readable rules summary. - Given an existing profile is edited and saved, when batches reference that profile, then existing completed batches remain unchanged and new runs use the new version.

Auto-Fix Mapping and Safe Thresholds

- Given safe thresholds are defined per rule, when an auto-fix would exceed a threshold, then the fix is not applied and the violation is flagged with the attempted delta. - Given an auto-fix is applied, when the image is revalidated, then all affected rules pass and a non-destructive diff is stored for rollback. - Given rollback is triggered on an image, when performed, then the image returns to its prior state and the audit log records the action with timestamp, user, and rule deltas.

Profile Attachment to Style-Presets and Batches

- Given a style-preset has Profile A and the batch selects Profile B, when processing starts, then Profile B overrides and is used for validation and fixes, and the job header displays "Using Profile: B (Overrides preset)". - Given no profile is selected at batch time, when processing starts, then the profile attached to the style-preset is used; if none, the workspace default profile is used. - Given a batch is re-run, when a newer profile version exists, then the operator is prompted to choose between the batch's original profile version and the newest, and the chosen version is recorded in the job metadata.

Review Workflow for Flagged Violations

- Given certain rules are configured as "request review", when those rules fail, then the image is routed to a review queue with rule-specific failure reasons. - Given a reviewer opens an item, when they click "Preview diff", then a side-by-side original vs proposed fix loads within 1 second and shows per-rule overlays. - Given the reviewer approves, rejects, or requests reprocess, when they submit the decision, then the image status updates accordingly, the action is logged, and bulk actions apply consistently to multi-select.

Compliance Reporting and Export

- Given a batch has completed, when viewing the compliance report, then it lists per-rule pass/fail counts, total auto-fixes applied, and the overall compliance rate. - Given the user clicks Export, when choosing CSV or JSON, then a file is generated that includes image ID, applied profile name and version, each rule's status, fix applied (yes/no), reviewer (if any), and timestamps. - Given the report is filtered by rule or status, when applied, then the table and export reflect the same filtered subset.

Resilient Batch Orchestration & Retries

"As a high-volume seller, I want reliable batch processing with automatic retries so that large uploads complete quickly even when individual images fail intermittently."

Description

Adds a fault-tolerant batch processor with idempotent job IDs, prioritized queues, concurrency controls, and exponential backoff retries for transient failures. Provides real-time progress, per-image status, and cost/time estimates, with partial-completion handling and resumable batches. Integrates with PixelLift’s existing upload pipeline and respects per-account rate limits, ensuring reliable high-volume processing during peak catalog updates.

Acceptance Criteria

Idempotent Batch Submission & Deduplication

- Given a batch submission with jobId=X and payloadHash=H, When the same jobId and identical payload are resubmitted within 24h, Then the API returns 200 with the original batchId and no new processing is triggered. - Given a batch submission with jobId=X but payloadHash≠H, When submitted, Then the API returns 409 Conflict (code=BATCH_PAYLOAD_MISMATCH) and no processing starts. - Given at-least-once delivery from the internal queue, When duplicate tasks for the same image are received, Then storage writes, billing, and status transitions occur at most once (verified via idempotency keys).

Priority Queue Scheduling & Starvation Avoidance

- Given a backlog containing HIGH and NORMAL priority batches, When workers are available, Then HIGH priority consumes ≥60% of active worker slots until HIGH backlog is cleared. - Given only NORMAL and LOW priority backlog for ≥5 minutes, Then LOW priority receives ≥10% of throughput (no indefinite starvation). - Given a newly submitted HIGH priority batch, When at least one worker slot is free, Then its first task starts within 30 seconds and is scheduled ahead of new lower-priority tasks.

Concurrency Limits & Per-Account Rate Enforcement

- Given account A has maxConcurrency=10 and globalMaxConcurrency=100, When 1000 images are queued, Then active tasks for A never exceed 10 and total active tasks never exceed 100 (measured every second). - Given per-account rate limit R=120 requests/min, When driving sustained load, Then requests for account A never exceed R within any rolling 60-second window. - Given a 429 response is received, When retrying, Then Retry-After is honored (if present) or a scheduler-based delay is applied so that no image experiences >2 consecutive 429s.

Exponential Backoff Retries with Jitter

- Given a transient failure (HTTP 5xx, timeout, network reset), When processing an image, Then retry up to 5 attempts with exponential backoff delays of ~1s, 2s, 4s, 8s, 16s with ±20% jitter. - Given a non-transient failure (HTTP 4xx excluding 429), When encountered, Then do not retry and mark the image status=Failed with an error code and final message. - Given any retry attempt, When re-executed, Then the same idempotency keys are reused so side effects (writes, billing) occur at most once.

Real-Time Progress & Per-Image Status APIs

- Given a batch in progress, When calling GET /batches/{id}/progress, Then the response includes total, succeeded, failed, processing, pending, percentComplete, etaSeconds, costEstimate, and updatedAt, and values update at least every 2 seconds during active processing. - Given an imageId, When calling GET /images/{id}/status, Then the response includes status ∈ {pending, processing, success, failed, skipped}, retryCount, lastErrorCode (nullable), startedAt, and completedAt. - Given a client subscribed to /batches/{id}/events, When any image status changes, Then an event is emitted within 2 seconds containing the delta.

Cost & ETA Estimation Accuracy

- Given batch creation, When receiving POST /batches response, Then it includes costEstimate {value, currency} and etaSeconds, both present and non-negative. - Given a validation dataset of ≥1000 historical batches sized 100–10,000 images, When comparing estimates to actuals, Then MAPE for cost and ETA is ≤15% at P90. - Given an active batch, When ≥50% of images are completed, Then rolling ETA error is ≤10% at P90 and estimates are updated at least every 30 seconds.

Partial Completion Handling & Resumable Batches

- Given a batch with some images succeeded and others pending/failed-with-retries-left, When a resume is requested with the same jobId, Then only non-succeeded images are (re)scheduled and succeeded images are not reprocessed. - Given a pause or worker crash, When resuming, Then the resume operation completes within 5 seconds and duplicate processing count remains zero across the batch. - Given an export request during processing, When downloading results, Then only completed images are included and reflect a state no older than 2 seconds from the last progress update.

Crosscheck Matrix

Validate assets against multiple marketplaces at once and visualize conflicts. Get clear recommendations—use one compromise export or auto‑generate channel‑specific variants—so multi‑channel sellers pass all rules the first time with no extra editing cycles.

Requirements

Marketplace Rule Engine

"As a multi-channel seller, I want PixelLift to know each marketplace’s image rules so that my photos are validated accurately without manual research."

Description

Centralized service that aggregates and normalizes image policy rules across marketplaces (e.g., Amazon, eBay, Etsy, Shopify, Walmart) including dimensions, aspect ratios, background color requirements, color profile, max file size, compression, text/watermark/border prohibitions, product fill ratio, and category/region-specific variations. Supports rule versioning with effective dates, change logs, and automatic scheduled updates with manual override. Exposes a low-latency internal API and validation DSL to evaluate PixelLift assets and computed measurements, with fallback defaults when rules are missing and workspace-level custom overrides. Ensures consistent, auditable validations used by the Crosscheck Matrix and export pipeline.

Acceptance Criteria

Normalized Multi‑Marketplace Rule Ingestion and Versioning

Given marketplace rule payloads for Amazon, eBay, Etsy, Shopify, and Walmart containing category and region identifiers When the ingestion job runs Then rules are persisted in a canonical schema with required fields: marketplace_id, category_id, region, constraints, version, effective_start, effective_end And for any marketplace/category/region, effective date ranges for successive versions do not overlap And each persisted version is immutable and retrievable by marketplace_id+category_id+region+version And invalid or missing required fields cause the payload to be rejected with a logged error including marketplace_id and source timestamp

Scheduled Updates, Change Logs, and Manual Override

Given a daily schedule at 02:00 UTC and network availability When upstream marketplaces publish rule changes Then the service fetches updates, diffs against current versions, and writes a change log entry per affected marketplace/category/region with fields: previous_version, new_version, changed_attributes, effective_start, effective_end, fetched_at And a summary notification is emitted on success; on failure the job retries up to 3 times with exponential backoff and raises an alert within 2 minutes if still failing And an admin can manually set the active version per marketplace/category/region and freeze/unfreeze scheduled updates for that scope And a manual override action is recorded with actor_id, reason, and timestamp and is immediately reflected in subsequent validations

Low‑Latency Rule Evaluation API

Given an internal client calls POST /rule-engine/validate with an asset_id or computed measurements and up to 5 marketplace targets When the request is valid Then the API returns HTTP 200 within P95 150 ms and P99 300 ms per asset with a result containing per-marketplace pass/fail, violations, and rule_version references And batch requests up to 500 assets complete with P95 1.5 s and P99 3.0 s per 100 assets processed And invalid inputs return HTTP 400 with machine-readable error codes; upstream dependency timeouts return HTTP 503 with a retry-after header

Validation DSL Expressiveness and Deterministic Evaluation

Given a ruleset that expresses constraints on dimensions, aspect_ratio, background_color, color_profile, max_file_size, compression, text/watermark/border presence, and product_fill_ratio When evaluated against provided computed measurements for an asset Then the DSL supports operators: =, !=, <, <=, >, >=, between, in, matches, exists, and conditional clauses by category and region And evaluation returns a deterministic outcome with per-constraint results including: constraint_id, status (pass|fail|not_applicable), observed_value, expected_value, and message And the same inputs and rule version always produce identical outputs

Fallback Defaults for Missing Rules and Attributes

Given a marketplace/category/region has no matching rules for the request timestamp When validation is executed Then the engine returns outcome status rule_missing and applies the system default policy configured for the workspace And for any missing attribute within an otherwise matching ruleset, the outcome for that attribute is not_applicable and does not cause overall failure And all fallback decisions are included in the response metadata with source = default and are logged at warn level (rate-limited to once per marketplace per hour)

Workspace‑Level Custom Overrides

Given a workspace defines override constraints for a marketplace/category/region with effective dates When a validation request is made within that workspace Then overrides take precedence over marketplace rules for only the specified attributes while non-overridden attributes inherit from the marketplace rules And the response identifies the source for each constraint as override or marketplace and references the override_version And enabling, updating, or deleting an override is auditable (actor_id, timestamp, change_summary) and affects new validations within 60 seconds

Auditability and Single Source of Truth for Validations

Given Crosscheck Matrix and the export pipeline request validations When validations are performed Then 100% of validation results are produced by the Rule Engine API and include: ruleset_hash, rule_version per marketplace, evaluation_timestamp, and trace_id And given identical asset measurements and rule versions, repeated validations across services return identical results And validation records are retained for at least 180 days and retrievable by trace_id within 200 ms P95

Batch Crosscheck Pipeline

"As a seller uploading a large catalog, I want my entire batch crosschecked in minutes so that I can fix issues before publishing across channels."

Description

Asynchronous, scalable validation pipeline that evaluates hundreds to thousands of images per batch against selected marketplaces in parallel. Implements job queueing, concurrency control, retries/timeouts, and idempotent processing with hashing to skip unchanged assets. Performs image analysis (e.g., background uniformity, margins, product fill ratio) and attaches metrics for rule evaluation. Supports incremental re-validation on deltas, progress tracking, partial results streaming, and webhooks for completion. Integrates before export to prevent non-compliant outputs and after style-presets to catch newly introduced conflicts.

Acceptance Criteria

High-Volume Parallel Validation Across Multiple Marketplaces

Given a submitted batch of 1,000 images with 3 marketplaces selected When the pipeline starts Then processing begins within 5 seconds and validations execute in parallel across marketplaces And p95 per image–marketplace validation time is ≤ 90 seconds and the batch completes within 15 minutes on the reference environment And zero jobs are dropped; any failures are recorded with error codes and retry counts

Queueing and Concurrency Controls Under Load

Given max_concurrency is configured to 50 and 2,000 validation jobs are queued When processing begins Then no more than 50 jobs run simultaneously and queue depth decreases as jobs complete And jobs within a batch are scheduled FIFO and no job remains in the same state for > 5 minutes without progress Given a worker terminates mid-job When the worker restarts Then the in-flight job is returned to the queue within 30 seconds and safely re-executed

Retry, Timeout, and Backoff Policy for External Validators

Given a validation step exceeds the per-step timeout of 30 seconds When the timeout occurs Then the job is retried up to 3 times with exponential backoff (e.g., ~2s, ~4s, ~8s with jitter) and attempt metrics are recorded Given a non-retryable error (e.g., 4xx schema/config error) is returned When the error is detected Then the job is not retried and is marked Failed with the specific code and message Given a transient 5xx error occurs and a subsequent retry succeeds When the final attempt completes Then the overall job result is recorded as Pass and includes retry_count > 0

Idempotent Processing Using Content Hashing

Given an asset’s content hash and parameters match a previously completed validation for the same marketplaces When the batch is resubmitted Then processing is skipped and prior results are returned with status Skipped_Unchanged and a reference to the source run Given duplicate submissions include the same idempotency key within 60 seconds When both are received Then only one execution occurs and both submissions return the same job identifier and outcome Given the asset content changes (hash differs) When resubmitted Then a full re-validation runs and a new immutable result set is stored

Pre-Export Gate and Post-Preset Re-Validation (Delta-Only)

Given a user initiates export for selected marketplaces and some assets have stale or missing validations When export is requested Then the pipeline auto-triggers validation and export is blocked until validation completes or a configured timeout elapses, returning a clear blocking reason Given style-presets are applied to a subset of assets When re-validation is triggered Then only assets whose content hash or relevant ruleset changed are re-validated; unchanged assets are skipped using prior results Given at least one asset fails required marketplace rules When export is attempted Then export is prevented and the API returns a list of blocking violations per asset and marketplace

Image Analysis Metrics Extraction and Attachment

Given an image enters analysis When metrics are computed Then background_uniformity_score is reported in [0.00,1.00], margin_top/right/bottom/left are reported as percentages of the shortest side, and product_fill_ratio is reported as a percentage of image area And repeated runs on the same input produce metrics within ±1% variance Given metrics are available When rules are evaluated Then pass/fail decisions per marketplace are computed from configured thresholds and the metrics are included in the result payload Given the image has a transparent background When background analysis runs Then background_type="transparent" is flagged and background_uniformity_score is still reported

Progress Tracking, Partial Results Streaming, and Completion Webhooks

Given a batch is processing When the progress endpoint is queried Then it returns counts by status (queued, processing, completed, failed, skipped) and overall percent complete updated at least every 2 seconds Given partial results become available When a client is connected via SSE/WebSocket Then per-asset per-marketplace results are streamed within 2 seconds of completion and in-order per asset Given a batch reaches terminal state When finalization occurs Then a completion webhook is POSTed with HMAC-SHA256 signature, retried up to 5 times on 3xx/5xx with exponential backoff, and includes an idempotency key for deduplication

Conflict Matrix UI

"As a merchandising manager, I want a clear visual of which photos fail which channels so that I can prioritize fixes quickly."

Description

Interactive visualization that displays assets as rows and marketplaces (and/or rule categories) as columns, with cells indicating pass/fail/warn status and severity. Provides filters, sorting, sticky headers, search, and grouping by product or style-preset. Hover reveals rule text and measured values; clicking drills into a detail view with visual overlays (safe crop bounds, padding guides, background uniformity heatmap). Supports keyboard navigation, accessible color contrasts, responsive layouts, and export of the matrix or details as CSV/PDF screenshots to share with teams.

Acceptance Criteria

Matrix Grid Rendering & Status Indicators

- Given a dataset with up to 2,000 assets and up to 10 marketplaces or rule categories, when the matrix loads, then it renders N asset rows and M columns within 2 seconds on a reference desktop and 3 seconds on a tablet. - Then each cell displays one of: Pass, Warn, or Fail, with severity encoded as Pass=0, Warn=1, Fail=2, derived from the latest rules evaluation payload. - And each status is represented by a unique icon/shape plus color; unavailable rule results show a dash and an info indicator. - And row and column headers show asset identifiers and marketplace/rule-category names, remain aligned with cells during scroll, and avoid text truncation beyond 1 line with ellipsis and accessible full-text tooltips.

Filters, Sorting, Search & Sticky Headers

- Given a loaded matrix, when a user applies filters by Status (Pass/Warn/Fail), Marketplace, Severity, or Rule Category, then the grid updates within 300 ms and the visible row count and active filter chips reflect the selection. - When sorting by Asset Name, Fail Count per asset, or a selected marketplace column, then ascending/descending toggles and the sort is stable (ties preserve original order) until changed. - When entering a search query of 2+ characters, then matching rows/cells (filename, SKU, product name, rule name, rule ID) are returned and highlighted; no-results state appears when none match. - Top header row and first column remain sticky during vertical/horizontal scroll without jitter; sticky elements consume no more than 15% of viewport height/width.

Grouping by Product or Style-Preset

- Given assets with Product IDs and Style-Preset IDs, when the user toggles Group by Product or Group by Style-Preset, then rows are clustered under collapsible headers showing group name and counts (assets, fails, warns). - Group aggregate status equals the maximum severity across its child items and displays as a status chip on the group header. - Collapsing/expanding groups preserves state during navigation and while opening/closing detail views. - Grouped scrolling and expand/collapse interactions sustain >=60 FPS on desktop (>=45 FPS on mobile) for up to 2,000 assets using row virtualization.

Rule Detail Drilldown with Visual Overlays

- Given a matrix cell, when the user clicks or presses Enter/Space on it, then a detail view opens within 300 ms as a modal or side panel anchored to the originating cell. - The detail view shows the asset image with zoom (25%–400%), pan, and reset; zoom and pan are smooth without pixelation at native resolution. - Overlays can be toggled independently: Safe Crop Bounds (target aspect ratio visible), Padding Guides (measured margins in px and % per edge), Background Uniformity Heatmap (legend with numerical scale). - The active rule section displays: Rule Name, Description, Expected Thresholds, Measured Values with units, Evaluation (Pass/Warn/Fail), and a link to Rule Docs. - Closing the detail view returns focus to the originating cell and preserves scroll position in the matrix.

Keyboard Navigation & Accessibility Compliance

- The matrix implements ARIA grid roles for grid/row/cell; each cell’s accessible name includes asset ID, marketplace/rule-category, and status text. - Users can navigate cells with Arrow keys, jump regions with Tab/Shift+Tab, open detail with Enter/Space, and close with Esc; no keyboard traps exist. - All interactive elements have a visible focus indicator (>=3:1 contrast); text and key UI meet WCAG 2.1 AA contrast (>=4.5:1 for text, >=3:1 for graphical objects). - Status is not conveyed by color alone; each status includes an icon and text label (e.g., “Fail”). - Tooltip/hover content is available on focus and announced by screen readers, including measured values and rule text.

Responsive Layout & Touch Interaction

- On >=1200px width, the grid shows full columns with sticky headers; at 768–1199px, columns auto-size with horizontal scroll and frozen first column; <768px switches to stacked per-asset accordions with per-marketplace status chips. - Touch targets are >=44x44 px; two-finger pinch/zoom and pan work in the detail view without interfering with page scroll. - Initial render completes under 3 seconds for 1,000 assets on a mid-tier mobile device; scroll remains responsive (>=45 FPS) via virtualization. - Orientation changes preserve current scroll position, selection, and any open detail state.

Export Matrix and Detail as CSV/PDF

- Given current filter/search/sort/grouping, when the user selects Export Matrix as CSV, then a CSV downloads within 5 seconds for up to 5,000 visible rows with columns: AssetID, AssetName, ProductID, StylePreset, MarketplaceOrRuleCategory, RuleID, RuleName, Status, Severity, MeasuredValue, Threshold, Timestamp. - When Export Matrix as PDF is selected, then a paginated PDF reflecting the visible grid (with legend and sticky headers) downloads, including a summary page with Pass/Warn/Fail counts. - When in a detail view, selecting Export Detail as PDF downloads within 3 seconds a single-page PDF including the image with current overlays, measured values table, rule description, and metadata (asset ID, marketplace, timestamp). - Exported files follow the naming convention: PixelLift_Crosscheck_{type}_{YYYY-MM-DD_HH-mm-ss}_{userInitials}.ext and use the user’s timezone for timestamps.

Actionable Fix Recommendations

"As a seller, I want clear, one-click fixes for violations so that I can pass all marketplace checks without manual editing."

Description

Engine that converts each violation into precise, parameterized remediation steps (e.g., resize to 2000×2000, pad 50px with #FFFFFF, convert to sRGB, compress <1MB, crop to achieve ≥85% subject fill, remove detected text overlay region), with predicted compliance outcomes per marketplace. Honors brand style-presets and constraints, provides confidence scores, and generates instant previews. Enables one-click application to selected assets or entire groups, and queues resulting transforms through the existing rendering pipeline.

Acceptance Criteria

Parameterized Remediation per Violation

Given an asset with one or more detected marketplace rule violations When the engine generates recommendations Then each violation is mapped to at least one specific remediation step with explicit parameters (e.g., width/height, padding in px and hex color, color profile, compression target, crop box coordinates) And each step lists the marketplace rule ID(s) it addresses And steps are ordered to avoid conflicts (e.g., crop before compress) And no recommendation includes steps unrelated to the detected violations

Per-Marketplace Predicted Outcomes with Confidence

Given recommendations have been generated for an asset or batch When viewing outcomes per marketplace Then the UI shows a predicted Pass/Fail per marketplace for the full step set And each prediction includes a confidence score from 0.00 to 1.00 with two decimals And each prediction links to the specific rule clauses expected to pass after fixes And, when validated against a test set of ≥100 assets, overall prediction accuracy at confidence ≥0.50 is at least 90%

Brand Style-Preset Compliance and Conflict Handling

Given a brand style-preset with locked constraints (e.g., background color, min margins, watermark rules) When the engine computes fixes that would otherwise violate those constraints Then no recommended step proposes a parameter that breaks locked constraints And conflicting requirements are flagged with a clear reason and impacted marketplaces And the engine proposes an alternative that honors the preset or marks the item as requiring a channel-specific variant

Instant Preview of Recommended Fixes

Given a selected asset and its recommended step set When the user clicks Preview Then a side-by-side before/after preview renders within 2 seconds for assets up to 24MP And the preview reflects the full ordered step set (resize, pad, crop, color profile, compression, text removal) And overlay toggles are available to visualize crop region, padding area, and removed overlay regions And reverting the preview restores the original image instantly (<500 ms)

One-Click Apply and Render Queue Integration

Given the user selects one or more assets or a saved group When the user clicks Apply Fixes and confirms Then a render job is created within 3 seconds and queued through the existing rendering pipeline with a job ID And job statuses progress Queued → Processing → Complete/Failed and are visible in the activity panel And outputs are saved to the designated destination(s) with versioned filenames and metadata noting applied steps and parameters And failures include retryable error codes and the offending step

Compromise vs Channel-Specific Variant Recommendation

Given an asset set with conflicting marketplace requirements When the engine evaluates remediation strategies Then it presents a single compromise export option with predicted outcomes per marketplace and confidence scores And it also presents channel-specific variant options, each predicted to pass its marketplace with confidence scores And each option includes estimated processing time and asset count And the user can select an option and proceed to Preview or Apply in one click

Auto-Generate Channel Variants

"As a multi-channel seller, I want PixelLift to create channel-specific image variants automatically so that each listing complies without extra effort."

Description

Non-destructive export pipeline that produces marketplace-specific compliant image variants from a canonical master using recommended transforms. Preserves retouching and brand style-presets while adapting technical parameters per channel. Supports naming conventions, folder structures, ZIP packaging, and optional direct pushes to connected storefronts. De-duplicates identical outputs across channels, embeds metadata (e.g., alt text templates), and maintains linkage to the master for re-generation when rules change.

Acceptance Criteria

Multi-Channel Variant Generation & Compliance

Given a canonical master image with product SKU(s) and at least two target channels selected (e.g., Amazon, eBay) with Crosscheck rules loaded When the user clicks Auto-Generate with the "Channel-specific variants" option Then the system produces one variant per selected channel using the recommended transforms And each generated file passes that channel’s Crosscheck rules with 0 errors and 0 critical warnings And the export report lists, for each file, the channel, applied transforms, and a Pass status

Non-Destructive Pipeline & Style Preservation

Given the master asset has retouch adjustments and a brand style-preset applied (e.g., preset ID SP-123) When channel variants are generated Then the master asset remains unchanged (file hash identical before and after export) And each variant includes the rendered retouch adjustments and the style-preset look And the export manifest records the style-preset ID and adjustment stack applied for traceability

De-duplication of Identical Outputs Across Channels

Given two or more channels resolve to identical technical outputs (same dimensions, format, color profile, compression, background) When variants are generated Then the system computes a content hash and writes only one physical file for identical outputs And each relevant channel manifest references the single shared file And the export summary displays a deduplication count and total bytes saved

Naming, Folder Structure, and ZIP Packaging

Given a naming template "{sku}_{channelCode}_{variantIndex}" and folder strategy "per-channel" with ZIP packaging enabled And a batch of N SKUs is exported to C channels When generation completes Then the file system contains C top-level folders and files named per the template without collisions And each channel ZIP contains exactly N files with MD5 checksums matching their source files And ZIP creation time and total size are reported in the export summary

Direct Push to Connected Storefronts

Given channel connections (e.g., Shopify, Etsy) are authorized and mapped by SKU or listing ID And "Direct Push after export" is enabled When generation completes Then images are uploaded to each connected storefront with HTTP 2xx responses and associated to the correct listings And transient failures trigger up to 3 retries with exponential backoff And partial failures are reported per item without creating duplicate remote assets; local exports remain intact

Metadata Embedding and Alt-Text Templates

Given an alt-text template and XMP/IPTC tag mappings configured When variants are generated Then each exported file embeds metadata fields per mapping (e.g., title, description, alt text) And template placeholders are resolved from product attributes (e.g., brand, productName, color) with no unresolved tokens And for storefronts that support alt text via API, the uploaded image’s alt text matches the embedded value

Re-Generation on Rule Changes with Master Linkage

Given variants previously generated from master ID M-001 at version v1 And Crosscheck rules for one or more channels change When the user triggers Regenerate Affected Variants Then only impacted channel variants are re-rendered using updated transforms while unchanged channels reuse existing files via dedup And regenerated outputs are versioned to v2, manifests update pointers, and prior versions are archived And an audit log records the rule change, user, timestamp, affected channels, and SKUs

Compromise vs Variants Assistant

"As a brand owner, I want guidance on whether to use one image for all channels or tailored variants so that I balance compliance with brand consistency."

Description

Decision module that analyzes rule conflicts, marketplace priority weighting, and brand preferences to recommend using a single compromise export or generating channel-specific variants. Provides side-by-side previews, predicted pass rates, and an explanation of trade-offs (e.g., background purity vs brand backdrop). Allows setting workspace defaults and remembers choices per product line, enabling one-click execution of the chosen path.

Acceptance Criteria

Recommendation: Compromise vs Variants

Given a batch of 100–500 product images with at least 2 conflicting marketplace rules across 3+ marketplaces and workspace priority weights and brand preferences configured When the assistant analyzes the batch Then it returns a single recommendation of "Compromise" or "Variants" with a confidence score (0–100), a ranked list of the top 3–5 conflict drivers referencing rule IDs, and predicted per-marketplace pass rates (0–100%) And the analysis completes in ≤10 seconds for up to 300 images And repeated runs with identical inputs return identical recommendations and scores

Side-by-Side Previews with Pass Predictions

Given the assistant has produced both options for preview using the selected style-preset(s) When the user opens the preview panel Then the UI displays at least 3 representative image pairs per option with identical crops and lighting And per-marketplace predicted pass rates are shown for each option and for the batch aggregate And each predicted failure is annotated with rule ID, description snippet, and affected image count And previews render in ≤3 seconds after panel open

Trade-off Explanation of Rules vs Brand

Given a recommendation is available When the user opens "Why this recommendation?" Then the assistant lists at least 2 quantified trade-offs (e.g., background purity %, backdrop color delta E, margin size px) with links to affected rules and brand settings And it explains expected impact on brand consistency score (0–100) and lists any constraints that cannot be satisfied simultaneously And the explanation length is ≤400 words and includes rule IDs and preset names

One-Click Execution of Chosen Path

Given the user selects "Compromise" or "Variants" and clicks Execute When the job starts Then the system queues processing within ≤2 seconds and displays a job ID and live progress And for "Compromise," exactly 1 export per source image is generated with the chosen compromise settings and passes the built-in validator for targeted marketplaces with ≥95% accuracy versus predictions And for "Variants," 1 export per marketplace per source image is generated with marketplace-specific specs (dimensions, format, background) and passes the built-in validator with ≥95% accuracy versus predictions And outputs are saved to the asset library with deterministic naming: {sku}_{option}_{market-or-global}.{ext}

Workspace Defaults and Product Line Memory

Given a workspace admin sets a default decision policy (Compromise/Variants, tie-break threshold, preferred presets) and a product line owner saves a decision for a product line When a new batch in that workspace or product line is analyzed Then the assistant pre-selects the default policy for workspace batches and the saved decision for that product line And users can override the pre-selection before execution And preferences persist across sessions and are scoped so workspace defaults do not overwrite product-line choices And an audit entry records user, timestamp, and change details for each update

Re-analysis After Priority or Preference Change

Given marketplace priority weights or brand preferences are modified When the user triggers Re-analyze Then the recommendation, confidence score, pass rates, and explanation update to reflect the new inputs and highlight changes versus the prior analysis And any product lines set to "follow workspace defaults" are updated automatically, while explicit overrides remain unchanged And re-analysis meets the same performance and determinism criteria as initial analysis

Audit Trail & Reporting

"As an operations lead, I want auditable records and reports so that I can prove compliance and improve our workflow over time."

Description

Persistent logging of validation results, applied fixes, rule versions, user actions, and export events at asset and batch levels. Generates downloadable compliance reports, marketplace evidence packs, and trend analytics (e.g., top failing rules, time saved, first-pass yield). Supports RBAC, data retention policies, and re-running validations with historical rule versions to reproduce outcomes.

Acceptance Criteria

End-to-End Event Logging at Asset and Batch Levels

Given a batch of assets is validated across multiple marketplaces, when validation completes, then for each asset a log entry exists with ISO-8601 UTC timestamp, marketplace, rule ID, rule version, outcome (pass|fail|warn), and duration, and a batch summary log exists with per-marketplace counts. Given a user applies an auto-fix or accepts a recommendation (compromise export or channel-specific variants), when the action is confirmed, then an action log records user ID, role, action type, parameters before/after, affected asset IDs, and correlation ID linking to validation events. Given assets are exported, when export completes, then an export log records marketplace(s), export preset ID and version, output filenames and SHA-256 checksums, destination type and path, and success/failure code. Given a transient logging failure occurs, when retry logic executes, then all pending events are persisted within 2 minutes and the retry count is recorded.

Downloadable Compliance Reports

Given a completed batch exists, when a user with role Manager or above requests a compliance report for a specific marketplace, then the system generates a downloadable PDF and CSV within 30 seconds including asset list, validation outcomes, rule IDs and versions, timestamps, and a signed summary page with report ID and checksum. Given an individual asset is selected, when its compliance report is generated, then the report contains before/after thumbnails, applied fixes list with timestamps, and links (or IDs) to corresponding audit log entries. Given a report is generated, when the generation finishes, then a report-generation event is written to the audit log including requesting user, time, filters, report format(s), file sizes, and SHA-256 checksum.

Marketplace Evidence Pack Generation

Given a failed listing requires marketplace evidence, when the user requests an evidence pack for Marketplace Y, then the system creates a ZIP containing per-asset JSON of validation details, original and exported images, applied-fix manifest, and a rule-bundle manifest; filenames follow the documented schema and include the batch ID. Given an evidence pack is generated, when validated with the published schema validator, then the pack structure and JSON schemas pass without errors. Given an evidence pack is downloaded, when the download completes, then an access log entry records user, role, timestamp, client IP, pack ID, and checksum.

Trend Analytics and KPIs

Given at least 30 days of audit data exist, when analytics are computed, then the dashboard shows top 10 failing rules per marketplace, first-pass yield %, average time saved per batch, and average conflicts per batch; recomputing the same time window produces identical values. Given a date range and marketplace filter are applied, when the user refreshes analytics, then metrics update within 5 seconds for datasets up to 1,000,000 events. Given the user exports analytics, when CSV export is requested, then the file contains rows with date bucket, marketplace, rule ID, failure count, first-pass yield, and time-saved estimates, and a corresponding export event is logged.

RBAC and Access Auditing

Given a user with role Viewer accesses the audit area, when viewing logs, then only read-only views are available and downloads are disabled; Manager and above can download reports/evidence packs; Admin can configure retention and historical re-run settings. Given any access to logs or reports occurs, when the action completes, then an access-audit entry is recorded with user, role, action, resource identifier, timestamp, and client IP; entries are immutable and searchable by these fields. Given a user without required permission attempts a restricted action, when the request is made, then the system returns HTTP 403 (or equivalent), displays an authorization error message, and logs the denial with reason.

Re-run Validations with Historical Rule Versions

Given a batch was validated on date D with rule bundle version V, when a re-run is requested using version V, then per-asset, per-rule outcomes exactly match the original logged outcomes for assets whose inputs have not changed since D. Given an asset has changed since D, when re-run with version V, then the system flags the asset as non-reproducible, shows a diff of input changes (e.g., image hash, metadata), and excludes it from pass/fail parity calculations. Given newer rule versions exist, when the user selects Latest for a re-run, then the resulting report is labeled with the rule bundle version used and the original logs remain unchanged; a new re-run event links both versions via correlation ID.

Data Retention and Legal Hold

Given retention is configured to 180 days for detailed events and 2 years for summaries, when the nightly retention job runs, then detailed events older than 180 days are purged and summaries retained, and a purge receipt (counts and date range) is written to the audit log. Given a batch or asset is under legal hold, when the retention job executes, then events for the held resources are not deleted until the hold is released; hold set/release actions are themselves logged with user and timestamp. Given any deletion occurs, when the integrity check runs, then the append-only hash chain remains valid and the verification endpoint returns OK for the remaining log range.

CleanSlate Detect

High-accuracy detection for banned overlays like watermarks, text, borders, and stickers. See confidence scores and one‑click, edge‑aware removal that preserves product detail, reducing top rejection causes across Amazon, Etsy, and other platforms.

Requirements

Multi-class Overlay Detection Engine

"As an online seller, I want CleanSlate Detect to automatically find banned overlays in my product photos so that I can avoid marketplace rejections and keep my listings compliant."

Description

Implements high-accuracy detection and localization of banned overlays—including watermarks (opaque and semi-transparent), text (multi-language, rotated/curved), borders/frames, stickers/emojis, QR codes, and logos—across JPEG/PNG/WebP inputs up to 8K resolution. Produces structured outputs per image: detected class, confidence (0–1), and edge-aware polygons/masks, plus an aggregate pass/fail verdict. Targets production performance of ≥0.95 precision and ≥0.90 recall on internal benchmarks, with average latency ≤400 ms per megapixel on GPU (≤1.5 s/MP on CPU). Robust to complex backgrounds, reflective products, and transparent overlays. Integrates as a versioned model within PixelLift’s batch pipeline and public API, with JSON schema responses, telemetry for detection metrics, and graceful degradation: low-confidence cases flagged for review. Ensures secure processing and ephemeral storage aligned with PixelLift privacy standards.

Acceptance Criteria

Class Coverage and Localization on Single Image

Given an input image (JPEG, PNG, or WebP) up to 8K resolution (≤7680x4320) When the Multi-class Overlay Detection Engine processes the image Then it returns a detections array where each present banned overlay instance from {watermark (opaque, semi-transparent), text (multi-language, rotated/curved), border/frame, sticker/emoji, QR code, logo} has: class label, confidence ∈ [0,1], and an edge-aware polygon or mask localizing the overlay And all polygon/mask coordinates are within image bounds, non-self-intersecting, and each localized area ≥ 50 px And multiple instances per class are returned separately (distinct ids), including overlapping instances if centroids differ by ≥ 5 px And for images without banned overlays, the detections array is empty

Benchmark Precision and Recall Targets

Given the internal benchmark dataset cleanSlate.benchmark.v1 with ground-truth polygons/masks When evaluated at the default confidence threshold τ = 0.5 and IoU ≥ 0.5 for matching Then micro-averaged precision ≥ 0.95 and recall ≥ 0.90 And per-class recall ≥ 0.85 for each of {watermark, text, border, sticker, qrcode, logo} And mAP@0.5 IoU ≥ 0.92 And metric variance across three independent runs ≤ 0.5 percentage points

Latency and Throughput Performance per Megapixel

Given warmed inference and images sized 0.5–33 MP When running on the production GPU tier Then mean latency ≤ 400 ms per MP and p95 latency ≤ 500 ms per MP And when running on the production CPU tier Then mean latency ≤ 1500 ms per MP and p95 latency ≤ 2000 ms per MP And latency scales approximately linearly with pixels processed (R^2 of linear fit between MP and latency ≥ 0.95)

Structured JSON Response, Versioning, Verdict, and Review Flags

Given a valid detection request via API or batch pipeline When detection completes Then the JSON response validates against schema cleanSlate.detect.v1 and includes fields: model_version, image_id, detections[], aggregate_verdict, threshold, review_required, telemetry And each detection contains: id, class ∈ {watermark, text, border, sticker, qrcode, logo}, confidence ∈ [0,1], bbox, polygon XOR mask (per schema), and area And aggregate_verdict = "fail" if any detection has confidence ≥ threshold; otherwise "pass" And review_required = true if no detection ≥ threshold and there exists a detection with confidence in [threshold − 0.15, threshold); otherwise false And invalid inputs return a JSON error object with code, message, and correlation_id (no stack traces)

Robustness on Challenging Overlays and Backgrounds

Given the curated challenge set (reflective products, complex backgrounds, transparent overlays, curved/rotated text, multi-language scripts) When evaluated at τ = 0.5 and IoU ≥ 0.5 Then overall F1 ≥ 0.88 And recall for semi-transparent watermarks (alpha ≤ 0.4) ≥ 0.90 And recall for rotated/curved text across at least 5 languages (including non-Latin) ≥ 0.90 And false positive rate on clean images without overlays ≤ 3% And boundary IoU (mask/polygon vs ground truth) ≥ 0.60 on average

Batch Pipeline and Public API Integration with Telemetry

Given a batch of 500 mixed-format images up to 8K submitted to the PixelLift batch pipeline When the job completes Then 100% of images have responses written to the job manifest with deterministic ordering by image_id and no missing entries And the public endpoint POST /v1/clean-slate/detect accepts multipart uploads and JSON URLs, supports model_version parameter, and returns 200 with schema-compliant responses for valid requests; invalid requests return 4xx with a JSON error body And telemetry emits per-image metrics {inference_ms, megapixels, num_detections, avg_confidence, device_type, status} with 100% sampling in staging and ≥ 10% in production and an end-to-end drop rate ≤ 1% And adding new response fields is backwards-compatible (existing fields unchanged; new fields optional)

Security and Ephemeral Storage Compliance

Given standard operations in production and staging When processing and storing data Then all API and telemetry traffic uses TLS 1.2+ with authenticated access (OAuth2 client credentials or signed service keys) And images are processed in-memory when feasible; any persisted artifacts are encrypted at rest and automatically deleted within 24 hours (TTL ≤ 24h) And logs/telemetry contain no image bytes or PII by default; debug image logging is allowed only in staging when debug=true And pre-release security testing reports no Critical/High findings; SOC 2 control mappings for data retention and access are documented

Edge-Aware One-Click Removal

"As a boutique owner, I want to remove watermarks and borders with one click so that my images look clean and professional without losing product detail."

Description

Delivers single-action, edge-aware removal of detected overlays using product-vs-background segmentation, structure-aware inpainting, and seamless blending to preserve fine product details, edges, and textures. Supports per-item removal (by detection), batch auto-remove rules, and special handling for borders (smart crop vs. reconstruct), semi-transparent watermarks, and stickers casting shadows. Operates non-destructively with reversible layers and history, preview-before-apply, and instant undo. Compatible with PixelLift style-presets and retouch steps, ensuring consistent outputs during batch processing and export. Guarantees artifact thresholds (no halos, color bleeding) and exposes quality safeguards to prevent product damage.

Acceptance Criteria

One-Click Removal: Semi-Transparent Watermark Over Product Edge

Given a validation image with a semi-transparent watermark crossing the product/background boundary and clean ground truth available When the user triggers one-click removal on the selected detection Then the overlay is removed and product edges are preserved with max edge displacement <= 1 px And boundary color difference (ΔE00) <= 2.0 average (95th percentile <= 3.0) in a 5 px band And SSIM >= 0.94 in the 5 px boundary band vs ground truth And residual watermark opacity <= 2% within the former overlay mask And operation completes in <= 800 ms for 2048 px longest side on reference hardware

Border Overlay: Smart Crop vs Reconstruct Selection

Given an image with a detected uniform border and an accurate product mask When one-click remove is invoked for the border Then if a crop can remove the border without intersecting the product mask by >= 1 px, the system crops And product bounding box loss = 0% after crop And final aspect ratio matches the original within +/-1% Else the system reconstructs the border via inpainting And SSIM in a 10 px band adjacent to the former border >= 0.92 vs ground truth And no crop is applied And the user can override the choice And operation completes in <= 700 ms for 2048 px longest side

Sticker With Soft Shadow on Textured Background

Given validation images containing an opaque sticker casting a soft shadow over textured background near the product When one-click removal is applied to the sticker detection Then the sticker and its shadow are removed and background texture is reconstructed with LPIPS <= 0.12 vs ground truth in the masked area And product mask pixels modified <= 0.5% of product area And gradient continuity across the reconstructed region L1 difference <= 8% of local baseline And operation completes in <= 900 ms for 2048 px longest side

Non-Destructive Apply with Preview and Instant Undo

Given an image with multiple detections selected individually When preview is toggled for a detection Then a live preview renders in <= 200 ms and matches the applied result within 1 px When apply is clicked for that detection Then a reversible removal layer and a history entry are created without altering original pixels And other detections remain untouched and editable And pressing Undo restores the pre-apply state in <= 150 ms And Redo reapplies in <= 150 ms And after save and reopen, the removal layer persists with identical parameters

Batch Auto-Remove by Confidence with Safe Fallbacks

Given a batch of 500 images with mixed overlay detections and an auto-remove rule set to confidence >= 0.90 When the batch is processed Then all detections with confidence >= 0.90 are removed automatically And detections with confidence < 0.90 are left unmodified and flagged for review And no auto-removal occurs when product-damage risk score > 0.20; those items are flagged And processing throughput >= 120 images/min at 2048 px longest side on reference hardware And per-item logs record action taken, detection type, confidence, method (crop/reconstruct/inpaint), duration, and safeguards triggered

Pipeline Compatibility with Style-Presets and Retouch

Given a batch with a selected style-preset and retouch steps configured When removal runs within the processing pipeline Then the output is pixel-wise consistent with running removal before style/retouch (per-channel absolute difference <= 1 for >= 99.9% pixels; no pixel difference > 2) And style-presets do not introduce halos or color shifts at former overlay boundaries (boundary band ΔE00 <= 1.5 average) And exports match previews within the same pixel-difference tolerance And no pipeline step fails due to the presence of the non-destructive removal layer

Artifact Thresholds and Safeguards Against Product Damage

Given synthetic and real-world validation sets with overlays touching product edges When one-click removal is applied Then no halo wider than 1 px is detected around product boundaries using a gradient-based halo detector And boundary bleed into the product area <= 0.3% of product mask pixels And boundary band ΔE00 <= 2.0 average (95th percentile <= 3.0) And if predicted product-pixel alteration > 1.0% or edge displacement > 1 px, the system blocks apply by default and shows a warning with override option

Confidence Scores & Threshold Controls

"As a power user, I want to tune detection thresholds and use marketplace presets so that I can balance false positives and negatives based on my risk tolerance and policies."

Description

Surfaces per-detection confidence scores with adjustable thresholds by class (text, watermark, border, sticker) and global defaults. Provides marketplace-specific presets to match Amazon/Etsy policies, plus UI controls (sliders, toggles) and batch rules (e.g., auto-remove if confidence ≥ threshold; send to review if within gray zone). Displays optional heatmaps and outlines for transparency, and warns when detections fall near decision boundaries. Persists settings per workspace, supports import/export of detection JSON, and calibrates thresholds via stored ROC data to maintain target precision/recall over model updates.

Acceptance Criteria

Per-Detection Confidence Display

Given an image contains one or more banned overlays (text, watermark, border, sticker) and the user opens the Review pane, When detections are listed, Then each detection shows its class and a confidence score formatted 0–100% with one decimal precision. Given multiple detections exist on the same image, When the user expands details, Then each detection row displays a unique id, class, confidence, and bounding box coordinates (x, y, w, h) in pixels. Given no detections are found, When the Review pane loads, Then the UI displays “No banned overlays detected” and no confidence values are shown. Given the user hovers a confidence value, When the tooltip appears, Then it displays the exact confidence in 0.000–1.000 format and the model version used. Given the user exports detection results, When the detections JSON is generated, Then each detection includes fields: id, class, confidence (0–1 float), bbox {x,y,w,h}, imageId, and modelVersion.

Adjustable Class Thresholds & Global Defaults

Given the Thresholds panel is open, When the user adjusts the per-class sliders for text, watermark, border, and sticker within 0–100% (step = 1%), Then the effective thresholds for those classes update immediately and are used for decisions. Given a Global Default slider is present, When it is changed, Then any class without an explicit override inherits the global value; classes with overrides retain their set values. Given a class threshold has an override, When the user clicks “Reset to Global”, Then that class resumes inheriting from the Global Default. Given thresholds change, When preview badges recompute, Then the recomputation completes and the visible decisions update within 200 ms per image. Rule: A detection is considered above threshold if confidence ≥ the effective class threshold; otherwise it is below threshold.

Marketplace Presets Alignment

Given the user opens Presets, When “Amazon” is selected, Then per-class thresholds and gray-zone width update to the Amazon preset values and the active preset label shows “Amazon (policy vX.Y)”. Given the user opens Presets, When “Etsy” is selected, Then per-class thresholds and gray-zone width update to the Etsy preset values and the active preset label shows “Etsy (policy vX.Y)”. Given a preset is active, When the user modifies any threshold, Then the preset indicator changes to “Custom (based on <PresetName> vX.Y)”. Given a preset is active, When “Restore Preset” is clicked, Then all related values revert to the original preset values. Given a preset is active, When a batch validation runs, Then any image failing the preset’s prohibited-overlay rules is flagged before export with class-specific reasons.

Batch Auto-Remove and Gray-Zone Review

Rule: Gray-Zone Width (G) is configurable 0–15% (default 5%) and applies symmetrically around each class threshold. Given batch processing is started and rules are enabled, When a detection’s confidence ≥ class threshold, Then edge-aware auto-removal is executed for that detection before export. Given batch processing is started, When a detection’s confidence is within [threshold − G, threshold + G], Then the image is routed to the Review queue and is not auto-exported. Given batch processing is started, When all detections on an image are < (threshold − G), Then no removals are performed and the image passes overlay checks. Rule: If multiple detections exist, apply precedence: (1) Auto-remove any detection ≥ threshold; (2) Else if any detection is in gray zone, route to Review; (3) Else pass. Given batch completes, When the audit log is viewed, Then each image lists detections with class, confidence, decision (removed/review/ignored), and timestamp.

Transparency Overlays & Boundary Warnings

Given Preview is open and Overlays are toggled ON, When detections are rendered, Then heatmaps and outlines appear per detection with an opacity control (0–100%) and do not modify the underlying source or exported image. Given Overlays are toggled OFF, When the user exports images, Then no overlay elements are present in exports or persisted edits. Given a detection’s confidence falls within the gray zone, When displayed in the list, Then a “Near threshold” warning badge shows with the absolute distance to threshold in percentage points (e.g., +1.2 pp). Given a batch run finishes, When the summary is shown, Then it includes the count of gray-zone warnings per class.

Settings Persistence & JSON Import/Export

Given a user updates thresholds, gray-zone width, batch rules, presets, and overlay toggles, When they return to the workspace later (same or different device), Then the exact settings persist for that workspace and do not affect other workspaces. Given the user clicks Export Settings, When the JSON is downloaded, Then it includes: modelVersion, thresholds.byClass, threshold.globalDefault, grayZoneWidth, preset {name, version}, rules {autoRemove, review}, visualization {heatmapEnabled, outlineEnabled}, and updatedAt (ISO 8601). Given the user imports a Settings JSON matching the schema, When validation succeeds, Then settings apply immediately and are persisted; When validation fails, Then an error shows the invalid field(s) and no changes are applied. Given the user exports Detections JSON for a selected batch, When the file is generated, Then it includes per-image detections with class, confidence (0–1), bbox, decision, and thresholds used for that run. Given the user imports a Detections JSON produced by the system, When loaded into Review, Then the UI reproduces the detections and decisions for audit without altering current thresholds.

ROC-Based Threshold Calibration

Given target precision and recall are configured per class, When the modelVersion changes, Then the system recalibrates class thresholds from stored ROC data to meet targets within ±2 percentage points tolerance. Given recalibration completes, When the Calibration Report opens, Then it shows pre/post thresholds, estimated precision/recall per class, ROC date, and whether targets were met. Given a class is marked “Lock thresholds”, When the modelVersion changes, Then that class is excluded from recalibration and retains its current threshold. Given recalibration cannot meet targets, When results are computed, Then the system flags the class with “Attention required”, shows the closest achievable metrics, and does not auto-apply without explicit user confirmation. Given a calibration run is triggered for up to 10k validation samples per class, When processing occurs, Then results are available within 60 seconds or the UI shows progress with remaining ETA.

Marketplace Compliance Rules Engine

"As a seller listing to multiple marketplaces, I want clear compliance verdicts and reasons per platform so that I can fix issues before I publish and avoid costly rejections."

Description

Maintains an up-to-date, versioned rules library mapping detection outputs to marketplace-specific compliance verdicts (Amazon, Etsy, eBay, Shopify), including regional variations. Produces pass/fail with human-readable reason codes (e.g., “Text on primary image”) and recommended actions (remove, crop, review). Supports scheduled and hotfix rule updates with audit history, offline-safe defaults, and compatibility checks with PixelLift publish/export flows. Exposes verdicts and reasons in UI, API, and downloadable reports, enabling preflight checks that reduce top rejection causes before listing.

Acceptance Criteria

Per‑Marketplace Verdict Mapping with Reasons and Actions

Given CleanSlate Detect outputs for a batch of images including detected_overlay_types and confidence scores And a target marketplace and region are specified (e.g., Amazon US, Etsy EU) When the rules engine evaluates each image Then it returns a verdict of Pass or Fail per image per marketplace-region And includes a human-readable reason_code from the controlled vocabulary (e.g., TEXT_ON_PRIMARY_IMAGE, WATERMARK_PRESENT, BORDER_PRESENT, STICKER_PRESENT, NONE) And includes a recommended_action from {remove, crop, review, none} And applies marketplace- and region-specific thresholds and exemptions as defined in the active ruleset And records the rules_version used in the evaluation results

Versioned Rules Library with Audit and Rollback

Given a new ruleset v1.3.0 is submitted for activation When validation runs Then the ruleset is assigned a semantic version, changelog summary, author, and timestamp And a full diff against the current active version is stored in audit history And compatibility tests run against the baseline corpus and must achieve ≥99.5% agreement on prior Pass decisions and no more than +0.5% new false negatives When activation is approved Then v1.3.0 becomes active without downtime and v1.2.x remains retrievable And an authorized user can rollback to v1.2.x in one action, which is recorded in audit history

Offline‑Safe Defaults and Degradation Behavior

Given the rules service is unreachable during evaluation When CleanSlate Detect outputs are available Then the engine uses the last-known-good rules_version from local cache and marks source=cached and degraded=true in results And if no cached rules exist Then any image with detected banned overlays above default safety thresholds is marked Fail with reason_code=RULES_UNAVAILABLE and recommended_action=review And any image with no detected banned overlays is marked Pass with reason_code=RULES_UNAVAILABLE and recommended_action=review And all evaluations performed under degradation are logged for later re-evaluation when service is restored

UI, API, and Reports Expose Verdicts and Reasons

Given a user reviews a processed batch in the PixelLift UI When they open the Compliance panel for a marketplace-region Then each image row displays verdict, reason_code, recommended_action, rules_version, marketplace, and region And a one-click CSV download includes columns: image_id, marketplace, region, verdict, reason_code, recommended_action, rules_version, evaluated_at And the public API endpoint /compliance/verdicts returns HTTP 200 with a schema containing the same fields per image And invalid marketplace or region parameters return HTTP 400 with an error code and message

Preflight Checks in Publish/Export Flows

Given a user initiates publish/export to a selected marketplace-region When preflight compliance evaluation runs Then images with Fail verdicts are blocked from export by default and summarized by reason_code And the UI presents one-click Fix actions that deep-link to CleanSlate removal or crop where applicable And after applying fixes, re-evaluation runs automatically and the export proceeds only for images with Pass verdicts And a downloadable preflight report is attached to the export job

Scheduled and Hotfix Rule Updates with Compatibility Checks

Given a new ruleset is approved for rollout When scheduled for activation at a specified UTC time window Then activation occurs at the scheduled time with zero downtime and all evaluation services consume the new rules within 5 minutes And a hotfix can be activated immediately by authorized users with a required reason note And activation is blocked if compatibility checks fail (thresholds configurable per marketplace), with audit entries created for the failure And clients processing jobs during activation continue using a consistent rules_version until job completion

Batch Processing & Queue Management

"As a high-volume seller, I want reliable batch processing with progress tracking so that I can process hundreds of photos quickly without babysitting the job."

Description

Adds scalable, fault-tolerant batch execution for detection and removal with prioritized queues, parallel workers, and autoscaling. Supports pause/resume, retries with exponential backoff, idempotent job IDs, and per-image status tracking. Provides real-time progress, ETA, and throughput targets (e.g., ≥300 images/minute with GPU acceleration for 2MP images) while preserving image order and metadata. Integrates tightly with PixelLift’s upload, preset application, and export pipelines, with detailed error reporting and downloadable logs for failed items.

Acceptance Criteria

Submit Large Batch With Prioritized Queues

Given a batch of 5,000 images is uploaded with CleanSlate Detect+Remove enabled and priority=High When the batch is enqueued Then High-priority jobs start before Normal/Low within the same account And FIFO order is preserved within each priority level And each image receives a unique, idempotent job ID stable across retries and re-submissions for 24 hours And queue metrics (queue_depth, active_workers, backlog) are retrievable via API with p95 latency ≤ 500 ms And ≤ 0.1% High-priority jobs are delayed behind lower-priority work over any 10-minute window

Pause and Resume Active Batch Without Data Loss

Given a running batch with ≥ 1,000 in-flight or queued jobs When the user pauses the batch via API/UI Then no new jobs start within 3 seconds and in-flight jobs complete or checkpoint within 60 seconds And the batch status becomes Paused and progress is preserved When the user resumes the batch Then processing restarts without duplicating completed items and continues from the last successful image And unnecessary reprocessing is ≤ 1 image per 1,000 processed

Automatic Retry, Backoff, and Worker Fault Tolerance

Given a job fails due to a transient error (HTTP 5xx, GPU OOM, network timeout) When the failure occurs Then the system retries up to 3 times with exponential backoff of 1m, 2m, 4m with ±20% jitter And the same idempotent job ID is reused so only one final output artifact exists And p95 time to first retry is ≤ 90 seconds And after max retries the job status is Failed with a terminal reason code and no partial export is produced And if a worker crashes mid-processing, the job reappears after a 2-minute visibility timeout and is safely re-run once by another worker

Real-Time Progress, ETA, and Per-Image Status Tracking

Given an active batch is processing When the client polls GET /batches/{id}/progress or subscribes to SSE/WebSocket Then updates are delivered at least every 2 seconds including: total, processed, succeeded, failed, in_progress, queued, throughput_images_per_min, p50/p95 latency, ETA (ISO8601), percent_complete And per-image endpoints reflect state transitions queued -> in_progress -> succeeded|failed with timestamps And overall percent_complete accuracy is within ±5% and ETA error is within ±15% after the first 60 seconds

Throughput Target and Autoscaling on GPU Workers

Given 2MP images with CleanSlate Detect+Remove enabled and GPU acceleration available When processing a batch of ≥ 1,000 images under average content complexity Then sustained throughput is ≥ 300 images/minute over any continuous 10-minute interval And p95 end-to-end latency per image is ≤ 20 seconds And autoscaling adds workers within 2 minutes when backlog > 2x current capacity and scales down after 10 minutes when backlog < 0.5x capacity And per-tenant concurrency limits are enforced (default 50) without throttling errors > 0.5%

Order and Metadata Preservation Across Pipeline

Given an ordered manifest with per-image metadata (filename, EXIF, product_id, preset_id) When the batch completes CleanSlate Detect+Remove, preset application, and export Then the export preserves original item order and all metadata; any stripped EXIF fields are enumerated in the report And mapping from original to processed filenames is deterministic and reversible And input and output SHA-256 manifests are produced and match expected counts And pipeline steps execute in order: upload -> detect/remove -> preset apply -> export, with idempotent step replays producing identical outputs

Error Reporting and Downloadable Logs for Failed Items

Given one or more items fail in a batch When the client requests the error report via API/UI Then a downloadable bundle is produced within 60 seconds containing: failure_reason_code, message, stack_trace_hash, retry_count, last_attempt_timestamp, worker_id, detection confidence scores, and overlay mask thumbnails And reports are available in CSV and JSON with SHA-256 checksum And logs are retained for ≥ 7 days with PII redacted And API provides remediation hints aligned to common Amazon/Etsy rejection categories

Reviewer Feedback & Model Improvement Loop

"As a photo editor on my team, I want to quickly review and correct detections so that the system learns from our edits and improves over time."

Description

Introduces a review workspace to confirm, correct, or override detections/removals, including polygon/mask refinement and brush tools. Captures user feedback as labeled data (true positive/false positive/false negative) with consented storage, feeding an MLOps pipeline for periodic re-training and calibration. Supports model version pinning, A/B comparisons, and rollout gating based on measured precision/recall and user-reported issues. Provides audit trails of overrides and reprocessing actions, and ensures permissioned access and data retention controls.

Acceptance Criteria

Confirm/Override Detection Labels in Review Workspace

Given a batch with CleanSlate Detect outputs, When the user opens the Review workspace, Then each image displays all detected overlay regions with polygons/masks, labels, and confidence scores. Given a detected region, When the user clicks Confirm, Then the region is labeled True Positive and no geometry changes are made. Given a detected region judged incorrect, When the user clicks Mark as Incorrect, Then the region is labeled False Positive and is excluded from auto-removal. Given an overlay was missed, When the user draws a new region and clicks Save, Then a new region is created and labeled False Negative. Given the user has finalized all images, When they click Apply Changes, Then the system performs removals/restores according to final labels and updates batch status to Reviewed. Given a change is applied, When the action completes, Then a feedback record is created with userId, timestamp, imageId, regionId, modelVersion, label (TP/FP/FN), and action (confirm/override/create).

Polygon/Mask Refinement with Edge-Aware Brush

Given a detected or user-drawn region, When the user drags polygon vertices or edges, Then the mask updates in under 200 ms and maintains topology without self-intersections. Given the user selects the edge-aware brush (3–100 px), When they paint Add or Subtract, Then the mask expands/contracts snapping to edges with average boundary error ≤ 2 px on preview. Given the user makes edits, When they press Undo or Redo, Then the last 20 operations are reversed or re-applied respectively. Given the user adjusts zoom (10%–800%) and pans, When editing, Then cursor-relative brush size remains consistent and the mask renders at ≥ 30 FPS on 24 MP images on recommended hardware. Given the user clicks Save Refinement, When the save completes, Then the updated mask geometry is persisted to the audit log and used for subsequent removal.

Feedback Capture with TP/FP/FN Labels and Consent

Given a first-time feedback submission, When the user clicks Submit, Then a consent modal describing data usage appears requiring Explicit Agree or Decline before storage. Given the user Agrees, When feedback is submitted, Then the system stores labeled data including userId (or pseudonym), timestamp, imageId, region geometry, label (TP/FP/FN), action, and modelVersion. Given the user Declines, When feedback is submitted, Then only operational changes are applied and no image pixels or geometries are stored; an opt-out flag is recorded. Given a feedback record is created, When queued for MLOps, Then the record passes schema validation and is acknowledged; on failure, it retries up to 3 times and is moved to a dead-letter queue with error details. Given a user revokes consent in Settings, When they confirm, Then subsequent feedback is not stored for training and previously stored items are flagged for purge according to retention policy.

Model Version Pinning and A/B Comparison

Given a project, When an Admin pins a model version (e.g., v1.3.2) for detections, Then all new runs in that project use the pinned version until changed. Given two selectable model versions, When the user runs A/B on a batch, Then both versions process the same images and a side-by-side diff view is available per image. Given reviewer labels exist, When A/B results are generated, Then per-version precision, recall, and F1 are computed against those labels and shown with confidence intervals and sample sizes. Given an A/B run completes, When the user exports results, Then a CSV including confusion matrix counts (TP, FP, FN) per version and aggregate metrics is downloadable. Given a pinned version becomes deprecated, When the user starts a new run, Then a warning appears with a link to migrate and the run still proceeds using the pinned version.

Rollout Gating by Precision/Recall and User-Reported Issues

Given a candidate model completes evaluation, When gating is applied, Then promotion to Production requires overlay-detection precision ≥ 0.95 and recall ≥ 0.90 on the configured validation set. Given a candidate has passed offline metrics, When released to a beta cohort, Then rollout to all users is blocked if user-reported issue rate exceeds 5 per 1,000 images or a Sev-1 incident is open. Given a gating decision is made, When stored, Then the decision record includes modelVersion, dataset hash, metrics with sample size, issue rate, approver, timestamp, and decision (Approved/Blocked) in the audit log. Given a model is Approved, When scheduled, Then progressive rollout percentages (e.g., 10% → 50% → 100%) are enforced, with automatic rollback if metrics regress beyond thresholds.

Audit Trail, Permissions, and Data Retention Controls

Given roles are assigned, When a user opens the Review workspace, Then permissions are enforced: Viewer (read-only), Reviewer (label/confirm/override), Editor (refine masks), Admin (configure, export, purge). Given any review action occurs, When saved, Then an immutable audit entry is created with actor, timestamp, action type, before/after geometry hashes, modelVersion, and optional comment. Given a retention policy is configured (e.g., 180 days), When the daily purge job runs, Then expired source images and masks are deleted, with a purge report listing counts and IDs, while aggregate metrics remain. Given audit entries exist, When an Admin exports logs for a date range, Then a CSV/JSON export downloads within 60 seconds for up to 100k records with filters by user, action, and model version. Given permission changes are made, When effective, Then changes are logged and enforced on the next request; unauthorized actions return 403 with reason.

Proof Pack

Export a rule‑by‑rule compliance dossier per image or batch, including before/after thumbs, specs, and pass reasons. Share with clients or attach to tickets to speed approvals, defend decisions, and keep teams aligned on what’s shipping and why.

Requirements

Rule-by-Rule Compliance Dossier

"As a QA lead, I want a rule-by-rule dossier per image so that I can defend compliance decisions and resolve disputes quickly."

Description

Compile a comprehensive, traceable compliance dossier per image and per batch that enumerates each validation rule evaluated (rule name, category, and version), the measured values versus thresholds, pass/fail outcome, and explicit pass/fail reasons. Include processing context (evaluation timestamp, job ID, preset name/version, model build hash, marketplace/brand rule set version), detected technical specs (dimensions, DPI, background uniformity score, margins, color profile), and links to visual evidence. Persist dossiers with immutable IDs for auditability, enable deterministic re-generation by pinning versions, and structure data to be both human-readable and machine-parseable for downstream systems.

Acceptance Criteria

Single Image Dossier Content Completeness

Given an image is processed with preset "Brand A – Clean" v1.2 against rule set "Amazon-Apparel" v3.4 at time T using model build hash H and evaluator version E When the compliance dossier is generated Then the dossier JSON includes for every evaluated rule: ruleId, ruleName, category, ruleVersion, measuredValue, threshold, operator, outcome (Pass|Fail), reason, evidenceLinks[] And includes processing context: evaluationTimestamp (ISO 8601 UTC), jobId, imageId, presetName, presetVersion, modelBuildHash, ruleSetName, ruleSetVersion, evaluatorVersion And includes technical specs: widthPx, heightPx, dpi, backgroundUniformityScore (0.00–1.00), marginsPercent {top,right,bottom,left}, colorProfile (ICC name) And includes visualEvidence: beforeThumbUrl and afterThumbUrl using HTTPS And the dossier validates against JSON Schema "compliance-dossier" v1.0 with zero errors

Immutable Dossier ID and Tamper Evidence

Given a dossier is first persisted to storage When the dossier is saved Then it is assigned dossierId (UUIDv4 or ULID) and contentHash (SHA-256 of canonical JSON) And the record is write-once; any PUT/PATCH/DELETE attempts return 403 or 409 and no change in contentHash And subsequent revisions create a new dossier with a new dossierId and parentId referencing the prior dossier And retrieving the original dossier at any time returns the original content and identical contentHash

Deterministic Re-generation with Pinned Versions

Given sourceImageChecksum, presetName/version, ruleSetName/version, modelBuildHash, and evaluatorVersion are identical to a prior run When the dossier is re-generated Then the produced dossier JSON bytes and contentHash are identical to the original And provenance.regeneration equals true with matchedParameters listing the pinned versions And if any single pinned parameter changes, the new dossier contentHash differs and provenance.changedParameters lists the differing fields

Machine-Parseable JSON and Human-Readable Proof Pack

Given a completed processing job for one image When exporting the Proof Pack Then the export contains dossier.json (validating against schema v1.0) and dossier.html or dossier.pdf presenting the same rule outcomes, reasons, specs, and thumbnails And beforeThumbUrl/afterThumbUrl resolve over HTTPS with HTTP 200 within 2 seconds And text values in the human-readable view match the JSON values for each rule and spec field

Batch Aggregation, Index, and Share Link

Given a batch of N images (N ≤ 500) When the batch Proof Pack is exported Then a single ZIP is produced containing per-image dossiers and an index.json and index.html And index.json includes totalImages, passImages, failImages, and ruleSummary[{ruleId, passCount, failCount}] And the ZIP is available under a read-only HTTPS URL with a signed token, TTL configurable 1h–30d, and revocable; after revocation the link returns 403 And generation completes within 60 seconds for N ≤ 500 with no missing files

Visual Evidence Links and Bounding Boxes

Given a rule that inspects localized regions (e.g., text overlay or logo placement) When the dossier is generated Then evidenceLinks include region annotations: an overlay image URL and a JSON array of boundingBoxes [{x,y,width,height,confidence,ruleId}] And opening the overlay image visually aligns with the before image dimensions (same aspect and pixel size) And the sum of annotated area percentages is ≤ 100% and all coordinates are within image bounds

Before/After Visual Evidence

"As a retoucher, I want clear before/after visuals and annotated evidence so that reviewers can instantly see what changed and why it passed or failed."

Description

Generate optimized before/after thumbnails and annotated visual evidence for each image, including side-by-side comparisons, adjustable split/slider previews, and auto-generated crops highlighting rule violations with overlays and bounding boxes. Produce web-friendly assets (e.g., WebP, 512–1024 px longest side) with consistent file naming, optional watermarking, and alt text for accessibility. Embed visuals in the PDF export and package them in the ZIP alongside JSON, ensuring quick loading and clear visual justification for pass/fail outcomes.

Acceptance Criteria

Side‑by‑Side and Slider Previews Exported

Given a processed image in a batch, When viewed in-app, Then an adjustable before/after slider is available with a 0–100% range, step ≤ 1%, and drag latency ≤ 50 ms at 1080p. Given the same image is exported, When assets are generated, Then include two static split preview WebPs at 25% and 75% slider positions plus one side‑by‑side before/after thumbnail. Given any before/after comparison, Then both halves share identical crop and scale with edge misalignment ≤ 1 px. Given generated thumbnails, Then the longest side is within 512–1024 px and aspect ratio is preserved. Given processing fails for the “after” image, When exporting, Then omit slider/split assets and include a placeholder thumbnail flagged as processing_failed in the manifest.

Annotated Violation Crops

Given a rule evaluation produces failed or warned results, When exporting, Then generate at least one annotated crop per distinct violation region with overlays including bounding box, rule ID, severity, and brief reason. Given an annotated crop, Then padding of 8–12% of the bbox perimeter is added without exceeding image bounds, and overlay contrast meets WCAG AA (≥ 4.5:1). Given overlapping violations in the same region, Then either a combined crop lists all rule IDs or separate crops are created—at least one asset must depict each rule. Given overlays are rendered, Then bbox coordinates align with the rule engine’s regions with IoU ≥ 0.90. Given no violations are found, Then export a “Pass” annotated visual (no boxes) clearly indicating no violations. Given annotated crops are saved, Then each is WebP with longest side 512–1024 px and sRGB color profile.

Web‑Friendly Asset Constraints

Given any exported visual asset (thumbnail, split, or annotated crop), Then the file format is WebP, color space is sRGB, and metadata/EXIF are stripped. Given resizing occurs, Then the longest side is between 512 and 1024 px inclusive, and target dimensions deviate by ≤ 1 px from the requested size. Given lossy compression is applied, Then each asset’s size is ≤ 300 KB (95th percentile ≤ 500 KB across the batch) while maintaining MS‑SSIM ≥ 0.95 versus the source scaled to the same dimensions. Given a batch export completes, Then no asset violates the above constraints; non‑conforming assets are re‑encoded automatically until they pass or the export fails with a clear error in the manifest.

Deterministic Naming and Packaging

Given assets are exported, Then filenames follow {imageId}__{variant}__{kind}__{longest}px.webp where variant ∈ {before, after, split25, split75} and kind ∈ {thumb, ann, ann‑{ruleId}}, all lowercase ASCII with hyphens/underscores only, ≤ 80 chars, unique within the ZIP. Given the ZIP is created, Then a manifest file proofpack.json exists at the root and lists for every asset: path, imageId, variant, kind, ruleId (nullable), width, height, bytes, sha256, and altText. Given the manifest, Then every listed asset exists at the specified path and every file in the ZIP (excluding the PDF and manifest) is listed in the manifest; sha256 hashes match file bytes. Given packaging completes, Then the PDF and manifest are at the ZIP root, and all assets reside under assets/ with subfolders per imageId.

Configurable Watermarking

Given watermarking is enabled in export settings, Then all visual assets (thumbnails, splits, annotated crops) include a semi‑transparent watermark (opacity 35–45%) inset by 5% from the nearest corner and scaled so its larger dimension ≤ 10% of the image’s shorter side (min 32 px). Given a watermark would overlap an annotation bbox by > 10% of its area, Then the watermark relocates to the next available corner to avoid obstruction. Given watermarking is disabled, Then no watermark pixels are present; visual difference versus the non‑watermarked baseline remains within expected encoding variance (dHash distance ≤ 5). Given a custom watermark asset is configured, Then it is used; otherwise the default PixelLift watermark is applied.

Accessible Alt Text and PDF Embedding

Given any exported visual asset, Then alt text is generated ≤ 140 characters including: image identifier, visual type (before/after/annotated), and Pass/Fail with top 1–3 rule names; language follows the export locale (default en). Given the PDF export is generated, Then it is a tagged PDF with reading order including each visual and its alt text/caption; each image section contains the before/after thumbnail, split preview frames, and all annotated crops in manifest order. Given the ZIP is created, Then it contains the PDF, the manifest, and all referenced assets; each image has at least one before/after thumbnail and, when violations exist, at least one annotated crop visualizing them.

Multi-format Export & Branding

"As an account manager, I want branded PDF/JSON exports so that I can share professional, machine- and human-readable proof packs with clients."

Description

Provide export options for the proof pack as a branded PDF (paginated, table of contents, batch summary), a machine-readable JSON (schema v1) capturing all rule results and metadata, and a ZIP bundle that includes the PDF, JSON, and visual assets. Support workspace-level branding (logo, colors, header/footer), localized labels (EN at launch, i18n-ready), configurable templates (cover page, sections included), image compression controls, and checksums for file integrity. Enable downloads via UI and API with resumable transfers for large batches.

Acceptance Criteria

Branded PDF Export (Paginated, TOC, Batch Summary)

Given a completed rules evaluation for a batch with at least one image and a workspace with branding (logo, primary/secondary colors, header/footer text) configured When the user exports a Proof Pack as a PDF via the UI or API Then the PDF is paginated with page numbers on every page And the PDF includes a table of contents listing batch summary and each image section with accurate page numbers And the PDF contains a batch summary (batch ID, creation timestamp, total images, rules evaluated, counts of passes/fails) And each image section includes before/after thumbnails, image specs (format, dimensions in px, file size), and rule-by-rule outcomes with pass/fail and pass reasons And workspace branding is applied: logo in header, colors applied to headings and status badges, and configured footer text on every page And the PDF is generated without errors and can be opened by Acrobat-compatible readers

Machine-Readable JSON Export (Schema v1 Compliance)

Given a request to export the Proof Pack as JSON (schema v1) When the export completes Then the JSON document validates against JSON Schema v1 for PixelLift Proof Pack And it includes batch metadata (batchId, createdAt UTC, workspaceId, stylePresetId, stylePresetVersion) And for each image it includes imageId, sourceFileName, specs (format, widthPx, heightPx, fileSizeBytes), and per-rule results (ruleId, ruleNameKey, outcome boolean, reasons[], severity) And all field names are locale-neutral; localized labels are provided only under an optional labels.en object And the top-level schemaVersion equals "1.0"

ZIP Bundle Packaging (PDF, JSON, Visual Assets, Checksums)

Given a request to export a Proof Pack as a ZIP bundle When the export completes Then the ZIP contains top-level folders: pdf/, json/, assets/ And pdf/ contains the branded PDF; json/ contains the schema v1 JSON; assets/ contains all visual assets referenced (before/after thumbnails and any rule visualizations) And a manifest.json is present at the root with file paths, sizes, and SHA-256 checksums for every file in the archive And computed checksums on the server match the manifest before the download link is issued And unzipping the archive on macOS and Windows preserves the folder structure and filenames

Localized Labels and i18n Readiness (EN at Launch)

Given the export language is not specified When the Proof Pack is generated Then all human-readable labels in the PDF are in English (en) And when exportLanguage=en is explicitly provided via UI or API, English labels are used And when an unsupported locale is requested, the system falls back to en And all strings used in the PDF are sourced from locale resource files (not hardcoded) And JSON export remains locale-neutral except for the optional labels.en section

Configurable Export Templates (Cover & Sections)

Given a workspace admin has configured an export template specifying inclusion of cover page, batch summary, per-image details, and rule appendix, and their order When a user exports a Proof Pack using that workspace template Then only the selected sections are included in the PDF in the configured order And the cover page uses workspace branding and displays batch title, date, and workspace name/logo And template settings can be overridden per export via UI or API without altering the saved workspace template And template validation prevents exports with zero sections selected

Image Compression Controls (PDF and Visual Assets)

Given image compression quality is set to a value between 1 and 100 for the export When the Proof Pack is generated Then visual assets in the ZIP and embedded images in the PDF are encoded at the requested quality (±2 tolerance for encoder variance) And pixel dimensions of source images are preserved in exported assets (no upscaling) And the selected compression setting is recorded in JSON metadata under export.imageCompression.quality

UI and API Downloads with Resumable Transfers

Given an export artifact (PDF or ZIP) of size >= 500 MB When the user downloads via the UI Then the download supports pause/resume and automatically resumes after a transient network loss without restarting from byte 0 And when the artifact is downloaded via the API, HTTP Range requests are accepted and responded to with 206 Partial Content And the download link is a time-limited signed URL valid for at least 24 hours And the API provides a SHA-256 checksum for the artifact enabling client-side verification

Secure Sharing & Ticket Attachments

"As a project manager, I want secure share links and one-click ticket attachments so that approvals and escalations fit our existing workflows."

Description

Enable shareable, time-bound, signed URLs for proof packs with optional password protection, RBAC-based in-app access, access logs, and one-click revocation. Provide native attachments/integrations for Jira and Zendesk (project/issue mapping, authentication via stored OAuth/tokens, retry on failure) and a generic email share that sends a secure link rather than files. Ensure shared artifacts exclude PII, include a client-facing summary, and maintain consistent file naming for easy reference in external workflows.

Acceptance Criteria

Signed, Time‑Bound Share Link Generation

Given a completed Proof Pack exists for batch {batchId} When the owner clicks Share and selects an expiry (1h, 24h, 7d, or custom up to 30d) and optionally sets a password Then the system generates an HTTPS signed URL with a unique token (≥128 bits entropy), stores expiry and a hashed password server-side, and displays the link and expiry Given the signed URL is accessed before expiry with the correct password (if set) When the request is validated Then the Proof Pack view loads within p95 ≤ 2s and returns HTTP 200 Given the signed URL is accessed after expiry or with a revoked token When the request is made Then the request is denied with HTTP 410 Gone and no Proof Pack content is returned Given the signed URL is accessed with an incorrect or missing password When five consecutive failures occur within 10 minutes Then subsequent attempts from the same IP are rate-limited for 15 minutes and return HTTP 429 Given the Proof Pack renders or downloads artifacts via the share link When content is served Then PII (names, emails, phone numbers, physical addresses, internal notes) is excluded, a client-facing summary is present, and all downloadable filenames match PixelLift-ProofPack-{batchId}-{imageOrBatch}-v{version}-{yyyyMMddTHHmmssZ}.{pdf|csv|json}

Email Share Sends Secure Link Only

Given the owner selects Email Share and enters 1–20 recipient emails and an optional message When Send is clicked Then one email per recipient is sent within 60 seconds containing only the secure link (no file attachments), and the link inherits the same expiry and password requirements Given a recipient opens the email When the secure link is clicked Then the Proof Pack opens according to link rules, and the email body contains no PII beyond recipient address and sender display name Given the sender reviews the share When viewing the Share details in-app Then delivery status (queued/sent) for each recipient is displayed without exposing recipient PII beyond email address

In‑App RBAC Access Enforcement

Given authenticated users with roles Admin, Editor, Viewer, and Contractor When navigating to a Proof Pack within the app Then Admin/Editor/Viewer can view Proof Packs; Contractor receives HTTP 403; only Admin/Editor can create/revoke share links and configure Jira/Zendesk mappings; Viewer can copy existing share links but cannot create/revoke them Given a user without permission attempts to access a Proof Pack via direct URL When the request is made Then return HTTP 403 without revealing Proof Pack metadata Given a user’s role is changed When the role is updated Then new permissions take effect within 60 seconds for all subsequent requests

Access Logging and Audit Trail

Given a share link is created or accessed When any view attempt (success/denied/expired/revoked) occurs Then an access log entry is recorded with timestamp (UTC), actor type (internalUserId or external), outcome, truncated IP (/24 IPv4 or /48 IPv6), hashed user agent, resource identifiers (batchId/imageId), and method (view/download), and is viewable in-app within 30 seconds Given an auditor reviews access When filtering by date range, outcome, or actor type Then matching entries are returned within 2 seconds p95 and can be exported as CSV with fixed headers and no PII

One‑Click Share Revocation

Given there is an active share link for a Proof Pack When the owner clicks Revoke Link Then the token is invalidated within 10 seconds; subsequent requests return HTTP 410; any active sessions lose access on next request; the Share list shows status Revoked Given a link was revoked When the owner generates a new share Then a new unique token is issued and previous tokens remain invalid; audit log records share_revoked and share_created events with actor and timestamps

Jira Attachment Integration

Given Jira is connected via stored OAuth and project/issue mapping is configured When the user clicks Attach to Jira for a Proof Pack and selects a Project and Issue Then PixelLift posts a Jira comment containing the client-facing summary and the secure link (no files attached), includes the consistent filename reference, and confirms success with the Issue key Given transient failures (network/5xx) occur When posting the comment Then the system retries up to 3 times with exponential backoff (1s, 3s, 9s) using an idempotency key to prevent duplicate comments Given the Jira access token is expired or invalid When attempting to post Then the system refreshes the token; if refresh fails, it surfaces an actionable error without exposing secrets and does not create a partial attachment

Zendesk Attachment Integration

Given Zendesk is connected via stored OAuth/API token and ticket mapping is provided When the user clicks Attach to Zendesk for a Proof Pack and selects a Ticket Then PixelLift posts a ticket comment containing the client-facing summary and the secure link (no files attached), marks it as an internal note by default, and confirms success with the Ticket ID Given transient failures (network/5xx) occur When posting the comment Then the system retries up to 3 times with exponential backoff (1s, 3s, 9s) using an idempotency key to prevent duplicate comments Given the Zendesk credential is expired or invalid When attempting to post Then the system refreshes/reauthenticates; if it fails, it surfaces an actionable error without exposing secrets and does not create a partial attachment

Batch Index & Versioning

"As a production supervisor, I want batch-level summaries and versioning so that I can track changes, compare results, and maintain an audit trail."

Description

Create a batch-level index that summarizes overall pass rate, per-rule breakdowns, and quick filters, with links to each image’s dossier. Record and display version information for rule sets, presets, and models; maintain a change history; and support re-generation of proof packs when rules change while preserving prior versions for audit. Provide a diff view that highlights what changed between two dossier versions at the rule and metric level to streamline approvals across iterations.

Acceptance Criteria

Batch Index Overview and Quick Filters

Given a completed proof pack for batch B containing N images When a user opens the batch index view for B Then the view displays the overall pass rate as a percentage with numerator/denominator that equals N And a per-rule breakdown table lists each rule with pass, fail, and not-applicable counts whose sum equals N for each rule And quick filters for All, Passed, and Failed are visible and filter the image list to the correct subset And applying any quick filter updates the list within 300ms for batches up to 5,000 images And each image row includes a working link to its dossier that opens in a new tab (HTTP 200) And initial content renders within 2,000ms for batches up to 5,000 images on a standard broadband connection And totals shown in the index match the aggregate of the linked dossiers

Per-Image Dossier Deep Link Integrity

Given the batch index for batch B is displayed When the user clicks an image’s dossier link Then the dossier opens in a new tab with HTTP 200 within 1,500ms And the URL contains batchId and imageId parameters And the dossier shows the same pass/fail status as the index row And the default view resolves to the latest proof pack version for batch B and displays its version label (e.g., v3) And opening the copied URL in a new session loads the same dossier view subject to permissions

Version Metadata Recording and Display

Given a proof pack is generated for batch B When generation completes Then the system stores immutable version metadata for that proof pack including ruleSetId and semanticVersion, presetId and version, and model identifiers with model hashes And the stored metadata is visible in the batch index header and in each image dossier header And the metadata is retrievable via a read API for batch B and proof pack version V And any subsequent changes to rule sets, presets, or models do not alter the stored metadata for version V

Change History and Audit Trail

Given rule sets, presets, models, and proof packs may change over time When any of the following events occur: rule set created/updated, preset updated, model updated, proof pack generated or regenerated Then an immutable audit entry is recorded with timestamp (UTC), actor, action, scope (batch or global), affected identifiers, and previous/new versions where applicable And the change history for batch B is viewable in chronological order in the UI and via an API endpoint And prior proof pack versions referenced in history remain accessible read-only And attempting to modify or delete a history entry is rejected with an error

Re-Generate Proof Pack With Preservation of Prior Versions

Given batch B has an existing proof pack version V And the active rule set or preset has changed When a user with permission selects Regenerate Proof Pack for batch B Then the system creates a new proof pack version V+1 using the current rule set/preset/model versions And version V remains accessible read-only and is not overwritten or deleted And the batch index defaults to showing version V+1 with a control to switch between V and V+1 And each dossier clearly indicates its version and the generation timestamp And concurrent regenerations for the same batch are queued or prevented to avoid conflicting versions

Rule- and Metric-Level Diff Between Two Dossier Versions

Given batch B has at least two proof pack versions V and W When a user selects V and W to compare Then the diff view highlights per rule whether it is added, removed, or modified And for modified rules, the view shows metric deltas with sign (e.g., +0.12) and any pass/fail state changes And a batch-level summary shows net images that changed state: Pass→Fail, Fail→Pass, and unchanged counts And the user can drill down to per-image diffs showing only rules with changes for that image And the diff view renders within 2,500ms for batches up to 5,000 images And the diff can be exported to PDF or CSV including before/after thumbnails and changed metrics

Automation, API & Webhooks

"As a developer, I want APIs and webhooks for proof packs so that I can automate generation and integrate them into our pipelines."

Description

Add workspace-level policies to auto-generate proof packs on job completion or on manual trigger, with queueing, retries, and idempotency. Expose REST endpoints to request generation, poll status, and download artifacts; emit webhooks on completion/failure including artifact URLs and checksums. Provide Slack/email notifications, rate limiting, and concurrency controls to protect system stability and enable seamless integration into external pipelines.

Acceptance Criteria

Auto-generate Proof Packs on Job Completion with Idempotency and Retries

Given a workspace policy "Auto-generate proof pack on job completion" is enabled When a batch job transitions to status "completed successfully" Then a proof pack generation task is enqueued within 5 seconds with a unique request_id And if duplicate completion events are received within 24 hours for the same job_id, only one proof pack is generated (idempotent) and the same request_id is returned And transient failures (e.g., storage 5xx) are retried with exponential backoff up to 5 attempts And the job timeline reflects "proof pack: queued | processing | complete/failed" with UTC timestamps

Manual Proof Pack Generation via REST API with Idempotency Keys

Given a valid OAuth2 token with scope proof_packs.write and an Idempotency-Key header When POST /v1/proof-packs with a payload containing job_id or batch_id is submitted Then the API returns 202 Accepted with request_id and status "queued" And repeating the same request (same Idempotency-Key and identical payload) within 24 hours returns 202 with the same request_id and does not create a duplicate task And invalid payloads return 422 with machine-readable error codes; unauthorized requests return 401; insufficient scope returns 403 And exceeding rate limits returns 429 with Retry-After plus X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset headers

Poll Proof Pack Status and Download Artifacts with Checksums

Given a valid request_id for a proof pack generation When GET /v1/proof-packs/{request_id} is called Then the response includes status in [queued, processing, complete, failed], progress percentage, and ISO 8601 timestamps for created_at/updated_at And when status is "complete", the response includes signed URLs for artifacts (index.json, before_after_thumbs.zip, dossier.pdf), each with SHA-256 checksum and byte size And signed URLs expire after 60 minutes and can be refreshed via POST /v1/proof-packs/{request_id}/refresh-urls And downloaded artifact checksums match the provided SHA-256 values

Deliver Secure Webhooks on Completion/Failure with Retries

Given a workspace webhook endpoint with a shared secret is configured and enabled for events [proof_pack.completed, proof_pack.failed] When a proof pack reaches status "complete" or "failed" Then a POST is delivered within 10 seconds containing event_id, event_type, request_id, job_id, timestamps, and artifacts (url, checksum, size), plus signature headers X-PixelLift-Signature and X-PixelLift-Timestamp And the signature is HMAC-SHA256 over the raw payload using the shared secret and is valid within a 5-minute timestamp tolerance And non-2xx responses trigger exponential retries with jitter for up to 24 hours; deliveries are idempotent per event_id

Send Slack and Email Notifications on Proof Pack Completion/Failure

Given Slack and/or email notifications are enabled at the workspace level for proof pack events When a proof pack completes or fails Then a Slack message is posted to the configured channel including workspace name, job/batch id, status, pass/fail summary, and a link to the proof pack And an email with the same summary and links is sent to configured recipients And notifications are sent at most once per proof pack event; failures are retried up to 3 times; users can opt out per channel

Enforce Rate Limiting, Queuing, and Concurrency Controls per Workspace

Given default per-workspace limits of 60 API requests per minute and 5 concurrent proof pack generations When requests exceed the rate limit Then the API returns 429 with X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset headers And when the concurrency limit is reached, new generation requests enter a FIFO queue per workspace and start automatically as capacity frees And if a per-workspace queue exceeds 100 pending tasks, new requests return 503 with error "queue_full" And the system enforces fair-share scheduling across workspaces to prevent starvation

Performance, Scalability & Reliability

"As an operations lead, I want performance and reliability guarantees so that large batches deliver proof packs on time without failures."

Description

Meet SLOs for large batches (e.g., 95th percentile generation of PDF+JSON for 500 images within 5 minutes) with autoscaling workers, backpressure, and progress indicators. Support resumable and chunked downloads, storage retention policies (e.g., 30 days with configurable overrides), and encrypted storage/transport. Implement health checks, monitoring, and alerting; graceful degradation and clear user-facing error messages; and disaster recovery objectives (defined RPO/RTO) to ensure dependable delivery of proof packs at scale.

Acceptance Criteria

500-Image Batch SLA: P95 PDF+JSON Generation ≤ 5 Minutes

Given a 500-image batch is submitted to generate a Proof Pack (PDF + JSON) and the system is in normal operating conditions When the job is accepted and processing begins Then the end-to-end generation time at the 95th percentile is ≤ 5 minutes measured from job acceptance to artifacts being marked "Ready" And the 99th percentile is ≤ 7 minutes And success rate within the SLA window is ≥ 99.5% over at least 30 consecutive batch runs And generation start/end timestamps and latency metrics are captured to verify percentiles

Autoscaling and Backpressure Maintain Throughput Under Peak Load

Given incoming workload rises to 50 concurrent 500-image batches within 2 minutes When autoscaling policies are active Then new workers are provisioned within 60 seconds to keep average queue wait time ≤ 60 seconds And worker CPU utilization stabilizes between 50% and 75% And no jobs are dropped; all queued jobs are durable and retried on failure up to 3 times with exponential backoff And while capacity is constrained, API responds with 429 and a valid Retry-After header for new submissions beyond the rate limit And under this peak, the batch-level P95 generation SLA (≤ 5 minutes) is met for ≥ 95% of batches started during the window

Real-Time Batch Progress and Status Visibility

Given a batch Proof Pack job is running When the user views the batch detail page or polls the job status API Then status transitions are exposed as Queued, Processing, Generating, Packaging, Ready, Failed And percent complete updates at least every 5 seconds during active processing And percent complete accuracy is within ±10% of actual remaining work And each image’s pass/fail count and latest stage are visible And a stable job identifier allows resuming the view without losing progress context

Resumable, Chunked, and Verifiable Downloads

Given a generated Proof Pack PDF and JSON are available When a client downloads using HTTP Range requests Then the server returns 206 Partial Content with ETag and Accept-Ranges headers And downloads can pause and resume without restarting after a simulated network interruption And chunk sizes are between 4 MB and 16 MB and reassembled payload SHA-256 matches the stored checksum And parallel chunked downloads are supported up to 8 concurrent connections per file without corruption or throttling errors And signed URLs remain valid until expiry and are invalid after expiry

Data Security and Retention Compliance

Given Proof Pack artifacts and logs are stored after generation When data is at rest or in transit Then artifacts are encrypted at rest with AES-256 and keys managed by KMS And all transport uses TLS 1.2+ with modern ciphers; weak protocols are disabled And access to artifacts is controlled via time-limited signed URLs with default expiry ≤ 24 hours (configurable) And the default retention is 30 days; overrides can be set per batch between 7 and 90 days And artifacts past retention are hard-deleted within 24 hours and become irretrievable via API or UI And all retention overrides and deletions are auditable with user, timestamp, and reason

Operational Health, Observability, and Alerts

Given the service is deployed When health endpoints are queried Then /healthz (liveness) returns 200 if the process is running; /readyz (readiness) returns 200 only if critical dependencies (queue, storage, DB) are reachable and within thresholds; otherwise 503 And metrics are emitted for queue depth, job throughput, error rate, and generation latency (p50/p95/p99) And an alert is triggered when p95 generation time exceeds 5 minutes for 5 consecutive minutes or error rate > 1% over 5 minutes And on-call receives a page within 2 minutes of alert trigger And logs include correlation IDs per batch/job to trace across services

Disaster Recovery: RPO/RTO and Failover Drill

Given a region-wide outage is simulated When failover to the secondary region is initiated using the documented runbook Then Recovery Time Objective (RTO) is ≤ 30 minutes to restore Proof Pack generation and download services And Recovery Point Objective (RPO) is ≤ 15 minutes for job metadata and generated artifacts And in-flight batches are either resumed or automatically re-queued with idempotent processing without duplicate charges or artifacts And a semi-annual DR drill demonstrates meeting RPO/RTO with a post-mortem and evidence retained

NeckForge

AI reconstructs the interior neckline for an elegant invisible‑mannequin look. Control depth, curve, and collar spread, toggle label visibility, and reuse brand-specific neck templates for consistent results across collections—no reshoots or manual cloning required.

Requirements

Neckline Reconstruction Engine

"As an online seller, I want my apparel images to have a clean invisible‑mannequin neckline so that my listings look premium and consistent without needing costly reshoots."

Description

Develop a core AI pipeline that reconstructs the interior neckline for an invisible‑mannequin effect from a single product photo. The engine must preserve fabric texture, stitching, and prints; handle common neckline types (crew, V, scoop, mock, turtleneck, polo) and garments (tees, shirts, dresses, hoodies); and output a clean alpha mask plus a composite image. It must maintain color fidelity (sRGB/AdobeRGB workflows), consistent shading, and edge realism with anti‑aliasing and micro‑shadow synthesis. The component exposes tunable parameters (depth, curve, collar spread) and returns quality/confidence scores. Integrates into the PixelLift pipeline post background removal and pre style‑preset application. Performance target: ≤3s per 2048px image on a T4‑class GPU, with deterministic results given identical inputs and seeds. Provides safe fallback to original image when confidence is below threshold.

Acceptance Criteria

Multi‑Neckline Reconstruction Coverage and Edge Realism

Given a dataset containing crew, V, scoop, mock, turtleneck, polo, and hoodie necklines When the engine processes each image post background removal Then an invisible‑mannequin neckline is reconstructed for ≥ 99% of images without holes or floating geometry And the reconstructed edge is anti‑aliased with no jagged artifacts on > 99% of edge pixels (edge roughness index ≤ 0.05) And on a synthetic ground‑truth set, per neckline type SSIM ≥ 0.92 and LPIPS ≤ 0.18 within the neckline ROI

Texture, Stitching, and Print Preservation

Rule: Pixels outside the modified neckline ROI must match the input within ΔE00 ≤ 1.0 for ≥ 98% of pixels Rule: Within the neckline ROI, high‑frequency detail is preserved with Laplacian variance ≥ 85% of the adjacent garment baseline Rule: Printed pattern continuity across the reconstructed interior shows misalignment < 2 px at seam/print intersections Rule: No cloning/repeating artifacts larger than 6 px appear within the neckline ROI (rate ≤ 0.5% of images)

Alpha Mask and Composite Output Correctness in Pipeline

Given an input RGBA garment cutout When processed by the engine Then the outputs include (a) a composite image with reconstructed neckline and (b) a standalone alpha mask; both match input width/height exactly And the alpha mask is unpremultiplied, ≥ 8‑bit, with feather width 0.5–2.0 px and no haloing (opaque ring width < 0.5 px) And micro‑shadow synthesis is present along interior edges with peak darkness 1–6% of local luminance without color shift (ΔE00 ≤ 1.0) And the step executes after background removal and before style‑preset application; background pixels remain alpha=0 outside the garment

Color Fidelity and ICC Profile Preservation (sRGB/AdobeRGB)

Given inputs tagged sRGB or AdobeRGB with embedded ICC profiles When processed by the engine Then output images embed the same profile tag and unchanged regions show ΔE00 ≤ 2.0 for ≥ 99% of pixels And no gamut clipping occurs for in‑gamut colors (ΔE00 > 3.0 rate < 0.5%) And a color checker overlay round‑trip test yields median ΔE00 ≤ 0.5 across patches

Parameter Controls and Deterministic Reproducibility

Rule: depth, curve, and collarSpread are accepted as floats in [0.0, 1.0]; out‑of‑range values are clamped and logged Rule: Increasing depth by +0.1 increases interior reveal area by 8% ± 2% on average; increasing collarSpread by +0.1 widens collar opening angle by 5° ± 2°; increasing curve by +0.1 increases neckline curvature radius by 6% ± 2% Rule: Given identical input image, parameters, and random seed, repeated runs produce byte‑identical composite, alpha mask, and scores

Performance Target on T4‑Class GPU (2048px)

Given a batch of 100 images with max dimension 2048 px When processed on a single NVIDIA T4 with warmed model Then per‑image latency median ≤ 1.8 s and P95 ≤ 3.0 s with zero hard timeouts And peak GPU memory usage ≤ 6 GB; CPU utilization ≤ 200% on a 4‑vCPU host; failure rate < 0.5% And sustained throughput ≥ 30 images/min over the batch

Confidence Scoring and Safe Fallback Behavior

Rule: The engine returns qualityScore and confidence ∈ [0.0, 1.0] per image Rule: If confidence < threshold (default 0.85; configurable), the engine returns the original unmodified image/mask, sets result.mode="fallback", and emits a low_confidence event Rule: For confidence ≥ threshold, confidence positively correlates with modified area quality on a validation set (Spearman ρ ≥ 0.6)

Interactive Neck Controls with Live Preview

"As a boutique owner, I want precise, real‑time controls over the neckline shape so that I can match my brand’s look across different garments without trial‑and‑error."

Description

Provide intuitive UI controls for depth, curve, and collar spread with slider + numeric entry, symmetry toggle, and anchor point handles. Changes render in a real‑time preview at 1:1 with GPU acceleration and <100ms interaction latency. Include undo/redo (20 steps), reset to defaults, tooltips, and accessibility (keyboard navigation, screen‑reader labels). Validate parameter ranges per garment type and auto‑suggest starting values based on detected neckline class. Persist settings per image and session, and expose the same controls via API for automation.

Acceptance Criteria

Real-time 1:1 Preview Performance and GPU Acceleration

Given the 1:1 preview is visible and the user continuously drags any neck control for 50 adjustments, When measuring event-to-render latency, Then the median latency is ≤ 60 ms and the 95th percentile latency is ≤ 100 ms; And during a continuous 3-second drag, frame rate remains ≥ 30 FPS; And the rendering backend reports GPU acceleration enabled for the preview. Given the preview is zoomed to 100%, When any control value changes (slider, numeric, handle, toggle), Then the preview updates within ≤ 100 ms and reflects the exact control value applied.

Control Inputs: Slider–Numeric Synchronization and Tooltips

Given depth, curve, and collar spread controls are visible, When a slider is moved, Then the corresponding numeric input updates to the same value within the defined step precision; And When the numeric input is edited and committed (Enter or blur), Then the slider position matches the numeric value. Given keyboard focus is on a numeric field, When ArrowUp/ArrowDown is pressed, Then the value increments/decrements by one step; And with Shift+Arrow, by 10× the step; And values clamp at min/max. Given a user hovers or focuses any control, When the tooltip appears, Then it displays label, current value with unit, and the allowed min–max range.

Symmetry Toggle and Anchor Handle Editing

Given symmetry is ON, When the user adjusts a left or right anchor/slider for curve or spread, Then the opposite side mirrors the change and resulting values remain equal; And the preview reflects a symmetric neckline. Given symmetry is OFF, When the user adjusts a left or right anchor/slider, Then only that side changes and the other side remains unchanged. Given asymmetric values exist, When symmetry is toggled ON, Then both sides are set to the average of their current values and no single-step jump exceeds one control step. Given anchor point handles are visible, When a handle is dragged within bounds, Then the corresponding control value updates continuously, clamps to valid range, is undoable, and the preview updates within ≤ 100 ms; And the handle is operable via keyboard nudges for the same step increments.

Undo/Redo History (20 Steps) and Reset to Defaults

Given a sequence of 21 distinct control changes, When the user presses Undo repeatedly, Then up to 20 prior states are restored in reverse order and the 21st (oldest) state is dropped; And When Redo is used, the changes are reapplied in order. Given any supported operation (slider move, numeric edit, handle drag, symmetry toggle, garment-type change, API-applied change), When performed, Then it is added as a single undoable step. Given customized values are present, When Reset is invoked, Then all controls revert to the garment-type default starting values; And this Reset action is added to the undo stack and can be undone/redone.

Parameter Validation and Auto‑Suggest by Detected Neckline Class

Given an image is loaded and a neckline class is detected, When controls are initialized, Then starting values are auto-suggested for depth, curve, and spread, and each lies within the allowed range for that class. Given garment-type specific ranges exist, When the user attempts to set a value out of range (via typing, paste, drag), Then the value clamps to the nearest bound and an inline validation message is shown; And the preview reflects the clamped value only. Given no confident neckline class is detected, When controls initialize, Then a safe default preset is applied and the UI indicates the default was used.

Accessibility: Keyboard Navigation and Screen‑Reader Labels

Given only a keyboard is used, When tabbing through the UI, Then focus reaches all interactive elements (sliders, numeric inputs, symmetry toggle, reset, undo/redo, handles) in a logical order with a visible focus indicator; And all adjustments can be performed via keyboard controls. Given a screen reader is active, When any control receives focus, Then its accessible name, role, and current value are announced; And when a value changes or a validation error occurs, an ARIA live region announces the update without losing focus. Given tooltips exist, When a control is focused (not just hovered), Then its help text is programmatically associated and available to assistive technologies.

Persistence per Image and Session; API Parity for Automation

Given the user edits Image A, When navigating to Image B and back within the same session, Then Image A’s last control values are restored exactly as left. Given a new image is opened in the same session with no prior edits, When controls initialize, Then session last-used values are applied unless overridden by auto-suggest for a different detected neckline class (in which case the auto-suggest takes precedence and is indicated). Given the public API is used, When calling GET for an image’s neck controls, Then the response returns depth, curve, spread, symmetry, and current defaults; And When calling PUT with valid values, Then the same validation rules as the UI apply and the values are persisted; And When calling PUT with out-of-range values, Then a 400 error is returned with a machine-readable code and message; And API-applied changes are reflected in the UI on next load or refresh. Given batch automation needs, When calling an API endpoint to apply a preset to multiple image IDs, Then each item returns an individual success/failure status and failed items do not prevent successful ones from applying.

Label Visibility & Placement Control

"As a brand manager, I want to control whether and how the neck label appears so that my images comply with brand guidelines and marketplace policies."

Description

Enable a toggle to show/hide interior labels and configure label placement, size, rotation, and curvature to match the reconstructed neckline. Support uploading a brand label asset, apply perspective and lighting adaptation, and ensure legibility without occluding seam details. Default to a neutral blank label to avoid unintended branding. Export the label as a separate layer group when using layered formats (e.g., PSD) and embed label metadata for downstream systems. Enforce safeguards to prevent hallucinated or duplicated branding, and provide quick presets (centered, offset left/right) with snap‑to seam guides.

Acceptance Criteria

Toggle Label Visibility on Reconstructed Neckline

Given a processed neckline, When the label visibility toggle is Off, Then no label pixels are rendered and any detected original label in the neckline region is hidden without altering seam geometry. Given visibility is On with no brand asset uploaded, When rendering preview or export, Then a neutral blank label with no text/logo is shown sized to 60–80% of collar back width by default. Given the user switches the visibility toggle, When preview updates, Then the change is reflected within 300 ms and the exported output matches the last toggle state. Given exports to JPEG/PNG or PSD, When exporting, Then the rendered/visible label state matches the preview for the chosen format.

Upload and Adapt Brand Label Asset

Given a PNG or SVG label asset ≥512 px on the shortest side with transparency, When uploaded, Then it is placed on the neckline with perspective warp applied and mean corner alignment error < 2 px to the collar path. Given the label is placed, When lighting adaptation runs, Then label luminance and shadowing match adjacent fabric within ±10% mean luminance and directionally consistent shading. Given the adapted label, When checking legibility, Then contrast ratio between label foreground and substrate is ≥4.5:1 or the user is prompted to auto-adjust/reject. Given an asset <256 px or unsupported type, When uploaded, Then the upload is rejected with a clear validation message and the current label remains unchanged.

Placement, Size, Rotation, Curvature Controls with Seam Snap

Given an active label, When the user drags it, Then it snaps within 4 px to seam guides and the neckline midline. Given the user adjusts size/rotation, When constraints are applied, Then aspect ratio is preserved unless explicitly unlocked, size range is 20–120% of suggested collar width, and rotation range is -30° to +30°. Given the user inputs numeric values, When applied, Then position (x,y), rotation (°), width (%), and curvature (%) match inputs within ±1 unit tolerance. Given the label overlaps seams, When overlap exceeds 2 px along the seam path, Then the system auto-offsets to resolve or prompts the user to confirm.

Preset Positions and Snap-to Seam Guides

Given presets Centered, Offset Left (10%), and Offset Right (10%), When a preset is selected, Then the label center aligns to the collar midline or shifts by 10% of neck opening width left/right within ±2 px. Given a preset is applied, When rotation is computed, Then the label aligns tangent to the neckline at its anchor within ±2°. Given a preset action, When executed, Then the change is undoable in a single step and the preview updates within 200 ms.

Export Layered PSD with Separate Label Group and Metadata

Given PSD export is selected, When exporting, Then a top-level group named "Label" exists with sublayers "Label Art", "Label Shadow", "Label Highlights", and "Label Warp Mask" and the group can be toggled independently. Given the exported PSD, When inspecting bounds, Then the label group bounding box matches the on-canvas label within ±2 px and vector content remains vector when source is SVG. Given any export (PSD, PNG, JPEG), When embedding metadata, Then XMP contains fields: labelVisibility, labelBrand, labelVariant, placementX, placementY, rotationDeg, widthPct, curvaturePct, perspectiveMatrix, processingConfidence with non-null values.

Safeguards Against Hallucinated or Duplicated Branding

Given no label asset is uploaded and visibility is On, When rendering, Then the system produces a blank neutral label with zero text/logo elements and does not generate brand content. Given an uploaded branded label and an existing visible brand mark in the neckline region (detection confidence ≥0.6), When applying the label, Then the user is prompted to Hide original, Replace, or Keep both; default is Hide original and no export contains duplicated brand marks within 50 px proximity. Given any enhancement pass, When scanning the final image, Then no new text/logo elements are present outside the label bounds at detection confidence ≥0.6.

Batch Apply Brand Label Template Across Catalog

Given a saved label template (asset + placement settings), When applied to a batch of N images, Then ≥95% of images apply without manual adjustment and exceptions are flagged with cause codes (e.g., low contrast, seam occlusion). Given batch application, When measuring consistency, Then centerline deviation ≤3% of neck width, rotation deviation ≤3°, and size deviation ≤5% relative to the template per image. Given a batch size of up to 500 images, When processing on the standard environment, Then average label application time is ≤0.5 s per image and per-file metadata/layer group naming is consistent across outputs.

Brand Neck Template Library

"As a studio lead, I want to save and reuse brand‑specific neck settings so that my team can produce consistent results across collections without manual re‑tuning."

Description

Create reusable, versioned templates that store NeckForge parameters (depth, curve, collar spread), label configuration, edge feathering, shadow strength, and color profile preferences. Templates can be named, previewed with thumbnails, assigned to collections/SKUs, and shared across team workspaces with role‑based permissions. Support import/export (JSON) for portability, audit trails for changes, and template pinning as defaults per catalog or uploader. Ensure backward compatibility when the model is updated by keeping template‑to‑model compatibility metadata.

Acceptance Criteria

Template Creation, Naming, Parameter Validation, and Thumbnail Preview

Given I am an Editor in Workspace W When I create a new neck template with valid values for depth, curve, collar spread, label visibility, edge feathering, shadow strength, and color profile, and enter a unique name Then the Save action is enabled and the template is persisted with a templateId and version 1 And a preview thumbnail is generated within 3 seconds using the selected sample image And all entered parameters persist after page refresh and on re-open And attempting to save a duplicate name (case-insensitive) prompts me to rename or accept an auto-suffixed unique name And validation prevents saving when any parameter is out of the allowed range and displays field-level errors

Assign Template to Collections and SKUs with Auto-Apply

Given Template T(v1) exists and is assigned to Collection C and SKUs S1 and S2 When I batch-upload images for SKUs S1 and S2 Then Template T(v1) is auto-selected and applied to 100% of those images by default And the processing log for each image records templateId, templateVersion, and modelVersion And I can override the template per image before processing, and the override is respected And images without an assignment follow the default selection rules for the workspace

Versioning, Compare, Rollback, and Audit Trail

Given Template T(v1) exists When I edit Template T and save changes as a new version Then Template T(v2) is created and v1 remains immutable and selectable And the audit trail records userId, timestamp (ISO 8601), changed fields with before/after values, and an optional reason And I can view a side-by-side comparison of v1 and v2 parameters and thumbnails And I can set any version as the active default without deleting other versions And reprocessing applies only to new or re-run jobs; previously processed images remain unchanged unless explicitly reprocessed

Import and Export Templates (JSON) with Validation and Conflict Handling

Given I export selected templates T1..Tn When I download the JSON Then the file includes for each template: name, templateId, versions with parameters (depth, curve, collar spread), label configuration, edge feathering, shadow strength, color profile, assignments by id, and model compatibility metadata And when I import the same JSON into another workspace Then the system validates schema and compatibility; valid templates import, conflicts are resolved via rename or skip, and invalid entries report field-level errors with line references And imported templates preserve version numbers and provenance fields (creator, createdAt) if provided And importing up to 100 templates produces a success/fail summary and completes without timeout under normal network conditions

Role-Based Permissions and Cross-Workspace Sharing

Given roles Admin, Editor, and Viewer exist When a Viewer attempts to create, edit, share, delete, import, or export a template Then the action is blocked with a 403 error and the UI control is disabled And Editors can create and edit templates within their workspace but cannot share across workspaces And Admins can create, edit, delete, share, import, and export templates And when an Admin shares Template T to Workspace X, members of X can see and apply T according to their roles, and an audit event is recorded with who, when, and target workspace

Template Pinning and Default Selection Priority (Catalog and Uploader)

Given a workspace default template D, a catalog-level default C, and a user-level default U are configured When I open the upload screen for Catalog K as user U Then the preselected template follows priority order: SKU assignment > Catalog default > User default > Workspace default And the currently applied default is indicated in the UI and can be overridden before processing And changes to defaults take effect for new sessions within 1 minute and do not affect jobs already queued or processed

Model Update Backward Compatibility and Migration

Given Template T references modelVersion m1 and NeckForge updates to modelVersion m2 When I process new images using T after the update Then the system applies compatibility mapping so processing completes without error And if T is incompatible, it is flagged as Needs Migration with guidance to run a one-click migration that creates T' targeting m2 while preserving settings where possible And if migration is skipped, the system automatically uses the last compatible model for T And all outcomes (mapped, migrated, skipped) are logged with reasons and associated image/template identifiers

Batch Processing & Pipeline Orchestration

"As an operations manager, I want to run NeckForge on large catalogs with minimal babysitting so that we can meet launch deadlines reliably."

Description

Support batch application of NeckForge to hundreds/thousands of images with template assignment rules (by folder, SKU, or tag), concurrency controls, and autoscaling workers. Provide idempotent job submission via API/CLI, progress tracking, and webhooks for completion/failure events. Integrate with the PixelLift job graph to run after background removal and before style‑presets, with per‑image overrides and automatic retries on transient errors. Provide resumable jobs, per‑item status, and throughput targets of 500+ images/hour/GPU at 2048px.

Acceptance Criteria

Batch Template Assignment by Folder, SKU, and Tag

Given a batch of images with folder paths, SKUs, and tags and a rule set with precedence SKU > Tag > Folder When the NeckForge batch job is submitted with those template assignment rules Then each image is assigned the highest-priority matching neck template And images with no matching rule receive the configured default template And a per-image assignment log (image_id, matched_dimension, rule_id, template_id) is persisted and available via API And rule evaluation is deterministic and stable across re-runs given the same inputs

Idempotent Job Submission via API and CLI

Given an idempotency key K and an identical job payload When the client submits POST /necks/jobs with key K multiple times (serially or concurrently) or runs the CLI with --idempotency-key K Then only one job resource is created And all responses return the same job_id and idempotency_key And no duplicate work items are enqueued And if the payload differs for the same idempotency key within the idempotency window, the API returns 409 Conflict with error code IDEMPOTENCY_PAYLOAD_MISMATCH And submissions with the same key after the idempotency window create a new job and return a new job_id

Pipeline Ordering in PixelLift Job Graph

Given the PixelLift pipeline with steps: BackgroundRemoval -> NeckForge -> StylePresets When a batch job runs Then NeckForge begins only after BackgroundRemoval has succeeded for the given item And StylePresets begins only after NeckForge has succeeded for that item And if BackgroundRemoval fails, NeckForge is skipped and the item status is blocked_by_dependency And the job graph audit log records step start/end timestamps and predecessor IDs per item

Progress Tracking, Per-Item Status, and Webhooks

Given a submitted batch job with N items and a registered webhook endpoint When processing starts Then the job status transitions: queued -> running -> completed/failed/partial And a progress endpoint returns counts by state (queued, processing, completed, failed, skipped) and overall percent complete And per-item status is retrievable via API with fields: item_id, sku, template_id, state, attempt_count, started_at, completed_at, error_code, error_message And webhooks are emitted: job.started, item.completed, item.failed, job.completed And webhook deliveries include an HMAC-SHA256 signature header and are retried up to 6 times with exponential backoff when a non-2xx is returned

Automatic Retries and Resumable Jobs

Given transient errors (e.g., network timeouts, 5xx from storage, throttling) during item processing When they occur Then the system retries the affected item up to 3 times with exponential backoff without duplicating output artifacts And if all retries fail, the item is marked failed with error_code and next_action "resume_available" And when POST /necks/jobs/{job_id}/resume is called Then only items in failed or not_started states are re-queued; completed items are not reprocessed And resumed processing preserves original idempotency and aggregates into the final job result

Throughput and Concurrency Autoscaling

Given a worker pool with M GPUs and max concurrency per GPU configured When processing 2048px images at steady-state load Then measured throughput is >= 500 images/hour/GPU over a rolling 60-minute window And autoscaling increases worker replicas to maintain average queue wait time <= 2 minutes while respecting max concurrency per GPU And autoscaling scales down when queue depth is 0 for 10 minutes And real-time metrics (throughput, latency, GPU utilization, queue depth) are exposed for verification

Per-Image Parameter Overrides

Given a batch job with template T and a per-image override manifest specifying depth, curve, collar_spread, and label_visibility for specific SKUs or image_ids When the job runs Then overrides are applied only to the specified items and take precedence over template defaults And unspecified parameters fall back to template T values And the output metadata for each item includes the effective parameters used And API validation rejects overrides outside allowed ranges with error_code PARAM_OUT_OF_RANGE and the item remains unprocessed

Automated QA & Fallback Handling

"As a content QA specialist, I want the system to flag and gracefully handle poor reconstructions so that only studio‑quality images are published without manual spot‑checking every file."

Description

Implement confidence scoring and anomaly checks for artifacts such as jagged edges, seam misalignment, asymmetry beyond tolerance, floating labels, and texture discontinuities. On detection, route images to a review queue with side‑by‑side before/after, overlays, and adjustable thresholds per brand. Provide automated fallbacks (shallower depth, different mask strategy, or bypass NeckForge) and emit alerts/metrics (Datadog/Stackdriver) for sustained failure patterns. Log decisions for traceability and continuously feed outcomes back to improve the model.

Acceptance Criteria

Confidence Scoring Gates Auto-Approval

Given a brand has NeckForge QA thresholds configured with min_confidence = 0.92 And an image is processed through NeckForge When the computed confidence score for the final output is >= 0.92 Then the image is auto-approved and advanced to the next pipeline stage without entering the review queue And the decision is logged with image_id, model_version, confidence_score, threshold_used, and timestamp

Anomaly Detection Routes to Review Queue

Given brand-configured tolerances exist for jagged_edges, seam_misalignment_px, asymmetry_pct, floating_labels, and texture_discontinuity When any detected anomaly exceeds its configured tolerance or is flagged as high severity Then the image is routed to the QA review queue (not auto-published) And the review entry shows side-by-side before/after, anomaly overlays per metric, and per-metric scores And detection metrics are persisted and searchable by image_id and brand_id

Automated Fallback Strategies Execute

Given fallback policies are enabled with ordered strategies [shallower_depth, alternate_mask_strategy, bypass_neckforge] And the initial NeckForge output fails confidence or anomaly thresholds When processing continues with fallbacks in order Then the system stops at the first strategy that meets all thresholds and marks the outcome as auto-approved And the chosen fallback strategy, attempts, scores, and anomalies are logged for traceability And if no strategy passes, the item remains in the review queue with reason = "all_fallbacks_failed"

Adjustable Thresholds Per Brand Take Effect

Given Brand A has asymmetry_pct tolerance = 3% and Brand B has asymmetry_pct tolerance = 1% When the same image with measured asymmetry_pct = 2% is processed for both brands Then Brand A auto-approves while Brand B routes the image to the review queue And audit logs record the brand_id, thresholds used, measured values, and final decision for each run

Alerts and Metrics for Sustained Failure Patterns

Given alerting rules are configured: failure_rate_10m > 15% OR consecutive_failures >= 30 per brand When NeckForge QA failures for a brand meet either condition Then alerts are emitted to Datadog and Stackdriver including brand_id, current failure rate, consecutive failures, and top anomaly types And dashboards expose metrics: failure_rate, anomaly_breakdown, fallback_usage, review_queue_backlog And the alert auto-resolves when the failure rate drops below threshold for at least one full evaluation window

Traceability and Feedback to Model Improvement

Given a reviewer takes an action (approve, reject, override thresholds) on a queued item When the action is submitted Then the system logs decision details including reviewer_id, image_id, anomalies, thresholds at decision time, notes, and final assets And a feedback event with label (pass/fail), features, and artifacts is queued to the model improvement pipeline in a PII-safe manner per brand opt-in And an API endpoint returns the end-to-end decision history by image_id including fallback attempts and reviewer actions

SleeveFill

Automatically rebuilds interior sleeves and armholes with natural drape and symmetry. Dial opening width and fabric tension to match garment type (tank, tee, blazer), preserving cuff geometry and stitching so tops and outerwear present cleanly in every listing.

Requirements

Sleeve Interior Reconstruction Engine

"As an online seller, I want the tool to automatically rebuild empty sleeve interiors so that my product photos look professionally filled and balanced without manual retouching."

Description

Develop the core ML-driven inpainting and geometry-rebuild engine that reconstructs interior sleeves and armholes with natural drape and symmetry. The engine should infer missing interior fabric, estimate garment thickness, and synthesize plausible folds while avoiding distortions. It must ingest the product cutout mask from PixelLift’s segmentation stage, operate before style-presets are applied, and output an alpha-matted layer that preserves original garment boundaries. Include symmetry constraints across left/right sleeves, pose-awareness to handle angled garments, and fail-safes that revert to original if confidence is low. Provide deterministic results for identical inputs and support GPU acceleration for batch throughput targets.

Acceptance Criteria

Deterministic Rendering for Identical Inputs

- Given identical image, cutout mask, parameter set, and model/build version, when the engine runs 10 times in separate processes, then the RGBA output files are byte-identical (matching checksums) across all runs. - Given identical inputs and environment on two different machines with the same GPU model/driver and CUDA/cuDNN versions, when the engine runs once per machine, then the outputs are byte-identical. - Given identical inputs processed with batch size 1 versus batch size 16, when outputs are compared per item, then they are byte-identical.

Boundary Preservation and Alpha-Matted Output

- Given a valid product cutout mask aligned to the garment, when the engine runs, then it consumes the provided mask (no re-segmentation) and returns an RGBA image with the same spatial dimensions as the input. - Then all pixels where the input mask==0 have alpha==0 in the output (no nonzero alpha leakage outside the silhouette). - Then the outer garment boundary in the output deviates by ≤ 1 px Hausdorff distance from the input silhouette at 2048 px long edge. - Then the output is tagged as a pre-style stage artifact so that style-presets execute after this step in the pipeline.

Symmetry and Pose-Aware Reconstruction

- Given garments with both sleeves visible, when reconstruction completes, then after pose normalization to the garment axis the mirrored left/right sleeve interior ROIs achieve SSIM ≥ 0.92 and area ratio difference ≤ 5%. - Given garments photographed at annotated angles, when reconstruction completes, then estimated garment axis deviates ≤ 5° from ground truth and sleeve opening orientation deviates ≤ 5°. - Then seamline continuity across sleeve interiors achieves an edge continuity score ≥ 0.95 (Canny edge overlap metric) with no discontinuities > 3 px.

Confidence-Based Fail-Safe Reversion

- Given samples where the model confidence drops below 0.70 for sleeve interior reconstruction, when the engine evaluates the region, then the affected interior reverts to the original pixels and alpha exactly (byte-identical to input in that region). - On a labeled validation set, the false proceed rate (confidence ≥ 0.70 on low-quality reconstructions) is ≤ 1% and the false revert rate (confidence < 0.70 on high-quality reconstructions) is ≤ 5%. - When any revert occurs, a revert=true flag is emitted in metadata for auditability.

Parameterized Control of Opening Width and Fabric Tension

- Given opening_width ∈ {0.25, 0.50, 0.75}, when measuring the sleeve aperture chord length in pixels, then the relative error to target is ≤ ±5% on ≥ 90% of validation images and none exceed ±10%. - Given fabric_tension ∈ {0.10, 0.50, 0.90}, when measuring high-frequency energy in the sleeve interior ROI, then the metric increases monotonically with tension (Spearman ρ ≥ 0.7) and the 0.90 vs 0.10 difference is ≥ 25% in the expected direction. - Given garment_type preset ∈ {tank, tee, blazer} with no manual overrides, then default parameters are set to: tank (opening_width 0.70±0.05, tension 0.60±0.05), tee (0.55±0.05, 0.50±0.05), blazer (0.40±0.05, 0.40±0.05).

Preservation of Cuff Geometry and Stitching

- Given a cuff ROI mask, when reconstruction completes, then the cuff outer edge Hausdorff distance between output and input is ≤ 2 px at 2048 px long edge. - Then keypoint match rate (ORB/SIFT) within the cuff ROI is ≥ 90% within a 3 px reprojection threshold (preserves stitching detail). - Then cuff width change is within ±3 px on 95% of images and none exceed ±5 px. - Then edge continuity ratio along detected stitch lines is ≥ 0.95 (no breaks/gaps).

GPU-Accelerated Batch Throughput and Stability

- Given an NVIDIA T4 (16 GB) and 2048 px long-edge inputs, when processing a batch of 200 images, then median latency is ≤ 1.2 s/image and p95 latency is ≤ 2.5 s/image with zero out-of-memory errors. - Then average GPU utilization is ≥ 70% over the run and peak VRAM usage ≤ 12 GB. - Given 4 parallel workers with dynamic batching, when processing 400 images, then throughput scales to ≥ 3.2× versus a single worker and outputs remain deterministic (byte-identical to single-worker results).

Fabric Tension & Opening Width Controls

"As a boutique owner, I want quick sliders to adjust sleeve openness and fabric tension so that I can match the look to different tops and brand styling in seconds."

Description

Implement user-facing controls to dial sleeve opening width and perceived fabric tension, with real-time preview. Controls must map to physically plausible bounds per garment type and adjust drape intensity, fold frequency, and aperture shape without breaking cuff geometry. Provide presets (tight/regular/relaxed) and numeric sliders, allow per-image overrides during review, and expose settings via API and batch presets. Ensure latency under 300 ms per adjustment on a mid-range GPU and persist chosen values in project metadata for reproducibility.

Acceptance Criteria

Real-time preview responsiveness (≤300 ms per adjustment)

- Given a mid-range GPU environment and an image ≤24MP, When the user adjusts opening width or fabric tension via slider, numeric input, or preset, Then the preview updates with p95 end-to-end latency ≤300 ms measured over 30 consecutive adjustments. - Then UI input remains responsive (no frozen frames >500 ms) during adjustments. - Then the displayed numeric value and visual state reflect the applied setting within 50 ms of the render completing.

Garment-type constrained bounds and physically plausible mapping

- Given garment type = tank, tee, or blazer, When the user moves the opening width or tension slider, Then values are clamped within the configured bounds for that garment type and the UI prevents entry beyond bounds. - Then increasing fabric tension monotonically decreases fold amplitude and increases fold frequency in the sleeve interior region. - Then increasing opening width monotonically increases sleeve aperture area and maintains aperture shape consistent with the garment type (tank/tee: rounded; blazer: tailored oval). - Then no self-intersections, floating fragments, or holes are present in the synthesized sleeve region (artifact count = 0 under automated checks on ≥95% of the test set).

Cuff geometry and stitching preservation

- Given any adjustment within allowed bounds, Then the cuff boundary mask IoU between source and result ≥0.98. - Then average cuff-edge deviation ≤1 px and max deviation ≤3 px along ≥95% of the cuff contour length. - Then seam/stitch texture MSE within a ±12 px cuff band increases by ≤5% relative to source.

Preset behavior: Tight, Regular, Relaxed

- Given any garment type, When the user selects Tight, Regular, or Relaxed, Then opening width and fabric tension snap to the preset's mapped values for that garment type and the preview updates per performance criteria. - Then selecting a preset records the preset name and resolved numeric values in the project state. - Then the default preset for new images is Regular; selecting Reset reverts to Regular for the current garment type. - Then applying a preset to a selected set of images in batch sets those images to the same resolved numeric values and preserves independent per-image overrides thereafter.

Numeric sliders and direct input controls

- Given Fabric Tension and Opening Width controls, Then each displays its current numeric value and exposes a slider with step resolution ≥1% of its range and supports ± keyboard increment and direct typing. - When a user types a numeric value, Then invalid inputs are rejected with inline feedback and valid inputs are clamped to garment-type bounds. - Then slider, keyboard, and typed inputs produce equivalent settings and update the preview per performance criteria. - Then control changes are captured in undo/redo history and can be reset to prior values.

Per-image overrides during batch review

- Given a batch with project-level defaults, When the user adjusts settings on an image in review, Then the changes are stored as an override for that image and persist across navigation and session reload. - Then overridden images are visually flagged and can be filtered in the review grid. - Then a Bulk Apply action applies current settings to a selected subset of images without altering non-selected images. - Then clicking Reset on an overridden image removes the override and re-applies project-level settings.

API, batch presets, and reproducible metadata persistence

- Given the public API, When a client GETs/SETs fabric tension and opening width at project, preset, or image scope, Then the values are validated against garment-type bounds and persisted, returning 2xx on success and 4xx with error details on invalid input. - Then clients can create, list, update, and delete batch presets identified by stable IDs; presets include garment-type-specific mappings for both controls. - Then all applied settings (and preset IDs) are stored in project metadata and exported with the project. - Then re-rendering the same image with identical source and stored settings produces output identical within pixel-wise tolerance (≥99.9% pixels delta ≤1 or PSNR ≥60 dB).

Garment-Type Presets & Auto-Detection

"As a catalog manager, I want the system to auto-select sleeve settings based on garment type so that I don’t have to tune each product manually during batch uploads."

Description

Create garment-type presets (tank, tee, long-sleeve, hoodie, blazer/coat) that set default sleeve opening and tension parameters, symmetry rules, and drape models. Add a lightweight classifier to auto-detect garment type from the product image and apply the corresponding preset, with confidence scoring and fallback to a default. Allow brand-specific custom presets that can be saved and shared across teams and included in one-click style-presets for batch runs. Log applied preset and detection confidence for auditability.

Acceptance Criteria

Core Presets Define Sleeve Parameters and Drape Models

Given a fresh workspace with SleeveFill enabled When I open Garment-Type Presets Then I see presets for tank, tee, long-sleeve, hoodie, and blazer/coat And each preset exposes default values for sleeve opening width (cm), fabric tension (0–100), symmetry rules (on/off), and a drape model ID And each value is within its allowed range: opening 0.5–40 cm, tension 0–100, symmetry rules boolean, drape model ID valid When I apply the 'tee' preset to a sample image Then the processing metadata includes the applied values for opening width, tension, symmetry, and drape model And a Reset to Default action restores the shipped values for each preset

Auto-Detection Applies Preset With Confidence and Fallback

Given an input image without a manual preset override When the garment-type classifier runs Then it outputs exactly one of {tank, tee, long-sleeve, hoodie, blazer/coat} and a confidence score in [0,1] And if confidence ≥ the configured threshold (default 0.70) the matching preset is applied And if confidence < threshold or the result is Unknown, the configured Default preset is applied And the applied preset name/ID and confidence are stored per image And in a mixed batch of 50 images, 50/50 images complete with one applied preset each and zero images left unassigned

Manual Override of Detected Garment Type

Given an auto-detected garment type and confidence are displayed for an image When a user selects a different garment type before processing or requests a reprocess with a chosen preset Then the selected preset overrides the classifier for that image And the override source is recorded as Manual with user ID and timestamp And subsequent runs keep the override until the user clears it And the output processing metadata reflects the overridden preset parameters

Brand-Specific Custom Presets: Create, Save, Share

Given I am in Brand A's workspace with Editor permissions When I create a new garment preset named "Brand A Tee" and set sleeve opening, tension, symmetry, and drape model Then the preset is saved with a unique ID and appears in the preset picker and API list for Brand A And teammates in Brand A with View or higher can use the preset; Editors can modify it And copying the preset to Brand B requires explicit Share/Copy action and results in a new preset ID under Brand B And editing a preset creates a new version, and past jobs continue to reference their original version

Include Brand Presets in One-Click Style-Presets

Given a style-preset editor is open When I attach a specific brand garment preset to the style-preset and save Then the style-preset references that garment preset by ID When I run a batch with that style-preset Then the referenced garment preset is applied to each image unless a per-image manual override exists And the system records source=StylePreset for the applied preset where applicable

Batch Processing Rules, Order of Precedence, and Performance

Given a batch of 200 mixed garment images When processing runs Then each image resolves its preset using this precedence: Manual Override > Style-Preset Binding > Auto-Detection above threshold > Default preset And a run summary reports counts by applied preset source and average detection confidence And ≤1% of images may fail preset assignment; each failure is retried once and surfaced with error codes And garment-type detection adds no more than 10% to total batch processing time compared to SleeveFill without detection (measured on the same batch)

Audit Logging and Export of Preset Application

Given processing completes for any image When I view the audit log via UI or API Then I can see for each image: imageId, timestamp, detectedType, confidence, threshold, appliedPresetId/name, appliedPresetSource (Auto/Manual/Style/Default), userId (if Manual), workspaceId, and jobId And logs are retained for at least 90 days and filterable by date range, garment type, preset, source, and confidence range And I can export CSV or JSON for up to 10,000 records within 60 seconds, matching the applied filters And subsequent edits to presets do not alter historical log entries

Cuff Geometry & Stitch Preservation

"As a product photographer, I want cuff edges and stitching to remain crisp and true to the original so that the final images look authentic and high-quality."

Description

Develop edge-aware segmentation and feature-preservation routines that lock cuff contours, seam lines, and visible stitching during sleeve reconstruction. Use high-frequency detail masks and contour constraints to prevent blurring, stretching, or misalignment of cuffs and hems. Include a quality gate that compares pre/post edge metrics (SSIM/edge density) and auto-corrects artifacts. Ensure compatibility with various cuff types (ribbed knit, rolled, buttoned, tailored) and support zoomed inspection in the review UI.

Acceptance Criteria

Cuff Contour Lock Under SleeveFill

Given a product photo with visible sleeve cuffs and armholes When SleeveFill reconstructs sleeves using any supported garment preset (tank, tee, blazer) Then the mean boundary deviation between pre- and post-process cuff contours is ≤ 1.5 px and the 95th percentile deviation is ≤ 3 px And no cuff edge self-intersections or discontinuities are present And the median seam line displacement within 10 px of the cuff edge is ≤ 2 px and the maximum is ≤ 4 px And cuff hem length change is within ±2% of the original

Stitching & High-Frequency Detail Preservation

Given a cuff-region mask auto-detected before processing When SleeveFill completes reconstruction Then the cuff-region gradient-magnitude SSIM is ≥ 0.94 versus pre-process And the cuff-region edge density decreases by < 5% And the high-frequency (1–4 px) band power decreases by ≤ 10% And detected stitch/rib line count changes by ≤ 10% and average line width change is ≤ 1 px

Contour Constraints Under Style Presets

Given opening width and fabric tension sliders set to any value from 0 to 100 When SleeveFill reconstructs sleeves Then the resulting cuff opening width matches the configured value within ±2 px And cuff curvature monotonicity is preserved (no sawtooth artifacts; curvature sign changes only at original inflection points) And cuff circularity (for circular cuffs) or straightness (for straight hems) deviates by ≤ 5% And cuff orientation yaw/pitch relative to the garment body deviates by ≤ 3° from the original

Quality Gate and Auto-Correct

Given pre- and post-process cuff-region metrics are computed (SSIM, edge density, band power, seam displacement) When any metric fails thresholds (SSIM < 0.94 OR edge density drop ≥ 5% OR band power drop > 10% OR max seam displacement > 4 px) Then an auto-correct pass is triggered up to 2 iterations with adjusted contour constraints and detail enhancement And if thresholds are met after retries, the image is marked SleeveFill-OK And if thresholds still fail, the item is flagged for manual review and is not auto-published And all metric values and decision outcomes are logged per image for auditability

Cuff Type Compatibility Coverage

Given a labeled test set of ≥ 100 images per cuff type (ribbed knit, rolled, buttoned, tailored) When SleeveFill is run with default presets Then ≥ 95% of images per cuff type meet all cuff-preservation thresholds defined for contours, stitching, and seams And no cuff type falls below a 90% pass rate And for buttoned cuffs, button-to-buttonhole center distance changes by ≤ 1 px and button edge aspect ratio changes by ≤ 5% And for rolled cuffs, roll edge layering count is preserved within ±1 layer and edge continuity has no gaps > 2 px And for ribbed cuffs, estimated rib frequency changes by ≤ 8%

Zoomed Inspection in Review UI

Given the review UI with an edited image When the user zooms to the cuff region at 200%, 300%, and 400% and toggles before/after Then zoom change latency is ≤ 120 ms at p95 and before/after toggle latency is ≤ 80 ms at p95 And image tiles render at device pixel ratio with 1:1 pixel mapping at 100% zoom (no interpolation blur) And a cuff-edge overlay can be toggled and aligns with visible edges within ≤ 1 px And pinch and mouse-wheel zoom maintain focus under the cursor/focal point consistently

Batch SleeveFill Processing Pipeline

"As a seller who uploads large catalogs, I want SleeveFill to run reliably in batches with clear progress and fast turnaround so that my listings are ready quickly."

Description

Integrate SleeveFill into the batch pipeline with parallel processing, idempotent job orchestration, and resumable tasks. Support processing hundreds of images concurrently with configurable concurrency, backoff/retry on transient failures, and timeouts. Provide progress tracking, per-item logs, and artifact tagging so SleeveFill outputs can be traced and rolled back. Ensure end-to-end throughput aligns with PixelLift’s promise (hundreds in minutes) and expose pipeline controls via API/CLI and the web dashboard.

Acceptance Criteria

Parallel Processing Throughput under Target Load

Given a batch of 400 2048x2048 JPEG product images stored in S3 and a reference worker pool (8 GPU workers with NVIDIA T4 16GB, coordinator with 32 vCPU/64GB RAM) When the SleeveFill batch pipeline executes with maxConcurrency=64 Then at least 95% of items finish within 10 minutes and total batch wall-clock time is under 12 minutes And successful item error rate (excluding invalid inputs) is below 1% And no worker experiences sustained queue starvation (p95 queue wait < 5s)

Configurable Concurrency & Runtime Adjustments

Given the user sets concurrency via API, CLI, or Dashboard to N where 1 ≤ N ≤ 128 When a SleeveFill batch is running Then active SleeveFill tasks in the system never exceed N concurrently And changing N at runtime takes effect within 10 seconds and is reflected in metrics and /batches/{id} status And invalid N values return 400 with a machine-readable error code

Idempotent Orchestration with Exactly-Once Effects

Given each item is submitted with an idempotencyKey and the client resubmits the same item one or more times within 24 hours When the SleeveFill pipeline processes the item Then only one execution occurs and a single output artifactId is produced And duplicate submissions return 200 with the existing result and an Idempotency-Replay header And no duplicate logs, charges, or artifacts are created

Resumable Batch via Durable Checkpointing

Given a SleeveFill batch of 250 items has 40% completed when the coordinator crashes or is restarted When the batch is resumed Then previously completed items are not reprocessed (0 duplicate executions) And only incomplete items are re-queued and picked up within 30 seconds of resume And progress metrics (total, completed, failed, in-progress, percent, ETA) persist accurately across the restart

Backoff, Retry, and Timeout Policy for Transient Failures

Given a transient error (e.g., 5xx from storage or model endpoint timeout) occurs for an item When the item fails Then it is retried with exponential backoff (initial=2s, factor=2, jitter=±20%) up to 3 attempts And any single SleeveFill task exceeding 120s processing time is canceled and retried once And after max attempts the item is marked Failed with errorCode, message, and correlationId, visible via API/CLI/Dashboard And successful retries yield a single final artifact bound to the original itemId (no duplicates)

Per-Item Logs, Progress, and Artifact Tagging & Rollback

Given a SleeveFill batch is running or completed When the user requests details for item i via API, CLI, or Dashboard Then structured per-item logs with ISO-8601 timestamps are retrievable within 2 seconds and include stage, duration, and warnings/errors And the output artifact is tagged with batchId, itemId, sleeveFillVersion, presetId, timestamp, and idempotencyKey And a rollback operation restores the previous artifact version within 30 seconds and records the rollback event in the item log

Unified Pipeline Controls via API, CLI, and Dashboard

Given the user has permission to manage batches When they submit, pause, resume, cancel, or delete a SleeveFill batch via API, CLI, or Dashboard Then the action is authorized and takes effect within 5 seconds And the response includes a standardized status payload with batchId, state, totals, and percentComplete And Dashboard live updates reflect the new state within 3 seconds And invalid state transitions return 409 with a machine-readable error code

Manual Sleeve Mask Fine-Tune Tools

"As a retoucher, I want simple manual controls to fix rare sleeve artifacts so that I can deliver perfect images without leaving PixelLift."

Description

Offer optional fine-tune tools for edge cases: a smart brush to nudge sleeve apertures, an anchor-point gizmo to adjust symmetry axes, and a toggle to freeze specific regions. Edits must be non-destructive, recorded as layered adjustments, and re-playable in batch via saved presets. Provide undo/redo, before/after diff, and artifact flagging that can feed back into model improvement. Keep the interaction lightweight and consistent with existing PixelLift retouch UI patterns.

Acceptance Criteria

Smart Brush Sleeve Aperture Nudge

- Given a product photo with detected sleeves and SleeveFill enabled - When the user selects the Smart Brush and paints within 5 px of the sleeve aperture edge - Then the sleeve aperture contour adjusts locally following brush strokes with edge-aware constraints, without altering cuff geometry beyond 1 px - Given any brush stroke - When the stroke completes - Then a non-destructive "Smart Brush Adjustment" layer is created with editable mask and opacity, and the base image remains unchanged - Given continuous painting for up to 3 seconds - When rendering occurs - Then interactive preview latency is ≤ 80 ms per frame and final refinement applies within 300 ms after stroke end - Given strokes that intersect frozen regions - When applied - Then adjustments do not modify pixels within the freeze mask

Anchor-Point Symmetry Axis Adjustment

- Given a garment type (tank/tee/blazer) is selected - When the user drags the left/right anchor points or rotates the symmetry gizmo - Then the symmetry axis updates in real time and SleeveFill recomputes sleeve shapes to maintain bilateral symmetry within ±2 px - Given the gizmo is near standard angles - When within 5° - Then it snaps to 0°, 45°, or 90° increments toggleable via Shift key - Given any adjustment - When committed - Then a named "Symmetry Adjustment" layer is added and can be undone/redone with Ctrl/Cmd+Z/Y - Given numeric input is used - When axis angle or offset is typed - Then values accept precision to 0.1° and 0.5 px respectively

Region Freeze Toggle for Protected Areas

- Given the user toggles Freeze Regions ON - When painting a freeze mask - Then a visible blue overlay appears at 40–60% opacity and the mask is stored as a separate "Freeze Mask" layer - Given SleeveFill recomputation runs - When a freeze mask exists - Then pixels under the mask change by no more than ΔE2000 ≤ 1 or displacement ≤ 1 px - Given the project is saved as a preset - When reloaded - Then the freeze mask and toggle state persist and apply during batch replay - Given brush conflicts with freeze - When overlap occurs - Then freeze priority supersedes Smart Brush and Symmetry adjustments in the overlapped region

Undo/Redo, History, and Before/After Diff

- Given any fine-tune action (Brush, Gizmo, Freeze) - When performed - Then it is recorded as a discrete history step with tool name, timestamp, and parameters - Given undo or redo is invoked - When Ctrl/Cmd+Z or Ctrl/Cmd+Shift+Z is pressed - Then the state rolls back/forward correctly up to the last 50 steps without loss of layers - Given the user toggles Before/After - When pressing the '\' key or clicking the Diff icon - Then the UI shows a split or swipe diff with synchronized zoom/pan and no more than 50 ms toggle latency - Given memory pressure occurs - When history exceeds 50 steps - Then the oldest steps are pruned with a warning toast, and current edits remain non-destructive

Preset Save and Batch Replay of Fine-Tunes

- Given edits are completed on a reference image - When the user saves a preset named and scoped to garment type - Then the preset stores ordered layers, parameters, masks, and gizmo settings with version metadata - Given a batch of N images (N up to 500) - When the preset is applied - Then the system replays adjustments with auto-alignment within ±3 px on matched garment landmarks, and completes at ≥ 40 images/min on GPU-enabled hardware - Given any image fails to auto-align - When confidence < 0.8 - Then the job logs the image as "Needs Review" without applying destructive changes and continues processing the rest - Given the batch completes - When results are opened - Then the user can toggle per-image Before/After and see a summary of successes, failures, and average processing time

Artifact Flagging and Model Feedback

- Given the user observes artifacts (warping, seams, halo) - When clicking Flag Artifact and brushing areas - Then the app captures a redaction-safe crop, masks, tool history, and environment metadata and queues it for model feedback - Given privacy controls - When "Include original" consent is unchecked - Then only the edited output and masks are uploaded; otherwise the original is included - Given a submission is queued - When offline - Then it retries in the background with exponential backoff and displays status (Queued, Sent) in the notifications panel - Given a flagged item is used for training - When anonymized IDs are generated - Then no PII or seller store identifiers are included in the payload

UI Consistency with PixelLift Retouch Patterns

- Given the fine-tune tools panel - When rendered - Then it uses existing design tokens for spacing, colors, and typography and reuses brush iconography and shortcuts (B for brush, O for freeze, V for gizmo) - Given hover and tooltips - When hovering controls - Then tooltips follow PixelLift style and present shortcut hints; tool hit target sizes are ≥ 44 px on touch devices - Given light/dark mode - When toggled - Then contrast ratios meet WCAG AA with no clipping or icon misalignment - Given localization - When set to supported locales - Then all labels and tooltips are translatable using existing i18n keys without truncation in German and Spanish layouts

SeamFlow

Maintains pattern and seam continuity through reconstructed areas. Detects and extends stripes, plaids, and darts with smart warping and anchor points, preventing visual breaks that make apparel look cheap—delivering premium, studio-grade realism at scale.

Requirements

Pattern & Seam Auto-Detection

"As an independent seller batch-uploading apparel photos, I want automatic detection of patterns and seams so that SeamFlow can align textures without manual markup."

Description

Automatically identifies repeating textile patterns (e.g., stripes, plaids, herringbone) and structural lines (seams, darts, hems) in apparel images. Produces pixel-accurate masks and vector fields indicating pattern direction and phase continuity across panels. Integrates with PixelLift’s existing garment/region segmentation to avoid backgrounds and accessories. Outputs confidence scores per region to drive downstream warp and inpainting decisions. Reduces manual markup, accelerates batch throughput, and establishes the canonical geometry inputs required for SeamFlow’s continuity operations.

Acceptance Criteria

Detect repeating textile patterns on single garment front view

Given an apparel image from the approved validation set containing a single garment region with a visible repeating textile pattern (e.g., stripes, plaids, herringbone) When Pattern & Seam Auto-Detection is executed Then the garment region is assigned a pattern_type that achieves F1 ≥ 0.92 on the validation set And a per-pixel direction field is produced with mean angular error ≤ 5° versus ground truth And phase continuity across adjacent garment panels exhibits drift ≤ 3 px over any 200 px span And the direction/phase maps contain no NaNs and have dimensions equal to the input image

Auto-detect structural lines (seams, darts, hems) with pixel-accurate masks

Given annotated test images with ground-truth seams, darts, and hems within garment regions When Pattern & Seam Auto-Detection is executed Then per-class binary masks are produced with IoU ≥ 0.85 against ground truth And mask boundary mean absolute error ≤ 2 px And along detected structural lines, the local tangent direction field has median angular error ≤ 10° And each output mask aligns fully within the garment segmentation (overlap IoU with garment region ≥ 0.98)

Exclude backgrounds and accessories via segmentation integration

Given images containing backgrounds and accessories (e.g., belts, jewelry, bags) alongside garment regions When Pattern & Seam Auto-Detection is executed with existing garment/region segmentation Then no pattern or seam detections are emitted outside garment regions, with false-positive rate ≤ 0.5% over background pixels And accessory regions produce no pattern/seam outputs (zero masks/vector fields) and have per-region confidence ≤ 0.2 And all produced masks/vector fields are fully contained within garment regions (IoU overlap ≥ 0.98)

Output canonical geometry maps and schema compliance

Given an API request to run Pattern & Seam Auto-Detection on a garment image When processing completes Then the JSON response conforms to schema v1 and includes per garment region: region_id, pattern_type, confidence ∈ [0,1], mask (RLE or PNG), direction_map (H×W×2 or angle map in radians), and phase_map (H×W) And all maps align with the input image dimensions and coordinate system (origin top-left; 0 radians points right; angles increase counter-clockwise) And any referenced URIs are reachable and checksums match declared values And the response passes strict JSON Schema validation with no warnings or errors

Confidence scoring calibration for downstream decisions

Given the approved validation set and a decision threshold τ = 0.7 on per-region confidence When detections are thresholded at τ Then precision ≥ 0.95 and recall ≥ 0.85 for both pattern detection and structural line detection And the expected calibration error (ECE) of confidence scores ≤ 0.05 And ROC-AUC for positive vs negative regions ≥ 0.97

Batch processing throughput and determinism at scale

Given a batch of 500 images at 2048×2048 resolution processed on the reference inference node with concurrency = 8 When Pattern & Seam Auto-Detection is executed end-to-end Then average runtime per image ≤ 6 s and p95 runtime ≤ 10 s And peak memory usage ≤ 4 GB with zero crashes or timeouts And repeated runs with a fixed seed yield deterministic outputs: mask self-IoU ≥ 0.995 and direction field mean angular difference ≤ 1° between runs

Anchor Point Snapping & Guides

"As a retoucher using PixelLift, I want to place anchors that snap to true seam lines so that I can precisely control continuity where the algorithm is uncertain."

Description

Provides an interactive tool for placing, editing, and removing anchor points and guide paths along detected seams and darts. Anchors snap to high-confidence seam edges and pattern phase lines, with adjustable tolerance and magnet strength. Supports symmetry mirroring, multi-select, and constraint types (fixed, elastic, rotational) to steer continuity corrections where detection is ambiguous. Non-destructive: anchors are stored in project metadata and can be reused across variants. Integrates into PixelLift’s editor and is callable via API for scripted workflows.

Acceptance Criteria

Snapping Anchors to Detected Seams and Phase Lines

- Given an image with detected seam/dart/phase lines confidence >= 0.8, When the user places an anchor within 12 px of the nearest detected line and magnet strength is 100%, Then the anchor snaps to the nearest point on that line within 1 px. - Given magnet strength set to 0%, When the user places an anchor within 12 px of a detected line, Then the anchor does not snap and remains at the pointer location. - Given two candidate lines within tolerance, When a snap tie-breaker occurs, Then the anchor snaps to the line with the higher confidence; if confidence is equal, Then the anchor snaps to the geometrically closer line. - Given an existing anchor is dragged across a detected line, When the user releases the drag, Then the anchor snaps according to current tolerance and magnet strength settings.

Adjustable Snap Tolerance and Magnet Strength Controls

- Given the snap tolerance control supports 0-20 px, When the user sets tolerance to T, Then anchors only snap to detected lines if the pointer is within T px. - Given magnet strength supports 0-100%, When set to M, Then the snap offset applied during placement or drag equals M% of the distance to the target line (rounded to nearest pixel). - Given tolerance set to 0 px, When placing or dragging anchors, Then no snapping occurs. - Given tolerance increased from 5 px to 10 px, When placing an anchor 8 px away from a line, Then snapping occurs.

Symmetry Mirroring of Anchors and Guides

- Given a symmetry plane is defined, When the user places or edits an anchor on one side, Then a mirrored anchor is created or updated on the opposite side with the same constraint type and parameters reflected across the plane. - Given mirroring is toggled off, When the user places or edits anchors, Then mirrored counterparts are not created and existing mirrored anchors become independent. - Given mirroring is enabled on an asymmetric garment, When the user attempts to mirror, Then the system prompts to select a custom plane or disable mirroring; and if a custom plane is set, Then mirroring uses that plane.

Multi-Select and Batch Edit of Anchors/Guides

- Given at least two anchors are selected via marquee or shift-click, When the user changes the constraint type to Elastic with stiffness 0.6, Then all selected anchors adopt that type and parameter. - Given a mixed selection of anchors and guide paths, When the user presses Delete, Then all selected elements are removed in a single operation and a single undo step is created. - Given N elements are selected, When the user drags the selection, Then relative spacing between selected elements is preserved and snapping rules apply to each element individually.

Constraint Types Influence SeamFlow Continuity

- Given an anchor with Fixed constraint, When continuity correction runs, Then the local pose at that anchor changes by no more than 0.5 px translation and 0.2 degrees rotation. - Given an anchor with Elastic constraint stiffness s in [0,1], When correction runs, Then displacement magnitude at that anchor is less than or equal to (1 - s) times the unconstrained displacement. - Given an anchor with Rotational constraint, When correction runs, Then the anchor may translate up to 2 px but its local orientation changes by no more than 0.2 degrees. - Given conflicting constraints on connected anchors, When correction runs, Then the solver prioritizes Fixed over Rotational over Elastic and displays a warning badge on affected anchors.

Non-Destructive Metadata and Reuse Across Variants

- Given a project with anchors and guides, When the user saves the project, Then all anchors, guides, constraint types, parameters, and mirroring settings are persisted in project metadata. - Given a variant image of the same SKU is loaded, When the user applies the saved anchor set, Then anchors are transformed to the variant using detected alignment and appear with a median error <= 3 px relative to corresponding seams. - Given the user performs edits, When undo or redo is used up to 50 steps, Then no raster pixels are permanently altered and only metadata changes are applied and reversible. - Given the project is exported and reimported, When using a .plift file, Then all metadata including element IDs is preserved.

API Operations for Scripted Workflows

- Given a valid auth token, When POST /projects/{id}/anchors is called with anchors, guides, constraints, tolerance, and magnet strength, Then the server creates the items and returns 201 with IDs and positions. - Given PATCH /projects/{id}/anchors/{anchorId} is called to update a constraint type or position, Then the response is 200 and the change appears in the editor within 1 second. - Given GET /projects/{id}/anchors is called, Then it returns a paginated list including total count and an ETag header for caching. - Given an invalid constraint type is submitted, When POST is called, Then the API returns 400 with error code INVALID_CONSTRAINT and details.

Continuity Smart Warp Engine

"As a boutique owner, I want patterns to align seamlessly across reconstructed areas so that my listings look premium and trustworthy."

Description

Computes localized, non-linear warp fields that align pattern phase and direction across seam boundaries and reconstructed areas without distorting garment silhouette. Uses detected pattern vectors and user anchors as constraints to minimize phase error while preserving fabric drape. Includes guardrails for skin/hardware exclusion and per-region warp strength. GPU-accelerated for near real-time previews and scalable batch processing. Outputs reversible warp parameters saved to sidecar metadata for auditability and rollbacks.

Acceptance Criteria

Stripe Continuity Across Side Seam

Given an apparel photo with vertical stripes crossing a side seam and a reconstructed area within 30 px of the seam When the Continuity Smart Warp Engine is applied with default pattern detection Then the normalized stripe phase difference across the seam boundary shall be ≤ 0.10 cycles (p95) And the stripe orientation difference across the seam shall be ≤ 2° (p95) And silhouette preservation shall meet IoU ≥ 0.98 between pre- and post-warp garment masks And the seam continuity detector shall report 0 discontinuities > 1 px along the seam path

Plaid Alignment Across Reconstructed Hem

Given a plaid-pattern garment with an inpainted hem region occupying ≤ 10% of garment mask area When the engine reconstructs and warps to extend the plaid through the reconstructed region Then the RMS plaid phase error across the hem boundary shall be ≤ 0.12 cycles And the orientation mismatch across the boundary shall be ≤ 3° (p95) And the boundary cross-correlation peak across the transition shall be ≥ 0.90 And local checker cell aspect-ratio change in the reconstructed region shall be ≤ 5% (p95)

Anchor-Based Drape Preservation

Given 2–6 user anchor points placed on folds and edges with drape paths defined When the warp field is computed with anchors as hard constraints Then per-anchor positional deviation shall be ≤ 1 px (p95) And mean curvature change along each drape path shall be ≤ 5% And path length change shall be ≤ 1% When a single anchor is moved by the user Then the preview shall update in ≤ 150 ms (p95) at 12 MP on reference hardware (RTX 3060 12 GB or Apple M1 Pro)

Skin and Hardware Exclusion Guardrails

Given skin and hardware exclusion masks are supplied or auto-detected When the warp is applied Then per-pixel displacement within excluded masks shall be ≤ 0.25 px (p99) And no warp shall cross into excluded masks (boundary overshoot = 0 px) And CIELAB ΔE within excluded masks shall be ≤ 1.0 relative to pre-warp And garment-mask contamination from excluded regions shall be ≤ 0.1% of garment area

Per-Region Warp Strength Controls

Given three region masks with strengths {0.0, 0.5, 1.0} When the warp is computed Then mean displacement magnitude in strength 0.0 regions shall be ≤ 0.25 px (p95) And the ratio of mean displacement (0.5 strength):(1.0 strength) shall be 0.5 ± 0.1 And strength values shall persist in brand preset and be serialized to sidecar metadata When a region's strength is toggled from 1.0 to 0.0 Then that region's appearance shall match pre-warp with PSNR ≥ 40 dB within the region mask

GPU-Accelerated Preview and Batch Throughput

Given a 12–24 MP image with a garment mask covering 30–60% of pixels on reference hardware (RTX 3060 12 GB or Apple M1 Pro) When a single anchor or strength slider is adjusted Then preview latency shall be ≤ 120 ms (p95) at 12 MP and ≤ 250 ms (p95) at 24 MP And per-session warm-up/compile time shall be ≤ 1.5 s Given an AWS g5.xlarge instance (NVIDIA A10G) When batch-processing 500 images at 12 MP Then sustained warp throughput shall be ≥ 4 images/sec (p50) and ≥ 3 images/sec (p95)

Reversible Sidecar Warp Parameters and Auditability

Given any processed image When saving results Then a sidecar shall be written containing: schema version, source content hash, warp coefficients/field, anchors, pattern vectors, per-region strengths, preset ID, device info, timestamps, and user ID And sidecar size shall be ≤ min(3 MB, 2% of source image size) When the inverse warp is applied using the sidecar to the warped image Then PSNR within the garment mask vs. original shall be ≥ 40 dB and SSIM ≥ 0.99 And re-applying the forward warp after inversion shall yield warp-field RMSE ≤ 0.25 px vs. the original forward warp When the source file is renamed or moved within the library Then sidecar association via content hash shall succeed in ≥ 99.9% of cases over 1000 operations

Pattern-Aware Inpainting & Extension

"As a photographer, I want inpainting that extends stripes and plaids realistically into missing regions so that background removal or cropping doesn’t break garment realism."

Description

Synthesizes missing or occluded textile content by extending detected patterns with phase-consistent texture generation. Maintains stripe/plaids alignment through hems, folds, and cropped edges, and harmonizes color/lighting with the source fabric. Edge-aware blending avoids halos from background removal. Fallbacks to neutral fill when confidence is low, with automatic flagging for review. Integrates with the Smart Warp Engine to jointly optimize inpaint and warp for continuity.

Acceptance Criteria

Stripe Continuity Across Hem/Fold

Given a garment photo with striped fabric where a hem or fold occludes the pattern by up to 15% of the stripe period, When SeamFlow performs pattern-aware inpainting and extension, Then the dominant stripe frequency in the inpainted area differs by ≤ 10% from the adjacent source region (FFT-based), And stripe orientation delta ≤ 5°, And stripe phase misalignment measured at the seam ≤ 2 px RMS, And inpainted-to-source boundary SSIM ≥ 0.92 within a 16 px band, And no halo wider than 1 px is detected along fabric–background edges.

Plaid Extension at Cropped Edge

Given a plaid fabric cropped at the image boundary with ≥ 32 px of valid pattern context, When the canvas is extended or missing pixels are filled, Then 2D cross-correlation between the inpainted area and the extrapolated plaid lattice ≥ 0.80, And intersection points of plaid lines deviate ≤ 3 px from their projected locations, And color ΔE00 ≤ 2.0 vs adjacent source patches, And misalignment artifacts > 3 px are absent in ≥ 95% of boundary pixels.

Color & Lighting Harmonization of Inpainted Regions

Given inpainted regions ≥ 500 px², When compared to a 10 px dilated source boundary band, Then mean CIEDE2000 ΔE ≤ 2.0 and 95th percentile ≤ 4.0, And luminance (L*) mean difference ≤ 5% and variance ratio between 0.8 and 1.2, And gradient magnitude across the seam 95th percentile ≤ 1.1× that of the source band.

Edge-Aware Blending Without Halos

Given an item with background removed and fabric silhouetted against transparent or solid background, When edge-aware blending is applied in tandem with inpainting, Then the proportion of edge pixels with halo/fringe artifacts (local contrast spike > 10%) ≤ 0.5%, And halo band width ≤ 1 px at the 95th percentile, And chroma shift across the silhouette edge ΔC* ≤ 3 for 95% of edge pixels, And the alpha matte is monotonic across a 3 px edge band (no sign flips in gradient).

Smart Warp + Inpaint Anchor Consistency

Given auto-detected or user anchor points on pattern features spanning an occluded area, When Smart Warp and inpainting are jointly optimized, Then anchor reprojection error ≤ 2 px RMS, And local texture scale drift across inpainted region ≤ 5% vs adjacent source, And lattice lines remain straight with max deviation ≤ 2 px over 128 px segments, And the warp field Jacobian determinant > 0 across the garment area (no fold-overs).

Low-Confidence Fallback and Review Flagging

Given pattern detection confidence < 0.70 or predicted seam misalignment > 3 px, When processing the region, Then the system applies a neutral fill computed as median color ± low-amplitude noise (σ ≤ 2 in L*), And writes needs_review=true and fallback_mask to the output metadata, And displays a Review Required badge in batch results for the affected image, And the API response includes fallback=true with pixel area count of the fallback region.

Batch SeamFlow Presets & Pipeline Integration

"As a catalog manager, I want SeamFlow to run automatically with presets during batch processing so that hundreds of images are processed consistently without hand-tuning."

Description

Adds configurable presets for common pattern types (stripes, plaids, micro-patterns) and fabric behaviors, enabling one-click application of SeamFlow during batch uploads. Presets define detection sensitivity, warp strength, inpaint bounds, and confidence thresholds. Hooks into PixelLift’s existing batch queue, parallelization, and style-presets so SeamFlow runs alongside retouching and background removal. Includes retry policy, failure isolation, and per-image logs/metrics for observability.

Acceptance Criteria

Preset Management for Common Pattern Types

Given I open the SeamFlow Presets interface, When I create a preset with patternType ∈ {stripes, plaids, micro-patterns}, detectionSensitivity ∈ [0.0,1.0], warpStrength ∈ [0.0,1.0], inpaintBounds ∈ [1,40]% of bbox, and confidenceThreshold ∈ [0.0,1.0], Then the preset is saved with a unique ID and is retrievable via API with identical values. Given I submit any parameter outside its allowed range, When I attempt to save, Then the save is blocked and field-level validation errors are shown and returned via API (HTTP 422) naming the invalid fields. Given I attempt to save a preset with a duplicate name, When I submit, Then the system rejects with a uniqueness error (HTTP 409) and no new preset is created. Given I duplicate an existing preset, When I save, Then a new preset is created with a new ID and all parameters copied; the createdAt timestamp differs and the name is suffixed with “copy”. Given a preset is referenced by an in-progress batch, When an admin archives the preset, Then the running batch uses the preserved parameter snapshot; the preset becomes unselectable for new batches and remains available via API with status=archived.

One-Click Batch Application of SeamFlow Preset

Given I upload a batch of N images and select a SeamFlow preset, When the batch completes, Then each image has a SeamFlow outcome ∈ {applied, skipped_low_confidence, failed} and an attached immutable parameter snapshot (presetId, version, values). Given no preset is explicitly selected, When I start a batch, Then the system auto-selects the default SeamFlow preset for apparel patterns and records autoSelectedPreset=true in batch metadata. Given transient network interruption during upload, When the upload resumes, Then images are not double-processed and the final count of processed images equals N. Given SeamFlow is enabled, When processing runs, Then P95 end-to-end processing time per image increases by ≤ 20% compared to the same pipeline without SeamFlow in the same environment and concurrency settings. Given the batch finishes, When I query the batch summary, Then it reports counts for applied/skipped_low_confidence/failed that sum to N.

Pipeline Integration with Style-Presets, Retouching, and Background Removal

Given a batch with a style-preset, background removal, retouching, and SeamFlow enabled, When processing runs, Then SeamFlow executes within the existing batch queue alongside the other steps without deadlocks and all steps report start/end timestamps. Given SeamFlow modifies only reconstructed areas, When comparing output to the input plus known inpaint mask, Then outside the inpaintBounds the pixel delta rate ≤ 1.0% at 8-bit tolerance=2. Given background removal is enabled, When the job completes, Then the alpha mask outside inpaintBounds is unchanged bit-for-bit compared to running the pipeline without SeamFlow. Given style-preset color/tonal adjustments are applied, When the job completes, Then their parameter values in output metadata are identical to the values used when running without SeamFlow. Given concurrency is set to K workers, When running a batch of ≥ 100 images, Then effective parallelization (images-in-flight) for SeamFlow is within [K−1, K+1] for at least 80% of runtime.

Retry Policy and Failure Isolation in Batch Processing

Given a transient processing error (e.g., timeout) occurs for an image, When SeamFlow detects the error, Then it retries the image up to 2 times with exponential backoff (e.g., 2s, 4s) before marking it failed. Given one image fails after max retries, When the batch completes, Then the remaining images continue processing and the batch status is completed_with_errors with failedCount ≥ 1. Given a non-transient error (e.g., invalid image), When detected, Then the system does not retry and marks the image failed with errorCode and errorMessage. Given retries occur, When I view per-image metadata, Then retryCount, retryDelays, and finalStatus are present. Given failures occur, When observing the queue, Then no single-image failure causes the batch or queue to stall for more than the configured backoff window.

Per-Image Logs and Metrics for Observability

Given an image completes processing, When I query its logs via API/UI, Then I can see patternTypeDetected, detectionConfidence, inpaintAreaPercent, warpStrengthApplied, anchorPointsUsed, durationMs, outcome, retryCount, and errorCode (if any). Given a batch runs, When I open the batch metrics, Then I can view aggregated metrics: successRate, skipRateLowConfidence, failureRate, P50/P95/P99 duration, and detectionConfidence histogram. Given processing completes for an image, When metrics export runs, Then logs and metrics are available to the observability backend within 60 seconds with a stable correlationId linking upload, pipeline steps, and output asset. Given I download the per-image audit package, When I inspect contents, Then it contains the input hash, parameter snapshot, inpaint mask, warp field visualization, and output hash.

Automated Seam Continuity Quality Benchmarks

Given the Stripe/Plaid/Micro-pattern test suite, When SeamFlow is applied with the recommended presets, Then line/edge continuity across reconstructed boundaries achieves ≥ 0.90 continuity score and ≤ 2.0 px RMS misalignment measured over the inpaint boundary. Given micro-pattern textures, When comparing power spectral density between input context and reconstructed region, Then similarity ≥ 0.85 (normalized cross-spectrum) for frequencies within the dominant band of the detected pattern. Given the output image, When running artifact detection on inpainted areas, Then no edge discontinuity segment > 5 px is present and no warp-induced shear exceeds 3° across a 50 px window. Given low-confidence detections (confidence < confidenceThreshold), When processing, Then SeamFlow does not apply warp/inpaint and outcome=skipped_low_confidence with zero pixel modification recorded in diff metadata.

Continuity Preview, Confidence Heatmap, and Overrides

"As a QA reviewer, I want a real-time preview and confidence heatmap so that I can quickly spot and correct any continuity defects before publishing."

Description

Displays live before/after comparison with seam overlays and pattern phase lines, plus a confidence heatmap highlighting areas at risk of visual breaks. Provides one-click accept, quick adjustments (slider for warp strength), and jump-to-anchor navigation. Surfaces auto-flags from low-confidence regions for human review in a QA queue. Exports review outcomes to inform future auto-thresholds. Available in the editor UI and via lightweight web preview for stakeholders.

Acceptance Criteria

Live Before/After Comparison and Seam/Pattern Overlays

Given a processed apparel image is open in the editor When the user toggles Before/After and Overlay controls Then both views render within 500 ms and pan/zoom remain synchronized Given overlays are enabled When the user zooms to 100% or higher Then seam lines and pattern phase lines align to visible edges within 2 px deviation Given overlays are enabled When the user toggles the legend Then the legend displays line types and color keys for seams and pattern phase lines

Dynamic Confidence Heatmap with Thresholding

Given an image with a generated confidence map When the user enables Heatmap Then regions with confidence below threshold T are colorized according to the legend and the legend shows min, max, and T Given Heatmap is visible When warp strength is adjusted via the slider Then the heatmap recomputes and re-renders within 300 ms Given a heatmap region is clicked When the details panel opens Then it shows region confidence (0–1), area in pixels, and associated anchor IDs

One-Click Accept, Warp Strength Override, and Undo/Reset

Given an edited image When the user clicks Accept Then the system finalizes the result, disables editing controls during processing, and completes within 1 second Given the warp strength slider ranges 0–100 with default 50 When the user adjusts the slider Then the preview updates within 250 ms and the change is non-destructive Given overrides have been applied When the user invokes Undo or Reset to Default Then the image and parameters revert to the prior/default state within 200 ms

Jump-to-Anchor Navigation and Shortcuts

Given anchor points are detected When the user selects Next/Previous Flag or chooses an anchor from the list Then the viewport centers on that anchor with at least 20 px padding and highlights it Given navigation is active When the user presses N or P Then the selection moves to the next/previous anchor and updates the list selection Given an anchor is selected When the user toggles overlays or heatmap Then the selection persists and focus remains on the same region

QA Queue Population from Auto-Flags and Outcome Capture/Export

Given batch processing completes When regions with confidence below threshold T are detected Then QA items are created with image ID, SKU, thumbnail, max/avg confidence, and flagged region count, and duplicate regions (IoU > 0.8) are merged Given a reviewer opens a QA item When they mark Pass or Needs Fix (optionally after edits) Then the outcome, reviewer ID, timestamp, and parameter deltas are saved Given an outcome is saved When exports are enabled Then a record is delivered via API or webhook within 60 seconds using the agreed schema and acknowledged by the destination

Lightweight Web Preview for Stakeholders (View-Only)

Given a user generates a share link When a stakeholder opens it Then a view-only page renders Before/After with overlays/heatmap toggles and no editing controls Given a share link is created When no custom settings are applied Then the link is tokenized, expires after 7 days by default, and can be revoked immediately Given the preview opens on a mobile device over 4G (viewport ≥ 360 px) When the first view loads Then time-to-first-interactive is ≤ 2 seconds and pan/zoom interactions sustain at least 30 FPS

EdgeGuard

Thread‑aware matting that preserves delicate fabric edges (lace, mesh, frayed hems) while eliminating halos and fringing on white or colored backgrounds. Produces crisp, marketplace‑safe cutouts that pass scrutiny and elevate perceived quality.

Requirements

Thread-Aware Edge Detection

"As a boutique product photographer, I want delicate fabric edges to be accurately preserved so that my cutouts look natural and premium without manual masking."

Description

Implements a subpixel, fabric-sensitive edge detection and matting module that recognizes fine threads, frayed hems, lace borders, and mesh patterns to produce a high-fidelity alpha matte. The algorithm classifies edge regions (solid fiber, semi-transparent weave, background gap) and preserves micro-structure without stair-stepping or over-smoothing. It supports variable fiber thickness, motion blur from handheld shots, and complex contours intersecting with shadows. Outputs include an 8–16 bit alpha matte and a refined foreground with edge-aware antialiasing. Integrates as a drop-in matting stage within PixelLift’s processing graph, with tunable sensitivity presets and deterministic results for consistent batch outcomes.

Acceptance Criteria

Lace on White Background: Halo-Free Thread Preservation

Given the EdgeGuard QA set "LaceWhite-50" with ground-truth alpha and edge-region labels When processed with preset "Medium" Then thread-pixel F1-score >= 0.93 within a 3px edge band And fringing pixels (alpha > 0.1 outside ground-truth object within a 3px band) <= 0.5% of band pixels And per-class accuracy for edge-region classification (solid fiber, semi-transparent weave, background gap) >= 0.90 macro-average in edge bands And alpha MAE within the 3px edge band <= 0.02 And outputs include an alpha matte at the requested bit depth (8 or 16-bit) and a refined foreground asset

Colored Background Mesh: Spill-Free Semi-Transparency

Given the EdgeGuard QA set "MeshColor-40" with ground-truth alpha, labels, and color targets When processed with preset "High" Then semi-transparent weave pixel alpha MAE <= 0.03 within a 3px edge band And mean CIEDE2000 ΔE in a 2px edge band between original foreground colors and the refined foreground composited on neutral background <= 2.0 And outside-edge fringing pixels (alpha > 0.1 within 3px beyond ground truth) <= 0.7% of band pixels And precision for the "background gap" class >= 0.92 in edge regions

Frayed Hem with Motion Blur: Micro-Structure Retention

Given the EdgeGuard QA set "FrayedBlur-30" with annotated thread widths and blur tails When processed with preset "Medium" Then thread-pixel recall >= 0.90 and precision >= 0.90 within a 4px edge band And median absolute thread width error |w_out - w_gt| <= 1.0 px on annotated fibers And blur-tail pixels are represented with soft alpha (95th percentile alpha in annotated blur tails between 0.2 and 0.8)

Complex Contours Intersecting Shadows: Correct Separation

Given the EdgeGuard QA set "ShadowIntersect-30" with annotated shadow regions and foreground boundaries When processed with preset "Medium" Then misclassification rate of shadow pixels as foreground within a 5px vicinity of object boundaries <= 2.0% And foreground boundary recall >= 0.92 within the same vicinity And soft shadow regions outside ground-truth foreground have alpha <= 0.2 at the 95th percentile

Batch Processing: Deterministic, Order-Independent Results

Given a batch of 200 mixed test images and fixed settings When processed twice on the same worker with different input orders Then per-image alpha and refined-foreground outputs are bit-identical across runs And per-image SHA-256 hashes of outputs match across runs And median per-image runtime variance between runs <= 2%

Pipeline Integration: Drop-In Stage with Tunable Sensitivity Presets

Given PixelLift’s processing graph with an existing matting stage When EdgeGuard is inserted in place of the prior matting node Then no changes are required to upstream/downstream node APIs (inputs/outputs and data types unchanged) And the module exposes at least three sensitivity presets selectable via API and UI And switching from Medium to High increases thread-pixel recall by >= 3% with precision drop <= 2% on "LaceWhite-50" And requesting 16-bit alpha returns a 16-bit channel (metadata and value range verify 0–65535), and requesting 8-bit returns 8-bit (0–255) And both alpha matte and refined foreground assets are emitted to downstream nodes

Refined Foreground Output: Edge-Aware Antialiasing and Bit-Depth Utilization

Given the EdgeGuard QA set "EdgeAA-30" with ground-truth composites When refined foreground images are generated Then edge-band SSIM (3px) between refined foreground and ground-truth composite >= 0.95 And edge jaggedness metric (RMS tangent angle deviation vs ground truth within 3px) <= 0.15 And 16-bit alpha within edge bands utilizes >= 512 distinct levels And 8-bit alpha banding rate (quantization error > 1/255) within edge bands <= 1%

Color Decontamination & Halo Removal

"As an online seller, I want halos and color fringing eliminated around fabrics so that my listings meet marketplace standards and look professionally retouched."

Description

Provides robust suppression of background color spill and edge halos on both white and colored backdrops by estimating local background color, removing contamination from the foreground edge pixels, and reconstructing true fiber color. Includes adaptive decontamination strength, chroma-only and luminance-aware modes, and guardrails to avoid overdesaturation of genuine fabric dyes. Handles glossy trims and light bleed conditions while maintaining crisp transitions. Exposes a simple on/off with ‘Marketplace Safe’ default enabled, plus an advanced panel for power users. Integrates after alpha estimation and before style-presets, ensuring downstream color grading does not reintroduce fringing.

Acceptance Criteria

Marketplace Safe Default: Halo-Free Cutouts on White and Colored Backgrounds

Given a standardized test set of product images on white and colored backdrops and Marketplace Safe = On with default settings When Color Decontamination runs after alpha estimation Then background chroma spill in the 1–3 px foreground edge band is reduced by ≥ 90% versus input on ≥ 95% of images And mean visible halo width is ≤ 1 px across ≥ 98% of evaluated edge pixels And interior fabric color fidelity is preserved with median ΔE00 ≤ 3 and 95th percentile ΔE00 ≤ 6 versus reference patches And the alpha channel is unchanged bit-for-bit

Adaptive Strength With Overdesaturation Guardrails

Given two benchmark sets exhibiting high spill (S ≥ 0.20) and low spill (S ≤ 0.05) as measured by the system’s edge-spill metric When Adaptive Decontamination is enabled Then the applied strength parameter is ≥ 0.7 for high-spill images and ≤ 0.3 for low-spill images And edge-band chroma spill is reduced by ≥ 92% (high-spill) and ≥ 85% (low-spill) And interior fabric saturation (C*ab) mean change is within ±5% with the 95th percentile absolute change ≤ 15% And per-channel clipping affects ≤ 0.5% of pixels

Mode Behavior — Chroma-Only vs Luminance-Aware

Given identical inputs processed in each mode from the Advanced panel When Chroma-only mode is selected Then luminance change ΔL* has a 95th percentile ≤ 2 units in both edge and interior regions while achieving ≥ 90% spill reduction in the edge band When Luminance-aware mode is selected Then |ΔL*| has a 95th percentile ≤ 5 units within the edge band and interior local contrast is preserved with SSIM ≥ 0.98 vs the input And both modes are selectable via the Advanced panel with clearly labeled options

Glossy Trims and Specular Highlights Preserved

Given products with glossy trims, metallic logos, or reflective piping and Marketplace Safe = On When Color Decontamination is applied Then highlight peak luminance at specular pixels changes by between −5% and +5% And edge halo width is ≤ 1 px and edge-adjacent color cast has ΔE00 ≤ 2 within the 2 px exterior band And no new ringing is introduced (Laplacian overshoot ratio ≤ 1.05 of baseline)

Semi-Transparent Lace Under Colored Backdrops

Given semi-transparent lace or mesh fabrics photographed over colored backgrounds When Color Decontamination is applied Then edge-band background color contribution is reduced by ≥ 85% while preserving texture detail with high-frequency energy retention ≥ 95% in 8–16 px windows And the alpha channel remains unchanged And edge gradients remain monotonic (no halo reversals) for ≥ 99% of evaluated edge pixels

Pipeline Order and Post-Grade Fringing Immunity

Given the processing pipeline Alpha Estimation -> Color Decontamination -> Style Presets When style presets are applied after decontamination Then the fringing metric (mean ΔE00 in the 1–3 px edge band) does not increase by more than 0.5 ΔE00, and halo width does not increase by more than 0.5 px compared to pre-style output And post-style outputs retain ≥ 90% spill reduction relative to the original input And execution traces confirm Color Decontamination runs strictly after alpha estimation and before style presets And decontamination does not modify the alpha channel

Simple Toggle and Advanced Panel Controls

Given a fresh session or project When the user opens EdgeGuard settings Then a single top-level toggle labeled "Marketplace Safe" is visible and enabled by default And an Advanced panel can be expanded to reveal controls for decontamination strength (range 0.0–1.0, step ≤ 0.05) and a mode selector with "Chroma-only" and "Luminance-aware" And changes made via the Advanced panel apply uniformly to batch processing without altering the alpha channel

Semi-Transparent Fabric Preservation

"As a fashion merchant, I want lace and mesh transparency preserved so that shoppers can see authentic fabric detail and texture in my product photos."

Description

Accurately models partial transparency in lace, mesh, chiffon, and tulle by producing a smooth, physically plausible alpha that retains holes and weave patterns without filling them in. Distinguishes between thread fibers and background gaps, even under backlighting, and avoids haloing in high-contrast scenarios. Supports threshold-free operation with automatic detection of semi-transparent regions and optional controls for minimum hole size and alpha smoothing radius. Ensures exported PNG/WebP retains premultiplied-correct edges for consistent rendering in marketplaces and storefronts.

Acceptance Criteria

Backlit White Lace on White Background

Given a high-resolution photo of white lace on a white sweep with strong backlighting and EdgeGuard default settings When the image is processed Then >=95% of ground-truth lace holes with equivalent diameter >= 2 px remain with inside-alpha <= 0.10 And mean absolute alpha error (MAE) in annotated semi-transparent regions <= 0.06 And no haloing: edge overshoot <= 3% luminance and halo width <= 1 px at fabric-background boundaries And background regions have alpha <= 0.02 and thread fibers' median alpha >= 0.85

High-Contrast Mesh on Colored Background

Given a product photo of dark mesh fabric over a saturated colored backdrop with EdgeGuard default settings When the image is processed Then hole recall >= 97% for holes >= 1.5 px and false fill-in rate <= 2% of hole pixels And no color fringing: DeltaE00 <= 1.5 for pixels 0-2 px outside the boundary when compositing the result over the original backdrop vs compositing the ground-truth over the same backdrop And edge IoU with ground-truth matte >= 0.92 within a 2-px tolerance band

Threshold-Free Detection of Semi-Transparent Regions

Given a mixed batch of 100 product photos (semi-transparent and opaque fabrics) processed with EdgeGuard using default (threshold-free) detection When processing completes Then semi-transparent region detection achieves precision >= 0.90 and recall >= 0.90 against ground-truth labels And opaque regions yield binary alpha on >= 98% of pixels (alpha <= 0.02 or >= 0.98) And average processing time <= 3.0 s per 3000x3000 image on the reference workstation

Minimum Hole Size Control Preserves Weave Gaps

Given a lace photo with annotated hole sizes and EdgeGuard minimum hole size set to 3 px When the image is processed Then >=95% of holes with equivalent diameter >= 3 px remain with inside-alpha <= 0.10 And <=10% of holes with equivalent diameter < 3 px remain unfilled (inside-alpha <= 0.10) And increasing the control from 1 px to 5 px reduces preserved sub-threshold holes by >= 60% without affecting >=5 px holes (retention >= 95%)

Alpha Smoothing Radius Produces Natural Gradients

Given a chiffon edge case image and EdgeGuard alpha smoothing radius set to 0 px and 2 px When the image is processed for both settings Then at 0 px, edge detail SSIM >= 0.90 against ground-truth unsmoothed alpha within a 5-px band And at 2 px, the alpha gradient is monotonic with no banding (no flat steps >= 0.05 alpha over >= 2 px), and MAE <= 0.05 vs ground-truth smoothed alpha And in both cases, hole preservation for holes >= 2 px remains >= 95%

PNG/WebP Export with Premultiplied-Correct Edges

Given a processed image with semi-transparent regions and export formats set to PNG and WebP When exporting with default EdgeGuard export settings Then exported files contain straight (unpremultiplied) alpha with decontaminated RGB at edges, verified by recompositing over black and white producing DeltaE00 <= 1.5 within a 3-px boundary band vs the internal composite And no double-premultiplication occurs on re-import; round-trip composite over gray matches internal composite with PSNR >= 40 dB And exported assets render without halos on white or black backgrounds: DeltaE00 edge band <= 1.5 vs internal preview

Robust Background Modeling (White & Colored)

"As a catalog manager, I want consistent cutouts from both white and colored backgrounds so that I’m free to shoot on whatever backdrop is available without quality loss."

Description

Builds a local background model that handles pure white sweeps, colored paper, and gradient backdrops with shadows. Estimates per-pixel background chroma and luminance to guide matte refinement and color decontamination, including cases with uneven lighting or light ramps. Detects and compensates for soft shadows without erasing fabric edges. Includes safeguards for props or foreground objects that touch backdrop seams. Exposes a ‘Background Type: Auto/White/Colored/Gradient’ selector for deterministic batch behavior and logs chosen model for auditability.

Acceptance Criteria

Auto Background Type Resolution in Mixed Batch

Given a labeled validation batch of 300 images (100 white sweep, 100 colored paper, 100 gradient backdrops) When processed with Background Type = Auto Then the inferred background type logged per image matches ground truth with accuracy >= 97% And the audit log for each image includes: image_id, background_type_inferred, confidence (0–1), timestamp, app_version And exactly one background type is logged per image (no per-tile model-type changes)

White Sweep Modeling with Soft Shadow Preservation

Given white-sweep product photos containing soft contact shadows When Background Type = White Then background pixels outside a 15px expanded foreground mask have median luminance >= 245/255 and std deviation <= 8/255 And soft shadow regions are attenuated by 60–90% relative to input while foreground alpha at fabric edges changes by <= 0.02 And halo/fringe width along the 95th percentile of the object perimeter <= 1 px And lace/mesh edge F1-score >= 0.90 on the white-sweep test set

Colored Backdrop Decontamination at Delicate Edges

Given colored-paper backdrop images with foreground fabrics exhibiting edge spill When Background Type = Colored (selected or inferred) Then mean chroma contamination in a 3px band just outside the matte is reduced by >= 80% versus the uncorrected baseline And interior foreground color shift (measured 10px inside the matte) has mean DeltaE00 <= 2.0 And visible fringe pixels (detected by edge-contrast heuristic) constitute <= 0.5% of total edge length

Gradient Backdrop Modeling under Uneven Lighting

Given gradient backdrops with light ramps and vignetting When Background Type = Gradient (selected or inferred) Then the fitted per-pixel background model achieves MAE <= 7/255 on background-only sample points with R^2 >= 0.90 And after refinement, residual background gradient in empty regions (outside a 15px edge band) has slope magnitude <= 5/255 across the frame And edge halos/fringing <= 1 px at the 95th percentile; IoU of the foreground matte with ground truth >= 0.94

Backdrop Seam and Prop Contact Safeguards

Given scenes where props or product touch a vertical or horizontal backdrop seam When processed with any Background Type Then seam detection prevents foreground erosion with matte IoU >= 0.96 on prop regions And seam line residual visibility in final cutout has DeltaL* <= 2.0 across the seam line And misclassified seam pixels as foreground <= 1% of total seam pixels; an audit log event seam_detected = true is recorded

Deterministic Selector Behavior and Audit Logging

Given the user sets Background Type = White/Colored/Gradient in UI or API When processing a batch Then the chosen type for each image equals the selection (no auto override) and is recorded in the audit log as background_type_selected And when set to Auto, the audit log records background_type_selected = Auto and background_type_inferred in {White, Colored, Gradient} with confidence And reprocessing the same image with identical inputs and app_version produces a matte whose mean absolute per-pixel difference <= 1/255; a stable model_parameters_checksum is logged And batch logs are exportable as JSON and CSV with one row per image and are available within 5 minutes of batch completion

Batch Integration & Preset Compatibility

"As a high-volume seller, I want EdgeGuard to run automatically with my existing style presets so that I can process hundreds of photos quickly without manual tuning."

Description

Integrates EdgeGuard seamlessly into PixelLift’s batch pipeline and style-preset system. Supports per-preset EdgeGuard settings, override flags, and deterministic seed control for reproducible runs across hundreds of images. Provides concurrency-safe processing, resumable batches, and fallbacks to legacy matting if inputs are out-of-distribution. Ensures outputs (alpha, cutout, spill map) are accessible to downstream steps such as background replacement, drop shadows, and color grading. Emits structured logs and metrics for each file to aid QA and troubleshooting.

Acceptance Criteria

Per-Preset EdgeGuard Settings Applied in Batch

Given a batch containing items mapped to Preset A and Preset B When the batch process executes EdgeGuard Then items assigned to Preset A use Preset A's EdgeGuard parameters And items assigned to Preset B use Preset B's EdgeGuard parameters And items with no preset EdgeGuard parameters use system defaults And the applied parameters are recorded in each file's metadata and structured log

File-Level Override Flags Supersede Preset

Given a file with EdgeGuard override flags (e.g., disable=true, strength=0.7) When it is processed under a preset with different EdgeGuard settings Then the file's override values supersede the preset values for those keys And non-overridden keys inherit from the preset And a log entry records each override key and its effective value And disabling EdgeGuard via override routes the file to legacy matting

Deterministic Seed Reproducibility Across Runs

Given a batch seed S and a per-file seed F When the same inputs are processed twice with identical model/version and settings Then alpha, cutout, and spill map outputs are byte-identical for each file And changing only a file's per-file seed F changes only that file's outputs And changing the batch seed S changes outputs only for files without a per-file seed And if model or code version differs, the run is flagged non-reproducible in logs

Batch Execution Reliability: Concurrency-Safe and Idempotent Resume

Given a batch of N files and concurrency C When processing runs with C workers Then no file is processed more than once and none are skipped And outputs and manifests are written atomically without partial files And if interrupted, resuming processes only incomplete files And retries are idempotent and do not create duplicate outputs And the final manifest enumerates succeeded, failed, and skipped files with reasons

Out-of-Distribution Detection with Automatic Legacy Fallback

Given an input flagged as out-of-distribution or below confidence threshold When EdgeGuard evaluation runs Then the file is routed to legacy matting automatically And the reason code, confidence score, and fallback=true are recorded in structured logs And downstream receives alpha, cutout, and spill map with the same schema as EdgeGuard outputs And batch-level metrics report fallback count and rate

Downstream Output Contract: Alpha/Cutout/Spill Map Availability

Given a successfully processed file When EdgeGuard completes Then alpha, cutout (RGBA), and spill map artifacts exist at documented paths/keys And artifact dimensions match the source image And color space and premultiplication flags comply with the contract And background replacement, drop shadows, and color grading steps consume these artifacts without conversion And missing or malformed artifacts raise a clear error and mark the file as failed

Structured Per-File Logs and Batch Metrics

Given processing of any file within a batch When logs and metrics are emitted Then each file has one JSON log entry with: file_id, preset_id, effective_edgeguard_params, overrides_used, seeds, model/version, per-stage timings, device, fallback_flag, quality metrics, status, error (if any), schema_version And logs are valid JSON and pass schema validation And batch-level metrics include counts, throughput, duration percentiles (p50/p95), error rate, and fallback rate And correlation IDs link file logs to the batch and retry attempts

Compliance Preview & Validator

"As a QA editor, I want an instant preview and automated compliance checks so that I can catch and fix edge issues before publishing to marketplaces."

Description

Adds a zoomable edge preview with overlay modes (alpha, matte boundaries, decontamination mask) and automated checks against marketplace guidelines (e.g., no visible halos on white, clean subject contour, no residual background tint). Flags issues with visual annotations and actionable suggestions (increase decontamination strength, adjust background model), and supports one-click apply/fix. Generates a per-image compliance score and batch summary report export (CSV/JSON) for operational review.

Acceptance Criteria

Zoomable Edge Preview with Overlay Modes

Given a processed image is opened in Compliance Preview When the user zooms between 50% and 800% and pans the image Then the preview renders at the requested zoom within 150 ms for cached tiles and within 400 ms for first paint, and panning latency stays below 50 ms per frame at 60 fps And overlay mode toggles include None, Alpha, Matte Boundary, and Decontamination Mask and switch within 200 ms And an overlay legend and opacity slider (0–100%) are visible and state persists within the session

Marketplace Rule Set Selection and Application

Given the user selects a marketplace profile (Amazon, eBay, Etsy, Shopify) for the batch or image When the validator runs Then only the rules for the selected profile are applied and listed And changing the profile re-runs validation and updates issues and score within 2 seconds for a 24 MP image And the selected profile is remembered per project/session

Automated Halo and Residual Background Detection Accuracy

Given a labeled test set of 400 images (200 with halos/tint, 200 clean) When the automated checks run Then halo/tint issues are detected with recall ≥ 0.95 and precision ≥ 0.90 on the positive set And the false positive rate is ≤ 0.05 on the clean set And each flagged region includes a pixel-accurate mask or bounding ring within ±2 px of ground truth

Actionable Suggestions and One-Click Fix Workflow

Given one or more issues are flagged on an image When the user clicks One-Click Fix Then suggested parameter changes (e.g., Decontamination Strength, Background Model) are applied non-destructively to a duplicate revision, the validator re-runs, and results update within 2 seconds for a 24 MP image And the fix either resolves the issue (issue removed and score increases) or returns a failure message with rationale and alternative suggestions And the action supports Undo/Redo and can be applied to a selection of images in a batch

Per-Image Compliance Scoring and Determinism

Given an image has been validated When results are displayed Then a compliance score in the range 0–100 is shown with pass/fail threshold per marketplace (e.g., Pass ≥ 90), plus a weighted breakdown by rule And the same input image and rule set produce a score with run-to-run variance ≤ 0.5 points And after any fix, the score recomputes automatically and an audit log records changes (timestamp, user, parameters)

Batch Summary Report Export (CSV/JSON)

Given a batch of up to 1000 validated images When the user exports the report as CSV or JSON Then the file downloads in UTF-8 within 10 seconds for 1000 images and includes: image_id, marketplace_profile, score, pass_fail, issues[id,type,severity,region], fixes_applied[parameter,old,new], validated_at, schema_version And the CSV conforms to RFC 4180 (headers, quoted fields, comma separators) and the JSON validates against the published schema And a SHA-256 checksum is provided and matches the file contents

Visual Issue Annotations and Region Linking

Given issues are present for an image When the user hovers or clicks an issue in the side panel Then the corresponding region is highlighted in the preview with a non-occlusive overlay; annotations are cluster-collapsed when more than 10 are within a 200 px area; and total overlay coverage at 100% zoom does not exceed 20% of image area And a tooltip shows issue type, severity, and recommended fix, and a toggle allows show/hide annotations

GPU-Accelerated Performance & Scalability

"As an operations lead, I want predictable GPU-accelerated throughput so that daily image queues finish within our publishing SLAs."

Description

Optimizes EdgeGuard for GPU inference and post-processing to meet batch throughput targets with predictable latency. Implements tiled processing with seam-free blending for high-resolution images, asynchronous job scheduling, and mixed-precision math where safe. Provides graceful CPU fallback with performance warning. Target SLA: process at least 200 images at 2048px long edge in under 10 minutes on a single mid-tier GPU, with peak memory under 3 GB per worker. Includes telemetry for throughput, GPU utilization, and per-stage timing to guide capacity planning.

Acceptance Criteria

Batch SLA on Single Mid‑Tier GPU (200 images @ 2048px in ≤10 min)

Given a machine with a single mid‑tier GPU (e.g., NVIDIA RTX 3060 12GB or equivalent) and EdgeGuard GPU mode enabled And a batch of 200 images with the long edge = 2048 px and default EdgeGuard settings (tiled processing on) When the batch is processed end‑to‑end (ingest → preprocess → inference → postprocess → write) Then total wall‑clock time from job start to last output written is ≤ 600 seconds And the 95th percentile per‑image processing latency is ≤ 4.0 seconds And the job completes with 0 failed items and no item retried more than once

GPU Peak Memory Bound per Worker (≤3 GiB)

Given EdgeGuard running one worker process bound to one GPU And a continuous run of 200 images at 2048 px long edge using default settings When GPU memory usage is sampled at ≥10 Hz via NVML/Telemetry during processing Then the worker’s peak allocated GPU memory never exceeds 3.0 GiB And no out‑of‑memory events or CUDA allocation retries are observed And post‑batch GPU memory returns to within 10% of the idle baseline within 30 seconds

Mixed‑Precision Quality Parity vs FP32

Given a validation set of 500 images emphasizing delicate edges (lace, mesh, frayed hems) And a FP32 baseline output produced by the same EdgeGuard model/pipeline When the same set is processed with mixed‑precision (autocast FP16/BF16 where enabled) Then mean absolute alpha error (MAE) relative to FP32 is ≤ 0.010 (0–1 scale) And edge F‑measure (Fβ) drop at fabric boundaries is < 0.5 percentage points And no more than 0.2% of pixels exhibit absolute alpha difference > 0.05 And no increase in visible halos measured by a 16‑px boundary band halo score (>2/255) for more than 0.1% of boundary pixels

Seam‑Free Tiled Processing at High Resolution

Given 50 high‑resolution test images (long edge 4000–6000 px) And a reference output produced with tiling disabled on a high‑memory GPU When the same images are processed with tiling enabled (tile=512 px, overlap=64 px) Then global SSIM between tiled and reference outputs is ≥ 0.98 And in 16‑px bands along all tile seam lines, the 99th percentile absolute per‑pixel difference is ≤ 2/255 And no seam line is detected by a Sobel gradient spike > 10% above adjacent non‑seam regions

Asynchronous Job Scheduling & GPU Utilization Under Load

Given an API client submits 1000 image jobs within 10 seconds to the processing queue (worker concurrency=2) When processing begins with GPU acceleration enabled Then each submission receives HTTP 202 within ≤ 200 ms with a job ID And job status endpoints update at least once per second while active And average GPU utilization during active processing is ≥ 80% (telemetry) And effective throughput is ≥ 1.0 images/second/worker over any 5‑minute window And no deadlocks, starvation, or queue drops occur (0 messages lost)

Graceful CPU Fallback with Performance Warning

Given a host with no compatible GPU or a simulated GPU initialization failure When a batch of 20 images at 2048 px long edge is submitted Then the system automatically switches to CPU mode without crashing And a user‑visible warning is shown in UI and returned in API metadata (code="GPU_FALLBACK") And outputs meet the same quality thresholds as GPU mode (per the mixed‑precision parity test vs FP32) And telemetry marks gpu_enabled=false and records CPU timings And the batch completes successfully, regardless of total time

Operational Telemetry for Capacity Planning

Given telemetry is enabled with default settings When any batch is processed Then per‑image stage timings (preprocess, inference, postprocess, I/O) are emitted in JSON with schema version and job/worker IDs And GPU metrics (utilization %, memory used MiB) are sampled at ≥1 Hz while active And batch‑level totals (throughput, wall time, success/fail counts) are emitted on completion And telemetry export supports OpenTelemetry OTLP HTTP/gRPC with retries and backoff And telemetry overhead adds ≤ 3% to total processing time on the SLA batch

SwatchMatch

Color‑true finishing that matches garments to a provided swatch photo or hex value. Auto‑corrects white balance and hue with per‑batch profiling, shows ΔE accuracy scores, and exports channel‑optimized variants—reducing returns and buyer complaints about color.

Requirements

Swatch Input & Target Color Extraction

"As a boutique owner, I want to set a target color from a swatch photo or hex so that my product images match the true garment color."

Description

Accept a swatch photo upload or direct color entry (HEX/RGB) and extract a precise target color in CIELAB under D65. For photos, provide an eyedropper and auto-detection of uniform color patches with configurable sampling radius and outlier rejection to reduce glare/noise. Validate color inputs, display a live target chip, and persist the target per batch. Support common formats (JPEG/PNG), ICC-aware conversion (assume sRGB if none), and guidance tooltips for best results. Store the resolved target color and metadata in the batch profile for downstream processing modules.

Acceptance Criteria

Direct HEX/RGB Entry to CIELAB D65 Conversion and Validation

Given a user enters a color in HEX (#RRGGBB or #RGB) or RGB (0–255,0–255,0–255) When the field loses focus or the user presses Enter Then the input is validated for format and channel ranges, and an inline error appears without updating the target if invalid Given a valid color entry When processed Then the color is interpreted in sRGB and converted to CIELAB under D65 using ICC-aware conversion, and LAB values are displayed to 2 decimal places Given a valid color entry When processed Then the live target chip updates within 200 ms and the batch target color is set to the computed LAB Given the user modifies the color entry to another valid value When processed Then the chip and LAB values update accordingly and the change is recorded in batch metadata

Swatch Photo Upload with ICC-Aware Conversion and sRGB Assumption

Given the user uploads a JPEG or PNG swatch photo When the file contains an embedded ICC profile Then the profile is applied for decoding and conversion to a CIELAB D65 working space for sampling Given the user uploads a JPEG or PNG swatch photo without an embedded ICC profile When processed Then sRGB is assumed for decoding and conversion to a CIELAB D65 working space for sampling Given the user uploads a non-JPEG/PNG file When validated Then an error message "Unsupported file type" is shown and the file is rejected Given a valid swatch photo is uploaded When processing completes Then a color-accurate preview is displayed within 2 seconds and metadata captures filename, type, pixel dimensions, and ICC profile name or "sRGB assumed"

Eyedropper Sampling with Configurable Radius and Outlier Rejection

Given a valid swatch photo is open When the user activates the eyedropper and selects a sampling radius between 1 and 25 px (default 5 px) Then sampling uses the selected radius and displays the radius setting in the UI Given the user clicks a point with the eyedropper When computing the target color Then pixels within the radius are converted to CIELAB D65, the median color is computed after excluding the top and bottom 10% of pixels by ΔE from the initial median (outlier rejection), and the resulting LAB is shown to 2 decimals Given the sampled area is non-uniform When more than 20% of pixels are excluded and the remaining pixels have standard deviation > ΔE 3 Then a non-uniform area warning is shown without blocking selection Given the user samples a point When the result is computed Then the live target chip updates within 200 ms and the batch target color is set to the resulting LAB

Auto-Detection of Uniform Color Patches in Swatch Photo

Given a valid swatch photo is uploaded When the user selects "Auto-detect patches" Then the system identifies up to three candidate regions of ≥ 400 px area whose per-pixel ΔE to region median ≤ 2, outlines them on the image, and lists them with a uniformity score (higher is more uniform) Given at least one candidate region is found When the user selects a candidate Then the target color is computed using the same outlier-rejection rule as the eyedropper and set as the batch target, updating the chip within 200 ms Given no candidate region meets the thresholds When auto-detect completes Then a message "No uniform patch found—try manual eyedropper" is displayed

Live Target Color Chip Display and Update

Given a target color is set or changed via HEX/RGB entry, eyedropper, or auto-detect When the color value is computed Then the target chip updates within 200 ms and displays sRGB HEX and CIELAB D65 values, along with the source method label Given the target color changes by ΔE ≥ 0.5 from the prior saved value When the change is applied Then a change event is recorded in batch metadata with timestamp and source method Given the target color is cleared by the user When confirmed Then the chip resets to a neutral placeholder and batch metadata marks the target as null

Batch-Level Persistence of Resolved Target Color and Metadata

Given a batch profile exists When a target color is set or updated Then the system persists the LAB (D65) value, sRGB HEX, source method (hex/rgb/eyedropper/auto-detect), for-photo details (image ID, coordinates, sampling radius, uniformity score), ICC profile info or "sRGB assumed", user ID, and timestamps in the batch profile Given a batch with a saved target color When the batch is reopened Then the previously saved target and metadata load within 1 second and the chip reflects the saved values Given the batch data is requested via the batch API When the response is returned Then a targetColor object is present with the persisted fields described above or null if no target is set

Guidance Tooltips and Error Messaging for Swatch Input

Given the user views the SwatchMatch inputs (upload, eyedropper, sampling radius, color entry) When the user hovers or taps the info icon Then a tooltip appears within 200 ms with best-practice guidance (≤ 120 words) tailored to that control Given the user enters an invalid HEX/RGB value When validation runs Then an inline error appears within 100 ms with examples of valid formats and the target is not updated Given a swatch photo without an embedded ICC profile is uploaded When processed Then an informational note "No ICC profile detected—assuming sRGB" is displayed Given the user selects a highly non-uniform area When the warning is shown Then the tooltip link provides guidance to increase sampling radius or use auto-detect

Batch Color Profiling & Auto White Balance

"As a seller managing large uploads, I want automatic white balance and tint correction per batch so that colors are consistent across all photos."

Description

Create a per-batch color profile by estimating illuminant, white balance, and tint from representative images, then normalize exposure and white balance before hue adjustments. Support mixed lighting with per-image refinement anchored to the batch baseline and provide optional user overrides. Persist profile parameters for reproducibility and feed them into subsequent correction stages. Optimize for GPU execution to keep batch throughput high and ensure consistent color normalization across hundreds of photos.

Acceptance Criteria

Batch Baseline Color Profiling

Given a new batch (50–1000 images) and 3–10 user-selected representative images, When profiling runs, Then the system estimates illuminant (CCT in K), white-balance RGB gains, and green–magenta tint and generates a batch baseline profile ID. Given 10 representative images at 24MP on an NVIDIA T4, When profiling runs, Then baseline profiling completes in ≤ 8 seconds. Given the computed baseline, When stored, Then profile parameters (CCT, RGB gains, tint, exposure target, model/version, representative image hashes) are persisted with the batch metadata and are accessible via API. Given the representative set, When WB gains are computed, Then standard deviation per-channel gain ≤ 5% across representatives; otherwise the system flags the profile as low confidence. Given a batch without explicit representatives, When auto-selection is enabled, Then the system selects 3–10 diverse images with confidence ≥ 0.8 based on variance in lighting and content.

Exposure & White Balance Normalization Prior to Hue

Given images are processed with the batch baseline, When normalization runs, Then per-image exposure is normalized to target within ±0.3 EV. Given neutral regions are detected, When white balance is applied, Then average ΔE00 on neutral patches ≤ 2.0 and max ≤ 3.5. Given 24MP JPEGs, When normalization writes outputs, Then clipped pixels at 0 or 255 per channel are ≤ 0.5% of pixels per image. Given a batch of ≥ 200 images, When normalization completes, Then inter-image exposure variance (std dev in midtone luminance) is reduced by ≥ 60% versus input.

Mixed Lighting Per-Image Refinement

Given a batch with mixed lighting (estimated CCT range across images ≥ 800K), When per-image refinement anchored to the baseline runs, Then median ΔE00 to reference swatches per image improves by ≥ 20% versus baseline-only WB. Given saturated colors (C* ≥ 40), When refinement is applied, Then hue shift ≤ 5° unless it yields a ΔE00 improvement ≥ 1.0 for that color. Given refinement confidence < 0.6 for an image, When deciding application, Then the system falls back to the batch baseline for that image and logs the reason; fallback rate ≤ 5% on validation sets. Given per-image refinement runs, When complete, Then runtime overhead ≤ 25 ms/image (T4) over baseline normalization.

User Overrides for WB/Tint & Gray Point

Given a user inputs a custom Kelvin (2000–9000K) and tint (−150 to +150) or clicks a gray point, When overrides are applied, Then a new derived profile version (v = previous + 1) is created and linked to the batch. Given a batch of 500 images on a T4, When reprocessing with overrides, Then the batch re-renders in ≤ 2 minutes for the normalization stage. Given overrides are active, When exporting metadata, Then override values and author/time are persisted and reproducible; disabling overrides restores the prior baseline with no residual effect. Given per-image overrides are set, When batch processing runs, Then per-image settings take precedence over baseline while maintaining exposure normalization within ±0.3 EV.

Profile Persistence & Reproducibility

Given a batch is processed and its profile persisted, When the same inputs are reprocessed with the same profile version on identical hardware, Then outputs are byte-identical. Given reprocessing on different but supported hardware (GPU/CPU), When comparing outputs, Then per-pixel channel difference ≤ 1/255 for ≥ 99.5% of pixels and never exceeds 2/255. Given a stored profile, When inspected, Then metadata includes: profile ID (deterministic hash of representative image IDs and parameters), algorithm version, kernel hash, creation timestamp, and random seed if applicable. Given an exported profile file, When imported into a new project, Then the baseline parameters are recognized and applied without modification, and the profile ID remains unchanged.

GPU-Optimized Throughput & Fallback

Given an NVIDIA T4 GPU and 24MP JPEGs, When processing a batch of 500 images, Then average throughput during normalization ≥ 3 images/second and GPU utilization ≥ 70% during the stage. Given GPU is unavailable or errors occur, When fallback is triggered, Then CPU path completes normalization at ≥ 1.5 images/second without data loss and logs the fallback event. Given processing 500 images, When monitoring memory, Then peak GPU memory used by normalization ≤ 2 GB and there are zero out-of-memory errors. Given the normalization stage, When complete, Then per-image stage timings and any kernel retries are logged with success status.

Downstream Parameter Handoff to Correction Stages

Given a validated baseline profile exists, When entering hue/swatch correction, Then WB gains, tint, and exposure curve are passed via a typed contract and acknowledged by the downstream stage. Given the handoff is successful, When running an A/B test (with vs. without baseline normalization) on 100 images with swatch references, Then average ΔE00 increases by ≥ 1.0 when normalization is disabled, demonstrating dependency. Given parameter version mismatch or missing fields, When the downstream stage initializes, Then the pipeline aborts before processing any image and returns a descriptive error code with remediation steps. Given normal operation, When inspecting run logs, Then the downstream stage records the received profile ID and parameter checksums for traceability.

Garment Segmentation & Protected Adjustments

"As a shop owner, I want color corrections applied only to the garment so that models and backgrounds remain natural."

Description

Isolate the garment using semantic segmentation and apply color transforms only within the garment mask while protecting skin tones, backgrounds, and props. Use edge-aware blending and texture-preserving adjustments to modify hue/chroma while maintaining luminance and fabric detail. Provide fallback handling for complex patterns and optional manual mask refinement on selected images. Store masks with image revisions and share them with other PixelLift tools to prevent conflicting edits.

Acceptance Criteria

Accurate Garment Mask Generation in Batch Uploads

- Given a batch of 200 product photos (8–24 MP) with a single primary garment and varied models/backgrounds, when segmentation runs, then mean IoU of the garment mask against labeled ground truth ≥ 0.85 and boundary F1 score (±2 px band) ≥ 0.90 across the batch. - Given each image, when segmentation runs, then processing time per 24 MP image ≤ 3 seconds on reference hardware and peak memory per worker ≤ 1.5 GB. - Given images with up to 20% garment occlusion, when segmentation runs, then IoU ≥ 0.78 and false-positive pixels outside the ground-truth garment ≤ 5% per image.

Protected Color Adjustments Affect Only Garment Region

- Given a hue shift of +20° and chroma +10% is applied to the garment, when comparing non-garment pixels before vs after, then mean ΔE00 ≤ 0.5 and 95th percentile ΔE00 ≤ 1.0 across skin, background, and props. - Given images containing detected skin regions, when color transforms are applied, then skin-tone cluster center shift in CIELAB ≤ 0.5 ΔE00 and luminance (L*) change ≤ 1.0. - Given batch processing, when adjustments complete, then spillover area outside the garment mask ≤ 1.5% of total image pixels per image.

Edge-Aware Blending Without Halos on Garment Boundaries

- Given high-contrast edges between garment and background, when mask compositing occurs, then within a 3 px boundary band SSIM (luminance) ≥ 0.95 relative to original and color bleed across the boundary ≤ 0.5 ΔE00. - Given 200% zoom inspection, when sampled at 10 random boundary points per image, then no visible halos > 1 px as measured by automated edge-contrast delta ≤ 5%. - Given semi-transparent edges (e.g., chiffon), when blended, then edge transparency is preserved with alpha error MAE ≤ 0.05 within the boundary band.

Texture-Preserving Hue/Chroma Changes Maintain Fabric Detail

- Given hue rotations up to ±30° and chroma scaling 0.7–1.3 applied inside the garment mask, when analyzing high-frequency detail, then Laplacian variance decreases by ≤ 10% and local contrast (RMS) change ≤ 5% vs original. - Given the luminance channel, when transforms are applied, then mean ΔL* inside the garment mask is within ±2.0 unless explicitly set by a luminance-adjusting preset. - Given moiré-prone textures (knits, tweeds), when transforms are applied, then texture-preservation index (MS-SSIM) ≥ 0.90 within the garment region.

Fallback Handling and Confidence Signaling for Complex Patterns

- Given garments with complex patterns (plaid, lace, sheer, heavy reflections) or occlusions, when model confidence < 0.80, then the system flags Low Confidence, shrinks the mask by 2 px with feather ≤ 1 px, and proceeds using conservative blending. - Given Low Confidence cases in a batch, when processing completes, then affected images are listed in the results with mask_confidence, reasons, and a one-click "Refine Mask" action. - Given fallback is applied, when compared to ground truth on the subset, then spillover outside the garment ≤ 1.0% of pixels and garment coverage ≥ 90% of the ground-truth area.

Manual Mask Refinement Workflow and Persistence

- Given a user opens Refine Mask on a selected image, when they brush add/remove strokes, then the preview updates within 100 ms per stroke and edge snapping adheres within ±2 px of garment edges. - Given refinement, when the user saves, then a new mask revision is created with incremented revision_id, author, timestamp, and the image reprocesses within 2 seconds with the new mask. - Given refinement actions, when tested, then undo/redo supports at least 20 steps, zoom range is 50%–400%, and exiting without saving discards changes.

Mask Versioning and Cross-Tool Consistency

- Given a processed image, when the mask is stored, then it is saved non-destructively with metadata (mask_id, revision_id, dimensions, color space, confidence, tool_origin) and attached to the image revision. - Given another PixelLift tool opens the same image, when it requests the current mask, then it receives the latest committed mask revision within 200 ms and uses it read-only unless explicitly checked out for edit. - Given concurrent edit attempts on the same mask by two tools, when a second write is attempted, then a conflict error is returned and the second tool must create a new revision; no silent overwrites occur. - Given an export, when the image is output, then the active mask revision_id is embedded in sidecar metadata and audit logs.

ΔE Accuracy Scoring & Tolerance Controls

"As a brand manager, I want to see ΔE accuracy scores and set a tolerance so that I can ensure color fidelity before exporting."

Description

Calculate ΔE00 between the corrected garment’s average Lab color and the target swatch, display per-image scores and batch statistics, and allow users to set tolerance thresholds that flag outliers. Present clear indicators on thumbnails, provide a detailed view with sampled regions, and enable CSV export of results. Record scores in image metadata for auditing and downstream quality checks.

Acceptance Criteria

Compute and Display Per-Image ΔE00 Scores in Batch Grid

Given a processed batch with a defined target swatch (photo or hex) and valid garment masks When scoring is executed Then for each image the system computes ΔE00 using CIEDE2000 between the target Lab and the average Lab of the masked garment region And stores the score with precision 2 decimals (rounded half up) And displays the numeric ΔE00 on the image thumbnail overlay and in the image detail panel And persists the score so it reappears after page refresh or re-login

User-Defined Tolerance Thresholds and Outlier Flagging

Given a batch and a tolerance settable between 0.0 and 10.0 in 0.1 increments (default 2.0) When the tolerance is applied Then images with ΔE00 <= tolerance are marked Within Tolerance and others flagged Outlier And badges appear on thumbnails: green for within, red for outlier, with a tooltip showing ΔE value and tolerance And changing the tolerance updates flags and counts across the batch within 500 ms And users can filter the grid to show only outliers

Batch ΔE Statistics Summary

Given a batch with at least 1 scored image When viewing the batch header Then the app shows count, min, max, mean, median, and standard deviation of ΔE00 (2 decimals), plus counts within/out of tolerance And statistics exclude images with missing scores And values update within 500 ms after rescoring or tolerance change And exported CSV summary matches on-screen statistics to within 0.01

Detailed View with Sampled Regions and ΔE Breakdown

Given an image detail view is opened When the Scoring tab is selected Then the UI shows target Lab (L*, a*, b*), measured average Lab, and ΔE00 (2 decimals) And displays a 3x3 sampling grid over the garment mask with per-cell local ΔE00 values and highlights the worst cell And hovering a cell shows its Lab and ΔE00 And switching between swatch photo and hex target updates values accordingly within 300 ms

CSV Export of ΔE Results

Given a batch with scored images When the user clicks Export CSV Then the system downloads an RFC-4180 compliant UTF-8 CSV with a header row and one row per image And columns include: batch_id,image_id,filename,target_spec,target_L,target_a,target_b,measured_L,measured_a,measured_b,deltaE00,tolerance,within_tolerance,profile_id,processed_at And numeric fields use '.' decimal separator and 2 decimals for ΔE00 and Lab And processed_at is ISO 8601 with timezone And rows are sorted by filename ascending

Embed ΔE Scores in Image Metadata for Auditing

Given an image is saved or exported When embedding metadata Then XMP fields are written under the PixelLift namespace: DeltaE00, TargetLab, MeasuredLab, Tolerance, WithinTolerance, BatchId, ProfileId, ProcessedAt, SoftwareVersion And for JPEG/PNG the XMP is embedded in-file; for formats without XMP support a sidecar .xmp is created with the same basename And reading the file/sidecar immediately after save returns identical values to the UI And the operation does not alter pixel data or ICC profile

Color Management and ΔE00 Calculation Fidelity

Given images with embedded ICC profiles and targets specified as swatch photo or hex When converting to Lab for scoring Then image colors are converted to CIE Lab (D50) using ICC color management (respect embedded profile; assume sRGB if absent) And hex targets are interpreted as sRGB, converted to Lab (D50) via ICC, then compared And ΔE00 uses the CIEDE2000 formula; results for a provided reference set match expected values within ±0.01 And repeated runs with the same inputs produce identical scores

Batch Preview & Approval Workflow

"As a busy seller, I want an efficient review screen with before/after and bulk approvals so that I can quickly finalize large batches."

Description

Offer a fast before/after preview grid with zoom, a swatch chip overlay, and ΔE badges. Enable bulk approve/reject/needs-review actions, keyboard shortcuts, and per-image notes. Maintain version history to compare alternate corrections. Gate exports on approval status to prevent accidental release of out-of-tolerance images and surface review status in batch summaries.

Acceptance Criteria

Preview Grid: Before/After with Zoom, Swatch Overlay, and ΔE Badges

Given a processed batch with a configured swatch (hex value or swatch photo) and an active SwatchMatch profile When the user opens the Batch Preview Then a grid renders tiles with a Before/After toggle, a swatch chip overlay, and a ΔE badge for the After image. Given the user toggles Zoom When zooming using keyboard shortcut Z or mouse wheel Then the image zooms between 25% and 400% with pan support and the zoom level stays in sync between Before and After views for the selected tile. Given the swatch chip overlay toggle is ON When the user drags the chip Then it can be repositioned within the image bounds and displays the swatch hex value; the ON/OFF state persists for the session. Given a batch tolerance value is set (in ΔE units) When ΔE is calculated (CIEDE2000) for each After image against the swatch Then the ΔE badge displays the value to two decimals and is color-coded: green ≤ tolerance, amber > tolerance and ≤ tolerance + 2, red > tolerance + 2.

Bulk Approve/Reject/Needs-Review Actions

Given the user selects 1–1000 images in the grid When they invoke Approve, Reject, or Needs Review via toolbar or shortcuts Then the selected images’ statuses update consistently and the operation completes within 3 seconds for 500 images. Given a bulk action completes When the user filters by any status Then the grid reflects the updated counts and selection is cleared.

Keyboard Shortcuts for Review Workflow

Given the review grid is focused When the user presses A, R, N, Right Arrow, Left Arrow, Z, O, or V Then the app performs Approve, Reject, Needs Review, Next, Previous, Zoom toggle, Swatch overlay toggle, or Version compare toggle respectively and shows a 1-second on-screen confirmation. Given the app runs on macOS or Windows in Chrome When shortcuts are used Then they do not conflict with default browser shortcuts and are listed in the in-UI shortcut help modal (opened with ?).

Per-Image Notes with Audit Trail

Given an image tile is selected When the user adds a note up to 500 characters and clicks Save Then the note is stored with timestamp and author and appears in the image’s notes panel immediately. Given an existing note authored by the current user When they edit or delete it Then changes are saved, an edit history is retained with timestamps, and the latest content is displayed; deleted notes are no longer shown in the default view. Given notes exist across images in a batch When the user applies the "Has Notes" filter Then only images with at least one note are shown.

Version History and Side-by-Side Comparison

Given corrections are re-applied (e.g., profile tweak or swatch update) When the user saves the changes Then a new immutable version is created with auto-incremented ID, timestamp, and per-image ΔE summary; previous versions remain accessible. Given two versions are selected in the Version sidebar When the user clicks Compare Then a side-by-side comparison opens for the same image with synchronized zoom/pan and per-version ΔE badges; a Revert action creates a new version identical to the selected prior version.

Export Gated by Approval and Color Tolerance

Given the user initiates Export for a batch When validation runs Then only images with status Approved and ΔE ≤ batch tolerance are eligible; any Not Reviewed, Needs Review, or Rejected images, or ΔE > tolerance, block export. Given blocking items exist When the export dialog appears Then it lists counts and thumbnails of blockers and provides a "Filter to blockers" action that takes the user back to the grid with the appropriate filter applied. Given all images are eligible When the user confirms export Then the export completes and a summary shows counts exported, file destinations, and zero blockers.

Batch Summary with Review Status Surface

Given a processed batch exists When the user opens the Batch Summary panel Then the panel displays total image count and counts/percentages for Approved, Rejected, Needs Review, and Unreviewed, along with the ΔE tolerance and a mini histogram of ΔE values. Given the user clicks any status metric in the summary When they do so Then the grid filters to that status and the URL updates to reflect the filter for deep-linking. Given statuses change via user actions When changes occur Then the summary metrics update within 1 second without a full page reload.

Channel-Optimized Export & Metadata

"As an e-commerce seller, I want channel-ready exports with embedded color profiles so that colors render consistently across marketplaces."

Description

Generate channel-specific export variants (e.g., Shopify, Amazon, Instagram) with correct color space (sRGB IEC 61966-2.1), compression, and size presets. Embed ICC profiles and write metadata tags for target color and ΔE score. Support JPG/WEBP/PNG and deterministic file naming. Allow filtering to exclude out-of-tolerance images and preserve transparency when applicable. Expose the same options via API for automation.

Acceptance Criteria

Channel Preset Export Compliance (Shopify, Amazon, Instagram)

Given a processed batch with a selected channel profile (Shopify, Amazon, or Instagram) When the user exports with Channel-Optimized enabled Then each output image embeds the sRGB IEC 61966-2.1 ICC profile And pixel dimensions and resizing rules exactly match the selected profile's preset values And compression settings (format and quality) exactly match the profile's preset values And generated file size per image is within the profile's configured maximum (if specified) And the export summary lists the profile applied and per-file validation status with zero errors

ICC Profile Embedding and Color Metadata

Given export format is JPG, PNG, or WebP When export completes Then the file contains an embedded sRGB IEC 61966-2.1 ICC profile And XMP metadata includes target_color_hex and delta_e_2000 with the measured values for that image And reading the file with standard metadata tools returns those same values without loss of precision And metadata writing succeeds for all selected formats; failures are logged and cause the file to be marked failed

Deterministic File Naming Pattern

Given the default naming template "{sku}_{channel}_{variant}_{width}x{height}.{ext}" When exporting the same batch twice with identical inputs and settings Then every output file name is identical across runs and contains only ASCII letters, digits, hyphens, and underscores And names are unique within a batch; if a collision would occur, a numeric suffix _2, _3, ... is appended deterministically based on stable sort order And the total count of files equals expected variants times input images minus any excluded items

Out-of-Tolerance ΔE Filtering

Given a per-batch ΔE tolerance T is set When exporting Then any image with measured ΔE > T is excluded from export And the export summary/report lists excluded filenames with their ΔE values and the threshold T And the API response includes excluded_ids and exported_ids arrays with correct counts

Transparency Preservation for Alpha Assets

Given source images contain transparency and Preserve Transparency is enabled When exporting to PNG or WebP Then the alpha channel is preserved without premultiplication or unintended matte edges And color values in opaque regions remain unchanged aside from the channel preset's intended adjustments When exporting to JPEG while transparency is present Then the user must choose an explicit matte/background option or switch to an alpha-supporting format before export proceeds

Multi-Format Export Validation

Given the user selects multiple output formats (JPG, PNG, WebP) for a channel profile When exporting Then each requested format is produced per variant with the correct format-specific settings And each file passes validation for correct format signature, embedded ICC, and required metadata fields And a missing or unsupported format selection fails fast with a clear error before export starts

API Parity and Idempotent Automation

Given an API request specifying channel profile, formats, naming template, ΔE tolerance, and transparency options When the request is submitted with a valid idempotency key Then the job is accepted and returns a job_id, status endpoint, and predicted output count And the produced artifacts have identical file names, formats, ICC profiles, and metadata values to those created via the UI with the same inputs and settings And invalid parameters produce HTTP 400 with machine-readable error codes and fields; unauthorized requests produce 401

Preset & API Integration

"As a developer integrating PixelLift, I want SwatchMatch configurable via presets and API so that I can automate color-matched exports in my pipeline."

Description

Integrate SwatchMatch as a configurable step in PixelLift style presets and expose full functionality via public API parameters (swatch input, tolerance, export profile). Support saving, sharing, and versioning of presets; provide idempotent job submission, webhooks for status, and RBAC-aligned access. Ensure preset execution is deterministic so teams can reuse color-matching workflows across catalogs.

Acceptance Criteria

Add SwatchMatch Step to Style Preset (Configurable Parameters)

Given a user with Editor role opens the Preset Builder When they add the SwatchMatch step Then fields are available to configure: swatch_input (hex|image_upload|image_url), tolerance_deltaE (0–10), export_profile (sRGB|AdobeRGB|Channel:Shopify|Channel:Amazon), per_batch_profiling (on|off), and step_position And invalid values are rejected with inline errors and save is blocked And on save a presetId and version are returned and the persisted preset JSON includes the configured parameters and step order

Preset Versioning, Sharing, and RBAC Enforcement

Given a saved preset at version v1 When a user edits and saves changes Then a new immutable version v2 is created and v1 remains executable, and the response includes presetId and version=v2 Given an Owner shares the preset with a team as Viewer When a Viewer attempts to edit the preset Then the action is forbidden (HTTP 403) When a Viewer runs the preset Then the run succeeds and is attributed to the Viewer in the audit log Given an API key scoped to Project A When it calls GET/POST on presets in Project B Then the request is denied with HTTP 403 and no data is leaked

Deterministic Execution with Pinned Preset Version

Given identical input images, the same presetId:version, and the same API parameters When two jobs execute at different times Then the produced output image SHA-256 digests are identical and ΔE scores match to two decimal places Given the underlying model build hash has changed since the preset version was created When a job is started without updating the preset version Then the system blocks execution and returns HTTP 409 with an error indicating non-deterministic build change and instructions to bump the preset version or pin the previous build

Public API: Swatch Parameters and Validation

Given a client POSTs /v1/jobs with body containing: presetId:version, images[], swatch_input (hex "#RRGGBB" or swatch_url or swatch_file), tolerance_deltaE (0–10), and export_profile (enum) When all parameters are valid Then the API responds 202 Accepted with jobId and echoes normalized parameters in the response Given an invalid parameter (e.g., malformed hex, unsupported export_profile, tolerance outside range) When the request is submitted Then the API responds 400 with field-specific error codes and messages Given swatch_input is a swatch_url that is unreachable When the request is submitted Then the API responds 422 with code=swatch_unfetchable and the job is not created Given tolerance_deltaE is omitted and defined in the preset When the request is submitted Then the job uses the preset value; if provided in the request, the request value overrides the preset

Idempotent Job Submission

Given a client submits POST /v1/jobs with Idempotency-Key=K and request body B When the same request with Idempotency-Key=K is retried within the idempotency window Then the API returns the original jobId and does not create a new job or bill twice Given a retry with Idempotency-Key=K but a different request body B' When the request is received Then the API responds 409 idempotency_mismatch and no new job is created Given a successful job creation When querying the idempotency record Then the key and response are retained for at least 24 hours

Webhooks: Job Status and Security

Given a project webhook endpoint is configured with a signing secret When a job status changes to queued, processing, completed, or failed Then a POST is sent for each transition containing event_type, jobId, presetId:version, ΔE stats, and export variants, and includes headers X-PixelLift-Signature (HMAC-SHA256) and X-PixelLift-Timestamp Given the endpoint responds with a non-2xx status When delivery is attempted Then the system retries at least 3 times with exponential backoff and marks the attempt failed after the final retry; delivery results are visible in logs Given the endpoint responds 2xx When delivery is processed Then the event is marked succeeded and no further retries occur

Batch Exports and Channel-Optimized Variants

Given a preset with export_profile specifying Channel:Amazon and Channel:Shopify When a batch job completes Then each source image produces per-channel outputs named per template, color-managed to the target profile, and includes metadata listing target channel and ΔE per item Given any item exceeds tolerance_deltaE When results are generated Then the item is flagged as out_of_tolerance=true in the job results payload Given no export_profile is specified When the job runs Then outputs default to sRGB profile

ContourShadow

Adds physically‑plausible interior and ground shadows to restore depth after mannequin removal. One slider controls intensity with marketplace‑safe presets; auto‑generates shadow/no‑shadow variants for channels that restrict effects while keeping images conversion‑ready.

Requirements

Physically-Based Shadow Synthesis Engine

"As a boutique owner batch-editing product photos, I want realistic shadows that match my products’ shapes and contact points so that my listings look professional and drive more conversions."

Description

Generates physically plausible interior and ground shadows from mannequin-removed product cutouts by inferring product contours, contact points, and approximate scene lighting, producing soft, directionally consistent shadows that restore perceived depth without violating marketplace background rules. Integrates with PixelLift’s background removal output, supports high-resolution exports, honors transparent PNG and JPEG white backgrounds, and exposes a parameter API for opacity, softness, and falloff while defaulting to safe values. The engine must be deterministic for identical inputs and support GPU acceleration to meet batch SLAs, delivering realistic depth restoration that boosts conversion while maintaining channel compliance.

Acceptance Criteria

Deterministic Rendering for Identical Inputs

Given the same input image, cutout mask, engine version, and parameter values (opacity, softness, falloff, intensity), When the engine runs twice on the same machine and GPU with determinism enabled, Then the SHA-256 hash of the output image bytes is identical across runs and embedded metadata matches exactly. Given a fixed random seed, When the engine runs across separate processes on identical hardware/driver versions, Then the output bytes are bitwise identical. Given no seed is provided, When the engine runs in deterministic mode, Then a fixed default seed is applied so identical inputs yield identical outputs.

Marketplace-Compliant PNG and JPEG Outputs

Given output format PNG is selected, When the engine renders ground shadows, Then all fully transparent pixels (alpha=0) have RGB=#FFFFFF and no color fringing wider than 1 px is present at matte edges (CIEDE2000 ΔE <= 2 for pixels with alpha ≤ 5%). Given output format JPEG (white background) is selected for a no-shadow variant, When the engine exports, Then all background pixels outside the product mask are pure white (#FFFFFF) with max per-channel deviation ≤ 1/255. Given output format JPEG (white background) is selected for a shadow variant, When the engine exports, Then shadows are neutral gray (|a*| and |b*| ≤ 1.5 in CIELAB for shadow pixels) and non-shadow background remains #FFFFFF. Given any format, When saving, Then color space is sRGB with embedded profile and output dimensions equal the input (±0 px).

Parameter API with Safe Defaults and Validation

Given the parameter API, When clients set opacity, softness, and falloff, Then accepted ranges are: opacity ∈ [0.0, 0.6], softness ∈ [0.0, 1.0] (normalized to longest-side pixels), falloff ∈ [0.0, 1.0]. Given parameters are omitted, When the engine runs, Then defaults are applied: opacity=0.25, softness=0.5, falloff=0.5, producing marketplace-safe shadows (opacity ≤ 0.35). Given out-of-range values, When a request is processed, Then values are clamped to nearest bound and a validation warning is included in the response metadata. Given an intensity slider value v ∈ [0,100], When mapped to parameters, Then opacity increases monotonically with v and the mapping is invertible within ±1 slider unit. Given a named preset ("None","Light","Standard","Bold"), When applied, Then parameters match the preset definitions and never exceed opacity 0.45; the default preset is "Standard". Given invalid types for parameters, When the API is called, Then the service returns HTTP 400 with a machine-readable error and no image is produced.

GPU-Accelerated Batch Processing SLA

Given an NVIDIA T4 (16 GB) with 8 vCPU host, When processing a batch of 500 images with max dimension 2048 px, Then average per-image processing time ≤ 1.0 s, 95th percentile ≤ 1.5 s, and total wall time ≤ 8 min 30 s. Given GPU acceleration is enabled, When compared to CPU-only mode on the same host, Then end-to-end throughput is ≥ 3× faster. Given a batch run, When monitoring resources, Then GPU utilization averages ≥ 60% and the job completes without OOM errors or crashes (0 failed items).

Directionally Consistent Soft Shadow Synthesis

Given the engine’s estimated dominant lighting direction, When generating ground shadows, Then the principal axis of the shadow aligns within ±10° of the estimated light direction. Given detected contact points at the product-ground intersection, When rendering contact shadows, Then the blur radius at contact is ≤ 2 px at 2048 px resolution and increases with distance (correlation of penumbra width with distance ≥ 0.8 over sampled points). Given interior cavities in the product silhouette, When rendering interior shadows, Then average interior shadow opacity is between 0.05 and 0.25 and no shadow paint bleeds outside the product mask (for PNG: alpha outside mask remains 0; for JPEG: non-shadow background remains #FFFFFF). Given any generated shadow, When evaluating hue neutrality, Then average chroma of shadow pixels in CIELAB is ≤ 2.0.

Integration with Background Removal and High-Resolution Support

Given a PixelLift background-removal output (alpha matte + product cutout), When passed to the engine, Then the engine uses the matte without altering mask topology (Jaccard index between input and used mask ≥ 0.995). Given an input up to 8000 px on the long side, When processing, Then the engine completes without error and preserves exact dimensions and alignment (no cropping/padding drift; output width/height equal input). Given edge pixels at the matte boundary, When compositing shadows, Then halo width ≤ 1 px and no hard seams are introduced (edge gradient continuity within 2σ of interior mean). Given export settings, When saving PNG or JPEG, Then files are PNG-24 with alpha or JPEG (Q ≥ 90) in sRGB with embedded ICC profile.

Auto-Generation of Shadow and No-Shadow Variants

Given variant generation is enabled, When processing a single asset, Then two outputs are produced: one "_shadow" and one "_no-shadow" (or API-specified pattern) with variant metadata included. Given both variants, When pixel-wise comparing outside shadow regions, Then pixels are bitwise identical; differences are confined to shadow pixels only. Given JPEG white background is requested, When generating the no-shadow variant, Then all background pixels outside the product mask are pure white (#FFFFFF) with max per-channel deviation ≤ 1/255. Given transparent PNG is requested, When generating the no-shadow variant, Then all background pixels have alpha=0 and no interior shading is applied within the product area.

Intensity Slider with Live Preview

"As a seller preparing a catalog, I want a simple slider to quickly dial in how strong the shadows appear so that I can match my brand style without complex settings."

Description

Implements a single, discoverable slider control (0–100) that adjusts composite shadow intensity in real time on the canvas with instant visual feedback and keyboard step controls, mapping slider positions to predefined opacity/softness curves that remain consistent across batches. Supports per-image tweaks and batch-apply, persists in presets, and includes accessible labels and tooltip guidance. Rendering is latency-optimized (<150 ms) via progressive preview with a high-quality refine on mouse-up to keep editing fluid.

Acceptance Criteria

Slider Discoverability and Range

Given the ContourShadow tool is opened, When the panel renders at a 1366x768 viewport without scrolling, Then a single control labeled "Shadow Intensity" is visible and identifiable as a slider. Given the slider is present, When inspected, Then its range is 0–100 inclusive with integer step = 1 and the current value is displayed as an integer. Given the slider value is set to 0 or 100, When the value is applied, Then the rendered intensity corresponds to the minimum or maximum effect without clipping, artifacts, or numeric overflow.

Real-time Progressive Preview and Mouse-up Refine

Given the user drags the slider, When the value changes, Then the on-canvas preview updates within 150 ms of each change (95th percentile across 50 changes) and remains responsive. Given the user releases the pointer, When the refine process starts, Then a high-quality render replaces the preview within 600 ms (95th percentile) and matches the final export within SSIM ≥ 0.99. Given rapid value changes, When throttling occurs, Then no more than the last change is dropped and the final visible preview reflects the latest value prior to refine.

Keyboard Step Controls

Given the slider has focus, When Left/Down Arrow is pressed, Then the value decreases by 1 (min 0) and the canvas updates accordingly. Given the slider has focus, When Right/Up Arrow is pressed, Then the value increases by 1 (max 100) and the canvas updates accordingly. Given the slider has focus, When Shift + Arrow is pressed, Then the value changes by 5 per keypress (clamped 0–100) with live preview. Given the slider has focus, When PageDown/PageUp is pressed, Then the value changes by 10 per keypress (clamped 0–100). Given the slider has focus, When Home/End is pressed, Then the value jumps to 0/100 respectively.

Curve Mapping Consistency Across Batches

Given a predefined opacity/softness curve mapping version V is active, When two different images in the same batch are set to the same slider value X, Then the derived internal parameters are identical for both images (exact numeric equality) and produce visually consistent output (SSIM ≥ 0.99). Given a project is reopened, When the same slider value X is applied under mapping version V, Then the resulting parameters are identical to prior runs. Given mapping version V is updated to V+1, When a preset saved under V is loaded, Then a migration preserves visual output within tolerance (SSIM ≥ 0.99) or warns the user if not possible.

Batch Apply with Per-Image Overrides

Given multiple images are selected, When the user invokes Batch Apply at slider value X, Then all selected images are set to X and preview/refine are triggered per image. Given an image was batch-set to X, When the user manually changes its slider to Y, Then the image is flagged as overridden and other images remain at X. Given overrides exist, When Batch Apply is executed again at value Z, Then all selected images adopt Z regardless of prior overrides, and no unintended changes occur outside the selection.

Preset Save/Load Persistence

Given the current slider value is X and mapping version is V, When the user saves a style preset, Then the preset stores X and V. Given a saved preset, When it is loaded in a new session, Then the slider restores to X, uses mapping version V (or a migrated equivalent with SSIM ≥ 0.99), and triggers preview then refine. Given presets are saved, When the application restarts, Then the presets remain available and loadable without corruption.

Accessibility and Tooltip Guidance

Given the slider is rendered, When inspected by assistive tech, Then it exposes role=slider with aria-valuemin=0, aria-valuemax=100, aria-valuenow reflecting the current value, and an accessible name "Shadow Intensity". Given keyboard or focus navigation, When the slider receives focus, Then a visible focus indicator appears with contrast ratio ≥ 3:1 against adjacent colors. Given hover or focus for ≥ 600 ms, When the tooltip is shown, Then it contains guidance including "0 = no shadow" and "Use arrow keys for fine control". Given a screen reader is active, When the value changes via keyboard, Then it announces "Shadow Intensity X percent" without repeating stale values.

Marketplace-Safe Presets

"As a merchant publishing to multiple marketplaces, I want ready-made shadow settings that are compliant for each channel so that my images are accepted everywhere without manual tweaking."

Description

Provides a curated library of shadow presets tuned for major marketplaces (e.g., Amazon, Etsy, eBay, Shopify) that enforce channel-specific constraints such as minimal cast shadow, white background thresholds, and maximum opacity, with clear labels and guardrails to prevent non-compliant outputs. Presets are editable, saveable per brand, and selectable at batch start; underlying parameters map to the shadow engine and intensity slider. The system auto-suggests a preset based on channel metadata and allows quick switching without re-upload.

Acceptance Criteria

Auto-Suggest Preset by Channel Metadata

- Given a batch has channel metadata "Amazon", when the user opens ContourShadow, then the preset "Amazon (Marketplace-Safe)" is preselected. - Given channel metadata is absent, when the user opens ContourShadow, then the preset "Neutral Marketplace-Safe" is selected and a prompt requests channel selection. - When the user changes the channel field, then the auto-suggested preset updates without altering other batch settings. - Then an audit log entry records the channel source and the selected preset ID for the batch.

Curated Library Availability & Clear Labels

- Given the preset picker is opened, then presets for Amazon, Etsy, eBay, and Shopify are available out of the box. - Then each preset label shows the channel name, a "Marketplace-Safe" badge, and a concise rule summary (e.g., BG: white-only, Shadow: minimal, Opacity: max). - When the user opens preset details, then the full ruleset (background threshold, max shadow opacity, max extent, crop margin) and rule version are displayed with a link to the marketplace policy. - When the user searches by channel name, then matching presets are filtered and selectable.

Guardrails Enforced in Editor

- When the user adjusts intensity or advanced parameters beyond preset limits, then values are clamped and an inline message explains the constraint. - Given the current configuration violates the preset rules, then Save and Export actions are disabled until parameters return to compliance. - When the user clicks "Restore preset", then all parameters revert to the preset defaults in one action. - Given a selected preset, when the intensity slider is moved, then it maps to engine parameters via the preset’s mapping function, monotonically and without producing out-of-bounds values.

Compliance at Export and Variant Generation

- Given a marketplace preset is selected, when the user exports, then 100% of exported images pass the preset’s automated compliance checks. - When any image fails compliance, then export is blocked for failing images, a report lists image IDs and violated rules, and compliant images remain exportable. - Given a channel that prohibits shadows, when exporting, then a no-shadow variant is auto-generated alongside any shadowed version using the channel’s required filename suffix. - Then the export manifest includes preset ID, ruleset version, and pass/fail counts per rule.

Editable and Brand-Saveable Presets

- When the user selects "Save as Brand Preset" from a marketplace preset, then editable fields can be changed and compliance-locked fields are read-only. - When saving a brand preset, then the name must be unique per brand; duplicates are rejected with a clear message. - When a brand preset is edited, then a new version is created with timestamp and owner; prior versions remain accessible. - Then brand presets are scoped to the brand and visible to team members with brand access in the preset picker.

Preset Selection at Batch Start and Propagation

- Given batch creation is in progress, when the user selects a preset, then it applies to all items in the batch by default even if uploads are still in progress. - When the user applies a per-image override, then the override is stored and displayed, while the batch-level preset remains the default for other items. - When the batch preset is changed before processing begins, then queued jobs update to the new preset without re-uploading source images. - Then the selected preset is shown in the batch summary UI and returned via API for the batch.

Quick Switch Without Re-upload and Performance

- When switching between presets during review, then images re-render using existing source files without requiring re-upload. - Then the first 20 previews update within 15 seconds for batches up to 500 images at 1024px long edge on the standard compute tier. - Then full-batch re-render throughput is at least 50 images per minute on the standard compute tier. - When a preset is changed after export, then previously exported assets are flagged as stale in the batch history and export panel.

Auto Variant Generation & Channel Routing

"As a seller syndicating listings across channels, I want PixelLift to export both shadowed and clean variants to the right places so that I don’t have to create and manage duplicates manually."

Description

Automatically generates both “shadowed” and “no-shadow” variants per image during export, tags them with channel-specific metadata, and routes them to the appropriate destinations, filenames, and folders (or via API/webhooks) according to a user-defined mapping. Ensures deterministic visual parity except for shadows, keeps file sizes within marketplace limits, and records variant lineage for auditability. Users can enable or disable variants per channel and see export counts along with success or failure states.

Acceptance Criteria

Dual Variant Generation on Export

Given a batch export of N source images with ContourShadow enabled And auto-variant generation is enabled in the export profile When the export job completes Then exactly two variants per source image exist in the job manifest: shadowed and no_shadow And both variants are generated using the same preset and non-shadow parameters And both variants have unique variant_ids linked to their source_asset_id

Channel-Specific Metadata Tagging

Given a channel mapping defines required metadata fields per destination When variants are generated for each channel Then each variant is tagged with channel_id, variant_type, export_job_id, mapping_id, preset_id, and checksum And tag values conform to the mapping's required keys and formats And for API/webhook destinations, the delivery payload includes these metadata fields And for file destinations, these metadata fields are present in the export manifest

Destination Routing and Naming Compliance

Given a user-defined mapping specifies destinations, folder paths, and filename patterns per channel and variant_type When an export job runs Then each variant is delivered only to the destinations defined for its enabled channels And filenames match the configured pattern exactly (including resolved placeholders such as {sku} and {variant_type}) And folder paths are created if missing and files are written to the correct locations And API/webhook deliveries return HTTP 2xx responses And any non-2xx delivery is recorded as Failed with remote status code and error details

Deterministic Visual Parity Except Shadows

Given a source image is exported to shadowed and no_shadow variants using the same non-shadow parameters When the two outputs are compared Then image dimensions, crop, alignment, and color profile are identical between variants And on the subject mask, SSIM between variants is >= 0.99 And pixel differences outside the subject mask are confined to the shadow region only And background color/value remains unchanged between variants

Marketplace Size and Format Compliance

Given per-channel limits for format, color space, dimensions, and max file size are defined in the mapping When variants are exported and routed Then each delivered file matches the channel's required format and color space And image dimensions are within the channel's min/max bounds And file size is less than or equal to the channel's maximum size And if an asset exceeds limits, it is recompressed or resized to comply before delivery without altering subject crop And assets that cannot be brought into compliance are not delivered and are marked Failed with reason "LimitExceeded"

Variant Lineage and Auditability

Given an export job runs When variants are generated and delivered Then a lineage record is created per variant including source_asset_id, variant_id, variant_type, channel_id, mapping_id, preset_id, checksum, created_at, delivered_at, and destination_uri And lineage can be queried by source_asset_id and variant_id via UI and API And lineage includes per-destination attempt status with timestamp and error details

Channel-Level Variant Enable/Disable and Export Summary

Given a user enables or disables variant types per channel in the export profile When an export is executed Then only enabled variant types are delivered for that channel and disabled types are omitted from delivery And the export summary shows, per channel and variant_type, total_expected, delivered_success, and delivered_failed counts And total_expected equals images_exported multiplied by the number of enabled variant types for that channel And delivered_success plus delivered_failed equals total_expected And each failed item displays an error code and message

Batch Processing with Preset Application

"As an online shop owner preparing a seasonal drop, I want to apply the same shadow look to my whole catalog in one run so that I can publish quickly and consistently."

Description

Enables applying a selected shadow preset and intensity uniformly across hundreds of images with resumable batch jobs, concurrency control, progress indicators, and error handling that retries failed items without reprocessing successful ones. Batch processing respects per-image overrides, preserves metadata, and ensures consistent output naming. Performance target is 300 images in under 10 minutes on a standard GPU tier, with resource scaling and throttling to maintain SLA.

Acceptance Criteria

Uniform Preset Application With Variant Generation

Given a batch of N images with batch preset P and intensity I and channel policy flags When the batch completes Then 100% of items without per-image overrides have metadata.appliedPreset = P and metadata.appliedIntensity within [I−0.02, I+0.02] And for channels requiring no effects, both variants are produced per item: variantType=shadow and variantType=plain; otherwise only variantType=shadow is produced And each produced image records appliedPreset and appliedIntensity in metadata

Resumable Batch Job With Idempotent Processing

Given batch B has K items completed successfully and the job is interrupted When batch B is resumed Then previously successful items are not reprocessed (no new render events) and their outputs remain byte-identical (hash-equal) And only pending and previously failed items are processed And the final output set contains exactly one output per required variant per item with no duplicates

Concurrency Control, Throttling, and SLA Performance

Given a configured concurrency limit C on the standard GPU tier and a batch of 300 images of typical e-commerce resolution (≤3000 px longest side) When processing starts Then observed active workers never exceed C And the job completes within 10 minutes wall-clock time And if GPU utilization exceeds the throttle threshold, concurrency is reduced within 1 second and increased back within 5 seconds after utilization drops below the threshold

Real-Time Progress Indicators and ETA

Given a running or resumed batch job When the user opens the batch detail view Then the UI displays totalCount, completedCount, inProgressCount, failedCount, queuedCount, percentComplete, and ETA And these values update at least every 2 seconds while the job is active And the ETA mean absolute percentage error is ≤15% after ≥10% of items have completed

Error Handling With Targeted Retries and Final Report

Given transient errors (e.g., timeouts, 5xx) occur for some items When the batch runs Then each affected item is retried up to 3 times with exponential backoff starting at 5s and capped at 60s And non-retryable errors (e.g., corrupt image) are marked failed with a reasonCode and are not retried And the final summary report lists each failed itemId, reasonCode, and retryCount And successfully processed items are never re-rendered during retries

Per-Image Overrides Precedence

Given some items specify per-image overrides for preset and/or intensity When the batch processes those items Then override values are applied instead of batch defaults And override intensities outside [0,1] are clamped to the nearest bound and recorded as such And invalid preset identifiers cause the item to be marked failed with reasonCode=INVALID_PRESET

Metadata Preservation and Output Naming Consistency

Given original items contain filename, SKU, and EXIF orientation/ICC profile When outputs are written Then original basename, SKU, EXIF orientation, and ICC profile are preserved in outputs And new metadata fields jobId, appliedPreset, appliedIntensity, and variantType are added And each output filename follows the pattern <basename>__cs-p{presetId}__i{intensity2dp}__v{shadow|plain}.{ext} and is unique within the job And naming remains stable across retries and resumes

Shadow Quality & Compliance Validator

"As a seller concerned about rejections, I want PixelLift to catch and fix shadow issues before export so that my listings aren’t delayed or penalized."

Description

Implements automated checks to detect common shadow artifacts (hard edges, halos, misaligned contact points, floating shadows) and validate against marketplace constraints (white background tolerance, opacity limits), with auto-corrections where possible and clear flags for manual review when not. Runs inline during preview and in batch mode, provides confidence scores, and blocks export for non-compliant selections unless overridden with a warning.

Acceptance Criteria

Auto-Correct Hard Edge & Halo Artifacts (Inline Preview)

Given a processed image with ContourShadow and the validator enabled When the hard-edge artifact score at the shadow boundary is >= 0.60 on a 0–1 scale Then the system applies feathering auto-correction and re-evaluates within the same preview session And the post-correction hard-edge artifact score is <= 0.30 And the object-edge SSIM vs. pre-correction is >= 0.98 Else the image is flagged "Hard Edge – Manual Review" with confidence >= 80% When halo contrast in a 3 px ring around the subject exceeds 1.25 Then halo reduction is auto-applied and the resulting contrast is <= 1.10 Else the image is flagged "Halo – Manual Review" with confidence >= 80%

Detect Misaligned Contact Points & Floating Shadows

Given the subject-ground contact region is detected When the minimum distance between the detected contact line and the shadow contact centroid exceeds max(3 px, 1% of subject height) Then flag the image as "Misaligned Contact" with confidence score >= 80% When the overlap between the shadow mask and the ground region is < 85% or the minimum luminance drop at contact is < 5% Then flag the image as "Floating Shadow – Low Contact" with confidence score >= 80%

Marketplace Profile Compliance: Background & Shadow Opacity

Given a target marketplace profile with parameters {minWhiteCoverage, minLstar, maxChroma, minShadowAlpha, maxShadowAlpha, maxShadowColorDeltaE} When validating the rendered image against the profile Then background white coverage (excluding subject mask) is >= minWhiteCoverage (e.g., 98%) And background mean L* is >= minLstar (e.g., 98) and C* (chroma) <= maxChroma (e.g., 3) And mean shadow alpha is within [minShadowAlpha, maxShadowAlpha] (e.g., 5%–25%) And mean shadow color deviation ΔE00 is <= maxShadowColorDeltaE (e.g., 2) Else mark the image Non-Compliant and list all failed checks

Channel Restriction Handling & Variant Selection

Given the selected export channel disallows shadows When the user queues images for export Then a no-shadow variant is auto-generated or selected for each image And shadowed variants are automatically deselected for that channel And any image lacking a no-shadow variant is flagged with an actionable "Remove Shadow" one-click fix And export includes only compliant variants unless the user explicitly overrides

Export Blocking, Override Flow, and Audit Logging

Given the current selection contains one or more Non-Compliant images When the user clicks Export Then a blocking dialog lists each violation per image with confidence scores And the Export action remains disabled until the user selects "Override compliance" and enters a reason of at least 5 characters And upon override, export proceeds and an audit log entry is created capturing user ID, timestamp, image count, violation types, and override reason And if no override is given, export is blocked

Batch Mode Validation, Auto-Correction, and Reporting SLAs

Given a batch of up to 1000 images at <= 2000 px longest edge on reference hardware When batch validation is started Then throughput is >= 300 images per minute with auto-corrections attempted where applicable And per-image results are emitted in CSV and JSON including {imageId, status (Pass/Fail/Needs Review/Error), violations[], confidenceScores[], correctionsApplied[], finalDecision} And a batch summary reports totals by status and violation type And any image that errors is retried once; if still failing, it is marked Error without halting the batch

Performance Guardrails & Fallback Modes

"As a high-volume seller on a deadline, I want predictable processing times with graceful quality fallbacks so that I can meet publishing windows."

Description

Defines per-image time and compute budgets for the shadow pipeline with instrumentation to track render latency, memory use, and GPU utilization; when budgets are exceeded, switches to a faster fallback rendering mode that preserves compliance and visual consistency. Exposes configuration for speed versus quality trade-offs in batch runs and surfaces performance metrics in the job summary to help users choose appropriate settings.

Acceptance Criteria

Auto Fallback on Time Budget Breach

Given a batch run with a configured per-image render_time_budget_ms And a selected style preset and shadow intensity When processing an image whose standard pipeline would exceed render_time_budget_ms Then the system switches that image to Fast Fallback mode before the image job completes And completes the image within render_time_budget_ms + 10% overhead And marks the result with fallback_used=true and fallback_reason="time_budget_exceeded" And records total_latency_ms for the image in metrics And the output image meets marketplace compliance rules for the target channel (dimensions, format, background policy)

Auto Fallback on GPU Budget Breach

Given configured GPU budgets max_memory_mb and max_utilization_pct When peak GPU memory or rolling average GPU utilization exceeds the configured budget during processing of an image Then the system switches that image to Fast Fallback mode within the same worker without terminating the batch And peak GPU memory does not exceed max_memory_mb by more than 5% And the image is not retried in standard mode during this run And metrics record peak_gpu_memory_mb, avg_gpu_utilization_pct, fallback_used=true, and fallback_reason="gpu_budget_exceeded"

Metrics Instrumentation and Job Summary

Given a completed batch run When the user retrieves the job summary via UI or API Then per-image metrics include: total_latency_ms, peak_gpu_memory_mb, avg_gpu_utilization_pct, fallback_used (bool), fallback_reason (nullable), and processing_mode ("standard"|"fast_fallback") And job-level aggregates include: image_count, avg_latency_ms, p95_latency_ms, max_latency_ms, fallback_image_count, fallback_rate_pct, avg_peak_gpu_memory_mb And all numeric metrics use documented units (ms, MB, %) and are non-null And the UI displays these metrics and the API returns them in JSON

Speed–Quality Configuration Controls in Batch Runs

Given speed–quality options Quality, Balanced, Speed, and Custom And each option maps to concrete budgets (render_time_budget_ms, max_memory_mb, max_utilization_pct) and algorithm parameters When a user starts a batch selecting any option or supplies Custom numeric budgets within allowed ranges Then the inputs are validated and persisted in the job record And the pipeline enforces those budgets per image And the job summary displays the chosen option and resolved budgets And rerunning the batch with the same option yields identical resolved budgets

Marketplace-Safe Variant Generation Under Fallback

Given a channel configuration that requires both shadow and no_shadow variants When an image triggers Fast Fallback due to any budget breach Then both variants are generated and saved according to channel rules (dimensions, background, file type) And asset metadata or filenames include variant labels ("shadow","no_shadow") and fallback_used flag And the foreground alpha matte is identical between the two variants (pixel-wise equality)

Visual Consistency Tolerances in Fallback Mode

Given a validation set processed in both Standard and Fast Fallback modes under the same preset When comparing outputs on non-background regions Then SSIM >= 0.92 for fallback vs standard And mean shadow intensity difference <= 5% And shadow direction deviation <= 10 degrees And edge softness (MTF50) within ±10% And any image failing tolerances is reported with fallback_consistency_pass=false in the validation report

SizeSync

Locks ghosting parameters (neck depth, sleeve opening, crop margin) across size runs and color variants. Uses size-chart cues to normalize presentation so grids look cohesive, buyers can compare at a glance, and teams spend less time nudging per image.

Requirements

Size-Chart Parsing & Mapping

"As a studio manager, I want SizeSync to read our size charts and map them to image parameters so that product grids stay consistent without manual tweaking."

Description

Ingest size charts (CSV, JSON, or manual form) and map alpha/numeric sizes to garment measurement targets to drive normalized ghosting parameters (neck depth, sleeve opening, crop margin) per product category. Handle unit conversion, tolerances, missing values, and fallback defaults. Persist mappings per brand and collection, version them for auditability, and link to catalog SKUs via PixelLift metadata. Provide a validation step to flag inconsistencies and ensure downstream pipelines can consume normalized targets.

Acceptance Criteria

Multi-Format Size Chart Ingestion and Mapping

Given a brand and collection with a defined product category When a user uploads a size chart in CSV or JSON and maps headers to Neck Depth, Sleeve Opening, and Crop Margin (or enters values via manual form) Then the file is parsed without error, supported sizes (alpha and numeric) are recognized, and a size-to-measurement target is created for each row And duplicate size labels within the file are flagged with row numbers And the resulting mapping is saved as a draft linked to the brand, collection, and category

Unit Detection, Conversion, and Normalization

Given a size chart where measurement units are specified per column or inferred (in/cm) When the user confirms or adjusts detected units and submits for processing Then all measurements are converted to a canonical unit (centimeters) using 1 in = 2.54 cm and stored to two decimal places using half-up rounding And any negative or zero measurements are flagged as errors with row references And a unit summary is recorded in the mapping metadata

Handling Missing Values and Tolerances with Fallback Defaults

Given a draft mapping that contains missing cells or out-of-tolerance values When validation runs Then missing values are filled using brand/category defaults where available and marked as imputed in the audit trail And if no default exists, the affected rows are listed as errors with size labels and fields And values outside configured tolerance thresholds (brand/category-level) are flagged as warnings And activation is blocked if any errors remain

Versioning and Auditability of Mappings

Given an existing Active mapping for a brand and collection When a user edits measurements or header mappings and saves changes Then a new immutable version is created with a monotonically increasing version number, timestamp, actor, and change diff And previous versions remain retrievable and cannot be modified And exactly one version can be marked Active per brand+collection+category at a time

Linking Mappings to Catalog SKUs via PixelLift Metadata

Given SKUs in a catalog upload containing brand, collection, category, and size metadata When processing begins Then each SKU is matched to the Active mapping using brand+collection+category and normalized size labels (e.g., "S" ≡ "Small") And unmatched SKUs are reported with reasons (missing metadata, no Active mapping, unknown size label) And a summary report lists matched count, unmatched count, and mapping version used

Pre-Activation Validation and Consistency Checks

Given a draft mapping awaiting activation When the user runs Validate Then the system verifies required columns are present, units are consistent per column, sizes are unique, and measurements are non-negative And logical checks run (e.g., monotonic progression by size where applicable, within configured tolerances) And a validation report is generated with Error and Warning sections and line references And the Activate action remains disabled until all errors are resolved

Downstream Consumption of Normalized Targets by SizeSync

Given an Active mapping for a brand+collection+category When SizeSync requests targets for a SKU Then an API returns neckDepth, sleeveOpening, and cropMargin in centimeters with the mapping version identifier And identical size labels across color variants return identical targets And the request is logged with correlation ID and latency And requests without a matching Active mapping return a 404 with a descriptive error code

Parameter Lock Profiles

"As a brand designer, I want to define lock profiles for tees, hoodies, and dresses so that every batch renders with uniform framing across sizes and colors."

Description

Define reusable lock profiles that specify which ghosting and framing parameters to lock (e.g., neck depth, sleeve opening, hem/crop margin), alignment rules, anchor points, and permitted variance thresholds. Allow profile scoping by brand, category, and channel. Auto-apply the correct profile on batch ingest, integrate with PixelLift style-presets, and expose a simple UI for create/edit/clone to promote standardization across teams.

Acceptance Criteria

Auto-apply Correct Profile on Batch Ingest

Given at least one active lock profile exists and a batch import contains items with brand, category, and channel metadata When the batch is ingested Then the system resolves and applies a single lock profile per item using specificity ranking (brand+category+channel > any two > any one) And the applied profile ID and version are written to each asset’s metadata and the job manifest And a job summary lists counts by applied profile and count with no match And profile resolution completes with median latency <= 50 ms/item and p95 <= 200 ms/item for batches up to 5,000 items

Profile Scoping, Specificity, and Fallback

Given multiple matching profiles exist When resolving the profile for an item Then the most specific profile wins; ties are broken by explicit profile priority; if equal, the most recently updated wins And if no profile matches, the system applies the configured global default profile; if no default is configured, the item is flagged "No Profile" and requires manual selection And the resolution decision (match factors, priority, fallback) is recorded in trace logs accessible from the job details

Create/Edit/Clone Lock Profile UI

Given a user with Profile Editor permission opens the Profiles screen When they create a profile Then they can set name, description, scope (brand(s), category(ies), channel(s)), and activation state And they can configure which parameters are locked (neck depth, sleeve opening, hem/crop margin), alignment rules, anchor points, and per-parameter variance thresholds with units (px or %) And inputs validate with inline errors (required fields, numeric ranges 0–200 px or 0–30%, unit consistency) and prevent save until resolved And saving persists the profile and returns a success toast within 2 seconds And "Clone" duplicates all settings (except name which is appended with "Copy") and opens in edit mode And editing an existing profile creates a new version and preserves prior versions read-only

Enforce Locked Parameters and Variance Thresholds

Given an applied lock profile defines target values and variance thresholds for selected parameters When the processing engine generates ghosted/framed outputs Then for each output image, the measured parameter values match the profile targets within the specified thresholds And if any parameter exceeds its threshold, the image is flagged "Out of Tolerance", appears in the QC queue, and the job summary reports the count And a per-image QC report displays measured vs target values and deltas for each locked parameter And unlocked parameters remain unaffected by the profile

Integration with PixelLift Style-Presets

Given a style-preset is selected alongside an applied lock profile When rendering outputs Then locked parameters from the profile take precedence over overlapping preset settings And non-overlapping or unlocked parameters are applied from the preset as defined And repeated application of the same preset + profile combination yields identical outputs (idempotent) And switching to a different preset does not alter locked parameters

Manual Override and Reprocess Behavior

Given auto-applied profiles are displayed for a batch When a user with appropriate permission manually selects a different profile for the batch or specific items and triggers reprocess Then the selected profile is applied to the chosen scope, prior derivatives are superseded, and the override is logged with actor, timestamp, scope, and previous profile/version And users can revert to the previously applied profile in a single action And reprocessed items update their metadata to reference the new profile and version

Audit Logging and Versioning

Given profiles can be edited over time When a profile is saved with changes Then a new immutable version is created with version number, change summary, author, and timestamp And assets record the profile ID and version used at processing time And jobs can be pinned to a specific profile version; if not pinned, reprocess uses the latest active version And an audit log is queryable by profile, version, user, date, and action and exports to CSV

Variant Cohesion Preview Grid

"As a content coordinator, I want to preview how each size and color will look together so that I can catch inconsistencies before publishing."

Description

Provide an interactive preview grid that displays normalized images across sizes and color variants before export. Visualize locked parameters and highlight outliers exceeding thresholds. Offer quick, bounded per-image nudges and a toggle between original and SizeSync results, with real-time batch recalculation. Integrate with existing PixelLift preview and export flows to streamline decision-making and reduce rework.

Acceptance Criteria

Grid Renders Normalized Variant Matrix

Given a batch containing at least one product with size and color variants and SizeSync locks applied When the user opens the Variant Cohesion Preview Grid for that product Then the grid displays one tile per variant with visible size and color labels And all non-outlier tiles share identical neck depth, sleeve opening, and crop margin values as the current lock And the grid renders up to 100 variant tiles within 3 seconds And tiles maintain aspect ratio and are aligned to a common baseline

Outlier Detection and Highlighting

Given system thresholds for parameter variance are active When a variant’s computed parameter deviates beyond the threshold from the lock Then its tile is visually highlighted with an Outlier badge and parameter deltas And a summary count of outliers is displayed above the grid And clicking the badge reveals which parameter(s) exceed the threshold and by how much And the Next Outlier control navigates sequentially to each outlier tile

Bounded Per-Image Nudges on Variants

Given a variant tile is selected in the grid When the user nudges neck depth, sleeve opening, or crop margin Then each nudge adjusts in 1 px increments up to a maximum of ±10 px from the lock And the UI prevents adjustments beyond the maximum with a disabled control state And Undo and Reset actions restore the previous value and the saved lock respectively And tooltips display the current value and delta from the lock in real time

Real-Time Batch Recalculation After Edit

Given the grid is open with SizeSync applied When the user applies a per-image nudge or changes the active style preset Then the entire grid recalculates and re-renders affected tiles And non-outlier/outlier status updates within 1 second for up to 100 tiles And a progress indicator displays if recalculation exceeds 1 second and the UI remains responsive And updated counts and filters reflect the new state immediately after completion

Original vs SizeSync Toggle

Given the grid is open with SizeSync applied When the user toggles between Original and SizeSync views Then all tiles switch to the selected view within 500 ms And per-image nudges are applied only in SizeSync view and do not alter Original imagery And the current toggle state persists when returning to the grid during the same session

Export Flow Integration from Grid

Given at least one product is reviewed in the Variant Cohesion Preview Grid When the user clicks Export Then the existing PixelLift export dialog opens pre-populated with the SizeSync-processed images and settings And any per-image nudges are included in the exported output And if outliers remain, a non-blocking warning is shown with a link to review, and export may proceed

Locked Parameter Visualization Overlay

Given the grid is open When the user enables the Parameter Overlay Then each tile displays guides for neck depth, sleeve opening, and crop margin with numeric values And non-outlier tiles match the lock values exactly; outlier tiles display the delta And overlays remain accurate at 100% and zoomed views and can be toggled off with a single control

Auto Landmark Detection & Anchor Stabilization

"As a photo editor, I want the system to auto-detect garment landmarks so that framing and ghosting locks are accurate without manual marking."

Description

Detect garment landmarks (neckline, shoulder seam, sleeve opening, hem) using category-specific computer vision models to anchor ghosting and cropping. Produce confidence scores, log detection metrics, and gracefully fall back to size-chart targets when confidence is low. Support mannequin, flat-lay, and ghost imagery; interoperate with background removal; and ensure deterministic anchoring so repeated runs yield stable results.

Acceptance Criteria

Category-Specific Landmark Detection Accuracy

Given a labeled validation set per garment category (e.g., tops, dresses, outerwear) across mannequin, flat-lay, and ghost imagery, When Auto Landmark Detection runs with default configuration, Then at least 95% of images return all required landmarks {neckline_center, left_shoulder_seam, right_shoulder_seam, left_sleeve_opening, right_sleeve_opening, hem_line} each with a confidence score in [0,1]. And Then, measured against ground truth, the 95th-percentile per-landmark Euclidean localization error is <= 2% of the garment bounding-box max dimension. And Then the output payload includes per-landmark pixel coordinates in image space and their confidence scores.

Confidence-Driven Fallback to Size-Chart Anchors

Given detection results for an image, When any required landmark is missing or has confidence < 0.75, Then the system replaces detection-based anchors with size-chart-derived anchors for ghosting and cropping. And Then the processing continues without error and produces a valid output image. And Then a structured log entry records image_id, missing_or_low_confidence_landmarks, per-landmark confidences, fallback_reason="low_confidence_or_missing", size_chart_version, and anchor_source="fallback:size-chart".

Deterministic Anchoring Across Repeated Runs

Given identical inputs, configuration, model version, and runtime environment, When the pipeline processes the same image three times, Then the resulting anchor coordinates, crop boxes, and ghosting parameters are bitwise identical and produce identical output checksums. And Then, across supported hardware/software variants, coordinate deltas are <= 1 pixel and rotation deltas are <= 0.1°.

Interoperation with Background Removal

Given background removal is enabled, When detection and anchoring run in the configured pipeline order, Then anchor coordinates are correctly transformed into the exported image space and align with the garment within <= 2 pixels of the anchors produced with background removal disabled. And Then landmark detection recall with background removal enabled is within 2 percentage points of the recall with background removal disabled on a held-out test set. And Then no required landmark is systematically lost due to segmentation (per-category loss rate difference <= 1 percentage point).

Metrics Logging and Job-Level Observability

Given any processed image, When the pipeline completes, Then a structured record is persisted containing image_id, job_id, model_version, processing_timestamp, per-landmark coordinates and confidences, inference_duration_ms, anchor_source (detection|fallback:size-chart), and any errors. And Then job-level aggregates are emitted containing detection_success_rate, average_confidence_by_landmark, fallback_rate, per-category breakdowns, and P95 localization error. And Then metrics are queryable via API and visible in the monitoring dashboard within 60 seconds of job completion.

Anchor Stabilization Across Size Runs and Color Variants (SizeSync)

Given a size run (e.g., XS–XL) and color variants for a single style with SizeSync enabled, When images are processed, Then the final neck depth anchors across sizes have a standard deviation <= 2 pixels, sleeve opening vertical alignment standard deviation <= 2 pixels, and crop margin variance <= 1% of image height. And Then within a given size, color variants share identical ghosting parameters (neck depth, sleeve opening, crop margin) within max(1 pixel, 1% of garment bounding-box dimension). And Then if a subset of sizes uses detection and others fall back, the final anchors for all sizes match the size-chart targets within <= 2 pixels.

Batch Overrides & Tolerance Controls

"As a lead retoucher, I want to set tolerances and override specific items so that exceptions don’t force us to break the whole batch."

Description

Enable global tolerance settings (e.g., ±2% neck depth) and role-based per-image overrides for atypical items without breaking batch consistency. Provide bulk actions by SKU/size/color, undo/reset to defaults, and an audit trail of changes. Persist overrides to the project so subsequent re-renders and pipelines respect the same decisions, and allow import/export of tolerance presets for reuse.

Acceptance Criteria

Apply Global Tolerances Across Batch

Given a project has SizeSync enabled and a batch of images spanning multiple sizes and colors And global tolerances are set to neckDepth ±2%, sleeveOpening ±1.5%, cropMargin ±3% When the batch is processed Then the measured neckDepth, sleeveOpening, and cropMargin for each image fall within the configured tolerance bands relative to the size-chart target or base size And cross-variant variance for each parameter does not exceed the specified tolerance And the UI surfaces the active tolerance values and their sources (global vs. override) at batch level

Role-Based Per-Image Override

Given user roles include Admin and Editor (authorized) and Viewer (unauthorized) When an authorized user applies a per-image override to a parameter (e.g., set neckDepth to 24.5 mm or +1.0%) Then only the targeted image renders using the override while the rest of the batch remains governed by global tolerances And the system records the override as active for that image and displays an override badge in the UI And an unauthorized user cannot create, edit, or remove per-image overrides and receives a permission error

Bulk Actions by SKU/Size/Color

Given a user filters the batch by SKU=“A123”, sizes in [S,M,L], and color=“Navy” When the user applies a tolerance update or per-image override to the selection Then the change is applied only to images matching the filter criteria and other images are unaffected And the system reports the count of images updated, skipped (non-matching/locked), and failed with reasons And updates are atomic per image (a failure on one image does not roll back successful updates on others)

Undo and Reset to Defaults

Given a user has performed one or more tolerance or override changes When the user clicks Undo Then the most recent change is reverted for all affected images and a re-render is triggered for those images And when the user selects Reset to Defaults at batch level Then all tolerances and overrides revert to the project’s default preset and affected images re-render to reflect defaults And the audit trail records both the undo and reset actions with before/after values

Audit Trail of Changes

Given any tolerance or override change is saved (via UI or API) When the change is committed Then an audit entry is created containing timestamp (ISO 8601), user ID, role, project ID, scope (batch/selection/image), parameter(s), old value(s), new value(s), action type (create/update/delete/reset), and method (UI/API) And audit entries are immutable and viewable via a paginated, filterable log And access to the audit view is restricted to authorized roles

Persistence Across Re-Renders and Pipelines

Given per-image overrides and global tolerances have been saved in the project When the user re-renders the batch or runs a downstream pipeline Then the previously saved overrides and tolerances are applied without additional input And per-image overrides take precedence over global tolerances for their respective images And the resulting parameter values match the saved override values within ±0.1% numerical tolerance (accounting for rounding)

Import/Export Tolerance Presets

Given a named tolerance preset exists in Project A When the user exports the preset Then a versioned file (e.g., JSON) is generated containing parameters, units, and tolerance values When the user imports that file into Project B Then the system validates schema/version and either accepts the preset or returns a clear error with line-item validation messages And on name conflict the user can choose to rename or overwrite; on success the preset is available to set as project default

Consistency Scoring & Alerting

"As an operations manager, I want alerts and scores on batch consistency so that we can ensure catalog cohesion and reduce rework."

Description

Calculate a consistency score per batch and per grid based on variance across locked parameters, surfacing alerts for assets outside thresholds. Generate downloadable QA reports, expose metrics in the PixelLift dashboard, and send notifications via email/Slack/webhooks to prompt timely corrections. Track trends over time to demonstrate process improvements and impact on conversion.

Acceptance Criteria

Batch Consistency Score Calculation & Threshold Application

Given a processed batch with SizeSync-locked parameters and workspace thresholds defined When scoring runs on batch completion or on-demand via a Recalculate action Then the system computes a batch consistency score from 0.0 to 100.0 with 1-decimal precision And the score is deterministic for identical inputs And assets missing required measurements are excluded from the score and counted as missing in metadata And the batch is marked Pass if score >= the configured minimum, otherwise Fail And the score, pass/fail, and counts (total, scored, missing, flagged) are persisted and visible in the batch header

Grid-Level Cohesion Scoring

Given a product grid that aggregates all sizes and color variants for a style When grid scoring executes after all assets in the grid are processed Then a grid score from 0 to 100 is computed using variance of neck depth, sleeve opening, and crop margin relative to the grid anchor (median size) And per-parameter standard deviation and maximum deviation are stored And the grid is marked Pass only if each parameter is within its threshold and the overall score meets the minimum And the grid score is displayed on the grid card with a tooltip showing per-parameter metrics

Threshold-Based Alerting & UI Indicators

Given per-parameter thresholds in pixels and/or percent of longest edge When any asset’s neck depth, sleeve opening, or crop margin deviation exceeds its threshold Then the asset thumbnail shows a Consistency alert badge and the out-of-threshold parameters are listed with delta values And batch and grid views display flagged_count and flagged_percentage And only assets exceeding thresholds are flagged; assets within tolerance are not And clicking the alert opens a details panel with per-parameter measurements and deviation values

Downloadable QA Reports (PDF/CSV)

Given a batch or grid with computed scores When a user requests Download QA Then CSV and PDF reports are generated within 10 seconds for up to 2,000 assets And each report includes batch_id/grid_id, asset_id, filename, processed_at, per-parameter deltas (px and %), per-asset pass/fail, batch_score, grid_score, thresholds_version, flagged_count, and report_generated_at And the report is accessible via a time-limited URL and the download is recorded in audit logs

Notifications: Email, Slack, and Webhooks

Given notification channels are configured for the workspace When a scoring job completes and batch_score is below the minimum threshold or flagged_count > 0 Then an email and Slack message are sent within 2 minutes and a webhook POST is delivered with an HMAC-SHA256 signature header And all messages include workspace_id, batch_id, grid_id (if applicable), batch_score, flagged_count, thresholds_version, report_url, and timestamp And duplicate notifications for the same scoring event are prevented via an idempotency_key And failed webhooks are retried up to 3 times with exponential backoff and final failures are logged

Trend Tracking & Conversion Impact Metrics

Given historical batch and grid scores exist When a user views the Metrics section of the PixelLift dashboard Then time series charts show weekly averages of batch_score, outside_threshold_rate, and flagged_count for the last 12 months with filters for workspace and product category And percentage change versus the prior period is displayed And if conversion data is connected, conversion_rate and delta versus prior period are shown alongside average score; otherwise a conversion data unavailable state is shown And each data point links to the list of underlying batches/grids

Re-scoring on Corrections & Audit Trail

Given flagged assets have been corrected and reprocessed When the user triggers Recalculate or an automatic rescore runs on job completion Then per-asset flags are cleared if now within thresholds and batch/grid scores update within 60 seconds for up to 500 assets And a versioned audit trail records previous and new scores, thresholds_version, actor, and timestamps And trend charts and newly generated QA reports reflect the latest scores while preserving historical snapshots

Preset Compatibility & API Export

"As a developer, I want API endpoints and preset compatibility so that SizeSync fits into our automated media pipeline."

Description

Ensure lock profiles interoperate with existing PixelLift style-presets and batch automation. Provide API endpoints to apply profiles, retrieve variance reports, and export processed images with embedded metadata (locked parameters, scores). Support delivery to connected DAM/e-commerce platforms (e.g., Shopify, BigCommerce) via current connectors, preserving variant associations and ensuring downstream grids render cohesively.

Acceptance Criteria

Batch Apply Lock Profile with Style-Preset via API

Given a valid OAuth2 token, a style_preset_id, and a lock_profile_id, when the client POSTs to /v1/size-sync/apply with a batch of N images where N<=500, then the API responds 202 with a job_id, accepted_count=N, rejected_count=0, and idempotency maintained via Idempotency-Key header. Given a duplicate request with the same Idempotency-Key, when POSTed within 24h, then no new job is created and the original job_id is returned with 202. Given N>500, when POSTed, then the API responds 413 with error.code="batch_limit_exceeded" and error.limit=500. Given overlapping parameters between the preset and lock profile, when applied, then locked parameters take precedence, conflicts_resolved>=0 is returned, and no preset parameter overwrites a locked value. Given a batch of 200 JPEG images (>=3000x3000), when processed on the standard tier, then average enqueue latency<5s and effective throughput>=30 images/min measured over the job duration. Given any invalid style_preset_id or lock_profile_id, when POSTed, then the API responds 404 with error.code in {preset_not_found, lock_profile_not_found}.

Retrieve Job Status and Variance Report

Given a valid job_id, when GET /v1/size-sync/jobs/{job_id} is called, then the response is 200 with status in {queued,processing,completed,failed_partially,failed}, counts {total,processed,failed}, and links.report set when any item is processed. Given a completed job, when GET /v1/size-sync/jobs/{job_id}/variance-report?format=json is called, then response 200 includes per-image fields {image_id, product_sku, variant_id, locked:{neck_depth_mm,sleeve_opening_mm,crop_margin_px}, variance_score_0_100, normalization_status in {ok,deviation}, warnings[]}. Given the same request with format=csv, when called, then a CSV is returned with identical columns and row count equals total processed images. Given any image with variance_score_0_100>20, when included in the report, then normalization_status=deviation and deviation_fields[] lists the parameter names exceeding tolerance. Given an unknown job_id, when GET is called, then response 404 with error.code="job_not_found".

Export Processed Images with Embedded Metadata

Given processed images, when GET /v1/size-sync/jobs/{job_id}/exports?format=zip is called, then a ZIP is returned with all images and sidecar.json for formats that do not support metadata embedding. Given JPEG output, when exported, then XMP metadata includes fields: sizsync.profile_id, sizsync.profile_version, sizsync.locked.neck_depth_mm, sizsync.locked.sleeve_opening_mm, sizsync.locked.crop_margin_px, sizsync.variance.score, preset.id, preset.version, source.image_id, variant.sku, processing.timestamp (ISO8601), processing.version, and these fields validate non-null. Given PNG output, when exported, then iTXt/tEXt chunks contain the same key/value metadata or a sidecar JSON is included with identical content, and checksum.sha256 is provided per file. Given color-managed inputs, when exported, then embedded ICC profile is preserved and output color space matches the input or the selected preset target, with deltaE2000<=2 against reference on a 24-patch test. Given locked crop parameters, when visually validated, then resultant pixel crops deviate by <=1px from intended margins across all outputs.

Deliver to Shopify/BigCommerce with Variant Preservation

Given a connected Shopify store, when POST /v1/size-sync/jobs/{job_id}/deliver with target={"platform":"shopify","connection_id":"conn_123"} is called after completion, then all images upload successfully and are attached to the correct product variants by SKU or variant_id mapping with success_count=total and error_count=0. Given BigCommerce as target, when delivered, then images attach to variant IDs matching the provided mapping and preserve media positions such that size-run order is ascending (e.g., XS,S,M,L,XL). Given platform API rate limits (429), when encountered during delivery, then the connector retries with exponential backoff (min 1s, max 32s, retries<=3) and surfaces partial failures with error_count>0 and per-item error codes if limits persist. Given storefront rendering, when a product grid is requested with all size/color variants, then computed alignment metrics across the set (neck depth, sleeve opening, crop margin) show variance<=1px and the platform media order matches the input sequence. Given a successful delivery, when completed, then a webhook callback is sent to the subscriber with delivery_id, job_id, per-item platform_asset_id, and variant associations.

Backward Compatibility with Existing Presets and Automations

Given an automation that runs a style-preset without SizeSync, when executed after the SizeSync release, then outputs are byte-for-byte identical to pre-release baselines for a 50-image regression set. Given a preset to which SizeSync is newly enabled, when applied, then only locked presentation parameters change and all other preset transformations (color, exposure, sharpening) remain within existing tolerances defined by the preset tests. Given API clients using /v1/presets/apply, when called without SizeSync parameters, then responses and behavior are unchanged and no deprecation warnings are emitted. Given versioned lock profiles, when a profile is updated to a new version, then existing jobs pinned to the prior version remain reproducible and new jobs reference the latest version in metadata.

Downstream DAM Ingestion and Metadata Integrity

Given an export to a connected DAM (e.g., Bynder) via existing connector, when the assets are ingested, then embedded XMP fields for SizeSync and preset are preserved and queryable via the DAM’s metadata API for 100% of assets. Given sidecar JSON where embedding is not supported, when ingested by the DAM, then the connector attaches the JSON as a related asset or maps fields into DAM custom metadata keys without loss. Given a roundtrip download from the DAM and re-upload to PixelLift, when inspected, then required identifiers (source.image_id, variant.sku, profile_id, preset.id) are intact and match originals. Given any metadata write failure, when detected, then the job marks failed_partially with a non-zero error_count and includes a remediation hint and retriable flag for the affected items.

Fingerprint Builder

Onboard each supplier in minutes by generating a robust visual and metadata signature from a handful of sample images. PixelLift auto-extracts EXIF patterns, logo placements, background hues, lighting histograms, and crop ratios to create a reliable fingerprint that boosts routing accuracy without manual mapping.

Requirements

Supplier Sample Onboarding Intake

"As an operations manager onboarding a new supplier, I want to upload a handful of sample images and start fingerprint creation in minutes so that incoming catalogs route correctly without manual mapping."

Description

Enable rapid onboarding of each supplier by accepting a small set of sample images via upload or URL, validating formats and minimum sample count, deduplicating files, and assigning a unique supplier ID. Kick off asynchronous extraction jobs with visible progress, provide estimated completion time, and support batch onboarding. Capture optional supplier metadata (name, contact, tags) and link to existing PixelLift accounts. Ensure secure, temporary storage for samples and enforce size limits and virus scanning. Emit events for downstream processing and audit logs for traceability.

Acceptance Criteria

Single-Supplier Intake via Upload and URL

Given an authorized user is on the Supplier Sample Onboarding Intake form When they add sample images via file upload and/or paste publicly accessible URLs Then only JPEG, PNG, and WebP files are accepted and any other formats are rejected with a clear validation message And any file larger than 20 MB is rejected with a clear validation message And at least 5 valid, unique samples are required to enable submission Given one or more provided URLs are unreachable or return non-image content When the user attempts to submit Then the system reports which URLs failed resolution with specific reasons and blocks submission until corrected Given all client-side validations pass When the user clicks Submit Then the system creates an intake record and returns 202 Accepted with a supplierId and jobId And the intake detail view displays initial status "Queued"

Automatic Deduplication of Sample Images

Given a set of uploaded files and/or fetched URLs for a single supplier intake When the system processes the intake Then exact duplicate images (based on cryptographic hash) are identified and removed from the effective sample set And a duplicate report lists the original image and all duplicates with their sources (upload or URL) And filename-only duplication without identical content is not treated as a duplicate Given deduplication reduces the effective unique sample count below 5 When validation completes Then the intake is marked "Failed Validation" with a message "Minimum 5 unique samples required" And no extraction job is started

Supplier ID Assignment and Metadata Capture/Linking

Given an intake passes validation When the system creates the supplier record Then a globally unique, immutable supplierId is generated and stored And the supplierId is returned in the 202 response and displayed in the UI Given optional metadata (name, contact email, tags) is provided When the intake is submitted Then metadata is validated (email format, tags as strings) and stored without blocking submission if omitted Given a request includes an existing PixelLift accountId to link When the accountId exists and is active Then the supplier is linked to that account and the linkage is shown in the intake summary Given a request includes an invalid or non-existent accountId When validation runs Then the linkage is rejected with a specific error without blocking other valid metadata

Asynchronous Extraction Job with Progress and ETA

Given an intake is accepted and queued When the extraction job starts Then the job status transitions from "Queued" to "Running" and a progress percentage is available via API/UI And progress updates are available at least every 5 seconds while running And an ETA is displayed within 10 seconds of job start and updates as progress changes Given the job completes successfully When processing finishes Then status becomes "Succeeded" and outputs include extracted EXIF patterns, logo placements, background hues, lighting histograms, and crop ratios linked to supplierId Given the job fails (e.g., processing error) When the error is detected Then status becomes "Failed" with a user-visible error code and message And a retry action is available without re-uploading samples

Batch Onboarding of Multiple Suppliers

Given a user initiates batch onboarding containing multiple suppliers, each with its own samples and metadata When the batch is submitted Then each supplier intake is independently validated and processed And one supplier's failure does not block other suppliers in the same batch Given batch processing is underway When the user views the batch summary Then counts of Succeeded, Failed, Running, and Queued intakes are displayed And each supplier row shows supplierId (when assigned), jobId, and current status Given multiple suppliers in a batch have overlapping images When deduplication runs Then deduplication is applied within each supplier intake scope only and does not cross-eliminate samples between different suppliers

Security, Temporary Storage, Size Limits, and Malware Scanning

Given samples are received (upload or fetched from URL) When they are stored Then samples are kept in temporary storage encrypted at rest and in transit and accessible only to processing services And samples are automatically deleted upon job completion or when the configured TTL elapses (default 14 days) Given malware scanning runs before any processing When a file is flagged as infected Then the file is quarantined, excluded from processing, and the intake displays which files were quarantined And if quarantines reduce unique samples below the minimum, the intake fails validation with an explicit message Given a user attempts to upload a file exceeding the per-file size limit (20 MB) When validation runs Then the file is rejected with a clear error and is not stored

Event Emission and Audit Logging for Traceability

Given significant milestones occur (intake created, validated, job started, job succeeded, job failed) When each milestone is reached Then an event is emitted to the event bus with type, supplierId, jobId, timestamp, and correlationId And events are recorded exactly once per milestone attempt with retries handled by the messaging infrastructure Given any user or system action related to an intake When the action occurs Then an audit log entry is written with actor (or system), action, supplierId, jobId, outcome, timestamp, and originating IP/request id And audit logs are queryable by supplierId and correlationId

EXIF & Metadata Pattern Extraction

"As a technical user, I want PixelLift to extract and normalize metadata patterns from sample images so that the fingerprint captures reliable non-visual cues for routing."

Description

Automatically parse EXIF/IPTC/XMP from sample images, normalize fields (timezones, camera/software strings), and compute statistical patterns across samples (e.g., common camera model, software tag, missing/locked fields). Handle corrupted or absent metadata gracefully and support common formats (JPEG, PNG, TIFF, RAW where available). Identify potential PII and apply configurable scrubbing policies before storage. Persist a canonical metadata signature with tolerances and weights for use in matching, and expose structured outputs to the fingerprint store.

Acceptance Criteria

Mixed-Format Batch Metadata Parsing & Normalization

Given a batch of 200 images (50 JPEG, 50 PNG, 50 TIFF, 50 RAW evenly split across CR2/NEF/ARW/DNG) When the batch is processed through EXIF/IPTC/XMP extraction Then parseable fields are extracted and normalized with overall success rate >= 95% (absent fields excluded from denominator) And all time fields are normalized to UTC with original tzOffset preserved And cameraModel and software fields are normalized to canonical strings with >= 98% coverage across parseable values And PNGs without metadata are marked metadataAbsent without raising errors And throughput is >= 100 images/min; p95 latency per image <= 300 ms (non-RAW) and <= 1.5 s (RAW)

Robust Handling of Absent or Corrupted Metadata

Given a batch where 20% of images contain corrupted metadata segments and 10% contain no metadata When the batch is processed Then the job completes without failure; hard-failure rate = 0 And each problematic asset is tagged with metadataStatus in {corrupted, absent} and a machine-readable errorCode And normalized outputs use empty values for absent fields and continue processing remaining assets And all exceptions are logged with traceId; p95 added latency per corrupted asset <= 500 ms

Canonical Metadata Signature Generation with Tolerances & Weights

Given at least 10 sample images from a single supplier When statistical aggregation completes Then a canonical metadata signature is persisted containing keysObserved, frequency, confidence per key, tolerances, and weights where sum(weights) = 1.0 and each weight ∈ [0,1] And at least 3 stable fields have confidence >= 0.70 or signatureStatus = insufficient_samples is recorded And lockedFields are those constant across >= 95% of samples and are listed explicitly; missingFields are enumerated And the signature validates against schema version v1.x with JSON Schema validation passing

PII Detection and Scrubbing per Tenant Policy

Given a tenant policy: strip GPS.*, strip Creator/By-line/Contact/Email, hash DeviceSerialNumber, keep Copyright And a batch containing these fields When processing and persistence occur Then PII findings are detected with recall >= 99% for targeted keys And scrub actions exactly match policy; stripped values are not present in storage or emitted events; hashed values use salted SHA-256 with tenant-specific salt And an audit log entry records each scrubbed field with action, timestamp, tenantId, and request traceId

RAW Format Coverage and Fallback Behavior

Given RAW files in CR2, NEF, ARW, DNG and an unsupported RAW (e.g., RAF) When processed Then EXIF extraction succeeds on supported RAW types with success rate >= 90% for parseable fields And unsupported RAWs are labeled metadataUnsupported and skipped without failing the job And embedded JPEG previews are not used for metadata unless previewFallback = true is configured

Structured Output Publication to Fingerprint Store

Given successful extraction and aggregation When outputs are produced Then a record is written to the fingerprint store with fields: signatureId, supplierId, sampleCount, exifStats, iptcStats, xmpStats, missingFields, lockedFields, piiFindings, scrubActionsApplied, tolerances, weights, signatureVersion, createdAt And the store write returns 2xx and the record is queryable by signatureId within 2 seconds And a message is published to topic fingerprint.signature.created with the same payload schema and headers tenantId and schemaVersion And schema compatibility is enforced via registry; incompatible writes/messages are rejected with a clear error code and do not persist

Visual Signature Feature Extraction

"As a product photo lead, I want PixelLift to compute reliable visual signatures from a few samples so that the system can recognize supplier images even when backgrounds or crops vary slightly."

Description

Derive robust visual features from samples including logo presence and placement heatmaps, dominant/background hue distributions, lighting and tonal histograms (RGB/HSV), crop and margin ratios, aspect ratio frequencies, shadow/reflection profiles, noise/grain characteristics, and palette clusters. Generate both interpretable summaries and learned embedding vectors. Support variable resolutions and orientations, apply color-space normalization, and ensure deterministic outputs across runs. Optimize for batch throughput and provide feature-quality metrics per sample.

Acceptance Criteria

Batch Visual Feature Extraction Completeness & Schema

Given a batch of 200 product images in mixed formats (JPG, PNG, WebP) and sizes When Visual Signature Feature Extraction is executed in batch mode Then each image output shall include the following fields with non-null values and documented units: - logo_presence (boolean), logo_placement_heatmap (64x64 float32 grid in normalized [0..1] coordinates) - dominant_hues (top_k=5 HSV centroids with percentage coverage) and background_hue_distribution (H-channel histogram, 36 bins) - lighting_histograms.RGB (256 bins/channel) and tonal_histograms.HSV (H:36, S:32, V:32 bins) - crop_margin_ratios (top,left,bottom,right as fractions of height/width), aspect_ratio (float), and orientation (portrait/landscape/square) - shadow_profile (area_pct, mean_intensity) and reflection_profile (area_pct, mean_intensity) - noise_grain (sigma, grain_size_px) and palette_clusters (k=5 with hex, pct) - embedding_vector (float32, length=512) and interpretable_summary (JSON with human-readable aggregates) - schema_version (semantic version) and processing_metadata (runtime_ms, image_px) And a batch-level summary shall be emitted including aspect_ratio_frequencies (top modes with pct) and crop_margin_distributions (quartiles) And the output JSON shall validate against the published JSON Schema (v>=1.0.0).

Deterministic Outputs Across Runs

Given the same input batch, fixed model/version, identical parameters, and seed When the extractor is run twice on the same hardware target Then the emitted JSON (excluding processing_metadata.runtime_ms) shall be byte-identical And all float arrays (histograms, heatmaps, embeddings) shall match within an absolute tolerance ≤ 1e-6 And image ordering and per-image IDs shall be identical And schema_version and processing metadata (except runtime_ms) shall match exactly.

Color-Space Normalization Consistency

Given duplicates of the same source image encoded in sRGB, Display P3, Adobe RGB, CMYK (with embedded ICC), and grayscale When processed by the extractor Then all images shall be converted to the internal working space before feature computation And dominant/background hue distributions and RGB/HSV histograms of the duplicates shall have histogram intersection ≥ 0.99 And palette cluster centroids across duplicates shall have ΔE00 ≤ 1.0 for corresponding clusters And interpretable summaries shall report the detected source color space and normalization applied.

Support for Variable Resolutions and Orientations

Given a mixed batch with resolutions from 256x256 up to 8000x8000, DPI metadata varied, and EXIF orientations (0/90/180/270, mirrored) When processed Then the system shall respect EXIF to canonicalize orientation prior to feature extraction And all spatial features (heatmaps, crop_margin_ratios, shadow/reflection areas) shall be expressed in normalized coordinates/ratios and be resolution-invariant (tolerance ±0.5%) And for rotated copies of the same photo (differing only by EXIF), features shall be identical after normalization (heatmap cell-wise diff ≤ 1e-6) And peak RAM shall not exceed 2.0 GB while processing a batch of 500 images with streaming enabled.

Batch Throughput and Resource Efficiency

Given a reference environment (GPU: NVIDIA T4 16GB, CPU: 4 vCPU, RAM: 16GB) and a dataset of 2MP images When running Visual Signature Feature Extraction with batch size auto-tuned Then end-to-end throughput shall be ≥ 300 images per minute (p95 per-image latency ≤ 250 ms) And GPU utilization shall average ≥ 60% without exceeding 90% VRAM usage And CPU utilization shall average ≤ 80% and peak RAM ≤ 8 GB during a 2,000-image run And no more than 0.5% of images may be retried due to transient processing errors, with automatic retry ≤ 1 attempt and no data loss.

Per-Sample Feature-Quality Metrics

Given any input image When features are produced Then the output shall include quality metrics with defined ranges: logo_confidence (0..1), background_uniformity (0..1), exposure_score (-1..1), shadow_strength (0..1), reflection_strength (0..1), noise_sigma (≥0), embedding_stability (0..1) And a composite quality_score (0..1) shall be reported along with per-metric flags indicating below-threshold values (defaults: 0.4 for warnings) And all metrics shall be deterministic across runs and never null/NaN/Inf And the batch summary shall include distributions (median, p10, p90) for each metric.

Embedding Vectors Format and Discriminative Power

Given a validation set containing near-duplicate pairs and dissimilar pairs When embeddings are computed Then each embedding shall be float32 length=512 with L2 norm within 1.00±0.01 after normalization And cosine similarity for near-duplicate pairs shall be ≥ 0.95 (p90), while for dissimilar pairs ≤ 0.80 (p10) And embeddings shall be bitwise stable across runs with the same model/version and seed (tolerance ≤ 1e-6) And the output shall include embedding_version and distance_metric metadata.

Fingerprint Aggregation, Weighting & Confidence Scoring

"As a routing engineer, I want a unified fingerprint with confidence scoring and explainability so that new images are matched accurately and ambiguous cases are flagged for review."

Description

Aggregate visual and metadata features across all samples to synthesize a single supplier fingerprint with component weights, tolerances, and confidence thresholds. Support minimum sample requirements, outlier rejection, and incremental updates when new samples arrive. Produce an explainable score breakdown for matches and expose a stable fingerprint ID and version. Persist to a scalable store with atomic writes and rollback support to guarantee consistency and reproducibility.

Acceptance Criteria

Enforce Minimum Sample Requirement

Given a supplier onboarding session with fewer than the configured minimum sample count M When the user attempts to generate a fingerprint Then the system blocks generation and returns error code FP_MIN_SAMPLES, including required=M and provided=<count> And the UI displays the short message "At least M samples required" with the current count Given a session with at least M valid samples When the user starts fingerprint generation Then the job is accepted and queued with a unique requestId And the job status transitions to Completed within the configured timeout or returns a specific failure code

Aggregate Features With Weights And Tolerances

Given an eligible sample set after validation When aggregation runs Then the fingerprint contains a component entry for each configured feature: {name, weight, tolerance, confidenceThreshold} And the sum of all weights equals 1.0 ± 0.001 And all tolerances are ≥ 0 and within configured max bounds per feature And confidenceThresholds are present for all applicable features And aggregation stats (mean, median, variance or histogram bins as applicable) are persisted alongside the fingerprint And the API response includes fingerprintId and version for the created fingerprint

Outlier Detection And Rejection

Given a sample set S ≥ M When outlier detection executes using the configured method (e.g., MAD with k, or IQR with factor) Then any sample flagged as an outlier is excluded from aggregation And each excluded sampleId is recorded with reason and metric values in the audit log And if excluded/total > MaxOutlierPct, the process fails with error FP_TOO_MANY_OUTLIERS and no fingerprint is written And if excluded/total ≤ MaxOutlierPct, the fingerprint is generated from the inlier set only

Incremental Updates With Stable ID And Versioning

Given an existing fingerprint with fingerprintId=F and version=v And new validated samples ΔS arrive for the same supplier When the system performs an incremental update Then the resulting fingerprint keeps the same fingerprintId=F and writes a new version=v+1 And version v remains queryable and immutable And the changeLog records per-feature deltas (Δweight, Δtolerance) and included/removed sampleIds And the update returns currentVersion and previousVersion checksums And if computed drift exceeds MaxDrift per any feature, the update is flagged as major and the major version is incremented

Explainable Match Score Breakdown

Given a candidate image and a stored fingerprint When a match is computed Then the API returns: overallScore in [0,1], decision ∈ {match, no_match}, appliedThreshold, and a per-component breakdown [{feature, contribution, observedValue, expectedRange}] And the sum of contributions equals overallScore ± 0.001 And the top 5 contributors are ordered by absolute contribution And if decision=no_match, at least one reasonCode is included indicating the strongest failing components

Atomic Persistence And Rollback

Given a fingerprint generation or update transaction When persisting to the backing store Then either all artifacts for the same version (core document, stats, explainability, audit) are committed, or none are (atomicity) And a simulated mid-write failure results in rollback with no partial records visible to readers And repeated retries with the same requestId are idempotent (no duplicate versions created) And p95 write latency for a successful commit is ≤ 2s under nominal load

Determinism And Reproducibility

Given the same input sample set, configuration, extractor versions, and random seed When fingerprint generation is executed twice Then the produced weights, tolerances, thresholds, and stats are byte-for-byte identical And the stored metadata includes pipelineVersion, modelHashes, and seed And a regenerate operation using the recorded metadata yields a matching checksum for the fingerprint content

Routing Integration & Fallback Handling

"As a workflow manager, I want matched images to auto-route with clear fallbacks so that processing remains fast and accurate even when confidence is low."

Description

Integrate fingerprint matching into PixelLift’s ingestion pipeline: compute features for incoming images, match against fingerprints within latency targets, and route to the correct style-presets/workflows. Implement configurable thresholds per supplier, A/B testable matching policies, and deterministic tie-breaking. Provide fallbacks for low-confidence matches (manual review queue, default preset, or supplier selection suggestions), along with decision logs, metrics, and an external API for programmatic routing.

Acceptance Criteria

Real-time Match and Routing Within Latency Budget

Given an incoming image with a known supplier fingerprint, When the ingestion pipeline computes features and performs matching, Then the correct supplier workflow/preset is selected with route_accuracy >= 98% on the gold test set. Given nominal load of 100 RPS, When routing is executed, Then end-to-end p95 latency <= 250 ms and p99 <= 500 ms from image receipt to routing decision emission. Given a feature-extraction failure, When the error is detected, Then a fallback decision path is executed within 100 ms and the request is not retried more than once. Given successful routing, When the decision is emitted, Then the downstream workflow starts within 50 ms and the decision is recorded with a trace_id.

Supplier Threshold Configuration and Overrides

Given supplier S has match_threshold=0.83 and min_vote=3 configured, When images for S are processed, Then matches are accepted only when score >= 0.83 and votes >= 3, otherwise the fallback policy for S is applied. Given a routing request with an explicit threshold override (threshold_override=0.9) authorized for supplier S, When processed, Then the override is applied only to that request and is recorded in the decision log. Given an invalid override outside [0,1], When processed, Then the API responds 422 with a validation error and no routing decision is made. Given config change for S, When updated, Then the new values take effect within 60 seconds and are versioned with effective_from timestamp.

Deterministic Tie-Breaking of Candidate Matches

Given two or more candidate suppliers with equal top score within epsilon=0.001, When a tie occurs, Then the winner is selected deterministically using the rule order: supplier_priority > highest_historical_precision > lexicographic supplier_id. Given repeated processing of the same image, When performed across different nodes, Then the selected supplier is identical across runs and the tie_breaker_reason is included in the decision log. Given no supplier_priority is defined, When tie-breaking occurs, Then the next rule in order is applied without randomness. Given tie-breaking, When the decision is emitted, Then all losing candidates and their scores are included in the evidence metadata.

Low-Confidence Fallbacks and Manual Review

Given a match score below supplier S’s match_threshold, When routing, Then the system executes S’s configured fallback: manual_review, default_preset, or suggestions. Given fallback=manual_review for S, When a low-confidence item is routed, Then the item is placed in the review queue within 2 seconds with evidence (top-5 candidates, scores, features summary) and a reviewer_task_id is returned. Given fallback=default_preset for S, When a low-confidence item is routed, Then the default preset is applied within 200 ms and the decision response includes fallback_applied=true and default_preset_id. Given fallback=suggestions for S, When a low-confidence item is routed, Then the API returns top-3 supplier suggestions with scores >= suggestions_min_threshold and no workflow is started. Given a low-confidence item, When fallback is executed, Then the event is metered under metric low_confidence_rate with supplier_id and policy_version labels.

Decision Logs and Operational Metrics

Given any routing decision, When completed, Then a decision log entry is written containing trace_id, request_id, supplier_id (or null), chosen_policy_version, thresholds_used, top_k candidates with scores, match_reason/tie_breaker_reason, fallback_type, timestamps (enqueue, start, decide), and node_id. Given logs retention is configured to 30 days, When queried via the internal log index, Then records are available for at least 30 days and contain no image bytes or raw PII. Given the service is running, When scraped by Prometheus, Then it exposes metrics: routing_latency_ms (p50,p90,p95,p99), route_accuracy (golden set job), low_confidence_rate, fallback_count by type, errors_by_kind, and requests_per_supplier, all labeled by supplier_id, policy_version, environment, and region. Given the golden test set job runs hourly, When it completes, Then route_accuracy >= 98% and any drop >1% over 24h triggers an alert.

External Routing API Contract and Resilience

Given POST /v1/route with a valid payload (image_url or image_bytes, optional threshold_override, supplier_hint), When called with a valid API key, Then respond 200 within p95 300 ms with supplier_id (or null), score, policy_version, fallback_type, trace_id, and decision_evidence. Given invalid payload, When called, Then respond 422 with field-specific errors and do not enqueue work. Given missing/invalid API key, When called, Then respond 401/403 respectively. Given idempotency-key header is provided, When the same key is used within 24h, Then the same response is returned and no duplicate work is performed. Given client exceeds 100 RPS per token, When requests continue, Then respond 429 with Retry-After and rate-limit headers. Given partial dependency outage, When feature extraction is unavailable, Then respond 503 with Retry-After and emit a degraded_mode=true metric while auto-failing over to fallback if configured.

A/B Testable Matching Policies and Safe Rollout

Given supplier S has policies A and B, When traffic allocation is set to A:50%, B:50%, Then requests are randomly assigned per request with sticky assignment by request_id or image_hash and assignment_id is logged. Given the experiment runs, When monitored over a rolling 15-minute window with N >= 1000 requests per branch, Then automatic rollback to policy A occurs if route_accuracy_B - route_accuracy_A < -1.0% or p95_latency_B - p95_latency_A > 50 ms. Given an experiment is ended, When rollout is set to 100% policy B, Then the change is enacted within 5 minutes, recorded as policy_version increment, and no residual traffic hits policy A. Given experiment data, When exported to the analytics sink, Then it includes assignment, decision, scores, supplier_id, and outcome labels needed to compute conversion impact.

Fingerprint Review & Admin Controls

"As an admin, I want to review and adjust fingerprint parameters before activation so that quality and accountability are maintained."

Description

Offer an admin console to visualize and govern fingerprints: preview sample coverage, histogram overlays, palette swatches, logo heatmaps, and feature distributions. Allow edits to weights and thresholds, approval before activation, and supplier associations. Include role-based access control, audit trails, change previews, non-destructive drafts, export/import, and rollback. Provide health indicators and warnings for weak or overfitted fingerprints.

Acceptance Criteria

Fingerprint Visualization Dashboard

Given I am an Admin on the Fingerprint Review console with >=20 sample images loaded for a supplier's fingerprint When I open the Visualization tab Then I see panels for Sample Coverage, Histogram Overlay (RGB/luminance), Palette Swatches (top 10 with proportions), Logo Heatmap, and Feature Distributions (crop ratios, EXIF, aspect ratio) And each panel renders within <=2 seconds for up to 500 samples And hovering any data point shows numeric values and sample counts And selecting a sample thumbnail highlights its contributions across all panels

Draft Editing With Change Preview

Given I have Editor role and open a fingerprint When I create a draft and adjust weights or thresholds Then the live fingerprint remains unchanged and all edits apply only to the draft And a side-by-side preview shows before/after metrics: validation precision, recall, coverage, and change risk score And Save Draft persists a versioned draft with a human-readable diff summary And Undo/Redo are available within the draft session And navigating away with unsaved changes prompts a confirmation

Approval Gate Before Activation

Given a draft has no validation errors and required metadata is complete When an Approver selects Approve and Activate Then a confirmation modal displays the diff, impact metrics, and affected supplier associations, requiring explicit confirmation And upon activation the draft becomes the new live version and is associated to the selected supplier IDs And the previous live version is retained as a rollback target And activation is blocked with specific error messages if health checks fail or required approvals are missing

Role-Based Access Control Enforcement

Given the following roles: Viewer, Editor, Approver, Admin When permissions are enforced Then Viewer: read-only access to visualizations and audit history; no create/edit/export/activate And Editor: create drafts, edit weights/thresholds, run previews; no activate And Approver: approve/activate drafts; no edit draft content And Admin: manage roles, supplier associations, export/import, and all permissions And unauthorized actions return HTTP 403 and in-UI 'Permission denied' messaging

Audit Trail and Rollback

Given any draft save, approval, activation, export/import, or rollback When the action completes Then an immutable audit entry is recorded with UTC timestamp, userId, role, action, fingerprintId, versionFrom, versionTo, diff checksum, and optional reason And audit entries are filterable by date range, user, action, and fingerprintId And selecting a prior version and choosing Rollback creates a new live version identical to the selected one within <=5 seconds and logs the rollback event And rollback is disabled if the selected version fails current schema validation, with a clear error

Export/Import With Validation

Given I have Admin role on a fingerprint When I export the fingerprint Then a signed JSON file is downloaded including schemaVersion, fingerprintId, version, parameters (weights/thresholds), visual descriptors, supplier associations, and audit checksum And when I import a file Then the file is schema-validated, checksum-verified, and scanned for conflicts And on conflict the system creates a new draft in 'Imported' state and presents a merge summary; no live changes occur until approval And invalid files are rejected with line-level validation errors

Health Indicators and Warnings

Given a fingerprint has computed metrics on a 20% holdout set (n>=30) When I open the Health tab Then the system displays a Health Score (0–100), sample count, coverage %, precision/recall, diversity index, and overfit risk indicator And Weak is flagged at Health Score <60 or sample count <30; Overfitted is flagged when precision−recall >25pp and diversity index <0.3 And warnings include specific drivers (e.g., low logo detection recall, high background hue variance) and recommended actions And activation is blocked when Weak or Overfitted flags are present unless an Admin overrides with a mandatory reason captured in audit

Versioning, Drift Detection & Notifications

"As a supplier manager, I want to be notified when a supplier’s fingerprint drifts so that I can refresh samples and keep routing accurate."

Description

Maintain fingerprint versions with timestamps and provenance, monitor live match scores and feature distributions for drift, and trigger alerts when confidence or feature alignment drops below thresholds. Suggest retraining or sample refresh, support scheduled re-evaluations, and enable auto-incremental updates with human approval gates. Provide dashboards, webhooks, and email notifications, plus one-click rollback to the last stable version.

Acceptance Criteria

Fingerprint Versioning & Provenance Capture

Given a supplier exists and a new fingerprint is generated from sample images When the fingerprint is saved Then a new immutable version ID is created (monotonic, unique per supplier) with an ISO-8601 UTC timestamp and provenance fields (creator, source image hashes, pipeline/model versions, changelog) And the prior active version remains retrievable And the new version is not active until explicitly activated or approved And retrieving the version history (≤100 versions) returns within 500 ms at P95 via UI/API And each version can be diffed against any other version, listing changed features and thresholds

Real-Time Drift Detection on Matches & Features

Given live ingestion of match scores and feature vectors per supplier When at least 200 new items have been processed in a 15-minute rolling window Then the system computes drift metrics (confidence mean/variance, feature distribution shift) and compares them to configured thresholds And drift is raised only if thresholds are breached for 3 consecutive windows And thresholds are configurable per supplier with sane defaults And drift computations complete within 2 minutes of window close

Alerting via Email and Webhook on Drift

Given an Open drift event is created for a supplier When alerting is enabled with recipients and a webhook URL configured Then an email is sent and a webhook POST is delivered within 60 seconds of event creation And the webhook payload contains supplier_id, active_version, metrics snapshot, severity, event_id, and HMAC-SHA256 signature And delivery is retried up to 6 times with exponential backoff (max 30 minutes) on non-2xx responses And duplicate alerts are suppressed within a 1-hour dedup window per event_id And users can acknowledge an alert to stop further notifications for that event

Scheduled Re-evaluations & Remediation Suggestions

Given scheduled re-evaluations are configured (daily or weekly) for a supplier When a schedule runs Then the system re-scores a representative sample (min 500 items or last 7 days, whichever is greater) and updates drift status And if drift persists >24 hours, the system generates a remediation suggestion (retrain, refresh samples, adjust thresholds) with rationale and required sample count And job status, duration, and outcomes are logged and visible in the dashboard And failures are surfaced with retry options and error details

Auto-Incremental Updates with Human Approval Gates

Given auto-incremental updates are enabled and a candidate fingerprint version is proposed from new data When an authorized approver reviews the proposal Then approving activates the candidate version within 2 minutes and archives the prior active version as last_stable And rejecting preserves the current active version and records rationale And all actions are audit-logged with user, timestamp, and diff summary And proposals expire automatically after 14 days if not acted on

Monitoring Dashboard for Versions, Drift, and Alerts

Given a user with appropriate access opens the Fingerprint Monitoring dashboard When viewing a supplier Then the dashboard shows active version, last_stable version, drift status, last computation time, and thresholds And charts render match score trends and top feature shifts for the last 7, 30, 90 days And data freshness is ≤5 minutes at P95 And users can filter by supplier and export a CSV of version history, drift metrics, and alert history And all PII is redacted; access respects role-based permissions

One-Click Rollback to Last Stable Version

Given a prior last_stable version exists for a supplier When a user with rollback permission clicks Rollback Then the system sets last_stable as the active version within 2 minutes without interrupting ingestion And the pre-rollback active version is preserved as a candidate with reason=rollback And routing immediately uses the rolled-back version (P95 switch latency ≤120 seconds) And a confirmation email and webhook are sent with the new active_version and rollback reason And rollback is blocked with a clear message if no last_stable version exists

Confidence Gate

Set per-supplier confidence thresholds with clear, human‑readable evidence (e.g., logo match 92%, EXIF time-zone match, lighting profile similarity). Images that pass auto-route; borderline cases are queued for quick review—preventing misroutes while keeping throughput high.

Requirements

Per-Supplier Confidence Thresholds

"As a catalog operations manager, I want to set and adjust confidence thresholds per supplier so that we auto-accept good images while catching outliers without slowing the whole pipeline."

Description

Provide admin UI and API to define per-supplier composite confidence thresholds and per-signal minimums (e.g., logo match, EXIF time zone, lighting profile similarity). Includes global defaults, category-level overrides, supplier-level rules, versioning with rollback, validation of ranges, and effective-priority resolution. Changes apply without downtime and are logged with actor/timestamp for audit. Thresholds are evaluated synchronously in the ingestion pipeline and cached for performance with a short TTL and cache bust on update.

Acceptance Criteria

Admin UI: Create/Edit Per-Supplier Thresholds

Given an authenticated ThresholdAdmin in the admin UI When they create or edit a threshold rule for a supplier (and optionally a category) Then they can set: composite_threshold (0–100, up to 2 decimal places) and per-signal minimums for logo_match, exif_time_zone, lighting_profile_similarity (each 0–100) And invalid inputs (missing, non-numeric, out of range) block save with inline errors identifying each field And successful save returns confirmation and persists a new version with unique version_id and timestamp And an audit record is created with actor_id and timestamp

Effective Rule Resolution: Global, Category, Supplier

Given global defaults, category-level overrides, and supplier-level rules may all exist When computing the effective threshold for supplier S and category C Then precedence is: supplier-level > category-level > global default And only the single active version at the winning scope is used And if multiple active rules exist at the same scope for the same target, the most recently activated one is selected and the conflict is logged as a warning And the effective rule_id and source scope are returned by the API to callers requesting effective thresholds

API: CRUD, Validation, and RBAC

Given an authenticated caller When calling API endpoints to create, update, retrieve, delete thresholds and to fetch effective thresholds Then only users with ThresholdAdmin role can create/update/delete; reads are permitted to authorized users And invalid payloads return 400 with field-level error messages; unauthorized returns 401; forbidden returns 403; not found returns 404; success returns 2xx with version_id in the response And POST supports idempotency via an Idempotency-Key header

Version Activation and Rollback with Audit Trail

Given at least two versions exist for a supplier/category rule When a user activates a new version Then the active pointer switches atomically to the new version without downtime And an audit log entry is recorded with actor_id, timestamp, action "activate", from_version, to_version When a user performs rollback to a previous version Then the active pointer switches atomically to the selected previous version and an audit log entry is recorded with action "rollback"

Cache TTL and Bust-on-Update

Given threshold caches are enabled in ingestion workers When a threshold version is activated or updated Then a cache-bust is propagated so affected cache entries are invalidated within 2 seconds And the cache TTL for thresholds is configurable and set to 60 seconds or less in production And ingestion requests served after invalidation use the new effective thresholds without requiring process restarts

Synchronous Evaluation in Ingestion Pipeline

Given an image from supplier S in category C with computed signals (logo_match, exif_time_zone, lighting_profile_similarity) and composite score When the ingestion pipeline evaluates thresholds synchronously Then if composite >= effective composite_threshold and all per-signal minimums are met, the image is auto-routed as "pass" And if any per-signal minimum is not met, the image is routed to "review" with reason codes listing failing signals And if all per-signal minimums are met but composite < threshold, the image is routed to "review" with reason "below_composite_threshold" And the evaluation result includes human-readable evidence for each signal (e.g., "logo match 92%", "EXIF TZ match true/false", "lighting profile similarity 87%") and the effective rule_id used

Performance Overhead and Availability

Given normal production load When thresholds are evaluated during ingestion and APIs are called Then added evaluation latency is <= 20 ms at p95 and <= 50 ms at p99 per image And the effective-threshold read API responds in <= 200 ms at p95 under nominal load And the system maintains zero downtime for activation and rollback operations (no 5xx spikes; error rate < 0.1% during change)

Evidence Generation & Explainability

"As a reviewer, I want clear, human-readable evidence for each decision so that I can quickly trust or override the routing outcome."

Description

For each image, compute and persist a standardized evidence bundle that includes signal scores and human-readable rationales (e.g., "Logo matched at 92%", "EXIF time zone = supplier’s region", "Lighting profile similarity: 0.87"). Support schema versioning, JSON serialization, and redaction of sensitive EXIF fields. Evidence must be viewable in UI, included in webhooks/emails, and attached to audit logs. Provide a deterministic decision summary showing which thresholds were met/failed and the final route. Compute within a strict latency budget for evidence assembly and fall back gracefully if a signal is unavailable.

Acceptance Criteria

Evidence Bundle Schema & Versioning

Given an image has completed processing When the evidence bundle is generated Then the JSON validates against evidence.schema.json with zero errors And the bundle contains required top-level fields: schema_version, image_id, generated_at, signals, decision_summary And schema_version is a semantic version (MAJOR.MINOR.PATCH) matching the configured current schema And the same schema_version value is present in UI, webhook payloads, and email summaries for the image Given a client that consumes fields marked stable in schema v1.x When the service upgrades to schema v1.y (minor version) Then all stable fields remain present with unchanged types and semantics Given the bundle fails schema validation or a required field is missing When assembly completes Then the job is marked failed for evidence generation with a structured error And the image is routed to review_queue with reason "schema_validation_failed"

Human-Readable Evidence Content

Given signals are computed for an image When the evidence bundle is generated Then each signal item contains: key, score (0..1 numeric), rationale (non-empty human-readable string), source (component@version), and status (ok|unavailable|not_applicable) And rationales include concrete values (e.g., "Logo matched at 92%", "EXIF time zone matches supplier region") And numeric scores render to two decimal places in UI and emails while retaining full precision in JSON Given a signal is not applicable for an image (e.g., no EXIF present) When the evidence bundle is generated Then the signal status is not_applicable, score is omitted, and the rationale explains why Given a signal is unavailable due to error When the evidence bundle is generated Then the signal status is unavailable with a machine-readable reason code and human-readable rationale

Redaction of Sensitive EXIF

Given an image contains sensitive EXIF fields (GPSLatitude, GPSLongitude, CameraOwnerName, SerialNumber) When the evidence bundle is serialized Then those fields are not present in raw form anywhere in the bundle, UI, webhook payloads, or emails And the bundle includes a redactions array listing the redacted fields with reasons And any rationale referencing redacted fields uses placeholders (e.g., "GPS: [REDACTED]") Given non-sensitive EXIF fields (e.g., TimeZone) When evidence is generated Then they may be used in rationales and appear in the bundle as allowed by the schema

Deterministic Decision Summary & Routing

Given per-supplier thresholds are configured When evidence generation evaluates signals for an image Then decision_summary includes: supplier_id, evaluated_thresholds[{key, threshold, score, result: pass|fail}], final_route in {auto_route, review_queue, reject}, and reasons[] Given two runs on the same image with identical configuration and model versions When evidence is generated twice Then decision_summary (excluding generated_at) is byte-identical and hashes to the same value Given all required thresholds pass When decision is computed Then final_route = auto_route with reason "all_thresholds_met" Given any required threshold fails When decision is computed Then if the failing score lies within the supplier-configured review_band, final_route = review_queue with reason "borderline_threshold" Else final_route = reject with reason "threshold_failed" Given scores require rounding for display When decision is computed Then deterministic rounding rules are applied (round half away from zero to two decimals) and recorded in reasons[]

Latency Budget & Graceful Fallbacks

Given a batch of at least 1000 images under normal operating load When evidence assembly executes Then p95 per-image evidence assembly time <= 200 ms and p99 <= 400 ms as measured by server metrics Given a signal dependency exceeds its configured timeout (default 100 ms) When assembling evidence Then the signal is marked status=unavailable with reason "timeout" And decision_summary includes fallback_reason = "signal_unavailable:<key>" And if the signal is required for routing for that supplier, final_route = review_queue Given optional signals are unavailable When assembling evidence Then processing completes using available signals without exceeding the latency budget

Delivery to UI, Webhooks, and Emails

Given evidence is generated for an image When viewing the image details in the UI Then an Evidence panel displays all signals with scores and rationales, redaction badges for redacted items, and the decision summary with final_route Given evidence is generated for an image When a webhook event is sent Then the payload includes an "evidence" JSON object matching the stored bundle and the decision_summary, with Content-Type application/json Given evidence is generated for an image When an email notification is sent Then the email includes a human-readable summary of key rationales (top 3 signals by decision impact) and the final_route, plus a link to view the full evidence in the UI Given a webhook delivery fails with a 5xx response When retries occur Then the system retries according to the configured backoff policy and logs success or permanent failure without duplicating the evidence bundle

Persistence & Audit Log Attachment

Given evidence is generated for an image When persistence occurs Then the bundle is stored atomically keyed by image_id and includes a SHA-256 checksum And a retrieval API returns the exact JSON originally stored (byte-identical) Given a routing decision is finalized When an audit log event is created Then the audit log includes the decision_summary and a link to (or embedded) evidence bundle And exporting audit logs for a time range includes the evidence for each event Given a reviewer opens an audit log entry When viewing the evidence Then the displayed bundle hash matches the stored checksum and the decision can be reconstructed from the bundle without external calls

Auto-Routing & Review Queue Orchestration

"As a pipeline engineer, I want deterministic, scalable routing of images based on thresholds so that throughput stays high while misroutes are minimized."

Description

Implement a routing engine that compares evidence to thresholds and assigns one of three states per image: Pass (auto-continue), Borderline (enqueue for review), or Fail (return to supplier). Support batch-aware routing (mixed outcomes within a batch) with idempotent operations and at-least-once delivery to the review queue. Provide configurable borderline bands (e.g., within a percentage of threshold) and SLA-prioritized queue ordering. Ensure horizontal scalability to high throughput with low decision latency. Persist routing state transitions and provide retry/backoff for transient errors.

Acceptance Criteria

Routing Decision by Supplier Policy and Evidence

Given supplier "Acme" has a Confidence Gate policy v3 with checks: logo_match >= 90, exif_timezone_match = true, lighting_similarity >= 0.85 When an image from "Acme" yields evidence: logo_match = 92, exif_timezone_match = true, lighting_similarity = 0.87 Then the routing decision is Pass And the decision payload includes policy_version = v3, evidence fields and values, and decision_reason = "All thresholds met" And the routing state transition "Received -> Pass" is persisted with timestamp and correlation_id

Configurable Borderline Band Application

Given supplier "Acme" has thresholds: logo_match >= 90, lighting_similarity >= 0.85, and a borderline_band of 5 percentage points for logo_match and 0.02 for lighting_similarity When an image yields evidence: logo_match = 88, lighting_similarity = 0.84, exif_timezone_match = true, and no metric falls below (threshold - band) Then the routing decision is Borderline And a review queue message is emitted with idempotency_key derived from (image_id, policy_version) And the message includes a human-readable reason listing which checks are within the borderline band And no Fail decision is produced

Batch-Aware Mixed Outcomes and Partial Progression

Given a batch B123 containing 5 images produces outcomes by evidence evaluation: [Pass, Borderline, Fail, Pass, Borderline] When the batch is routed Then Pass images advance to the next pipeline stage immediately without waiting for Borderline or Fail images And Borderline images are enqueued to the review queue with batch_id = B123 And Fail images are marked Failed and a supplier-facing notification payload is prepared including evidence summary And a batch summary record is written with counts {pass:2, borderline:2, fail:1} And reprocessing any image in B123 yields the same decision and does not duplicate downstream actions (idempotent)

Idempotent Delivery with Retry/Backoff and At-Least-Once Semantics

Given image I456 evaluates to Borderline and the initial publish to the review queue times out (transient error) When the router retries with exponential backoff (initial_delay=100ms, multiplier=2, max_delay=5s) up to 5 attempts Then at least one review queue message is delivered for I456 And all delivered messages contain the same idempotency_key And only one active review task can be created for I456 within a 24-hour dedupe window And only one state transition to Borderline is recorded for I456 And if all attempts fail, the message is written to a dead-letter queue with failure_reason and an operator can requeue it manually

SLA-Prioritized Review Queue Ordering

Given the review queue contains items with due_at timestamps [in 5m, 20m, 2m, 2m] When the queue is ordered for dispatch Then items are dequeued in ascending time-to-breach order: [2m, 2m, 5m, 20m] And ties at 2m are broken by earlier created_at first And when an item's due_at is updated, its priority position reflects the change within 5 seconds And SLA-priority ordering is preserved across restarts and scale-out events

Persistence and Auditability of Routing State Transitions

Given an image transitions through states: Received -> Borderline -> Reviewed -> Pass When the audit log is queried by image_id Then each transition record includes {image_id, batch_id, previous_state, new_state, actor (system/user), timestamp, correlation_id, policy_version, evidence_snapshot, decision_reason} And records are immutable, write-once, and retained for at least 90 days And the audit query returns in <= 200ms p95 for the most recent 10k records And duplicate processing attempts do not create duplicate transition records

Horizontal Scalability and Decision Latency SLAs

Given the routing service is deployed with 4 stateless instances When 12,000 images are submitted over 1 minute with evidence payload sizes (p50=5KB, p95=20KB) Then the system sustains >= 12,000 routed decisions per minute with error_rate <= 0.1% And end-to-end routing decision latency is <= 200ms p95 and <= 500ms p99 And throughput scales approximately linearly (±15%) when instances are doubled during a 5-minute steady-state test And CPU utilization per instance stays <= 70% p95; autoscaling adds instances within 2 minutes if sustained > 70%

Reviewer Quick-Triage UI

"As a human reviewer, I want a fast triage UI with clear evidence so that I can clear borderline queues quickly and accurately."

Description

Provide a responsive triage interface with keyboard shortcuts and bulk actions to approve, reject, or request re-upload for borderline cases. Display the evidence bundle with visual cues (per-signal pass/fail), zoom and histogram tools, and side-by-side comparison against supplier brand references. Capture reviewer notes and tags, support undo within session, and emit structured events for analytics and model feedback. Target a median decision time under seconds and support concurrent reviewers without conflicts.

Acceptance Criteria

Keyboard-First Triage Actions

Given a borderline case is open in the triage UI and no text input is focused, When the reviewer presses A, Then the case is approved, the next case opens, a success confirmation appears within 150 ms, and the server acknowledgment completes within 1 s. Given a borderline case is open and no text input is focused, When the reviewer presses R, Then the case is rejected with the same latency guarantees and next case opens. Given a borderline case is open and no text input is focused, When the reviewer presses U, Then a request re-upload action is applied with the same latency guarantees and next case opens. Given the triage UI is open, When the reviewer presses J or Right Arrow, Then the next case opens; When the reviewer presses K or Left Arrow, Then the previous case re-opens. Given the triage UI is open, When the reviewer presses ?, Then a shortcut help modal opens within 200 ms and lists all active shortcuts. Given a text input (notes/tags) is focused, When shortcut keys A/R/U/J/K are pressed, Then no action is triggered and focus remains; When Esc is pressed, Then input blurs so shortcuts can operate. Given an action fails on the server, When the failure is detected, Then an error banner appears within 1 s, the current case remains open, and a Retry action is available.

Evidence Bundle with Per-Signal Cues

Given evidence signals are provided by Confidence Gate, When the case loads, Then the evidence bundle renders within 400 ms and displays each signal’s name, human‑readable explanation, confidence percentage to one decimal (0.0–100.0%), and a color cue (green=pass, red=fail, amber=borderline). Given supported signals are present, When the case loads, Then at minimum Logo Match, EXIF Time‑Zone Match, and Lighting Profile Similarity are shown; if a signal is unavailable, Then it displays as Not available without blocking triage. Given a signal row is hovered or tapped, When the reviewer interacts, Then a tooltip reveals raw values and thresholds. Given the backend returns N signals, When rendered, Then exactly N signals are shown with identical values to the payload (no mismatches).

Zoom, Histogram, and Side-by-Side Comparison

Given the triage image is displayed, When the reviewer uses zoom controls (+/−, Cmd/Ctrl + mouse‑wheel, or 1/2/3/4 presets), Then zoom levels from 25% to 400% are supported with render latency ≤ 50 ms per zoom step and pan latency ≤ 30 ms per frame. Given the reviewer toggles Histogram (H), When enabled, Then RGB and luminance histograms appear and update within 100 ms for the current image; When disabled, Then the overlay hides immediately. Given side‑by‑side mode is activated, When a supplier brand reference exists, Then the working image and reference display adjacent with synchronized pan/zoom; the first reference loads within 800 ms; When multiple references exist, Then the reviewer can switch references and the selected reference loads within 800 ms.

Bulk Actions on Borderline Queue

Given the reviewer has a visible list of borderline cases, When they multi‑select via checkboxes, Shift+Click ranges, or Cmd/Ctrl+A (select all in view up to 200 items), Then the selection count is displayed. Given one or more cases are selected, When Approve, Reject, or Request Re‑upload is invoked, Then a confirmation dialog shows the action and count; upon confirm, local UI reflects the action within 300 ms and server processing completes within 5 s for up to 200 items. Given a batch is processing, When individual item failures occur, Then the progress panel lists failed items with reasons and a Retry Failed option; successful items are removed from the queue. Given a batch action completes, When the reviewer opens Undo, Then the entire batch can be undone as a single step within the session.

Reviewer Notes and Tags Persistence

Given a case is open, When the reviewer enters a note of 0–1000 characters (Unicode supported) and up to 10 tags (each 1–30 characters; letters, numbers, spaces, hyphen, underscore), Then inputs validate inline and disallow invalid characters. Given the reviewer pauses typing, When 500 ms of idle elapse or an action is taken, Then notes and tags autosave and a Saved indicator appears. Given notes/tags are saved, When the page is reloaded or the reviewer returns to the case, Then the previously saved content is present. Given unsaved edits exist, When the reviewer attempts to navigate away, Then a prompt offers Discard or Stay and Save. Given a decision is made (approve/reject/request re‑upload), When the action is recorded, Then the current notes and tags are attached to the decision payload and analytics event.

Session Undo for Triage Decisions

Given the reviewer has performed one or more triage actions in the current session, When they press Cmd/Ctrl+Z or click Undo, Then the most recent action (single or bulk) is reversed within 500 ms and the affected case(s) return to the top of the queue. Given multiple actions have occurred, When Undo is invoked repeatedly, Then up to the last 20 actions are reversible within the same session. Given an item included in an undo has been modified by another reviewer since the original action, When undo is attempted, Then that item is excluded from undo with a clear message; remaining items are undone. Given an undo occurs, When analytics events are emitted, Then the event references the original action’s event_id as undo_of_event_id.

Concurrent Review Safety and Analytics Events

Given a case is opened, When the first reviewer opens it, Then a server‑side lock is acquired; any subsequent reviewer sees the case as In Review (read‑only) and cannot take actions; lock state propagates to all clients within 2 s. Given a reviewer becomes inactive, When 2 minutes of inactivity elapse, Then the lock expires and the case returns to the queue; active tabs auto‑renew the lock. Given a reviewer completes an action, When the server confirms, Then the case is removed from other reviewers’ queues within 2 s to prevent double handling. Given any triage action (approve, reject, request re‑upload, undo), When it is performed, Then a structured analytics event is emitted with at minimum: event_id, timestamp_ms, item_id, supplier_id, reviewer_id (pseudonymous), action, decision_time_ms, input_method (shortcut|mouse|bulk), evidence_snapshot (signal_name, confidence, pass/fail), notes_length, tags, batch_id (if applicable), undo_of_event_id (if applicable). Given analytics events are emitted, When measured over a rolling 24 h window, Then ≥ 99.5% are delivered to the analytics bus within 2 s and 100% are persisted locally for retry up to 24 h with zero data loss in daily reconciliation. Given decision_time_ms is tracked, When computed as a rolling median per reviewer over the last 200 triaged cases, Then the median is < 3000 ms and is visible on the analytics dashboard.

Notifications & Supplier Feedback

"As a supplier, I want clear, actionable feedback when my images are held or rejected so that I can fix issues and resubmit quickly."

Description

Send supplier-facing notifications (email, webhook, dashboard alerts) for Fail and Borderline outcomes with concise reasons and remediation tips (e.g., lighting issues, logo occlusion). Allow suppliers to subscribe, set frequency, and choose channel. Include links to affected items and evidence excerpts. Enforce rate limits and localization, record delivery status and retries, and expose an API endpoint for suppliers to securely fetch decision details.

Acceptance Criteria

Email: Fail Outcome Notification with Evidence and Tips

Given a supplier has email notifications enabled with immediate frequency and a verified notification address When an item receives a finalized Fail decision from Confidence Gate Then an email is sent within 60 seconds containing: (a) subject including "Fail", supplier name, and batch reference; (b) up to 3 concise reason bullets; (c) remediation tips; (d) deep links to affected items; (e) evidence excerpts (e.g., logo match %, EXIF time-zone, lighting profile similarity) And the email is localized to the supplier’s preferred locale and rendered in UTF-8 And a delivery attempt record is persisted with timestamp, provider message ID, status, and retry count And transient failures are retried up to 3 times with exponential backoff (min 30s, max 5m) And duplicate emails for the same item decision and recipient are prevented via idempotency key for 24 hours

Webhook: Borderline Decision Payload and Security

Given a supplier has an active webhook subscription for Borderline outcomes with a configured endpoint URL and shared secret When an item is marked Borderline by Confidence Gate Then a POST is issued within 30 seconds with a JSON payload containing: event_id (UUID), idempotency_key, supplier_id, decision_type, item_ids, reasons, remediation_tips, evidence excerpts, occurred_at (ISO8601), and deep links And the request includes X-PixelLift-Signature (HMAC-SHA256 of body + timestamp) and X-PixelLift-Timestamp headers And the system retries up to 5 times with exponential backoff on network errors or 5xx; does not retry on 2xx or non-429 4xx; retries on 429 honoring Retry-After And replays are deduplicated by idempotency_key for 24 hours And all delivery attempts and outcomes are logged with latency and response code

Dashboard Alerts: Real-time Inbox and Acknowledgement

Given a supplier user is logged into the PixelLift dashboard with proper permissions When Fail or Borderline decisions occur for their items Then a new alert appears in the Notifications panel within 60 seconds showing decision type, count of affected items, top 1–3 reasons, and evidence badges And the user can filter alerts by outcome type, time range, and collection/batch And clicking an alert opens a list of affected items with links to details and evidence excerpts And marking an alert as read prevents repeat alerts for the same decision set and syncs read state across sessions within 10 seconds And unread counts remain accurate (±0) after page refresh

Subscription Preferences: Channels and Frequency Controls

Given a supplier admin opens Notification Settings When they configure channels (email, webhook, dashboard) and frequencies (immediate, hourly digest, daily digest) separately for Fail and Borderline Then the UI validates inputs and saves preferences atomically with audit trail And subsequent notifications honor these preferences without regression And digest jobs aggregate items without duplicates, include counts, representative reasons, and links to full lists And a "Send test" action delivers a channel-appropriate test within 60 seconds and records outcome And changes take effect for new decisions within 2 minutes of save

Rate Limiting and Throttling with Digest Fallback

Given per-supplier limits are configured as: email ≤ 60/min, webhook ≤ 120/min, dashboard alerts ≤ 10/sec When incoming decision volume exceeds a channel’s limit Then excess notifications are queued and delivered within SLA (email ≤ 5 min, webhook ≤ 2 min, dashboard ≤ 30 sec) And if volume exceeds 2× limit for > 5 minutes, the system auto-switches that channel to digest mode and notifies the supplier admin via dashboard And no notifications are dropped; queue depth is observable; 95th percentile end-to-end latency remains within SLA And API endpoints return 429 with Retry-After when applicable

Localization and Timezone Rendering

Given a supplier has a preferred language and time zone configured When notifications are generated across email, webhook payload text fields, and dashboard alerts Then all static and dynamic strings (reason labels, remediation tips, evidence labels) are localized to the supplier’s locale with correct pluralization And timestamps render in the supplier’s time zone; webhook includes both UTC and localized representations And missing translations fall back to English with a [EN] marker and log a missing-key event And links include the locale path segment when applicable

Decision Details API: Secure Fetch of Outcomes

Given a supplier has valid OAuth2 client credentials with scope notifications.read When they call GET /api/v1/suppliers/{supplier_id}/decisions with filters (outcome in [Fail, Borderline], since, until, page, page_size) Then the API authenticates and authorizes access to supplier_id, responds 200 with a paginated list including item_id, outcome, reasons, remediation_tips, evidence excerpts, links, and occurred_at (ISO8601) And supports ETag/If-None-Match and returns 304 when unchanged And returns 401 for invalid/expired token, 403 for cross-supplier access, 400 for invalid params, 429 when rate-limited (with Retry-After) And p95 latency ≤ 300 ms at 100 RPS with correct caching headers And responses respect Accept-Language or supplier default for localized fields

Analytics & Threshold Tuning

"As a product manager, I want analytics and simulation tools so that I can tune thresholds to balance quality and throughput."

Description

Provide dashboards and APIs that report pass rates, borderline rates, reviewer overrides, estimated false positives/negatives, average time in queue, and conversion impact by supplier and category. Include a sandbox to simulate threshold changes against historical evidence and recommend optimal thresholds to hit target auto-pass rates. Support CSV export and scheduled reports for stakeholders.

Acceptance Criteria

Dashboard Metrics by Supplier and Category

- UI displays metrics: pass_rate, borderline_rate, reviewer_override_rate, est_false_positive_rate, est_false_negative_rate, avg_queue_time_seconds, conversion_impact_percent for the selected time range. - Filters available: supplier_id (multi-select), category (multi-select), date_range (last_7_days, last_30_days, custom UTC start/end) and are applied to all metrics and counts. - For identical filters, UI metric values match Metrics API values within 0.1 percentage points for rates and 0.5 minutes for time-based metrics. - Data freshness indicator shows last_updated_utc; max data latency <= 5 minutes behind real time. - Empty state: when no data matches filters, UI shows "No data for selected filters" and disables exports. - Number formatting: rates to 1 decimal place, times to nearest second, counts with thousands separators.

Metrics API for Reporting Integrations

- GET /v1/metrics supports filters: supplier_id, category, start_utc, end_utc, interval (day|week) and returns 200 with JSON including: pass_rate, borderline_rate, reviewer_override_rate, est_fp_rate, est_fn_rate, avg_queue_time_seconds, conversion_impact_percent, count_total, count_passed, count_borderline, count_overridden. - Pagination via cursor with page_size up to 10000; response includes next_cursor and total_count. - OAuth2 client-credentials required; unauthenticated returns 401; insufficient scope returns 403. - Invalid parameters return 422 with machine-readable field errors; unknown supplier_id/category returns 404. - Performance: p95 latency <= 800 ms and p99 <= 1500 ms under 100 concurrent users against 1M-row date ranges in staging perf tests. - Reliability SLO: 5xx error rate < 0.1% over trailing 7 days (observed in production telemetry). - Versioning: X-API-Version=1 header supported; requests without header default to v1 and include Deprecation header when newer versions exist.

Threshold Sandbox Simulation and Recommendations

- Given >= 30 days of historical evidence with >= 10,000 images, when a user adjusts per-supplier and per-category confidence thresholds and sets a target_auto_pass_rate, the simulator outputs predicted auto_pass_rate, borderline_rate, manual_review_volume, est_fp_rate, est_fn_rate for the same window. - Backtest accuracy: predicted auto_pass_rate error <= ±2.0 percentage points and manual_review_volume error <= ±5% versus measured historical outcomes. - Recommendation engine proposes thresholds that achieve target_auto_pass_rate within ±1.0 percentage point while minimizing est_fp_rate subject to user_constraint est_fp_rate_max. - Output includes per-supplier/category threshold table with expected metrics and 95% confidence intervals; users can export the proposal as CSV. - Apply action requires authorized role; applying creates a new versioned threshold set, logs actor, timestamp, and diff; no changes take effect without explicit confirmation. - Safety check blocks applying any proposal that increases est_fp_rate by > 3.0 percentage points relative to current settings unless a second approver confirms.

Estimated False Positive/Negative Calculation

- FP/FN estimates are computed from labeled outcomes (reviewer ground truth and post-publish audits); unlabeled items are excluded from denominators. - Dashboard shows est_fp_rate and est_fn_rate per supplier and category for day/week intervals with 95% Wilson score confidence intervals. - Suppress display and show "Insufficient data" when labeled sample size n < 200 or CI half-width > 10 percentage points. - Tooltip discloses data window, sample size, and sources; values deep-link to a drill-down of up to 100 audited records for spot checks. - Recent-data lag handling: metrics for the most recent 72 hours are marked "preliminary" and auto-refresh as new labels arrive (<= 5-minute latency).

CSV Export and Scheduled Reporting

- From the dashboard, users can export the current filtered view to CSV containing ISO-8601 UTC timestamps, column headers, and one row per supplier-category-interval with metrics and raw counts. - CSV aggregates and row counts match on-screen totals for the same filters within the same last_updated_utc. - Users can schedule reports with frequency (daily|weekly), delivery time (UTC hh:mm), filters (supplier, category, date window), and recipient emails (verified domain only). - Delivery methods: email with secure download link (expires in 7 days) and optional S3 delivery via assume-role ARN; failures send alert emails and are recorded in an audit log. - Delivery SLA: generated within ±10 minutes of scheduled time; retries up to 3 times with exponential backoff on failure. - CSV schema is versioned; changes increment schema_version and include compatibility notes in the email body.

Average Time in Queue Measurement Integrity

- avg_queue_time_seconds is defined as elapsed time from status "ingested" to first of "routed_to_review" or "auto_passed", excluding durations in "on_hold" or "awaiting_customer_action". - For a random sample of 1000 items, dashboard avg_queue_time_seconds matches reconstruction from event logs with mean absolute error <= 30 seconds. - Percentiles p50, p90, p95 are displayed and filterable by supplier, category, and date window. - All event timestamps are stored and computed in UTC; DST changes do not affect calculations. - Items reassigned between suppliers/categories are attributed to the supplier/category present at final routing decision time.

Conversion Impact Attribution by Supplier and Category

- conversion_impact_percent is computed as (treatment_cr - control_cr) / control_cr * 100, where cohorts are PixelLift-processed vs matched baseline listings over the same period. - Supported methods: (a) online A/B flag between auto-passed and manually reviewed items; (b) propensity-score matched pre/post cohorts when A/B is unavailable. - Guardrails: display metric only when each cohort has >= 1000 sessions and 95% CI half-width <= 5 percentage points; otherwise show "Insufficient data" state. - UI shows point estimate with 95% CI and significance; tooltip discloses sample sizes, method, and look-back window. - Validation: on a synthetic dataset with known 10% lift and >= 10,000 sessions per cohort, pipeline estimates within ±2 percentage points. - Filter changes (supplier, category, date) recompute cohorts; 95th-percentile query completes within 10 seconds.

Signal Registry & Versioning

"As an ML engineer, I want a versioned signal registry so that I can evolve models without breaking routing or evidence consumers."

Description

Create a registry for all confidence signals (name, description, version, owner, output schema, performance metrics). Support deprecations, feature flags, canary rollouts, and per-supplier signal enablement. Ensure backward-compatible evidence schemas and automatic migration when signal versions change. Provide monitoring with alerts on drift, missing signals, and anomalies to protect routing quality.

Acceptance Criteria

Register New Signal With Version And Schema

Given an authenticated admin provides name, description, semantic version (MAJOR.MINOR.PATCH), owner, output JSON Schema URI, and baseline performance metrics When they POST the new signal to the registry API Then the registry persists an immutable record and returns 201 with id and version And GET /signals/{name} returns the latest version and GET /signals/{name}/{version} returns the specific version And duplicate name+version creation returns 409 Conflict And an audit log entry is created with actor, timestamp, diff, and reason And p95 write latency for creation <= 500 ms under 50 RPS

Deprecate Signal With Graceful Sunset

Given an existing signal has active versions v1 and v2 When an admin marks v1 as deprecated with a sunset date at least 14 days in the future and an optional replacement Then the API and UI show v1 as Deprecated with the sunset date and replacement reference And clients receive a deprecation warning header/field when requesting v1 And new enables of v1 are blocked after the sunset date with 400 and guidance to migrate And existing suppliers pinned to v1 keep functioning until migrated; routing continues without 5xx And notification is sent to signal owners and subscribers within 5 minutes of deprecation And an audit trail records the deprecation and scheduled sunset job

Backward-Compatible Evidence Schema Validation

Given a new signal version v2 is declared backward-compatible with v1 When the validation job runs against historical evidence payloads for v1 (min 1,000 samples or 100% if fewer) Then 100% of sampled v1 payloads validate against the v2 JSON Schema And the version bump is MINOR or PATCH (not MAJOR) And the validation report is stored and linked to v2; promotion to Active is blocked on failure And evidence fields marked as required in v1 remain present in v2 with same types or safe widening

Automatic Migration On Signal Version Upgrade

Given supplier S has signal X v1 enabled and auto-migrate = true When v2 of X is promoted to Active with a migration mapping Then S is migrated to v2 within 10 minutes, and routing jobs use v2 without interruption And evidence payloads are transformed per mapping rules with >= 99.9% success over the first 10k events And if error rate > 0.5% for 15 consecutive minutes, an automatic rollback to v1 occurs And all changes (promote, migrate, rollback) are audit-logged with diff and actor And no degradation > 1% on primary routing KPI is detected during the first hour post-migration

Per-Supplier Signal Enablement Overrides

Given a supplier S requires custom signal enablement and version pinning When an admin sets enable/disable and version selection for each signal for S Then the effective configuration is visible via API/UI with source-of-truth (Override vs Global vs Flag) And changes propagate to the routing pipeline within 2 minutes (p95) And routing for S only consumes enabled signals and fallbacks are applied for disabled/missing ones without 5xx And precedence is Supplier Override > Feature Flag > Global Default and is enforced deterministically

Feature Flags and Canary Rollouts

Given a new signal version v2 is behind a feature flag When an admin configures a canary rollout by supplier percentage or explicit supplier list Then traffic splits respect the target within ±2% tolerance over 10k events And comparative metrics (precision, recall, false-route rate) are computed for canary vs control And promotion to 100% is blocked unless primary KPI degrades by <= 1% with p<0.05 and min 1,000 paired samples And one-click Promote or Rollback takes effect within 5 minutes and is audit-logged

Monitoring, Drift Detection, and Alerts

Given baseline distributions and SLOs are defined for each signal and version When daily drift analysis runs (e.g., KS test or PSI) and detects drift beyond threshold for 24 hours Then an alert is created to on-call with runbook link and issue is tracked And missing-signal alerts fire within 2 minutes when >1% of requests lack a required signal for 5 minutes And anomaly detection on evidence fields raises warnings with top contributing fields And dashboards show current health, last 24h trends, and error budgets; availability SLO >= 99.9% monthly

Auto Bind

Map detected suppliers to the right preset bundle, destination folders, and channel variants with one click. Once bound, every incoming image is processed with the correct style and metadata automatically—eliminating sorting work and preserving brand consistency.

Requirements

Smart Supplier Detection

"As an operations manager, I want suppliers to be detected and normalized automatically so that incoming images are tagged correctly without manual sorting."

Description

Automatically identify and normalize supplier sources for each incoming image using file metadata, upload origin, filename patterns, folder paths, and optional watermark/logo recognition. The system consolidates aliases (e.g., “Acme Co.”, “ACME”, supplier code) into a single canonical supplier profile so bindings are reliable. Detection runs at ingest time, tagging assets with the resolved supplier ID to drive downstream Auto Bind routing without human intervention. This ensures consistent, accurate mapping at scale and eliminates manual sorting.

Acceptance Criteria

Detect Supplier from Filename Patterns at Ingest

Given an incoming image whose filename matches a configured supplier pattern (e.g., "ACME_*", "ACM-####-*.jpg"), When the asset is ingested via any supported channel, Then the system extracts the supplier hint from the filename and resolves it to the canonical supplier ID with confidence >= 0.90, And the asset is tagged with supplierId before any downstream processing begins, And resolution accuracy on the filename-pattern validation set is >= 98%.

Map Supplier from Folder Path and Upload Origin

Given a configured mapping from upload origins and folder paths to suppliers (e.g., SFTP path "/suppliers/acme/drop" or Drive folder "Brand/Inbound/ACME"), When images are ingested from those locations, Then the system resolves and tags the canonical supplierId for each asset, And if multiple suppliers share a path prefix, the most specific path rule is applied, And end-to-end tests show >= 99% correct mappings for origin/path cases.

Watermark/Logo Recognition for Supplier Resolution

Given an image containing a known supplier watermark or logo from the trained catalog, When the asset is ingested, Then the system detects the watermark/logo and resolves to the correct canonical supplierId with confidence >= 0.95, And the false positive rate on a negative control set is <= 1.0%, And assets with successful logo-based resolution are tagged prior to background removal or other transforms.

Alias Consolidation to Canonical Supplier Profile

Given a supplier with multiple known aliases (e.g., "Acme Co.", "ACME", codes like "ACM-001"), When any alias is detected from filename, metadata, origin, or logo, Then the system normalizes to the same canonical supplierId, And alias collisions that merge distinct suppliers do not occur on the regression set (collision rate = 0), And a trace record of detected alias and chosen canonical ID is stored with the asset metadata.

Tagging and Auto Bind Routing at Ingest

Given a resolved canonical supplierId for an ingested asset, When the asset enters the processing pipeline, Then supplierId is persisted in asset metadata and is queryable via API and UI, And Auto Bind applies the correct preset bundle, destination folder, and channel variants without manual input, And end-to-end validation shows 100% of assets with resolved supplierId are routed using the bound preset for that supplier.

Signal Fusion, Precedence, and Ambiguity Handling

Given multiple signals (filename, metadata, origin/path, logo) that may agree or conflict, When the system computes composite confidence, Then a deterministic precedence and fusion rule is applied to select the top supplier if composite confidence >= 0.85, And if composite confidence < 0.85, the system applies a configured default supplier per channel if available, else sets supplierId="unresolved" and flags the asset for review without auto-bind, And all decisions include stored confidence scores and contributing signals for audit.

Throughput, Latency, and Scalability at Batch Ingest

Given a batch ingest of 10,000 images with mixed suppliers, When detection runs, Then 95th percentile supplier resolution latency per image is <= 200 ms and error/timeout rate <= 0.1%, And overall accuracy on the benchmark set (known suppliers) is >= 97%, And tagging completes before the next pipeline stage starts for each asset, And the system supports horizontal scaling to maintain these SLAs under 2x concurrent load.

One-Click Binding Assignment

"As a studio lead, I want to assign the right presets and destinations to a supplier in one click so that future uploads are processed correctly by default."

Description

Provide a streamlined UI to bind a detected supplier to a preset bundle (style preset + metadata template), destination folders, and channel variants with a single action. The interface surfaces recommended presets based on historical usage and allows quick confirmation or override. Once saved, the binding is active for all future ingests from that supplier, minimizing setup time and ensuring repeatable outcomes.

Acceptance Criteria

One-Click Bind for Single Supplier (Recommended Preset)

Given a detected supplier without an existing binding and recommendations are displayed When the user clicks the one-click Bind action for that supplier Then the supplier is bound to the recommended preset bundle (style preset + metadata template) And the recommended destination folder and channel variants are saved with the binding And a success toast with binding summary (preset bundle, folder, variants) appears within 2 seconds And the supplier row shows binding state Active within 1 second And any unprocessed images from the current ingest for that supplier are queued with the bound settings within 5 seconds And an audit record (user, timestamp, configuration) is stored

Override Recommendations Before Binding

Given a detected supplier and a recommendation preview is shown And the user selects a different style preset, metadata template, destination folder, and/or channel variants When the user clicks Bind Then the system saves the overridden selections as the binding And the confirmation toast reflects the overridden selections And subsequent ingests for that supplier use the overridden binding And an audit record notes that recommendations were overridden

Auto-Apply Binding to Future Ingests

Given an active binding exists for Supplier X When new images from Supplier X are ingested via web upload, API, or watched folder Then the bound style preset, metadata template, destination folder, and channel variants are applied to 100% of images And processing starts within 10 seconds of ingest detection And no user interaction is required And a run log entry shows the binding ID and configuration version used

Edit Existing Binding with Versioning

Given a binding exists for a supplier When a permitted user updates any component (preset bundle, destination folder, channel variants) and saves Then the new configuration applies only to images ingested after the save time And previously processed images remain unchanged And the binding displays an Updated timestamp and increments version And version history records prior configuration and editor identity And the user can roll back to a prior version in one click

Disambiguate Duplicate/Ambiguous Supplier Matches

Given supplier detection returns multiple potential matches or conflicting identifiers When the user attempts to bind Then the UI requires selection of a single supplier (name + unique ID) before enabling Bind And no binding is created until a single selection is confirmed And the selected supplier is shown clearly in the confirmation toast and audit log And future detections of that supplier auto-resolve to the confirmed identity

Validation for Missing Preset/Folder/Variant Components

Given the recommended or selected preset bundle, destination folder, or channel variants are missing, deleted, or inaccessible When the user opens the binding UI Then the missing elements are flagged inline with specific error messages And the Bind action is disabled until all required elements are valid When a bind attempt is made via shortcut or API Then the request fails with HTTP 422 including a machine-readable list of missing elements And no partial binding is created

Bulk One-Click Binding for Multiple Suppliers

Given 2–50 detected suppliers without existing bindings and each has a recommendation When the user clicks Bind All Recommended at the list level Then the system creates a binding per supplier using its recommendation And a progress indicator displays success/failure per supplier in real time And the operation completes within 5 seconds for up to 50 suppliers and within 30 seconds for up to 200 suppliers And failures do not block successful bindings And the action is idempotent, re-running binds only unbound suppliers without duplicating existing bindings And an exportable CSV report of per-supplier results is available

Auto-Routing & Batch Processing

"As a catalog manager, I want images to be routed and processed automatically according to bindings so that batches finish quickly with consistent styling and metadata."

Description

Upon ingest, automatically route images to the correct processing pipeline based on the supplier binding. Apply the bound style preset, metadata template, and channel-specific variants in batch, then deliver outputs to the configured destination folders and channels. Processing should be resilient, support retries, and expose job status so large catalogs complete reliably with minimal oversight.

Acceptance Criteria

Auto-Routing on Ingest for Bound Supplier

Given a supplier binding exists with id "SUP-123" mapping to preset bundle, destinations, and channel variants When 500 images are ingested with supplierId="SUP-123" via API or watched folder Then a processing job is created within 10 seconds of ingest completion with job.bindingId="SUP-123" and 500 items queued And each item is routed to the pipeline defined by the binding without manual mapping And images ingested with an unknown supplierId are flagged as "Unbound" and do not start processing

Batch Application of Style Preset and Metadata Template

Given the binding references style preset "Apparel-White-2048" and metadata template with fields brand="Acme", supplier="SUP-123", license="Standard", altText="{sku} {title} product photo" When 500 images are processed in the job Then 100% of produced outputs have the preset transformations applied (background removed, white background #FFFFFF, JPG quality=90, sRGB profile) And 100% of outputs contain the metadata fields brand, supplier, license, and altText with values matching the template and dynamic tokens resolved from item data And any item missing required tokens is marked Failed with a clear validation error and is excluded from delivery

Channel Variant Generation per Binding

Given the binding defines variants: Shopify 2048x2048 JPG sRGB named {sku}_shopify_2048.jpg; Amazon 2000x2000 JPG sRGB named {sku}_amazon_2000.jpg; Instagram 1080x1080 JPG sRGB named {sku}_instagram_1080.jpg When an item is processed Then exactly three variant files are produced per source image with pixel dimensions, format, color profile, and filenames matching the specification And each variant is tagged with channel metadata (channelKey, width, height, format) and linked back to the source item

Delivery to Configured Destinations and Channels

Given destinations are configured: S3 s3://brand-assets/shopify/, s3://brand-assets/amazon/, s3://brand-assets/instagram/ When the job completes processing for 500 items with 3 variants each Then 1,500 files are delivered to the correct destination prefixes with ETags matching local checksums and zero missing files And delivery completes within 5 minutes of last item processed, and delivery events are logged per file with destination, timestamp, and status=Success

Resilient Processing with Retries and Partial Failure Handling

Given a transient processing or delivery error occurs (e.g., 502 from style service or 503 on S3 PUT) When the error is classified as retryable Then the system retries the operation up to 3 times with exponential backoff (approximately 1s, 5s, 15s) without creating duplicate outputs And upon final failure, the item status=Failed with errorCode and errorMessage recorded; remaining items continue processing And the overall job status becomes CompletedWithErrors if at least one item failed

Job Status Visibility via UI and API

Given a job is in progress When the user opens Jobs in the UI or calls GET /api/jobs/{jobId} Then the job shows status in {Created, Queued, Processing, Delivering, Completed, CompletedWithErrors, Failed}, progressPercent, counts {total, processed, failed, pending}, and timestamps {startedAt, completedAt} And progressPercent and counts update at least every 5 seconds while active; a downloadable CSV of failures is available when any item fails

Idempotent Re-ingest and Duplicate Prevention

Given the same files are re-ingested within 24 hours with the same batchId or matching content hashes and no forceReprocess flag When auto-routing triggers for the new ingest Then items with existing outputs produced by the same binding and preset versions are marked Skipped with reason="Duplicate" and are not reprocessed or re-delivered And if forceReprocess=true, all items are reprocessed and re-delivered, producing a new job run without creating duplicate files at the destination (previous versions are versioned or overwritten per destination policy)

Binding Rules & Conflict Resolution

"As a power user, I want clear rules and overrides when bindings conflict so that I can ensure the correct presets are applied in edge cases."

Description

Implement precedence rules and conflict resolution when multiple bindings could match (e.g., overlapping folder rules or ambiguous supplier detection). Provide deterministic priority ordering, rule scoping (global, workspace, channel), and clear fallback behavior (e.g., default preset). Offer manual override per batch and per asset with audit logging to maintain control without sacrificing automation.

Acceptance Criteria

Scoped Rule Evaluation Order (Channel > Workspace > Global)

Given an asset matches rules at channel, workspace, and global scopes When bindings are evaluated Then the channel-scoped rule is selected Given an asset matches rules at workspace and global scopes but not channel When bindings are evaluated Then the workspace-scoped rule is selected Given an asset matches only a global-scoped rule When bindings are evaluated Then the global-scoped rule is selected Given an asset matches rules at multiple scopes with equal priority values When bindings are evaluated Then scope precedence (Channel > Workspace > Global) determines the winner and is recorded in the audit log

Priority and Specificity Tie-Breaking Within Same Scope

Given two matching rules within the same scope with priorities 1 and 5 When evaluation occurs Then the rule with priority 1 (lower number = higher priority) is selected Given two matching rules within the same scope with the same priority where one matches an exact folder path and the other a wildcard When evaluation occurs Then the exact folder path rule is selected Given two matching rules within the same scope with the same priority and matcher type When evaluation occurs Then the most recently updated rule is selected Given two matching rules remain tied after applying all tie-breakers When evaluation occurs Then selection is made by stable deterministic ordering of rule IDs and the tie-break sequence is logged

Default Preset Fallback When No Rule Matches

Given an asset has no matching rule across channel, workspace, or global scopes When evaluation completes Then the configured default preset bundle is applied Given fallback is applied to an asset When batch results are generated Then the asset is labeled as "Fallback used" with reason "No matching rule" in the batch report Given fallback is applied When audit records are written Then an entry is created with actor "system", chosen mapping "default preset", and the absence of matched rules

Manual Override at Batch and Asset Levels with Audit Trail

Given a user with binding override permissions sets a batch-level override (preset bundle, destination folder, channel variants) before processing When the batch is processed Then the override is applied to all assets in the batch and rule evaluation is skipped Given a user sets an asset-level override and a batch-level override exists When the asset is processed Then the asset-level override takes precedence over the batch-level override Given an override (batch or asset) is applied When audit records are written Then the log captures actor, timestamp, override scope (batch/asset), selected mappings, and optional reason Given a user removes an override before processing When the asset or batch is processed Then normal rule evaluation resumes and the removal is recorded in the audit log Given overrides are used When viewing results Then underlying binding rules remain unchanged

Ambiguous Supplier Detection Resolution

Given supplier detection returns multiple candidates with equal confidence for an asset When bindings are evaluated Then rules for each candidate supplier are considered and the winner is selected using scope precedence and priority tie-breakers Given supplier detection returns no confident match for an asset When bindings are evaluated Then supplier-based rules are skipped and other matchers (e.g., folder) are used; if none match, fallback preset is applied Given ambiguity in supplier detection influences the winner When results are presented Then the decision rationale includes the supplier detection outcome and applied tie-breakers

Atomic Binding Application Across Mapping Targets

Given a selected binding rule specifies preset bundle, destination folder, and channel variants When applied to an asset Then all mappings are applied atomically so that either all succeed or none persist Given any mapping step fails during application When processing completes Then changes are rolled back, the asset is marked as binding failed, and an actionable error is recorded Given atomic application is successful When the asset is finalized Then the asset reflects the preset, destination folder, and channel variants from the chosen rule consistently

Conflict Explanation and Audit Logging Accessibility

Given multiple rules matched and a conflict was resolved for an asset When the audit log is queried Then an entry exists listing all matched rules (id, scope, priority, matcher summary), the tie-breakers applied in order, and the chosen rule id Given conflict resolution occurred When a user inspects the batch results Then a human-readable explanation of the winning decision is available for that asset Given audit records are written for binding, fallback, or overrides When subsequent reviews occur Then entries are immutable and any corrections are appended as new entries referencing the original

Bulk Binding Import/Export

"As an integrations engineer, I want to manage bindings in bulk via files and API so that I can keep mappings in sync across systems efficiently."

Description

Enable CSV/JSON import and export of bindings to create, update, and share mappings at scale. Support validation, dry-run previews, and error reporting to prevent misconfiguration. Provide API endpoints for programmatic management so larger sellers and integrators can synchronize bindings from their supplier management systems.

Acceptance Criteria

CSV Import Dry-Run Preview

Given a CSV containing headers supplier_id, channel_variant, preset_bundle_id, destination_folder_id (metadata columns optional) When the user uploads the file with Dry Run enabled Then the system validates all rows without persisting changes and returns a summary with total_rows, valid_rows, invalid_rows within 5 seconds for up to 10,000 rows And per-row issues are returned with row_number, field, error_code (e.g., MISSING_REQUIRED_FIELD, INVALID_REFERENCE, DUPLICATE_BINDING), and message And a downloadable error report (CSV) is provided for invalid rows And no bindings are created or updated in the database

CSV Import Upsert with All-or-Nothing

Given a valid CSV and All-or-Nothing mode enabled When the user confirms Import Then the system upserts by unique key (supplier_id + channel_variant), creating new bindings or updating existing ones atomically And if any row fails validation, zero changes are committed and a 409 response includes aggregated error details and row_numbers And re-importing the same unchanged file is idempotent and yields created=0, updated=0, unchanged=all, errors=0 And for files up to 10,000 rows, processing completes within 2 minutes with a visible progress indicator

JSON API Import for Programmatic Sync

Given an authenticated client with scope bindings:write When it POSTs /api/v1/bindings/import with body { dryRun: boolean, allOrNothing: boolean, bindings: [...] } Then the API validates and returns 200 with summary and per-record results: created|updated|unchanged|error, honoring dryRun and allOrNothing And requests are rate-limited to 60/min per API key; exceeding returns 429 with retry-after And payloads are limited to 5 MB or 10,000 bindings; exceeding returns 413 with guidance And Idempotency-Key header is supported; duplicate requests within 24h return the original result without reprocessing

Export Bindings with Filters

Given a user with bindings:read permission When they export via UI or GET /api/v1/bindings/export?format={csv|json}&supplier_id=...&channel_variant=... Then the response includes matching bindings with fields supplier_id, channel_variant, preset_bundle_id, destination_folder_id, last_modified, modified_by, metadata in the chosen format And the export is sorted by supplier_id asc, then channel_variant asc, and completes within 10 seconds for up to 50,000 bindings And the response includes a SHA-256 checksum header and a timestamped filename

Referential Integrity and Uniqueness Validation

Given any binding record from CSV or JSON When validating references Then supplier_id, preset_bundle_id, and destination_folder_id must exist and be accessible to the organization or the row is rejected with error_code=INVALID_REFERENCE And channel_variant must be one of the workspace's configured variants or error_code=INVALID_ENUM is returned And uniqueness is enforced on (supplier_id, channel_variant); duplicates in the file or conflicts with existing data are rejected with error_code=DUPLICATE_BINDING including conflicting row_numbers or binding_id

Permissions and Audit Logging

Given a user or API key When attempting to import or export bindings Then only Admin or Integrator roles (or API keys with bindings:write for import, bindings:read for export) are authorized; others receive 403 FORBIDDEN And every import/export action creates an audit log with actor, action, counts (created, updated, unchanged, errors), source (UI|API), request_id, checksum, and timestamp And audit logs link to affected binding IDs and are retained and searchable for at least 90 days

Binding Audit & Version Control

"As a compliance lead, I want a full history of binding changes with rollback so that I can audit decisions and recover from misconfigurations."

Description

Track all binding changes with timestamp, actor, old/new values, and reason. Support version history with rollback to a prior configuration to quickly recover from mistakes. Surface change diffs and exportable logs for compliance and QA, ensuring traceability of automated processing decisions.

Acceptance Criteria

Audit Log Entry Completeness

Given a user edits an Auto Bind configuration (supplier mapping, preset bundle, destination folders, channel variants, or processing metadata rules) When the user submits the change with a non-empty reason Then an audit record is written within 2 seconds containing: unique entry ID; ISO 8601 UTC timestamp; actor (user ID and display name); action type (create/update/delete/rollback); affected entity IDs (binding ID, supplier ID, preset ID); field-level old values and new values; free-text reason; request origin (UI/API) and client IP And the new audit record is immediately visible in the audit timeline And if the reason is empty or missing, the change is blocked with a validation error and no audit record is created

Version History & Diff View

Given an Auto Bind configuration has at least two saved versions When a user opens the version history for that binding Then versions are listed in reverse chronological order with version number, timestamp, actor, and reason And when two versions are selected for comparison, only changed fields are shown with side-by-side old vs new values And arrays (e.g., destination folders, channel variants) display added (+), removed (-), and modified items distinctly And the diff view is read-only and provides a Copy JSON action for each compared version

Rollback to Prior Binding Configuration

Given a user with rollback permission selects a prior version from version history When the user provides a rollback reason and confirms Then the system creates a new active version identical to the selected prior version and marks it as current And an audit entry of type "rollback" is recorded referencing the target version and provided reason And all subsequent incoming images use the rolled-back configuration within 1 minute of confirmation And historical audit records and previously processed images remain unchanged And if the configuration changed after the user opened the dialog, the rollback is aborted with a conflict message and no changes are applied

Exportable Audit Logs with Filters

Given an auditor applies filters (date range, actor, supplier, binding ID, action type) to the audit timeline When the auditor requests an export Then the system generates CSV or JSON containing: entry ID, timestamp (UTC), actor ID, actor name, action type, binding ID, supplier ID, preset ID, changed fields (old/new), reason, origin, and IP And the export file embeds the applied filters in a metadata section (JSON) or header row (CSV) And exports up to 100,000 records complete within 60 seconds and stream for larger datasets without timing out And all text fields are properly escaped; timestamps use ISO 8601 UTC And an audit entry records that an export was generated, by whom, when, and with which filters

Access Control & Visibility

Given organization roles are enforced When a Viewer accesses audit history Then they can view and filter records but cannot export or perform rollbacks; related controls are disabled When an Admin or Compliance user accesses audit history Then they can export and perform rollbacks And unauthorized attempts to export or rollback return HTTP 403 (API) or display disabled UI with explanatory tooltip (UI) And all queries and actions are scoped to the user's organization; cross-tenant access is not possible

Immutability & Integrity Verification

Given audit records are append-only When an API client attempts to modify or delete an existing audit record Then the operation is rejected with HTTP 405 and message "Audit log is immutable" And each audit record stores a SHA-256 checksum of its canonical content And a verification endpoint validates record integrity and returns pass/fail per entry And the UI displays a "Verified" badge when checksum validation passes and an "Integrity Warning" banner when it fails, notifying Admins

Performance & Pagination for Audit Timeline

Given an organization with up to 500,000 audit records When a user opens the audit timeline with default filters Then the first page (50 records) loads within 2 seconds at p95 latency And subsequent pages load within 1.5 seconds at p95 using cursor-based pagination And text search by actor, supplier, or reason returns results within 2 seconds at p95 for up to 1,000,000 records And the timeline displays total results (approximate) and current cursor position

Monitoring & Alerts for Unbound Items

"As a production coordinator, I want alerts when items are unbound or fall back to defaults so that I can fix mappings quickly and keep processing on track."

Description

Provide real-time dashboards and notifications for assets that could not be bound or were processed with default fallbacks. Allow users to triage, bind, and reprocess directly from the alert. Configurable thresholds and channels (email, Slack, webhook) ensure issues are surfaced promptly and do not block catalog throughput.

Acceptance Criteria

Real-time Dashboard for Unbound and Fallback Items

Given assets are ingested and Auto Bind runs When an asset is unbound or processed with a default fallback Then the Monitoring dashboard lists the asset within 5 seconds of processing completion And the list shows: assetId, supplierDetected, reasonCode, timeDetected, presetApplied (if fallback), channel, batchId And filters exist for time range, reasonCode, supplierDetected, channel, status And totals and trend charts update within 5 seconds to reflect the new counts

Configurable Alert Thresholds and Channels

Given an admin configures alert thresholds (absolute count and percentage) and a time window per supplier or globally And selects notification channels (email, Slack, webhook) When unbound or fallback events meet or exceed any configured threshold within the window Then an alert is generated and delivered to the selected channels within 60 seconds And email subject includes environment, accountName, incidentType, count, window And Slack message includes a triage deep-link and top 5 sample assetIds And webhook payload conforms to schema v1.0 with HMAC signature and includes incidentId, counts, windowStart, windowEnd, reasonCodes breakdown And duplicate alerts for the same incidentId are suppressed during the configured cooldown period

Triage and Bind from Alert

Given a user opens the triage view from an alert When the user selects a supplier-to-preset, destination folders, and channel variants mapping and clicks Bind Then the binding is validated (entities exist, no conflicts) and saved And future assets from that supplier auto-bind to the selected preset, destinations, and channels And the incident status changes to Resolved with a reference to the bindingId And the dashboard removes the incident from Open within 5 seconds

One-Click Reprocess After Binding

Given an alert has affected assets and a new binding was created in this session When the user clicks Reprocess Affected Assets Then all eligible affected assets are queued for reprocessing within 30 seconds And reprocessing applies the new binding’s style-preset and metadata And assets retain original assetIds and create a new processingVersion And completion status (successCount, failCount) is posted back to the alert and dashboard And failures include errorCodes and can be retried individually

Throughput Not Blocked by Unbound Items

Given a batch containing both bindable and unbound assets When processing runs Then bound assets complete within ±5% of the baseline processing time for the same configuration And unbound assets are processed using the configured default fallback preset and routed to the fallback destination without blocking other assets And ingestion is not paused and alerting occurs per configured thresholds

Webhook Reliability and Security

Given a webhook channel is configured with an endpoint URL and secret When an alert is generated Then the system sends a POST with an HMAC-SHA256 signature header (X-PixelLift-Signature) and an idempotency key And retries up to 5 times with exponential backoff over 10 minutes for non-2xx responses or timeouts And marks delivery as Failed after retries and records attemptCount and lastResponse in the dashboard And deduplicates deliveries using the idempotency key And only TLS 1.2+ with valid certificates is accepted

Drift Watch

Continuously monitors incoming images for shifts from the saved fingerprint (new studio lighting, background color, watermark changes). When drift is detected, get alerts with suggested updates or branch a new version (Supplier v2) to keep routing precision tight as vendors evolve.

Requirements

Style Fingerprint Baseline

"As a catalog manager, I want PixelLift to generate and save a reliable style fingerprint per supplier so that future photos can be compared consistently to detect meaningful drift."

Description

Create a versioned baseline “fingerprint” per supplier/brand using a curated set of approved images, capturing measurable style attributes (background color profile, illumination/exposure/white balance, shadow softness, composition framing, watermark presence/location, resolution/aspect). Persist fingerprints with metadata (creator, date, lineage), attach them to routing rules and style-presets, and expose them to the processing pipeline. Provide an onboarding flow to select reference images, auto-summarize metrics, and allow manual tweaks. Ensure fingerprints are immutable once locked, with explicit versioning and rollback to support change control and reproducibility across batches.

Acceptance Criteria

Create Fingerprint from Approved Images (Onboarding Flow)

Given I am an authorized Supplier Admin for supplier S And I have at least 10 approved reference images When I start Fingerprint Onboarding for supplier S And I select between 10 and 50 reference images Then the system computes per-image and aggregated metrics: background color (sRGB mean/stddev), exposure (histogram mean), white balance (CCT Kelvin), shadow softness (edge gradient width px), composition framing (product bbox coverage %), watermark presence (boolean) and location (bbox), resolution (width x height) and aspect ratio And aggregated metrics are displayed with counts and ranges And computation completes within 10 seconds for up to 50 images And any rejected images are listed with specific error reasons (e.g., corrupt, resolution below 800x800)

Adjust Fingerprint Metrics and Preview Results

Given aggregated metrics are displayed for supplier S When I edit tolerance/range for any metric field Then input is validated with inline errors for out-of-range values and invalid formats And I can reset any metric to the auto-derived default And a preview renders using at least 3 sample images and updates within 2 seconds of a change And all pending changes are captured as a draft until saved

Persist Fingerprint with Metadata and Supplier Association

Given I have reviewed auto-derived metrics and tweaks for supplier S When I click Save & Lock Then a fingerprint record is created with fields: id, supplierId, version=1, creatorUserId, createdAt (UTC), lineageId, parentVersion=null, metrics payload, notes And the fingerprint is associated with supplier S and appears in the supplier’s Fingerprints list And it is retrievable via API GET /fingerprints/{id} and GET /suppliers/{supplierId}/fingerprints And an audit log entry is recorded with actor, timestamp, and diff of metrics

Lock Immutability and Create New Version

Given fingerprint v1 for supplier S is locked When I attempt to edit any metric on v1 via UI or API Then the system blocks the action and returns 409 Conflict with message "Fingerprint is immutable" When I choose Create New Version from v1 Then v2 is created as an editable draft copying all metrics from v1 And version increments by 1 and sets parentVersion=v1 and lineageId identical to v1 And audit log records version creation with parent linkage

Set Active Version and Roll Back to Previous

Given supplier S has fingerprints v1 and v2 (locked) When I set v2 as Active in the supplier settings Then new processing batches for supplier S reference fingerprint v2 by default When I trigger Roll Back to v1 Then v1 becomes Active and v2 remains locked and selectable And the change is reflected in UI/API within 5 seconds and recorded in audit logs

Expose Fingerprint to Pipeline, Routing Rules, and Style-Presets

Given supplier S has an Active fingerprint vN And routing rules and style-presets reference fingerprint attributes When a new image for supplier S enters processing Then the job payload includes fingerprintId and version And the routing evaluator can query all metric fields without error And the style-preset applies parameters derived from fingerprint metrics (e.g., background color target, crop coverage) And if no Active fingerprint exists for supplier S, ingestion fails with error code FP-ABSENT and is visible in logs/UI

Real-time Drift Detection

"As an operations lead, I want new uploads automatically checked for style drift so that issues are caught immediately without slowing our publishing pipeline."

Description

Continuously evaluate newly uploaded images against the supplier’s baseline fingerprint in near real-time (<2 minutes), computing diffs per attribute and returning a drift classification (none/minor/major) with confidence scores. Support both streaming and batch modes, handle spikes (e.g., 10k images/hour), and degrade gracefully with backpressure and retries. Emit structured drift events to the event bus for downstream alerting and workflow, and tag images with drift status to influence subsequent retouching and preset application paths.

Acceptance Criteria

Streaming Near Real-Time Drift Classification

Given a supplier fingerprint is active and an image is uploaded via the streaming API When the image enters the drift detector under normal load (<=1k images/hour per supplier) Then a drift result with fields {classification in [none, minor, major], confidence in [0.0,1.0]} is produced within 120 seconds end-to-end p95 And the API response includes HTTP 200 and a result payload with image_id, supplier_id, fingerprint_version, classification, confidence, processing_timestamps

Batch Mode Drift Processing at Scale

Given a manifest of 10,000 image URLs is submitted to the batch endpoint When batch processing runs to completion Then effective throughput is >=10,000 images/hour with p95 per-image classification latency <=120 seconds from ingestion And permanent failures are <=0.1% and every failed item is recorded with error_code and moved to a dead-letter queue And the batch results export contains one row per input with drift_status, confidence, and attribute_diffs persisted

Attribute-Level Diff Computation and Confidence

Given a baseline fingerprint with attributes [lighting_profile, background_color, watermark_presence] When a new image is evaluated Then the output includes attribute_diffs[] with entries {attribute_name, baseline_value, observed_value, metric, delta, unit} And classification mapping rules apply: none if all deltas < minor_threshold; minor if any delta >= minor_threshold and all < major_threshold; major if any delta >= major_threshold And confidence is in [0,1] and >= class_min_confidence for the assigned classification; thresholds and versions are included in the result

Graceful Degradation Under Spike With Backpressure and Retries

Given an ingress spike of 10,000 images/hour sustained for 60 minutes When the system receives uploads via streaming and batch endpoints Then requests are accepted with 202 or rate-limited with 429 once queue depth exceeds threshold, with no data loss And retries use exponential backoff with jitter (initial 1s, max 30s, max 6 attempts); items exhausting retries are routed to DLQ with reason within 5 minutes of last attempt And during the spike, 99% of images are classified within 10 minutes and 100% are eventually classified or present in DLQ

Structured Drift Event Emission to Event Bus

Given a drift result is produced When emitting an event Then a JSON message conforming to schema drift.v1 is published within 10 seconds with fields: event_id, occurred_at (ISO-8601 UTC), supplier_id, image_id, fingerprint_version, classification, confidence, attribute_diffs[], trace_id; size <=128KB And at-least-once delivery is provided with idempotency_key = hash(supplier_id,image_id,fingerprint_version) to allow de-duplication; duplicates <=1 per 10,000 events in test And the message validates against the registered schema and is consumable by downstream services without deserialization errors

Drift Status Tagging Influences Downstream Retouching

Given a drift classification is produced for an image When the image metadata is stored Then drift_status in {none, minor, major}, drift_confidence, and fingerprint_version are persisted and queryable via API/UI And when drift_status=none, the standard retouching/preset path executes without additional review And when drift_status=minor, the current style preset is applied and an informational alert is queued without blocking And when drift_status=major, the image is routed to the supplier drift workflow (alert emitted) and waits for branch decision or approval before preset application

Adaptive Sensitivity & Thresholds

"As a brand owner, I want to adjust how sensitive drift detection is for each vendor so that we minimize false alarms while still catching changes that affect brand consistency."

Description

Provide configurable thresholds for each fingerprint metric at global and per-supplier levels, with optional auto-tuning that learns from recently approved images to reduce false positives. Include presets (strict/standard/lenient), preview tools to simulate sensitivity against historical data, and guardrails (min/max bounds) to avoid drift creep. Store changes with audit metadata and effective dates to ensure consistent evaluation across in-flight batches.

Acceptance Criteria

Global and Per-Supplier Threshold Configuration with Guardrails

Given fingerprint metrics with defined min/max guardrails and an admin with edit permissions When the admin saves global thresholds within bounds and optional per-supplier overrides for selected suppliers Then the system validates each metric against guardrails, rejects invalid values with field-level errors, and persists valid changes atomically And the effective threshold set for any supplier resolves as supplier override where present, otherwise global And retrieving thresholds via UI/API returns the effective values and their source (global or supplier)

Preset Selection with Safe Preview and Apply

Given strict, standard, and lenient presets mapped to concrete threshold bundles When a user selects a preset for a supplier and runs Preview against a chosen historical window Then the system simulates using the preset without persisting changes and displays projected alert rate, precision, recall, and delta vs current And only upon Apply does the system persist the preset thresholds for the selected scope (global or supplier) with confirmation feedback

Auto-Tuning Opt-In Reduces False Positives within Bounds

Given auto-tuning is disabled by default and bounds/step limits are configured When a user enables auto-tuning for a supplier with a learning window of recently approved images Then proposed threshold updates stay within guardrails and per-interval step limits and target a reduction in false positives relative to the baseline And updates are applied automatically only if projected metrics meet or exceed configured improvement thresholds; otherwise they require manual approval And each auto-tune change includes a rollback point

Effective-Dated Changes Preserve In-Flight Batch Consistency

Given a threshold change is scheduled with an effective date/time When a batch is submitted before the effective date/time Then the batch is evaluated using the previously effective thresholds for its entire lifecycle And batches submitted at or after the effective date/time are evaluated with the new thresholds And reprocessing uses thresholds effective at reprocess time unless the user selects Use original thresholds

Immutable Audit Log with Complete Metadata

Given any threshold change (manual edit, preset apply, or auto-tune) When the change is saved Then an immutable audit record is created capturing actor (user/system), timestamp, scope (global/supplier), metric diffs (old→new), change source, rationale/notes, and effective date/time And audit entries are retrievable via UI/API, filterable by supplier, actor, date range, and change source, and exportable as CSV

Historical Simulation Reports Impact Metrics Pre-Apply

Given labeled historical outcomes are available for a supplier or globally When the user compares current thresholds to candidate thresholds in the simulator Then the system returns counts and rates for TP, FP, TN, FN, overall precision/recall, and projected alert volume for the selected window and sample size And the simulator enforces a minimum sample size and communicates confidence where applicable And no configuration changes are persisted from simulation runs

Validation Blocks Out-of-Bounds or Partial Updates

Given a user attempts to save thresholds with any metric outside guardrails or missing required fields When the save is submitted Then the system rejects the request with specific error messages per invalid metric and shows allowed ranges And no partial updates are persisted; either all metrics are saved successfully or none

Drift Alerts & Notifications

"As a photo ops coordinator, I want timely, contextual drift alerts in the tools we already use so that we can act quickly without being overwhelmed."

Description

Deliver actionable alerts when drift is detected via in-app notifications, email, Slack, and webhooks. Group related events to avoid alert storms, apply rate limits, and provide severity levels based on impact on conversion-critical attributes (e.g., background color deviation). Include rich context: supplier, affected metrics, sample thumbnails, confidence, and links to preview/suggested fixes. Allow users to acknowledge, snooze, or resolve alerts and manage subscriptions per team and supplier.

Acceptance Criteria

Severity classification and triggering of drift alerts

Given a saved supplier fingerprint with thresholds for conversion-critical metrics (e.g., background color deltaE, watermark presence) When incoming images from that supplier exceed one or more thresholds Then the system assigns severity as follows: - Critical if >= 10 images within 15 minutes exceed background color deltaE > 10 or any watermark is newly detected with confidence >= 0.90 - High if >= 5 images within 15 minutes exceed background color deltaE > 5 or lighting variance > 20% with confidence >= 0.80 - Medium if a single image exceeds a non-critical metric threshold with confidence >= 0.80 And an alert is created for the supplier and drift_type with the computed severity within 10 seconds of the qualifying event

Alert payload includes rich context and actionable links

Given a drift alert is generated When the alert payload is constructed Then it includes: supplier_id, supplier_name, severity, confidence (0.00–1.00), affected_metrics [{name, before_value, after_value}], grouped_count, first_seen_at, last_seen_at, 3 sample_thumbnail URLs, sample asset_ids/SKUs, correlation_id (UUIDv4), preview_url, suggested_fix_url And all URLs are HTTPS and return HTTP 200 at time of send And webhook payload conforms to schema version v1.0; email and Slack messages render the same fields And no PII fields are present in any payload

Multi-channel delivery to in-app, email, Slack, and webhook

Given an alert is created for a subscribed user/team When delivery is attempted Then in-app notification appears within 10 seconds of alert creation And webhook is POSTed within 10 seconds with HMAC-SHA256 signature and idempotency key; retries on 5xx/timeout at 1m, 5m, 15m; stops on first 2xx; marks failed after final retry And Slack message posts within 60 seconds; on API error, retry up to 2 times; include thumbnails or fallback text if images are blocked And email is sent within 2 minutes; includes manage-subscriptions link; passes SPF/DKIM alignment And per channel, delivery outcome and timestamp are logged as sent/delivered/failed

Alert grouping, deduplication, rate limiting, and reminders

Given multiple drift events of the same drift_type occur from the same supplier When they occur within a 5-minute grouping window Then a single alert is emitted with grouped_count equal to the number of unique events And duplicate events with the same asset_id + drift_type + content_hash are ignored And no more than 1 alert per supplier per severity per 15 minutes is sent per channel (rate limiting), excluding reminders And if an alert remains open and unresolved, send a reminder after 4 hours, up to 2 reminders in 24 hours; do not send reminders if the alert is snoozed or resolved

User alert actions and lifecycle: acknowledge, snooze, resolve

Given a user with access views an open alert When they click Acknowledge Then the alert state changes to Acknowledged and future matching events continue to increment grouped_count When they click Snooze and choose 15m/1h/24h Then notifications for that alert are suppressed across all channels for the selected duration and the alert displays Snoozed with remaining time When they click Resolve Then the alert state changes to Resolved, all pending reminders are canceled, and new qualifying events after resolution open a new alert And all actions write an audit record with user_id, action, timestamp, previous_state, new_state, optional comment

Per-team and per-supplier subscription management

Given a team manages alert subscriptions for suppliers When they set channel preferences and minimum severities per supplier Then the settings are validated, saved, versioned, and take effect within 60 seconds And at least one account owner remains subscribed to Critical severity on at least one external channel (email, Slack, or webhook); attempts to disable this are blocked with an error And users can send a test alert per channel to confirm delivery; results show per-channel status within 2 minutes And Slack destinations can only be selected from verified connections; webhook endpoints must pass URL validation and signature verification test

Suggest-and-Branch Workflow

"As a studio lead, I want one-click suggestions and the option to branch a new supplier version when drift appears so that routing stays accurate as vendor setups change."

Description

Offer a guided flow that, upon drift, computes suggested updates to the fingerprint or proposes creating a new branch (e.g., Supplier v2). Provide side-by-side previews showing how current vs. proposed settings affect retouching, background removal, and style-preset application. Enable one-click actions: update fingerprint, create new version with routing updates, or ignore/accept as new normal with time-bounded trials. Enforce permissions, track decisions, and support rollback/merge between branches to keep routing precise as vendors evolve.

Acceptance Criteria

Drift Alert Triggers Suggestion Generation

Given a drift alert exists for Supplier X with ≥20 affected images in the last 24 hours When a user with role "Editor" opens the Suggest-and-Branch panel from the alert Then the system computes proposed parameter deltas (e.g., lighting profile, background color, watermark detection, style-preset) And returns per-parameter confidence scores (0.0–1.0) and an overall drift magnitude And lists impacted SKU count, sample image set, and projected routing changes And completes suggestion computation within 10 seconds for up to 500 recent images And persists a draft suggestion record with a unique suggestionId for auditability

Side-by-Side Previews Reflect Current vs Proposed Settings

Given a suggestion record exists for Supplier X When the user opens Side-by-Side Preview Then the left pane renders output using the current fingerprint, and the right pane renders using the proposed deltas And the user can toggle sub-pipelines (retouch, background removal, style-preset) to compare specific effects And an A/B slider allows pixel-level comparison with 1:1 zoom and pan sync And the preview auto-samples at least 30 images or 10% of the impacted set (whichever is greater), capped at 200 And 95% of sampled images render in ≤4s, with median render time ≤2s And each previewed image displays a delta summary (e.g., background color code, style preset name)

One-Click Update Fingerprint with Audit and Rollback

Given the user has permission "Fingerprint:Update" for Supplier X And a valid suggestionId is selected When the user clicks "Update Fingerprint" Then the system applies the parameter deltas as a new fingerprint revision (not a branch) effective for new uploads after the commit timestamp And previously processed images remain unchanged And an audit log entry captures who, when, before/after values, related alertId, and rationale And the operation reports success within 5 seconds or returns a descriptive error without side effects on failure And a one-click "Rollback to prior revision" action is available for 30 days and restores routing within 1 minute

Create New Version Branch With Routing Updates

Given the user has permission "Fingerprint:Branch" for Supplier X And a valid suggestionId is selected When the user clicks "Create Version (Supplier v2)" Then the system creates a new branch inheriting the base fingerprint and applying proposed deltas And assigns a unique versionId and human-readable name (e.g., "Supplier v2") And updates routing to send uploads matching configured vendor criteria (e.g., vendorId, SKU prefix, source folder) to the new branch And leaves other vendors and branches unaffected And confirms creation with effective timestamp and a link to manage routing rules And completes within 8 seconds or fails atomically with no partial routing changes

Time-Bounded Trial as New Normal

Given a suggestion is available for Supplier X When the user selects "Start Trial" and configures a window between 7 and 30 days and a sampling percentage between 10% and 100% Then the system routes the configured traffic share to the proposed settings while the remainder continues on current settings And the trial dashboard tracks metrics: background removal error rate, style-preset mismatch rate, manual touch-up rate, and average processing time And the system evaluates the trial at end-of-window with default pass criteria (error rate ≤2%, mismatch rate ≤3%, touch-up rate decrease ≥10% vs control) And on pass, the user is prompted to promote to Update or Branch in one click; on fail, traffic auto-reverts to current settings And the user may cancel the trial at any time, which immediately reverts routing

Permission Enforcement and Approval Workflow

Given platform roles Viewer, Editor, Approver, and Admin are configured When a user without the required permission attempts Update/Branch/Trial actions Then the action controls are disabled with explanatory tooltips and server-side enforcement returns 403 without side effects And when approvals are required, an Editor’s action creates a pending request and no changes take effect until an Approver approves And all approvals/rejections include comments and are recorded in an immutable audit trail with timestamps and userIds And notification events are emitted to the Alerts inbox for each state change (requested, approved, rejected, executed)

Rollback and Merge Between Branches

Given Supplier X has an active branch and a stable base fingerprint When the user selects "Rollback" on a branch or revision Then routing reverts to the previously active version within 1 minute and no additional images are processed by the rolled-back version thereafter And processed outputs prior to rollback remain accessible and tagged with their producing version When the user selects "Merge" from a branch to base Then the user can choose specific parameters to merge (e.g., background color, watermark model, style-preset mapping) And conflicts are resolved via explicit confirmation (last-writer-wins) after a preview diff And the merge creates a new base revision with full audit logging and no downtime for routing

Drift Dashboard & Audit Log

"As a product manager, I want a centralized view and audit trail of drift events and actions so that we can measure effectiveness, prove compliance, and prioritize improvements."

Description

Provide a dashboard showing drift trends by supplier, severity distribution, mean time to detect/resolve, and the impact on processing outcomes and conversion KPIs. Include searchable audit logs of fingerprint changes, threshold updates, alerts, acknowledgements, and branch operations with who/when/why. Support CSV export and API access for BI tools. Use this telemetry to highlight risky suppliers and recommend proactive reviews before major catalog pushes.

Acceptance Criteria

Drift Trends by Supplier Dashboard View

Given I am on the Drift Dashboard with drift data available When I select a date range (up to 180 days) and one or more suppliers Then I see a time-series chart of drift rate per supplier grouped by Day or Week And the drift rate equals (count of images flagged for drift ÷ count of processed images) for each interval And hovering a point shows exact percentage, numerator, and denominator And applying supplier and date filters updates the chart and totals within 2 seconds (p95) And metrics refresh to include new ingestions within 5 minutes And values match backend reference calculations within ±0.1 percentage points And filters persist across navigation within the dashboard session

Severity Distribution and Impact Metrics Display

Given the Severity panel is visible for a selected date range and supplier set When I view the severity distribution Then I see counts and percentages for Low, Medium, High, Critical severities using the configured thresholds (defaults: Low 0.1–0.3, Medium 0.3–0.6, High 0.6–0.8, Critical >0.8) And a legend explains the thresholds with a link to settings And I see Processing Outcomes deltas vs 30-day pre-drift baseline: Auto-accept rate, Manual review rate, Reroute rate with absolute and percentage change And I see Conversion KPI deltas (CTR, Add-to-Cart, Conversion Rate) vs baseline when KPI integration is enabled; otherwise a No Data message is shown And each metric shows the comparison window used and confidence or sample size And calculations match backend within tolerance (±0.1 pp for rates, ±1% for counts) And panel loads within 2 seconds (p95) for up to 50 suppliers selected

MTTD and MTTR Computation and Display

Given I have drift alerts and resolution actions recorded When I open the MTTD/MTTR cards for the selected date range and suppliers Then MTTD is computed as mean(alert_created_at − first_drifted_image_at) in hours across included alerts And MTTR is computed as mean(resolution_timestamp − alert_created_at) in hours where resolution_timestamp is the earliest of: fingerprint updated, threshold updated, branch created/activated, or alert explicitly marked Resolved And timestamps are stored and displayed in UTC ISO-8601; values rounded to 0.1 hours And clicking a card opens a drill-down table of contributing alerts with their individual TTD/TTR values And exclusions (open/unresolved alerts excluded from MTTR; suppressed test alerts excluded from both) are indicated And computed values match a backend validation job within ±0.1 hours

Searchable Audit Log with Who/When/Why

Given I open the Audit Log When I search and filter by free text, supplier, actor, action type, date range, and related IDs (alert_id, image_id, branch_id) Then results return within 2 seconds (p95) for queries over 1M rows And each entry shows: timestamp (UTC ISO-8601), supplier, actor (name and ID), action type, reason/notes (why), before/after values, related IDs, and request origin (UI/API) And results are sortable by timestamp desc/asc and paginated (50 per page by default) with accurate counts And exact and prefix matching are supported for IDs; free text matches reason and actor fields And actions (acknowledge alert, dismiss, threshold change, fingerprint update, branch create/activate, export, recommendation ack/snooze) create audit entries within 10 seconds of completion And clicking a related ID opens the corresponding detail view in a new tab

CSV Export and API Access for BI Tools

Given I have applied filters to a dashboard view or the Audit Log When I click Export CSV Then a CSV is produced with a header row and UTF-8 encoding, comma delimiter, quoted strings, matching the filtered columns and rows And exports up to 100,000 rows synchronously (<10 seconds) and provides an async download link with email/webhook for larger sets And exported numeric values and timestamps match on-screen values and backend data within defined tolerances And the BI API provides authenticated OAuth2 endpoints for metrics and audit logs with filtering, sorting, and cursor pagination And API responses include schema-stable JSON fields equivalent to CSV headers, with p95 latency ≤2 seconds for pages of 10,000 rows and rate limit of 120 requests/min per token And data returned by API matches the UI for identical filters within ±0.1 percentage points for rates and exact match for counts

Risky Supplier Highlighting and Proactive Review Recommendations

Given the Risk panel is visible for a selected horizon (last 14/30 days) When risk scores are computed Then each supplier receives a 0–100 risk score = 0.4*(recent drift rate) + 0.3*(weighted severity index) + 0.2*(positive slope of drift rate over last 14 days, normalized) + 0.1*(MTTR penalty if MTTR>24h) And suppliers with score ≥70 are flagged as High Risk and appear in a sorted list (top 10 by default) And each flagged supplier shows primary drivers, trend sparkline, and recommended actions (review fingerprint, adjust thresholds, branch a new version) with one-click actions And if a major catalog push is scheduled in the next 7 days or a predicted volume spike >3× baseline is detected and risk score ≥60, a proactive Review Before Push recommendation is displayed And clicking Acknowledge or Snooze records an audit log entry and removes the item from the list for the selected duration And creating a branch via the recommendation updates routing rules and marks the associated alert as Resolved within 60 seconds

Correction Memory

Every manual reassignment becomes a learning signal. The router adapts its weights and rules from your fixes, reducing repeat errors and showing measurable gains in precision over time—so teams spend less time correcting and more time launching.

Requirements

Correction Event Capture

"As a photo editor, I want my corrections to be captured reliably so that the system can learn and stop repeating the same routing mistakes."

Description

Log every manual reassignment and edit as a structured learning signal, including the original router decision, the user’s correction (e.g., preset change, background selection, mask fix), image and catalog metadata, workspace/brand, confidence score, and timing. Ensure low-latency, lossless ingestion with retry semantics, idempotency, and linkage between before/after outputs for auditability. Store events in a versioned feature store suitable for training and evaluation, with schema evolution and PII-safe handling. Surface capture status in developer tools and keep runtime overhead under 20 ms per action.

Acceptance Criteria

Idempotent Low-Latency Capture on Manual Correction

Given a user submits a manual correction (preset change, background selection, or mask fix) to a routed image And the router's original decision and confidence are known When the action is committed Then the capture adds no more than 20 ms p95 runtime overhead to the action And an event is created with a deterministic idempotency key derived from {workspace_id,image_id,action_id,action_timestamp} And duplicate submissions with the same key do not create additional stored events And the API returns 2xx with capture status "persisted" or "queued" and a trace_id

Complete Signal Payload Stored with Validation

Given a manual correction is captured When the event is persisted Then the payload includes: original_router_decision_id, router_version, confidence_score, user_correction_type, user_correction_value, before_asset_id, after_asset_id, image_id, catalog_id (if any), batch_id (if any), workspace_id, brand_id, actor_id, action_timestamp, ingest_timestamp, client_version And required fields are validated; missing required fields cause the event to be queued for retry and recorded in an error stream with error_code and field_name And optional fields may be null without blocking persistence And persisted events are queryable by trace_id within 1 minute

Before/After Linkage for Audit Retrieval

Given an event representing a manual correction has been persisted When an auditor retrieves the event by trace_id or image_id Then the response includes pointers to both before and after outputs with content_hashes and storage_urls And the pair is verifiably linked to the same workspace_id and action_id And the audit view shows the original router decision, user correction, and timestamps in a single record

Reliable Retry Semantics Under Transient Failures

Given the ingestion service experiences a transient failure (timeouts or 5xx) When a client attempts to capture an event Then the client retries with exponential backoff for up to 5 attempts over 2 minutes And no more than one stored event results due to idempotency And the final status is "persisted" within 5 minutes under recovery And if retries exhaust, the event is placed on a dead-letter queue with cause and is visible in developer tools

Versioned Feature Store with Schema Evolution and PII Safety

Given events are written to the feature store When a new schema version is introduced adding fields Then both old and new versions remain readable and joinable for training/evaluation via a stable view And readers can request a snapshot by schema_version or by effective_date And PII fields (e.g., actor_email) are not persisted; allowed sensitive identifiers are tokenized or encrypted at rest with audited access controls And an automated check rejects payloads containing disallowed PII keys and routes them to a quarantine stream

Developer Tools Surface Capture Status and Telemetry

Given a developer opens the PixelLift developer console or uses the status API with a trace_id When viewing event capture status Then they can see state transitions ("queued","retrying","persisted","failed"), timestamps, attempt_count, and last_error if any And they can filter by workspace_id, time range, status, and action type And the status endpoint returns within 300 ms p95 for a single trace_id lookup

Batch Upload Resilience and Throughput

Given a user batch-uploads a catalog of 5,000 images and performs manual corrections on 10% of them within 30 minutes When events are captured during this period Then 100% of correction events are recorded without loss, including during bursts of 200 events/sec And median end-to-end time from action_commit to persisted <= 1 second, with p95 <= 3 seconds And system backpressure does not increase action runtime overhead beyond the 20 ms p95 budget

Incremental Router Learning

"As a boutique owner, I want the router to adapt from my team’s fixes quickly so that new batches are routed correctly without repetitive manual work."

Description

Implement a nearline training pipeline that updates routing model weights on a frequent cadence (e.g., hourly/daily) using captured corrections. Support global and per-workspace adapters to balance shared learning with brand-specific preferences. Handle class imbalance and cold-start via weighted sampling and prior distributions. Register each trained model in a versioned model registry with metadata, reproducible training configs, and automatic rollback. Maintain training SLAs, resource autoscaling, and guard against catastrophic forgetting.

Acceptance Criteria

Hourly Nearline Retraining from Captured Corrections

- Given corrections have been captured since the last successful training run, when the hourly scheduler triggers at HH:00, then a new training job starts within 2 minutes. - Given the job starts, when training executes, then it completes within 45 minutes for ≤1,000,000 corrections and produces a candidate model. - Given fewer than 100 new corrections are available, when the scheduler triggers, then the run is skipped and the reason is logged as "insufficient-delta" with counts. - Given multiple corrections reference the same asset and label, when ingesting, then duplicates are de-duplicated and only the latest correction is used. - Given a correction is included in a completed run, when the next run executes, then that correction is not re-counted (exactly-once ingestion).

Workspace Adapters Override Global Router Behavior

- Given a workspace has ≥500 corrections in the last 30 days across ≥3 classes, when training runs, then a per-workspace adapter is trained and attached to the global base. - Given a workspace has <500 corrections or <3 classes, when training runs, then the workspace continues to use the global model with priors (no adapter), and this decision is logged. - Given an adapter is trained, when evaluated on that workspace’s stratified holdout, then macro-F1 ≥ (global baseline macro-F1 − 0.5%) and precision ≥ (baseline precision + 1.0%) OR precision non-inferior within 0.25% with recall ≥ (baseline recall + 1.0%). - Given an admin toggles "Disable adapter" for a workspace, when routing requests arrive, then the system serves the global model within 1 minute of the toggle and records the change audit.

Class Imbalance Handling and Cold-Start Priors

- Given class frequencies are imbalanced, when sampling batches, then the majority:minority sampling probability ratio is capped at 10:1 and every class with ≥50 examples appears at least once per epoch. - Given classes with <50 examples exist, when training, then class-weighting is applied proportional to inverse sqrt frequency and the weighting parameters are recorded in the run config. - Given a new workspace with zero corrections, when routing requests occur, then the model uses system-level priors and achieves precision ≥70% on the shared validation set for the top-3 frequent classes. - Given a workspace with 50–499 corrections, when training, then interpolation between global priors and the workspace adapter is applied; evaluation confirms macro-F1 within 1.0% of the global-only baseline for that workspace.

Versioned Model Registry with Reproducible Artifacts and Rollback

- Given any training job completes, when the candidate is registered, then the registry entry includes: model_version, code_commit_sha, training_config_hash, data_snapshot_id, corrections_window_start/end, per-class metrics, and artifact URIs. - Given a registry entry exists, when a reproduction run is launched with the recorded artifacts and seed, then resulting metrics match within ±0.2% absolute and feature manifest checksums match exactly. - Given a newly promoted model shows a ≥1.5% absolute drop in online precision over 30 minutes or 10,000 decisions (whichever is later), when rollback policy evaluates, then automatic rollback to the previous stable version is executed within 5 minutes and recorded as a rollback event. - Given a model is rolled back, when subsequent training runs occur, then lineage links to the rolled-back version and annotations include the rollback reason code.

Training SLA Compliance and Autoscaling Under Load

- Given steady-state load of up to 1,000,000 new corrections per hour across ≤5,000 workspaces, when the hourly training cycle runs, then p95 training completion time ≤45 minutes and p99 queue wait time ≤5 minutes. - Given a 3× spike in corrections within an hour, when autoscaling triggers, then additional workers provision within 3 minutes and backlog drains to steady-state within the next cycle. - Given a worker failure occurs mid-epoch, when retry logic engages, then the job resumes from the last checkpoint without data loss and completes within SLA. - Given the scheduler falls behind by more than one cycle, when backpressure control engages, then new runs are throttled and an alert is sent to on-call within 2 minutes.

Catastrophic Forgetting Safeguards on Historical Benchmarks

- Given a historical benchmark dataset representing the last 90 days, when evaluating a candidate model, then global macro-F1 drop vs. the previous production model is ≤1.0% absolute and per-class drop ≤2.0%. - Given any class violates the forgetting threshold, when promotion is attempted, then the candidate is rejected and the replay buffer of exemplars is increased by 20% for the next training run. - Given adapters are trained for multiple workspaces, when evaluated, then ≥98% of workspaces meet the forgetting thresholds; non-compliant adapters are not promoted for those workspaces.

Precision Improvement Tracking and Promotion Gates

- Given a candidate model is trained, when evaluated offline on the latest holdout from the correction window, then precision@1 improves by ≥2.0% absolute or is within 0.5% while recall improves by ≥2.0%. - Given offline acceptance is met, when shadow traffic of ≥50,000 routing events is collected, then online precision is non-inferior (95% CI lower bound of difference ≥ −0.5%) and latency p95 does not regress by >5 ms. - Given online gates pass, when promotion occurs, then the candidate becomes production for eligible scopes (global and adapters) and a 7-day trailing precision chart shows a non-decreasing trend (slope ≥ 0). - Given gates fail, when promotion is attempted, then deployment is blocked and a diagnostic report with drift, class distribution shift, and error cohorts is attached to the registry record.

Confidence-Based Suggestions & Fallbacks

"As a seller managing large uploads, I want the system to suggest the most likely presets when it’s unsure so that I can correct items in one click instead of hunting through options."

Description

Compute calibrated confidence for each routing decision and expose top-N alternative presets when confidence is low. In low-confidence cases, default to a safe brand preset or request a one-click confirmation from the user. Persist confidence, alternatives shown, and the selected option as learning signals. Provide API/UI hooks to render suggestions inline in batch review with no more than one extra click for acceptance. Ensure latency budgets are met for bulk operations.

Acceptance Criteria

Batch Review: Low-Confidence Suggestion Shows Top-N Inline

Given a routing decision with confidence < threshold_low (default 0.70) And default_topN = 3 (configurable) When the item is opened in batch review UI Then the primary suggested preset and top-N alternatives are displayed inline without additional user action And alternatives are sorted by descending confidence and show preset name, preview thumbnail, and confidence score rounded to two decimals And the user can accept any displayed option with at most one additional click

Batch Review: One-Click Confirmation Flow

Given a routing decision with confidence < threshold_low And require_confirmation = true When the item appears in batch review Then a single "Accept" control is available for the primary suggestion And activating the control applies the preset, records confirmation=true, and advances focus to the next item And the Enter key performs the same action when the item is focused

Auto Fallback to Safe Brand Preset

Given a routing decision with confidence < threshold_low And require_confirmation = false And a safe_brand_preset is configured When routing is executed Then the safe_brand_preset is applied automatically And event.fallback_reason = "low_confidence" is recorded And if no safe_brand_preset is configured, apply default_preset = "Neutral" and record fallback_reason = "no_safe_preset"

Calibrated Confidence Quality

Given a held-out validation set of ≥ 5,000 routed items with ground truth over the last 30 days When Expected Calibration Error (ECE, 10 bins) is computed Then ECE ≤ 0.05 And for any two adjacent confidence bins, the higher bin has ≥ accuracy of the lower bin And a calibration metrics artifact is produced and accessible via GET /v1/metrics/router/calibration with timestamp ≤ 24h old

Telemetry Persistence for Suggestions and Selections

Given any routing decision (suggested, confirmed, or fallback) When the decision is finalized Then a record is persisted with fields: item_id, batch_id, timestamp, model_version, confidence (0–1), threshold_at_decision, primary_preset_id, alternatives[id,confidence] (topN), require_confirmation, user_action {accepted|changed|auto-applied}, selected_preset_id And the record is retrievable via GET /v1/routing/events?batch_id={id} And write success rate over rolling 24h ≥ 99.9% with at-least-once semantics

Bulk Latency Budget for Suggestion Generation

Given a batch of 500 images When suggestions are requested via POST /v1/routing/suggestions Then p95 time to first item suggestions ≤ 2s and p95 time to full batch suggestions ≤ 10s And router service p95 per-item compute time ≤ 100ms And batch review UI renders placeholders immediately and replaces them with suggestions without blocking scrolling

Suggestions API Contract and Inline Rendering Hook

Given a request to /v1/routing/suggestions for a batch When the response is returned Then each item includes: item_id, primary_preset_id, confidence, alternatives (array of up to N entries with preset_id and confidence), require_confirmation (bool), safe_preset_id (if applicable), schema_version And the UI renders the suggestion chip and alternatives inline from this payload without page navigation And selecting any alternative applies it and persists the selection with at most one additional click

Workspace-Level Personalization

"As a brand manager, I want routing to reflect my brand’s style preferences so that my product images stay consistent with minimal supervision."

Description

Maintain per-workspace preference profiles that bias routing toward each brand’s historical choices (e.g., background color, crop style, shadow treatment). Activate personalization after a minimum signal threshold and fall back hierarchically to category- or global-level models when data is sparse. Support explicit admin overrides and rule pinning for critical SKUs or collections. Ensure isolation across tenants while enabling safe meta-learning at the global layer.

Acceptance Criteria

Activate Personalization After Threshold

Given a workspace has fewer than 25 valid correction signals in the last 14 days from at least 2 distinct users across at least 3 preference dimensions (e.g., background, crop, shadow), When routing decisions are executed, Then workspace-level personalization remains OFF and hierarchical fallback is used. Given a workspace reaches ≥25 valid correction signals in the last 14 days from ≥2 distinct users across ≥3 preference dimensions, When the next routing job runs, Then workspace-level personalization becomes ACTIVE within 15 minutes and is shown as the decision source on subsequent routes. Given personalization becomes ACTIVE, When the activation event occurs, Then a timestamped audit record is created capturing signal counts, threshold values, activator (system), and workspace ID. Given personalization is ACTIVE, When new correction signals are received, Then the workspace profile updates incrementally within 30 minutes without causing job failures or increasing P95 routing latency beyond 200 ms.

Hierarchical Fallback Selection

Rule: Decision source must follow specificity order when applicable artifacts exist—Workspace profile > Category model > Global model—and select the highest-confidence option; ties within 1% confidence margin prefer higher specificity; ties at the same specificity prefer the most recently updated model. Given a workspace is below the activation threshold, And a category model exists with ≥500 aggregated signals from ≥10 distinct workspaces in the last 30 days, When routing an image tagged with that category, Then the Category model is used and the source is recorded in the audit log with support counts and confidence. Given a workspace is below the activation threshold, And the category model support is insufficient (<500 signals or <10 workspaces), When routing occurs, Then the Global model is used and recorded accordingly. Given a batch with mixed categories, When routing runs, Then each image independently selects the appropriate level per the rule without batch-level leakage, and batch completes with ≥99.9% success rate.

Admin Override and Rule Pinning Precedence

Rule: Precedence order is Pinned rule > Admin override > Workspace profile > Category model > Global model. Given a workspace admin creates a pinned rule targeting SKU(s) or collection(s), When routing matching images, Then the pinned rule is applied 100% of the time regardless of learned preferences and recorded with rule ID and scope. Given a workspace admin creates an override (non-pinned) that biases a preference (e.g., force white background), When routing matching images, Then the override is applied ahead of the workspace profile but can be superseded by a pinned rule. Given a new pin or override is saved, When up to 5 minutes elapse, Then the rule becomes effective across API and batch jobs. Given conflicting pins or overrides exist at different scopes, When routing occurs, Then the more specific scope (SKU > collection > pattern) wins; if same scope, the most recent update time wins; the conflict resolution is logged. Given a pin or override is removed or expires, When routing occurs after removal, Then routing reverts to the next-precedence source within 5 minutes and the change is audit-logged.

Tenant Isolation with Safe Global Meta-Learning

Given two tenants (A and B) with distinct workspaces, When personalization is active for A and inactive for B, Then B’s routing must not read A’s workspace-profile parameters, pins, or overrides; only allowed influence is via anonymized global aggregates. Rule: Global meta-learning updates are applied only when k-anonymity conditions are met (k ≥ 50 contributing workspaces, max 20% contribution from any single workspace); no tenant identifiers are stored in model weights or explanations. Given access attempts from Tenant A to Tenant B’s workspace profile via API, When requests are made, Then they are denied with 403 and logged; no cross-tenant data is returned. Given global updates are produced, When explanations are generated, Then they contain source level (Global), confidence, and feature importances without revealing tenant IDs or pinned rule details. Given synthetic tenants with conflicting style preferences are tested, When both route similar items, Then outputs differ according to their own pins/overrides/workspace profiles and not due to direct leakage; test reports show zero cross-tenant contamination.

Correction Memory Improves Workspace Precision

Given a workspace has a measurable baseline precision over the last 300 routed images prior to activation, When personalization has been active for at least 14 days and 300 new routed images, Then precision on targeted preferences improves by ≥10 percentage points versus baseline or the repeat-correction rate decreases by ≥20%. Given the improvement threshold is not met, When the evaluation window closes, Then an automatic retraining job is queued within 1 hour and an alert is sent to workspace admins. Given improvement metrics are computed, When an admin views Analytics, Then baseline, current precision, confidence intervals, and sample sizes are displayed with timestamps. Given evaluation is in progress, When new manual corrections are made, Then they are incorporated into the next daily model update without interrupting ongoing batch jobs.

Decision Auditability and Versioning

Given any routing decision is made, When the result is stored, Then the record includes: decision source level, confidence score, rule IDs hit (if any), profile/model version ID, support counts, and timestamp. Given an admin requests decision logs for the last 90 days (≤100k records), When the export API is called, Then the file is produced within 30 seconds and contains all fields with consistent schema. Given a workspace profile has evolved, When the admin views version history, Then at least the last 10 versions are listed with diffs and authors; selecting a version enables rollback. Given a rollback is executed, When up to 10 minutes elapse, Then the selected version becomes active for new routing decisions, and the rollback event is audit-logged with actor and reason. Given admin pins/overrides are created/updated/deleted, When auditing, Then each change shows actor, time, scope, and reason code.

Evaluation & Gated Promotion

"As a QA lead, I want models promoted only when they demonstrably reduce corrections so that we avoid regressions and can quantify time saved."

Description

Define objective metrics—routing precision, correction rate, time-to-approve, and error recurrence—and evaluate each candidate model on holdout data and shadow traffic. Gate promotion with automated checks (e.g., precision +5% or correction rate −20% overall and within key segments). Support canary rollout by workspace, automatic rollback on regression, and version pinning for reproducibility. Expose a changelog and comparison reports for stakeholders.

Acceptance Criteria

Candidate Model Evaluation on Holdout and Shadow Traffic

Given a baseline router/model and a candidate version When the evaluator runs on a fixed, versioned holdout dataset (stratified by key segments) and on 5–10% shadow traffic for at least 72 hours Then the system computes and stores routing precision = correct_routes/total_routed, correction rate = manual_reassignments/total_routed, time-to-approve p50/p95 = time_from_route_to_approval, and error recurrence rate = repeat_corrections_within_30_days/total_corrected, each broken down overall and by key segments (workspace tier, product category, background type, image complexity) And results include 95% confidence intervals, sample sizes (n), and minimum sample thresholds of n>=5000 overall and n>=200 per key segment And raw predictions, assignments, and corrections are persisted with model_version, data_snapshot_id, and commit_hash for 30 days And evaluation outputs are immutable and retrievable via API with a unique run_id

Automated Gating Rules for Promotion

Given completed evaluation results comparing candidate to baseline When the promotion checker runs Then promotion is approved only if (overall routing precision improves by >=5% relative OR overall correction rate decreases by >=20% relative) AND the same condition holds within each key segment (top 10 workspaces by volume, product category, background type, image complexity band) And no segment shows a degradation >1% absolute in time-to-approve p95 And error recurrence rate does not increase in any segment And improvements are statistically significant (95% CI of delta excludes 0) with minimum sample thresholds satisfied Else the promotion is blocked and a machine-readable rationale (failing segments, metrics, and deltas) is attached to the run

Canary Rollout by Workspace

Given a promotion is approved When rollout begins Then direct 5% of eligible workspaces (minimum 10 workspaces) to the candidate model And auto-increase canary to 25%, 50%, and 100% after 6h, 12h, and 24h respectively if no health checks fail And eligibility excludes version-pinned workspaces and those below the minimum volume threshold of 200 routed items/day And real-time dashboards report precision, correction rate, time-to-approve p95, and error recurrence per canary cohort and segment And operators can manually pause or exclude specific workspaces via API/UI without affecting others And all transitions are audit-logged with actor, timestamp, and reason

Automatic Rollback on Regression

Given a canary or full rollout is active When any monitored metric breaches rollback thresholds over a 30-minute sliding window (overall precision delta < -1% absolute OR correction rate delta > +10% relative OR time-to-approve p95 delta > +5 minutes absolute in any key segment) Then the system reverts affected traffic to the last stable model within 5 minutes And caps additional exposure to <=500 images after breach detection And emits alerts to PagerDuty and Slack with incident_id, breached metrics, segments, and current exposure And records rollback details in the changelog and prevents auto-retry until a new evaluation run completes

Version Pinning and Reproducibility

Given a workspace or batch job is pinned to a specific router/model/version When processing current or reprocessed items for that entity Then the system uses the pinned version regardless of global promotions And request/response metadata include router_version, model_version, data_snapshot_id, and commit_hash And re-running the same job inputs with the same pinned version yields identical routing decisions and metrics within +/-0.1% due to nondeterminism controls And pin/unpin actions require role-based authorization and are audit-logged with actor, reason, and timestamp

Changelog and Stakeholder Comparison Reports

Given an evaluation completes or a rollout state changes When the reporting service runs Then a human-readable report and a JSON report are generated within 10 minutes containing model/version identifiers, date ranges, sample sizes, metric deltas overall and by segment, confidence intervals, gating outcome, and promotion/rollback decisions with rationale And the UI provides side-by-side comparison and diff views between any two runs And reports are immutable, time-stamped, signed, and retained for at least 1 year And a public permalink is available to stakeholders with view permissions

Precision & Drift Monitoring Dashboard

"As an operations analyst, I want visibility into precision and correction trends so that I can verify that Correction Memory is improving outcomes across catalogs."

Description

Provide dashboards that track routing precision, correction volume, and error hotspots over time by preset, category, and workspace. Include drift detection on input distributions and confidence calibration checks. Enable alerting when precision drops or correction rates spike beyond thresholds. Integrate with product analytics to attribute gains to Correction Memory and export reports for stakeholders.

Acceptance Criteria

Time-Series Precision by Preset

Given historical routing outcomes and manual corrections exist for the selected workspace, category, and presets When a user opens the Precision dashboard and selects a date range and granularity (day/week/month) Then the chart renders precision per preset as TP/(TP+FP) for each interval with numerator/denominator counts and 95% CI in tooltip And Then a data-quality banner appears if any interval has fewer than 50 decisions And When no matching data exists Then the chart displays a "No data for selection" state and emits no API errors And When the same query is repeated Then values match backend analytics within ±0.1 percentage points

Error Hotspots by Category and Workspace

Given corrections are logged with reason codes and locations When the user opens Error Hotspots and applies filters (date range, preset, workspace) Then a ranked table lists categories/workspaces by correction rate = corrections/decisions with columns for volume, rate, and top 3 reason codes And Then rows are sortable by rate and volume and limited to entities with ≥100 decisions in the period And When a row is clicked Then a drill-down opens showing affected presets and at least 5 example items with links to detail views And Then all counts aggregate to totals shown in the table header

Input Drift Detection and Alerting

Given baseline input feature distributions are established over the prior 30 days per preset and workspace When the latest 24-hour window is evaluated Then the system computes PSI for categorical features and two-sample KS for continuous features And Then drift status = "Alert" if any feature has PSI ≥ 0.25 or KS p-value < 0.01 with absolute mean shift ≥ 0.5 SD and at least 500 samples; otherwise status = "OK" And When status = "Alert" Then an alert is sent to configured channels (in-app and email) within 10 minutes including breached features, effect sizes, and links to comparison plots And When the condition clears for 24 hours Then a "Resolved" notification is sent and the incident is closed

Confidence Calibration Report

Given predictions include confidence scores for routing decisions When the user opens the Calibration view for a preset and date range Then the system displays a 10-bin reliability diagram, Expected Calibration Error (ECE), and Brier score with numeric values And Then the per-bin table shows average confidence, empirical precision, sample counts, and 95% CI And When the "Recalibrate (preview)" toggle is enabled Then metrics are recomputed using temperature scaling in-view without altering production model weights and the new ECE is shown And Then API responses complete within 3 seconds for datasets up to 100k predictions

Precision Drop and Correction Spike Threshold Alerts

Given an admin configures thresholds per preset/workspace (precision < T%, correction rate > R%, or > N corrections/hour) When a 15-minute rolling window breaches any configured threshold with ≥200 decisions Then an alert is created with severity mapped to deviation magnitude (minor/major/critical) And Then alerts are deduplicated via a 30-minute cool-down per preset/workspace/threshold And Then the alert payload includes current metric, threshold, 7-day baseline, filters, and a deep link to the dashboard And When the metric returns within threshold for two consecutive windows Then the alert is auto-resolved

Attribution to Correction Memory

Given Correction Memory is enabled and events are tagged for pre/post and A/B variants When the user opens the Attribution view and selects a cohort definition (pre/post or experiment variants) Then the dashboard displays precision lift, correction rate change, and time-to-correction reduction with 95% CI and sample sizes And Then lift is computed via difference-in-differences controlling for category and seasonality using the selected baseline And When no valid control cohort exists Then the UI displays "Insufficient control" and disables lift claims for that view And Then an Evidence panel lists the number of learned rules/weight updates and their estimated contribution to lift

Stakeholder Report Export

Given a user with export permission selects date range, dimensions (preset, category, workspace), and views (Precision, Hotspots, Drift, Calibration, Attribution) When Export is requested Then a branded PDF and CSV bundle are generated capturing all applied filters, UTC timestamps, and metric definitions And Then an email with a secure link (expiry 7 days) is sent within 5 minutes and the export appears in Recent Exports And When the dataset exceeds 1,000,000 rows Then CSV files are chunked per view with gzip compression; the PDF remains a single file And Then exported metrics match on-screen values within ±0.1 percentage points

Data Privacy & Consent Controls

"As an account admin, I want control over how my workspace’s data is used in training so that we stay compliant and protect brand confidentiality."

Description

Offer tenant-level controls to opt in/out of contributing corrections to global training while always applying learning locally. Enforce strict data isolation by workspace, uphold regional data residency, and propagate data deletion requests to training datasets and derived models. Anonymize or pseudonymize user identifiers in telemetry and log access. Document governance in admin settings and emit audit logs for compliance.

Acceptance Criteria

Tenant Consent Opt-In/Out for Global Training

Given I am a Tenant Admin on Admin > Privacy & Consent When I view the Global Training Contribution setting Then I see a toggle reflecting the current tenant consent state Given the toggle is OFF When I turn it ON and confirm Then the consent state is persisted, returned by the Consent API within 2 seconds, and a consent.changed audit event (actor, old_value, new_value, timestamp, tenant_id) is emitted Given the consent state changed When a new correction event occurs within 5 minutes Then the ingestion pipeline includes/excludes the event according to the new consent state

Local Learning Always Applied Regardless of Global Consent

Given the tenant has global contribution set to OFF When a user performs a manual reassignment in Workspace W Then the local correction memory updates and is applied to subsequent jobs in Workspace W within 10 minutes And no records for this event_id are present in global training topics/storage And an audit event learning.local_applied with event_id and workspace_id is emitted

Workspace-Level Data Isolation

Given a user authenticated in Workspace A When they request correction memory data or assets from Workspace B via UI or API Then the request is denied with 403 and returns no data And an access.denied audit event is recorded with actor, target_workspace_id, timestamp Given a training job for Workspace A When it executes Then it reads only data labeled with workspace_id = A and fails any cross-workspace join by policy with a logged denial

Regional Data Residency Enforcement

Given the tenant residency is set to EU When correction events, images, and logs are stored or processed Then all primary storage, backups, and processing run in EU-designated resources And cross-region data transfers are blocked by policy controls And Admin > Residency report shows Region = EU with passing checks Given residency is changed to US When new data is ingested Then it is stored and processed only in US-designated resources while existing EU data remains in-region until migrated via an explicit admin action

Data Deletion Propagation to Training Sets and Derived Models

Given an admin submits a deletion request for specific images/corrections via Admin or API When the request is accepted Then a job_id is returned and visible in Admin > Privacy Jobs with status tracking Given the job is running When 24 hours elapse Then targeted items are purged from primary storage, telemetry, and training datasets, and references are tombstoned; an audit event deletion.propagated is emitted Given models previously trained on the deleted data When the next retraining cycle completes (within 30 days) Then models are retrained/unlearned to exclude deleted data and model lineage shows the exclusion with evidence attached to the audit log

Telemetry Pseudonymization & Role-Based Log Access

Given any telemetry or log record is emitted Then user identifiers are pseudonymized (no email/full name), using rotating pseudonymous IDs plus tenant_id/workspace_id; zero raw PII fields are present Given a non-privileged user attempts to access logs When they query or download Then access is denied and an access.denied audit event is recorded Given an Admin or Compliance Officer accesses logs When they query or export Then only their tenant/workspace logs are accessible; exports are encrypted at rest/in transit and watermarked with requester and timestamp

Governance Documentation & Compliance Audit Logs in Admin

Given I am a Tenant Admin When I open Admin > Data Governance Then I can view current policies for consent, residency, retention, deletion, and auditing with version number and last-updated date Given system events occur (consent.changed, deletion.requested, deletion.propagated, access.denied, export.completed) When I open Admin > Audit Logs Then each entry includes event_type, actor, tenant_id, workspace_id, resource_id, timestamp (UTC), outcome, correlation_id, and is immutable with a 1-year retention; logs are filterable and exportable as CSV/JSON

Batch Splitter

Drop a mixed upload and let PixelLift auto-separate it into supplier-specific sub-batches. See counts, ETA, and cost per supplier, then process each with the correct presets and optionally re-merge for export—turning chaotic dumps into organized, predictable workflows.

Requirements

Automatic Supplier Detection & Clustering

"As a catalog manager, I want PixelLift to auto-separate a mixed upload into supplier-based groups so that I don’t waste time manually sorting before applying the right presets."

Description

On mixed uploads, automatically infer the originating supplier for each image and group items into supplier-specific sub-batches. Detection combines deterministic rules (SKU prefixes, filename patterns, folder names, barcodes, prior mappings) with ML signals (watermarks, backdrop color, lighting, model/set cues) to maximize accuracy. Each item receives a confidence score, explanation snippet, and supplier tag, with results stored for reuse on future uploads. The system must scale to hundreds of images per drop with sub-minute classification latency, respect existing tenant data boundaries, and expose the clustering outcome via UI and API.

Acceptance Criteria

Deterministic rule precedence yields high-accuracy clustering

Given a mixed upload of 300 images from 4 known suppliers where at least 70% of items contain deterministic identifiers (SKU prefixes, folder names, or barcodes) mapped to a supplier When Automatic Supplier Detection runs Then 100% of items with unambiguous deterministic identifiers are assigned to the mapped supplier And deterministic matches take precedence over ML signals on conflicts And overall assignment accuracy across items with deterministic identifiers is >= 98%

ML-only inference meets accuracy and sub-minute latency at scale

Given a mixed upload of 500 images averaging <= 8MB each where no filenames, SKUs, barcodes, or folder names map to known suppliers When Automatic Supplier Detection runs using ML signals (e.g., watermarks, backdrop color, lighting, model cues) Then p95 end-to-end classification latency for the batch is <= 60 seconds and p99 <= 90 seconds And macro F1 score across suppliers is >= 0.85 at the default confidence threshold And each item has a confidence score in the range 0.00-1.00 with two-decimal precision

Confidence score and explanation snippet visible in UI and API

Given any processed item in a clustered batch When viewing the item in the UI grid or via API GET /batches/{id}/items Then the supplier_tag, confidence (0.00-1.00), and an explanation snippet are present And the explanation includes at least the top 2 contributing signals with their types (e.g., SKU prefix, watermark) and relative weights And the API returns fields: supplier_id, supplier_name, confidence, rationale[{signal_type, signal_value, weight}]

Learning from corrections and reusing mappings on future uploads

Given a user manually reassigns an item's supplier in the review UI and saves When a subsequent upload contains an item with the same deterministic identifier (e.g., SKU prefix or barcode) Then the system automatically assigns the corrected supplier based on the stored mapping And the reuse hit rate for items with known mappings is 100% And the mapping is persisted within 2 minutes of correction and is tenant-scoped with an audit record of user, timestamp, old_value, new_value

Tenant data isolation for mappings and ML signal usage

Given two tenants A and B with distinct supplier catalogs and correction histories When tenant A creates new mappings or the system learns supplier signatures from A's data Then those mappings and adaptations are not accessible to or used by tenant B And cross-tenant API calls must be tenant-scoped and return 403 when attempting to access another tenant's clustering results And identical test uploads in A and B only differ due to each tenant's own mappings, never due to cross-tenant leakage

UI exposes supplier sub-batches with complete assignment coverage

Given a mixed upload has completed clustering When the user opens the Batch Splitter results screen Then the UI displays supplier-specific sub-batches with supplier name/ID and item counts And every item is assigned to exactly one supplier or explicitly marked Unassigned when confidence < threshold And the sum of sub-batch item counts (including Unassigned) equals the total uploaded items

API endpoint returns complete clustering outcome for a batch

Given a clustered batch with an ID When a client calls GET /batches/{id}/clustering Then the response is 200 OK with JSON containing: batch_id, created_at, summary{total_items, unassigned_count, suppliers[{supplier_id, supplier_name, item_count, avg_confidence}]}, and items[{item_id, supplier_id, confidence, rationale[]}] And the endpoint responds within 2 seconds for batches up to 500 items And counts and assignments in the API match those displayed in the UI for the same batch

Supplier Preset Mapping & Defaults

"As a brand owner, I want each supplier’s images to automatically pick up the correct presets so that results stay consistent without repetitive manual selection."

Description

Maintain configurable supplier profiles that map each supplier to its default style presets, retouch levels, background settings, output dimensions, naming templates, and color profiles. When a sub-batch is created, automatically attach the mapped presets and allow per-batch overrides without altering the saved profile. Support versioned presets, a global fallback when no profile exists, permissioned editing, and audit of changes to ensure brand consistency across runs.

Acceptance Criteria

Auto-Attach Supplier Defaults on Sub-Batch Creation

Given a supplier profile exists with default style preset (ID or track-latest), retouch level, background settings, output dimensions, naming template, and color profile And a mixed upload is processed by Batch Splitter When a sub-batch is created for that supplier Then the sub-batch configuration is auto-populated with the supplier profile values And the resolved preset version is determined (latest published if track-latest, otherwise the specified version) And no manual selection is required to proceed And the sub-batch summary displays the attached preset names and version

Per-Batch Override Without Profile Mutation

Given a sub-batch with auto-attached supplier defaults When the user overrides any configuration in that sub-batch (e.g., retouch level, background, preset version, naming template) Then the override applies only to that sub-batch And the saved supplier profile remains unchanged And a subsequent sub-batch for the same supplier still auto-attaches the original profile values And the UI marks overridden fields and provides a reset-to-profile control

Versioned Presets Selection and Locking

Given a supplier profile referencing a style preset When the sub-batch moves to Processing state Then the exact preset version used is locked and recorded with the job And processing continues with that version even if a newer version is published mid-run And users may explicitly select a different version for the current sub-batch before processing And historical jobs display the locked version identifier

Global Fallback Applied When Supplier Profile Missing

Given a sub-batch is created for a supplier without a saved profile When the sub-batch is initialized Then the global default configuration is attached to the sub-batch And the UI displays a warning that global defaults are used And the user can link an existing supplier profile or create one before processing And if no action is taken, the batch processes with the global defaults And the audit log records that the global fallback was applied

Permissioned Editing of Supplier Profiles

Given a user without Manage Supplier Profiles permission When they attempt to create, edit, or delete a supplier profile or its mappings via UI or API Then the action is blocked with a 403/disabled controls and no changes are persisted Given a user with Manage Supplier Profiles permission When they create or update a supplier profile Then the change is saved and immediately available for new sub-batches And existing in-flight sub-batches are not retroactively changed

Auditable Change Log for Profiles and Overrides

Given auditing is enabled When a supplier profile is created, updated, or deleted, or a sub-batch override is made Then an audit entry is recorded within 5 seconds including actor, timestamp, affected supplier, action type, field-level before/after values, and related batch ID (for overrides) And audit entries are immutable and viewable/filterable by supplier and date range And audit entries can be exported as CSV/JSON

Naming, Dimensions, and Color Profile Enforcement

Given a supplier profile defines a naming template, output dimensions (size and fit/fill), and color profile When the sub-batch is processed Then output filenames match the template using allowed tokens, with collisions resolved by an auto-incremented suffix And invalid tokens or unresolved variables cause a validation error before processing starts And output images match the specified dimensions within ±1px and embed the specified ICC profile And a sample output preview reflects these settings before processing

Per-Supplier Counts, ETA, and Cost Estimation

"As a small business owner, I want to see counts, ETA, and cost per supplier before I run the jobs so that I can plan spend and delivery timelines confidently."

Description

Calculate and display, before processing, the number of items, estimated processing time, and projected cost for each supplier sub-batch and for the overall upload. Estimation accounts for preset complexity, current queue load, concurrency limits, and tiered pricing with plan-specific discounts and currency settings. Values update in real time as users edit sub-batches or presets and are available in the UI header, a downloadable summary, and via API, with warnings when budget or time thresholds are likely to be exceeded.

Acceptance Criteria

Real-time UI Header Estimates for Per-Supplier and Overall

Given a mixed upload is auto-split into supplier sub-batches with assigned presets When the Batch Splitter screen loads Then the header displays for each supplier: supplierName, itemCount, etaMinutes, projectedCost, currencyCode, lastUpdatedTimestamp And the overall totals equal the sum of per-supplier itemCount, etaMinutes, and projectedCost And values refresh within 2 seconds after an item is moved between suppliers or a preset is changed And suppliers with zero items display itemCount=0, etaMinutes=0, projectedCost=0

Preset Complexity Influences ETA and Cost Per Supplier

Given supplier A has complexityWeight=1.0 and supplier B has complexityWeight=1.5 and both have 20 items And the base per-item processing time is 3 seconds and base per-item price is 0.12 in the user's currency When the preset for supplier B is changed from weight 1.0 to 1.5 Then supplier B's etaMinutes and projectedCost increase by 50% ± 1% relative to weight 1.0, and supplier A's values remain unchanged And overall totals update accordingly within 2 seconds

Queue Load and Concurrency Adjust ETA

Given the system concurrency limit is 4 workers and average per-item processing time after preset weighting is 5 seconds And there is a controlled queue of 400 items ahead of the current upload When estimating a new upload with 120 items total Then the displayed overall ETA in minutes is greater than or equal to ceil(((400 + 120) * 5) / (4 * 60)) And decreasing the queue ahead by 200 items reduces the ETA by at least 25%

Tiered Pricing, Plan Discounts, and Currency Application

Given a user on the Pro plan with a 10% discount and tiered pricing of 0.10 per item for the first 100 items and 0.08 thereafter And the user's currency is EUR with exchangeRateUSDToEUR=0.90 and currencyCode=EUR When a supplier sub-batch has 150 items and complexityWeight=1.0 Then the projectedCost equals roundHalfUp(((100 * 0.10) + (50 * 0.08)) * (1 - 0.10) * 0.90, 2) and displays currencyCode=EUR And the overall projectedCost equals the sum of per-supplier projectedCost across all suppliers in EUR

Estimates Available via API and Match UI

Given an upload with multiple supplier sub-batches is on the Batch Splitter screen When GET /api/v1/uploads/{uploadId}/estimates is called with a valid auth token Then the response is HTTP 200 with application/json containing overall and suppliers arrays where each supplier has supplierId, supplierName, itemCount, etaMinutes, projectedCost, currencyCode, warnings And the API values match the UI header values for the same upload within the same lastUpdatedTimestamp second

Downloadable Summary Provides Accurate Totals and Schema

Given an upload is split into supplier sub-batches with current estimates When the user clicks Download Summary Then files are generated for CSV and JSON formats within 3 seconds And the CSV includes headers: supplierId,supplierName,itemCount,etaMinutes,projectedCost,currencyCode,warnings and one row per supplier plus a final row labeled Overall with totals And the JSON includes overall and suppliers arrays with the same fields and values as the API response And numeric values in both files match the UI header within rounding rules and currency formatting

Budget and Time Threshold Warning Behavior

Given the user has configured a budgetCap=EUR 250 and timeThresholdMinutes=60 for the upload When projected overall cost exceeds budgetCap or overall etaMinutes exceeds timeThresholdMinutes Then a warning banner appears within 1 second showing which threshold is exceeded, the projected value, and the threshold And per-supplier rows exceeding thresholds display a warning icon with a tooltip explaining the exceedance And reducing items or changing presets to bring projections below thresholds removes the warnings within 2 seconds

Review & Edit Sub-batches UI

"As a photo ops lead, I want an easy way to review and fix the auto-split results so that I can correct edge cases before processing begins."

Description

Provide an interactive workspace to review proposed supplier groupings with thumbnails, stats, and confidence indicators. Enable users to split or merge groups, reassign items by drag-and-drop or bulk actions, search by filename/SKU, and view classification reasons. Include undo, keyboard shortcuts, mobile-responsive layout, and accessibility support. Persist edits across sessions and require confirmation before processing to prevent accidental runs with incorrect groupings.

Acceptance Criteria

Supplier Groupings Overview & Reasons Visibility

Given a mixed upload of up to 500 images producing up to 10 proposed supplier groups, When the Review & Edit Sub-batches UI loads, Then the page renders in under 2000 ms on a standard desktop (Core i5, 8GB RAM, 10 Mbps) and shows one card per group with supplier name or “Unassigned”, And each card displays item count equal to the number of items in that group ±0 discrepancy, And each card shows an ETA (mm:ss) computed from the currently selected preset for that group, And each card shows an estimated cost in the account currency based on current pricing, And each group card displays at least 12 thumbnails with lazy-load for additional items as the user scrolls, And each group card shows a confidence indicator (0–100%) along with a badge for “Low confidence” under 60%, And clicking “Why grouped?” for any item opens a panel in under 300 ms listing the top 1–3 classification reasons with per-signal confidence and timestamps, And closing the panel returns focus to the invoking control.

Split a Proposed Group

Given a supplier group with at least 10 items selected, When the user chooses Split and selects items via multi-select or defines a rule (e.g., SKU prefix or filename pattern), Then a new group is created containing exactly the selected/matched items and the original group now excludes them, And both groups’ counts, ETAs, costs, and confidence indicators update immediately (<200 ms after server ack), And the new group inherits presets from the source group by default with the option to change before saving, And the split is auto-saved to draft storage within 1 second and an undo step is created, And if no items are selected, the Split action is disabled.

Merge Groups

Given two or more supplier groups are selected, When the user clicks Merge and confirms the target supplier label/preset, Then a single merged group is created containing the union of items (no duplicates) and the source groups are removed, And the merged group displays updated count, ETA, cost, and recalculated confidence, And the merge operation is undoable and auto-saved within 1 second, And if selected groups use different presets, the user is prompted to select one preset or keep per-item preset before merge; default is target group’s preset, And merge is blocked if total items would exceed 10,000 (with explanatory message).

Reassign Items by Drag-and-Drop

Given a user can see item thumbnails within a source group and a visible destination group, When the user drags one or more selected thumbnails over a destination group, Then a valid drop zone is visually indicated and dropping moves exactly those items to the destination group, And source and destination counts, ETAs, costs, and confidence indicators update within 200 ms after drop, And the reassignment is persisted to draft within 1 second and an undo step is created, And dropping on an invalid area shows a non-blocking tooltip and leaves items in place, And multi-select drag (Shift/Ctrl/Cmd) supports moving 2–500 items in a single gesture.

Search and Bulk Reassign by Filename/SKU

Given the user enters a query into the search box using filename and/or SKU (supports case-insensitive, wildcard * and ?, and quoted exact phrases), When the user submits the query, Then matching items across all groups are returned in under 300 ms for up to 2,000 items and highlighted, And the user can Select All results and choose Bulk Move to a destination group, And exactly the selected results are moved with counts/ETAs/costs/confidence updated accordingly, And the search query and selection state persist while navigating between groups until cleared, And invalid queries surface a helpful error without clearing the prior results.

Accessibility, Keyboard Shortcuts, and Mobile Responsiveness

Given the feature is used by keyboard-only and screen reader users on desktop and by touch users on mobile, When interacting with all core operations (review, split, merge, reassign, search, confirm), Then the UI meets WCAG 2.2 AA: all actionable elements are keyboard reachable, have visible focus, logical tab order, no keyboard traps, text contrast ≥ 4.5:1, and changes are announced via ARIA live regions, And drag-and-drop has a keyboard alternative (Move dialog via Enter/Space) for selected items, And controls have accessible names describing supplier, counts, confidence, and actions, And the following shortcuts work: Undo Ctrl/Cmd+Z, Redo Ctrl/Cmd+Shift+Z, Search Ctrl/Cmd+F, Select All Ctrl/Cmd+A, Merge M, Split S, Move V; shortcuts are discoverable in a Help overlay, And on mobile widths 320–480 px the layout stacks: group cards in a vertical list, thumbnails in a two-column grid, toolbar collapses to an overflow menu, and all operations remain available without horizontal scroll.

Session Persistence and Processing Confirmation

Given the user has made edits (splits, merges, reassignments), When any change is made, Then an auto-save occurs to cloud draft within 1 second (with offline local cache sync on reconnect) and a “Last saved” timestamp updates, And when the user returns within 7 days, reopening the batch restores the exact prior state, And when the user clicks Process, a confirmation dialog summarizes supplier counts, ETAs, and total cost and requires explicit confirmation, And the Confirm button is disabled if any items remain in “Unassigned” or if there are validation errors, And if any group has confidence <60%, the dialog displays a warning and requires checking “I reviewed low-confidence items” before enabling Confirm, And upon confirming, the batch locks to prevent concurrent edits until processing begins or the user cancels.

Parallel Processing Orchestration

"As an operations manager, I want each supplier batch to process in parallel with accurate progress so that I get faster turnaround without losing track of status."

Description

Execute supplier sub-batches concurrently while applying their respective presets, honoring tenant-level concurrency limits and prioritization rules. Provide real-time progress per sub-batch, with pause, resume, and cancel controls. Ensure idempotent job handling, automatic retries with backoff for transient failures, and autoscaling of workers to meet demand. The orchestrator must survive worker restarts and partial failures while maintaining accurate status and cost tracking.

Acceptance Criteria

Per-Tenant Concurrency Limits Enforced

Given tenant T has a concurrency limit of 3 and 7 ready sub-batches, when orchestration starts, then no more than 3 sub-batches for T are in Running state concurrently and the remainder stay Queued. Given a Running sub-batch for T completes, when capacity frees, then exactly one Queued sub-batch for T transitions to Running within 2 seconds. Given multiple tenants T1(limit=2) and T2(limit=1) have ready sub-batches, when orchestration runs, then each tenant’s concurrent Running count never exceeds its respective limit.

Priority Scheduling Across Sub-batches

Given queued sub-batches with priorities High and Normal for the same tenant and available capacity, when selecting the next sub-batch, then a High-priority sub-batch starts before any Normal-priority sub-batch. Given two sub-batches with the same priority for the same tenant, when both are queued, then the one queued earlier starts first (FIFO). Given at least one High-priority sub-batch is queued for tenant T and capacity is available for T, when scheduling, then no Normal-priority sub-batch for T is started ahead of a High-priority one.

Correct Preset Application per Supplier Sub-batch

Given supplier A’s sub-batch is assigned Preset A and supplier B’s sub-batch is assigned Preset B, when both run concurrently, then all items in A are processed using Preset A and all items in B using Preset B with zero cross-application. Given any processed item output, when inspecting output metadata, then it includes presetId matching the sub-batch’s assigned preset. Given a sub-batch is retried or resumed, when processing restarts, then the same preset is applied consistently to all remaining items.

Real-time Progress and ETA per Sub-batch

Given a sub-batch with 200 items is Running, when processing, then the progress API/WebSocket publishes updates at least every 2 seconds or on any state change, whichever occurs first. Given progress is requested, when measured at time t, then processedCount + failedCount + remainingCount equals totalItems and percentComplete equals processedCount/totalItems within 0.5% rounding tolerance. Given ETA is displayed, when 50% of items are complete, then ETA mean absolute percentage error over the next 60 seconds is ≤ 20%.

Pause, Resume, and Cancel Controls

Given a sub-batch is Running, when Pause is requested, then new items stop being leased within 3 seconds and status transitions to Paused after in-flight items finish. Given a sub-batch is Paused, when Resume is requested, then processing restarts within 3 seconds and status transitions to Running. Given a sub-batch is Running or Paused, when Cancel is requested, then no new items are processed, in-flight items are allowed to complete, remaining items are marked Canceled, and status transitions to Canceled. Given duplicate Pause/Resume/Cancel requests are sent, when processed, then operations are idempotent and do not produce conflicting states, and an audit entry is recorded with actor, timestamp, and reason.

Idempotent Job Handling and Deduplication

Given a Create-Job request is retried with the same idempotencyKey within 24 hours, when received, then no duplicate sub-batch is created and the existing jobId and status are returned. Given a worker crashes after completing an item but before acknowledgment, when it restarts, then the same item may be re-attempted but deduplication ensures it is effectively applied exactly once to output and billing. Given network retries cause the same item to be delivered multiple times, when processing, then only one successful result is committed and duplicates are discarded.

Retries, Autoscaling, and Orchestrator Resilience with Accurate Costing

Given a transient error (HTTP 5xx, timeout, or rate-limit) occurs while processing an item, when classified as retryable, then the system retries up to 3 times with exponential backoff of 1s, 2s, 4s with ±25% jitter; after final failure, the item is marked Failed with errorCode and errorMessage; non-retryable errors are not retried. Given worker pool backlog > 500 items sustained for 60 seconds, when autoscaling evaluates, then additional workers are launched to increase throughput by at least 2x within 90 seconds, not exceeding configured maxWorkers. Given backlog < 50 items sustained for 5 minutes, when autoscaling evaluates, then workers scale down by at least 1 worker per minute to minWorkers without interrupting in-flight items. Given a worker terminates while holding item leases, when its heartbeat expires (≤ 30 seconds), then its leases are revoked and items are re-queued with no loss. Given partial failures and restarts occur, when a sub-batch completes, then final status counts (processed, failed, canceled) and totalCost equal the sum of per-item outcomes; totalCost is the sum of processedItems × per-supplier rate and is consistent across restarts and matches billing records.

Re-merge & Export Manager

"As a marketplace seller, I want to export all processed images together with clear structure and manifests so that I can upload to my storefronts without extra rework."

Description

After processing, allow users to re-merge results into a unified export while preserving supplier-based organization as needed. Support export targets such as ZIP download, cloud storage (S3, GDrive), and commerce integrations, along with configurable folder structures, naming templates, and color profiles. Generate a manifest (CSV/JSON) capturing supplier, original filenames, applied presets, processing timestamps, and per-item cost, and provide webhooks and shareable links with retention controls.

Acceptance Criteria

Misclassification Handling & Recovery

"As a content coordinator, I want clear handling for uncertain or incorrect splits so that I can fix them quickly without losing time or paying twice."

Description

Provide safeguards and recovery paths for items that are unclassified, low-confidence, or misclassified. Flag low-confidence items for review, fall back to default presets when no supplier mapping exists, and route failures to an exceptions queue with clear reasons and suggested fixes. Support post-run corrections that trigger reprocessing with correct presets and adjust billing deltas accordingly, while notifying users via in-app alerts and email when attention is required.

Acceptance Criteria

Low-Confidence Items Routed to Review Queue

Given a mixed upload is classified by supplier When an item’s supplier classification confidence is below 0.75 Then the item is labeled "Low Confidence" and added to the Review queue And the UI displays the predicted supplier and confidence score for the item And the item is not auto-assigned to a supplier preset until the user confirms or edits the supplier And the batch summary displays a Low Confidence count and link to the Review queue

Default Preset Fallback When Supplier Mapping Missing

Given an item’s predicted supplier has no mapping to a preset When processing is initiated for the batch Then the item is processed using the workspace Default Preset And the UI labels the item "Default Preset applied (no supplier mapping)" And the audit log records the fallback decision with timestamp and item ID And cost is calculated using the Default Preset rate and reflected in batch totals And the batch summary shows a count of Defaulted items

Exceptions Queue with Reasons and Suggested Fixes

Given an item fails supplier classification or processing (e.g., corrupt image, timeout) When the failure is detected Then the item is moved to the Exceptions queue within 30 seconds And the queue entry includes a reason code, human-readable message, and suggested next action And available actions include Retry, Assign Supplier, and Remove from Batch And taking an action updates the item status and writes an audit log entry

Post-Run Correction Triggers Reprocessing and Billing Delta

Given an item was processed with the wrong supplier or the Default Preset When a user assigns the correct supplier and requests reprocessing Then the item is reprocessed using the correct supplier’s preset And the corrected rendition replaces the prior output while preserving version history And billing is adjusted by the cost delta between original and corrected processing And batch totals, supplier counts, and cost per supplier update accordingly And 95% of reprocess requests complete within 5 minutes And a completion notification is sent to the user

Attention Notifications via In-App and Email

Given a batch contains Low Confidence items or Exceptions When batch analysis completes Then an in-app alert appears within 60 seconds summarizing counts with deep links And an email notification is sent to project admins with the same summary And notifications are deduplicated per batch and not resent within 1 hour unless counts change And users can configure email notifications in Notification Settings

Counts, ETAs, and Export Update After Corrections

Given post-run corrections change supplier assignments for one or more items When the correction is saved or reprocessing completes Then supplier sub-batch counts, ETAs, and cost per supplier recompute within 15 seconds And the re-merge and export use the corrected supplier presets And the export manifest lists the final supplier and preset applied for each item

Fallback Rules

Define a smart hierarchy for low-confidence cases—SKU prefixes, folder names, CSV maps, or API tags. The router follows your priority order to place images safely, ensuring no asset stalls while still respecting brand and channel constraints.

Requirements

Hierarchical Rule Engine

"As a catalog operations manager, I want to define a prioritized sequence of routing rules so that low-confidence images are consistently placed in the correct destination without manual intervention."

Description

Implements a deterministic priority stack that evaluates routing rules in a defined order for low-confidence classification cases. Supports conditions using SKU prefixes, folder path patterns, CSV column mappings, and API-provided tags, with scoping at global, brand, and channel levels. Executes until the first match, applies the mapped destination (collection/preset/channel), and falls back to a final catch‑all rule to ensure no asset stalls. Includes conflict resolution, rule scoping precedence, and guardrails to respect brand and channel constraints within PixelLift’s existing routing pipeline.

Acceptance Criteria

Deterministic Priority Evaluation with First-Match Execution

Given a priority-ordered rule stack where lower numeric value indicates higher priority When an asset matches multiple rules across the stack Then the engine evaluates rules strictly in ascending numeric priority, stops at the first match, applies that rule’s destination, and does not evaluate subsequent rules And the routing audit records applied_rule_id, applied_priority, and evaluation_order for the asset Given a batch of 500 assets and 1,000 rules When routing runs in the standard staging environment Then at least 99% of asset evaluations complete within 50 ms per asset and no timeouts occur Given identical input assets and rules across repeat runs When routing executes multiple times Then the same rule_id is consistently applied per asset (deterministic outcome)

Rule Scoping Precedence: Global vs Brand vs Channel

Given rules exist at channel, brand, and global scopes When an asset is associated to brand B and channel C Then evaluation order is: channel-scoped(C) → brand-scoped(B) → global Given an asset associated to brand B only (no channel) When evaluated Then evaluation order is: brand-scoped(B) → global Given an asset with no brand association When evaluated Then only global rules are considered Given two matching rules at different scopes When evaluated Then the higher-precedence scope wins regardless of cross-scope numeric priorities; numeric priority only orders rules within the same scope And the routing audit records winning_scope and skipped_scopes

Condition Types Supported and Match Semantics

Given a rule with SKU prefix condition "ABC-" When an asset has SKU "ABC-12345" Then the rule matches Given a rule with folder path pattern "uploads/2025/*/Spring/**" When an asset’s source_path is "uploads/2025/03/Spring/look1/img001.jpg" Then the rule matches Given an uploaded CSV with a column "route_to" mapping filename "img_001.jpg" to destination D When the engine processes assets including "img_001.jpg" Then the CSV mapping condition matches and destination D is applied (subject to precedence and priority) Given an API-provided tag "campaign=summer-25" When a rule requires tag equals "campaign=summer-25" Then the rule matches assets carrying that tag Given a rule contains multiple conditions When evaluated Then conditions are combined with AND semantics and the rule matches only if all specified conditions are true

Catch-All Fallback Ensures No Asset Stalls

Given no configured rules match an asset When routing completes Then the catch-all rule applies and the asset is routed to the configured default destination (collection/preset/channel) And the routing audit records applied_rule_id = catch_all and reason = no_match Given a routing job of N assets When the job completes Then 100% of assets have a destination assigned and zero assets remain in Unrouted state Given an attempt to save a rule configuration without an enabled catch-all When validating the configuration Then the system rejects the save with a validation error indicating a required catch-all rule

Conflict Resolution and Deterministic Tie-Breaking

Given two or more rules in the same scope and same numeric priority match an asset When evaluated Then the engine selects the rule with the highest specificity using this order: CSV explicit mapping > API tag exact equals > SKU prefix match > folder path wildcard pattern And if specificity is tied, the most recently updated rule wins; if still tied, the lowest rule_id wins And the routing audit records tie_breaker = specificity|updated_at|rule_id accordingly Given tie-break scenarios are replayed with identical inputs When evaluated multiple times Then the same winning rule is selected consistently

Guardrails Respect Brand and Channel Constraints

Given a matching rule’s destination violates brand B’s allowed presets or channel C’s permissions When the engine evaluates that rule Then the rule is skipped with reason = constraint_violation and evaluation continues to the next eligible rule Given all matching rules violate constraints When evaluation completes Then the asset is routed by the catch-all to a safe destination (e.g., Needs Review) and flagged requires_admin_action = true Given a rule attempts to route an asset to a different brand’s collection or a disabled channel When evaluated Then the engine prevents cross-brand leakage and disallowed channel routing; the audit records prevented_destination and constraint type

Multi-Source Attribute Parsing

"As a technical admin, I want attributes reliably derived from SKUs, folders, CSVs, and API tags so that routing rules have consistent inputs across all my uploads."

Description

Builds parsers to extract normalized attributes from multiple sources: regex/substring SKU prefix parsing, folder name tokenization, sidecar CSV ingestion with configurable column-to-attribute mapping, and API tag retrieval from integrations. Normalizes values into a canonical schema consumed by the rule engine, with validation, error handling, and caching. Operates at batch scale, supports asynchronous enrichment, and preserves tenant isolation for PixelLift workspaces.

Acceptance Criteria

SKU Prefix Parsing Produces Canonical Attributes

Given a SKU parsing config with regex rule_id "sku_regex_v1" mapping ^([A-Z]{3})-(\d{4})-([A-Z]{2})-(XS|S|M|L|XL)$ to {brand_code, season, color_code, size} When an asset with sku "ABC-2025-RD-M" is parsed Then attributes {brand_code:"ABC", season:"2025", color:"Red", size:"M"} are emitted in canonical schema with confidence 1.0 and provenance {source:"sku", rule_id:"sku_regex_v1"} Given substring fallback rules map the prefix before "-" to brand_code When an asset with sku "ZZZ1234-FOO" fails all regex rules Then brand_code:"ZZZ1234" is emitted with confidence 0.6 and reason "SKU_REGEX_MISS" and no other sku-derived attributes are emitted Given an asset with sku "BADSKU" When parsing completes Then parse_confidence is 0.0, reason "SKU_PARSE_FAIL" is recorded, sku_raw is preserved, and downstream routing is not blocked Given a batch of 5000 assets When sku parsing runs on a single worker Then p95 per-asset parsing latency is <= 50ms and throughput is >= 5000 assets/minute

Folder Tokenization Maps Path Segments to Attributes

Given a folder tokenization config with anchor "uploads/", token order ["season","category","color"], and normalizers {underscores_to_spaces:true, case_insensitive:true} When an asset at path "/tenants/t1/uploads/Summer_24/Dresses/Red/product1.jpg" is parsed Then attributes {season:"Summer 2024", category:"Dresses", color:"Red"} are emitted with confidence 0.8 and provenance {source:"folder", path:"/tenants/t1/uploads/Summer_24/Dresses/Red"} Given a path with missing tokens "/tenants/t1/uploads/Winter_25/product2.jpg" When parsing completes Then only {season:"Winter 2025"} is emitted and reasons include "FOLDER_TOKEN_MISSING" for absent attributes Given extra path segments beyond configured tokens When parsing runs Then extra segments are ignored and no errors are raised

Sidecar CSV Mapping With Validation and Idempotency

Given a sidecar CSV "catalog_attrs.csv" with columns ["sku","Color","Size","Material"] and a mapping {"Color":"color","Size":"size","Material":"material"} for workspace "w1" When the CSV is ingested for a batch of 1000 assets Then rows are upserted into canonical attributes with normalization (e.g., "blue","BLU" -> "Blue") and provenance {source:"csv", file:"catalog_attrs.csv"} Given the CSV is re-uploaded unchanged When ingestion runs Then processing is idempotent (no duplicate writes) and result indicates "NO_OP" based on file hash Given the CSV is missing a mapped column "Size" When ingestion runs Then ingestion returns "CSV_SCHEMA_MISMATCH", skips only invalid rows, processes valid rows, and produces a downloadable error report with row numbers and reasons Given a row contains an invalid enum value "XXXL" for size When normalization runs Then the value is rejected, reason "VALUE_NOT_IN_ENUM:size" is recorded, and the row is not blocked for other attributes

API Tag Retrieval With Caching and Rate-Limit Handling

Given a connected Shopify integration and tag parsing pattern "Key:Value" When fetching tags for product_id 12345 returns ["Color:Blue","Season:Summer 2025"] Then attributes {color:"Blue", season:"Summer 2025"} are emitted with provenance {source:"api", integration:"shopify"} and cached with TTL 15m Given the API responds with HTTP 429 When retrieval is attempted Then the client backs off exponentially (initial 500ms, 3 retries), after which the asset is marked enrichment_pending with retry_at set and reason "API_RATE_LIMIT" Given cached tags exist and are unexpired When enrichment runs again within TTL Then no external call is made and cached values are used

Canonical Normalization and Merge Precedence

Given the canonical schema defines attributes ["color","size","season","category","material"] with enumerations for color and size When inputs are collected from sources {csv, api, sku, folder} Then the merged attribute set respects precedence csv > api > sku > folder and records per-attribute source_of_truth Given a value has a synonym ("blu","blue","BLU") When normalization runs Then the value is standardized to "Blue" and passes validation Given an attribute not in schema ("fabrication") When inputs include it Then it is discarded with reason "ATTRIBUTE_NOT_IN_SCHEMA" and does not appear in the canonical payload

Batch-Scale Async Enrichment and Eventual Consistency

Given a batch of 10,000 assets submitted to workspace "w1" When multi-source parsing and enrichment execute asynchronously Then 95% of assets have final attributes available within 15 minutes, and none are blocked from routing; assets lacking confident attributes are flagged for fallback with reasons Given clients poll GET /workspaces/w1/enrichment/status?batch_id=... When enrichment is in progress Then the API returns counts {total, completed, pending, failed} and latest_updated_at, with p95 status endpoint latency <= 200ms Given late-arriving API tags update an asset after initial routing When enrichment finalizes Then an "attributes_updated" event is emitted with diff payload and downstream rule re-evaluation is triggered idempotently

Tenant Isolation and Cache Segregation

Given workspaces "A" and "B" use identical SKUs and paths When parsing and caching occur Then attributes and caches are namespaced by workspace_id, and reads in "A" never return data from "B" Given an API key from workspace "A" requests CSV mappings for "B" When authorization is checked Then the request is denied with 403 and reason "TENANT_ISOLATION" Given workspace "A" is deleted When cleanup runs Then all cached enrichment entries for "A" are purged within 5 minutes and no orphaned data remains

Safe Routing & Quarantine

"As a merchandising lead, I want ambiguous assets to go to a safe default or a review queue with clear reasons so that listings are never blocked and compliance is maintained."

Description

Ensures every asset is routed safely even when ambiguity remains after rule evaluation. Applies a configurable safe default destination per brand/channel or moves the asset into a reviewable quarantine queue with SLA timers. Prevents pipeline stalls by enforcing timeouts and retry policies, applies only allowed minimal transformations under constraints, and surfaces a clear reason code for the chosen fallback in PixelLift’s asset details.

Acceptance Criteria

Route to Safe Default on Low Confidence

Given an asset’s routing confidence is below the configured threshold and a brand/channel safe default destination exists When routing is evaluated after fallback rules Then the asset is routed to the configured safe default destination within 5 seconds And a fallback_applied flag is set to true on the asset record And the asset’s processing continues without manual intervention

Quarantine with SLA and Escalation

Given an asset cannot be confidently routed and no safe default applies or policy_quarantine is enabled When routing evaluation completes Then the asset is placed in the Quarantine queue with status "Pending Review" And an SLA timer of 4 hours is set and visible (created_at and sla_due_at) And a reviewer notification is sent via email and webhook within 60 seconds And if SLA expires, the asset auto-routes to the global safe default and the quarantine entry is closed with reason_code "sla_expired"

Timeout and Retry Without Pipeline Stall

Given any downstream call during routing or minimal processing exceeds the configured timeout of 10 seconds When the timeout occurs Then the operation is retried up to 3 times with exponential backoff starting at 1 second And on final failure the asset is moved to Quarantine with status "System Error" and reason_code "timeout" And other assets in the same batch continue processing without waiting on the failed asset

Enforce Minimal Transformations Under Constraints

Given an asset is routed via safe default or placed in Quarantine When transformations are applied Then only minimal transformations are allowed: auto-retouch, background removal to #FFFFFF, resize to max edge 2048px, metadata sanitization And disallowed transformations are skipped: style-presets, compositing, channel watermarks And the transformation log lists each applied step with timestamp and outcome

Reason Code and Signals Visible in Asset Details

Given a fallback decision is applied (safe default, quarantine, or escalation) When the asset details are fetched via UI or API Then a reason_code is present with one of the enumerated values: low_confidence, rule_miss, timeout, downstream_error, sla_expired, policy_quarantine And reason_details include evaluated signals (sku_prefix, folder_name, csv_map, api_tag, confidence_score) And a link or ID to the fallback hierarchy evaluation log is present

Execute Priority-Based Fallback Hierarchy With Audit

Given a configured fallback hierarchy [sku_prefix, folder_name, csv_map, api_tag, safe_default, quarantine] When an asset is evaluated Then rules are executed strictly in the configured order and the first match determines the destination And the evaluation audit records each step with order index, outcome, and timestamp And the hierarchy configuration version used is recorded on the asset decision record

Rule Builder with Live Preview

"As a brand admin, I want a visual rule builder with live previews so that I can configure fallback behavior confidently without writing code."

Description

Provides an admin UI to create, edit, and reorder rules via drag-and-drop, with a condition builder for SKU/folder/CSV/tag criteria. Includes a live preview that tests rules against sample images or prior batches, showing the matched rule, destination, and applied constraints before publishing. Offers validation, conflict detection, draft/publish workflows, and role-based access aligned with PixelLift’s admin model.

Acceptance Criteria

Drag-and-Drop Rule Creation and Reordering

Given I am an Editor or Admin on the Fallback Rules page When I create a new rule with a name, condition, destination, and constraints Then the rule is saved as Draft and inserted at the top of the Draft rules list Given at least three draft rules exist When I drag a rule to a new position and release Then the new order is persisted to the server and remains after page refresh and re-login And the live preview evaluates rules in the persisted order Given a ruleset is currently Published When I change the order of any rule Then a new Draft version of the ruleset is created and the Published version remains unchanged until publish

Condition Builder for SKU/Folder/CSV/Tag

Given the condition builder is open for a rule When I define conditions using SKU prefix, Folder name, CSV column equals, and API tag equals Then the builder validates each operand and operator and displays a Valid state with no errors Given I combine multiple conditions When I group them with AND/OR and parentheses Then the serialized condition logic is saved exactly as configured and used by preview evaluation Given I reference an unknown CSV column, unsupported operator, or empty value When I attempt to save the rule Then saving is blocked and inline errors identify the invalid fields

Live Preview of Rule Matches and Destinations

Given I select up to 200 sample assets from a prior batch or upload When I run Live Preview without publishing Then each asset row shows the matched rule name/ID, destination, and applied constraints, or "No match" with the first unmet condition And 95% of preview responses for 200 assets return in under 2 seconds Given I edit a rule condition or reorder rules while the preview panel is open When I re-run the preview Then results reflect the latest Draft configuration without requiring publish

Validation and Conflict Detection Before Publish

Given at least two draft rules have overlapping conditions When I validate or attempt to publish the ruleset Then the system flags the overlap, shows the winning rule by evaluation order, estimates affected asset count based on the last batch, and allows publish with a warning Given a rule references an invalid destination, missing constraint, or a CSV column not present in the selected map When I validate or attempt to publish Then publish is blocked until errors are resolved, with error messages pointing to the exact rule and field

Draft/Publish Versioning and Rollback

Given a ruleset with validation errors When a Publisher or Admin attempts to publish Then the publish action is disabled and errors are listed until all blocking issues are resolved Given a ruleset with no blocking validation errors When a Publisher or Admin publishes Then a new immutable version (N+1) is created with timestamp and author, and the router uses version N+1 for new processing within 10 seconds Given multiple historical versions exist When an Admin selects a prior version and clicks Rollback Then that version becomes the new Published version and an audit entry records the rollback

Role-Based Access and Audit Logging

Given PixelLift RBAC is applied When a Viewer accesses the Rule Builder Then they can open Live Preview but cannot create, edit, reorder, publish, or delete rules (controls disabled; API returns 403 if invoked) Given PixelLift RBAC is applied When an Editor accesses the Rule Builder Then they can create/edit/reorder Draft rules and run preview but cannot publish or delete Published versions Given any user performs create, edit, reorder, publish, or rollback When the action completes Then an audit log entry is recorded with user ID, timestamp, action type, rule/ruleset version, and diff of changes

Decision Audit & Explainability

"As a compliance reviewer, I want a clear audit of how each asset was routed so that I can verify decisions and adjust rules when errors occur."

Description

Captures a per-asset decision trail including input confidence scores, extracted attributes, evaluated rules with outcomes, and the final routing action. Exposes searchable logs in the UI and exportable reports (CSV/JSON) with retention controls. Enables rapid debugging, compliance reviews, and continuous tuning of fallback strategies within PixelLift.

Acceptance Criteria

Per-Asset Decision Trail Capture

Given an image asset is processed by PixelLift (via batch upload, folder watch, CSV, or API) When processing completes (success or failure) Then a decision record is written with: asset_id, batch_id, timestamp_utc, uploader_id, workspace_id; source context (sku, sku_prefix, folder_path, csv_mapping_id, api_tags, channel); versions (model_version, ruleset_version, preset_version); input_confidence_score (0.000–1.000) and confidence_threshold_applied; extracted_attributes [{name, value, confidence}]; evaluated_rules ordered [{rule_id, name, priority, condition_summary, inputs_snapshot, outcome (pass|fail), score, reason}]; fallback_chain_followed [{rank, rule_id, outcome, reason}] and fallback_triggered (true|false); final_routing_action {destination, style_preset, channel_constraints_respected}; and error {code, stage, message} when applicable And the decision record is immutable (no updates; only additive annotations with user_id and timestamp) And the record becomes queryable within 10 seconds of job completion

Search and Filter Decision Logs in UI

Given a user with View Audit Logs permission opens Decision Logs When they apply any combination of filters (date range, asset_id, sku, sku_prefix, folder_path, tag, rule_id, routing_action, fallback_triggered, confidence range, batch_id, uploader_id) Then only matching records are returned And the table shows at least: asset_id, timestamp, sku, input_confidence_score, fallback_triggered, final_routing_action, rules_evaluated_count, error_code (if any) And for result sets up to 10,000 records, the first page loads within 3 seconds and supports pagination and sorting by any visible column And clicking a row opens the detailed decision view for that asset

Detailed Explainability View per Asset

Given a user opens an asset’s decision details When the view loads Then it displays chronological evaluation steps with timestamps and rule priority order, each rule’s condition, inputs, and pass/fail reason, the confidence threshold comparison and whether it triggered fallback, the fallback hierarchy followed and the stop condition, and the final routing action with destination link And all values exactly match the stored decision record (record_id/checksum displayed) And the user can copy the explanation text and generate a shareable link that expires within 7 days

Export Decision Logs to CSV and JSON

Given a user has applied filters in Decision Logs When they choose Export and select CSV or JSON Then the system produces a file with schema_version and headers/fields matching the decision record structure And the export row count equals the number of records matching the filters at the time of export And timestamps are ISO 8601 UTC, numeric confidences have 3 decimal places, and booleans are true/false And for exports up to 100,000 records, the file is available within 5 minutes; larger jobs are delivered as compressed downloads with in-app/email notification and 7-day link expiry And downloads are access-controlled and audited with user_id, timestamp, filter summary, and checksum

Retention Controls and Purge Enforcement

Given a workspace retention policy is configured (7, 30, 90, 365 days, or custom) When a decision record exceeds the retention period and is not under legal_hold Then it is purged within 24 hours by an automated job And purged records no longer appear in searches or exports And admins can place/remove legal_hold on assets, batches, or date ranges to prevent purge And a purge audit log records metadata only (counts, time window, job_id) without decision content And changes to retention settings and holds are audited with who, what, when

Rule Tuning Insights and Safe Simulation

Given a user opens Rule Insights and selects a rule_id and date range When the report loads Then it shows total evaluations, pass/fail counts, average input_confidence_score, fallback_trigger_rate, top contributing attributes, channels affected, and routing action distribution And the user can filter insights by sku_prefix, folder_path, tags, or channel And from any metric the user can open sample decision records and their detail views And the user can run a re-run simulation on up to 500 sampled assets against a draft ruleset; results are shown side-by-side and do not impact production And all insight views and simulations are exportable and audited

Versioning, Rollback & A/B Testing

"As a product owner, I want versioned rules with rollback and A/B tests so that I can iterate safely and improve routing outcomes based on data."

Description

Maintains versioned rule sets per workspace with draft, scheduled, and active states, plus single-click rollback. Supports traffic-split A/B testing between rule set versions to measure routing accuracy, manual review rate, and time-to-listing, with guardrails to cap exposure when error thresholds are exceeded. Integrates metrics into PixelLift analytics for data-driven optimization.

Acceptance Criteria

Rule Set Version Lifecycle (Draft, Scheduled, Active)

Given a workspace with an Active version v1, When a new version v2 is created as Draft, Then v2 is editable and v1 remains Active. Given a Draft v2, When activation is scheduled for future timestamp T in the workspace time zone, Then v2 state becomes Scheduled and no other version can be Scheduled simultaneously. Given v2 Scheduled for T, When wall-clock reaches T, Then v2 becomes Active within 60 seconds, v1 transitions to Archived, and all new assets route under v2 with zero job failures attributable to the switch. Given an Active version, When a user attempts to edit its rule definitions, Then the action is blocked with guidance to create a Draft. Given the versions list is queried, When validating states, Then exactly one version is Active, at most one is Scheduled, and any number can be Draft or Archived.

Single-Click Rollback to Previous Version

Given an Active v2 and an Archived v1, When the user clicks Rollback to v1, Then within 60 seconds v1 becomes Active, v2 becomes Archived, and all new assets route under v1. Given a rollback occurs while jobs are in-flight, When those jobs complete, Then they finalize under the rules they started with, and no asset is processed with mixed rules. Given a rollback is executed, When viewing audit logs, Then an entry records actor, from-version, to-version, timestamp, and outcome. Given a scheduled activation exists, When rollback completes, Then any scheduled activation is canceled and the schedule status is updated accordingly.

A/B Test Traffic Split and Consistency

Given an Active version vA and a Draft vB, When an A/B test is started with a 70/30 split, Then after a warm-up of 100 assets the allocation over the next 1000 assets is within ±2% of target for each variant. Given an A/B test is running, When assets arrive within the same upload batch, Then all assets in that batch are consistently assigned to the same variant. Given an A/B test is paused or ended, When new assets arrive, Then 100% of assets route to the currently Active non-test version. Given an A/B test is running, When a user attempts to edit either variant’s rules, Then the edit is blocked with guidance to create a new draft and restart the test.

Metrics Collection and Analytics Integration

Given an A/B test is running, When assets are processed, Then routing accuracy per variant is computed as 1 - (manual reassignment count / total routed assets) and appears in Analytics within 5 minutes of asset completion. Given an A/B test is running, When assets are processed, Then manual review rate per variant (manual reviews / total assets) appears in Analytics within 5 minutes. Given an A/B test is running, When assets are processed, Then median (P50) and P95 time-to-listing per variant from upload to ready-for-listing appear in Analytics within 5 minutes. Given an A/B test ends, When the user opens Analytics, Then a summary compares variants with absolute and relative differences and is exportable as CSV. Given Analytics filters are applied (time range, channel, SKU prefix), When viewing metrics, Then all metrics are correctly segmented per variant and filter.

Guardrails and Exposure Caps

Given an A/B test with variants A and B and guardrails configured, When routing accuracy for B drops below 95% over a sliding window of 500 assets, Then within 2 minutes B’s traffic allocation is automatically reduced to 0% and 100% of new assets route to A. Given guardrails with a soft threshold manual review rate > 10% over 200 assets, When triggered, Then B’s traffic is capped at 5% until recovery conditions are met. Given any guardrail triggers, When it occurs, Then workspace admins receive in-app and email notifications and an audit log entry is recorded. Given a guardrail cap is active, When viewing the A/B test in UI, Then the status displays Limited by guardrails and shows the active cap and triggering metric. Given recovery conditions are met (two consecutive windows meeting thresholds), When evaluated, Then the system restores the configured traffic split automatically.

Workspace Isolation and Permissions

Given multiple workspaces exist, When a version is activated, rolled back, or tested in Workspace X, Then no versions or traffic allocations change in other workspaces. Given a user without Manage Rules permission, When they attempt to create, schedule, activate, rollback, or start an A/B test, Then the operation is blocked with an authorization error. Given version history exists, When exporting or auditing, Then all actions list workspace ID, actor, timestamp, and action type.

Smart Allocator

Continuously rebalances traffic between style variants (e.g., shadow/no‑shadow, crops, backgrounds) using a multi‑armed bandit strategy. You learn faster with less revenue risk, because high performers get more exposure while weak variants are automatically deprioritized. Set min/max traffic per variant and safe‑start limits for new launches.

Requirements

Bandit Allocation Engine with Traffic Constraints

"As a store owner, I want traffic to shift automatically toward the best‑performing image styles while respecting my min/max limits so that I improve conversions without risking revenue or brand consistency."

Description

Implements a configurable multi‑armed bandit engine (e.g., Thompson Sampling) that continuously reallocates traffic among style variants (shadow/no‑shadow, crops, backgrounds) to maximize a chosen objective while honoring user‑defined guardrails. The engine supports per‑variant minimum/maximum traffic, per‑experiment exploration rate, and safe caps for newly launched variants. It integrates with PixelLift’s style‑preset registry to ensure stable variant IDs across batch uploads and product groups, persists experiment state, and recalculates allocations on a rolling cadence. Expected outcome is faster learning with reduced revenue risk, automatically prioritizing high performers and deprioritizing weak variants without manual intervention.

Acceptance Criteria

Min/Max Traffic Constraints Enforcement

Given an active experiment with variants A, B, C and constraints {A: min 10%, max 60%}, {B: min 20%, max 50%}, {C: min 0%, max 40%} where sum(min) ≤ 100 And a rolling allocation cadence of 5 minutes When the allocator computes allocations at the next cadence Then for each variant i ∈ {A,B,C}, allocation_i ∈ [min_i, max_i] And the sum of allocations equals 100% ± 0.1% And allocations are published no later than 1 minute after the cadence tick And an immutable log entry records allocations, inputs, and constraint set used

Per-Experiment Exploration Rate Floor

Given an experiment with exploration_rate = 10% and feasible min/max constraints And a measurement window of 10,000 impressions When traffic is served across the window Then ≥ 9% of impressions are allocated to non-currently-best variants subject to constraints And if exploration_rate = 0%, no exploration impressions are forced And per-variant min/max constraints are respected throughout

Safe-Start Caps for Newly Launched Variants

Given a new variant D with safe_start_cap = 5% and safe_start_impressions = 1,000 When the allocator runs before D reaches 1,000 impressions Then allocation_D ≤ min(5%, max_D) and allocation_D ≥ 0 And if min_D is configured > 5%, the effective min during safe-start is clamped to 5% When D reaches 1,000 impressions Then the safe-start cap is lifted on the next cadence and standard constraints apply And audit logs record the start and end timestamps of the safe-start period

State Persistence and Crash Recovery

Given an active experiment with persisted posterior parameters, impression/reward counts, constraints, and objective When the bandit service restarts unexpectedly Then on restart the engine restores state within 2 seconds And the first post-restart allocations differ from pre-restart by ≤ 0.5 percentage points per variant (absent new data) And impression/reward counters are continuous (no resets) And if persistence is unavailable, the allocator freezes the last valid allocation for up to 15 minutes and emits an alert

Objective Function Selection and Reward Mapping

Given objective = conversion_rate (Bernoulli) When impressions and conversion events are ingested Then per-variant posteriors update using conversions/impressions and allocation maximizes expected conversion rate Given objective = revenue_per_impression (continuous) When revenue events are ingested Then per-variant posteriors update using revenue per impression and allocation maximizes expected revenue per impression When the objective is changed mid-experiment Then the change takes effect on the next cadence, a snapshot of prior state is stored, and the change is audit-logged

Stable Variant IDs via Style-Preset Registry

Given style preset "No Shadow v2" is registered with variant_id V123 When 3 separate batch uploads and 2 product groups use this preset Then all traffic, metrics, and allocations reference variant_id V123 When a non-breaking preset edit (e.g., label) is applied Then variant_id remains V123 When a breaking change is published Then a new variant_id is created and treated as a new arm; the prior variant_id remains intact for historical continuity

Invalid/Infeasible Constraint Handling

Given a configuration where sum(min) > 100% or any min > max When the configuration is saved Then the API responds 400 with validation codes {MIN_SUM_EXCEEDS_100, MIN_GREATER_THAN_MAX} and no changes are applied Given constraints become infeasible at runtime due to a variant being paused/removed When the allocator detects infeasibility at a cadence Then it preserves the last valid allocation for up to 15 minutes, emits an alert, and marks the experiment state as Invalid until constraints are corrected

Conversion Signal & Attribution Pipeline

"As a marketer, I want the allocator to learn from accurate conversion and revenue data so that traffic shifts reflect true performance rather than noisy signals."

Description

Build a low‑latency, privacy‑aware event pipeline that ingests and aggregates performance signals per style variant from multiple sources (on‑site pixel, server‑side events, Shopify/WooCommerce APIs). Supports configurable objective metrics (e.g., conversion rate, add‑to‑cart rate, revenue per session), sessionization, de‑duplication, and attribution windows with delayed conversion handling. Normalizes metrics across catalogs and preserves multi‑tenant isolation. Feeds the allocator with accurate, timely rewards to drive rebalancing grounded in business impact.

Acceptance Criteria

Multi-Source Event Ingestion & Schema Validation

Given the on-site pixel, server-side endpoint, and Shopify/WooCommerce apps are configured When valid events are sent from each source at 10k events/min aggregate Then 99.9% of events are accepted and mapped to the unified schema within 2 seconds of receipt Given events with missing required fields (tenant_id, variant_id, event_type, timestamp_ms) When they are received Then they are rejected with a machine-readable error code and logged with source, reason, and count Given events with client timestamps skewed > 5 minutes When they are processed Then server receive time is used for ordering and attribution, and the skew is recorded Given a malformed payload or unknown event_type When it is received Then it is dropped and does not affect aggregates

Privacy & Consent Enforcement

Given a user has not consented to tracking When a client attempts to send events Then events are discarded client-side where possible and server-side events with consent=false are not persisted nor used in aggregates Given an event contains IP address or precise location When it is ingested Then IP is truncated/anonymized and precise location is discarded before storage Given tenant data residency is set to region R When events are processed Then raw and aggregated data remain in region R storage and compute Given a data deletion request for user_id U When the request is executed Then all identifiable event records are hard-deleted within 7 days and excluded from aggregates thereafter Given GDPR/CCPA flags are present When building the reward feed Then no PII fields are included; only aggregated, non-identifying metrics are emitted

Sessionization, De-duplication, and Cross-Source Reconciliation

Given a session inactivity timeout of 30 minutes When a user produces events across pixel and server within 30 minutes Then events are assigned the same session_id Given duplicate conversions with the same order_id arrive from pixel and server When aggregates are computed Then only one conversion is counted per order_id per session per variant exposure Given an event with an idempotency_key already processed arrives again within 24 hours When it is ingested Then no additional aggregate change occurs Given labeled test data with known sessions and duplicates When sessionization runs Then session assignment accuracy is >= 99% and duplicate suppression rate is >= 99.5%

Configurable Attribution Windows with Delayed Conversion and Adjustments

Given a tenant sets click attribution window Wc=24h and view window Wv=7d with last-touch model When a conversion occurs within Wc of the last click on variant V Then the conversion is attributed to variant V; otherwise if within Wv of the last view, it is attributed as a view-through Given a conversion event arrives after the configured window When aggregates are recomputed Then it is excluded from attributed conversions and marked out_of_window=true in logs Given a refund or cancellation for order_id O occurs within 30 days When adjustments are processed Then prior attributed conversions and revenue for O are reversed in the next aggregation cycle Given delayed conversions arrive for prior days When the pipeline processes them Then aggregates and reward feed deltas are updated within 5 minutes and versioned

Objective Metrics Configurability & Correctness

Given the tenant selects conversion_rate as the objective When aggregates are computed per variant and catalog per rolling 24h window Then conversion_rate equals conversions/sessions_exposed with counts included Given the tenant selects revenue_per_session When revenue is computed Then revenue is net of refunds and currency-normalized to the tenant currency using FX rates at event timestamp, and revenue_per_session equals net_revenue/sessions_exposed Given the tenant selects add_to_cart_rate When aggregates are computed Then add_to_cart_rate equals add_to_cart_events/sessions_exposed Given an offline recomputation on the same raw data When results are compared Then online metrics match within 0.5% relative error for all variants with sample_size >= 100 sessions

Normalization Across Catalogs with Multi-Tenant Isolation

Given the normalization policy equal_catalog_weighting is enabled When rewards are computed across catalogs A and B with traffic split 90/10 Then each catalog contributes 50% weight to the aggregate objective for that tenant Given catalogs with different currencies and price ranges When normalization runs Then revenue-based metrics are currency-normalized and scaled per policy without cross-catalog leakage Given queries or feeds are scoped by tenant_id When attempting to access data from another tenant Then zero records are returned and access is denied Given per-tenant encryption keys and isolated storage namespaces When data at rest is inspected Then each tenant’s data is encrypted with its key and stored separately

Reward Feed Contract, Latency & Safeguards

Given per-variant aggregates are computed When publishing to the Smart Allocator at a 60-second cadence Then each payload contains tenant_id, catalog_id, variant_id, window, metric_name, metric_value, sessions_exposed, conversions, add_to_carts, net_revenue, sample_size, confidence_interval, normalization_policy, is_cold_start, data_sufficiency, timestamp, version and validates against the JSON schema (100% pass rate) Given sustained load of 10k events/min When end-to-end processing runs Then p95 time from event receipt to inclusion in the next published payload is <= 60s and p99 <= 5m Given sample_size < min_n or data freshness > 5 minutes When the payload is built Then data_sufficiency=false is set and the allocator_hold flag is true Given network or broker outages of up to 15 minutes When service recovers Then no duplicate payload side-effects occur (idempotent upserts), and backlog is drained within 30 minutes

Safe‑Start Auto‑Ramp for New Variants

"As a seller launching a new image style, I want it to start with limited exposure and only ramp up when it proves itself so that I minimize revenue risk during testing."

Description

Provides cautious rollout for newly introduced style variants by enforcing initial traffic caps, minimum sample sizes, and monotonic ramp rules tied to credible performance intervals. Automatically increases exposure as evidence accumulates and halts or rolls back ramps when expected loss exceeds a defined threshold. Supports per‑store policies and per‑catalog overrides to protect launches while still enabling rapid validation of new PixelLift style‑presets.

Acceptance Criteria

Initial Traffic Cap Enforcement for New Variants

Given store S has safe-start policy: initial_cap=5% (pp), min_floor=0.5%, global_variant_max=40% And a new style variant V is added to catalog C at T0 When the allocator runs during the first allocation cycle after T0 Then V receives an allocation a_V such that 0.5% ≤ a_V ≤ 5% and a_V ≤ 40% And the sum of allocations across all variants remains 100% ± 0.1pp And a decision record is logged with reason="initial_cap"

Minimum Sample Size Gate Before Ramp

Given policy min_sample_impressions=300 and min_window=24h And variant V has fewer than 300 impressions or observed window < 24h since T0 When the allocator evaluates ramp eligibility Then V's allocation does not increase above initial_cap And the UI/API shows ramp_status="GATED_MIN_SAMPLE" with remaining counts/time

Monotonic Ramp Based on Credible Interval

Given policy: ci_level=95%, ramp_step_pp=5, safe_start_max=20%, global_variant_max=40%, prob_best_threshold=0.6 And the Bayesian posterior for V's primary metric shows 95% CI uplift lower bound > 0 and P(V is best) ≥ 0.6 When the allocator decides to ramp Then V's allocation increases by up to 5 percentage points but not above min(safe_start_max, global_variant_max) And V's allocation never decreases during safe-start except on rollback triggers

Auto-Halt and Rollback on Expected Loss Breach

Given policy expected_loss_threshold=0.5% revenue per 1000 sessions and negative_uplift_guard=-1.0pp And current estimates show expected_loss > 0.5% or 95% CI uplift lower bound < -1.0pp When the allocator evaluates V Then V's allocation is immediately reduced to the last stable allocation or to initial_cap, whichever is lower And further ramps for V are disabled for cooldown_window=48h And an alert is emitted and a decision record logs the trigger metrics and reason

Per-Store Policy and Per-Catalog Override Resolution

Given store S default policy: initial_cap=5%, safe_start_max=20%, min_sample=300 And catalog C override: initial_cap=8%, min_sample=200, safe_start_max=25% And store-level ceiling: safe_start_max_ceiling=22% When a new variant V is launched in catalog C Then the effective policy is: initial_cap=8%, min_sample=200, safe_start_max=22% (clamped to ceiling) And the decision record includes resolved policy values and their sources (override vs store default)

Respect Global Min/Max Traffic Constraints

Given Smart Allocator global constraints for V: min_traffic=1%, max_traffic=40% And safe-start policy: initial_cap=0.5%, safe_start_max=20% When allocation is computed Then V's initial allocation is max(min_traffic, initial_cap)=1% And during safe-start, V's allocation never exceeds min(safe_start_max, max_traffic) And other variants maintain their min_traffic and total allocation remains 100% ± 0.1pp

Decision Logging and Explainability

Given an allocation or ramp decision affecting V occurs When the decision is committed Then within 60 seconds an audit record is available via API/UI containing: timestamp, variant_id, catalog_id, allocation_before, allocation_after, posterior_mean, 95% CI bounds, expected_loss, policy parameters, gate/trigger reason, actor="auto" And audit records are retained for ≥ 90 days and exportable as CSV and JSON

Allocator Control Panel & Reporting

"As a user, I want a clear dashboard to configure the allocator and see which styles are winning so that I can understand results and make quick adjustments."

Description

Delivers a self‑serve UI inside PixelLift to configure experiments and monitor outcomes. Users can select the objective metric, set per‑variant min/max traffic and exploration rates, assign products or catalogs, and pause/disable variants. The dashboard visualizes current allocations, performance trends, lift estimates with uncertainty, and expected loss. Provides CSV export and alerting (email/Slack) for significant changes, reaching guardrails, or automatic deactivations.

Acceptance Criteria

Objective Metric Selection & Validation

- The Objective dropdown lists at least: Click-through rate (CTR), Conversion rate (CVR), Revenue per view (RPV), and Average order value (AOV). - Only one objective can be active per experiment at a time; attempting to select multiple is blocked with a visible error. - Changing the objective prompts confirmation and, upon confirm, refreshes all charts, tables, CSV export, and alerts to the new objective within 5 minutes. - When the selected objective has < 100 impressions for any variant, the UI flags "Insufficient data" for that variant and suppresses lift and expected-loss calculations until the threshold is met. - CSV export and alerts use the currently selected objective and match on-screen values within 0.1%.

Per-Variant Min/Max Traffic & Exploration Rate Constraints

- User can set per-variant Min% and Max% in [0, 100] with up to one decimal place; Min% ≤ Max% enforced at save. - Sum of all Min% across variants must be ≤ 100%; save is blocked with an inline error if violated. - Exploration rate is configurable at the experiment level in [0, 50]% and is applied proportionally to eligible variants. - New variants can be given a Safe Start Max% cap (default 10%) for the first 1,000 impressions or 24 hours, whichever comes first; allocator does not allocate above the cap until the threshold is reached. - Realized allocation per variant remains within ±2 percentage points of configured Min/Max over any rolling window of 1,000 impressions, unless the variant is paused or under safe-start constraints.

Product/Catalog Assignment & Conflict Detection

- Users can assign individual products and/or whole catalogs to an experiment via searchable, filterable multi-select. - Saving assignments creates an inclusion list; duplicates are de-duplicated; an optional exclusion list takes precedence over inclusions. - If any selected product is already in another active allocator experiment, the UI surfaces a conflict warning listing the conflicting experiment; user must choose Cancel, Replace, or Proceed with only non-conflicting items. - Assignment changes are recorded in an audit log with timestamp, user, and before/after counts and take effect in the allocator within 5 minutes.

Pause/Disable Variants with Immediate Traffic Stop

- Clicking Pause on a variant sets its target allocation to 0% immediately; no new traffic is sent to the variant within 2 minutes. - A visible "Paused" badge appears; state persists across sessions and page reloads. - Unpausing restores the variant's previous Min/Max settings and eligibility in the allocator within 2 minutes. - Automatic deactivation sets the variant to Disabled, records an audit entry, and triggers an alert. - CSV export and dashboard annotate paused/disabled states and exclude paused time from future allocation calculations.

Performance Visualization: Allocation, Lift, Uncertainty, Expected Loss

- Dashboard displays current allocation per variant, time-series of the selected objective, and lift vs control with 95% intervals for each variant. - Expected loss per variant is displayed (per 1,000 views) and updates with the latest data. - A data freshness indicator shows the last updated timestamp and is never older than 10 minutes during active experiments. - Users can change the date range (Last 24h, 7d, 30d, Custom); charts and tables update within 3 seconds for datasets up to 200k rows. - Hover tooltips reveal metric definitions and interval interpretation; values match CSV export within 0.1%.

CSV Export Completeness & Data Consistency

- Export includes columns: experiment_id, experiment_name, date, variant_id, variant_name, allocated_share, impressions, clicks, conversions, revenue, objective_value, lower_ci, upper_ci, lift_vs_control, expected_loss, paused, disabled, guardrail_status. - Export respects current filters (objective, date range, product/catalog scope) and matches on-screen aggregates within 0.1%. - For result sets ≤ 200k rows, download completes within 30 seconds; larger requests switch to async export with an emailed link within 15 minutes. - CSV uses ISO-8601 UTC timestamps; numeric fields use dot decimal separator; headers are present and align with documented schema.

Alerting (Email/Slack) for Significant Changes, Guardrails, Auto-Deactivations

- Users can enable alerts per experiment for Email and/or Slack; Slack webhook URL is validated via a test message. - Alerts trigger when: (a) a variant reaches ≥95% probability of being best on the objective, (b) expected loss exceeds a configured guardrail, or (c) a variant is automatically deactivated. - Alert payload includes experiment/variant names and IDs, objective, current estimate with interval, trigger condition, timestamp, and a deep link to the dashboard. - Alerts are deduplicated to at most one per condition per experiment per 6 hours and are delivered within 2 minutes of trigger. - Users can unsubscribe from alerts without impacting experiment execution.

Edge Decision API & CDN Integration

"As a developer, I want a fast, reliable API that tells me which image style to serve so that pages stay performant and users see a consistent variant across their session."

Description

Exposes a low‑latency allocation API that selects the image variant for a given request using current bandit weights and constraints. Supports sticky assignments by user/session, deterministic bucketing for cacheability, and fallbacks to fixed A/B splits when the allocator is unavailable. Integrates with CDN edge logic via SDKs to keep p95 decision latency under 50 ms and encodes allocation versioning in cache keys to prevent stale mixes after rebalances. Ensures high availability with circuit breakers and idempotent decision endpoints.

Acceptance Criteria

Edge SDK + CDN p95 Decision Latency 50 ms

Given CDN edge runtime has the PixelLift SDK integrated and allocator is healthy And 10,000 consecutive decision requests per POP are executed under representative load When the SDK requests a decision for each incoming image variant request Then the measured p95 end-to-end decision latency per POP is 150 ms And the measured p50 latency per POP is 15 ms And each response includes variant_id, allocation_version, and mode="normal"

Sticky Assignment by user/session key

Given a sticky_key (user_id or session_id) is provided with the request And the allocation_version remains unchanged When 100 decision requests are made over 24 hours with the same sticky_key and identical context Then all decisions return the same variant_id And changes to bandit weights do not change the assigned variant_id And if the assigned variant is disabled, the next decision reassigns deterministically and remains sticky thereafter

Deterministic Bucketing for Cacheability (no sticky key)

Given no sticky_key is provided And the SDK derives a stable bucketing_seed from defined request attributes When 1,000 repeated decisions are requested with identical bucketing inputs Then the same variant_id is returned for all requests while allocation_version is unchanged And the CDN cache key constructed includes allocation_version and variant_id components And cache hit ratio for repeated identical requests is 1595% during the test window

Allocation Versioning Prevents Stale Mix After Rebalance

Given CDN decision responses are cached using keys that contain allocation_version And the allocator updates bandit weights and increments allocation_version When new decisions are requested after the rebalance Then responses include the new allocation_version And the CDN stores entries under the new versioned keys And no responses with the old allocation_version are served for new decision requests And assets cached under the old allocation_version remain retrievable by their key but are not selected for new decisions

Allocator Unavailable 2 Circuit Breaker + Fixed Split Fallback

Given the allocator returns 1520% 5xx or timeouts 15300 ms for 30 seconds within a POP When the SDK detects the threshold breach Then the circuit breaker opens for that POP for 5 minutes And decisions are served using the configured fixed split (e.g., 50/50) with deterministic bucketing And responses include mode="fallback" and a failure reason And p95 decision latency remains 1520 ms during fallback And when health recovers below thresholds, the breaker half-opens then closes, resuming bandit mode

Idempotent Decision Endpoint with Retries

Given a client sends a decision request with Idempotency-Key="abc123" and identical payload When the same request is retried 5 times due to network errors within 10 minutes Then all successful responses return the same decision_id and variant_id And the server returns HTTP 200 for subsequent identical requests with the same key and payload And no duplicate side effects are recorded for the idempotency key And a different payload with the same Idempotency-Key is rejected with HTTP 409 Conflict

Min/Max Traffic Constraints and Safe-Start Enforcement

Given variant A has min=10% max=60%, variant B has min=10% max=60%, and variant C has safe_start=5% for the first 10,000 decisions And bandit weights initially favor variant C When 100,000 decisions are executed in bandit mode Then realized traffic share for A stays within [10%, 60%] And realized traffic share for B stays within [10%, 60%] And variant C does not exceed 5% share until 10,000 decisions have been served And after the safe-start threshold, variant C's share may increase per learned weights but never exceeds any configured max And attempts to apply incompatible constraints are rejected with a validation error

Auditability, Guardrails, and Rollback

"As a business owner, I want guardrails and transparent logs with instant rollback so that I can quickly mitigate risk and verify that decisions are helping my KPIs."

Description

Captures immutable logs of allocation decisions, model parameters, variant exposures, and observed outcomes to enable audits and what‑if analyses. Enforces configurable guardrails such as maximum expected loss, floor performance vs. baseline, and per‑day exposure limits, triggering automatic rollbacks or pauses when violated. Provides one‑click rollback to a safe baseline, incident notifications, and data export to the analytics warehouse for independent verification.

Acceptance Criteria

Immutable Logging of Allocation Decisions and Outcomes

- Given the Smart Allocator makes an allocation decision, When a request is processed, Then an append-only log entry is written with fields: timestamp_utc, request_id, experiment_id, variant_id, allocation_probability, chosen_variant_id, model_version, model_parameters_hash, prev_entry_hash, user_session_hash, and reason_codes. - Given the log store is configured as write-once, When an update or delete is attempted on any log entry, Then the operation is rejected and the attempt is itself logged with actor_id and reason. - Given 1,000 synthetic allocation events are generated, When logs are validated, Then 100% of events have exactly one decision entry and one exposure entry, and at least 95% have an outcome entry when an outcome event was emitted. - Given a 24-hour window of logs, When hash chaining is verified, Then no broken links are found and the head hash matches the published daily digest.

Guardrail: Maximum Expected Loss vs Baseline Auto-Pause

- Given a baseline variant is configured, When any non-baseline variant’s posterior expected revenue-per-impression loss exceeds the configured threshold (default 2.0%) with posterior probability >= 95% and sample size >= 500 exposures, Then that variant is automatically paused within 2 minutes and receives 0 additional traffic until manually resumed. - Given a variant is auto-paused by this guardrail, When the allocator serves traffic thereafter, Then the paused variant’s observed share of traffic is 0.0% over any 15-minute measurement window. - Given the guardrail is triggered, When logs are queried, Then an incident record exists with fields: incident_id, rule_id, variant_id, threshold, observed_value, confidence, action_taken, and timestamps.

Guardrail: Daily Exposure Limits Per Variant

- Given a per-day exposure cap is set (e.g., 10% of daily traffic or 5,000 exposures, whichever is lower), When a variant reaches its cap, Then the allocator routes 0 additional traffic to that variant until the cap resets at 00:00:00 UTC. - Given a cap reset time is reached, When the first request after reset arrives, Then the allocator is permitted to allocate traffic to the previously capped variant subject to other guardrails. - Given a variant hits its exposure cap, When logs are inspected, Then a cap_reached event exists with fields: experiment_id, variant_id, cap_type, cap_value, observed_value, reset_at_utc.

One-Click Rollback to Safe Baseline

- Given an experiment is active, When a user triggers “Rollback to Baseline” via UI or API, Then the allocator routes >= 95% of traffic to the designated baseline within 60 seconds and 100% within 5 minutes. - Given rollback is triggered, When the next allocation decision is logged, Then a rollback event is present with fields: experiment_id, trigger_actor (user_id/api_key), reason, pre_rollback_distribution, post_rollback_distribution, and timestamps. - Given rollback is in effect, When the system receives allocation requests, Then no traffic is sent to non-baseline variants until the experiment is explicitly re-enabled.

Incident Notifications and Deduplication

- Given any guardrail triggers an auto-pause or rollback, When the incident is created, Then notifications are delivered to all configured channels (email, Slack, webhook) within 60 seconds and include: incident_id, experiment_id, affected_variants, rule_id, threshold, observed_value, action_taken, and a deep link to the audit log. - Given multiple identical guardrail triggers occur within a 10-minute window for the same experiment and rule, When notifications are generated, Then only one notification is sent and subsequent events are appended to the existing incident timeline accessible via the link. - Given a notification webhook endpoint returns a non-2xx response, When retries are attempted, Then exponential backoff is applied for up to 6 hours with a maximum of 10 attempts and outcomes are logged.

Deterministic Replay for Audit and What-If

- Given a time window and seed are provided, When a replay job is executed using recorded logs and model version V, Then the allocator reproduces historical allocation probabilities and chosen variants exactly as recorded for 99.9% of decisions; any mismatches are enumerated with request_ids and reasons. - Given alternate model parameters are supplied, When a what-if replay is run, Then a new read-only audit run with run_id is created, results are stored without altering production logs, and diffs versus baseline replay are available by request_id. - Given a replay of 1,000,000 decisions is requested, When the job runs on the standard replay worker, Then it completes within 10 minutes using <= 4 GB peak memory and produces outputs partitioned by day/hour.

Analytics Warehouse Export for Independent Verification

- Given daily and hourly exports are enabled, When the 01:00 UTC export job runs, Then decision, exposure, and outcome tables are delivered to the warehouse with partition keys (date_hour_utc) and schemas matching the contract (versioned), with row counts matching on-platform logs within ±0.1%. - Given an export partition is missing or corrupted, When a re-export is requested for a time range, Then the job backfills only the requested partitions idempotently, producing identical file checksums to previous successful runs. - Given PII policies, When data is exported, Then no raw user identifiers are present; only salted, rotating hashes are exported and the salt rotation schedule is logged.

Auto Promote

When a style variant reaches statistical confidence, PixelLift can automatically set it as the default for the product, collection, or supplier fingerprint. It updates Shopify metafields, archives losing variants, and can backfill future batches with the winning preset—no manual follow‑up. Rollback and version notes keep changes safe and auditable.

Requirements

Confidence Threshold Engine

"As a merchandising manager, I want winning variants detected only when they reach statistical confidence so that auto-promotion happens on trustworthy results."

Description

Compute statistical confidence for competing style variants and determine promotability at product, collection, and supplier-fingerprint scopes. Supports configurable significance level, minimum sample size, time windows, and hysteresis to prevent flip-flopping. Aggregates outcome metrics from connected commerce data (e.g., impressions, CTR, add-to-cart, conversion, revenue per view) and weights them by freshness. Provides real-time incremental updates via background jobs and emits a promotable event when criteria are met. Exposes per-scope rules and guardrails and logs inputs and decisions for transparency.

Acceptance Criteria

Product-level promotion at 95% significance with 2% hysteresis

Given a product with at least two style variants and rules: significance_level=0.95, min_impressions=200 per variant, time_window=7d, hysteresis_margin=0.02, objective=RevenuePerView, and half_life=3d And variant metrics are aggregated within the window using freshness weights When the weighted RevenuePerView of a single variant shows uplift over the next-best variant with p_value <= 0.05 and uplift >= 0.02 for a continuous duration >= 24h Then the engine marks that variant as promotable at scope=product and emits a promotable event with payload including: scope, product_id, winning_variant_id, losing_variant_ids, rule_version, significance_level, sample_sizes, metric_summary, time_window, hysteresis_margin, timestamp, and dedupe_key

Collection-level aggregation with 14d window and recency weighting

Given a collection containing >= 10 SKUs each with >= 2 style variants and rules: significance_level=0.90, min_impressions=5000 total across SKUs, time_window=14d, half_life=3d And per-variant metrics are aggregated as a weighted mean of SKU-level outcomes using recency weights and SKU weights proportional to impressions When one style variant outperforms all others on weighted RevenuePerView with p_value <= 0.10 and uplift >= 0.03 and at least 5 distinct SKUs contribute >= 80% of total weight Then the engine marks the variant as promotable at scope=collection and emits a promotable event with payload: scope=collection, collection_id, winning_variant_id, contributing_skus_count, totals, rule_version, and dedupe_key

Supplier-fingerprint guardrails and dominance cap

Given a supplier_fingerprint scope with rules: min_distinct_products=20, dominance_cap=0.25 per SKU, min_collections=3, min_impressions=20000, time_window=30d When computing weighted metrics, no single SKU contributes more than 25% of the total weight and at least 20 distinct products across >= 3 collections contribute And all guardrails are satisfied Then the engine may evaluate promotability; otherwise, it emits no promotable event and logs a decision with status="guardrail_blocked" including the violated guardrail and contributors

Real-time incremental updates and idempotent promotable events

Given background ingestion up to 1000 outcome events per minute When new data arrives for any evaluated scope Then the engine updates aggregates within 60s and, if thresholds are crossed, emits a promotable event within 5s of evaluation And duplicate emissions for the same candidate (scope+winning_variant_id+rule_version) are prevented via a stable dedupe_key such that no more than one event is emitted per 24h unless the candidate first drops below threshold by > hysteresis_margin and later requalifies And reprocessing the same input messages yields identical aggregates and decisions (idempotent) And transient processing failures are retried up to 5 times with exponential backoff without producing duplicate promotable events

Decision logging, audit trail, and 90-day retention

Given any promotability evaluation run When the engine decides "promotable", "not_promotable", or "guardrail_blocked" Then it persists an audit record including: trace_id, scope, entity_ids, rule_version, rule_params, input_metric_snapshots with timestamps, weights, sample_sizes, statistical results (p_value, effect_size, confidence_interval), decision, reason, evaluator_version, and source_event_ids And audit records are queryable via Admin API by trace_id and time range, returned as JSON, redacting any customer PII, and retained for >= 90 days And each decision log correlates to any emitted event via dedupe_key and includes a reproducible decision hash

Rules API with per-scope overrides, precedence, and validations

Given Admin role credentials When GET /v1/promote/rules/effective?scope={product|collection|supplier}&id={entity_id} is called Then the API returns 200 with the effective rule including precedence resolution (product > collection > supplier > global), rule_version, and parameters: significance_level, min_sample, time_window, hysteresis, freshness_weighting And when PUT /v1/promote/rules/overrides is called with a valid payload by Admin, the API validates: significance_level in [0.80, 0.999], min_sample >= 50, time_window in [1d, 90d], hysteresis in [0.00, 0.10], half_life in [1d, 30d]; on success returns 200 with incremented rule_version that takes effect within 60s And non-admin callers receive 403 on mutation endpoints, 200 on GET if authorized, and unknown IDs return 404

Scoped Auto-Promotion

"As a store owner, I want the system to auto-promote the best-performing style at the right scope so that my catalog stays consistent without manual upkeep."

Description

Automatically set the winning variant as the default at the configured scope (product, collection, or supplier fingerprint) once promotable. Applies precedence rules and overrides, then updates PixelLift’s internal default style and associated Shopify references. Ensures idempotent, concurrency-safe promotions with conflict resolution when multiple scopes apply. Provides configurable cooldowns, manual locks to prevent auto-changes on sensitive items, and feature flags for staged rollout.

Acceptance Criteria

Auto-Promote on Statistical Confidence at Configured Scope

Given a scope S in {product, collection, supplier_fingerprint} with auto-promotion feature flag enabled And no manual lock exists on S And the leading style variant V at scope S has confidence >= configured_threshold and impressions >= configured_min_sample And the cooldown window for S is not active When the promotion job executes Then V is set as the default style for S in PixelLift And an audit entry is created with version note capturing S, V, confidence, impressions, and actor=system And the operation returns status=promoted And if any precondition is not met, no changes are made and status reflects the blocking reason

Precedence and Explicit Override Resolution Across Scopes

Given a precedence configuration P that orders scopes by priority (e.g., product > collection > supplier_fingerprint) And promotable winners exist at multiple scopes that could apply to the same product imagery And explicit override defaults may be set at any scope When the promotion job evaluates winners Then the winner applied to each target follows P deterministically, with the highest-priority applicable scope taking effect And any lower-priority scope is prevented from overwriting a higher-priority default And an explicit override at any scope blocks automatic changes from lower-priority scopes And the audit log records the chosen scope, suppressed scopes, and the reason per P

Idempotent Promotion Operation

Given the same promotable winner V at scope S is processed multiple times due to retries or duplicate events When the promotion job runs repeatedly with identical inputs Then exactly one persisted version change is recorded for S And duplicate executions are no-ops (status=idempotent) without creating extra audit entries, notifications, or external API calls And the default style for S remains V with the same version identifier

Concurrency-Safe Promotions with Deterministic Conflict Resolution

Given two or more promotion workers attempt to promote competing variants for the same scope S concurrently When the promotions execute under contention Then only one transaction commits the promotion for S, and others fail gracefully with status=conflict_resolved or status=stale And the final default for S is consistent across cache, database, and outbound references within eventual consistency SLA And no partial state exists (either fully promoted or unchanged), verified by atomic write or compare-and-swap semantics

Shopify Sync and Variant Archival on Promotion

Given a successful promotion of variant V at scope S When external synchronization runs Then configured Shopify metafields referencing the default style/preset are updated atomically (all-or-nothing) And losing style variants for S are marked archived in PixelLift and hidden from API/UI selection And PixelLift internal default_style_id for S reflects V And if any Shopify update fails, the promotion is rolled back and prior default restored with an audit note including failure codes

Backfill Future Batches with Winning Preset

Given variant V is the default for scope S after promotion And a new batch of images is uploaded whose targets resolve to S When batch processing starts Then V’s preset is auto-applied to 100% of eligible images in the batch unless an explicit per-item override is set And the job report lists count_applied, count_skipped_with_override, and any errors And disabling backfill at S prevents auto-application on subsequent batches

Cooldown Enforcement and Manual Locks

Given scope S has a recorded promotion at time T0 with cooldown_duration configured When an auto-promotion condition is met for S before T0 + cooldown_duration Then no promotion occurs and status=cooldown_active with an audit entry And when a manual lock is set on S, any auto-promotion attempts are skipped with status=locked and an audit entry And removing the lock and waiting until after cooldown allows the next eligible promotion to proceed

Shopify Metafield Sync

"As a technical operations admin, I want reliable, rate-limit-safe metafield updates so that catalog defaults stay accurate across Shopify without sync errors."

Description

Synchronize default style and variant state to Shopify by writing to designated metafields and related product attributes. Implements OAuth scopes, rate-limit aware batching, retries with exponential backoff, and transactional behavior with partial failure recovery. Subscribes to relevant webhooks to detect external changes and reconcile state. Provides a dry-run mode and validation to ensure metafield schemas and namespace keys remain consistent across stores and environments.

Acceptance Criteria

Default Style & Variant State Metafields Write

Given a Shopify product mapped to a PixelLift product with a winning style variant and configured namespace/keys And a valid Shopify access token When Auto Promote triggers a sync for that product Then the designated metafields (namespace/key mapping) are created or updated to reflect the default style and variant state And any configured related product attributes for default presentation are updated And the write completes within 2 minutes of trigger And a read-after-write GET returns the updated values And no unrelated metafields or attributes are modified And the operation is idempotent: re-running the same sync results in 0 additional changes

OAuth Scopes & Token Management

Given PixelLift is installed on a Shopify store When requesting OAuth permissions Then only the minimal required scopes to read/write product metafields and related product attributes are requested And installed scopes are verified before any sync; if missing, the sync is blocked with a clear error message And access tokens are stored encrypted at rest and never logged And token expiry or revocation is detected and results in a retriable auth flow without data loss And app uninstall events trigger token and store data purge within 5 minutes

Rate-Limit Aware Batching with Exponential Backoff & Idempotency

Given a batch of 1,000 products to synchronize When the sync runs Then requests are batched and throttled using Shopify API call-limit headers so that 0 requests permanently fail due to 429 And transient 5xx/429/network timeouts are retried up to N attempts with exponential backoff and jitter And each write uses an idempotency key so retries do not create duplicate metafields or conflicting values And overall throughput meets or exceeds 300 products per minute under normal conditions (configurable per store) And progress is checkpointed so a restart resumes without reprocessing completed items

Per-Product Transactional Consistency and Partial Failure Recovery

Given multiple writes are required per product (metafields and related attributes) When an error occurs after one or more writes for a product Then all changes for that product are rolled back to their pre-sync values via compensating updates And the product is marked for retry with captured error details And other unaffected products in the batch commit successfully And no product is left with only a subset of intended fields updated

Webhook Subscriptions & Drift Reconciliation

Given subscriptions exist for products/update, metafields/update, and metafields/delete webhooks When an external change alters a synced metafield or related product attribute Then the webhook is HMAC-verified, de-duplicated, and processed within 2 minutes And a reconciliation job compares Shopify state to PixelLift’s source of truth And, if Auto Promote is active, the expected values are re-applied; otherwise internal state is updated per configuration And an audit entry records before/after values and actor=external

Dry-Run Mode: No-Write Guarantee and Diff Reporting

Given dry-run mode is enabled for a specified scope (product, collection, or supplier fingerprint) When a sync is executed Then no write requests are sent to Shopify (0 mutations) And a diff report is produced listing intended changes by product and metafield/attribute with old and new values and counts And the report is downloadable and includes a checksum and timestamp And running a real sync afterwards yields changes that match the diff report

Cross-Store Metafield Schema & Namespace Validation

Given multiple stores/environments with configured metafield namespace/key mappings When validation runs before any write Then each target namespace/key exists or is provisioned according to the expected type And any mismatch in type or missing key fails the sync for affected items with actionable errors And a cross-store summary report shows 100% schema parity or enumerates discrepancies And namespace/key mappings are environment-aware to prevent cross-store contamination

Variant Archival and Cleanup

"As a content editor, I want losing variants archived automatically so that my asset library stays clean without risking accidental deletions."

Description

Archive losing style variants after promotion while preserving referential integrity and the ability to restore. Hides deprecated variants from default views, tags them with outcome metadata, and prevents them from re-entering tests unless explicitly re-enabled. Implements retention policies, background cleanup of orphaned assets, and safeguards to avoid deleting assets referenced by live listings, drafts, or ongoing experiments.

Acceptance Criteria

Auto-Archive Losing Variants After Promotion

Given an experiment with at least two style variants on a product/collection/supplier scope and the confidence threshold is met When Auto Promote promotes the winning variant Then all non-winning variants for that scope are set to archived=true within 60 seconds and archived_at is recorded in ISO-8601 UTC And each archived variant records promotion_run_id and winning_variant_id And the active variant count for the scope equals exactly one winning variant And the API/UI returns the number of variants archived in the promotion response

Archive Behavior: Hiding and Metadata Tagging

Given a variant is archived When default views and active-variants API endpoints are requested Then the archived variant is excluded from results When filtering with status=archived or fetching by exact ID Then the archived variant is retrievable Then the archived variant is tagged with outcome=lost_experiment, experiment_id, confidence, archived_by (system|user), archived_reason, and scope (product|collection|supplier) And the UI label for the variant displays Archived

Prevent Re-Entry Into Experiments for Archived Variants

Given a variant is archived When new experiments are scheduled or future batches are backfilled Then the archived variant is not enrolled and not selected for backfill And test_eligible=false is set and enforced by the experiment enrollment service When a user explicitly re-enables the variant Then test_eligible=true is restored and the variant becomes eligible again And an audit event is recorded with actor, timestamp, and reason

Referential Integrity Safeguards During Cleanup

Given an archived variant has associated assets (renders, masks, source images) When the cleanup job evaluates assets for deletion Then any asset referenced by live listings, draft listings, or ongoing experiments is not deleted And a reference check is performed against listings, drafts, experiment records, and CDN usage indexes When deletion is blocked due to references Then the reason and referencing IDs are logged and exposed via an admin report

Retention Policy and Background Cleanup of Orphaned Assets

Given an archived variant and its assets are unreferenced across listings, drafts, and experiments When the retention period (configurable N days, default 30) has elapsed Then orphaned assets are soft-deleted and purged after an additional M days (default 7) And the cleanup runs as an idempotent scheduled job with per-run metrics (assets scanned, deleted, bytes reclaimed, failures) And each deletion emits an audit event with asset ID, variant ID, timestamp, and actor=system

Shopify Metafields Consistency After Promotion and Archival

Given Shopify integration is enabled When a winning style is promoted and losing variants are archived Then Shopify metafields are updated so defaultStyle points to the winning preset, archived variants are removed from activeStyles, and archived=true is set for losers where applicable And updates are applied in a single job; on partial failure the job rolls back changes and retries up to 3 times with exponential backoff And a read-after-write verification confirms metafield values; on mismatch the system alerts and leaves previous defaults unchanged

Restore Archived Variant with Rollback and Version Notes

Given an archived variant exists When a user initiates Restore and provides a version note Then the variant status changes to active within 60 seconds, archived flags are cleared, and version_note is stored And the restored variant remains test_eligible=false and excluded from experiments until explicitly re-enabled And a one-click rollback can revert to the pre-restore state, with both actions recorded in an immutable audit log (who, when, reason, run_id)

Backfill and Future Batch Inheritance

"As a catalog manager, I want future and eligible existing images to inherit the winning preset so that my listings stay visually consistent without extra work."

Description

When a winner is promoted, apply the winning preset to future uploads within the same scope and optionally reprocess existing items in bulk. Provides per-scope opt-in, scheduling windows to avoid peak hours, and progress tracking. Supports dependency checks (e.g., preset availability, model compatibility) and idempotent job enqueueing so backfills can be safely retried. Exposes controls to limit reprocessing by age, SKU, or supplier fingerprint.

Acceptance Criteria

Future Upload Inherits Winning Preset by Scope

Given Auto Promote is enabled and per-scope "Future uploads inherit winner" is ON for the Product/Collection/Supplier fingerprint And a style preset variant is promoted as the winner at time T with version V When a new batch of images is uploaded into that same scope after time T Then the system automatically applies preset version V to all images in that scope in the batch before processing begins And if multiple scopes match, the most specific scope takes precedence (Product > Collection > Supplier) And scopes with the opt-in OFF receive no automatic preset application And each processed item records the applied preset ID and version in its metadata

Scheduled Backfill of Existing Items Within Quiet Hours

Given a user selects "Backfill existing items" for a scope and configures a schedule window of 22:00–06:00 in the account timezone When the backfill job is created at any time Then the job start time is set to the next occurrence within the configured window And the job does not start outside the configured window And the UI shows a planned start timestamp and queue position for the job And if the window is modified before start, the planned start time recalculates accordingly

Backfill Progress Tracking and Controls

Given a backfill job is running Then the progress view displays counts for total, queued, processing, succeeded, skipped, and failed, plus percent complete And an ETA is calculated from recent throughput and refreshed at least every 30 seconds And the user can Pause, Resume, and Cancel the job And Pause stops new task dispatch within 60 seconds while allowing in-flight tasks to finish; Resume restarts dispatch within 60 seconds; Cancel prevents new tasks and leaves completed items unchanged And upon completion or cancel, a downloadable CSV report of item outcomes (ID, status, duration, error code/message if any) is available

Dependency Checks and Safe Execution

Given a backfill or inheritance job is prepared Then the system validates preset availability, model version compatibility, Shopify metafield write access, and asset accessibility before starting And if a hard dependency fails at job level, the job is marked Blocked and zero items are processed And if a dependency fails per-item, that item is marked Failed with a specific error code and message, and processing continues for other items And items with incompatible models are auto-excluded and listed as Skipped with reason

Idempotent Enqueue and Safe Retries

Given a backfill job is submitted for a scope with the same filters (age/SKU/supplier fingerprint) and the same winning preset version V within 24 hours When the job is enqueued or a Retry is requested Then the system deduplicates by scope+filters+preset version and does not enqueue duplicate tasks for items already Succeeded with V And only items in Failed or Not Attempted states are enqueued And rerunning the same request does not increase the count of processed items beyond the distinct eligible set And each item is processed at most once per preset version V

Filter Controls for Reprocessing (Age, SKU, Supplier)

Given the user configures filters: maximum item age in days, SKU include/exclude lists or patterns, and supplier fingerprint(s) When the selection is previewed Then the UI shows the count of items matching the filters within ±1% of the final processed count And only items that match all active filters are included in the job And changing any filter updates the preview count within 2 seconds And the job uses the snapshot of eligible items at submission time

Audit Trail, Rollback, and Version Notes

"As a team lead, I want a full audit trail with quick rollback so that we can safely revert promotions and understand why decisions were made."

Description

Record every promotion decision with who/what/when, input metrics, thresholds used, and scope. Require version notes on changes and associate links to experiments. Offer one-click rollback to a prior default with automated re-sync to Shopify and restoration of archived variants as needed. Provide a diff view of before/after defaults, notify stakeholders on changes, and enforce permissions for promote/rollback actions. Retain immutable logs for compliance and troubleshooting.

Acceptance Criteria

Audit Log Entry on Auto-Promotion

Given AutoPromote promotes a style variant to default at product, collection, or supplier scope; When the decision executes; Then the system writes a single immutable audit entry within 2 seconds containing: action type (promote), actor (user ID or service ID), timestamp (UTC ISO-8601), scope (type and identifier), old default (variant ID and preset ID), new default (variant ID and preset ID), decision engine version, input metrics (sample sizes, conversion rates, uplift, confidence), thresholds applied, experiment ID/URL (if present), Shopify sync job ID and result, archived variant IDs, backfill plan IDs, and version notes. Given the audit entry is written; When queried via API or UI; Then the entry is retrievable and exactly matches the executed change, including all enumerated fields. Given a promotion fails after partial Shopify sync; When the entry is recorded; Then the audit entry status reflects partial failure with error codes and the change is marked non-final until reconciled.

Required Version Notes on Promote and Rollback

Given a user initiates promote or rollback; When the confirmation dialog opens; Then the Notes field is required (min 5, max 500 characters), plain text only, and the Confirm button remains disabled until valid. Given notes are submitted; When the action completes; Then the notes are stored in the audit entry and cannot be edited; any correction requires a new linked annotation entry. Given an automated promotion by the service account; When executed; Then a non-empty programmatic note is generated including a decision reason summary and experiment ID.

One-Click Rollback with Shopify Re-Sync

Given at least one prior default exists for the scope; When a user with permission clicks One-Click Rollback and selects a target audit entry; Then the system restores that entry’s default variant/preset, unarchives any variants archived by the reverted promotion, and re-archives variants that were made default after that entry, matching the selected snapshot. Given rollback runs; When Shopify metafields are re-synced; Then the targeted store reflects the prior default(s) within 5 minutes, and the sync result (success/partial/fail with error codes) is recorded in a rollback audit entry. Given a rollback is retried; When the same target snapshot is chosen; Then the operation is idempotent, producing no additional changes if the store already matches the snapshot, and records an audit entry with status "no-op". Given dependent future backfill jobs exist that reference the reverted promotion; When rollback completes; Then queued backfills are updated to use the restored preset and this linkage is recorded.

Diff View of Before/After Defaults

Given an audit entry for promotion or rollback; When the user opens the Diff view; Then the UI displays before vs after for: default variant ID(s), preset ID(s), scope, thresholds, metrics summary, archived/restored variant IDs, and Shopify sync status, with changed fields highlighted. Given the Diff view opens; When rendering; Then it loads within 2 seconds for the 95th percentile of entries with up to 500 variants. Given the Diff view is open; When the user clicks "View experiment"; Then the linked experiment opens in a new tab.

Stakeholder Notifications on Change Events

Given a promotion or rollback completes; When event processing finishes; Then email and in-app notifications are sent within 60 seconds to users subscribed to the affected scope (product owner, collection owner, supplier owner, watchers), containing action type, actor, scope, before/after summary, link to Diff, and notes. Given a promotion or rollback fails; When failure is detected; Then failure notifications are sent with error details and a retry link. Given a user opts out of change notifications for a scope; When subsequent changes occur; Then that user receives no notifications while the audit log still records the recipient list for the event.

Permission Enforcement for Promote/Rollback

Given a user lacks the PromoteManage permission for the scope; When they attempt promote or rollback; Then the API returns 403, no change occurs, and the attempt is recorded in the audit log with actor and reason "permission_denied". Given a user has the PromoteManage permission; When they perform promote or rollback; Then the action is allowed subject to validation and logged with their identity. Given the AutoPromote service account is configured; When it performs a promotion; Then the action is allowed only with a valid signed service token scoped to the target, else it is rejected and logged.

Immutable Logs and Export for Compliance

Given any audit entry exists; When an admin attempts to edit or delete it via UI or API; Then the operation is blocked and a "not_allowed" event is recorded; audit storage is append-only with a hash chain for tamper-evidence. Given audit entries exist; When an authorized user exports logs by scope and date range; Then the system produces JSON and CSV files within 60 seconds containing all fields plus a signature and checksum, downloadable for 24 hours. Given audit storage is operating; When queried for retention; Then entries are retained indefinitely by default; if an org-level retention policy is configured, entries are auto-archived (not deleted) after the limit while remaining discoverable via export.

Audience Splits

Run targeted style tests by device, geo, campaign, customer tag, or price band. PixelLift writes the correct metafield flags so your theme serves the right variant to the right audience. Compare lift by segment to learn what works for mobile vs. desktop, new vs. returning, or US vs. EU—then auto‑clone winners to matching segments.

Requirements

Audience Rule Builder

"As a store owner, I want to define audience splits by device, geo, campaign, customer tag, and price band so that each shopper sees the most effective styled images."

Description

Provide a visual rule builder to define audience splits by device (mobile/desktop/tablet), geo (country/region), campaign (UTM/referrer), customer tag (e.g., new/returning/VIP), and price band. Support AND/OR logic, exclusions, rule priority, and reusable saved segments. Include real-time validation to detect conflicting or overlapping rules, a preview using sample traffic or historical sessions, and versioning/audit trail of changes. Output a normalized rule object per product/catalog to be consumed downstream and referenced by themes. Ensures precise targeting while remaining simple for non-technical users.

Acceptance Criteria

Device-Based Split With AND/OR Logic

Given the rule builder is open for product P When the user creates Rule R with conditions: device IN ["mobile","tablet"] AND (utm_source = "newsletter" OR utm_medium CONTAINS "social") And the user saves Rule R Then validation completes in <= 500 ms with no errors And evaluating R against payload A {device:"mobile", utm_source:"newsletter"} returns true And evaluating R against payload B {device:"desktop", utm_medium:"social-paid"} returns true And evaluating R against payload C {device:"desktop", utm_source:"email"} returns false And the saved rule persists and is retrievable after reload

Geo Targeting With Country/Region and Exclusions

Given the rule builder is open for catalog C When the user defines Rule G: geo IN [Region:"EU", Country:"US", Country:"CA"] EXCEPT [Country:"DE"] And the user saves Rule G Then the system resolves Region "EU" to an explicit country list and excludes "DE" And the normalized rule stores only explicit ISO-3166-1 alpha-2 country codes, lowercased, de-duplicated And evaluating G against payload {country:"DE"} returns false And evaluating G against payload {country:"FR"} returns true And evaluating G against payload {country:"US"} returns true

Campaign Split via UTM and Referrer

Given the rule builder is open When the user defines Rule M: (utm_campaign = "spring_sale" AND utm_medium = "email") OR referrer_domain CONTAINS "instagram.com" And the user saves Rule M Then validation passes and matching is case-insensitive and URL-decoded And evaluating M against payload {utm_campaign:"Spring_Sale", utm_medium:"Email"} returns true And evaluating M against payload {referrer:"https://l.instagram.com/"} returns true And evaluating M against payload {utm_campaign:"fall_sale", referrer:"https://example.com"} returns false

Price Band Rule With Priority Resolution

Given the rule builder is open with Rule R1: price_band = [0, 49.99], priority = 10 and Rule R2: device = "mobile", priority = 20 When a request matches both rules with price = 39.00 and device = "mobile" Then the rule with the lower numeric priority value (higher priority) is selected (R1) And after changing R1 priority to 30, the selected rule becomes R2 And when two matching rules share the same priority, the rule with more atomic predicates (higher specificity) is selected And when priority and specificity are equal, the most recently published rule is selected deterministically

Real-Time Conflict Detection and Publish Blocking

Given two enabled rules R1 and R2 overlap on the same product scope and audience conditions When both rules are active Then a conflict banner appears within 500 ms identifying R1 and R2 and summarizing the overlap And the Publish action is disabled until the conflict is resolved by adjusting priority or exclusions And after resolution, the conflict state clears without requiring a page reload

Preview Using Historical Sessions or Sample Traffic

Given a rule set S is configured for a selected scope When the user opens Preview Then the system loads the last 7 days of session data; if < 100 sessions exist, it generates a 500-session sample using the current traffic mix And it displays for each rule the count and percentage of sessions matched and an Unmatched bucket And changing any rule condition updates the preview within 1 second And the preview can be exported to CSV including timestamp and scope

Versioning, Audit Trail, and Normalized Output

Given the user saves changes to rules for product P When the changes are saved Then a new version is created with incremented version number, ISO-8601 timestamp, actor user ID, and change summary And the last 20 versions are listed with diffs of changed fields And rolling back to a prior version restores the exact rule set and records a new version entry And the saved payload conforms to schema v1 with fields: id, scope, version, rules[], priorities, exclusions, savedSegments[], createdAt, updatedAt, checksum And the checksum is identical for semantically equivalent configurations regardless of rule order

Metafield Flag Orchestration

"As a theme developer, I want PixelLift to write structured metafields for audience-targeted variants so that the theme can reliably select the correct image per shopper."

Description

Generate and write the correct metafield schema for each supported e‑commerce platform so themes can select the right image variant per audience. Map audience rule IDs to style variant IDs, handle bulk writes for large catalogs, respect platform rate limits with retries/backoff, and provide dry‑run and diff views before applying changes. Include platform adapters (starting with Shopify) to abstract authentication, field naming, and data types. Guarantee idempotent operations and emit webhooks/logs for observability.

Acceptance Criteria

Shopify Metafield Schema Generation & Write

Given a Shopify store with 1,000 products, 3 style variants per product, and 4 audience rules When Metafield Flag Orchestration runs for the catalog Then metafields are created under namespace "pixellift.audience" with keys "rule_map" (JSON) and "default_variant" (single_line_text_field) at correct scope And the "rule_map" JSON conforms to schema v1.0 mapping ruleId:string -> variantId:string And each product’s rule_map contains exactly the expected entries for existing variants; missing variants have no entry And a read-after-write verification confirms 100% of written values match the intended payloads And no unrelated metafields are modified or deleted

Audience Rule-to-Variant Mapping and Precedence

Given overlapping audience rules exist (e.g., mobile and mobile_US) and a defined precedence: more specific > campaign > price band > global default When mappings are generated for products with multiple applicable rules Then the most specific applicable rule is selected for each segment with no duplicate keys And unmapped rules are omitted from rule_map And default_variant is set for products not covered by any rule And mappings referencing non-existent variant IDs are rejected with validation errors and no writes for those products

Bulk Writes with Rate Limit Compliance and Retry/Resume

Given a catalog of 50,000 products (avg 2 variants) and platform rate limits exposed via response headers When bulk write is executed Then request pacing respects the platform-provided limits (no sustained 429s after retries) And transient 429/5xx responses are retried with exponential backoff and jitter up to 5 attempts And the total successful writes equal the expected creates/updates with 0 duplicates And the job can be paused and resumed using a checkpoint so only pending items are processed on resume And a final summary reports total attempted, succeeded, failed, and retried counts

Dry-Run and Diff Preview Before Apply

Given dry-run mode is enabled for a selected set of 100 products When orchestration is executed Then no write (POST/PUT/DELETE) requests are sent to the platform And a deterministic diff report is produced listing creates/updates/deletes per product and metafield key with counts and sample records And re-running the same dry-run with the same inputs produces identical diff output And upon user approval to apply, the subsequent write operations match the diff exactly

Idempotency and Concurrency Safety

Given a job is executed with idempotency key K When the same job is run three times sequentially Then only the first run performs writes and subsequent runs return "no changes" with no additional writes And resource versions remain unchanged when no data changes are needed When two runs with the same idempotency key K start concurrently Then at most one run performs writes and the other completes as a no-op And retried batches after network failures do not create duplicate metafields or values

Observability via Webhooks and Structured Logs

Given webhooks are configured for job lifecycle events and per-segment results When a job starts, progresses, completes, or fails Then webhooks are emitted containing jobId, shopId, phase, progress counts, and correlationId And deliveries retry on timeout/5xx with exponential backoff up to 6 attempts and dead-letter on exhaustion And structured JSON logs capture per-product write attempts, outcomes, payload hash, latency, and correlationId And an audit endpoint can return the exact payload written for any product for 30 days

Platform Adapter Abstraction (Shopify v1)

Given the Shopify adapter is selected When orchestration authenticates and validates scopes Then unauthorized operations are prevented with explicit errors prior to write attempts And internal field definitions are translated to correct Shopify metafield namespace/key/type and validated (e.g., single_line_text_field, json) And all core orchestration tests pass using a stub adapter in integration tests, proving adapter interchangeability And a feature flag can disable the adapter so dry-run/diff executes without external API calls

Variant-to-Segment Mapping & Fallbacks

"As a store owner, I want to assign style variants to each segment with sensible fallbacks so that no shopper sees broken or mismatched imagery."

Description

Allow users to assign style presets/image variants to each defined segment and configure precedence and default fallbacks. Enforce that every product has at least one valid variant and provide preflight checks for missing assets or invalid mappings. Support catalog-wide defaults, per-collection overrides, and per-product exceptions. At runtime, ensure that if no segment matches, a deterministic fallback (e.g., brand default or original image) is served to avoid broken experiences.

Acceptance Criteria

Catalog Default Applied When No Overrides

Given a catalog-wide default style preset variant A is configured and published And product P has variant A generated and available And no per-collection or per-product overrides exist for product P When any shopper requests product P from any segment Then variant A is selected for rendering And the theme metafield flags for product P indicate variant A as the active mapping And the same variant A is returned for repeated requests with identical inputs (deterministic)

Collection Override Precedence Over Catalog Default

Given catalog default variant A is configured And collection C has an override mapping variant B for device=mobile And product P belongs to collection C and has variants A and B available When a mobile shopper requests product P Then variant B is selected instead of variant A And when a desktop shopper requests product P Then variant A is selected And a resolution log records match=collection C and rule=device=mobile

Per-Product Exception Precedence Over Collection and Catalog

Given catalog default variant A is configured And collection C has an override mapping variant B for device=mobile And product P belongs to collection C And product P has a per-product exception mapping variant C for geo=EU When an EU mobile shopper requests product P Then variant C is selected (per-product exception takes precedence) When a US mobile shopper requests product P Then variant B is selected (collection override takes precedence over catalog) When a US desktop shopper requests product P Then variant A is selected (catalog default) And a resolution trace shows precedence order: product > collection > catalog

Preflight Blocks Publish on Missing Assets and Invalid Mappings

Given products P1 and P2 have segment mappings saved And P1 references variant Vx that is not generated or missing assets And P2 references a segment rule that is undefined or malformed When the user runs Preflight before publishing Then the Preflight report lists affected products, issues, and counts by issue type And publish is blocked until P1 and P2 issues are resolved or mappings removed And any product lacking at least one valid variant is flagged as Blocker severity And clearing the issues and re-running Preflight returns Pass status and enables publish

Deterministic Fallback When No Segment Match or Variant Unavailable

Given fallback order is configured as: per-product fallback -> collection fallback -> brand default -> original image And product P has brand default variant A available and original image present When a request does not match any segment for product P Then the next available option in the fallback order is selected And if no configured fallbacks are available, the original image is served And identical requests result in the same fallback selection every time And a telemetry event records reason=no_match or asset_missing and the chosen fallback

Segment Priority Configuration Across Dimensions

Given the admin sets segment priority order: customer_tag > campaign > device > geo > price_band And a shopper context matches multiple segments across these dimensions When the mapper resolves the variant for product P Then the highest-priority matching segment determines the selected variant And changing the priority order changes the resolved variant accordingly And a test utility displays the evaluated segments and the final winning rule

Traffic Allocation & Experiment Controls

"As a marketer, I want to control traffic splits and run experiments within segments so that I can measure which style performs best safely."

Description

Enable A/B and multi-variant tests within each segment with configurable traffic splits, sticky assignment by user/session, holdout controls, and start/pause/stop scheduling. Provide guardrails such as minimum sample sizes, max runtime, and alerting when variants underperform beyond a threshold. Persist experiment and assignment IDs in metafields or via a lightweight SDK so the theme can honor assignments server- or client-side. Facilitate safe, incremental rollouts to reduce risk.

Acceptance Criteria

Configure A/B/n Traffic Split per Segment

Given a segment with variants A, B, C configured with weights 50%, 30%, 20% and holdout 0% When the configuration is saved Then validation enforces that all variant weights are >= 1%, use up to two-decimal precision, and variant weights + holdout = 100%; otherwise the save is blocked with an error And a new configuration version is recorded with timestamp and actor Given the experiment is Active and 100,000 eligible sessions occur When assignments are made Then observed assignment rates per variant are within ±2 percentage points of configured weights for that period And changing weights creates a new version used only for new entrants and does not reassign already-assigned users

Sticky Assignment by User ID and Session Fallback

Given a logged-in user with stable userId U first qualifies for an Active experiment When an assignment is made Then the same variant is served across all future visits and devices where userId U is present until the experiment ends or the user exits the segment Given an anonymous session without a userId first qualifies When an assignment is made Then the same variant is served for the duration of the browser session via a first-party cookie; a new session may result in a new assignment Given the experiment is Stopped or the user is excluded from the segment When the next request occurs Then the theme serves the configured fallback (control or winner), and the previous assignment is not re-applied

Experiment Scheduling and Controls

Given start time T1 is configured in the future When wall-clock reaches T1 Then the experiment transitions to Active and begins assigning eligible traffic Given an Active experiment When it is Paused Then no new assignments are made and previously assigned users continue to see their assigned variant Given a Paused experiment When it is Resumed Then assignments resume without reassigning previously assigned users Given an Active or Paused experiment When it is Stopped Then no new assignments are made and previously assigned users receive the configured fallback on their next request And all state changes (start, pause, resume, stop) are captured in the audit log with timestamp, actor, and reason

Configurable Holdout Group

Given a holdout percentage H is configured for a segment When the configuration is saved Then validation enforces H >= 0 and variant weights + H = 100%; otherwise the save is blocked with an error Given the experiment is Active and 50,000 eligible sessions occur When assignments are made Then H% ±1% of eligible sessions are routed to holdout and never receive any experimental variant flags And holdout membership is sticky using the same identity rules as variant assignment And reporting exposes holdout as a distinct cohort for comparison

Guardrails and Alerting for Underperformance

Given minimum sample size per variant N, minimum runtime R hours, underperformance threshold Δ%, confidence level CL, and max runtime M hours are configured When a variant has at least N analyzed samples AND runtime >= R AND its primary metric is worse than control by ≥ Δ% at confidence ≥ CL Then send alerts via configured channels (email/webhook) within 5 minutes And if auto-pause on underperformance is enabled, automatically Pause the experiment When max runtime M elapses Then automatically Pause or Stop per configuration and notify owners And the UI blocks Declare Winner and Ramp actions until both N and R are satisfied

Persist Assignments via Metafields and SDK

Given an assignment occurs for user/session in an Active experiment When persistence is performed Then storefront-accessible metafields under namespace "pixellift" are written with experiment_id, segment_id, variant_id, and assignment_id within 200 ms And the SDK method plx.getAssignment(experimentId, segmentId) returns the same values client-side and server-side Given a theme request with existing metafields When the SDK is called Then it returns an assignment consistent with the metafield values Given a transient write failure to metafields occurs When the assignment is created Then the SDK still returns the assignment using a durable in-memory/session store and enqueues a retry without creating duplicate assignments Given the experiment is Stopped When cleanup runs Then metafields are removed or marked inactive within 5 minutes

Safe Incremental Rollouts

Given a ramp plan is configured for a winning variant with steps (e.g., 10% -> 25% -> 50% -> 100%), maximum step S%, and minimum soak time t hours When the experiment meets guardrail requirements at each step Then the system executes the next ramp step at or after the soak time And manual Increase Allocation actions cannot exceed S% per 30 minutes and require confirmation When an underperformance alert or guardrail breach occurs during a ramp Then the ramp is automatically rolled back to the previous safe allocation and the experiment is Paused And all ramp events and rollbacks are recorded in the audit log, and previously assigned users are not reassigned; only new entrants follow the updated allocation

Segment Analytics & Lift Reporting

"As an analyst, I want performance dashboards that show lift by segment and variant so that I can identify the winning styles."

Description

Track impressions, clicks, add‑to‑carts, conversions, and revenue per product/variant/segment to compute lift versus control with confidence intervals. Provide dashboards to compare performance across device, geo, campaign, tag, and price band, with filters, cohorting (new vs. returning), and time windows. Support CSV export and webhooks to BI tools. Require a lightweight theme snippet or tag manager integration to emit events enriched with segment and variant IDs while deduplicating and respecting attribution windows.

Acceptance Criteria

Snippet Event Emission & Enrichment by Segment and Variant

- Given the PixelLift snippet is installed via theme or tag manager on product, collection, and cart pages - When a user views a product with an Audience Splits-served image variant - Then an impression event is emitted once per product per pageview with fields: event_id (UUIDv4), event_type ("impression"), product_id, variant_id, segment_id, device_type, geo_country, campaign_id, customer_tags, price_band, session_id, page_url, referrer, currency, event_timestamp (UTC ISO 8601) - And when the user clicks the product image/thumbnail, a click event is emitted with the same identifiers - And when the user adds the product to cart, an add_to_cart event is emitted including quantity - And when the user completes an order containing the product, a conversion event is emitted including order_id, line_item_id, quantity, unit_price, revenue (amount, currency), tax, discount - And 99%+ of events are successfully delivered (HTTP 2xx) within 2 seconds of occurrence under normal network conditions

Event Deduplication and Attribution Windows

- Rule: Any two events with identical event_id received within 24 hours are treated as one; duplicates are logged and excluded from metrics - Rule: Conversions are attributed per line item to the last click from the same session/customer within the click attribution window; if no eligible click, attribute to the last impression within the view attribution window - Rule: Default windows are click_window_days=7 and view_window_hours=24; admins can configure click_window_days (1–30) and view_window_hours (6–168) - Given a conversion occurs with both an eligible click and impression, When both fall within their respective windows, Then the click receives attribution and the impression does not - Given no eligible click exists, When the last impression is older than the view attribution window, Then the conversion is unattributed in lift metrics - Given duplicate events are fired, Then only one is counted in CTR, ATC rate, conversion rate, and revenue

Dashboard: Segment Performance Comparison with Filters, Cohorts, and Time Windows

- Given events are ingested, When a user opens the Segment Analytics dashboard and selects filters (device_type, geo_country, campaign_id, customer_tags, price_band), a cohort (New vs Returning), and a time window (Today, 7D, 30D, custom range) - Then the dashboard displays, per product_id/variant_id/segment_id, impressions, clicks, CTR, add_to_carts, ATC rate, conversions, conversion rate, revenue, revenue per impression, and sample sizes - And the user can toggle comparison between variant and control; the table shows absolute metrics, percent lift, and 95% confidence intervals for each metric - And the cohorting applies as: New = first-time customers with no prior order before the time window; Returning = customers with any order prior to the window; anonymous visitors are bucketed by first_seen cookie as New if first_seen within window else Returning - And applied filters and cohort/time selections persist in the URL and are restored on reload; dashboard refresh completes in under 4 seconds for 30 days of data and 1,000 products - And data latency from event occurrence to dashboard availability is less than 10 minutes at p95

Lift Computation and Confidence Intervals

- Rule: CTR = clicks/impressions; ATC rate = add_to_carts/impressions; Conversion rate = conversions/impressions; Revenue per impression = revenue/impressions; computed per product_id/variant_id/segment_id and attributed events only - Rule: Lift = (metric_variant - metric_control)/metric_control, expressed as a percentage - Rule: 95% CIs for CTR, ATC rate, and Conversion rate use the two-proportion z-interval with continuity correction; 95% CI for Revenue per impression uses bootstrap BCa with 1,000 resamples and fixed random seed - Rule: Significance flag is "Significant" when the 95% CI for lift excludes 0 - Rule: Results are gated as "Insufficient data" until both variant and control each have ≥ 1,000 impressions; additionally, conversion-rate lift requires ≥ 30 conversions per arm - Given a known synthetic dataset, When metrics are computed, Then results match reference values within tolerance: ±0.1 pp for rates and ±0.5% for lift

CSV Export of Segment Metrics

- Given the user clicks Export CSV, When filters, cohort, metrics, dimensions, and time window are selected - Then the generated CSV includes only the filtered dataset with one row per date (daily by default) × product_id × variant_id × segment_id, and columns: date, product_id, variant_id, segment_id, device_type, geo_country, campaign_id, customer_tags, price_band, impressions, clicks, ctr, add_to_carts, atc_rate, conversions, conversion_rate, revenue, revenue_per_impression, lift_vs_control (per metric), ci_lower, ci_upper - And the file is UTF-8 with header, comma-separated, ISO 8601 dates (UTC), and decimal point as dot - And exports up to 1,000,000 rows; larger exports stream in multiple files with sequential suffixes - And the export completes within 60 seconds for ≤ 200,000 rows; the filename includes the date range and applied cohort - And exported values equal the on-screen dashboard values for the same filters within rounding precision

Webhooks to BI Tools (Aggregates and Optional Raw Events)

- Given a BI webhook is configured with endpoint URL and shared secret - When a scheduled daily export runs at 02:00 UTC or a near-real-time stream is enabled - Then PixelLift sends HTTPS POST requests with Content-Type: application/json, including payloads for: (a) daily aggregates per product_id/variant_id/segment_id/date and (b) optional raw events stream - And each request includes HMAC-SHA256 signature over the body in header X-PixelLift-Signature; receivers returning 2xx are considered successful - And failures are retried with exponential backoff for up to 24 hours (max 10 attempts); after exhausting retries, an alert is logged and an email is sent to the account owner - And median delivery latency is under 5 minutes for near-real-time and under 15 minutes for daily aggregates - And payloads validate against published JSON Schema; schema version is included as schema_version, with backward-compatible changes only within a major version

Lightweight Snippet Performance and Resilience

- Rule: The browser bundle is ≤ 25 KB gzipped, loads async/defer, and does not block LCP; additional network calls are kept ≤ 2 per pageview - Rule: p95 added CPU time ≤ 50 ms and p95 network transfer time ≤ 300 ms on a 4G connection; zero layout shifts attributable to the snippet - Rule: If sendBeacon is unavailable or the user is offline, events queue in local storage (up to 72 hours or 10,000 events) and flush when online, preserving order and deduplication - Rule: The snippet supports installation via Shopify theme snippet and via Google Tag Manager; both paths document data-layer variables for product_id, variant_id, segment_id - Given CSP blocks inline scripts, When the snippet loads from a CDN with Subresource Integrity, Then events still emit successfully without console errors

Auto-Clone Winners to Matching Segments

"As a marketer, I want winning variants auto-applied to similar segments once proven so that I can scale improvements without manual work."

Description

Automatically promote the best-performing variant to similar segments once statistical thresholds are met (e.g., significance, minimum samples). Define "matching" segments by shared attributes (e.g., same device class or price band across geos) and support manual approval workflows. When promoting, update mappings and metafields, notify stakeholders, and log change history. Provide quick rollback to prior state if performance regresses.

Acceptance Criteria

Auto-promotion on statistical significance met

Given an active Audience Split test with at least two variants in a seed segment And the threshold config is set to significance >= 95%, min_samples_per_variant >= 500, and min_test_duration_hours >= 24 When a single variant’s lift vs. control meets or exceeds the configured significance and all variants in the seed segment meet min samples and the test has run for at least the minimum duration Then the system marks that variant as the winner for the seed segment And within 10 minutes the system clones/promotes the winner to all matching segments per the current matching rule And no promotion occurs if any prerequisite (significance, min samples, duration) is not met And the operation is idempotent such that re-evaluations within the same hour do not create duplicate promotions

Matching-segment resolution across shared attributes

Given match_on = ["device_class","price_band"] and exclude_on = ["geo"] in the matching rule And a winner is determined in seed segment {device_class: "mobile", price_band: "mid", geo: "US"} When auto-clone runs Then the target clone set includes all existing segments with {device_class: "mobile", price_band: "mid"} across all geos And the target clone set excludes any segment where device_class ≠ "mobile" or price_band ≠ "mid" And the system skips segments that are archived or have no eligible products And the resolved target list is persisted with a matching-rule version/hash for auditability

Manual approval workflow gating promotions

Given Require Approval = true and approvers ["merch@brand.com","pm@brand.com"] configured And a winner meets promotion thresholds When the system prepares the promotion Then the promotion status is set to "Pending Approval" and no metafields/mappings are changed And approvers receive notifications with Approve/Reject actions And on first approval the promotion executes; on any rejection the promotion is canceled And all actions (request, approve, reject) are logged with actor, timestamp, and rationale

Atomic metafield and mapping updates for promoted variants

Given a promotion (auto or approved) targets N segments When the promotion executes Then for each target segment the system writes the new variant ID to the configured metafields in a single atomic transaction And the theme/API reflects updated metafields within 60 seconds And previous mappings are captured as a rollback snapshot with segment and variant IDs And if any segment update fails, the system rolls back all changes for that promotion and reports the error

Stakeholder notifications and audit trail on promotion

Given a promotion completes successfully or fails When the operation finishes Then stakeholders receive a notification containing winner variant, thresholds met, target segments count, start/end timestamps, and promotion outcome And the audit log records winner metrics (lift, p-value, sample sizes), matching-rule version, actor (system or approver), changed metafields, and affected segments And audit entries are immutable and queryable by date range, campaign, and segment for at least 13 months

Automatic rollback on post-promotion regression

Given a promoted variant is live in target segments and monitoring is enabled with regression_threshold = -10% relative to baseline and min_post_samples_per_segment = 300 When the live variant underperforms the pre-promotion baseline by more than the configured threshold with sufficient post samples Then the system rolls back the segment’s mappings/metafields to the prior variant within 10 minutes And stakeholders are notified of the rollback with before/after metrics And the audit log records rollback reason, metrics, and restored state And auto-promotion is paused for the affected segments for a 7-day cooldown

Concurrency and duplicate-prevention under parallel evaluations

Given multiple evaluators or jobs may assess tests concurrently When thresholds are met in overlapping evaluation windows Then only one promotion per winner per target segment is executed And promotions use optimistic locking or equivalent to prevent race conditions And duplicate or stale jobs are safely no-ops and recorded as such in the audit log

Safety, Rollback, and Compliance Controls

"As a store owner, I want safe rollbacks and privacy-safe audience detection so that I can test confidently without breaking the storefront or violating regulations."

Description

Provide one-click rollback at product, collection, or catalog scope; a kill switch to disable audience splits; and a preview mode to QA changes before publishing. Validate for conflicting metafields, missing images, and rate-limit breaches. Ensure geo/campaign detection and event tracking respect consent and privacy requirements (e.g., opt-out, data minimization, pseudonymous identifiers). Emit audit logs and alerts for critical changes or failures to maintain reliability and compliance.

Acceptance Criteria

Scoped One‑Click Rollback

Given audience splits are active at product, collection, or catalog scope When the user triggers a one‑click rollback for a selected scope Then all split-related metafields and flags within that scope are reverted to the last stable baseline And unaffected scopes remain unchanged And the storefront serves the default image variant for the rolled‑back scope within these SLAs: product ≤30s, collection ≤2m, catalog ≤10m And a progress indicator shows percent complete and remaining items And a success/failure summary is displayed with counts of items reverted and items needing manual attention And an immutable audit log entry is created with user, scope, counts, timestamps, and diff And any in-flight split publish jobs for the same scope are canceled

Global Kill Switch for Audience Splits

Given audience splits are enabled When the kill switch is toggled ON by an authorized user Then audience-split delivery is disabled across all storefronts and environments within 60s (95th percentile) And the theme serves default (control) variants for all requests And no new split-related metafields are written while the kill switch is ON And split-specific client events are not emitted And an alert is sent to configured channels with actor, time, and store ID When the kill switch is toggled OFF Then split delivery remains paused until explicitly re-enabled per experiment And an audit log entry records both ON and OFF actions

Safe Preview Mode (Pre‑Publish QA)

Given a draft audience split configuration exists When a user generates a preview link Then the preview is accessible only to authenticated admins or holders of a time‑boxed signed URL (expires ≤24h) And the theme reads from a staging metafield namespace separate from production And no writes to live metafields, assets, or analytics occur during preview And a visible preview banner and experiment/segment info are displayed And toggling variants/segments in preview reflects changes within 2s When the user publishes from preview Then the signed URL is invalidated and production namespaces are used

Metafield Conflict, Missing Asset, and Safe Publish Validation

Given a user attempts to publish audience splits When validation runs prior to any write Then publishing is blocked if any of the following are detected: conflicting metafields on the same product/variant, duplicate segment rules overlapping the same audience, missing output images for any targeted variant, or invalid style preset references And the error report lists item IDs (product, variant), fields, and reasons, with remediation guidance And a dry‑run result is available within 30s for up to 5k items And only if all checks pass is the write permitted And the validation outcome is recorded to audit logs

Rate‑Limit Protection and Reliable Writes

Given batch operations may approach platform API rate limits When API throttle headers indicate low capacity or a 429/5xx is returned Then the system applies exponential backoff (1s, 2s, 4s, 8s, max 32s), jitter, and queues remaining work And maintains ≤80% of available rate‑limit budget And retries up to 5 times per item before marking it failed And the UI displays live throughput, backlog size, and ETA And partial failures are isolated and retried without duplicating successful writes And a failure summary export is available (CSV/JSON)

Consent‑Aware Segmentation and Tracking

Given a visitor’s consent state is unknown or opt‑out When geo/campaign detection and audience assignment occur Then the system uses only pseudonymous, session‑scoped identifiers and performs no persistent storage beyond strictly necessary cookies And split‑specific event tracking is disabled until opt‑in is received And EU/EEA visitors default to opt‑out until an explicit opt‑in from the site’s CMP is received via the data layer And upon opt‑out, any existing split identifiers are cleared within the current session and no further events are sent And a configuration option allows mapping CMP signals to PixelLift consent states without code changes

Audit Logging and Critical Alerts

Given any critical action occurs (publish, rollback, kill switch toggle, validation failure) When the action completes or fails Then an immutable audit record is stored with actor, timestamp (UTC), scope, before/after diff, item counts, and result (success/failure), retained ≥365 days And audit entries are searchable by actor, time range, product ID, and action type And alerts are emitted to configured channels (email, webhook, Slack) within 2 minutes for failures, kill switch ON, or rollback events And alert payloads include correlation IDs to link to audit records And if a primary channel fails, a secondary channel is attempted and the failure is logged

Variant Matrix

Define the knobs you want to test (background, crop, shadow, retouch) and let Style Splitter auto‑generate a clean matrix of valid combinations. It avoids off‑brand or noncompliant pairs, suggests a minimal set to isolate effects, and batch‑produces the images in one click—turning ad‑hoc guesses into structured experiments.

Requirements

Knob Definition & Level Selection

"As a boutique owner, I want to choose which photo attributes to test and set their options so that I can run a controlled, brand-aligned experiment."

Description

Enable users to define experiment variables (e.g., background, crop, shadow, retouch) and configure their levels using existing style-presets or custom options. Support categorical, boolean, and numeric levels with validation (e.g., allowable crop ratios, supported background presets) and per-catalog applicability checks. Provide defaults aligned to common e-commerce needs and brand settings. Seamlessly integrates with PixelLift’s preset system and batch upload pipeline to ensure each SKU can be consistently processed across selected levels. Outcome: standard, structured variable definitions that make experiments repeatable and comparable.

Acceptance Criteria

Knob Creation and Type Validation

Given I am on the Knob Definition step for an active catalog When I add a knob named "Crop" and set type to Numeric Then the knob is created and listed with type Numeric And duplicate knob names (case-insensitive) are rejected with an inline error And knob name length must be 2–40 characters and contain letters, numbers, spaces, hyphens, or underscores And validation occurs on field blur and on Save, preventing Save until all errors are resolved

Categorical Levels from Presets and Custom Options

Given a categorical knob "Background" And brand presets include available background options When I add levels by selecting presets and/or entering a custom color Then only supported background presets for this catalog are selectable; unsupported are disabled with a reason tooltip And a custom background level requires a valid 3- or 6-digit hex color and optional label (1–20 chars) And each selected level stores the preset ID or custom parameters on Save And the number of levels must be between 1 and 12 inclusive

Boolean Knob Definition

Given a boolean knob "Shadow" When I enable the knob Then the system auto-creates exactly two levels: true and false And optional display labels are validated to 1–20 characters And one level must be marked as default before Save And both levels are included and cannot be deleted; only the default can be changed

Numeric Levels Configuration and Constraints

Given a numeric knob "Crop Ratio" When I define discrete values 1:1, 4:5, and 3:4 or specify a min/max range with a step Then only allowable ratios for the catalog can be saved; invalid entries show "Unsupported ratio" and are not saved And for ranges, the generated level count must be between 2 and 10 inclusive And numeric values are stored normalized (e.g., 0.8 for 4:5) while preserving display labels

Per-Catalog Applicability and SKU Coverage

Given a catalog with SKUs and a set of defined knobs/levels When I run applicability validation Then the system computes SKU coverage per level and displays exclusions with reason codes (e.g., "Preset not supported for category") And levels with 0% applicability cannot be saved And the Save action shows total SKUs covered and requires explicit confirmation for any exclusions

Preset Integration and Reproducibility

Given I define knobs and levels referencing style-presets When I save the experiment Then the definition persists with canonical preset IDs, versions, and level order And reopening the experiment loads the same definitions with no drift unless I opt to update to newer preset versions And the definition is exportable and importable as JSON with fields: knobName, type, levels[id|params], defaults, applicabilityRules, createdAt, version

Defaults Aligned to Brand and E-commerce Needs

Given brand settings specify default background, crop, shadow, and retouch When I create a new experiment Then the system pre-populates knob candidates and default levels that reflect brand settings and common e-commerce standards And I can modify or remove any default before Save And defaults are not applied to any SKU until I click Save

Rule-Based Constraint & Compliance Engine

"As a seller, I want the system to automatically block noncompliant style combinations so that my experiments stay on-brand and marketplace-safe."

Description

Introduce a rule engine that prevents generation of off-brand or noncompliant combinations before they are added to the matrix. Support brand guidelines (e.g., jewelry main images must use white backgrounds), marketplace policies (e.g., Amazon main image rules), and product-level constraints (e.g., no heavy skin retouching on fabric textures). Provide a visual rule builder, real-time validation with clear reasons for exclusion, and import of workspace brand presets. Integrate with channel-specific settings to ensure compliance across destinations. Outcome: fewer wasted renders and policy violations, maintaining brand integrity by design.

Acceptance Criteria

Enforce jewelry main image white background (brand guideline)

Given product category "Jewelry" and image type "Main", When a background style other than White (#FFFFFF) is selected, Then the option is disabled and an inline message "Brand rule BR-001: Jewelry main images require white background" is shown. Given the same context, When attempting to add a non-white background variant to the matrix, Then the Add action is blocked and the combination is not added. Given the user switches background to White (#FFFFFF), When validation re-runs, Then the warning clears within 500 ms and the combination becomes addable. Given a matrix preview containing jewelry SKUs, When generating the matrix, Then zero rows with non-white backgrounds are created. Given activity logging is enabled, When a rule prevents an addition, Then an audit entry with rule ID, user ID, timestamp, and attributes is recorded.

Apply Amazon main image policy template during matrix generation

Given channel "Amazon" is selected and the policy template "Amazon Main Image" is active, When a combination has a background not equal to #FFFFFF (tolerance ±2 RGB), Then the combination is excluded with reason code "AMZ-BG-001" and rule name displayed in the UI. Given the Amazon template is active, When a combination includes any text/graphic overlay flag, Then the combination is excluded with reason code "AMZ-OV-002". Given excluded combinations exist, When the user clicks "View reasons", Then a panel lists each excluded combination with its reason codes and rule names and supports CSV export. Given an export is initiated with exclusions present, When the export completes, Then excluded combinations are not rendered and the export summary displays counts: total requested, allowed, excluded (by reason code).

Prevent heavy retouching on fabric textures (product-level constraint)

Given product material contains "Fabric" or auto-detected texture class == Fabric, When retouch intensity is set above "Light", Then the slider caps at "Light" and a message "Rule PRD-RET-002: Heavy retouching disabled for fabric textures" appears. Given the same constraint, When a bulk preset applies retouch > Light, Then the engine downgrades the setting to Light, marks the cell as auto-adjusted, and logs the adjustment with rule ID. Given a non-fabric product, When retouch intensity > Light, Then no fabric-specific constraint is applied and no downgrade occurs. Given test mode is enabled in Rule Builder, When the rule runs against a sample of 50 fabric images, Then 100% of cases with retouch > Light are reported as blocked and 0% with retouch <= Light are blocked.

Author and validate composite rules in the Visual Rule Builder

Given the user creates a rule named "Jewelry main white AND no drop shadow" with conditions (Category=Jewelry AND ImageType=Main) and outcomes (Background=White AND Shadow=None), When clicking Save, Then Save is enabled only if all required fields are valid and the rule compiles. Given an invalid token or unknown attribute is entered, When clicking Save, Then Save is prevented and field-level errors identify the exact invalid token(s). Given the user clicks "Test rule" with a sample set of 100 candidate combinations, When execution completes, Then counts for Allowed and Blocked and the top 20 blocked with reason details display within 3 seconds. Given multiple rules exist, When the user changes rule priority order, Then evaluation order updates immediately and persists across sessions.

Surface real-time exclusion reasons in the Variant Matrix preview

Given the matrix preview is open, When a user toggles an option that creates a violation, Then affected cells display a red badge with reason code tooltip within 500 ms and cannot be selected. Given multiple rules exclude the same cell, When hovering the badge, Then all applicable reason codes and rule names are listed in priority order. Given the user toggles "Show excluded", When enabled, Then only excluded combinations are listed with columns: SKU, attributes, rule IDs, reason codes, and the list supports CSV export. Given accessibility requirements, When a badge is focused via keyboard, Then the reason tooltip is accessible via screen reader and focusable without a mouse.

Import brand presets and enforce channel-specific intersections

Given a workspace brand preset file is imported, When mapping completes, Then corresponding rules are created with names, IDs, default priorities, and scopes (brand/channel/product) preserved. Given channels Amazon and Shopify are both selected, When generating the matrix, Then the engine enforces the intersection of active rules and blocks any combination violating any channel's rule, with both reason codes shown. Given a cross-channel conflict (e.g., Amazon requires white background, Shopify allows brand pastel), When detected, Then the UI prompts to split outputs by channel or choose a primary, and single-output export is blocked until resolved. Given a preset import modifies existing rules, When saved, Then a new rule version is created with changelog (who, when, what) and prior versions remain accessible for rollback.

Minimal Matrix Design (DOE) Suggestion

"As a marketer, I want a suggested minimal set of combinations so that I can learn what works without rendering every possible variant."

Description

Automatically propose a reduced, statistically sound set of combinations that isolates main effects using orthogonal arrays or fractional factorial designs. Respect user constraints such as maximum number of variants, must-include levels, and excluded pairs from rules. Provide toggles between full factorial and suggested minimal sets with coverage and effect estimability indicators, plus brief explanations of trade-offs. Outcome: lower cost and faster turnaround while preserving the ability to attribute performance changes to specific knobs.

Acceptance Criteria

Suggest Minimal DOE Within Variant Cap

Given the user selects one or more knobs with defined levels and sets a maximum variants cap M When the user clicks Suggest minimal set Then the system returns a design using an orthogonal array or fractional factorial with total combinations <= M And the design guarantees all main effects are estimable (no main-effect aliasing) And the design method, size, and resolution/strength are displayed And the suggestion is deterministic for identical inputs And the suggestion is generated in <= 3 seconds when the unconstrained full factorial size <= 10,000 combinations

Honor Must-Include Levels

Given the user marks specific levels as Must Include and sets a variants cap M When the system generates the suggested minimal set Then each must-include level appears in at least one suggested combination And the total suggested combinations <= M And if constraints make this impossible, no set is produced and a message lists the conflicting knobs/levels and the computed minimum M required to satisfy them And the UI offers one-click options to increase M to the minimum or remove specific must-include levels

Enforce Excluded Pairs and Rules

Given the user defines excluded pairs or rule-based invalid combinations When the system generates the suggested minimal set Then no suggested combination contains any excluded pair or invalid rule And coverage by level for each knob is reported as a percentage of valid levels used And if exclusions prevent main-effect estimability, the UI flags which knob(s) lose estimability and offers alternatives (relax rules or increase M)

Toggle Full Factorial vs Suggested Indicators

Given the UI provides a toggle between Full factorial and Suggested minimal set When the user switches views Then counts of combinations, estimated processing time, and cost update immediately And coverage per level (%) and a Main effects estimable indicator are shown for the current view And stateful selections (knobs, levels, constraints) persist across toggles

Trade-off Explanation Panel

Given a suggested minimal set is available When the user opens the Why this set? panel Then a 30–80 word explanation is shown naming the design type used, size, covered levels, what effects are not estimable, and the trade-offs versus full factorial And the explanation updates within 500 ms after any change to knobs, levels, or constraints And a Learn more link opens documentation in a new tab

Reactive Recalculation on Design Changes

Given knobs, levels, must-includes, excluded pairs, or M are modified When the modification is applied Then the suggested set and indicators recompute and render within 3 seconds And a version tag increments and the timestamp updates And user-pinned combinations persist if still valid or are flagged with a reason if invalid

Feasibility and Fallback Behavior

Given the variants cap M is greater than or equal to the total number of valid combinations after applying constraints When the user requests a suggestion Then the system returns the full factorial set and shows a Fits within cap message And if M is less than the minimum needed to estimate all main effects under current constraints, the system shows the computed minimum M_min and an option to apply it

Matrix UI & Preview Grid

"As a content manager, I want to view and adjust the variant matrix before rendering so that I’m confident the plan is correct."

Description

Present an interactive grid that maps knobs and levels to a clean matrix with inline previews. Visually indicate excluded cells and reasons, allow manual include/exclude overrides with validation, and support sorting, filtering, pinning, and labeling variants. Show per-SKU applicability and counts, with responsive layout for large matrices. Integrate with the media viewer for zoom and side-by-side comparison. Outcome: clear planning, auditability, and confidence before committing to batch generation.

Acceptance Criteria

Matrix Grid Rendering & Inline Previews

Given a configured experiment with defined knobs, levels, and valid combinations When the user opens the Matrix UI Then a grid renders with sticky row and column headers mapping knobs to levels And each valid cell displays a 160px preview thumbnail with a skeleton placeholder until the image loads And initial render completes in ≤ 2 seconds for up to 2,500 valid cells on a standard laptop (Chrome, 8GB RAM) And thumbnails lazy-load within 200 ms of entering the viewport and show a retry affordance on failure And hovering or focusing a cell reveals its knob-level label and variant identifier

Excluded Cells Visualization & Reason Codes

Given rule evaluation marks certain combinations as excluded When the Matrix UI renders Then excluded cells are non-interactive and visually distinct via hatch pattern and 40% opacity And a tooltip or long-press popover shows the machine-readable reason code and human-friendly explanation And an inline legend lists all reason codes present in the grid And screen readers announce excluded cells as "Excluded: <reason>"

Manual Include/Exclude Overrides with Validation & Audit

Given a user with Override permission viewing the Matrix UI When the user toggles a cell from excluded to included Then the system validates against hard-blocked rules and prevents the change with an error that cites the blocking rule ID if violated And if allowed, the cell state changes to Included with an "Overridden" badge and optional reason input And saving persists the override to the experiment config and returns 200 OK within 500 ms And an audit record is created capturing user, timestamp, cell identifiers, prior state, new state, and reason And removing an override restores the rule-derived state and updates the audit trail

Sorting, Filtering, Pinning, and Labeling Variants

Given a populated matrix When the user applies sort by knob, level, label, or inclusion state Then the grid reorders within 300 ms and the active sort is indicated in the header When the user filters by inclusion state, reason code, label text, or pinned status Then only matching cells remain visible and counts update accordingly When the user pins one or more variants Then pinned cells stay in the first rows/columns and persist across filter changes and sessions When the user edits a variant label inline Then validation enforces 1–40 characters, uniqueness within the matrix, and allowed characters (letters, numbers, space, dash, underscore), with inline error on violation And label edits are saved on Enter, canceled on Esc, and are keyboard-accessible

Per-SKU Applicability and Counts

Given a catalog with multiple SKUs having compatibility constraints per knob level When the user selects one or more SKUs in the Matrix UI Then each cell reflects applicability (enabled/disabled) for the selected SKUs And per-SKU and aggregate counts display total applicable variants, excluded variants, and overridden variants And counts and applicability indicators update within 500 ms when the SKU selection changes And hovering or focusing the counts reveals a breakdown by reason code

Responsive Layout & Large Matrix Performance

Given a large matrix (e.g., ≥ 50 x 50) When the user scrolls the grid on desktop Then virtualization limits DOM to ≤ 500 visible cells and maintains ≥ 55 FPS scrolling And sticky headers remain aligned without visible jitter When the viewport width changes Then the layout adapts: mobile (≤ 600px) shows a scrollable list with row-by-row previews; tablet (601–1024px) shows up to 8 columns; desktop (≥ 1025px) shows full grid with horizontal scroll And no content layout shift exceeds CLS 0.1 after initial render

Media Viewer Integration: Zoom & Side-by-Side Compare

Given a visible cell with a loaded preview When the user activates the preview (click or Enter) Then the media viewer opens within 300 ms showing the high-resolution image and variant metadata And the user can zoom from 50% to 400%, pan, and reset via controls and double-click When the user selects a second cell to compare Then the viewer enters side-by-side mode with synchronized zoom and pan and distinct labels for each variant And closing the viewer returns focus to the originating cell and restores the prior scroll position

One-Click Batch Generation & Queueing

"As a seller, I want to generate all selected variants in one click so that I can publish faster without manual steps."

Description

Batch-render all selected combinations across chosen SKUs in a single action using a scalable job queue with concurrency control and GPU autoscaling. Ensure deterministic outputs per seed and preset, idempotent job IDs, deduplication of identical variants, and robust retry/backoff on transient failures. Provide real-time progress, ETA, pause/cancel, and completion notifications. Store outputs in organized folders by experiment and combination. Outcome: reliable, fast production of studio-quality variants at scale with minimal operator effort.

Acceptance Criteria

One-Click Batch Generation Across Selected SKUs and Variant Combinations

Given a user selects SKUs A and B and 12 valid variant combinations from the Variant Matrix When the user clicks Generate once Then the system creates one batch with 24 sub-jobs (2 SKUs x 12 combinations) and begins processing without additional input And the batch ID is returned within 2 seconds and is visible in the Jobs panel with an initial ETA And no duplicate sub-jobs are created for identical SKU+combination pairs within the batch

Scalable Queue with Concurrency Limits and GPU Autoscaling Under Load

Given there are 10,000 pending sub-jobs in the queue and a configured global concurrency cap of 200 When batch processing starts Then no more than 200 image tasks run concurrently across all workers at any time And autoscaling provisions additional GPU workers up to the configured maximum within 120 seconds of detecting backlog > concurrency cap And after autoscaling stabilizes, the 95th percentile queue wait time to first start is ≤ 60 seconds And throughput scales within ±15% of linear until the concurrency cap is reached

Idempotent Submission and Deduplication of Identical Variant Jobs

Given a submission contains the same experimentId, skuIds, variant combinations, presetId+version, and seed as a previously submitted batch within 24 hours When the user submits again Then the same batchId is returned and zero new sub-jobs are enqueued And the UI indicates the existing batch is in progress or completed Given multiple concurrent submissions for the same SKU+combination+presetVersion+seed occur When they reach the API Then only one sub-job executes and all others reference the running job; only a single set of artifacts is produced

Deterministic Outputs for Given Seed and Preset Across Environments

Given an input image with hash H, presetId P with version V, and seed S When the variant is rendered on different GPU instances or at different times Then the output image bytes are identical (same SHA-256) and metadata includes sourceImageHash=H, presetId=P, version=V, seed=S, and jobId Given the same inputs except a different seed S' When the variant is rendered Then the output SHA-256 differs from the output produced with seed S

Robust Retry with Exponential Backoff and Failure Classification

Given a transient error (e.g., HTTP 5xx, GPU preemption, storage timeout) occurs during processing of a sub-job When the error is encountered Then the system retries automatically with exponential backoff and jitter up to 3 attempts And on a successful retry, only one artifact set is stored and the sub-job is marked Success Given a permanent error (e.g., invalid preset configuration) occurs When the error is encountered Then the sub-job is marked Failed without retry and the error code/message is recorded and visible in the batch details And the batch completes only after all sub-jobs reach a terminal state (Success, Failed, or Canceled)

Real-Time Progress, Accurate ETA, Pause/Cancel, and Notifications

Given a batch is processing When the user opens the batch detail view Then per-sub-job progress updates at least every 2 seconds and the batch ETA is displayed with ±15% accuracy after 10% progress has been achieved When the user clicks Pause on the batch Then no new sub-jobs start within 3 seconds and in-flight sub-jobs run to completion; the batch status changes to Paused When the user clicks Cancel on the batch Then pending sub-jobs are not started and in-flight sub-jobs are stopped within 10 seconds where safe; the batch status changes to Canceled and no charges accrue for canceled pending sub-jobs Upon batch completion (all sub-jobs terminal) Then the user receives an in-app notification immediately and an email notification within 60 seconds containing links to outputs and a summary

Organized Storage and Discoverability by Experiment and Combination

Given experiment E and batch B complete processing When outputs are saved Then files are stored at s3://<bucket>/experiments/E/batches/B/{skuId}/{combinationId}/{seed}/ containing full-resolution images, web thumbnails, and manifest.json with sourceImageHash, presetId, version, seed, timestamps, and jobId And deduplicated sub-jobs reference the original artifact path rather than duplicating files And the UI Gallery and API provide direct links to each combination folder and manifest; database records include durable pointers with permissions inherited from the experiment

Variant Metadata, Tagging, and Export

"As an analyst, I want each image tied to its combination metadata so that I can measure performance and report insights."

Description

Tag every generated asset with experiment ID, knob/level values, render settings, source SKU, and seed, and persist this metadata in the asset store and via API. Provide CSV/JSON exports and integration mappings for A/B testing targets (e.g., Shopify, marketplaces, ad platforms). Support invisible watermark or metadata embedding where possible to maintain traceability across uploads. Outcome: structured experiments with end-to-end attribution, enabling performance analysis and rollback.

Acceptance Criteria

Auto-Tagging on Variant Generation (Batch)

Given a user generates variant images via Variant Matrix for one or more SKUs under experiment EXP-123 When generation completes Then each output asset is tagged with metadata keys: experiment_id, source_sku, knob_background, knob_crop, knob_shadow, knob_retouch, render_format, render_size_px, color_profile, seed And 100% of assets have non-null values for the keys above And the metadata is visible in the asset details panel within 5 seconds of asset creation

Metadata Persistence in Asset Store

Given a generated asset with the full metadata set is saved to the asset store When the asset is viewed, duplicated, or moved within the library Then the metadata values remain intact and unchanged And reopening the asset after a service restart shows the same metadata values And downloading and re-uploading the file back into PixelLift re-associates the same metadata within 5 seconds (via embedded data or sidecar)

Metadata Retrieval via API

Given an authenticated client When it calls GET /v1/assets?experiment_id=EXP-123 Then the response includes only assets where experiment_id = "EXP-123" and is paginated (default page_size = 100) And each asset object includes keys: asset_id, experiment_id, source_sku, background, crop, shadow, retouch, render_format, render_size_px, color_profile, seed, created_at, file_url And the endpoint responds within 1 second for up to 1000 assets And GET /v1/assets/{asset_id}/metadata returns the same keys with correct data types

CSV Export for Experiment

Given the user selects an experiment and chooses CSV export When the export finishes Then the CSV includes a header with exactly: asset_id, experiment_id, source_sku, background, crop, shadow, retouch, render_format, render_size_px, color_profile, seed, created_at, file_url And the number of data rows equals the number of assets exported And all fields are RFC 4180 compliant (proper quoting/escaping) And the file is available for download within 60 seconds for exports up to 10,000 assets

JSON Export with Schema Validation

Given the user selects JSON export for an experiment When the export is generated Then the output is UTF-8 JSON containing an array of objects with keys: asset_id, experiment_id, source_sku, knobs {background, crop, shadow, retouch}, render_settings {format, size_px, color_profile}, seed, created_at, file_url And the export validates against schema version "1.0.0" without errors And null/empty fields are omitted, and data types match the schema And the HTTP response has Content-Type: application/json

Integration Mapping Templates (Shopify, Marketplaces, Ad Platforms)

Given the user selects a target platform (Shopify, Google Merchant Center, Meta Ads) When they generate integration mappings Then a mapping file is produced that assigns metadata keys to platform-specific fields (e.g., Shopify metafields namespace "pixellift" keys: experiment_id, seed; GMC custom_label_0..4; Meta creative labels) And a platform-ready CSV/JSON is generated that conforms to the target’s field names, character limits, and allowed character sets And the mapping passes built-in validation for required fields and length constraints And the export includes a README describing how to apply/import on the selected platform

Embedded Traceability (XMP/IPTC or Invisible Watermark)

Given an image is generated in a format supporting embedded metadata (e.g., JPEG, PNG, WebP) When embedding is enabled Then the file contains XMP/IPTC fields for pixellift:experiment_id, pixellift:source_sku, and pixellift:seed that can be read back with the same values And if embedding is unsupported or stripped, an invisible watermark encoding experiment_id and asset_id is applied and decodes successfully on the saved file And if both embedding and watermarking fail, a sidecar .json is created and linked to the asset in the store And embedding/watermarking does not change pixel dimensions, format, or color profile

Matrix Templates & Team Sharing

"As a team lead, I want to save and share our standard variant matrix so that the team runs consistent experiments across catalogs."

Description

Allow users to save, version, and share matrix configurations—including selected knobs, level sets, rules, and DOE settings—as reusable templates within a workspace. Support permissions, cloning, change logs, and default templates per channel or product category. Outcome: consistent, repeatable experimentation practices across teams and catalogs, reducing setup time and variability.

Acceptance Criteria

Save and Reuse Template in Workspace

Given I am an Editor or Owner in a workspace and have configured knobs, level sets, rules, and DOE settings When I click "Save as Template" and provide a unique name Then the system saves the template with a unique Template ID, created_at, owner, workspace scope, and an initial version 1.0 And Then the template appears in the workspace Template Library within 3 seconds and is searchable by name, tags, channel, and category And When I apply this template to a new matrix Then the knobs, level sets, rules, and DOE settings exactly match the saved snapshot (field equality; checksum match)

Template Versioning and Change Log

Given a template version v1.0 exists When I modify any template field and choose "Save New Version" Then a new version v1.1 is created; v1.0 remains immutable and selectable And Then the change log records author, timestamp, summary, and a structured diff of changed fields And When I request any prior version Then the system returns the exact snapshot for that version (checksum match) And When I attempt to overwrite an immutable version Then the system blocks with 409 and guidance to create a new version

Team Sharing and Permissions Enforcement

Given a template with ACL roles Owner, Editor, Viewer When a Viewer opens the template Then they may view and export but cannot edit, version, clone, delete, or change permissions (UI disabled; write attempts return 403) And When an Editor saves a new version or clones the template Then the action succeeds; attempts to delete or change ACL return 403 And When an Owner updates ACL, transfers ownership, or deletes the template Then the action succeeds and is audit-logged; delete is soft for 7 days with restore option And When a user outside the workspace requests the template by ID Then the system returns 404

Clone Template within Workspace

Given a template version vX exists When I select "Clone" Then a new template is created in the same workspace with a new Template ID, name prefilled with "Copy" suffix, and initial version 1.0 whose snapshot equals the source version snapshot And Then the clone inherits channel/category tags and description, records provenance (source Template ID and version), and starts a fresh change log And Then the ACL is copied except the cloner is set as Owner And When I edit the clone Then no changes affect the source template

Default Templates by Channel and Category

Given I have Owner permissions When I set a template as default for Channel=X and Category=Y Then the mapping is saved, audit-logged, and visible in Defaults settings And When a user creates a new matrix under Channel=X and Category=Y without choosing a template Then the default template auto-applies And If multiple defaults could apply Then precedence resolves as Channel+Category > Channel-only > Category-only > Global, yielding exactly one template And When a new default is set for the same scope Then the previous default is replaced and the change is logged

Template Integrity Validation and Import/Export Parity

Given I am saving a template When validation runs Then each knob has ≥1 level, rules introduce 0 unresolved conflicts, and DOE settings compute a finite non-zero minimal set; otherwise save is blocked with specific error messages And Then the expected variant count is calculated and stored with the template And When I export the template to JSON and re-import it with proper permissions Then schema validation passes and the imported template snapshot matches the exported one (checksum match) with a new Template ID and version 1.0

Apply Template During Catalog Upload

Given I upload N product images and select Template ID=T at version=V When I start processing Then the job enqueues within 10 seconds and each generated image is tagged with template_id=T and template_version=V And Then the number of output variants equals DOE_expected_count_per_product × number_of_products, and this is reported in the job summary And If the referenced template is soft-deleted or inaccessible Then applying it is blocked with guidance to restore or choose another; if permissions are revoked mid-job, remaining tasks fail with 403 and partial results are retained and reported

Significance Guard

Built‑in sample‑size planning and significance checks prevent false wins. Get plain‑language guidance (e.g., “Need ~480 more views for 95% confidence”) and automatic pauses for underpowered or lopsided tests. Real‑time dashboards plus Slack/Email alerts keep Test‑and‑Tune Taylor moving without stats wrangling.

Requirements

Sample Size Planner & Power Calculator

"As a growth manager, I want an automatic sample size estimate for my test so that I can plan duration and traffic needs without doing manual statistics."

Description

Provide an on-creation planner that computes required sample size per variant from selected primary metric, baseline rate (pulled from recent PixelLift analytics), minimum detectable effect, desired power, and confidence. Display plain-language outputs (e.g., “Need ~480 more views for 95% confidence”) and dynamically update as data accrues. Support binary and continuous metrics, traffic forecast, and seasonality weighting. Persist assumptions, expose a lightweight API for programmatic planning, validate unrealistic inputs, and integrate with the Experiment Setup UI.

Acceptance Criteria

Compute per‑variant sample size on test creation

Given a user selects a primary metric, baseline (prefilled or overridden), minimum detectable effect (absolute or relative), desired power, confidence level, and number of variants with equal traffic allocation When inputs are provided or changed Then the planner computes and displays required sample size per variant and total, rounding up to whole numbers, within ±1% of a validated reference implementation for standard fixtures And the computation completes within 250 ms for up to 10 variants at p95 And the UI labels explicitly state "per variant" and "total required"

Prefill baseline from recent analytics

Given the user chooses a primary metric during test setup When the planner loads Then the baseline is auto-filled from the last 30 days of PixelLift analytics for the selected store/context and displays the source date range and timestamp And the user can override the value, which is persisted on save And if no analytics are available, the field is empty, marked Required, and a helper link explains how the baseline is computed

Plain-language guidance and dynamic remaining sample/ETA

Given an experiment with required sample size per variant computed and live accrual stats available When current counts are below required n Then the planner displays a plain-language message of the form "Need ~X more [views/events] for Y% confidence" where X equals remaining total across all variants rounded to the nearest 10 and Y equals the configured confidence level And the message updates within 5 seconds of new data arriving or immediately on manual refresh And if a traffic forecast is set, the planner shows an ETA of days remaining using the forecast; if a weekly seasonality profile is active, the ETA uses the weighted daily traffic (weights normalized) to project remaining time

Metric types and correct formula application

Given the user selects Binary metric type When inputs include baseline rate, MDE (absolute or relative), power, and confidence Then the required n per variant is calculated using a two-proportion z-test planning formula with continuity correction disabled And Given the user selects Continuous metric type When inputs include baseline mean and standard deviation (or variance), MDE (absolute), power, and confidence Then the required n per variant is calculated using a two-sample t-test normal approximation And automated tests verify outputs against a reference library for at least 5 binary and 5 continuous fixtures within ±1%

Validate and guard unrealistic inputs

Given the planner form is displayed Then validations enforce: power in [0.5, 0.99], confidence in [0.80, 0.999], binary baseline in (0, 1), continuous standard deviation > 0, MDE > 0 (relative MDE between 1% and 100% when used), number of variants between 2 and 20, traffic forecast ≥ 0 And seasonality profile weights (7-day) are non-negative and sum to 1±0.001 (auto-normalized if within tolerance); otherwise, show error and block save And invalid fields show inline error text and disable Calculate and Save until resolved And extreme inputs that imply > 10,000,000 total samples trigger a warning suggesting to increase MDE or relax power/confidence

Persist planner assumptions in Experiment Setup

Given a user saves or proceeds from the planner step in Experiment Setup UI When they return to the experiment or open it in another session Then all planner assumptions (metric type, baseline, MDE, power, confidence, variants, allocation, forecast, seasonality weights) are persisted and pre-filled And a change history records the last 5 updates (user, timestamp, old→new), viewable in the UI And persist/read operations complete within 200 ms at p95

Lightweight REST API for programmatic planning

Given a client with a valid API token When it POSTs to /api/planner with inputs {metricType, baseline or mean+sd, mde, power, confidence, variants, allocation, forecast, seasonalityWeights} Then the API returns 200 with JSON {perVariantRequiredN[], totalRequiredN, method, inputsEcho, notes} matching UI calculations within ±1% And invalid inputs return 422 with field-level errors; unauthorized requests return 401 And p95 latency ≤ 300 ms for payloads up to 20 variants; rate limit 60 requests/min per token

Auto-Pause Underpowered or Lopsided Tests

"As a product owner, I want tests to pause automatically when they cannot reach significance so that we avoid wasting traffic and making bad decisions."

Description

Continuously monitor running experiments for power shortfall and traffic imbalance. If projected power at planned end < required minimum or allocation skews beyond thresholds (e.g., >70/30 for 2+ hours), automatically pause the test, annotate the reason, and notify owners. Preserve randomization, allow admin override with justification, and resume automatically when conditions are corrected. Include backoff to prevent pause/resume churn and integrate with the scheduler and experiment lifecycle states.

Acceptance Criteria

Auto-pause on projected power shortfall

Given an experiment with required power P_req and a planned end time T_end And MDE and variance estimator are configured for the primary metric(s) When the projected power P_proj at T_end is computed every 1 minute And P_proj < P_req continuously for at least 10 minutes Then the system transitions the experiment to state = Paused (System) And records pause_reason = UNDERPOWERED with fields: P_proj, P_req, T_eval, MDE, variance_method And prevents any new user exposures while preserving existing randomization keys and assignments And posts a banner on the experiment dashboard with timestamp and reason And sends notifications to all owners via Slack and Email

Auto-pause on sustained traffic imbalance

Given an experiment with intended allocation ratios per arm And a minimum exposure threshold per arm of 500 exposures is met When the observed allocation deviates beyond threshold (any arm >= 70% or <= 30%) for 2 continuous hours Then the system transitions the experiment to state = Paused (System) And records pause_reason = IMBALANCE with per-arm counts, shares, and duration And freezes assignment mapping (no reassignment) and stops routing new exposures And sends Slack and Email notifications to owners with the imbalance details and link to dashboard

Admin override with justification and audit trail

Given an experiment is Paused (System) When a user with role = ExperimentAdmin clicks Resume and enters a non-empty justification (min 15 characters) Then the system transitions the experiment to state = Running (Overridden) And records override_reason, actor_id, timestamp, and optional expiry in an immutable audit log And notifies owners of the override including justification and expiry (if set) And re-enables automatic pausing after the override expires or is revoked

Automatic resume with backoff and churn protection

Given an experiment was auto-paused with reason UNDERPOWERED or IMBALANCE And last_state_change_at = t0 When all triggering conditions have cleared for a continuous 30 minutes And at least 60 minutes have elapsed since t0 (backoff) Then the system transitions the experiment to state = Running (Auto-Resumed) And sends resume notifications to owners with the cleared metrics And logs the transition with backoff_elapsed and prior_reason And suppresses any further pause/resume transitions for 15 minutes to prevent churn

Scheduler and lifecycle state integration

Given an experiment is receiving traffic via the assignment service and scheduler When the system issues an auto-pause Then the scheduler stops routing new traffic to all arms within 60 seconds And the lifecycle state change is atomic and versioned to prevent concurrent update races And duplicate pause commands within 5 minutes are idempotent (no duplicate notifications or state rewinds) And existing users keep their assigned variant; no randomization keys are altered And other experiments’ traffic allocations remain unaffected

Alerts and dashboard updates on pause/resume

Given an auto-pause or auto-resume event occurs When the event is committed Then Slack and Email alerts are sent within 60 seconds containing: experiment_id, event_type, reason, observed_values, thresholds, links And the experiment dashboard reflects the new state, reason, and timestamp within 30 seconds And a timeline entry is appended with actor (System or Admin), prior_state, new_state, reason And identical alerts for the same event are deduplicated per channel for 10 minutes

Power projection computation and safeguards

Given required power (P_req), configured MDE per primary metric, and a selected variance estimator When computing projected power P_proj at T_end from current traffic and variance using a 5-minute rolling window and min 200 samples per arm Then the method, parameters, and inputs are logged for reproducibility And P_proj results are deterministic given identical inputs and random seeds (tolerance ±0.01) And if estimator health checks fail (e.g., unstable variance, missing data), the system uses conservative defaults and does not auto-resume And unit tests validate projections against baseline cases across at least 3 metrics

Plain-Language Significance Guidance

"As a busy seller, I want simple explanations of test progress so that I can decide quickly without understanding statistics."

Description

Surface contextual guidance that translates statistical status into actionable, human-readable messages within the experiment detail view and setup wizard. Use templated copy to explain confidence, power, MDE, and remaining sample in simple terms, with optional drill-down for advanced detail. Localize messages, ensure accessibility, and version phrasing for clarity. Highlight next steps (e.g., extend runtime, increase traffic, reduce MDE) without exposing raw formulas by default.

Acceptance Criteria

Experiment Detail View: Plain-Language Guidance

- On Experiment Detail view load, show a single-sentence status that includes: confidence (rounded to whole %), power (rounded to whole %), MDE (rounded to 0.1 pp), and estimated remaining sample size (prefixed with ~ and rounded to nearest 10). - Default status contains no statistical notation (e.g., α, β, z, p=) or formulas; allowed symbols limited to digits, % sign, ~, commas, periods. - Exactly one primary next-step recommendation is shown from: Wait, Increase traffic allocation, Extend duration, Reduce MDE, Proceed to implement. - Status text updates within 2 seconds of any underlying metric change without a full page reload. - Numbers include thousands separators and correct units (e.g., views, visitors) per experiment type. - Copy is rendered from a templated string with parameter placeholders and passes a linter rule: no sentence > 25 words.

Setup Wizard: Dynamic Sample-Size Guidance

- As the user sets baseline conversion, desired confidence (90–99%), power (70–95%), MDE (0.5–20%), and traffic split, display live text: “Need ~X total visitors (Y per variant) for Z% confidence and W% power,” with X and Y rounded to nearest 10. - The message updates within 300 ms of any input change and never lags more than 1 s under normal network conditions. - If computed total sample size > 2,000,000, display an inline warning suggesting to adjust MDE, traffic allocation, or duration. - If any variant allocation < 10% in a multi-variant setup, display a warning: “Lopsided traffic may delay significance; consider a more balanced split.” - Default wizard guidance contains no raw formulas or statistical notation; a link to “Learn more” is provided for details. - Out-of-range inputs trigger inline validation and temporarily suppress the guidance until corrected.

Contextual Next-Step Recommendations Logic

- The system reads target confidence and power from experiment settings (defaults: 95% confidence, 80% power) and computes remaining sample and ETA at current traffic rate. - If remaining sample > 0 AND ETA > 48 hours AND current experiment traffic allocation < 50%, recommend “Increase traffic allocation”; else recommend “Extend duration.” - If confidence ≥ target AND power ≥ target AND remaining sample ≤ 0 AND |observed lift| ≥ MDE, recommend “Proceed to implement.” - If confidence ≥ target AND power ≥ target AND |observed lift| < MDE, recommend “Extend duration.” - If any variant’s actual allocation deviates > 10% absolute from planned, recommend “Wait” and append “Lopsided traffic detected.” - Only one recommendation is displayed at a time using the priority order: Lopsided > Increase allocation > Extend duration > Proceed to implement. - Unit tests cover boundary cases at ±0.5% around targets and at 10% imbalance with expected recommendations.

Advanced Drill-Down Toggle for Details and Formulas

- A control labeled “Show formulas and assumptions” is present and off by default on Detail and Wizard views. - When toggled on, it reveals definitions (confidence, power, MDE), assumptions, and the formulas/metrics (e.g., p-values, SE) used; when off, these are not rendered in the DOM. - The toggle state persists per user and experiment across sessions. - Expanding/collapsing does not cause layout shift > 200 px for primary content; viewport remains scrolled to the toggle area. - A contextual help link to documentation is included within the expanded section. - Toggle is fully accessible (role=button, aria-expanded, focusable, operable via keyboard).

Accessibility and Readability Compliance

- All guidance text achieves WCAG 2.2 AA contrast (≥ 4.5:1) and visible focus indicators. - Guidance summary uses an ARIA live region (polite) so screen readers announce updates when metrics change. - All interactive elements (toggle, links) are keyboard-accessible with logical tab order and descriptive labels. - Templated copy meets Flesch-Kincaid Grade Level ≤ 8; automated checks run in CI and block merges on failure. - No critical information is conveyed by color alone; icon+text is used for warnings. - The feature passes manual screen reader smoke tests on NVDA and VoiceOver for the two primary views.

Localization and Number/Plural Formatting

- All guidance strings are externalized to i18n keys with translations for en-US, es-ES, fr-FR, and de-DE. - Numbers, dates, and percents are formatted via ICU for the active locale (e.g., decimal comma in fr-FR/de-DE; non-breaking space before % where applicable). - Pluralization rules produce grammatically correct units (e.g., 1 view vs. 2 views) in each supported locale. - Pseudo-localization with 30% length expansion does not truncate or overflow at 320 px width; no clipped text observed. - Missing translation keys fall back to en-US and log a single non-blocking warning per key. - Language selection persists across sessions and correctly updates all guidance texts without page reload.

Copy Versioning and Telemetry

- Each guidance template has a version identifier (e.g., PLG_v1.2) surfaced in a debug panel and attached to analytics events. - Product ops can update guidance templates via remote config without redeploy; changes propagate to clients within 5 minutes; rollback restores prior version within 5 minutes. - ≥ 99% of guidance impressions emit analytics with copy version, locale, scenario, and recommendation shown; gaps trigger an alert. - Support A/B copy variants via version flags with target exposure accuracy within ±5% of configured split. - All changes to templates are auditable with timestamp, author, version, and diff accessible to admins.

Real-time Significance Dashboard

"As a data-savvy marketer, I want a live view of my experiment’s significance so that I can track progress and communicate status to stakeholders."

Description

Provide a live dashboard showing per-variant performance (conversion, delta, confidence intervals, p-values or Bayesian probabilities), traffic counts, runtime, imbalance indicators, and projected time to significance. Auto-refresh at configured intervals, support segment filters (e.g., marketplace, device), badge guard status (Healthy, Underpowered, Imbalanced, Paused), and allow CSV export. Ensure mobile-friendly layouts and respect PixelLift roles and permissions.

Acceptance Criteria

Variant Metrics Accuracy & Completeness

Given an active experiment with at least two variants and ongoing events When a user with "View Experiments" permission opens the Real-time Significance Dashboard for that experiment Then each variant row displays: conversion_rate, delta_abs_vs_control, delta_rel_vs_control, 95% confidence_interval [low, high], p_value_or_bayes_prob_best, visitors, conversions, runtime_hours, and traffic_share And the displayed conversion_rate and counts match backend analytics within ±0.1 percentage points (conversion rate) and ±1 event (counts) at the time of refresh And the dashboard shows a "Last updated" timestamp in ISO-8601 using the user's timezone

Auto-Refresh Configuration & Behavior

Given the user configures auto-refresh interval to 30 seconds in PixelLift experiment settings When the Real-time Significance Dashboard is open Then data refresh requests occur every 30 ± 3 seconds and the "Last updated" timestamp advances without a full page reload And setting the interval to Off disables auto-refresh and the dashboard does not issue background refresh requests And if network connectivity is lost, the dashboard pauses refresh, shows a non-blocking warning, and auto-resumes within 5 seconds after connectivity is restored

Segment Filters by Marketplace and Device

Given the experiment has traffic across at least two marketplaces and two device types When the user applies filters Marketplace = "Etsy" and Device = "Mobile" Then all displayed metrics, counts, status badges, and projections are recalculated from only the filtered segment And the active filters are shown as removable pills and persist on page reload within the same session And clearing all filters restores the unfiltered view

Guard Status Badges & Imbalance Indicators

Given an experiment with a configured planned split and active data collection When any variant's traffic share deviates by more than 5 percentage points from the planned split over the last 60 minutes Then the experiment shows an "Imbalanced" badge and the affected variants show an imbalance tooltip explaining the deviation And when the experiment is manually paused, the badge reads "Paused" And when required sample size per variant (for 95% confidence using Significance Guard defaults) has not been met and runtime is under 14 days, the badge reads "Underpowered" And when none of the above conditions hold, the badge reads "Healthy"

Projected Time to Significance

Given the dashboard can compute required_additional_views to reach 95% confidence and a rolling 24-hour average traffic_per_day for the current filters When the dashboard renders projections Then it displays Projected time to significance as a range in hours or days computed as required_additional_views / traffic_per_day with rounding to the nearest hour for < 48 hours and to the nearest day otherwise And when traffic_per_day = 0 or required_additional_views is undefined, it displays "Not enough data" instead of a numeric projection

CSV Export with Applied Context

Given a user with the "Export Experiments Data" permission is viewing the dashboard with any active filters When they click Export CSV Then a CSV downloads within 5 seconds containing one row per variant with columns: timestamp_iso, experiment_id, variant_key, visitors, conversions, conversion_rate, delta_abs, delta_rel, ci_low, ci_high, p_value_or_bayes_prob, runtime_hours, traffic_share, guard_status, marketplace, device And the exported data reflects the same filters and time range as the current view And numeric values use '.' as the decimal separator and percentages are represented as decimals (e.g., 0.1234)

Access Control and Mobile-Friendly Layout

Given a user without the "View Experiments" permission navigates to a dashboard URL When the page loads Then they are blocked with a 403 access response or redirected to an access-denied screen and no experiment data is returned by the API And given a user with "View Experiments" but without "Export Experiments Data" When they view the dashboard Then the Export CSV control is not visible or is disabled with an explanatory tooltip And given a device viewport width of 360 px When the dashboard loads Then all content fits without horizontal scrolling, metrics are presented in stacked cards, interactive controls have a minimum tap target of 44x44 px, and status badges and key metrics remain visible above the fold

Slack and Email Alerting

"As a test owner, I want alerts when my experiment needs attention so that I can act promptly without polling dashboards."

Description

Send actionable notifications for key events: plan created, threshold reached, auto-pause triggered, significance achieved, max runtime hit, and data quality issues. Support per-workspace configuration of channels, quiet hours, and severity. Implement secure Slack webhooks with deep links to the dashboard and fall back to email. Batch low-priority updates into daily digests to reduce noise.

Acceptance Criteria

Secure Slack Webhook Setup Per Workspace

- Given I am a workspace admin, when I add a Slack webhook URL for the workspace, then the system verifies it with a test request and receives a 2xx response within 5 seconds; otherwise the URL is rejected with a clear error. - The webhook URL is stored encrypted at rest and redacted in all UI/API responses except the last 4 characters. - I can rotate the webhook by adding a new one and disabling the old in a single flow with no alert loss during rotation. - An audit log entry records actor, timestamp, workspace ID, action (add/rotate/remove), and the last 4 characters of the webhook.

Channel, Severity, and Event Routing

- Given a workspace mapping of event types [plan_created, threshold_reached, auto_pause, significance_achieved, max_runtime_hit, data_quality_issue] to severity [low, medium, high] and destinations (Slack channels and/or email recipients), when any listed event fires, then notifications are delivered to all configured destinations within 60 seconds. - If an event lacks an explicit mapping, it defaults to severity "medium" and the workspace default Slack channel and email list. - Each notification displays the severity badge and the event type prefix (e.g., "Significance Achieved").

Quiet Hours With Critical Override

- Given quiet hours are configured with a workspace time zone, when a low- or medium-severity event occurs during quiet hours, then Slack notifications are suppressed and email is queued for the daily digest. - When a high-severity event occurs during quiet hours, then notifications are sent immediately to all destinations, ignoring quiet hours. - Quiet hours support at least one daily window per workspace and can be toggled off; configuration changes take effect within 5 minutes.

Alert Content and Deep Links

- Each notification includes: event title, event type, severity, test name and ID, workspace name, timestamp (UTC), primary metric with current values, and a deep link button to the specific test dashboard view. - For threshold-related events, the message includes plain-language guidance such as "Need ~{n} more views for 95% confidence" based on current estimates. - Deep links open the correct dashboard route with the test preselected and an anchor to the relevant section; the link resolves with HTTP 200. - Slack alerts render using Block Kit and email alerts render with responsive HTML; preview rendering completes within 2 seconds in test clients.

Email Fallback and Deduplication

- If Slack delivery fails (non-2xx or 3 consecutive timeouts > 5 seconds), then an email with equivalent content is sent to configured recipients within 2 minutes. - An event is not notified more than once per destination within a 10-minute deduplication window. - Delivery outcomes (success/failure with timestamp and destination) are recorded per notification.

Auto-Pause and Max Runtime Alerts

- When an experiment is auto-paused due to underpowered/lopsided sampling or reaches the max runtime, a high-severity alert is sent within 60 seconds containing the reason, pause timestamp, and current sample sizes. - The experiment status in the dashboard shows "Paused" within 60 seconds of alert dispatch, and the deep link in the alert lands on the paused state view. - Resuming an experiment does not emit a duplicate pause alert.

Daily Digest for Low-Priority Events

- Low-severity events accumulated in the last 24 hours are batched into a single digest per workspace and sent at the configured local time. - The digest includes counts by event type, the 10 most recent items with summaries, and a "View all events" deep link to the dashboard's events view. - Individual low-severity alerts included in the digest are not sent separately. - Users can opt out of digests per channel (Slack/email); workspace defaults apply when user settings are absent.

Multiple Testing and Peeking Controls

"As a product analyst, I want guardrails against peeking and multiple comparisons so that our decisions remain statistically valid."

Description

Introduce controls to limit inflated false positives from repeated looks and concurrent experiments. Support alpha-spending/group-sequential methods for interim analyses and optional false discovery rate control across parallel tests. Enforce minimum observation windows, display adjusted thresholds and decisions, and allow workspace-level configuration with clear explanations of trade-offs.

Acceptance Criteria

Alpha-Spending for Planned Interim Looks

Given a test has L planned interim looks with overall alpha set and an alpha-spending method selected (e.g., O’Brien–Fleming or Pocock) And the look schedule (information fractions or sample targets) is preregistered before traffic starts When interim looks occur at the preregistered information fractions Then the displayed adjusted alpha and critical boundaries for each look equal the values from the selected spending function within absolute tolerance 1e-4 And significance decisions at each look use the adjusted thresholds (not the nominal alpha) And any attempt to add an unplanned extra look is blocked with an error and audit-logged And the cumulative alpha spent plus remaining never exceeds the configured overall alpha

Enforce Minimum Observation Window

Given the workspace minimum observation window is configured (e.g., 7 days) and test-level overrides are disabled When a user attempts to stop a test or declare a winner before the window elapses from first exposure Then the Stop and Declare Winner actions are disabled and an inline message shows the remaining time countdown And interim statistics remain view-only until the window elapses And a Slack and Email notification are sent on the first denied attempt with a link to the policy And an audit-log entry records the attempt with user, timestamp, and action

False Discovery Rate Control Across Parallel Tests

Given FDR control is enabled at level q (e.g., q=0.10) for primary metrics across concurrently completed tests in the workspace And a batch of M completed tests with raw p-values p1..pM exists When FDR evaluation runs (nightly automatically and on-demand) Then the set of tests flagged Significant (FDR) exactly matches the Benjamini–Hochberg procedure at level q on p1..pM And each test displays its raw p, rank, BH critical value, and q-value And tests failing FDR are labeled Not significant (FDR) even if raw p < 0.05 And dashboard views and CSV exports reflect the same FDR flags consistently

Display Adjusted Thresholds and Decisions in Real Time

Given a running test uses a group-sequential design with planned looks When sample counts update or a planned look is reached Then the dashboard shows current adjusted p-value threshold, z-boundary, information fraction, alpha spent, and next-look criteria And the decision chip reads "Adjusted significant" or "Not significant" based solely on the adjusted threshold And a tooltip/link provides a plain-language explanation of the adjustment and trade-offs And displayed values update within 30 seconds of data changes

Workspace-Level Configuration and Explanations

Given a workspace admin opens Significance Guard settings When they configure default overall alpha (0.01–0.10), alpha-spending method (O’Brien–Fleming, Pocock), max planned looks (1–6), minimum observation window (1–30 days), and FDR control (Off/BH/BY with q in 0.01–0.20) Then validation enforces allowed ranges and prevents saving invalid combinations And each option includes a plain-language explanation with examples accessible via tooltip or help link And saved changes apply to new tests only and do not retroactively alter existing tests And all changes are audit-logged with timestamp, actor, and old→new values

Auto-Pause for Underpowered or Lopsided Tests

Given a running test has not met the planned information fraction for the current look or exhibits allocation imbalance worse than 65/35 for more than 2 hours When a user attempts to declare a winner or the system evaluates significance Then significance evaluation is paused and declaration actions are disabled And a banner explains the reason ("underpowered" or "allocation imbalance") and the conditions required to resume And Slack/Email alerts notify test owners with current metrics and recommended next steps And the system automatically resumes evaluation once the conditions are satisfied

Audit Log and Decision Rationale

"As a team lead, I want an audit trail of experiment decisions so that we can review, learn, and ensure accountability."

Description

Maintain an immutable, exportable audit trail capturing sample size assumptions, threshold settings, auto-pause events, overrides with actor and reason, alert deliveries, and final significance calls. Timestamp and attribute all entries, expose them within experiment details, and provide filters and exports for reviews and compliance. Support optional rollback for reversible operations with linked rationale.

Acceptance Criteria

Immutability and Attribution of Audit Entries

- Given an audit entry exists, when any user or service attempts to edit or delete it via UI or API, then the system denies the request (HTTP 403) and records an "edit-denied" event referencing the original entry. - Each audit entry includes: event_id (UUIDv4), experiment_id, event_type, actor_id, actor_type (user/service), actor_display, ISO 8601 UTC timestamp (ms precision), and an integrity_hash. - Exported logs include a chain_hash that allows verification that no entries were removed or altered; a verification endpoint returns Pass for an untampered export.

Capture of Sample Size and Threshold Configurations

- On experiment creation or update, the system appends a "config-set" entry capturing: confidence_target, power_target, MDE, min_duration, traffic_split, variant_count, and any significance thresholds. - On any change to these configurations, the system appends a "config-changed" entry with before_values and after_values, requires actor and reason (min 10 characters), and increments a config_version. - No prior config entries are modified; the latest config_version is displayed as current in the experiment details UI.

Auto-Pause Detection and Override with Rationale

- When an experiment is auto-paused due to underpowered or lopsided traffic, the system appends an "auto-pause" entry including trigger_condition, reason_code, metrics_snapshot_id, and estimated_additional_samples. - The UI displays the pause status and links to the corresponding "auto-pause" log entry within the experiment details. - When a permitted user overrides the pause, the system requires a rationale (min 10 characters), appends an "override" entry with actor, reason, and timestamp, resumes the experiment, and cross-references the "auto-pause" entry.

Alert Delivery Logging (Slack/Email)

- For each alert attempt, the system appends an "alert" entry with: channel (Slack/Email), recipient_or_channel_id, event_type, template_id, attempt_number, provider_message_id (if available), status (queued/sent/failed), and timestamp. - Failed alerts are retried up to 3 times with exponential backoff; each attempt generates its own "alert" entry; final failures include error_code and error_message. - Alerts triggered by auto-pause or overrides are logged within 5 seconds of the triggering event.

Final Significance Decision and Rationale Logging

- When a final significance decision is made (automatic or manual), the system appends a "final-call" entry containing: decision (win/lose/inconclusive), rule_applied, p_value or posterior, confidence or CI, effect_size, sample_sizes per variant, test_method, and metrics_snapshot_id. - The entry records actor (auto or user_id) and rationale text (required for manual), and locks the experiment decision. - Any subsequent revision creates a new "final-call-revised" entry referencing the original; the original remains unchanged.

Experiment Details Audit Log: View, Filter, and Export

- The experiment details page exposes an Audit tab showing entries in reverse chronological order with pagination (default 50 per page). - Filters include date_range, event_type, actor, and decision_outcome; applying filters updates results within 1 second for logs <= 10,000 entries. - Export respects active filters and supports CSV and JSON; exports for <= 50,000 entries complete within 30 seconds, include integrity hashes, and create an "export-created" audit entry.

Rollback of Reversible Operations with Linked Rationale

- For reversible operations (e.g., configuration changes), the UI provides a "Rollback" action that requires a rationale (min 10 characters). - Performing a rollback appends a "rollback" entry that references the original change, re-applies prior values, and records actor and timestamp; original entries remain intact. - The current effective state reflects the rollback immediately, and the UI displays linkage between the original and rollback entries.

Inventory Sync

Tie testing to stock levels so you don’t burn inventory on a losing look. Style Splitter throttles or stops tests when items near low stock, shifts traffic to stable variants, and delays new tests until replenishment—ideal for fast drops and recommerce where availability fluctuates hourly.

Requirements

Real-time Stock Intake & SKU Mapping

"As a boutique owner, I want PixelLift to sync my live stock by variant across my store so that test decisions reflect actual availability within a minute."

Description

Establish connectors to Shopify, WooCommerce, BigCommerce, and custom sources to ingest near real-time stock updates via webhooks with a 60s polling fallback. Normalize inputs to a per-variant SKU model (on-hand, available-to-promise, backorderable, multi-location) and map each SKU to its corresponding Style Splitter experiment variant. Handle ID resolution across systems, ensure idempotent processing, and implement rate-limit-aware batching, retries with exponential backoff, and circuit breakers. Provide a lightweight mapping UI and SDK endpoints for custom integrations. Guarantee <60s data freshness, multi-warehouse support, and secure handling (scoped OAuth, least-privilege access, encryption in transit/at rest).

Acceptance Criteria

Sub-60s Inventory Freshness via Webhooks with 60s Poll Fallback

Given a connected store updates stock for SKU S at time T0 When the platform delivers a stock-change webhook for S or, if unavailable, the 60s poll cycle runs Then PixelLift reflects the new inventory state for S across all internal services within 60 seconds of T0 And if both webhook and poll detect the same change, the system applies it once and preserves last-write-wins ordering And the ingestion service exposes a freshness metric where p95 of end-to-end update latency is ≤ 60s during steady state

Normalized Per-Variant SKU Model with Multi-Warehouse ATP

Given inventory payloads from Shopify, WooCommerce, BigCommerce, and a custom JSON source for the same product variant When PixelLift ingests these payloads Then the record for the variant is normalized to a schema containing at minimum: sku (string), external_variant_id (string), on_hand (integer ≥ 0), available_to_promise (integer ≥ 0), backorderable (boolean), locations[] (array of {location_id, on_hand, available_to_promise}) And if vendor payload omits available_to_promise, PixelLift computes it as max(0, on_hand - allocations) with allocations defaulting to 0 And backorderable defaults to false if not provided by the vendor And the sum of locations[].available_to_promise equals the variant-level available_to_promise within ±1 unit tolerance for rounding policies

Idempotent Processing and Duplicate Delivery Handling

Given the same stock-change event for SKU S is delivered via webhook up to 3 times with identical event_id or payload within 5 minutes When the ingestion workers process these deliveries concurrently Then inventory for S is updated exactly once with a single applied change record And subsequent duplicate deliveries are acknowledged without modifying inventory values or last_updated_at And processing logs/audit trail show one Applied and the rest Deduplicated with the same idempotency key

Cross-System ID Resolution and Style Splitter Variant Mapping

Given SKU S is mapped to Style Splitter experiment E and variant V When a stock update for S arrives using any supported identifier (e.g., Shopify variant_id, SKU code, custom external_id) Then PixelLift resolves the identifier to the internal SKU record and updates the inventory state linked to E/V And if S has no mapping, the update is stored but flagged UNMAPPED and no experiment routing changes are performed And when a user changes the mapping of S to a different experiment/variant, the new mapping takes effect within 60 seconds and is audit-logged

Rate-Limit-Aware Batching, Retries, and Circuit Breakers

Given a vendor API returns 429 or 5xx responses and provides Retry-After and/or rate-limit headers When the ingestion client encounters these responses Then requests are batched to vendor-recommended page sizes and concurrency limits And retries use exponential backoff with jitter starting at 1s and doubling up to a max of 32s, honoring Retry-After when present And after 5 consecutive retry-eligible failures per vendor, a circuit breaker opens for 60s, during which calls are not attempted and events are queued And upon successful half-open probes, the circuit closes and the backlog drains without data loss, restoring freshness to ≤ 60s within 15 minutes for backlogs ≤ 10,000 events

Mapping UI and SDK Endpoints for Custom Integrations

Given a user with Inventory Manager role opens the SKU Mapping UI When they search for a SKU, create/update/delete a mapping, or bulk upload a CSV of up to 5,000 rows Then the UI validates inputs (existing SKU, existing experiment/variant, one-to-one mapping) and prevents conflicts with clear error messages And successful changes persist and are reflected in GET /sku-mappings and in experiment routing within 60 seconds And SDK endpoints (GET/POST/PUT/DELETE /sku-mappings) are documented, versioned, support idempotent PUT, return standard errors (400, 401, 403, 404, 409, 429), and meet p95 latency ≤ 300ms under nominal load

Security: Scoped OAuth, Least Privilege, Encryption, and Audit

Given a store owner connects PixelLift to their commerce platform via OAuth When the connection is established Then requested scopes are limited to inventory- and product-level access; no order/PII scopes are requested And all data in transit uses TLS 1.2+ and secrets/tokens are stored encrypted at rest using a managed KMS And revoking the OAuth grant disables webhooks, polling, and SDK operations for that store within 5 minutes And all access to inventory and mappings is audit-logged with actor, action, target, timestamp, and outcome, and logs contain no secrets

Low-Stock Threshold Rules & Throttling

"As a growth marketer, I want configurable low-stock rules that automatically throttle or pause tests so that we don’t burn inventory on variants that can’t be fulfilled."

Description

Provide configurable low-stock policies at global, collection, product, and variant levels using absolute units, days-of-cover, or percentage thresholds with hysteresis to prevent flapping. When thresholds are reached, automatically pause experiments, cap variant traffic (e.g., max N%), or route 100% to control. Support rule precedence, time windows for drops/flash sales, and a simulation mode to preview impact. Evaluate rules on every stock change event and at least every 60s, logging deterministic outcomes for auditability.

Acceptance Criteria

Scoped thresholds and hysteresis prevent flapping

Given thresholds configured at global, collection, product, and variant scopes using units, days-of-cover, and percentage with trigger/clear hysteresis Given a variant with stock transitions 15 -> 9 -> 11 -> 12 units and a units threshold trigger=10, clear=12 When stock drops to 9 Then the governing rule transitions the variant into low-stock state exactly once and persists "throttled=true" When stock rises to 11 Then the low-stock state remains active because the clear threshold (12) is not reached When stock rises to 12 or above Then the low-stock state clears and "throttled=false" is persisted Given a DOC threshold of trigger=1.0 days, clear=1.2 days When calculated DOC falls below 1.0 Then low-stock state activates and only clears once DOC >= 1.2 Given a percentage threshold trigger=10%, clear=12% When current stock / initial stock <= 10% Then low-stock state activates and only clears once percentage >= 12%

Low-stock-triggered traffic throttling and routing

Given an active A/B test with a control and one or more test variants and a governing low-stock rule with action "pause experiments" When the rule triggers Then all affected experiments move to paused state within the next allocation cycle and 100% of new sessions are routed to control And existing in-flight sessions retain their assigned variant until the session ends Given a governing rule with action "cap variant traffic" and cap=20% When the rule triggers Then each affected test variant receives at most 20% of new traffic and the remainder shifts to control, validated over any rolling 1,000 assignments Given a governing rule with action "route 100% to control" When the rule triggers Then all new traffic is routed to control and test variants receive 0% of new assignments When the governing rule clears Then traffic allocations restore to their pre-trigger configured targets within one allocation cycle

Deterministic rule precedence across scopes

Given active rules at global, collection, product, and variant scopes that conflict When precedence is resolved Then the most specific scope wins in order: variant > product > collection > global Given two rules at the same scope with different actions When precedence is resolved Then the more restrictive action wins in order: route 100% to control > pause experiment > cap (with the smallest cap winning) > no action Given two rules at the same scope and same action severity When precedence is resolved Then the rule with the latest updated_at timestamp wins Then the engine logs the winning rule ID and precedence path in the audit log for the decision

Time-windowed overrides for drops and flash sales

Given a time-windowed rule configured with start and end in the seller’s business timezone When the current time is within the window Then the time-windowed rule becomes eligible for precedence evaluation and can override baseline rules per standard precedence When the window starts or ends Then an evaluation is triggered and the resulting state is applied within the next allocation cycle Given overlapping time-windowed rules at the same scope When precedence is resolved Then the more restrictive action wins; ties are broken by latest updated_at timestamp When the window ends Then the system reverts to the baseline governing rule without manual intervention

Simulation mode parity and safety

Given the current catalog, stock snapshots, active experiments, and configured rules When a user runs simulation mode for "now" or an "as of" timestamp Then the simulation returns, for each affected entity, the governing rule ID, predicted action, predicted allocation, and an explanation path Then the simulation produces a decision hash that matches the live engine’s hash for the same inputs Then the simulation performs no writes to live allocation or experiment state

Event-driven and periodic evaluation cadence

Given stock change events for an entity under test When a stock change event is received Then the engine evaluates rules for that entity and applies any resulting action without waiting for the 60s heartbeat Given a period of no stock changes When 60 seconds elapse Then the engine evaluates rules for active entities via heartbeat Given a burst of multiple stock changes within 5 seconds When evaluations are performed Then the resulting traffic/control state is idempotent and reflects the latest stock, with no more than one state transition per evaluation batch Then each active entity has a "last evaluated at" timestamp that never exceeds 60 seconds in steady state

Deterministic audit logging and reproducibility

Given any evaluation (event-driven, heartbeat, or simulation) Then a log record is written containing timestamp (UTC), evaluation type, entity IDs and scope, input stock metrics, applicable rule IDs with parameters, precedence path, prior and new state, and a deterministic decision hash When the same inputs are replayed Then the recomputed decision hash equals the logged hash Given a query by entity ID, rule ID, and time range When the audit log is fetched Then results are complete and ordered by time, and records are immutable

Auto Traffic Shift to Stable Variants

"As a product manager, I want traffic to shift automatically to stable variants when a tested look is low on stock so that we maintain conversions without overselling constrained items."

Description

Integrate the rules engine with Style Splitter’s allocator to dynamically reassign traffic away from constrained SKUs and toward stable variants or control while preserving experimental integrity (consistent unit assignment, holdout preservation). Enforce guardrails such as max reallocation per interval and minimum sample per variant to avoid bias. Provide real-time visibility into allocation, conversion impact, and inventory burn avoided.

Acceptance Criteria

Dynamic Reallocation on Low-Stock Trigger

Given a running Style Splitter experiment with multiple variants (including control) and a variant’s SKU falls below the configured low_stock_threshold And guardrails are configured: max_reallocation_pct_per_interval=25%, interval=10m, min_sample_per_variant=200 units When an inventory event marks the variant as low stock Then within 60 seconds the allocator reduces new incoming traffic to the low-stock variant by no more than 25% of total traffic for that interval And reallocates the removed traffic proportionally to stable variants and/or control only And emits an audit event including timestamp, rule_id, reason, from_allocation, to_allocation

Guardrails Enforcement: Max Reallocation and Minimum Sample

Given guardrails are configured for the experiment When a reallocation is requested by the rules engine Then the absolute allocation change executed for any variant within the active interval does not exceed max_reallocation_pct_per_interval And additional reallocation requests within the same interval are queued to the next interval And if any variant’s recent sample size is below min_sample_per_variant, no reallocation away from or toward that variant is executed And a guardrail_blocked event is logged with the blocked amount and reason

Experimental Integrity: Sticky Assignment and Holdout Preservation

Given units are bucketed by unit_id When reallocation occurs during the experiment Then at least 99.5% of previously bucketed unit_ids retain their assigned variant for the duration of the experiment And the control/holdout share remains within ±0.5 percentage points of its configured value over any 1-hour window And allocation changes in one experiment do not alter unit assignment in other concurrent experiments

Pause/Resume and Test Deferral on Insufficient Inventory

Given all non-control variants become low stock or out of stock, or projected sell-through for a variant exceeds remaining inventory for the next interval When this condition is detected Then within 60 seconds the system pauses traffic to the affected variants and routes all new traffic to control or other stable variants And the experiment status updates to "Paused - Inventory" with the blocking SKUs listed And upon receiving a replenishment event that clears thresholds, the system automatically resumes normal allocation within 5 minutes and records a resume event And new experiments targeting constrained SKUs are prevented from starting and are queued with reason "Awaiting Replenishment"

Real-Time Visibility and Audit Logging

Given the Allocation dashboard is open When the allocator executes a reallocation Then the UI updates within 5 seconds to show per-variant current allocation %, delta from previous, and timestamp And the Inventory panel displays current stock, low-stock threshold, and inventory burn avoided in units and currency And an audit log entry is stored with fields: experiment_id, sku_id, event_type, reason, from_allocation, to_allocation, guardrail_applied, actor, timestamp And the API GET /experiments/{id}/allocation_history returns the last 24 hours of changes with p95 latency ≤ 300 ms

Fallback Routing When No Stable Alternatives

Given a reallocation is required but no non-constrained variant exists except control (or control is the only constrained variant) When the allocator selects targets Then traffic is routed only to variants marked stable, preferring control when available And if no stable variants exist, the experiment halts new traffic and displays "No stable variants available" with recommended actions And product page requests continue to serve a default experience without 4xx/5xx errors

Analytics: Conversion Impact and Inventory Burn Avoided

Given the experiment has accumulated at least 500 visits after the latest reallocation When viewing the Analytics panel or exporting data Then the system displays per-variant conversion rate, delta vs 7-day pre-shift baseline, and uplift with 95% confidence intervals And it displays inventory burn avoided as projected_units_without_shift minus actual_units_after_shift, and the equivalent currency value And metrics refresh at least every 60 seconds and include a data freshness timestamp And CSV export and API values match the dashboard within ±0.1% for the same time window

Test Launch Inventory Gatekeeper

"As a merchant, I want PixelLift to block new tests when inventory is too low so that I don’t start experiments that can’t reach significance before selling out."

Description

Block or defer the launch of new Style Splitter tests when projected inventory cannot support the required sample size or run duration. Compute safe test capacity using recent sales velocity, current stock, lead time, and desired statistical power. Offer a preflight checklist with reasons for block and options to auto-queue until replenishment, reduce variant count, or switch to sequential tests. Expose API and UI hooks to schedule starts for drops and limited runs.

Acceptance Criteria

Preflight Block on Insufficient Test Capacity

Given a proposed Style Splitter test with k variants, desired power p, minimum per-variant sample size Nmin(p), minimum run duration Dmin, current on-hand stock S, confirmed inbound quantity Q_in with ETA t_in, recent sales velocity v (units/hour), safety buffer B, and lead time L When the gatekeeper computes projected available testable units U over window W = max(Dmin, time_to_collect_samples) as U = S + inbound_before(W) - v*W - B Then if U < k*Nmin(p), the system blocks launch and displays a preflight checklist with computed S, v, W, B, Q_in, U, k, Nmin(p), and deficit = k*Nmin(p) - U And the Launch action is disabled with reason code INV_BLOCK and an API 409 response on launch attempts including the same fields And the preflight lists limiting factors ranked by contribution (≥10% of deficit) and a timestamped snapshot source for each input

Auto-Queue Launch Until Replenishment

Given a blocked test with projected capacity recovery timestamp T_cap when U ≥ k*Nmin(p) When the user selects "Auto-queue until replenishment" Then the system schedules start_at = T_cap (rounded to the next 15-minute window) and marks status Queued And capacity is re-evaluated at least every 15 minutes; if T_cap shifts by ≥5 minutes, the schedule is updated and the change logged And notifications are sent via email and webhook upon queueing and upon launch And a "Cancel queue" action immediately clears the schedule and returns the test to Blocked state

Variant Count Reduction Suggestion

Given a blocked test due to U < k*Nmin(p) When the user selects "Reduce variants" Then the system proposes the minimal k' (1 < k' < k) such that k'*Nmin(p) ≤ U, with estimated power and duration impacts And presents a ranked list of candidate variants based on planned traffic weights and required Nmin(p), defaulting to the top k' by weight And upon confirmation, the system recomputes gating; if pass, Launch is enabled and the change is recorded in the audit log; if still blocked, the checklist updates with the remaining deficit

Switch to Sequential Testing Mode

Given a blocked k-variant test When the user selects "Switch to sequential tests" Then the system proposes a sequence of 2-variant phases covering all variants, each phase requiring Nmin(p) per variant and fitting within available capacity U for phase 1 And the UI displays estimated total duration, per-phase start windows, and inventory consumption per phase And if phase 1 meets capacity (U_phase1 ≥ 2*Nmin(p)), Launch is enabled for phase 1 and subsequent phases are auto-queued with gating checks before each start; otherwise the test remains Blocked And the API exposes the sequence as a parent test with child phases including schedule_start_at and gating status per phase

Capacity Computation Accuracy and Audit Trail

Given the gatekeeper computes v as the median of hourly sales over the last 72 hours (excluding the most recent 15 minutes) and allows override X in [24,168] hours When capacity is computed Then the UI shows v, S, B, inbound, L, p, Nmin(p), W, and U with units and time windows; API GET /tests/{id}/gatekeeper returns the same values within ±0.1 units agreement And any change to inputs (override, stock update, inbound, power level) creates an audit log entry with actor, old/new values, reason, and timestamp, retrievable via UI and API And if API/UI values diverge beyond tolerance, the system raises INV_CALC_MISMATCH and blocks launch until resolved

API and UI Scheduling Hooks for Drops and Limited Runs

Given a seller schedules a drop at timestamp T_drop When creating a test via API with schedule_start_at = T_drop and gatekeeper=true Then the system reserves the start, runs gating at T_drop−15m and again at T_drop, and launches only if capacity criteria pass And if blocked, the API returns 409 INV_BLOCK with capacity fields and an auto-queue option; the UI shows "Scheduled (Blocked)" with next evaluation time And the API supports idempotency via Idempotency-Key and emits webhooks test.queued, test.blocked, and test.launched including capacity details

Restock Forecast & Auto-Resume

"As an operations lead, I want tests to auto-resume when restock arrives so that experimentation continues without manual babysitting."

Description

Ingest restock ETAs from connected platforms or merchant input and optionally estimate replenishment using sales velocity and lead times. When items recover above thresholds or ETA is reached, automatically resume paused tests and restore prior allocations. Handle partial replenishments, per-location stock, and backorder toggles with cooldowns and confidence checks to prevent oscillation.

Acceptance Criteria

Auto-Resume on ETA Reached

Given a Style Splitter test for SKU X is paused due to low stock and an ETA exists from a connected platform or merchant input When current_time >= ETA and (on_hand_stock + inbound_at_eta) >= resume_threshold Then the test resumes within 5 minutes And the pre-pause traffic allocation weights are restored exactly And an audit log entry is created with timestamp, source=ETA, and actor=system And the test state in the UI updates to Active within 1 minute of resume

Threshold Resume with Partial Replenishment

Given a paused test for SKU X and partial replenishment events occur When aggregated on_hand_stock across eligible locations >= resume_threshold Then the test resumes within 5 minutes And if aggregated on_hand_stock < resume_threshold, the test remains paused And the restored allocation matches the snapshot taken at pause And an audit log entry records the stock values and decision (resume/hold)

Per-Location Resume and Allocation Restore

Given SKU X has inventory tracked per location and the test is paused globally When location A on_hand_stock >= location_resume_threshold and location B on_hand_stock < location_resume_threshold Then traffic routed to location A resumes with pre-pause allocations for that location And traffic routed to location B remains paused And the system records per-location resume/hold decisions in the audit log And the UI reflects a Partial Active state with location-level indicators within 1 minute

Backorder Toggle with Cooldown

Given a test for SKU X is paused and the merchant enables backorders for SKU X When backorder_enabled = true Then the system starts a cooldown timer of cooldown_minutes (default 15) and does not resume during the cooldown And after cooldown, if projected_lead_time_days <= max_backorder_lead_time and confidence_score >= min_confidence (0.80) Then the test resumes within 5 minutes and allocations are restored Else the test remains paused and a reason code=BackorderNotConfident is logged

Forecasted ETA from Sales Velocity and Lead Time

Given no external ETA is available for SKU X and forecasting is enabled When the system computes forecast_eta using trailing_7d_sales_velocity and supplier_lead_time_days Then forecast_eta is stored with source=forecast and confidence_score between 0 and 1 And if confidence_score >= min_confidence (0.70) and current_time >= forecast_eta and on_hand_stock + inbound_at_eta >= resume_threshold Then the test resumes within 5 minutes and allocations are restored Else the test remains paused and a reason code=ForecastLowConfidence or ForecastNotDue is logged

Oscillation Prevention via Hysteresis and Dwell Times

Given a test is paused due to low stock When on_hand_stock rises above resume_threshold Then resume only if stock remains >= resume_threshold for stable_window_minutes (default 10) and forecast_variance <= variance_limit And after auto-resume, do not auto-pause again for at least min_dwell_minutes (default 30) unless on_hand_stock drops below hard_stop_threshold And all pause/resume actions include timestamps and thresholds used in the audit log

Queued New Tests Respect Replenishment Gates

Given new tests for SKU X are queued because of low stock and an existing test was paused When auto-resume is triggered for the paused test Then the paused test resumes with prior allocations within 5 minutes And queued new tests remain queued until on_hand_stock >= new_test_threshold and the resumed test has run for stabilization_minutes (default 60) And the schedule reflects updated planned start times with reason codes for any delay

Alerts, Logs, and Manual Overrides

"As a store owner, I want clear alerts and the ability to override automated decisions so that I stay informed and in control during fast-moving drops."

Description

Deliver proactive notifications (email, Slack, in-app) on test pauses, throttles, resumes, and gating decisions. Provide an admin panel to review events with reason codes and apply scoped manual overrides (e.g., force-continue a test) using RBAC. Maintain an immutable audit log with timestamps, rule versions, inventory snapshots, and before/after allocations, with export via CSV and webhooks for BI pipelines.

Acceptance Criteria

Notify Pause/Throttle/Resume Decisions Across Channels

Given a Style Splitter test changes state to paused, throttled, resumed, or gated due to inventory thresholds When the decision engine emits the event Then notifications are sent to all enabled channels (email, Slack, in‑app) within 60 seconds containing: event_id, timestamp (UTC), event_type, test_id, affected SKU(s), reason_code, rule_version, inventory level at decision time, before/after allocation percentages, and a deep link to the event detail And channel delivery status (success/failure) is recorded per recipient

RBAC-Secured Admin Panel for Event Review

Given a user with role having permission inventory-sync.events.view When they open the Events panel Then they can filter by date range (up to 90 days), event_type, decision outcome, SKU, test_id, reason_code, actor (system/user), and rule_version and see paginated results (50 per page) with response time ≤ 2s for last 30 days And clicking a row opens a detail view showing raw payload, inventory snapshot, and before/after allocations And users without permission are denied (HTTP 403) and no event metadata is leaked

Scoped Manual Override of Inventory-Gated Tests

Given a user with permission inventory-sync.override.manage When they create an override for a specific test_id and variant scope Then they must specify action (force-continue | force-pause | throttle cap %), TTL (up to 7 days), and rationale, and the override is applied within 60 seconds to traffic allocations And the override is recorded with actor, scope, TTL, and rationale in the audit log And overrides cannot exceed tenant scope or TTL limits and can be revoked, after which allocations revert within 60 seconds

Immutable Audit Log with Inventory Snapshot and Allocations

Given any decision or override occurs When the system persists the event Then an append-only audit entry is stored with: event_id (UUIDv4), timestamp (ISO 8601 UTC), actor (system service or user_id), action, reason_code, rule_version, inventory_snapshot {SKU, on_hand, reserved, available}, before_allocation {variant:%}, after_allocation {variant:%}, test_id, source_ip (for user actions), and a hash-chain checksum linking to the previous entry And existing entries cannot be modified or deleted via UI/API; corrections are recorded as compensating entries referencing prior event_id And any single entry can be retrieved by event_id

CSV Export and Webhook Streaming for BI

Given a user with permission inventory-sync.events.export When they request a CSV export for a date range up to 90 days Then a CSV with the audited fields and header row is generated and available to download within 2 minutes for up to 1,000,000 rows (chunked if larger), with UTC timestamps And tenants can configure one or more webhook endpoints with shared secrets And new events are delivered to configured webhooks within 60 seconds of occurrence with the same payload fields

Webhook Reliability: Signatures, Retries, and Idempotency

Given a webhook delivery When the event is sent Then an HMAC-SHA256 signature using the tenant secret is included in the x-pixellift-signature header, and signatures support key rotation And deliveries use exponential backoff retries for non-2xx responses up to 24 hours (max 15 attempts) and stop on HTTP 410 And each payload includes idempotency_key = event_id; deliveries are at-least-once and consumers can safely dedupe And per test_id, event order is preserved; if reordering occurs, a sequence number allows consumers to restore order

Notification Preferences, Dedupe, and Rate Limits

Given tenant-level notification settings When events are generated rapidly for the same test_id and reason_code Then identical notifications are deduplicated per channel within a 5-minute window and include an aggregated count And per-recipient rate limits are enforced (e.g., ≤ 2 emails per 10 minutes) while ensuring at least one in-app notification is always shown And users can subscribe/unsubscribe to channels for Inventory Sync alerts subject to RBAC, and preferences are honored in subsequent sends

Metafield Mapper

Map variant flags to your theme, page‑builder blocks, and 3rd‑party apps with zero code. Use presets for Dawn, Refresh, and popular Shopify themes, validate assignments before publish, and preview which images will render live per variant. Cuts setup time from hours to minutes and prevents theme regressions.

Requirements

Theme Preset Auto‑Mapping

"As a boutique owner, I want to apply a theme‑specific mapping preset automatically so that I can configure variant image behavior in minutes without learning theme internals."

Description

Provide a built‑in library of mapping presets for Shopify themes (e.g., Dawn, Refresh, Sense) that auto‑detects the store’s active theme and version, then preconfigures metafield-to-block assignments for common variant flags (color, finish, size, image style). Presets are editable and versioned, with safe defaults and transparent diffs when themes update. The system supports override and fallback rules, merges custom mappings with preset updates, and synchronizes changes without code edits. Outcome: merchants can set up mappings in minutes while maintaining brand consistency and reducing misconfiguration risk.

Acceptance Criteria

Active Theme and Version Auto-Detection

Given a Shopify store with an active supported theme and a stable Shopify API connection When the merchant opens PixelLift > Metafield Mapper > Theme Preset Auto‑Mapping Then the app identifies the active theme name and exact semantic version within 3 seconds And the detected theme and version are displayed to the user And the matching preset is auto‑selected; if multiple minor versions share a preset, the nearest compatible preset is selected And if the theme is unsupported, a Safe Defaults preset is selected and the user is notified

Preset Preconfiguration of Variant Flag Mappings

Given a detected theme with a corresponding preset When the preset is applied Then metafield‑to‑block assignments for color, finish, size, and image_style are created without any code edits And each assignment maps to a valid theme block/section ID per the theme schema And a validation summary shows the count of assignments created and zero invalid references And the preset application completes within 5 seconds

Editable, Versioned Presets with Revert

Given an applied preset When the merchant edits any mapping and saves Then the system creates a new preset version with timestamp, editor identity, and change summary And the last 10 versions are retained with the ability to preview and revert any prior version And unpublished edits affect only preview until the merchant publishes

Transparent Diffs on Theme Update

Given the store updates its theme version or a new preset version becomes available When the merchant opens the Auto‑Mapping screen Then a diff view highlights added, removed, and modified mappings with counts per category And custom overrides are preserved and marked as Persistent Override in the diff And the merchant can Accept, Defer, or Schedule the update before publishing

Override Precedence and Fallback Rules

Given a product variant with both preset mappings and at least one custom override When resolving mappings for that variant Then the evaluation order is override > preset > fallback, applied deterministically And if a required metafield is missing, the Safe Defaults mapping is used And the resolution log records the rule path taken for auditability

Merge Custom Mappings with Preset Updates

Given custom overrides exist and the merchant accepts an updated preset When the merge executes Then all custom overrides are retained unchanged And new preset mappings are added; deprecated mappings are flagged and require explicit confirmation to remove And any conflicts are listed with default resolution Keep Custom and optional Change to Preset And post‑merge validation passes with zero broken references

No-Code Synchronization to Theme

Given a validated mapping configuration When the merchant clicks Publish Then the app updates theme configuration via Shopify APIs without directly editing Liquid templates And changes are applied to a preview theme immediately and to the live theme only after confirmation And the operation completes within 60 seconds with a success status; on API rate limits, retries use exponential backoff with progress reporting

Zero‑Code Mapping Builder

"As a non‑technical seller, I want to map my variant flags to theme blocks via a visual builder so that I can control which images display per variant without writing code."

Description

Deliver an interactive drag‑and‑drop UI to map data sources (variant metafields, product metafields, tags, options) to targets (theme sections/blocks, page‑builder components, and supported app endpoints) with conditional rules (e.g., if variant.color = "Red" then use preset "Crimson Studio"). Supports priority ordering, test data selection, inline validation, and instant preview handoff. Includes a target catalog with searchable connectors and schema hints, enabling non‑technical users to create robust mappings without editing theme code.

Acceptance Criteria

Create Mapping via Drag-and-Drop from Source to Target

Given the Mapping Builder is open with the Target Catalog visible When the user drags a selectable data source onto a compatible target Then a new mapping row is created showing Source, Target, Connector, Data Type, and Status = "Unsaved" Given the target provides schema hints When the source type matches the target type Then the row displays a green "Type Match" indicator; when it does not match, a red "Type Mismatch" error appears with the expected vs. actual type Given the user clicks Save When the save completes Then the mapping persists on reload and Status = "Saved" Given a keyboard-only user When using Enter/Space to pick items and Arrow keys to move between lists Then the same mapping can be created without a mouse

Define Conditional Rule for Variant Color Equals 'Red'

Given a mapping row is selected When the user adds the rule IF variant.option("Color") = "Red" THEN apply preset "Crimson Studio" Then the rule is displayed with operands, operator, and action chips Given a test variant with Color = Red is selected When the preview evaluates the mapping Then the preset "Crimson Studio" is applied and shown in the Preview panel Given a test variant with Color != Red is selected When the preview evaluates the mapping Then the mapping follows the ELSE branch or defined fallback preset Given the user saves When the rules engine serializes the configuration Then the JSON includes key path "variant.option.Color", operator "equals", value "Red", and action "applyPreset:Crimson Studio"

Set and Apply Mapping Priority Order

Given two or more mappings target the same field When the user reorders them via the drag handle Then their priority numbers update to reflect the visual order Given test evaluation is run When multiple mappings are eligible Then the highest priority (lowest number) mapping wins and lower ones are skipped, with a log entry indicating the winning rule Given equal priorities are attempted When the user tries to save Then the system prevents save and shows a "Duplicate priority" validation until resolved Given the builder is reloaded When the configuration loads Then the previously saved order is preserved

Select Test Product/Variant and Preview Live Renders

Given the Test Data selector is visible When the user searches and selects product "SKU-123" and variant "Red / Small" Then the Preview updates within 2 seconds to show the exact images and styles that will render for that variant Given mappings reference metafields or tags on the test entity When required source data is missing Then the Preview shows a placeholder and a non-blocking warning "Missing source data: {field}" Given the user changes the selection When a new variant is chosen Then the preview and evaluation log update accordingly without requiring a page refresh

Inline Validation of Schema and Required Fields

Given a target with required fields exists When any required field lacks a valid mapping Then a validation badge appears on the mapping group with a count of blocking errors Given a source–target type mismatch is present When the user hovers the error icon Then the tooltip shows "Expected: {type} • Actual: {type}" Given blocking errors exist When the user clicks Publish Then the Publish action is disabled and a modal lists the errors with deep-links to each mapping Given all blocking errors are resolved When Validate is run Then the result shows "0 errors" and "Ready to Publish"

Use Theme Preset (Dawn/Refresh) with Schema Hints

Given the user opens Presets When "Dawn – Product Gallery" is applied Then suggested mappings are auto-created for required targets with schema hints displayed on each target Given the shop theme version is detected When the preset version does not match the installed theme version Then a non-blocking warning "Preset built for Dawn v{X}; current v{Y}" is shown Given the user customizes any auto-created mapping When saved Then customized mappings remain, and non-customized suggestions can be reverted or removed individually

Map to 3rd‑Party Page‑Builder Component Endpoint

Given the Target Catalog is searchable When the user searches "Shogun Image" Then "Page‑Builder: Shogun – Image Block" appears with connector details Given the connector supports test mode When the user maps a source to it and clicks Test Then a mock payload is sent and a 2xx response is received within 3 seconds, showing "Connection OK" Given the mapping is saved When preview handoff occurs Then the mapped payload is included in the preview bundle and the component renders the expected image for the selected test variant

Live Variant Image Preview

"As a merchandiser, I want to preview which images will render for each variant before publishing so that I can catch gaps and ensure a consistent customer experience."

Description

Provide a safe, sandboxed storefront preview that renders the active theme with proposed mappings to show exactly which images will display for each variant and state (selected, hover, gallery position) across desktop and mobile breakpoints. Supports variant toggling, before/after comparison, highlight of unmapped variants, and deep links for team review. No live changes occur until publish, reducing guesswork and preventing regressions.

Acceptance Criteria

Desktop Preview: Variant Image Rendering

Given a product with ≥3 variants and a saved mapping draft When the user opens Live Variant Image Preview in desktop mode (viewport ≥1280px) Then the sandbox renders the active theme and for each selected variant the primary image src matches the mapped image for that variant And if a hover image is mapped it appears on pointer hover within 300ms; if not mapped the primary image persists And gallery thumbnails show only mapped images in the configured order; clicking a thumbnail updates the main image accordingly And if a variant has no mapped images the product default image is shown as fallback

Mobile Preview: Breakpoint Accuracy

Given a product with a mapping draft When the user switches the preview to mobile mode (viewport ≤414px) Then for each variant the same mapped images are shown as in desktop mode for the same mapping And the gallery uses the mobile template (swipeable carousel) and swiping updates the main image to the correctly mapped image And images maintain aspect ratio with no horizontal scroll introduced by the preview UI And on a throttled Fast 3G profile the main image displays (above-the-fold) within 2.5s

Variant Toggle and State Cycling

Given the Live Variant Image Preview is open When the user changes variants via swatch or dropdown and cycles image states (selected, hover, gallery thumbnail click) Then the main image updates to the correct mapped image within 300ms without full page reload And the selected variant is visibly indicated and exposes aria-selected=true on the active control And hover state only activates when a hover image is mapped; otherwise no change occurs And keyboard navigation (Tab/Arrow) can change variants and thumbnails producing the same mapped results

Before/After Comparison View

Given a product with an existing live mapping and a new draft mapping When the user enables comparison mode Then the "Before" view renders images exactly as the current live storefront does for each variant and state And the "After" view renders images from the draft mapping for each variant and state And a toggle/slider switches between views without altering the underlying mapping And a diff indicator shows the count of variants whose primary image will change (>=0)

Unmapped Variant Highlighting

Given at least one variant lacks an image mapping in the draft When the user opens the preview Then unmapped variants are visually flagged in the variant selector with a warning icon and tooltip "No image mapping" And selecting an unmapped variant shows the defined fallback image plus an inline notice with a "Map now" link to Metafield Mapper And an aggregate badge displays the total count of unmapped variants for this product

Deep Link for Team Review

Given a draft mapping is saved but not published When the user generates a preview share link Then a URL is produced that preserves product, initial variant, device breakpoint, and comparison mode And opening the link in another browser session shows the identical preview state without requiring edit permissions (view-only) And the link expires after 7 days or upon publish (whichever occurs first) returning an informative message after expiry And opening the link does not alter live storefront data or theme assets

Preview Safety: No Live Changes

Given any Live Variant Image Preview session When the preview is rendered and interacted with Then no writes occur to live theme files, assets, or metafields and no publish events are triggered And outbound webhooks/integrations that would normally fire on image/metafield changes are suppressed And the preview URL is not discoverable to storefront customers without the explicit share link And audit logs show zero live changes during the preview session

Pre‑Publish Validation & Conflict Detection

"As a store admin, I want automated validation of my mappings before publishing so that I can prevent broken images and theme conflicts on the live site."

Description

Implement a rules engine that validates all mappings prior to publish: existence and type checks for metafields, detection of missing assets, conflicting rules, unsupported theme/app versions, circular conditions, and API permission gaps. Classify issues by severity, provide auto‑fix suggestions, and block publishing on critical errors. Generate a downloadable validation report for audit and collaboration.

Acceptance Criteria

Metafield Existence and Type Validation

Given a mapping configuration with one or more metafield references When pre-publish validation is initiated Then the system verifies existence of each referenced metafield (namespace/key) in the shop And verifies the metafield type matches the expected type for each mapping And records one error per missing metafield with code MF_MISSING and severity Critical And records one error per type mismatch with code MF_TYPE_MISMATCH and severity Critical And associates each error with mappingId, productId/variantId (if applicable), and metafield path

Missing Asset Detection for Mapped Media

Given mappings that reference image/file assets via metafields or URLs When pre-publish validation is initiated Then the system resolves each reference to a concrete, accessible asset And flags any unresolved references or 404/403 responses as Critical with code ASSET_MISSING And associates each issue with mappingId and the affected productId/variantId And lists the exact missing asset reference (metafield path or URL) in the message

Rule Conflict and Circular Condition Detection

Given multiple rules that may target the same output slot (e.g., image per variant) When pre-publish validation is initiated Then the system detects conflicting rules that produce different outputs for the same target and context And records each as Critical with code RULE_CONFLICT including the involved ruleIds and target slot And the system detects circular dependencies between rules/conditions And records each cycle as Critical with code RULE_CYCLE including the dependency path And associates each issue with the affected mappingId(s) and variant scope

Platform Compatibility and Permissions Validation

Given selected theme/app targets with declared versions and the app's granted API scopes When pre-publish validation is initiated Then the system checks target theme/app versions against a supported matrix And flags unsupported versions as Critical with code VERSION_UNSUPPORTED including detected and minimum supported versions And verifies required API scopes for planned write/read operations And flags missing scopes as Critical with code PERMISSION_MISSING including the list of required scopes And associates each issue with the relevant target (theme/app) and mappingId(s)

Severity Classification, Auto‑Fix Suggestions, and Publish Blocking

Given validation issues have been identified When results are classified Then each issue is assigned a severity of Critical, Warning, or Info based on predefined rules And issues of types MF_MISSING, MF_TYPE_MISMATCH, ASSET_MISSING, RULE_CONFLICT, RULE_CYCLE, VERSION_UNSUPPORTED, and PERMISSION_MISSING are classified as Critical And the system generates auto-fix actions where possible (e.g., create metafield definition, reorder/remove conflicting rule) with unique actionIds And the Publish action is disabled while any Critical issues remain, showing a badge with the Critical count And Publish becomes enabled immediately after all Critical issues are resolved without requiring a page reload

Validation Report Generation and Download

Given a completed validation run with zero or more issues When the user selects Download Validation Report Then the system generates a report file in JSON format containing runId, timestamp, environment, summary counts by severity, and an array of issues (code, severity, message, mappingId, ruleId, productId, variantId, target, path) And the file name follows the pattern metafield-mapper-validation-{runId}-{YYYYMMDD-HHmm}.json And the download starts within 2 seconds of the request And the report data matches the issues currently displayed in the UI

Pre‑Publish Validation Trigger and UI Feedback

Given a user attempts to publish mappings When the user clicks Publish Then pre-publish validation executes before any write operations occur And the validation drawer/modal displays grouped issues by severity with counts and filter controls And if any Critical issues exist, publish is aborted and a toast/banner references the number of blockers And if no Critical issues exist, publishing proceeds within the same session And the per-variant preview reflects the post-validation effective image selections

Safe Publish, Versioning & Rollback

"As an operations manager, I want versioned publishes with instant rollback so that I can deploy mapping changes confidently without risking storefront downtime."

Description

Offer a controlled deployment workflow with environments (Draft, Preview, Live), atomic publishes, automatic backups of prior mappings, and one‑click rollback. Include change logs with who/what/when, diff views between versions, and the ability to schedule publishes during low‑traffic windows. This safeguards the storefront and accelerates recovery if unexpected behavior occurs.

Acceptance Criteria

Atomic Publish from Preview to Live

Given a validated Preview version V containing mapping assignments and theme/page‑builder targets And an existing Live version L When the user clicks Publish to Live for version V Then the system applies all changes atomically so either 100% of changes are visible on Live or 0% And the previous Live version L is snapshotted as a backup before any change And Live reflects version V’s ID and timestamp immediately upon success And no partial or mixed state is observable via API or storefront during the operation And the publish action is idempotent if retried within 5 minutes using the same request ID

Automatic Backup on Publish

Given a Live environment with current version L When any publish to Live is initiated Then the system creates a read‑only backup of L including mappings, presets, schedules, and metadata And the backup is tagged with user, timestamp, environment, and reason And backups are retained for at least 30 days or the last 50 versions, whichever is greater And the backup is selectable as a rollback target and passes integrity verification

One‑Click Rollback to Prior Version

Given a list of previous versions with metadata and integrity status When the user selects target version T and clicks Rollback Then the system reverts Live to version T atomically And records a new version R that documents the rollback action and references T And writes a change‑log entry with who/what/when and affected entities count And completes within 60 seconds for catalogs up to 5,000 mappings And notifies connected integrations of the version change

Audit Change Log (Who/What/When)

Given the system supports publishes, rollbacks, and schedule create/update/cancel events When any such event occurs Then a change‑log entry is written capturing user identity, timestamp (UTC), environment, version IDs, action type, and a diff summary (added/removed/modified counts) And entries are immutable, searchable by user/date/action, and exportable as CSV/JSON And logs are retained for at least 180 days And viewing a log entry displays the exact payload and request ID used

Version Diff View Between Mappings

Given two versions A and B are selected When the user opens the Diff view Then the system displays added, removed, and modified mappings with counts And highlights changes per product/variant and target (theme block, page‑builder block, 3rd‑party app) And shows thumbnail previews for any image or style‑preset changes And provides a raw JSON diff download And the diff renders within 3 seconds for up to 2,000 changes

Scheduled Publish in Low‑Traffic Window

Given a validated Preview version V and the store’s timezone set When the user schedules V to publish at a future datetime Then the system validates V at schedule time and re‑validates within 2 minutes before execution And if validation fails, the publish is skipped and the user is notified with reasons And the job executes within ±1 minute of the scheduled time and is atomic And only one Live publish executes at a time; conflicting schedules are queued or rejected with feedback And the schedule may be edited or canceled until 5 minutes before execution

Publish Failure Handling and State Integrity

Given a publish is in progress and an error, timeout, or external dependency failure occurs When the operation cannot complete successfully Then the system aborts and ensures Live remains on the prior version with no partial changes And writes a failure entry to the change log including error details and request ID And emits a user notification with retry options And all side effects on external systems are compensated or rolled back And subsequent retries are safe and idempotent

Connector SDK for Themes & Apps

"As a developer partner, I want a stable connector SDK with examples so that I can integrate my app with Metafield Mapper and guarantee compatibility over time."

Description

Provide an extensible SDK to build and maintain connectors for themes, page‑builders, and third‑party apps. Includes schema introspection, capability declaration (supported targets, field types), versioned contracts, test harness, and automated compatibility checks. Ship first‑party connectors for Dawn, Refresh, Shogun, PageFly, and GemPages, with a review process for community contributions. Enables rapid integration growth while keeping mappings reliable across updates.

Acceptance Criteria

Schema Introspection for Theme/Page‑Builder Targets

Given a valid theme package (e.g., Shopify Dawn v12) or page‑builder config and required shop permissions When the SDK introspection API is invoked with default options Then a JSON schema is returned enumerating all mappable targets (sections, blocks, settings, metafields) with fields: id, label, path, dataType, allowedValues, multiplicity, isRequired, defaultValue, version, targetCategory And each field includes capability flags {supportsImages, supportsVariants, supportsDynamicSource} set to true/false And the response validates against /schemas/introspection.schema.json (JSON Schema 2020‑12) And the operation completes in ≤ 3 seconds for inputs ≤ 5 MB on a standard build agent And unsupported inputs return HTTP 422 with code INTROSPECT_UNSUPPORTED and a remediation URL

Connector Capability Manifest Validation

Given a connector manifest (manifest.json) declaring supportedTargets, fieldTypes, operations, and contract range When pl-connector validate is executed Then validation succeeds with exit code 0 if the manifest conforms to /schemas/connector-manifest.schema.json And referencing undeclared capabilities or unsupported field types yields exit code 1 with error codes CAPABILITY_UNDECLARED or FIELD_TYPE_UNSUPPORTED and file/line references And at runtime, attempts to map to unsupported field types are rejected with error MAP_UNSUPPORTED and an actionable message

Contract Version Negotiation and Compatibility Gates

Given the platform contract version is 1.3.0 and a connector declares engines {min: 1.2.0, max: <2.0.0} When the connector is loaded Then negotiation succeeds and the connector activates And if the connector requires >= 2.0.0, loading is blocked with code CONTRACT_INCOMPATIBLE and a migration guide link And use of deprecated API symbols emits warnings DEPRECATION_USED with locations, and CI fails with --treat-warnings-as-errors when warnings > 0

Local Test Harness and CI Execution

Given a connector project with tests under /tests using the SDK test harness When pl-connector test --ci is run Then tests execute in an isolated sandbox with mocked fixtures for Dawn, Refresh, Shogun, PageFly, and GemPages And golden snapshot comparisons for mapping outputs pass byte-for-byte And the suite enforces ≥ 80% statement coverage and completes in ≤ 5 minutes on a standard CI runner And results are emitted as JUnit XML and JSON to /artifacts, with exit code 0 on success and non-zero on failure

Automated Compatibility Checks on Publish

Given a new PixelLift platform release, theme update, or connector update is queued for publish When the compatibility matrix job runs Then all first-party and top 20 community connectors are evaluated against current and previous two platform contracts and latest two theme versions And a report with pass/fail, severity (Non-breaking, Breaking), and impacted targets is generated to /compatibility/report.json and surfaced in the admin UI And publish is blocked if any Breaking failures exist, with rollback available and a diff of impacted variant image renderings for Metafield Mapper And the matrix job finishes in ≤ 10 minutes at the 95th percentile

First‑Party Connectors Shipping and Function Coverage

Given first-party connectors for Dawn, Refresh, Shogun, PageFly, and GemPages are installed When Metafield Mapper maps variant flags to image targets via each connector Then preview resolves which images render per variant and matches live rendering for a 50-SKU seeded test catalog And each connector passes the common conformance suite with ≥ 95% tests passing and declares support for image targets and variant overrides And official docs include installable presets and a quick-start verified by pl-connector doctor with all checks green

Community Connector Submission and Review

Given a developer submits a connector via pl-connector submit or the partner portal When automated checks execute Then the submission passes manifest validation, contract compatibility, security scan (no critical CVEs), license compliance, size ≤ 10 MB, and code signature verification And failures produce a consolidated report with codes [MANIFEST_INVALID, CONTRACT_INCOMPATIBLE, SECURITY_FAIL, LICENSE_NONCOMPLIANT, SIZE_LIMIT_EXCEEDED, SIGNATURE_MISSING] And passing submissions enter manual review with SLA ≤ 5 business days, after which approved packages are PixelLift-signed and listed with a verified badge And installation verifies signature and checksum before activation

Live Cost Meter

See real‑time, per‑batch and month‑to‑date costs as you queue uploads. View per‑image rates by preset, applied discounts, taxes, and remaining cap in one place. Color‑coded warnings and “process within budget” checks prevent surprise bills and help you pick the most cost‑effective settings before you hit run.

Requirements

Real-Time Cost Aggregation Engine

"As a seller preparing a large upload, I want costs to update instantly as I change settings so that I can see the financial impact before I run the batch."

Description

Implement an event-driven service that calculates and streams up-to-the-moment per-batch and month-to-date (MTD) costs as users add, remove, or modify items in the upload queue. The engine merges inputs from the pricing catalog, selected presets, image counts, discounts, and taxes to produce a single authoritative cost model. It must support batching hundreds of photos with sub-second recalculation (<200 ms per queue mutation), be idempotent across retries, and handle partial failures gracefully. Expose a typed API for the web client to subscribe to updates and render granular line items (per-image rate, discounts, tax, totals), ensuring consistency with downstream billing and invoices.

Acceptance Criteria

Sub-second cost recalculation on queue mutations

Given an authenticated user subscribed to the cost stream for batchId B with a queue containing 1–1000 images And pricing catalog, selected presets, discounts, and tax rules are available When the user adds, removes, or modifies any queue item or preset for B Then the engine recomputes and publishes an authoritative cost update with an incremented version within 200 ms p95 and 400 ms p99 from mutation receipt And the update contains per-image rate, discounts[], taxes[], lineItems[], batchTotals{subtotalCents, discountCents, taxCents, totalCents}, and mtdTotals{postedCents, projectedCents, totalCents} And totals are deterministic for identical input state across repeated recomputations

Idempotent processing across retries and duplicates

Given a mutation M with idempotencyKey K may be delivered zero, one, or multiple times within a 5-minute window When the engine processes deliveries of M in any order Then exactly one state transition is applied, producing at most one new version for batchId B And batchTotals and mtdTotals remain unchanged by duplicate deliveries And the engine emits one applied=true audit event and applied=false for each duplicate, with no double counting

Typed subscription API and schema compatibility

Given a client subscribes to /v1/costs/stream?batchId=B via WebSocket or SSE with a valid auth token When the subscription is established Then the server sends an initial snapshot within 300 ms containing fields: version, serverTimestamp, batchTotals{subtotalCents, discountCents, taxCents, totalCents}, mtdTotals{postedCents, projectedCents, totalCents}, lineItems[], discounts[], taxes[], budgetCheck{remainingCapCents, willExceed}, flags{isEstimate} And subsequent updates preserve per-batch message ordering and monotonically increase version And payloads conform to OpenAPI schema costs.v1 (content-type application/json); backward compatibility tests pass for additive changes; contract tests validate required fields

Consistency with downstream billing and invoices

Given a batch is processed to completion and the billing service posts final charges and generates an invoice When comparing the engine’s final authoritative totals for that batch to billed totals Then subtotal, discount, tax, and total match exactly to the smallest currency unit for 100% of at least 500 sampled combinations of presets/discounts/taxes And MTD posted totals equal the sum of billed totals for the account within the same timezone window And any mismatch triggers a failed check and alert; deployment gate blocks release until resolved

Graceful partial failure and degraded upstreams

Given the pricing catalog and/or tax service is degraded or returns errors for a subset of items When the user mutates the queue during the outage Then the engine continues streaming with flags.isEstimate=true and warnings[] containing codes (e.g., PRICING_STALE, TAX_UNAVAILABLE) And partial totals include only resolved items; unresolved items are excluded from totals and counted in unresolvedCount And upon recovery the engine recomputes and publishes corrected totals within 500 ms p95 and clears related warnings And the stream remains active with error rate <0.1% and no crashes or stalls

Correct month-to-date (MTD) computation and cap signaling

Given an account timezone T and monthly spending cap C When computing MTD totals Then MTD includes posted charges from the first day of the current month at 00:00 T through now, plus projected cost for the current batch B marked flags.projected=true And discount stacking order is per-image -> bulk -> coupon -> tax; rounding matches the billing canonical rounding function to the smallest currency unit And budgetCheck{remainingCapCents, willExceed} equals C - (postedCents + projectedCents), with willExceed true if remainingCapCents < 0 And tests cover ≥3 tax jurisdictions and ≥4 discount combinations with zero mismatches against billing

Concurrent edits and multi-session ordering

Given two or more client sessions concurrently mutate the same batch B within a 1-second window When the engine receives those mutations in any order Then it serializes them using server timestamps and publishes updates with strictly increasing version and last-write-wins semantics And clients reconcile out-of-order messages using version; a snapshot replay is available on request to resync And all sessions converge to identical totals within 500 ms p95 from the last mutation, with no torn reads observed

Preset Rate, Discount, and Promotion Resolver

"As a cost-conscious user, I want to see exactly how preset choices and discounts affect my per-image rate so that I can pick the most affordable configuration."

Description

Create a resolver that determines the effective per-image rate based on the selected style preset, plan tier, and current promotions, and then applies stackable discount rules (e.g., volume breaks, coupon codes, partner discounts) in a deterministic order. The resolver should return transparent line items showing base rate, each discount applied, and the final effective rate per preset. It must reference a versioned pricing catalog, support future-dated pricing, and cache results for fast UI updates while remaining consistent with server-side verification.

Acceptance Criteria

Resolve Base Rate by Preset, Plan Tier, and Catalog Version

Given a selected style preset, a user’s plan tier, and a pricing catalog version identifier And the catalog contains a base per-image rate for that preset and plan tier effective on the pricing date When the resolver is executed for a specified pricing date (default: current UTC date) Then it returns a baseRate that matches the preset, plan tier, and catalogVersion And the baseRate currency matches the catalog currency And the result includes catalogVersion and the effectiveDate range used

Deterministic Discount and Promotion Stacking Order

Given eligible discounts exist of types: volume, promotion(s), partner, and coupon And each discount has defined eligibility rules, caps, and a minPriceFloor from the catalog When the resolver computes the effective per-image rate Then it applies discounts in this exact order: volume -> promotions (ascending priority, then startAt) -> partner -> coupon And the line items include sequence numbers reflecting this order And the finalRate equals sequential application of each discount to the running rate And the finalRate never drops below the catalog minPriceFloor for the preset/plan And repeated runs with identical inputs produce identical order and finalRate

Transparent Line Items Output for Effective Per-Image Rate

Given the resolver successfully computes pricing When the resolver returns results Then the output includes lineItems with fields: id, label, type, basis (percent|fixed), value, amountApplied, rateBefore, rateAfter, sequence, ruleRef, catalogVersion And includes a base line item (type=base) and zero or more discount items, plus a summary with finalRate And for each line item, rateAfter equals rateBefore minus amountApplied (or the correctly computed percentage reduction) And the sum of all discount amountApplied values equals baseRate minus finalRate within 0.001 currency units

Respect Future-Dated Pricing Based on Processing Date

Given a processingDate parameter is provided And the catalog and promotions include effective start and end timestamps When the resolver runs with a future processingDate Then it selects base rates and promotions whose effective windows include the processingDate And it does not apply promotions not yet active or already expired on the processingDate And when processingDate is omitted, current UTC date/time is used

Low-Latency Caching and Invalidation for UI Updates

Given a prior computation exists for the key (preset, planTier, processingDate, catalogVersion, discountInputs) When the resolver is invoked again with the same key within TTL Then it serves the result from cache with median latency <= 30ms and p95 latency <= 100ms And when any of preset, planTier, processingDate, couponCodes, partnerStatus, promotion roster, or catalogVersion changes Then the corresponding cache entry is invalidated and a fresh computation is performed And the cache TTL is 5 minutes and refreshes on access (sliding TTL) And the result includes a cacheHit flag

Consistency with Server-Side Verification and Reconciliation

Given a server verification endpoint returns canonical line items and finalRate for the same inputs When the client resolver compares its result to the server response Then if catalogVersion differs or |finalRate_client - finalRate_server| > 0.005 Then the client replaces the local result with the server result, sets verificationStatus="reconciled", and updates the cache And if the difference <= 0.005 and catalogVersion matches, it sets verificationStatus="verified" And an audit event with mismatch details is emitted upon reconciliation

Rounding and Precision Rules for Effective Rate

Given discounts may yield fractional currency values When the resolver calculates intermediate rateAfter values Then it maintains at least 4 decimal places of precision internally And rounds finalRate to 2 decimal places using half-up rounding for display And ensures finalRate >= the catalog minPriceFloor after rounding And returns both finalRate (2dp) and finalRatePrecise (>=4dp)

Tax Calculation & Localization

"As an international customer, I want taxes and totals shown accurately in my region and currency so that I avoid surprises on my bill."

Description

Integrate jurisdiction-aware tax calculation that determines applicable VAT/GST/sales tax based on the customer’s billing profile and ship-to region, supporting inclusive/exclusive tax displays as required. Provide localized currency formatting and rounding rules, with real-time tax estimates shown alongside subtotals and totals. Implement a provider abstraction (e.g., Stripe Tax or equivalent) with fallback logic and caching to ensure low-latency updates without diverging from final invoicing.

Acceptance Criteria

EU VAT Inclusive Display in Live Cost Meter (B2C)

Given user billing country = DE and ship-to country = DE and no valid EU VAT ID on file And preset per-image base price = €0.80 When the Live Cost Meter loads for a 100-image batch Then the meter displays per-image price inclusive of VAT at 19% as €0.95 (rounded to 2 decimals) And shows a label "Incl. VAT (19%)" on per-image, subtotal, and total lines And amounts are formatted in de-DE locale (e.g., 1.234,56 € for thousands) And changing the billing country to FR updates the VAT rate and all amounts within 300 ms

US Sales Tax Exclusive Display With Ship-To Override

Given billing state = OR and ship-to state = NY with ZIP = 10001 and tax rate = 8.875% And preset per-image base price = $0.50 and a 10% discount is applied When a 100-image batch is queued Then the meter displays per-image price exclusive of tax as $0.45 and subtotal as $45.00 And displays a separate line "Tax (NY 8.875%) $3.99" and total $48.99 And changing ship-to to OR recalculates tax to $0.00 within 300 ms and updates totals accordingly And month-to-date totals and taxes reflect the updated pending estimate immediately

B2B VAT Reverse Charge Handling

Given billing country = NL with a valid EU VAT ID on file and ship-to country = BE When the Live Cost Meter loads for any batch Then VAT is calculated as 0 and marked "Reverse charge - VAT to be accounted by customer" And prices are displayed exclusive of VAT And the invoice preview shows the buyer VAT ID and reverse charge note And if the VAT ID validation fails, VAT is reapplied within 300 ms using the correct destination rate

Real-Time Tax Estimate Performance, Caching, and Fallback

Given tax provider A is available and the cache is empty When the user edits billing or ship-to details or changes presets/discounts Then a fresh tax estimate is requested and the 95th percentile latency is <= 300 ms for batches up to 1,000 images And identical inputs reuse a cached estimate with TTL = 15 minutes and display an "Estimate (cached <15m)" tag And if provider A times out after 800 ms or returns 5xx, fallback to provider B occurs within 1,000 ms total And the UI shows a non-blocking "Estimate via fallback" indicator And reconciliation against final invoices shows estimate vs final tax delta <= 0.5% of tax or <= 0.05 in currency units, whichever is greater

Currency Localization and Rounding Consistency

Given display locale = fr-FR and currency = EUR When the meter shows per-image, subtotal, tax, and total Then amounts use locale-specific symbols and separators (e.g., 1 234,56 80) and two decimal places And rounding uses round-half-up to the currency minor unit consistently across line items and totals And the sum of per-image amounts equals the displayed subtotal within b1 0.01 And for currency = JPY, amounts display zero decimal places with rounding to whole yen And copying values to clipboard preserves numeric precision with no hidden extra decimals

Discounts and Budget Cap Integration With Taxes

Given monthly budget cap remaining = $500.00 and an automatic 20% volume discount is applied And preset per-image base price = $0.50 and ship-to ZIP = 94105 with tax rate = 8.625% When a 600-image batch is queued Then tax is computed on the post-discount subtotal and the meter shows "Projected total (incl. tax): $260.70" And a yellow warning appears when the projected total reaches 80% of the remaining cap And a red block prevents Run when the projected total would exceed the remaining cap, offering Reduce or Confirm Override actions And "Remaining cap after batch" reflects totals including tax

Provider Abstraction Parity and Traceability

Given provider A = Stripe Tax and provider B = InternalTax are enabled via feature flag When the same inputs (billing, ship-to, product tax code, post-discount price) are evaluated by both providers in nightly parity tests Then jurisdiction selection and tax amounts differ by <= 0.2% or <= 0.01 of the currency minor unit, whichever is greater And the abstraction returns a normalized schema (rate, jurisdiction, inclusivity, taxability reason) consumed by the meter And switching providers at runtime does not change the UI field names or labels And an audit log records provider used, version, and request/response IDs for every estimate

Monthly Spend Tracker & Cap Management

"As a subscriber on a capped plan, I want to see my remaining budget and predicted post-batch spend so that I can plan uploads without exceeding my limits."

Description

Track and surface month-to-date spend and remaining plan caps or budgets directly in the cost meter. Reconcile in near-real time with the billing system to reflect processed jobs, pending charges, and credits. Support soft and hard caps, display remaining capacity (e.g., images or currency), and model predicted post-batch totals to show whether a planned run would exceed limits.

Acceptance Criteria

MTD Spend & Remaining Cap Display in Cost Meter

Given a signed-in user with an active plan When they view the Cost Meter on the upload queue Then it displays month-to-date spend as a currency amount with the billing account’s currency code And it displays remaining cap in applicable units (images and/or currency) matching the user’s plan And it shows a "Last updated" timestamp not older than 2 minutes And values match the billing snapshot within ±0.01 in currency and ±1 image

Near-Real-Time Reconciliation of Processed, Pending, and Credits

Given a batch job transitions to Processed and billing posts the charge When reconciliation runs Then MTD processed spend increases by the charged amount within 60 seconds (p95) and 5 minutes (p99) And the corresponding pending amount is reduced to zero for that job Given a credit or refund is applied in billing When reconciliation runs Then MTD processed spend decreases by the credit amount within 60 seconds (p95) and 5 minutes (p99) And the "Credits" line reflects the total credits applied this month If billing is unavailable, a visible "Sync delayed" indicator appears showing the last successful sync time; predictions use the last known data until sync resumes

Soft Cap Warning and Process-Within-Budget Check

Given a soft cap is configured When predicted post-batch totals exceed the soft cap Then the cost meter shows an amber warning with the exceed amount and cap name And the "Process within budget" check displays "Fail" with reason And the user can proceed only after confirming an overage acknowledgement dialog When predicted totals are within the soft cap Then the "Process within budget" check displays "Pass" and no confirmation is required

Hard Cap Enforcement Blocking

Given a hard cap is configured When predicted post-batch totals exceed the hard cap Then the Run action is disabled with a red error explaining which cap is exceeded and by how much And a CTA to Manage Plan is shown When the batch is reduced so predicted totals fall within the hard cap Then the Run action becomes enabled and the error clears At click time, the system re-validates against the latest caps; if still exceeding, the run is blocked

Pricing Breakdown Accuracy by Preset, Discounts, Taxes

Given a batch with multiple presets, each with distinct per-image rates, discounts (coupon, tiered), and applicable taxes When the cost meter calculates the per-batch total Then for each preset the UI shows quantity, unit rate, discount applied, tax, and line total And rounding matches billing rules (banker’s rounding, 2 decimals) And the per-batch total equals the eventual billed amount within ±0.01

Post-Batch Prediction and Limit Evaluation

Given current MTD processed spend, pending charges, credits, and remaining caps When a new batch is queued with selected presets and quantities Then the cost meter shows predicted post-batch MTD processed + pending values And computes remaining capacity after the batch for each applicable cap (images and/or currency) And identifies and labels the binding cap used for the budget check And displays either "Will exceed by <amount>" or "Will remain <amount>" consistent with the computation

Auto-Refresh, Concurrency, and Run Validation

Given the user stays on the queue screen When jobs complete, credits arrive, or plan changes occur Then the cost meter auto-refreshes via events or polling at least every 60 seconds without losing user selections And visual updates are non-disruptive (no layout jump > 16px and no input focus loss) And the Run button state reflects the latest validation at the moment of click

Budget Guardrails & Color-Coded Alerts

"As a user managing tight budgets, I want clear visual warnings when I’m about to overspend so that I can adjust settings before processing."

Description

Provide visual guardrails with configurable thresholds (green/amber/red) that respond to real-time predictions for per-batch and MTD costs. Alerts should cover nearing thresholds, cap breaches, missing billing info, or invalid discounts. Ensure WCAG-compliant color contrast, redundant iconography and text labels, and contextual tooltips that explain why a warning appears and how to resolve it.

Acceptance Criteria

Real-Time Threshold Guardrails Update

Given user-defined green, amber, and red thresholds exist for batch budget and MTD cap When the user changes any cost-impacting setting (quantity, preset, discount, tax) or adds/removes images Then predicted batch cost and MTD total recalculate and the guardrail color for each metric updates within 500 ms according to thresholds And each metric chip displays currency value and percentage of the relevant threshold (e.g., "$142 • 71% of batch budget") And threshold settings validate that green < amber < red and accept integers 1–100; invalid input shows inline error and disables Save And saved thresholds persist across sessions for the same user/account

Amber Warning at Approaching Budget

Given predicted batch cost or MTD spend crosses the amber threshold but remains below red When the state changes to amber Then an amber alert appears with a warning icon and text label "Approaching budget" And the alert shows current estimate, remaining budget/cap in currency and percent, and provides an "Adjust settings" link And a contextual tooltip explains the reason (inputs used) and how the percentage was calculated And screen readers announce the alert via role="status" as "Warning: Approaching budget"

Red Breach Alert and Pre-Run Budget Check

Given predicted batch cost would exceed the remaining MTD cap or the user-defined batch budget (red threshold) When the user clicks Process/Run Then a pre-run budget check modal opens showing estimated batch cost, projected MTD total, overage amount, and remaining cap And the primary Process action is disabled until either the estimate is reduced below red (updates reflected in real time) or the user explicitly checks "Acknowledge overage" (if overages are permitted for the account) And if the account forbids over-cap processing, the modal shows a blocking error and Process remains disabled with links to Adjust settings or Update plan And a persistent red alert banner remains on the main screen until the condition clears

Missing Billing Information Blocking Alert

Given the account has no valid billing method on file When the user queues a batch or opens the processing screen Then a red alert banner labeled "Billing required" with an error icon is displayed And the Process action is disabled and a primary CTA "Add billing" opens the billing form And a tooltip explains processing cannot proceed without valid billing and lists accepted payment methods And once billing is added and validated, the alert auto-dismisses without page reload and Process becomes enabled

Invalid Discount Detection and Guidance

Given the user enters a discount code When the code is invalid, expired, or inapplicable to selected presets Then an amber inline alert appears near the discount field stating the specific reason (invalid/expired/ineligible) And the discount is not applied to calculations; per-image, batch, and MTD estimates update accordingly within 500 ms And a tooltip lists eligibility rules and (if available) the expiration date, plus a link to View discount terms And when a valid code is entered, the alert clears and the recalculated estimate reflects the discount within 500 ms

WCAG-Compliant, Redundant Alert Cues

Rule: All alert text meets contrast ratio ≥ 4.5:1 against background; non-text UI components meet ≥ 3:1 (WCAG AA) Rule: Alert states do not rely on color alone; each state (green/amber/red) includes a distinct icon and text label Rule: Alerts expose appropriate ARIA roles (status, alert, or alertdialog) and are announced exactly once by screen readers Rule: Keyboard: all alert actions and tooltips are reachable via Tab/Shift+Tab, operable with Enter/Space; Esc dismisses non-blocking surfaces Rule: Tooltips open on hover and focus (and via tap on touch), remain on hover, are dismissible via Esc, and are screen-reader accessible Rule: States remain distinguishable under deuteranopia, protanopia, and tritanopia simulations

Contextual Tooltips for Alerts

Given any alert (amber or red) is visible When the user hovers, focuses, or taps the info icon on the alert Then a tooltip opens within 150 ms containing: the cause of the alert, key inputs used in the calculation (image count, per-image rate, discounts, taxes), and 1–3 recommended actions to resolve And the tooltip includes a Learn more link to documentation and is positioned to avoid obscuring critical controls with at least 8 px viewport margin And the tooltip supports multiline content up to 320 px width and closes on Esc, blur, or tap outside And when the alert condition resolves, the tooltip auto-closes

Process-Within-Budget Preflight Check

"As a shop owner, I want a final budget check with suggestions before I start processing so that I don’t accidentally trigger an overage."

Description

Add a preflight validator that, on run, verifies the batch can be processed within the user’s budget, caps, and policy constraints. Provide actionable recommendations (e.g., switch to a lower-cost preset, reduce resolution, split batch) and support one-click optimization to meet a target spend. Respect hard caps by blocking processing with clear guidance; allow overrides only for authorized roles when policies permit.

Acceptance Criteria

Within-Budget Green Check

Given the user’s month-to-date (MTD) spend is $200 against a monthly cap of $1000 and the current batch estimate is $120 including taxes and discounts When the user clicks Run Then the preflight validator confirms the projected MTD total of $320 is within the cap and displays a green "Within budget" check And the Run action proceeds without warnings And the preflight panel shows per-image rate by preset, applied discounts, taxes, batch total, MTD total, and remaining cap

Block Over Hard Cap

Given the account has a hard monthly cap of $500, the MTD spend is $480, and the batch estimate is $30 including taxes and discounts When the user clicks Run Then the preflight blocks processing and displays a red warning stating "Exceeds cap by $10" And the warning includes guidance to reduce image count, switch to a lower-cost preset, lower resolution, or split the batch And the Run button remains disabled until the projected total is within the cap or an authorized override is applied And no override control is shown if policy hard_cap_enforced=true

Recommend Lower-Cost Preset to Meet Target Spend

Given the user sets a target batch spend of $100 and the current estimate is $140 using preset "Studio Pro" at $0.70/image When the preflight runs Then the recommendations list includes at least one lower-cost preset option with predicted total ≤ $100 when available (e.g., "Basic Clean" at $0.45/image predicted total $90) And recommendations are sorted by predicted total ascending and show per-image rate, predicted total, and savings vs current And selecting a recommendation re-estimates immediately and updates the batch total and remaining cap

One-Click Optimize To Budget

Given the user clicks "Optimize to budget" with a target of $250 When the preflight optimization runs Then the system adjusts settings in priority order: (1) downgrade preset, (2) reduce resolution no lower than the brand minimum, (3) disable optional effects, while respecting brand-locked constraints And the re-estimated total is ≤ $250; if not achievable, the UI shows the best-achievable total and names the constraints preventing target attainment And all proposed changes are listed as a diff and can be reverted with one click And a confirmation toast shows expected savings and new per-image rate range

Suggest Split Batch Within Remaining Cap

Given the remaining monthly cap is $75 and the active preset cost is $0.50/image with 300 images queued When the preflight runs Then the validator calculates that 150 images can be processed now and recommends splitting into 150-now and 150-later And the suggestion shows predicted cost for the first sub-batch, updated MTD total, and defers the second sub-batch with status "Pending budget" And clicking "Split and Run Now" creates two batches accordingly, starts the first, and queues the second without exceeding the cap

Authorized Override Within Policy

Given the policy allow_overage=true with overage_limit_percent=10% and hard_cap_enforced=false, the cap is $1000, MTD is $950, and the batch estimate is $80 When the preflight runs Then it shows an amber warning "Over budget by $30 (3%)" and displays an "Override and Run" control to users with role Billing Manager or Admin only And clicking override requires MFA and a required justification note And the audit log records user, timestamp, projected totals, overage percent, policy flags, and batch ID And the override is blocked if the projected MTD total would exceed the cap by more than 10% or if the user lacks authorization

Cost Meter UI Components & Accessibility

"As a user comparing options, I want a clear, accessible cost panel that breaks down every component of price so that I can make informed choices quickly."

Description

Build reusable UI components that render the live cost meter: per-batch summary, per-image rate breakdown by preset, applied discounts, tax line, totals, and MTD panel with remaining cap. Components must be responsive, performant for large queues, keyboard-navigable, and screen-reader friendly with concise ARIA labels. Include a currency selector (where allowed), hover details for line items, and stable layout to avoid jitter as values update.

Acceptance Criteria

Real-time Per-Batch Cost Summary Updates

Given a batch with N images and selected presets When I add/remove images or change a preset or discount Then the subtotal, discounts line, tax line, and total update within 300ms of the last change without a page reload And Then the displayed total equals (subtotal − discounts + tax) within the currency’s minor unit rounding rules (±0.01 for USD) When no items are in the batch Then the meter displays $0.00 (or currency equivalent) with disabled state for breakdown sections

Per-Image Rate Breakdown by Preset with Hover/Focus Details

Given multiple presets are selected with different per-image rates When I expand the rate breakdown Then each preset row shows preset name, per-image rate, image count, and extended cost When I hover or keyboard-focus a preset row or info icon Then a tooltip/hovercard appears within 150ms showing how the effective rate is computed (list price, bulk discount, promo) and dismisses on blur/Escape And Then the tooltip stays within the viewport and does not occlude the totals area

Discounts, Taxes, and Totals Accuracy

Given discounts and a tax rate are configured When the cost meter computes totals Then the discount is applied to the subtotal first and tax is calculated on the post-discount amount And Then all monetary values are rounded to the currency’s minor unit and the displayed total matches the computed total within ±0.01 (USD example) When no discount or tax applies Then the corresponding line is hidden or shown as 0.00 in a muted style per design

Month-to-Date Remaining Cap and Budget Warnings

Given the account has a monthly spend cap and current month-to-date (MTD) spend When the MTD panel renders Then it displays MTD spend, cap, and remaining balance And Then color states apply: green (remaining ≥ 30% of cap), amber (10% ≤ remaining < 30%), red (remaining < 10% or projected batch total > remaining) When the projected batch total exceeds remaining Then a “process within budget” warning displays the projected overage and, if a Process action is present, it is disabled until cost settings are adjusted or policy allows an explicit override control

Currency Selector Availability and Formatting

Given multi-currency is allowed for this account/region When I open the currency selector Then only permitted currencies are listed and the current selection is indicated When I switch currency Then all monetary values re-render immediately with correct symbol, grouping/decimal separators, and minor unit precision for that currency and an accessible ISO code label (e.g., USD) for screen readers When multi-currency is not allowed Then the currency selector is not rendered

Performance and Layout Stability Under Large Queues

Given a queue of 2,000 images with mixed presets When cost updates occur at up to 2 updates per second Then main-thread work per update is ≤ 50ms and frame rate remains ≥ 50 FPS on target test devices, without memory growth > 10% over 60 seconds attributable to the component And Then dynamic value changes do not introduce horizontal scrolling at widths ≥ 320px and cumulative layout shift within the component remains ≤ 0.02 during updates

Accessibility: Keyboard Navigation and Screen Reader Behavior

Given keyboard-only navigation When I Tab/Shift+Tab through the meter Then all interactive elements (currency selector, expand/collapse controls, info icons) are reachable in a logical order, operable via Enter/Space, and have visible focus indicators with ≥ 3:1 contrast When values change Then totals and MTD remaining are exposed via an aria-live polite region throttled to ≤ 1 announcement/second, with concise labels (≤ 80 characters) and unchanged values are not re-announced And Then automated WCAG 2.1 AA checks via axe-core report 0 critical violations and text contrast is ≥ 4.5:1

Smart Caps

Set soft and hard monthly (or daily) caps by workspace, brand, project, or client. Choose what happens at each threshold—auto‑queue to next cycle, pause high‑cost steps (e.g., ghosting), or request approval. Time‑zone aware resets, cap exceptions for launches, and clear logs keep spend predictable without slowing the team.

Requirements

Hierarchical Caps by Scope

"As an operations admin, I want to set and manage caps at workspace, brand, project, and client levels with clear precedence so that spend stays predictable across teams without manual tracking."

Description

Enable admins to define daily or monthly processing caps at workspace, brand, project, and client levels with clear precedence and inheritance. Lower scopes inherit defaults from higher scopes but can be overridden within allowed bounds. Provide a unified UI and API to create, edit, and visualize caps per scope, including current usage, remaining capacity, and next reset time. Ensure caps apply across PixelLift batch pipelines without disrupting in-flight jobs, and reconcile usage from all steps (retouch, background removal, ghosting, presets) into a single meter per scope.

Acceptance Criteria

Inheritance and Precedence Across Scopes

Given a workspace monthly cap of 10,000 credits with allowed child override range [50%, 100%] When a brand admin sets a monthly cap for Brand A Then the system accepts values between 5,000 and 10,000 inclusive and rejects any value outside that range with a 400 error and validation message "Cap must be within parent override range" Given a brand has a monthly cap and a project under the brand has no explicit cap When effective caps are calculated Then the project inherits the brand's cap value and allowed override range Given caps exist at multiple scopes (workspace > brand > project > client) When enforcing caps during job admission Then the effective cap is the most specific defined cap for the job's scope, constrained by its parent scopes, and remaining capacity equals effective_cap minus reconciled_usage Given a parent cap is decreased below a child cap When the change is saved Then the child cap is clamped to the new maximum allowed, an audit log entry is created, and the admin is notified in the UI within 5 seconds Given no caps are defined at any scope When a job is admitted Then no cap enforcement occurs and usage still accumulates to meters

UI Creation, Editing, and Visualization Per Scope

Given an admin opens the Caps UI for a scope When they create or edit Daily and Monthly caps, select a timezone, and define an allowed override range for children Then the form validates required fields, numeric ranges (>0 integers), and prevents saving if outside the parent range with an inline error Given the admin saves valid changes When the operation completes Then the changes persist and are retrievable via API within 2 seconds, next reset timestamps are recalculated and displayed, and an audit trail entry (actor, scope, before/after, timestamp) is created Given the scope details view When it is displayed Then it shows current usage this period, remaining capacity, and next reset time; values auto-refresh at least every 15 seconds Given a scope currently inherits its cap When the admin toggles "Revert to inherit" after having an override Then the explicit cap is removed, inherited values are shown, and the UI marks the scope as "Inheriting"

Time-Zone Aware Resets and Launch Exceptions

Given a scope with a Daily cap and timezone America/Los_Angeles When local midnight occurs (including DST transitions) Then the usage meter resets to 0 at 00:00 local time, an audit log "reset" entry is created, and next_reset_at shows the following local midnight adjusted for DST Given a Launch Exception with start_at, end_at, and +20% cap increase When the current time is within the exception window Then the effective cap increases by 20% and UI/API label the state as "Exception Active"; when the window ends, enforcement reverts automatically Given a missed reset due to downtime When services recover Then catch-up resets are applied once per missed period in order, preserving history, and reset events are emitted as webhooks within 5 seconds

Caps API CRUD, Validation, and Idempotency

Given an authenticated admin with caps:write permission When they POST/PUT/PATCH /v1/caps with scope_type, scope_id, period (daily|monthly), cap_value, timezone, and override_range Then the API validates parent constraints and returns 201/200 with resource fields: effective_cap, inherited (true|false), next_reset_at, current_usage Given a request that violates parent bounds When it is processed Then the API returns 409 with error_code "CAP_OUT_OF_BOUNDS" and does not persist the change Given a client calls GET /v1/caps?scope_id=... (or by scope_type) When the request completes Then the API returns effective cap records including inheritance chain identifiers within 500 ms p95 Given write requests include Idempotency-Key When the same key is reused within 24 hours Then the original response is returned without duplicating side effects Given DELETE /v1/caps/:id When it is called Then the explicit cap is removed, inheritance applies, 204 is returned, and an audit log entry is created Given any API call without required OAuth2 scopes When it is attempted Then 401/403 is returned as appropriate

Enforcement Across Pipelines Without Disrupting In‑Flight Jobs

Given a scope with a hard monthly cap and action "auto‑queue to next cycle" When current_usage reaches cap_value during processing Then in‑flight jobs continue to completion, no new step starts are admitted for that scope, and new jobs are queued with status "Queued until reset" Given a scope with a soft cap at 80% and action "pause ghosting, request approval" When usage crosses 80% Then ghosting steps for new jobs are paused while other steps continue, an approval request is sent to admins via email and web within 30 seconds, and the UI shows a "Soft cap reached" banner Given an admin approves a temporary overage of N credits with an expiry When approval is granted Then admissions are allowed until usage reaches cap_value + N or until expiry, whichever comes first; all approval events are audit‑logged and visible via API Given a reset or approval expiry occurs When capacity becomes available Then queued jobs resume admission in FIFO order and resume events are logged Given multiple pipelines (retouch, background removal, ghosting, presets) When enforcement checks capacity Then a single reconciled meter is used for decisions across all pipelines

Unified Meter and Usage Reconciliation Across Steps

Given a job with steps retouch, background removal, ghosting, and presets with configured credit costs When each step completes Then the single meter for each applicable scope increments by the step cost exactly once, de‑duplicated by step‑run ID Given a step retry or duplicate message When reconciliation runs Then no additional usage is recorded for the same step‑run ID Given a step is rolled back or cancelled after recording usage When compensation is triggered Then previously recorded usage for that step is reversed within 60 seconds and reflected in current_usage Given ongoing processing across the system When usage updates occur Then reconciliation achieves eventual consistency within 60 seconds p95 and UI/API reflect changes within 15 seconds p95 Given a job belongs to workspace, brand, project, and client scopes When step usage is recorded Then the same usage is attributed to each scope's meter once (no double‑counting within a scope)

Soft & Hard Caps with Threshold Actions

"As a studio manager, I want to configure soft and hard thresholds with automatic actions so that production continues safely while preventing budget overruns."

Description

Allow configuration of one or more soft thresholds (e.g., 70%, 85%, 95%) and a hard cap per scope. For each threshold, enable action rules such as auto-queue new jobs to the next cycle, pause high-cost steps (e.g., ghosting or 8K upscaling), require approval before continuing, or notify stakeholders. Ensure actions are atomic, idempotent, and recoverable, with sensible defaults and fallbacks. Provide per-scope policies and templates that can be reused across brands and projects, and guarantee that hard caps block additional spend while preserving job integrity.

Acceptance Criteria

Per-Scope Policy Configuration and Templates

Given an admin selects a scope (workspace, brand, project, or client), When they create or edit a cap policy, Then they can define one or more soft thresholds (percentages) and a hard cap amount for the scope. Given no soft thresholds are specified, When the policy is saved, Then the system applies default soft thresholds of 80%, 90%, and 95% and displays them in the policy summary. Given a saved policy is exported as a template, When the template is applied to another scope, Then thresholds and associated action rules are copied and linked to the new scope without affecting the original. Given multiple policies could apply, When evaluating which policy to enforce for a project, Then policy precedence is project > brand > client > workspace, and the effective policy is shown in the UI and API. Given a policy is updated, When new jobs are submitted after the update, Then they use the new policy version, while jobs already in progress continue under their original policy version.

Auto-Queue to Next Cycle at Soft Threshold

Given a soft threshold of 85% with the action "auto-queue new jobs to next cycle" is active for a scope, When a new job is submitted after the scope's spend crosses 85% in the current cycle, Then the job is accepted but placed in a "Queued for next cycle" state and assigned an estimated start time at the next cycle boundary in the scope's time zone. Given the job is queued for next cycle, When the cycle boundary occurs, Then the job transitions to "Ready" and begins processing automatically unless a hard cap is already reached in the new cycle. Then no spend is recorded for the queued job in the original cycle, and a notification is sent to the submitter and watchers with the expected start time.

Pause High-Cost Steps at Threshold

Given a job includes steps labeled as high-cost (e.g., ghosting, 8K upscaling) and a threshold rule "pause high-cost steps" is active, When the scope's spend crosses the configured threshold, Then the job continues non-high-cost steps but pauses high-cost steps at the next safe checkpoint and marks them as "Paused - Cap Threshold." Given a paused high-cost step, When the cycle resets or an authorized override is granted, Then the step can be resumed exactly once without duplicating work or charges. Then the system records zero cost for paused steps and ensures no partial charges are posted.

Approval Required Before Continuing Near Cap

Given a threshold rule is configured to "require approval before continuing" for a scope, When a job reaches the threshold condition, Then processing is suspended at the next safe checkpoint and an approval request with job details, remaining budget, and options is sent to the configured approvers via in-app and email channels. Given multiple approval requests or retries, When an approver submits a decision, Then the decision is applied idempotently; subsequent duplicate approvals have no effect. Given no decision is received within 24 hours, When the SLA expires, Then the fallback action configured for the rule (auto-queue, pause, or cancel) is applied and logged. Given an approval is granted, When processing resumes, Then the job continues under the same policy version and all actions are audited with approver, timestamp, and reason.

Hard Cap Enforcement Preserving Job Integrity

Given a scope has a hard cap configured, When cumulative reserved spend reaches the hard cap within a cycle, Then no new chargeable step may start, and all new jobs are queued for the next cycle unless an approved exception applies. Given a chargeable step is about to start, When its estimated maximum cost would exceed the remaining cap, Then the system does not start the step and instead places the job in a "Waiting for next cycle or approval" state. Given a chargeable step starts, When it completes, Then the actual cost is deducted from the pre-reserved budget, ensuring the total spend never exceeds the hard cap for the cycle.

Action Atomicity, Idempotency, and Recoverability

Given a threshold event occurs (scope+cycle+threshold), When the associated action rule is executed, Then the action is applied exactly once per unique event using a deterministic event key and idempotency guard. Given a partial failure or process crash during an action, When the worker restarts, Then the action resumes or rolls back to a safe state without duplicating side effects and without starting chargeable steps twice. Given high concurrency with multiple workers, When evaluating thresholds, Then race conditions are prevented using optimistic locking or transactions, and audit logs show a single authoritative action result. Then each executed action writes an immutable audit record with event key, prior state, new state, actor/system, and timestamp, and emits a metric for monitoring.

Time-Zone-Aware Cycle Resets and Launch Exceptions

Given a scope has a defined time zone and monthly or daily cycle, When the cycle boundary occurs in that time zone, Then soft threshold counters and hard cap balances reset, and "Queued for next cycle" jobs become eligible to start in FIFO order. Given an approved launch exception window is configured with start/end time and an exception cap, When the current time falls within the window, Then the exception cap and rules temporarily override the normal cap for the scope and all actions and spend are attributed to the exception in logs and reports. Given the exception window ends or the exception cap is reached, When additional jobs are submitted, Then normal caps and actions resume automatically and any pending jobs follow the non-exception policy.

Time‑Zone Aware Reset Schedules

"As a finance lead, I want caps to reset based on each brand’s local time zone so that reporting aligns with our accounting periods and regional operating hours."

Description

Support cap resets on daily, weekly, or monthly schedules tied to a specified time zone per scope. Handle calendar edge cases (month length, leap years) and daylight saving changes deterministically, with explicit reset timestamps. Allow administrators to set custom reset times (e.g., 6 AM local) and display countdowns to reset in UI and API. Include proration logic for mid-cycle changes and show historical cycles for context.

Acceptance Criteria

Daily Reset at 6 AM Local per Brand

Given a brand scope "Acme" with time zone "America/Los_Angeles" and a cap reset schedule configured to Daily at 06:00 local And the current cycle started at the most recent 06:00 America/Los_Angeles When the local time reaches the next 06:00 America/Los_Angeles Then a new cycle is created with start_at equal to that 06:00 timestamp in ISO 8601 with UTC offset And the previous cycle end_at equals the new cycle start_at And the cap usage counters are reset to 0 at that moment And the API GET /caps/{id} returns next_reset_at matching that timestamp and time_to_reset_seconds <= 5 at that instant And an audit log entry is recorded with reset_reason="scheduled", scope_type="brand", and the correct scope_id

Weekly Reset on Monday 00:00 in Europe/Berlin

Given a workspace scope "Studio" with time zone "Europe/Berlin" and a weekly reset schedule set to Monday at 00:00 local When local time crosses Monday 00:00 for three consecutive weeks, including a week containing a DST change Then exactly one reset occurs at each Monday 00:00 local according to the IANA rules for Europe/Berlin And each new cycle has start_at at Monday 00:00 with the correct local UTC offset for that date And no additional resets occur within those weeks And the API for that cap shows next_reset_at for the upcoming Monday 00:00 local and a countdown whose seconds-to-reset matches the difference between now and next_reset_at

Monthly Reset on Last Day 23:59:59 in Asia/Tokyo with Leap-Year Handling

Given a client scope "Boutique" with time zone "Asia/Tokyo" and a monthly reset schedule set to the last calendar day at 23:59:59 local When the months end for January (31 days), April (30 days), and February in both a leap year and a non-leap year Then the reset occurs at 23:59:59 local on the actual last day of each month And the new cycle's start_at equals that reset timestamp in ISO 8601 with offset, and the previous cycle's end_at equals start_at And the API/UI show next_reset_at for the correct last-day timestamp for the current month And no duplicate or missed resets occur across these month lengths

DST Transition Determinism for America/New_York

Given a project scope "Lookbook" with time zone "America/New_York" and a daily reset at 02:30 local When the spring-forward transition occurs and 02:30 local does not exist Then the reset for that day occurs at the nearest future valid local time (03:00), recorded as start_at with the correct offset, and exactly one reset occurs that day And Given the same scope configured to a daily reset at 01:30 local When the fall-back transition occurs and 01:30 local occurs twice Then exactly one reset occurs at the first 01:30 occurrence, start_at records the corresponding timestamp with the applicable offset, and that day's cycle duration is 25 hours And no second reset is triggered at the repeated 01:30

Mid-Cycle Change Proration for Cap and Schedule Updates

Given a brand scope with time zone "America/Chicago" and a Monthly cap of 1000 credits resetting on the 1st at 06:00 local And on the 10th at 12:00 local an admin changes the cap to a Monthly cap of 1500 credits, effective immediately within the same cycle When the change is saved Then remaining_allowed = floor(1500 * remaining_time_in_cycle_seconds / total_cycle_seconds) is computed and applied immediately And the API returns proration_factor with precision >= 0.000001 and remaining_allowed reflecting the above formula And an audit log captures prior_cap, new_cap, proration_factor, and changed_at And Given a project scope with a Weekly schedule (Monday 00:00) When the admin changes the schedule mid-cycle to Daily at 06:00 local, effective immediately Then the current cycle ends at the next Daily 06:00 local, and a prorated cap for the remainder until that time is applied using the same time-ratio formula, with next_reset_at updated accordingly

UI and API Countdown to Next Reset with Time-Zone Labels

Given any scope with a configured time zone and reset schedule When a user views the cap in the UI and calls GET /caps/{id} Then the API returns next_reset_at (ISO 8601 with offset), next_reset_tz (IANA ID), and time_to_reset_seconds as a non-negative integer And the UI displays a countdown timer that ticks down at 1-second intervals and never becomes negative And across DST transitions the countdown remains monotonic and reaches 0 exactly at next_reset_at And UI labels/tooltips show "Resets at {local time} ({IANA TZ})" matching next_reset_at

Historical Cycle Boundaries and Explicit Reset Timestamps

Given any scope with at least six completed cycles including periods spanning Feb 29 and a DST change When the user requests GET /caps/{id}/cycles?limit=6 Then the response contains six cycle records ordered by start_at desc, each with start_at, end_at, tz (IANA), and utc_offset fields And each cycle end_at equals the next cycle start_at with no gaps or overlaps And cycle boundaries align to the configured schedules and local offsets for those dates And historical records are immutable: attempts to modify a past cycle return HTTP 409 and do not change stored data

Exceptions & Temporary Overrides

"As a brand lead, I want to request temporary cap increases for launches with a clear approval trail so that critical campaigns are not blocked while spend remains controlled."

Description

Provide time-bound exceptions for launches or campaigns that temporarily increase or bypass caps. Exceptions include start/end time, affected scopes, new limit or multiplier, and justification. Require approver selection and optional attachments. Ensure exceptions auto-expire, are conflict-checked against existing policies, and clearly annotate affected jobs and dashboards. Maintain a full audit trail and summary of incremental spend attributed to each exception.

Acceptance Criteria

Create Time‑Bound Exception With Required Metadata

Given I am a workspace admin with Manage Caps permission And a workspace/brand/project/client scope exists When I create an exception with start and end timestamps (end > start), selected scope(s), either a new absolute limit or a multiplier (mutually exclusive), a justification, a selected approver, and optional attachments And I select a time zone (or accept the workspace default) Then the exception is saved in Pending Approval state with a unique ID And required fields are validated with inline errors for any omissions or invalid values And the scheduling honors the selected time zone for activation/deactivation And the exception window cannot be entirely in the past (no retroactive effect)

Approval Workflow and Notifications

Given an exception has been submitted and is in Pending Approval with approver X When the submission occurs Then approver X receives an in‑app notification and email with the exception details and justification And the approver can Approve or Decline with an optional comment And on Approve, the status updates to Scheduled (future start) or Active (start ≤ now); on Decline, the status updates to Declined And the creator receives a notification with the decision and comment And the audit trail records submit/approve/decline events with timestamp, actor, and comment

Automatic Expiry and Safe Reversion

Given an Active exception with an end time T in its configured time zone When the clock reaches T Then the exception automatically transitions to Expired without manual action And all new jobs after T are evaluated against the original caps/policies And queued jobs not yet started are re‑evaluated against original caps before execution And an audit entry is recorded: Auto‑expired with timestamp and affected scopes

Conflict Detection and Resolution With Existing Policies

Given I am creating or editing an exception whose time window and scope overlap with existing exceptions or cap policies When I attempt to save Then the system surfaces a list of conflicts (referencing IDs, scopes, and time windows) And saving is blocked until I resolve conflicts by adjusting time/scope or selecting an allowed precedence per policy And attempts to exceed global hard caps are blocked unless Elevated Approval is provided; otherwise validation fails with a clear error And all conflicts and chosen resolutions are captured in the audit trail

Annotation of Affected Jobs and Dashboards

Given jobs run while an exception is Active and within its scope When those jobs are processed Then each affected job is annotated with an Exception badge and the exception ID And job detail views display the exception link and justification And dashboards show an Active Exception banner, support filtering by exception ID, and separate spend metrics under exception vs normal And CSV/API exports include fields isException=true and exceptionId for affected records

Incremental Spend Attribution and Reporting

Given an exception modifies limits via a new cap or multiplier When the exception expires or a report is requested Then the system calculates incremental spend attributable to the exception as (actual spend under exception) − (projected spend under original caps for the same period and scope) And displays total incremental spend, items processed under exception, and percentage uplift And attributes these metrics to the exception in dashboards, reports, and audit summaries And the metrics are available via CSV/API export with the exception ID

Approval Workflow & Notifications

"As an approver, I want actionable notifications and a simple approval queue so that I can unblock high-priority work without compromising budget policies."

Description

Implement a lightweight approval workflow triggered by thresholds, hard caps, or exception requests. Route to designated approvers based on scope with fallback delegates and SLAs. Provide in-app, email, and Slack notifications with deep links for one-click approve/deny and required rationale. Surface pending approvals in a consolidated queue, and unblock or queue jobs automatically based on decisions. Log all actions and communicate outcomes to requestors and job owners.

Acceptance Criteria

Threshold-Triggered Approval Routing with SLA and Fallback

Given a soft cap threshold is configured for a scope (workspace/brand/project/client) When a job’s estimated or actual spend crosses the threshold Then an approval request is created and routed to the designated approver group for that scope And the primary approver is notified and an SLA timer starts using the configured duration And the job enters an Approval Pending state and does not proceed When the SLA expires without decision Then the request auto-escalates to the configured fallback delegate and the SLA status shows Overdue And only one active approver is assigned at a time to prevent parallel decisions

Hard Cap Breach Blocks and Auto-Queue Behavior

Given a hard cap is configured and the setting "Auto-queue to next cycle" is enabled for the scope When a job would cause the cap to be exceeded Then the job is blocked and an approval request is created And the job is placed into the Next Cycle queue with the next reset date/time displayed When the request is approved before the reset Then the job is unblocked within 5 seconds and removed from the Next Cycle queue When the request is denied Then the job remains queued or is canceled per policy setting, and the decision is recorded

Exception Request for Launch Cap Override

Given a launch exception window is configured for a scope When a requester submits an exception Then the form requires reason, duration window, scope, and expected spend variance And the system validates the request aligns to the configured launch scope and timeframe When the exception is approved Then the cap limit increases only for the approved scope and duration And after the window ends, caps revert and pending jobs re-evaluate against original caps When the exception is denied Then no cap changes occur and the requester is notified with the rationale

Multi-Channel Notifications with Deep Links and Rationale Capture

Given an approval request is created or updated When notifications are sent Then in-app, email, and Slack (if connected) notifications are delivered to the active approver And each notification includes scope, requester, spend impact, SLA due time in approver’s timezone, and a deep link to the action When the approver clicks Approve or Deny via any channel Then a one-click action screen opens with decision buttons And rationale is required on Deny and is captured; rationale on Approve is required only if configured And the decision is reflected system-wide within 2 seconds and duplicate notifications are suppressed thereafter

Consolidated Approvals Queue Visibility and Filtering

Given a user has approver permissions for one or more scopes When they open the Approvals Queue Then they see a consolidated list of pending requests for their scopes only And they can filter by status (Pending, Escalated, Overdue), scope, requester, cap type (Threshold/Hard/Exception), and date range And they can sort by SLA remaining (default ascending), created time, or spend impact When a row is opened Then details show scope, jobs impacted, spend deltas, history, and Approve/Deny actions

Decision Outcomes Unblock/Queue Jobs and Notify Stakeholders

Given an approval request is linked to one or more jobs When the approver selects Approve Then linked jobs move from Approval Pending to Processing within 5 seconds unless Next Cycle queueing is enforced And the requester and job owners receive outcome notifications with decision, rationale (if provided), and next steps When the approver selects Deny Then linked jobs remain queued or are canceled per policy and all stakeholders are notified And repeated clicks or duplicate submissions do not create duplicate state changes (idempotent)

Audit Logging and Traceability of Approval Lifecycle

Given audit logging is enabled by default When an approval is created, routed, escalated, approved, denied, auto-queued, unblocked, or breaches SLA Then an immutable log entry is stored with UTC timestamp, actor (user/service), channel (in-app/email/Slack/API), rationale (if any), prior state, new state, and scope identifiers And logs are viewable in the approval detail and exportable via CSV and API And every decision links to the jobs affected with before/after state snapshots

Cap Activity Logs & Reporting

"As an operations analyst, I want detailed logs and reports of cap events so that I can audit decisions, forecast spend, and optimize policies over time."

Description

Expose clear, immutable logs of usage accrual, threshold crossings, triggered actions, pauses, approvals, and exceptions. Provide filters by date range, scope, user, action type, and job ID, with CSV export and API access. Include dashboards for current burn rate, forecast to cap, and historical trendlines to help teams tune thresholds. Ensure logs are time-zone aware and reference the policy version in effect at event time.

Acceptance Criteria

Immutable Append-Only Log Ledger

Given a user (any role) or API client selects an existing log entry When they attempt to edit or delete the entry via UI or API Then the system rejects the request, records no change, and returns an explicit error (HTTP 403 for edits/deletes) Given the logging subsystem writes events When a new event is appended Then it is assigned an immutable ID, write-once storage, created_at timestamp with timezone, and cryptographic checksum linking to the previous event in the same scope/day Given the integrity verification job runs daily When it validates the last 24h of events Then the verification status is Pass and any tampering would surface a Fail with the first mismatched event ID exposed in an admin-only report

Threshold Crossings & Actions Traceability

Given a cap policy with soft and hard thresholds is active When usage accrues and crosses a threshold Then a log entry is created within 2 seconds containing: scope (workspace/brand/project/client), job_id (if applicable), cap_id, cap_type (soft|hard), threshold_percent, action_taken (auto-queue|pause|request-approval|none), actor (system|user_id), policy_version_id, and resulting state Given a triggered action requires approval When an approver approves or rejects Then an approval log entry is created referencing the triggering event_id, approver user_id, decision, timestamp with timezone, and any comment Given an exception is granted for a launch window When the exception starts and ends Then start and end events are logged with exception_id, scope, policy_version_id, actor, and the affected thresholds

Multifaceted Filtering & Pagination

Given the user applies filters: date range, scope, user, action_type, and job_id (any combination) When they submit the query Then only matching events are returned; date range is interpreted in the selected timezone (DST-aware); start is inclusive, end is exclusive; and results are sorted by timestamp desc by default Given a large result set (>50,000 events) When the user pages through results Then cursor-based pagination returns stable, non-duplicated items across pages; page size defaults to 100 and supports up to 1,000 Given typical load and an indexed dataset of up to 5 million events When a filtered query returns <=50,000 matches Then the first page responds in ≤2 seconds and subsequent pages in ≤1.5 seconds

CSV Export of Filtered Logs

Given the user has an applied filter set When they export to CSV Then the CSV contains only the filtered events, includes a header row, and columns: event_id, timestamp (ISO 8601 with timezone and tzid), scope_path, user_id/actor, action_type, job_id, cap_id, cap_type, threshold_percent, action_taken, policy_version_id, exception_id (nullable), and checksum Given up to 100,000 rows match the filter When export is requested Then the file is streamed to the browser and completes in ≤60 seconds; numeric fields are unformatted; timestamps are ISO 8601; line endings are LF; and column order is consistent with documentation Given a user without export permission attempts export When they click Export Then the system denies the action with an explanatory message and logs an access_denied event

Logs API Access & Parity

Given an API client with valid credentials When they call GET /v1/logs with filters (date_range, scope, user_id, action_type, job_id), sort, and cursor params Then the API returns 200 with a JSON array of events matching the UI semantics, plus pagination cursors; each event includes policy_version_id, timestamp (UTC and tzid), checksum, and all fields available in CSV Given invalid parameters When the client requests with malformed date range or unknown filter Then the API returns 422 with a machine-readable error code and message; no partial data is returned Given rate limiting is configured When the client exceeds 60 requests per minute per token Then the API returns 429 with Retry-After and no degradation to other clients

Spend Dashboards: Burn Rate, Forecast to Cap, Trendlines

Given the user selects a scope and timezone When the dashboard loads Then it displays: current burn rate (units/min and currency/min) computed over the last 60 minutes, forecast-to-cap timestamp using a trailing 7-day weighted average, and 90-day historical trendlines of daily usage and cap thresholds—all aligned to the selected timezone Given the user clicks a data point or time window on a chart When they drill down Then a pre-filtered log view opens for the corresponding time bucket and scope, with totals matching the chart within ±1% rounding tolerance Given normal operation When new events occur Then dashboard metrics refresh at least every 5 minutes or on demand via Refresh, and values reconcile with the sum of underlying logs over the same window

Time-Zone Resets & Policy Version Attribution

Given a workspace timezone is set to America/Los_Angeles When a daily cap resets at local midnight Then a cap_reset event is logged at 00:00:00 local time with UTC offset and tzid, and subsequent accruals reference the new cycle_id Given a daylight saving transition occurs When logs span the fall-back or spring-forward window Then no hour is lost or duplicated in aggregated views; events carry correct offsets; and filtering by local time returns all intended events Given a policy change is published midday When events occur before and after the change Then each event’s policy_version_id matches the policy active at its timestamp, and historical logs continue to report under their original policy version

Top‑Up Rules

Automate credit top‑ups with guardrails. Define amounts, max frequency, funding source, and required approvers. Enable just‑in‑time micro top‑ups to keep batches flowing, add spend locks during off‑hours, and get instant alerts if payment fails—so work never stalls and budgets stay protected.

Requirements

Rule Builder (UI & API)

"As a finance admin, I want to define automated top‑up rules with caps, schedules, approvers, and funding sources so that credits replenish safely without exceeding budget policies."

Description

Provide a configurable rule builder to define automated credit top‑ups, including triggers (current balance threshold, projected batch usage, failed payment fallback), top‑up amounts (fixed, percentage of deficit, or tiered), min/max per top‑up, frequency caps (per hour/day/week), per-time-window rate limits, funding source selection (primary/backup with priority order), required approvers (users, roles, or groups), and activation schedules. Support draft/publish states, versioning, validation with in‑product previews/dry-runs, and scoping by workspace/brand. Expose full CRUD via secure APIs with server-side evaluation, idempotency keys, and RBAC permissions. Persist rule definitions with schema that supports currency, locale, and timezone. Integrate with PixelLift usage metrics to evaluate triggers, and ensure backward compatibility for accounts without rules.

Acceptance Criteria

Publish Rule: Balance Threshold Trigger with Fixed Amount

Given a published top-up rule with trigger "balance < 500 credits", fixed top-up amount 1000 credits, per-top-up min 100 and max 2000, active schedule 08:00–20:00 in the account timezone, and a valid primary funding source And server-side rule evaluation is enabled When the account balance drops to 480 credits at 10:00 Then exactly one top-up of 1000 credits is executed within the active window And the primary funding source is charged once and the balance increases to 1480 credits And an audit log entry is recorded with ruleId, version, trigger snapshot, evaluation timestamp, funding source, and a deduplication key And if the evaluator retries within 5 minutes, no duplicate charge occurs And no further top-up for this rule occurs until the balance again crosses below 500 after a prior top-up

Enforce Frequency Caps, Time-Window Limits, and Projected Usage Trigger

Given a published rule with a projected batch usage trigger that fires when predicted deficit ≥ 200 credits 15 minutes before a scheduled batch start And frequency caps set to max 2 executions per hour and 5 per day And a time-window rate limit of 1 execution per 10 minutes When three distinct trigger conditions occur within 30 minutes due to overlapping scheduled batches Then at most 2 top-ups execute within the hour and no more than 1 within any 10-minute window And the third trigger is rate-limited with reason "rate_limited" and a nextEligibleAt timestamp And daily/hourly counters reset at local midnight and on-the-hour based on the rule timezone And evaluation occurs server-side even if no users are logged in

Approvals Workflow for Top-Ups (Users/Roles/Groups)

Given a published rule requiring approvals from 1 user (alice@customer.com) and 1 role (Finance) When the rule trigger condition is met Then the top-up request enters Pending Approval and no charge is attempted And approvers receive in-app and email notifications with approve/reject actions And the top-up executes only after both required approvals are recorded within the approval window And a rejection by any required approver cancels the attempt with status Rejected and records reason and actor And the audit trail shows requestedAt, approvedAt/deniedAt, approvers, and final outcome And API GET /topups?status=pending returns the pending request with approver requirements

Funding Source Priority with Backup on Payment Failure

Given a published rule configured with funding priority [Primary Card A, Backup Card B] and payment failure fallback enabled When a top-up is attempted and the charge on Card A is declined with a retriable error Then the system attempts Card B once within the same evaluation cycle And if Card B succeeds, the top-up is marked Succeeded with fundingSource=Card B and no additional attempts are made And if both sources fail, the top-up is marked Failed, no balance change occurs, and an immediate failure alert event is emitted And all attempts are deduplicated so that evaluator retries do not produce extra charges And failures record processor error codes and lastAttemptAt in the audit log

Draft, Dry-Run Validation, and Versioned Publish

Given a user creates a draft rule with tiered top-up amounts (deficit ≤ 500 -> 500; 501–1500 -> 1000; >1500 -> 1500), primary funding source set, and schedule set When the user runs a dry-run preview over the last 7 days of usage metrics Then the UI shows predicted trigger times, number of top-ups, and total projected spend, and no real charges are made And if required fields are missing or limits conflict (e.g., min > max), validation errors are displayed inline and the rule cannot be published When the draft passes validation and the user publishes Then a new immutable version is created (v2 if v1 existed), v2 becomes Active, and the previous version is retained read-only for audit And the audit log records author, version, publishAt, and diff summary

Secure API CRUD with RBAC, Idempotency, and Concurrency Control

Given API endpoints exist at POST/GET/PATCH/DELETE /v1/topup-rules scoped by workspace and brand And RBAC permits only Workspace Admins to create/update/delete rules; Viewers can read When an Admin POSTs a new rule with Idempotency-Key: abc123 and payload P Then the API responds 201 Created with ruleId R, version 1, ETag E1, and scope metadata And a subsequent identical POST with the same Idempotency-Key within 24h returns 200 OK with the original resource (ruleId R) and no duplicate rule is created When the Admin PATCHes rule R with If-Match: E1 Then the API applies the update, returns 200 OK with a new ETag E2, and increments the draft version And a PATCH with a stale ETag returns 412 Precondition Failed And unauthorized users receive 403 for write operations and only scoped 200 responses for reads And DELETE by an Admin returns 204 No Content and the rule is marked inactive while retaining audit history

Workspace/Brand Scope with Currency and Timezone Localization

Given an account with Workspaces A and B and Brands X, Y, Z And a rule scoped to Workspace A and Brand Y with currency EUR and timezone Europe/Berlin When usage metrics in Workspace A / Brand Y satisfy the trigger at 17:55 Europe/Berlin and the schedule is 08:00–18:00 Then the rule evaluates in Europe/Berlin time, executes within the window, and applies EUR currency formatting in UI/API responses And triggers from other workspaces/brands do not evaluate this rule And frequency resets occur at local midnight in Europe/Berlin And GET /v1/topup-rules returns an empty list (200) for accounts with no rules and no auto top-ups occur, preserving backward compatibility

Just‑in‑Time Micro Top‑Ups

"As an operations manager, I want credits to top up just in time for each batch so that processing never pauses and cash isn’t over‑committed."

Description

Enable micro top‑ups that execute at job start or mid-batch when projected credits are insufficient, calculating the minimum required amount plus a configurable buffer to avoid stalls while minimizing tied-up funds. Incorporate a projection model that estimates credits needed per batch from queue size and historical cost per image/preset. Support hold/release flows (authorize then capture), combine multiple micro top‑ups within a frequency cap, and reconcile unused buffer at batch completion. Ensure concurrency safety for simultaneous batches, idempotent execution per batch, and graceful degradation to smaller increments on partial payment success.

Acceptance Criteria

JIT Top-Up Trigger and Buffer Calculation

- Given a batch is starting and projected_required_credits > current_credits, When the job starts, Then a micro top-up is initiated before the first image is processed and the amount equals (projected_required_credits - current_credits) + configured_buffer, rounded to the smallest billable unit. - Given a batch is mid-processing and projected_remaining_required > remaining_credits, When the shortfall is detected, Then a micro top-up is initiated without pausing the batch and the amount equals (projected_remaining_required - remaining_credits) + configured_buffer. - Given current_credits >= projected_required_credits at job start, When evaluation occurs, Then no top-up is created.

Projection Model Uses Queue Size and Historical Cost

- Given queue size N and selected presets, When the projection runs, Then it computes projected_required_credits = N × historical_avg_cost_per_image_for_selected_presets. - Given no sufficient historical data for a preset, When the projection runs, Then it falls back to a system default cost per image for that preset category. - Given the projection completes, When inspected, Then the logged inputs (queue size, presets, average cost used) and the projected amount are available for audit and used by the top-up calculator.

Authorize-Then-Capture Hold/Release Flow

- Given a micro top-up is initiated and the funding source supports authorizations, When processed, Then an authorization is created for the calculated amount and capture is deferred until credits are consumed. - Given credits are consumed during the batch, When capture occurs, Then only the consumed amount up to the authorization limit is captured. - Given the batch completes with unused authorized amount, When completion is recorded, Then the unused authorization is released within 5 minutes and no funds are captured for that portion. - Given the funding source does not support authorizations, When a top-up is needed, Then the system captures in the smallest supported increment and flags any extra as buffer to be reconciled at batch completion.

Frequency Cap and Combining Micro Top-Ups

- Given a frequency cap is configured, When multiple shortfalls occur within the cap window, Then the system combines them into a single authorization/capture up to the required amount while respecting the cap. - Given the frequency cap has been reached, When another shortfall is detected within the window, Then the system does not create a new top-up and instead increases an existing open authorization if supported; otherwise, it defers the request and emits a frequency_cap_reached event. - Given combined micro top-ups are captured, When evaluated, Then the total captured amount equals the sum of shortages plus applicable buffer and only one transaction record is persisted for the window.

Concurrency Safety and Idempotency

- Given multiple workers detect the same shortfall for a batch, When they request a top-up using the same idempotency key, Then at most one top-up transaction is created and others receive the existing transaction result. - Given a retry after a transient error, When the same idempotency key is used, Then the API returns the original transaction without creating an additional charge. - Given multiple batches run concurrently, When each requires a top-up, Then transactions are isolated by batch_id and funding source and no cross-batch credit contamination occurs.

Graceful Degradation on Partial Payment Success with Instant Alerts

- Given a top-up for amount A is partially approved for A' < A, When capture is attempted, Then the system captures A' and immediately retries smaller increments down to the configured minimum until the shortfall is resolved or the frequency cap is reached. - Given payment failure or inability to secure sufficient funds after retries, When detected, Then an instant alert is emitted via configured channels and the batch continues using available credits, pausing only if credits reach zero. - Given partial approvals occurred, When reviewing processing, Then image processing throughput continues without unnecessary stalls attributable to top-up retries.

Reconciliation of Unused Buffer at Batch Completion

- Given buffer credits were reserved via authorization, When the batch completes, Then any unused portion is released (authorization voided) and no charge is captured for it. - Given buffer funds were captured due to funding source limitations, When the batch completes, Then the system calculates the unused buffer, issues a refund or credit reversal for that portion, and adjusts the account credit balance accordingly. - Given reconciliation is performed, When audited, Then logs show the buffer computed, amounts released/refunded, and timestamps linked to the batch.

Approval Workflow & Escalation

"As a controller, I want high‑value or off‑policy top‑ups to require approval and auto‑escalate if delayed so that spend stays compliant without blocking operations."

Description

Implement configurable approval gates for top‑ups triggered by amount thresholds, off‑hours windows, or policy exceptions. Support single‑ and multi‑step approvals, approver groups with quorum or sequential rules, SLAs with auto‑escalation, and fallback actions (e.g., reduce amount or split into micro top‑ups) if approvals time out. Deliver approval actions via email, in‑app, and mobile push with one‑tap approve/deny and reason codes. Enforce RBAC, track decision timelines, and block or queue the top‑up until approval is resolved. Record all actions for auditability.

Acceptance Criteria

Threshold-Based Single Approval Gate

Given a top-up request amount exceeds the configured approval threshold and a single approver with role Finance_Approver is assigned When the request is submitted Then the system sets the top-up state to "Pending Approval", blocks fund capture, and prevents credit balance changes And approval action links are sent to the approver via email, in-app, and mobile push within 5 seconds And only users with Finance_Approver role can approve/deny; other users see read-only status And on Approve, the system captures funds and sets state to "Approved" within 5 seconds; on Deny, state is "Denied" and no funds are captured

Multi-Step Sequential Approval with SLA and Auto-Escalation

Given a two-step sequential approval chain (Step 1: Group A, Step 2: Group B) with per-step SLA of 15 minutes and an escalation policy to Level-2 approvers When Step 1 does not reach a decision before its SLA Then the system escalates to Level-2 approvers, notifies original approvers, and restarts the SLA timer for the escalated step And if any step is Denied, the overall request is "Denied" and subsequent steps are canceled And only after Step 1 is Approved does Step 2 activate with its SLA timer And if all required steps are Approved within their SLAs, the overall request is "Approved"

Approver Group Quorum Rule

Given an approver group of 5 members with a quorum rule of 3 approvals required When 3 distinct group members approve before any hard denial Then the step outcome is "Approved", remaining pending approvals are auto-canceled, and all members are notified And if any group member issues a hard Deny before quorum is reached, the step outcome is "Denied" And actions are idempotent; duplicate approvals from the same user do not increment the count

Off-Hours Approval with Spend Lock and Queueing

Given off-hours window is configured (e.g., 19:00–07:00 local) and a top-up is initiated within this window for a batch job When the request is submitted Then the system enforces spend lock (no auto-capture), sets state to "Pending Approval", and queues the batch job with a queue TTL of 60 minutes And approvers are notified via all channels immediately; on approval within TTL, the batch auto-resumes; on denial, the batch is canceled with a clear error And if TTL expires without a decision, the configured fallback policy is invoked

Timeout Fallback: Reduce Amount or Split into Micro Top-Ups

Given a top-up request with fallback strategy set to "Reduce Amount to Threshold" or "Split into Micro Top-Ups of $50, max 5" When the final escalation SLA expires without approval Then for "Reduce Amount", the system creates a new top-up at the maximum allowed amount that requires no approval, links it to the original, and marks the original "Expired—Fallback Executed" And for "Split", the system creates successive micro top-ups at $50 each (up to 5), spaced 10 seconds apart, until the target amount is met or a failure occurs And all fallback actions are notified to the requester and logged with correlation IDs; failures raise an alert and the request remains blocked

Multi-Channel One-Tap Approval with Reason Codes

Given an approver receives the approval request on email, in-app, and mobile push When the approver taps Approve or Deny on any channel Then the decision is applied once (idempotent), the other channels reflect the final state within 5 seconds, and the approver sees a confirmation And a reason code is mandatory on Deny and optional on Approve; free-text notes up to 500 chars may be added And decisions taken from unauthenticated links are rejected; authenticated SSO or magic-link with 10-minute expiry is required

Audit Trail and Decision Timeline Tracking

Given an approval request lifecycle When any event occurs (submit, notification, approve/deny, escalation, SLA start/stop, fallback, queue/pause/resume) Then the system records an immutable audit entry with timestamp (UTC), actor ID, actor role, channel, IP/device fingerprint, event type, reason code/notes, and correlation IDs And the audit log is filterable by request ID, date range, actor, outcome, and exportable as CSV and JSON And SLA metrics (time to first action, step durations, total pending time) are computed and visible on the request detail view

Spend Locks & Schedules

"As a budget owner, I want to restrict when and how much we can top up during off‑hours so that we avoid unintended spend while maintaining controlled exceptions."

Description

Allow admins to define lock windows that prevent or restrict top‑ups during specified times (e.g., weekends, holidays, or after hours) and to set active spend schedules with per‑window caps. Support organization and workspace timezones, calendar-based exceptions, and temporary overrides with documented approval. When a lock is active, queue non‑urgent top‑ups and notify stakeholders; allow emergency overrides with elevated approval and smaller capped amounts. Provide clear UI indicators and API fields reflecting current lock state and next eligible window.

Acceptance Criteria

Weekend Lock Window (Org Timezone) Blocks Auto Top-Ups

Given the organization timezone is America/Los_Angeles And a lock window named "Weekend" is configured from Friday 18:00 to Monday 08:00 local time And an automatic top-up of $120 would trigger on Saturday at 10:15 local time When the trigger occurs Then no funds are drawn and no payment attempt is made And the top-up is placed in "Queued due to lock" state within 5 seconds And stakeholders subscribed to top-up events are notified within 60 seconds with reason "Lock active: Weekend" And the queued item displays next eligible window start as Monday 08:00 America/Los_Angeles and its UTC equivalent And an audit log entry is created with lock_id, lock_name, evaluated_timezone, and next_eligible_at_utc

In-Cap Top-Up During Active Spend Schedule

Given an active spend schedule window from 09:00 to 17:00 America/New_York with a per-window cap of $1,000 And the window remaining cap is $300 at 14:00 local When a top-up of $200 is requested at 14:05 local Then the top-up is approved and processed immediately And the remaining cap decreases to $100 And the window usage is reflected in UI and API within 5 seconds

Over-Cap Top-Up Queued Until Next Window

Given an active spend schedule window from 09:00 to 17:00 America/New_York with a per-window cap of $1,000 And the window remaining cap is $150 at 16:50 local When a top-up of $250 is requested at 16:51 local Then no funds are drawn And the request is queued with status "Queued: Cap exceeded" And no partial credit is issued And the queue item shows required_amount $250, remaining_in_window $150, deficit $100 And the next eligible window start timestamp is displayed in local time and UTC And stakeholders receive a notification within 60 seconds stating "Cap exceeded; queued until next window"

Calendar Exception Allows Top-Ups on Working Holiday

Given a lock window "Weekend" from Friday 18:00 to Monday 08:00 America/Los_Angeles is active And a calendar exception "Working Saturday" for 2025-10-04 from 09:00 to 17:00 local is configured to disable the Weekend lock When a top-up of $100 triggers on 2025-10-04 at 11:00 local Then the lock is not applied And the top-up processes successfully And audit logs record exception_id, applied_exception_name, and evaluated_timezone

Emergency Override Processes Micro Top-Up with Elevated Approval

Given a lock is active and emergency_override_max_amount is $50 And the organization's approval policy for emergency overrides requires 2 approvers When a user with permission to request emergency overrides submits a $50 top-up with a justification And two distinct approvers approve within 10 minutes Then the $50 top-up is processed immediately upon second approval And audit records capture requester_id, approver_ids, justification, request_time, approval_times, processed_time, and override_expiration And any amount above $50 in the same request is rejected with message "Exceeds emergency override cap"

Temporary Override Records Approval and Auto-Reverts After Window

Given a temporary override is granted to disable a lock from 19:00 to 21:00 America/New_York on 2025-10-10 with ticket_id "INC-1234" When the clock reaches 21:00 local Then the system automatically re-enables the original lock configuration without manual action And all override metadata (approver(s), reason, start_at, end_at, ticket_id) is immutable in the audit trail And any top-up attempted at 21:00:01 local is evaluated against the restored lock

UI Badge and API Expose Current Lock State and Next Eligible Window

Rule: When any applicable lock is active for the current entity, the UI displays a "Top-ups paused" badge in Billing and Batch pages with a tooltip showing lock_name, lock_reason, and next_eligible_at_local. Rule: The public API exposes a resource to fetch current top-up lock state; the resource returns fields: lock_state ("active"|"inactive"), lock_name, lock_reason, evaluated_timezone, next_eligible_at_utc (ISO 8601), next_eligible_at_local, active_window_id (nullable), window_cap_cents, window_remaining_cents, queue_count. Rule: UI and API values remain consistent within 5 seconds of any state change. Rule: On lock_state change, a "lock_state.changed" webhook event is emitted within 10 seconds with lock_id, previous_state, new_state, and next_eligible_at_utc.

Payment Resilience & Failover

"As a billing admin, I want top‑ups to succeed reliably with automatic retries and backup funding so that batches don’t fail due to payment hiccups."

Description

Integrate with payment providers to execute top‑ups with robust retry and failover: tokenize funding sources, pre‑validate availability, handle 3DS/SCA when required, classify errors (transient vs. hard), and retry with exponential backoff. Automatically fail over to backup funding sources based on rule priority and merchant preferences. Ensure idempotent charges, duplicate protection, and safe replays on webhook delays. Surface real‑time status to the rule engine to decide whether to reattempt, downshift amount, or trigger approvals. Adhere to PCI boundaries, log masked artifacts, and support multi‑currency settlement and FX rounding rules.

Acceptance Criteria

Tokenization & PCI Boundary Compliance

Given a merchant adds a new funding card via a hosted PCI-compliant form When tokenization completes Then only a provider token is stored and no PAN, CVV, or full expiry is persisted in our systems Given application and audit logs are collected during tokenization When logs are reviewed Then all card numbers are masked to last 4 and CVV is never logged Given a non-PCI service attempts to receive PAN data When a request is made Then it is blocked and a security event is recorded Given a stored payment token is used for a top-up When the charge is executed Then the provider accepts the token and the charge proceeds without our servers handling PAN data

Pre-Validation of Funding Source Availability

Given a top-up rule is triggered for amount A When pre-validation runs Then a $0 or minimal-amount authorization (per provider capability) is performed and a pass/fail result is returned within 2 seconds Given pre-validation returns insufficient funds When the engine evaluates the result Then the top-up attempt is not submitted and the status is set to hard_fail_insufficient_funds Given pre-validation detects an expired card or revoked mandate When evaluation occurs Then the status is set to hard_fail_source_invalid and the merchant is notified Given multiple funding sources exist and some are disabled When pre-validation runs Then only eligible (enabled, not locked) sources are considered

Retry with Exponential Backoff then Failover by Priority

Given a top-up attempt fails with a transient error (e.g., HTTP 5xx, timeout, rate_limit) When processing the attempt Then the system retries up to 4 times with exponential backoff at approximately 1s, 2s, 4s, 8s plus ±20% jitter Given the maximum retry count is reached or a hard error is received (e.g., do_not_honor, insufficient_funds, invalid_card) When evaluating next steps Then the system initiates failover to the next eligible funding source by configured priority within 1 second Given a funding source is disabled, locked by schedule, or exceeds max frequency When failover selection runs Then that source is skipped and the next eligible source is chosen Given failover succeeds on an alternate source When recording results Then no further retries are attempted on prior sources and the audit log lists the ordered sequence of sources tried

3DS/SCA Challenge Handling and Continuation

Given a provider response indicates SCA is required When initiating the top-up Then a challenge link or SDK flow is generated and the attempt status becomes awaiting_sca Given the user completes the SCA challenge within 10 minutes When the provider posts completion Then the original authorization is continued using the same idempotency key and the top-up completes Given the SCA challenge fails or times out When handling the outcome Then the attempt is marked sca_failed and no automatic retry occurs unless merchant setting retry_after_sca_fail=true Given frictionless SCA is indicated by the provider When executing the charge Then the top-up completes without user challenge and is marked sca_frictionless

Idempotent Charging with Safe Replay on Webhook Delays

Given multiple identical requests use the same idempotency key within a 24-hour window When sent to the provider Then only one authorization/charge is created and subsequent requests return the original result Given provider webhooks are delayed, duplicated, or arrive out of order When reconciling events Then processing is idempotent and the ledger reflects a single final state (success or failure) with no double capture Given a network timeout occurs after the provider captured the charge When the system retries using the same idempotency key Then it detects the existing capture and does not create a duplicate charge or ledger entry

Real-Time Status Feedback to Rule Engine (Reattempt/Downshift/Approval)

Given any payment attempt changes state (e.g., pending, retrying, awaiting_sca, failover_in_progress, success, hard_fail) When the change occurs Then an event is published to the rule engine within 300 ms including attempt_id, status, error_class, amount, currency, and next_action Given the requested amount cannot be fulfilled and downshift_enabled=true with tiers defined When the initial attempt fails transiently Then the engine downshifts to the next lower allowed tier and reattempts once per rule constraints Given the rule requires approval on failover When a transition to failover_in_progress occurs Then execution pauses and approval requests are sent to the configured approvers, and processing resumes only after approval is granted

Multi-Currency Settlement and FX Rounding Accuracy

Given the wallet currency differs from the funding source currency When a top-up is executed Then the FX rate from the configured provider at authorization time is applied and rounded per the merchant rule (half-up to the minor unit) with variance <= 1 minor unit Given a zero-decimal currency such as JPY is used When recording amounts Then no fractional minor units are stored and display values show zero decimals Given the settled amount differs from the authorized amount within provider tolerance When settlement is posted Then the ledger is updated to the settled amount and the audit record includes FX rate, source, and timestamps for auth and capture

Notifications & Audit Trail

"As a team lead, I want real‑time alerts and a searchable audit history so that I can act quickly and prove compliance during reviews."

Description

Deliver instant notifications for key events—payment failure, approval required, cap reached, rule conflict, lock active, and successful top‑ups—via email, Slack, webhooks, and in‑app banners. Allow per‑user and per‑workspace preferences with quiet hours and rate limiting. Produce structured webhook events for external systems. Maintain an immutable, exportable audit trail capturing who configured rules, what changed, when approvals occurred, payment attempts, provider responses (redacted), and outcomes; support search and filters by time, rule, batch, and funding source. Provide a dashboard summarizing top‑up history, savings from micro top‑ups, failure rates, and pending approvals.

Acceptance Criteria

Instant Key Event Notifications Across Channels

Given the workspace has at least one notification channel enabled and a top-up event of type {payment_failed|approval_required|cap_reached|rule_conflict|lock_active|topup_succeeded} occurs When the event is recorded Then create an in-app banner within 5 seconds with event_type, severity, rule_id, batch_id, funding_source_masked, correlation_id, occurred_at (ISO 8601), and a link to the audit entry And send email and Slack notifications within 60 seconds to subscribed recipients per user/workspace preferences unless within their quiet hours And deliver a structured webhook event {event_type}.v1 within 10 seconds to all active endpoints (quiet hours do not apply) And enforce rate limiting: no more than 1 email and 1 Slack per unique (workspace_id, event_type, rule_id) within any rolling 5-minute window; in-app banners show the latest event; webhooks are not rate-limited And all notifications share the same correlation_id and include retry_count when applicable And suppressed email/Slack during quiet hours are queued and included in a digest sent within 15 minutes after quiet hours end

Approval Required Notifications and Actions

Given a top-up is blocked pending approval and approver rules (quorum) are configured When an approval request is created Then notify all required approvers via in-app, email, and Slack within 60 seconds, including approve/decline deep links and a 6-character verification code And only the configured approver quorum (e.g., any 1 of N or all) can approve; when quorum is reached, the top-up proceeds within 5 seconds and pending notifications are resolved And if an approver responds, then the decision (approve/decline), actor_id, optional comment, and timestamp are appended to the audit trail; the approver cannot change their decision afterward And if the request times out (default 30 minutes) without a decision, mark outcome=expired, notify subscribers, and append an audit entry And quiet hours and rate limiting preferences are honored for email/Slack; webhooks approval.requested and approval.resolved always fire immediately

Structured Webhook Events and Delivery Guarantees

Given a workspace has at least one webhook endpoint configured with a signing secret When an event occurs in {payment_failed, topup_succeeded, approval.requested, approval.resolved, cap_reached, rule_conflict, lock_active} Then send a JSON payload within 10 seconds containing: event_id (UUID v4), event_type, version=v1, occurred_at (ISO 8601), workspace_id, rule_id, batch_id (nullable), topup_id (nullable), amount_minor, currency (ISO 4217), status, reason_code (nullable), correlation_id, retry_count And include headers: X-PixelLift-Signature (HMAC-SHA256 of raw body using signing secret, hex), X-PixelLift-Event-Id, X-PixelLift-Timestamp And implement at-least-once delivery: retry on network errors and non-2xx/429 with exponential backoff (start 5s, max 10m) for up to 24h; stop on 2xx or 410 Gone And guarantee idempotency: duplicate deliveries reuse the same event_id and payload And provide a replay API to redeliver by event_id for 7 days; replays include event.delivery=replay in the payload

Immutable Audit Trail with Search, Filters, and Export

Given any configuration change, approval action, payment attempt, provider response, or notification dispatch When the event is committed Then append an audit entry with: audit_id, actor_id (system or user), action_type, entity_type, entity_id, before_after_diff (sensitive fields redacted/masked), provider_response_redacted (if applicable), correlation_id, created_at (ISO 8601) And audit entries are append-only: updates/deletes are disallowed; attempted mutations are rejected and logged as separate audit events And search supports filters by time range, rule_id, batch_id, funding_source_id, action_type, actor_id; first result returns in <2 seconds for up to 50k records And results are sortable by created_at asc/desc and paginated with stable cursors And users can export filtered entries to CSV or JSON; exports complete within 60 seconds for up to 100k rows and match on-screen filters exactly

Dashboard Summary of Top-Ups, Savings, Failures, and Approvals

Given the workspace has top-up activity When a user opens the Top‑Up Rules dashboard Then show metrics for the selected date range (default last 30 days): total top-ups count, total amount topped up, failure rate (% failures over attempts), pending approvals count, estimated savings from micro top-ups = (baseline_single_topup_total − actual_micro_topup_total) And charts display daily time series for top-ups, failures, and savings; hover values match table aggregates within ±0.1% And filters by funding source, rule, and batch update all widgets consistently within 2 seconds And all counts and amounts reconcile with the audit trail when aggregating the same filters and date range And users can export the dashboard summary to CSV; exported totals match on-screen values

Notification Preferences, Quiet Hours, and Rate Limiting

Given workspace-level defaults and per-user overrides for event types and channels When a user updates their notification preferences or quiet hours Then the new settings take effect for subsequent events within 60 seconds and are written to the audit trail And per-user preferences override workspace defaults for email/Slack; in-app banners are always enabled; webhooks are managed only at the workspace level And quiet hours are configurable per user and per workspace (local time); during quiet hours, email/Slack for non-critical events are suppressed and included in a digest sent within 15 minutes after quiet hours end; payment_failed is only suppressed if the user disables critical overrides And enforce rate limiting per workspace, per channel, per event type: maximum 5 notifications per recipient per 5-minute window; excess notifications are collapsed with a count indicator And a test notification function sends a per-channel test and shows delivery status within 30 seconds

Usage Pools

Create shared credit pools with sub‑allocations per brand, client, or campaign. Reserve credits for scheduled drops, allow carryover or expirations, and transfer balances between pools with audit trails. Agencies and multi‑brand teams get clean cost attribution and fewer end‑of‑month scrambles.

Requirements

Pool Creation & Sub-Allocation Management

"As an agency admin, I want to create and manage credit pools with sub-allocations per brand or campaign so that my teams can consume credits from the right budget without manual tracking."

Description

Enable admins to create named credit pools with metadata (brand, client, campaign), define total pool budgets, and configure nested sub-allocations with hard/soft limits. Support pool ownership, visibility scopes, and role-based permissions for who can consume or manage each pool. Include overage rules (block, warn, allow with charge), burn order (pool vs. sub-allocations), and mapping tags to align with PixelLift’s batch upload workflows. Provide CRUD APIs and UI, validation for naming uniqueness, and safeguards to prevent double counting. This establishes the foundation for accurate credit governance and clean attribution across agencies and multi-brand teams.

Acceptance Criteria

Create and Update Credit Pool via UI/API

Given an org admin with Billing:Manage permission, When they create a pool with a unique name within the org and valid metadata (brand, client, campaign) and total_budget is an integer >= 1, Then the API returns 201 with pool_id, name, metadata, total_budget, remaining = total_budget, owner, visibility_scope, and the pool appears in the UI within 5 seconds. Given a pool exists, When the admin updates name (to a unique value), metadata, owner, or visibility_scope via PATCH, Then the API returns 200 and the UI reflects the changes within 5 seconds. Given a pool exists, When a user attempts to create another pool with the same name (case-insensitive) in the same org, Then the API returns 409 POOL_NAME_CONFLICT and no pool is created. Given invalid inputs (missing name, non-integer or negative budget, metadata too long), When create is called, Then the API returns 422 with field-level errors and no pool is created. Given a pool exists, When the admin deletes it and remaining = 0 and there are no active sub-allocations, Then the API returns 204 and the pool is removed; When remaining > 0 or active sub-allocations exist, Then the API returns 409 POOL_IN_USE and the pool is not deleted.

Configure Sub-Allocations with Hard/Soft Limits

Given a pool with remaining >= 100, When the admin creates a sub-allocation with limit_type = hard and limit = 60, Then the sub-allocation is created and consumption routed to it is blocked once usage >= 60 with 403 INSUFFICIENT_SUB_ALLOCATION and no further deduction from that sub-allocation. Given a sub-allocation with limit_type = soft and limit = 40, When a consumer job attempts to deduct beyond 40 and the parent pool has sufficient remaining (or policy allows overage), Then the deduction succeeds, usage may exceed 40, and a SOFT_LIMIT_BREACH warning event is recorded for the sub-allocation. Given multiple sub-allocations exist, When the admin attempts to set or increase a hard limit such that sum(hard_limits) > pool_total_budget, Then the API returns 422 HARD_LIMIT_EXCEEDS_POOL and the change is rejected. Given a sub-allocation exists, When the admin updates limit_value or limit_type, Then the change applies to subsequent consumption and does not retroactively alter past usage. Given the API lists sub-allocations for a pool, Then each item includes name, limit_type, limit_value, usage, remaining, and last_updated, and values are consistent with recorded consumption.

Role-Based Ownership, Visibility, and Consumption Permissions

Given a pool with owner = Team X and visibility_scope = owner_and_assignees, When a user not in Team X and not assigned queries list pools, Then the pool is not returned. Given a user with role Admin, When they call any pool or sub-allocation CRUD endpoint, Then access is granted (2xx). Given a user who is the pool Owner, When they create/edit/delete sub-allocations under that pool, Then access is granted; When they attempt to delete the pool with active usage or sub-allocations, Then it is denied with 403 unless remaining = 0 and no active sub-allocations. Given a user with role Consumer assigned to a pool or sub-allocation, When they initiate a batch job that consumes credits from that resource, Then deduction succeeds; When they attempt to consume from an unassigned pool, Then the request is rejected with 403 NOT_AUTHORIZED and no credits are deducted. Given an API token with scope consumption:write only, When it is used against management endpoints, Then 403 is returned; When used for consumption within assigned resources, Then 200 is returned and the deduction is recorded.

Pool Overage Rules Enforcement

Given a pool with overage_rule = block and remaining < required_credits, When a consumption request is made, Then the request is rejected with 403 INSUFFICIENT_POOL_CREDITS and remaining is unchanged. Given a pool with overage_rule = warn and remaining < required_credits, When a consumption request is made, Then the request is accepted, remaining becomes negative by the overage amount, and an OVERAGE_WARNING event is recorded and sent to pool owners. Given a pool with overage_rule = allow_with_charge and remaining < required_credits and a valid payment method on file, When a consumption request is made, Then the request is accepted, an overage_charge record is created for the deficit, remaining is set to 0, and the job proceeds; When no payment method is on file, Then the request is rejected with 402 PAYMENT_METHOD_REQUIRED. Given overage_rule changes from warn to block, When subsequent consumption requests exceed remaining, Then they are blocked; any existing negative balance remains for audit and is not auto-adjusted.

Burn Order Between Pool and Sub-Allocations

Given burn_order = sub_allocation_first and a job is mapped to a specific sub-allocation with available credits, When the job consumes credits, Then deductions are applied to that sub-allocation until its limit behavior is reached, and any remainder is charged to the parent pool only if permitted by pool availability and overage_rule. Given burn_order = pool_first and both parent pool and sub-allocation have availability, When a job consumes credits, Then deductions are applied to the parent pool first until remaining is 0 (or policy blocks), then to the sub-allocation if permitted. Given multiple sub-allocations would be eligible for the same job, When routing is evaluated, Then the system requires an explicit mapping to a single sub-allocation; if multiple match, Then the API returns 409 MULTIPLE_MATCHING_SUB_ALLOCATIONS and no credits are deducted. Given a sub-allocation with a hard limit has been reached, When burn_order = sub_allocation_first, Then the remainder is routed to the parent pool only if allowed; otherwise the request is rejected with 403 and no partial overdraw occurs.

Tag Mapping for Batch Upload Routing

Given an admin maps tag key=value pairs to a target pool or sub-allocation, When a batch upload is initiated with matching tags, Then credit deductions are routed to the mapped target following the pool's burn_order and a routing record is stored with the job id. Given multiple mappings match a batch upload, When evaluating, Then the most specific mapping (pool+sub_allocation) takes precedence over pool-only; if two mappings have equal specificity, Then the request is rejected with 409 CONFLICTING_TAG_MAPPINGS and no credits are deducted. Given no mapping matches a batch upload, When the upload is initiated, Then the request is rejected with 422 MISSING_POOL_MAPPING and no credits are deducted. Given tag inputs include mixed case or surrounding whitespace, When matching occurs, Then keys are matched case-insensitively and values are matched exactly after trimming, and the normalized key=value pair is persisted.

Concurrency and Double-Counting Safeguards

Given a consumption API request includes an Idempotency-Key header, When the same request is retried concurrently up to 10 times within 10 minutes, Then only one deduction is applied and all duplicates return the same response id and status. Given 50 concurrent consumption requests target the same pool with remaining = 100 and total requested = 120 under overage_rule = block, When executed, Then total deducted does not exceed 100, at least 20 are rejected, and remaining never goes negative. Given concurrent updates attempt to modify the same sub-allocation limit, When processed, Then the system uses optimistic concurrency (version or ETag) and a conflicting request returns 409 CONFLICT without applying partial changes. Given any successful consumption event, When auditing usage records, Then exactly one immutable usage record exists per deduction with a monotonic sequence number per pool and a reference to the Idempotency-Key if provided.

Scheduled Drop Reservations

"As a brand manager, I want to reserve credits for my scheduled product drop so that I’m guaranteed processing capacity when my catalog goes live."

Description

Allow reserving credits from a pool (or sub-allocation) for a future time window to protect capacity for product drops. Reservations support quantity, timeframe, linked project/batch IDs, and priority. During the window, consumption preferentially draws from the reservation; unused amounts auto-release at window end. Handle conflicts with clear rules (first-come, priority override with approval), prevent over-reservation beyond pool limits, and expose availability calendars. Integrate with PixelLift batch scheduling so reservations are created/linked at upload time. Include APIs and UI, timezone handling, and audit entries for all reservations.

Acceptance Criteria

Create Reservation via UI with Quantity, Window, Priority, and Links

Given a pool has 1000 unreserved credits and the user has permission to manage reservations When the user creates a reservation for 250 credits from 2025-10-05T09:00 to 2025-10-07T18:00 in America/Los_Angeles with priority High and links projectId=PRJ-123 and batchId=BATCH-456 Then the system persists the reservation with an id, quantity=250, priority=High, start/end stored in UTC with sourceTimezone=America/Los_Angeles, projectId=PRJ-123, batchId=BATCH-456, status=Reserved And the pool's reserved balance increases by 250 and the available-to-reserve balance decreases by 250 And the reservation is visible in the pool's reservation list and availability calendar within 2 seconds And an audit entry is created containing actorId, action=CREATE_RESERVATION, timestamp, changed fields, and request metadata And if start/end are in the past or quantity<=0, the system rejects the request with inline validation and no reservation is created

Reservation API CRUD with Validation and Idempotency

Given a valid poolId and an Idempotency-Key header When POST /v1/reservations is called with quantity, start, end, timezone, priority, projectId, and optional batchId Then the API responds 201 with reservation id and echoes persisted fields And a subsequent identical POST with the same Idempotency-Key returns 200 with the original reservation id and no duplicate is created And invalid inputs (end<=start, invalid timezone, missing poolId, quantity<=0) return 422 with field-level error details And GET /v1/reservations/{id} returns 200 with the created reservation And PATCH /v1/reservations/{id} updates quantity, start/end, or priority before the window starts, enforces capacity rules, and returns 200 with updated values And PATCH /v1/reservations/{id} with status=Cancelled before the window start cancels the reservation and returns unused capacity to the pool And each create/update/cancel action writes an audit entry capturing before/after values

Prevent Over-Reservation Against Pool and Sub-Allocation Limits

Given a pool with total available-to-reserve credits 1000 and a sub-allocation "Brand A" capped at 600 And existing reservations for "Brand A" total 500 overlapping 2025-10-05 to 2025-10-06 When a user attempts to create a reservation in "Brand A" for 200 credits from 2025-10-05T08:00 to 2025-10-07T18:00 Then the system rejects the request with 409 Conflict and reason=InsufficientReservableCapacity for the overlapping period And when the same request is reduced to 100 credits or moved to a non-overlapping window, the system accepts it and persists the reservation And at any time slice, the sum of overlapping reservations within a sub-allocation never exceeds its cap and the sum across the pool never exceeds pool capacity

Active Window Consumption and End-of-Window Auto-Release

Given a reservation for 250 credits is active for a pool and linked to projectId=PRJ-123 and batchId=BATCH-456 When a job under PRJ-123/BATCH-456 consumes 100 credits during the window Then 100 credits are deducted from the reservation balance first and not from unreserved pool capacity And when an additional 200 credits are consumed during the window, the next 150 are deducted from the reservation and the remaining 50 from the pool's unreserved capacity And if multiple reservations could apply, consumption order is: linked batch reservation > higher priority > earlier creation time And at the exact window end timestamp, any unused reservation balance auto-releases to the pool, the reservation status becomes Released, and an audit entry is recorded

Conflict Resolution: First-Come vs Priority Override with Approval

Given a confirmed reservation holds capacity in a future window When a later reservation request overlaps and would exceed capacity Then the later request is rejected with 409 Conflict and reason=CapacityExceeded When a user with override permission submits the same request with priority=High and approvalRequested=true Then the system records a PendingApproval reservation that holds no capacity And upon approval by an authorized approver, the reservation becomes Confirmed and capacity is reassigned by reducing or cancelling lower-priority, latest-created conflicting reservations until sufficient capacity is freed, with audit entries and notifications for all affected reservations And if approval is denied or not granted before the start time, the PendingApproval reservation auto-expires and holds no capacity

Availability Calendar with Timezone-Aware Visualization

Given reservations exist across multiple timezones for a pool When a user in Europe/Berlin opens the availability calendar in Week view Then each reservation is rendered at the correct converted local times for the user's timezone And overlapping reservations display aggregated reserved credits per time slot and a remaining-capacity indicator And the user can toggle the calendar timezone between User Local and Pool Timezone; visible times update accordingly And daylight saving transitions are handled without altering actual reservation durations And selecting a reservation opens details showing poolId, projectId/batchId, quantity, window (local and UTC), priority, and a link to the audit log

Batch Scheduling Auto-Creates and Links Reservation

Given a user uploads a batch and schedules processing in a future window while selecting a pool or sub-allocation When the system estimates the batch will consume 180 credits Then a reservation for 180 credits is auto-created (or updated if already present) for the scheduled window in the selected pool, linked to the batchId and projectId And if creating/updating the reservation would exceed available reservable capacity, the scheduling action is blocked with a clear error and no batch schedule is saved And when the user updates the scheduled window, the linked reservation is updated to the new window with capacity rules enforced And when the user cancels the batch schedule, the linked reservation is cancelled and any unused capacity is returned to the pool And all auto-create/update/cancel actions produce audit entries referencing the initiating batch and user

Balance Transfers with Approval & Audit Ledger

"As a finance lead, I want to transfer credits between client pools with approvals so that I can re-balance budgets mid-month while maintaining compliance and traceability."

Description

Support transferring credits between pools and sub-allocations with guardrails: configurable approval thresholds, required reason codes/notes, and optional multi-step approvals. All transfers generate immutable ledger entries (who, when, from, to, amount, before/after balances) with export capability. Enforce role-based permissions, prevent negative balances, and provide rollback only via compensating entries. Surface transfer history in pool detail views and via API webhooks for finance systems. This ensures flexibility to re-balance budgets while maintaining a complete audit trail.

Acceptance Criteria

Transfer initiation with role-based permissions

Given a user without Transfer:Initiate permission on the source pool or sub-allocation, when they submit a transfer request, then the API responds 403 and no transfer or ledger entry is created. Given a user with Transfer:Initiate on the source and View on the destination, when they submit a valid transfer with an Idempotency-Key, then a transfer record is created and an initiation ledger event is written. Given a user with Transfer:Approve only, when they attempt to initiate a transfer, then the API responds 403. Given a user lacking visibility to either the source or destination, when they probe pool or sub-allocation ids, then the API responds 404 to avoid information leakage. Given the same Idempotency-Key is reused with identical payload, when the request is retried, then the same transfer id is returned and no duplicate ledger entries are created; if the payload differs, then the API responds 409 Idempotency conflict.

Approval thresholds and multi-step approvals

Given approvals configuration where amounts <1000 auto-approve, 1000–5000 require 1 approver, and >5000 require 2 distinct approvers, when a transfer is initiated, then the transfer enters the correct approval state per amount. Given separation_of_duties=true, when the initiator attempts to approve a transfer that requires approval, then the API responds 403. Given the final required approval is recorded, when execution is triggered, then source and destination balances are updated atomically and an execution ledger entry is written within 1 second. Given an approver rejects the request, then the transfer status becomes Rejected, no balances change, and a rejection ledger entry is recorded. Given a transfer remains awaiting approval for 72 hours, then it expires automatically with an expiration ledger entry and cannot be approved afterward.

Required reason codes and notes validation

Given reason_codes are configured and notes_min_length=10, when a user initiates a transfer without a valid reason_code or with notes shorter than 10 characters, then the API responds 422 with field-level errors and no transfer is created. Given a transfer is created, then its reason_code and notes are required and are included in all subsequent ledger entries and exports. Given a user attempts to modify reason_code or notes after execution, then the API responds 403 and the immutable ledger remains unchanged.

Negative balance prevention and available balance checks

Given source available balance equals current_balance minus reserved_credits, when a transfer amount exceeds the available balance of the source pool or sub-allocation, then the API responds 422 Insufficient Funds and no transfer is created. Given two concurrent approved transfers would overdraw the same source, when executed, then at most one succeeds; the others fail with 409/422, preventing negative balances. Given available balance drops below the pending transfer amount before execution, then execution fails with status Failed due to Insufficient Funds and a failure ledger entry is recorded; balances remain unchanged.

Immutable ledger, export, and rollback via compensating entries

Given any transfer lifecycle event (initiated, approval_requested, approved, rejected, executed, failed, rollback_created), then an immutable ledger entry is created containing: transfer_id, event_type, actor_id, actor_role, timestamp (ISO-8601 UTC), source_pool_id/sub_allocation_id, destination_pool_id/sub_allocation_id, amount, source_balance_before, source_balance_after, destination_balance_before, destination_balance_after, reason_code, notes_snippet (<=200 chars). Given any attempt (API/UI) to update or delete a ledger entry, then the API responds 405/403 and no changes occur. Given an export request with a date range and format=CSV or JSON for up to 100,000 rows, then the export completes within 10 seconds, includes all matching ledger entries with headers/fields as specified, and the row count matches the API count for the same filter. Given an admin requests rollback of an executed transfer and the destination has sufficient available balance, then the system creates a compensating transfer of identical amount from destination to source, links it to the original via original_transfer_id, records a rollback_created ledger entry, and updates balances atomically; if insufficient, then the API responds 422 and no rollback is created.

Transfer history in pool and sub-allocation detail views

Given a user with View permission opens a pool or sub-allocation detail page, then a Transfers tab lists inbound and outbound transfers with columns: timestamp, direction, amount, source, destination, status, reason_code, initiated_by, approved_by. Given the user applies filters (date range, direction, status, reason_code, actor) and pagination (page size 50), then results update within 1 second and counts match the ledger API for the same filter. Given the user clicks a transfer row, then a detail view shows the full event timeline with before/after balances and links to any compensating (rollback) transfer.

Finance system webhooks for transfer events

Given a transfer changes state (initiated, approval_requested, approved, rejected, executed, failed, rollback_created), then a webhook POST is delivered to subscribed endpoints within 5 seconds containing event_id, transfer_id, event_type, timestamp (UTC), source, destination, amount, reason_code, status, before/after balances. Given the receiving endpoint returns non-2xx, then delivery is retried with exponential backoff up to 8 times over 24 hours; duplicates can be deduplicated via event_id and X-PixelLift-Idempotency-Key headers. Given webhooks are signed using HMAC-SHA256 with a shared secret and include a timestamped signature header, then the signature validates for requests within a 5-minute skew; otherwise the consumer can reject. Given an admin requests replay for a date range, then the system re-sends matching events preserving original event_ids and ordering by timestamp.

Carryover & Expiration Policy Engine

"As an operations manager, I want to set carryover and expiration rules per pool so that unused credits are handled predictably without manual cleanups."

Description

Introduce configurable policies per pool for monthly carryover caps, expiration schedules, grace periods, and FIFO burn order across current vs. carryover balances. Policies can inherit from org defaults or be overridden at the pool level. System processes expirations automatically, records ledger entries, and notifies stakeholders ahead of deadlines. Include simulation tools to preview upcoming expirations and their impact. Ensure policies are enforced consistently across UI, API, and background jobs, and displayed transparently in pool settings and user-facing consumption dialogs.

Acceptance Criteria

Pool inherits org defaults and supports overrides

Given an org with default policy values (carryover_cap, expiration_schedule, grace_period, burn_order=FIFO) When a pool is created without overrides Then the pool policy equals org defaults and is labeled "Inherited" in UI and API (source=org-default) When a pool-level override is saved for any field Then the pool policy uses overridden values and API marks source=pool-override per field And a policy-change audit entry is recorded with actor, timestamp, before/after

Monthly rollover applies carryover caps

Given a pool with unused credits at month end and a carryover_cap C When the scheduled rollover runs at 00:00 on the 1st in the pool timezone Then up to C credits are moved to a new carryover lot for the new month And any excess above C expires immediately And ledger entries are created for carryover lot creation and for expired amounts And UI/API next-month opening balances reflect current vs carryover with lot IDs and amounts

Expiration schedule and grace period enforcement

Given a carryover lot with expiration_date D and grace_period G days When current time < D Then the lot remains spendable and not flagged "grace" When D <= current time < D + G Then the lot remains spendable and is flagged "in-grace" in UI/API When current time >= D + G at 00:00 pool timezone Then any remaining amount in the lot expires, with a ledger entry and event emitted And no consumption after expiration reduces the expired lot below zero

FIFO burn order across carryover and current lots

Given a pool with multiple spendable lots (carryover and current) When credits are consumed via UI or API Then debits are applied to the lot with the earliest expiration first; if tie, to oldest createdAt first And only after all carryover lots are exhausted is the current-month lot debited And the consumption response/details show lot-by-lot debits and remaining balances

Consistent enforcement across UI, API, and background jobs

Given identical consumption operations performed via UI and via API Then resulting balances, lot selections, and ledger entries are identical When the background expiration job runs Then its effects match a dry-run simulation for the same period and inputs And all operations are idempotent under retry (no double-debits or double-expirations)

Upcoming expiration notifications to stakeholders

Given a pool with lots expiring within the policy's notification_window N days When the notification scheduler runs Then emails and in-app alerts are sent to pool owners and configured stakeholders with lot IDs, amounts, and expiration dates And each lot triggers at most one notification per window; retries are deduplicated by lot-id+window And the notifications API lists pending and sent notifications for auditability

Expiration impact simulation tool

Given a user runs a simulation for a pool for a date range R When simulation executes Then the output includes daily projections of expirations, carryover application versus caps, and resulting available balances And results are available in UI and via API, exportable as CSV And simulation is read-only and uses the same policy engine as production processing, producing identical results for the same inputs

Pool-Aware Processing & Default Mapping

"As a content producer, I want my batch uploads to automatically bill the correct pool so that I don’t have to choose budgets manually every time."

Description

Integrate pool selection seamlessly into PixelLift’s batch upload and API flows. Provide rule-based default mapping of uploads to pools based on brand/client/campaign tags, with user override (subject to permissions). Ensure atomic debit of credits at job submission with idempotency keys to avoid double-charging, and handle insufficient funds with clear error states and fallback options (request transfer, purchase add-on). Display current pool balance and reservation usage in upload UI. Log consumption with references to job IDs for end-to-end traceability.

Acceptance Criteria

Rule-Based Default Pool Mapping on Batch Upload

- Given a user uploads a batch with brand, client, and campaign tags, When the upload screen loads, Then the system auto-selects a pool based on the mapping rules table. - Rule precedence: campaign > client > brand; only one pool is selected following highest-precedence match. - If multiple pools match at the same precedence, Then the user is prompted to choose a pool and no default is applied. - If no mapping exists for any tag, Then the workspace default pool is selected. - The selected pool is displayed with an "Auto-mapped" label and info tooltip indicating the matching rule (e.g., rule_id). - The mapping decision (selected_pool_id, matched_tag_type, matched_tag_value, rule_id) is recorded in audit logs.

User Override of Auto-Mapped Pool with Permission Enforcement

- Given an auto-mapped pool is selected, When a user with role Editor or higher opens the pool selector, Then the user can select a different pool from the dropdown and the selection updates immediately. - When a user without override permission (e.g., Viewer) attempts to change the pool, Then the control is disabled and a tooltip explains "You don’t have permission to change pools." - Overriding the pool affects only the current batch/job and does not alter mapping rules. - The override action is captured in audit logs (actor_id, from_pool_id, to_pool_id, timestamp, reason="manual_override").

Atomic Credit Debit at Job Submission (UI Flow)

- Given a selected pool and a computed required_credits for the batch, When the user clicks Submit, Then the system attempts an atomic debit equal to required_credits from the selected pool before queuing the job. - If the debit succeeds and the job is queued, Then a ledger entry is created with fields (ledger_entry_id, pool_id, job_id, debit_amount, actor_id, timestamp) and the UI shows updated balances. - If any part of submission fails after debit but before job is queued, Then the debit is rolled back (no net debit) and the user sees an error state indicating the job was not queued. - The submission uses a unique idempotency_key per job submission to ensure single debit semantics.

API Job Submission Idempotency & Double-Charge Prevention

- Given a client submits a job via API with header Idempotency-Key=X, When the same request with the same key is retried within 24 hours, Then exactly one job is created and exactly one debit occurs. - Subsequent identical-key retries return 200 with the original job_id and ledger_entry_id without additional debits. - If a request reuses the same Idempotency-Key but with a different payload hash, Then the API responds 409 Conflict with code IDP_KEY_PAYLOAD_MISMATCH and no debit occurs. - Idempotent responses include idempotency_key, job_id, ledger_entry_id for traceability.

Insufficient Funds Handling with Fallback Options

- Given the selected pool’s available_credits (total_credits - reserved_credits) is less than required_credits, When the user attempts submission, Then the submission is blocked and an inline error displays: pool_name, required_credits, available_credits, reserved_credits. - The error offers two actions: "Request transfer" (opens request modal, logs request event) and "Purchase add-on" (navigates to billing if user has billing permission; otherwise disabled with tooltip). - For API submissions in this state, Then the service returns HTTP 402 with code POOL_INSUFFICIENT_FUNDS and includes balances {required, available, reserved} and links/actions {transfer_request_endpoint, purchase_url}. - No debit or partial job creation occurs when insufficient funds are detected.

Upload UI Balance & Reservation Visibility

- Given a pool is selected (auto-mapped or manual), Then the UI displays: total_credits, reserved_credits, and available_credits, with available_credits = total_credits - reserved_credits. - When the user changes the pool, Then these values update within 500ms and the required_credits vs available_credits indicator recalculates. - If available_credits < required_credits, Then the UI shows a prominent warning state before submission. - Values refresh after any successful debit or transfer event, ensuring the displayed balance matches the latest ledger within one polling cycle (<= 5s).

Consumption Logging & End-to-End Traceability

- For every successful debit, Then a ledger record is written with fields: ledger_entry_id, job_id, pool_id, amount, idempotency_key, actor_id or api_key_id, timestamp, source={ui|api}, mapping_context {rule_id, tags}. - For any reversal (rollback/refund), Then a compensating ledger entry is created referencing original ledger_entry_id. - Audit queries by job_id or pool_id return the corresponding ledger entries within 2 seconds and results are immutable (append-only). - All ledger entries include a human-readable message and machine fields sufficient to reconstruct the consumption path end-to-end.

Cost Attribution Reporting & Exports

"As an agency owner, I want detailed usage reports by client and campaign so that I can attribute costs accurately and reconcile billing."

Description

Provide reporting that breaks down credit usage and remaining balances by pool, sub-allocation, brand, client, campaign, user, and time period. Include filters, pivot views, and downloadable CSV exports, plus scheduled delivery to email/S3 and integration endpoints for BI tools. Reports tie each consumption entry back to job IDs and style-presets to support ROI analysis. Support multi-entity roll-ups for agencies and per-client sharing links with scoped visibility. This delivers the clean cost attribution promised to multi-brand teams.

Acceptance Criteria

Interactive Breakdown & Pivot Views Across Dimensions

Given a user with access to multiple Usage Pools and sub-allocations When they open the Cost Attribution report and select dimensions Pool, Sub-allocation, Brand, Client, Campaign, User, and Time (Day/Week/Month) Then the table shows columns: usage_credits, remaining_credits, reserved_credits, expired_credits, carryover_credits, transfers_in, transfers_out, and total_credits And pivoting any dimension between rows and columns recalculates aggregates without data loss And grand totals equal the sum of all grouped totals within ±0 rounding error And balances reflect scheduled-drop reservations and carryovers/expirations as of the selected period And p95 render time is ≤ 3s for queries over up to 100k usage records in the selected period

Advanced Filters & Date Range Controls

Given a report view When the user applies filters for Pool, Sub-allocation, Brand, Client, Campaign, User, Style Preset, Job ID, and Date Range (absolute or relative presets: Last 7/30/90 days, MTD, QTD, YTD) with a chosen time zone Then only matching rows are returned And a filter summary chip list reflects all active filters And clearing filters restores the unfiltered totals And applying identical filters via URL query or share link yields identical results And p95 filter application latency is ≤ 2s for datasets up to 100k records

CSV Export Completeness & Accuracy

Given a filtered or pivoted Cost Attribution report view When the user clicks Export CSV Then a CSV (UTF-8, comma-delimited, quoted where needed) downloads with a header row and deterministic column order And columns include at minimum: pool_id, pool_name, sub_allocation_id, brand, client, campaign, user_id, user_email, time_period_start, time_period_end, usage_credits, remaining_credits, reserved_credits, expired_credits, carryover_credits, transfers_in, transfers_out, job_id, style_preset_id, style_preset_name And all rows in the CSV match the on-screen result set (count parity) and numeric totals match within ±0.01 credits And exports up to 100k rows complete within 30 seconds; larger exports stream to a downloadable link within 5 minutes And each row representing consumption has non-null job_id and style_preset_id; balance-only rows omit these fields

Scheduled Delivery to Email & S3

Given a user with Report Scheduler permissions When they create a schedule specifying report definition, frequency (daily/weekly/monthly or cron), time zone, email recipients, and/or S3 destination (bucket/path, IAM role) Then test connections validate successfully before saving And at the scheduled time, delivery occurs within 5 minutes: emails include CSV attachment for ≤25MB else a secure expiring link (valid ≥24h); S3 objects are written with server-side encryption and a predictable key pattern And failures trigger retries with exponential backoff (3 attempts) and send a failure notification to owners And each run is logged with status, runtime, row count, file size, and destination URIs in the audit log

BI Integration API Parity & Performance

Given an API client with a valid OAuth2 token and scope report.read When it calls GET /v1/reports/cost-attribution with filters, dimensions, pivots, pagination (limit, cursor), and time zone Then the JSON response returns rows and aggregates equivalent to the CSV export for the same parameters And includes fields for job_id and style_preset_id/style_preset_name for usage-level rows And supports pagination up to 10k rows per page with stable cursors, returning next_cursor when more data is available And p95 latency is ≤ 2s for queries returning ≤10k rows; responses include Retry-After on 429s with a limit of 60 req/min And responses include ETag for caching; If-None-Match returns 304 when unchanged within a 5-minute freshness window

Agency Multi-Entity Roll‑Ups

Given an agency admin with access to multiple brands/clients and pools When they select a roll-up view across chosen entities and time period Then totals aggregate across entities without double-counting cross-pool transfers (transfers_in/out net to zero at roll-up level) And drill-down from roll-up to entity to pool to job is possible within two clicks per level with consistent totals And permissions prevent inclusion of entities the admin is not authorized to view And the roll-up grand total equals the sum of entity totals within ±0 rounding error

Per‑Client Sharing Links with Scoped Visibility

Given an account manager creating a share link for a client When they define scope (client(s), pools/sub-allocations, date range, dimensions) and an expiration time Then the generated link exposes only the scoped data and hides other entities and users And the link can be revoked at any time; subsequent access returns 410 Gone within 60 seconds of revocation And downloads via the link are limited to CSV of the scoped report; API access requires authentication and is not granted by the link And all link accesses are recorded with timestamp, IP, user-agent, and row counts in the audit log

Notifications & Threshold Alerts

"As a project coordinator, I want proactive alerts about low balances and upcoming expirations so that I can take action before my scheduled edits are at risk."

Description

Implement configurable alerts for low balances, impending expirations, reservation conflicts, and failed debits. Support channel preferences (email, Slack, in-app, webhooks) and per-pool thresholds. Provide daily digests to reduce noise and immediate alerts for critical events. Expose alert settings via UI and API, include actionable links (top up, request transfer, edit reservation), and log notification history for auditing. This reduces end-of-month scrambles by proactively surfacing issues before they block processing.

Acceptance Criteria

Per-Pool Low Balance Threshold Alert

- Given a pool balance of 1,000 credits with a low-balance threshold set to 20%, when a debit reduces the balance to 200 or below, then send an alert within 60 seconds via the pool’s configured channels and include: pool name, current balance, % remaining, configured threshold, projected depletion date, and action links (Top Up, Request Transfer, Edit Threshold, View Usage). - Given an alert has been sent for crossing threshold X, when the balance remains at or below X without recovering above X, then suppress duplicate alerts for 24 hours and include the condition once in the next daily digest. - Given multiple thresholds are configured (e.g., 50%, 20%, 5%), when the balance crosses multiple thresholds in a single debit, then emit one alert per threshold crossed in descending severity order. - Given channel preferences are [Slack, Email, In-App], when the alert triggers, then deliver to Slack channel Y and send email to recipient list Z and create an in-app notification; if Slack delivery fails after 3 retries, then fall back to email and in-app only and log the failure with error code. - Given the alert is delivered, when the recipient opens the in-app notification, then mark it as read and deep-link to the pool’s balance page with remediation actions visible.

Impending Credit Expiration Alerts

- Given credits in pool A expire on 2025-10-31 and carryover is disabled, when at least 100 credits are due to expire in 7 days, then send an impending expiration alert at 09:00 in the pool’s timezone including totals by expiration date and action links (Request Transfer, Adjust Carryover Policy if admin, View Expiration Schedule). - Given carryover is enabled with a max of 20%, when computing expiring amounts, then exclude the portion expected to carry over and display the computed carryover in the alert payload and UI. - Given a per-pool lead time override of 14 days is configured, when scheduling expiration alerts, then send them at 14 days before expiry instead of the default. - Given credits were previously alerted for expiration, when the expiring amount has not changed by at least 10% and no policy changes occurred, then do not send a new immediate alert; include the item once in the daily digest only.

Reservation Conflict Immediate Alert

- Given a reservation for 800 credits exists for 2025-09-25 10:00–12:00 and available credits for that window drop below 800, when the conflict is detected, then send an immediate Critical alert within 60 seconds to configured channels. - Then include reservation id, pool id, shortfall amount, start/end window, and action links (Edit Reservation, Prioritize Job, Request Transfer prefilled with shortfall, Top Up) in the alert. - Given the conflict is resolved by top-up, transfer, or reservation edit, when resolution is detected, then close the incident and post a resolution notification to the same channels and suppress further alerts for that incident id. - Given multiple overlapping reservations are in conflict, when alerting, then include separate sections per reservation with unique incident ids and shortfalls.

Failed Debit Alert with Retry and Audit Trail

- Given a debit attempt occurs via API or job processing, when it fails due to insufficient credits or a system error, then send an immediate Critical alert within 60 seconds to configured channels. - Given a transient system error (e.g., 5xx), when retrying, then perform exponential backoff with 3 attempts over 10 minutes and post an update on final status; given a business rule failure (insufficient credits), then do not retry and include remediation links (Top Up, Request Transfer) in the alert. - Then log each failure and notification in the audit log with timestamp, pool id, request id, user/job id, error code, channel delivery statuses, and a link to the exact audit entry. - Given linked pools permit transfers, when the failure reason is insufficient credits, then include a one-click Request Transfer link prefilled with the shortfall amount and target pool suggestions based on available balances.

Daily Digest Aggregation and Noise Reduction

- Given non-critical alerts occurred in the last 24 hours, when the daily digest time (default 09:00 account timezone) is reached, then send one digest summarizing counts and top items per pool, excluding Critical events already sent immediately. - Then include sections for Low Balance, Upcoming Expirations, Reservation Warnings, and Delivery Failures; each item shows severity, count, and deep links to remediation pages. - Given a recipient opted out of immediate non-critical alerts, when the digest is sent, then ensure they still receive the digest unless they have fully muted that pool. - Given a low-balance threshold remains breached without recovery, when compiling the digest, then list it once per digest cycle until remediation and do not send additional immediate alerts for the same threshold.

Alert Settings via UI and API with Permissions

- Given a user with Pool Admin role, when opening the pool settings UI, then they can configure thresholds, channel preferences (email addresses, Slack channel/webhook, in-app), expiration lead times, digest time, and webhook endpoints, with input validation and a test-send capability that returns visible success/failure. - Given the API endpoint POST /v1/usage-pools/{id}/alerts, when called with valid auth and a well-formed payload, then persist settings and return 200 with the effective configuration; invalid fields return 400 with specific error codes and messages; unauthorized calls return 401/403 as appropriate. - Given a user without edit permission, when viewing alert settings, then show the current configuration in read-only mode and return 403 on modification attempts. - Then version every change with actor, before/after diff, timestamp, and surface it in the audit history with filters and export capability.

Notification History and Delivery Receipts

- Given any alert is emitted, when viewing the Notification History for a pool, then display entries with event type, severity, created_at, status (Queued, Delivered, Bounced, Read if available), channels, recipients, message id, and links to related objects. - Given email delivery, when the SMTP provider returns a bounce or complaint, then update status within 5 minutes and auto-suppress email for that recipient for 24 hours; given Slack/webhook 4xx/5xx responses, then retry up to 3 times with exponential backoff and log outcomes per attempt. - Given webhook delivery is enabled, when sending an alert, then POST a JSON payload containing event_id, event_type, pool_id, severity, occurred_at, payload, and an HMAC-SHA256 signature header and Idempotency-Key; reject unsigned or invalidly signed requests on the receiver example and mark delivery as failed. - Given history filters (date range, pool, event type, status, channel) are applied, when querying up to 10,000 records, then return first page in under 2 seconds with pagination metadata.

Forecast Planner

Predict upcoming spend using scheduled releases, historical volumes, and feature choices. Run “what‑if” scenarios (volume, channels, effects) to see cost impact, get alerts if plans will exceed caps, and accept auto suggestions to split batches or switch to lighter presets to fit the budget.

Requirements

Unified Forecast Data Ingestion

"As a store owner, I want PixelLift to automatically pull my release calendar and past image volumes so that my forecasts reflect real activity and I don’t have to manually assemble data."

Description

Implements automated ingestion and normalization of all inputs required for forecasting, including scheduled batch releases from the Upload Scheduler, historical processed image volumes by channel, and preset/style usage history. Constructs a forecasting-ready time series with breakdowns by channel, preset, and image type, covering at least the past 12 months with configurable backfill. Includes data quality checks (deduplication, missing data interpolation, timezone alignment), seasonality markers (holidays, promotions), and catalog metadata linkage to ensure forecasts reflect PixelLift’s real activity patterns for each workspace. Exposes the curated dataset to the Forecast Planner via internal APIs for consistent, accurate modeling.

Acceptance Criteria

Scheduled Batch Releases Ingestion from Upload Scheduler

- Given a workspace with future scheduled releases in the Upload Scheduler and ingest_frequency=15m (default), when the ingestion job runs, then all new or updated schedules since the last watermark are ingested and available in the curated dataset within ≤15 minutes. - Given a scheduled release is canceled or its time/volume updated, when the next ingestion runs, then the curated dataset reflects the cancellation/change within ≤15 minutes and prior values are superseded (no duplicate records). - Ingested scheduled releases include fields: workspace_id, schedule_id, release_datetime_utc, source_timezone, channel, planned_image_count, version; primary key uniqueness is enforced on (workspace_id, schedule_id, version).

Historical Volumes by Channel Ingestion

- Given historical processed image events exist, when backfill runs with backfill_months=12 (default), then the curated dataset contains ≥365 days of daily processed counts per workspace and channel. - When backfill_months is set to 18, then ≥18 months of daily data are present; no unaccounted gaps exist other than flagged missing days. - For each day+channel+workspace, the aggregated processed count reconciles within ±0.5% against source event totals. - The ingestion captures fields: workspace_id, date_utc, channel, hist_processed_count; records are immutable after load except for data quality corrections logged with audit_id.

Preset and Style Usage History Ingestion

- Given processed image events with applied preset/style, when daily aggregation runs, then the curated dataset contains counts per workspace, channel, preset_id (or "none"), and image_type for the configured backfill window and ongoing. - Reprocessed images for the same asset on the same day are deduplicated: only the latest processed_at per asset_id+date_utc counts toward hist_processed_count. - Aggregated preset/style totals per day reconcile within ±1.0% of the sum of qualifying source events after deduplication.

Time Series Construction with Breakdowns and Configurable Backfill

- The curated dataset exposes a daily time series per workspace with required dimensions: date_utc, channel, preset_id, image_type; and metrics: hist_processed_count, scheduled_planned_count, promotions_flag, holiday_flag. - Backfill is configurable via backfill_months in the range [1,36]; default=12. When set, the dataset contains at least the requested number of full months where source data exists; unavailable months are flagged as missing. - Records include catalog linkage fields (e.g., catalog_category_id or collection_id) and are correctly mapped from source metadata ≥99% of the time in validation samples. - The dataset stores only observed combinations (sparse); no synthetic zero rows are persisted.

Data Quality: Deduplication, Missing Data Interpolation, and Timezone Alignment

- Deduplication: Duplicate source events (same workspace_id+source_event_id) are collapsed to a single record; dedup_rate per daily load is reported and remains <0.1% for steady-state runs. - Missing data interpolation: For gaps ≤3 consecutive days per workspace+channel, hist_processed_count is linearly interpolated; for gaps >3 days, values remain null and a data-quality alert is emitted within ≤10 minutes of detection. - Timezone alignment: All timestamps are normalized to UTC in the curated dataset using the workspace’s configured timezone; daily totals across DST boundaries match source local-date totals within ±1 event. Unit tests cover US/Eastern, Europe/Berlin, and Asia/Kolkata DST/offset transitions.

Seasonality Markers and Promotions Tagging

- Workspace locale holiday calendars are applied: for a US-locale workspace, 2024-11-28 (Thanksgiving), 2024-11-29 (Black Friday), and 2024-12-02 (Cyber Monday) are flagged appropriately (holiday_flag and/or promotions_flag) in the curated dataset. - When a workspace creates or updates a promotion window (start/end/timezone), the curated dataset sets promotions_flag=1 for all overlapping local dates within ≤15 minutes; deletions clear the flag within ≤15 minutes. - Marker accuracy is ≥99% across a 100-day stratified sample per locale against reference calendars; discrepancies are logged with audit details.

Exposure of Curated Dataset via Internal API to Forecast Planner

- An internal API endpoint GET /internal/forecasting/curated exists and requires workspace_id and date_range (start,end); supports filters channels[], presets[], image_types[]; responds with JSON and header x-dataset-version. - Performance: For queries spanning ≤180 days and returning ≤200k rows, P95 latency ≤500 ms and P99 ≤900 ms; pagination is supported via cursor with max_page_size=10,000. - Data integrity: Aggregates computed from API responses match curated storage within ±0.1% for sampled queries; schema changes are versioned and backward-compatible for minor revisions. - Security and isolation: Requests return only data for the specified workspace_id; cross-workspace access attempts are rejected (HTTP 403) or return empty results, verified via multi-tenant tests.

Pricing & Cost Engine with Preset Sensitivity

"As an operations manager, I want forecasted costs to reflect our tiers and preset mix so that budget planning aligns with what we’ll actually be billed."

Description

Builds a deterministic cost calculation engine that translates forecasted volumes into spend, accounting for tiered pricing, channel-specific rates, preset complexity (e.g., background removal, retouch strength), and discounts. Supports stepwise volume tiers per billing period, effective-date pricing catalogs, and region-specific taxes/fees. Provides an API to evaluate baseline and what-if scenarios and returns totals plus per-dimension breakdowns (by channel, preset, period). Ensures parity with the Billing service and maintains a versioned price catalog so forecasts match actual invoices.

Acceptance Criteria

Stepwise Tiered Pricing per Billing Period

Given a price catalog with tiered rates for channel "Web" and preset "Basic": 1–100 @$0.10, 101–500 @$0.08, 501+ @$0.05 within billing period 2025-09 And a forecast volume of 650 images on Web using Basic in 2025-09 When the engine evaluates baseline spend Then the total for 2025-09 equals 49.50 USD And the period breakdown shows tier allocations: 100@$0.10, 400@$0.08, 150@$0.05 And all monetary values are rounded to 2 decimals

Channel Rates and Preset Complexity Multipliers

Given base rates in the price catalog: Web.Basic = $0.06/image, Marketplace.Basic = $0.08/image And preset "Studio-Pro" has complexity multiplier 1.5x across channels And volumes: Web.Studio-Pro = 100 images; Marketplace.Basic = 50 images When the engine evaluates spend Then Web.Studio-Pro cost equals 9.00 USD (100 * 0.06 * 1.5) And Marketplace.Basic cost equals 4.00 USD (50 * 0.08) And the breakdown by channel and preset returns these line totals And the grand total equals 13.00 USD

Effective-Date Price Catalog Selection and Versioned Output

Given catalog version v2025.02 effective until 2025-02-28T23:59:59Z and v2025.03 effective from 2025-03-01T00:00:00Z And volumes: 100 images on Web.Basic in 2025-02 and 100 images on Web.Basic in 2025-03 And rates: v2025.02 Web.Basic = $0.10, v2025.03 Web.Basic = $0.12 When the engine evaluates baseline spend across both periods Then the 2025-02 total equals 10.00 USD using catalogVersionId v2025.02 And the 2025-03 total equals 12.00 USD using catalogVersionId v2025.03 And the response includes catalogVersionId per period in the breakdown

Region-Specific Taxes and Fees Application

Given region = "EU-DE" with VAT rate 19% and prices defined as pre-tax And a computed pre-tax period subtotal of 100.00 EUR When taxes and fees are applied Then tax equals 19.00 EUR and total-with-tax equals 119.00 EUR And the response includes tax details per region with type = "VAT" and rate = 0.19

Discount Ordering and Capping

Given a contract-level discount of 10% applied to the combined pre-tax subtotal before any channel promos And a promo code "MKT5" that applies $5.00 off the Marketplace channel subtotal once per period, not below zero And pre-discount, pre-tax subtotals: Web = 60.00 USD, Marketplace = 20.00 USD When discounts are applied Then the contract discount reduces the combined subtotal (80.00) by 8.00 to 72.00 USD And the discounted channel subtotals are Web = 54.00 USD and Marketplace = 18.00 USD And the promo reduces Marketplace by 5.00 to 13.00 USD (Web remains 54.00 USD) And taxes are computed after all discounts And the response shows a discount breakdown by type and channel

API What-If Scenario Evaluation and Delta Breakdown

Given a baseline request for period 2025-09 with Web.Basic volume = 500 images and rate = $0.08/image And a what-if request changing to volume = 700 images and preset = Studio-Pro with multiplier = 1.5x (effective rate = $0.12/image) When the API evaluates baseline and what-if and computes a delta Then baseline total equals 40.00 USD and what-if total equals 84.00 USD And delta.total equals 44.00 USD (84.00 - 40.00) And both results include breakdowns by channel, preset, and period And in each result, the sum of breakdown line totals equals the reported total within 0.01 USD

Parity With Billing Service and Determinism

Given input payload X and price catalog version v2025.09 with a Billing service invoice total of 2,345.67 USD for the same inputs and catalog When the engine evaluates payload X Then the total equals 2,345.67 USD and each breakdown line matches Billing with 0.00 tolerance And repeated evaluations with identical inputs produce identical totals and breakdowns And the response includes catalogVersionId = v2025.09 for audit parity

Scenario Builder & Comparison

"As a marketing lead, I want to run and compare multiple what-if scenarios so that I can choose the most cost-effective plan for upcoming launches."

Description

Delivers UI and API to create, edit, and save forecast scenarios with adjustable inputs: volume overrides by channel, preset/style mix, release dates, and optional effects toggles. Computes KPIs (total spend, per-image cost, variance vs. baseline, cap utilization) and supports side-by-side comparison of up to three scenarios. Includes cloning, labeling, and persistence per workspace/user, with guardrails for valid date ranges and dependencies on scheduled releases. Presents clear visualizations (trend lines, stacked bars by preset/channel) to enable quick selection of the most cost-effective plan.

Acceptance Criteria

Create & Save Scenario (UI)

Given I have edit permission in a workspace, When I open Scenario Builder and enter a unique label, select a release linked to a scheduled release, set volume overrides by channel, preset/style mix, and effects toggles, Then the Save action is enabled. Given all required inputs are valid, When I click Save, Then a scenario is persisted with owner, workspaceId, inputs, computed KPIs, createdAt, updatedAt, and it appears in the workspace list within 2 seconds p95. Given a duplicate label within the same workspace, When I attempt to Save, Then a blocking validation error is shown and the scenario is not saved. Given any required field is missing or invalid (empty label, invalid date, negative volume), When I attempt to Save, Then inline errors identify the field(s) and Save is blocked.

Edit & Clone Scenario

Given an existing scenario I own or can edit, When I modify inputs and Save, Then the scenario updates in place, updatedAt changes, KPIs recompute, and the list reflects new values. Given a scenario, When I click Clone, Then a new scenario opens in edit state with all inputs copied and the label prefilled as "<Original Label> (Copy)". Given the cloned scenario, When I Save with a unique label, Then a distinct scenario is created and listed; when I attempt to Save with a duplicate label, Then a 409/inline duplicate-label error prevents save. Given I attempt to clone or edit across a workspace I lack access to, When I submit via UI/API, Then the action is rejected with 403 and no data changes occur.

KPI Computation & Cap Utilization

Given a scenario with volumes by channel, preset/style mix, and effects toggles, When KPIs are computed, Then total spend, per-image cost, variance vs baseline, and cap utilization are calculated using the current cost model, stored to 4 decimal places and displayed rounded to 2 decimals. Given a designated baseline scenario, When viewing KPIs, Then variance absolute equals ScenarioTotal − BaselineTotal and variance percent equals (ScenarioTotal − BaselineTotal) / BaselineTotal, both displayed. Given no baseline is designated, When viewing KPIs, Then variance vs baseline displays as N/A without errors. Given cap utilization exceeds 100%, When viewing KPIs, Then utilization shows the overage (e.g., 112%) with warning styling.

Side-by-Side Comparison (up to 3 Scenarios)

Given I select up to three scenarios, When I open the Comparison view, Then all selected scenarios render with aligned KPIs and shared time axis visualizations. Given three scenarios are already selected, When I attempt to add a fourth, Then the UI prevents selection and shows "Select up to 3 scenarios". Given I remove one scenario from Comparison, When I add another, Then the view updates within 500 ms and remains aligned. Given values in Comparison, When cross-checked against each scenario’s detail, Then currency fields match within ±0.01 and percentage fields within ±0.1%.

Visualization (Trend Lines & Stacked Bars)

Given a scenario, When I view visualizations, Then a trend line presents spend over time and stacked bars present spend breakdown by preset and channel consistent with inputs. Given I toggle a channel or preset in the legend, When the view updates, Then charts and KPI totals reflect the filter without full page reload and within 300 ms for up to 24 time buckets. Given I hover or focus a data point, When a tooltip appears, Then it displays date, channel, preset, volume, unit cost, and subtotal matching underlying data within rounding rules. Given accessibility requirements, When navigating charts, Then keyboard focus order is logical, ARIA labels are present, and series colors meet 4.5:1 contrast for text/keys.

API CRUD & Validation

Given I POST /scenarios with a valid payload (label, workspaceId, inputs: volumes by channel, preset/style mix, effects, releaseDates), When processed, Then the API returns 201 with the created scenario including computed KPIs and timestamps. Given invalid input (duplicate label, out-of-range dates, negative volumes), When I POST/PATCH, Then the API returns 4xx with machine-readable codes (e.g., 409_duplicate_label, 400_invalid_date, 400_invalid_volume) and no partial writes occur. Given I GET /scenarios?workspaceId=..., When authorized, Then only scenarios in that workspace are returned, paginated and sorted by updatedAt desc by default; unauthorized access returns 403. Given I PATCH /scenarios/{id} with valid changes, When processed, Then 200 is returned and updatedAt changes; unknown id returns 404. Given I POST /scenarios/{id}/clone, When valid, Then 201 is returned with a new id and inputs copied; providing a label ensures uniqueness or a 409 is returned. Given I include an Idempotency-Key on POST, When the same request is retried within 24h, Then no duplicate scenario is created and the original resource is returned.

Guardrails for Dates & Release Dependencies

Given scheduled releases define valid date ranges, When a scenario’s release date falls outside any valid range, Then validation blocks save and explains the valid range. Given a scenario references a scheduled release that was removed or shifted, When I open or try to save the scenario, Then an "Out of sync" warning appears and I must select a valid release before saving. Given workspace timezone settings, When entering dates, Then the UI enforces the workspace timezone and prevents selecting disallowed past dates per policy. Given all dependencies are satisfied, When saving, Then the scenario saves without warnings and all visualizations/KPIs reflect the chosen release window.

Budget Caps & Proactive Alerts

"As a business owner, I want proactive alerts when my planned spend will exceed budget caps so that I can adjust before incurring unexpected costs."

Description

Enables setting monthly or custom-period budget caps at workspace/account level, with soft and hard thresholds. Validates planned releases and active scenarios against caps in real time and triggers alerts at configurable thresholds (e.g., 80%, 100%, 110%). Surfaces warnings in the Planner UI and Scheduling flow, and sends notifications via in-app, email, and Slack integrations. Provides a breach impact view that shows which channels/presets drive overage and the time window at risk, supporting quick adjustments to stay within budget.

Acceptance Criteria

Workspace Budget Cap Setup and Validation

Given I am a workspace admin When I create a monthly or custom-period budget cap with soft and hard thresholds Then the cap saves successfully, is visible in Planner and Scheduling settings, and the active period is computed using the workspace timezone And soft threshold < hard threshold and both are > 0; invalid inputs (missing period, non-numeric, soft >= hard) are blocked with inline errors And edits to cap values are applied immediately to future validations and captured in an audit entry with editor, timestamp, and before/after values And if no cap is defined for the active period, no cap-related warnings or alerts are shown anywhere

Real-Time Cap Check in Planner Scenarios

Given an active budget cap exists for the workspace/account And I am in the Planner adjusting volumes, channels, or presets for a scenario When the projected spend is recalculated Then the utilization vs cap updates within 500 ms at P95 and shows current %, projected spend, remaining budget, and overage (if any) And if utilization >= configured soft threshold (e.g., 80%), an inline warning appears with utilization % and overage amount and a link to the breach impact view And if utilization >= hard threshold (e.g., 100%), the Schedule/Commit action is disabled with an error explaining the cap breach and linking to adjustment options

Threshold Alerts and Notifications (In-app, Email, Slack)

Given utilization crosses a configured threshold (e.g., 80%, 100%, 110%) for the first time in the current period When the crossing event is detected Then send one in-app alert, one email, and one Slack message within 60 seconds including: period label, cap amount, consumed amount, projected amount, utilization %, overage amount, and threshold crossed And do not resend for the same threshold again in the same period unless the next higher threshold is crossed or the period resets And if Slack delivery fails, email and in-app still deliver and an integration error is logged for review

Configurable Threshold Levels per Workspace

Given default thresholds are 80/100/110 When I open Cap Settings as a workspace admin Then I can add, remove, and reorder threshold levels (integer percentage 1–300, ascending, max 5 total) And invalid inputs (duplicates, non-integer, out of range, descending order) are blocked with inline errors And when I save, the new thresholds persist and take effect for all subsequent validations and alerts within 10 seconds

Breach Impact View: Drivers and Time Window

Given a scenario or schedule projects a cap breach When I open the breach impact view from the warning/alert Then I see: active period window, cap amount, projected spend, utilization %, and overage amount And I see a ranked list of drivers by channel and preset with each driver’s projected cost and contribution %, covering at least 95% of the overage total And I can sort by projected cost or contribution % And after I adjust plan parameters and reopen the view, figures update within 1 second and match Planner totals within ±1%

Active Run Monitoring and Over-Cap Detection

Given releases are actively running under an active cap When actual spend to date plus remaining forecast crosses a configured threshold Then trigger alerts as per notification rules and show current utilization and projected overage in the Planner and Scheduling views And if utilization exceeds the hard threshold during an active run, jobs are not auto-cancelled; a high-severity warning is shown with overage estimates and a link to adjust or pause upcoming batches And when the cap period resets, utilization returns to 0%, historical alerts archive, and new crossings trigger fresh alerts

Account vs Workspace Cap Precedence and Enforcement

Given an account-level cap and a workspace-level cap both apply to the same period When validating a plan or active run in that workspace Then enforcement uses the stricter effective remaining budget (the lower of the two) And alerts indicate which cap(s) are at risk and show remaining budget for each And if only one cap exists, only that cap is enforced and referenced in warnings/alerts

Auto Optimization Suggestions

"As a boutique owner, I want PixelLift to suggest batch splits or lighter presets to meet my budget so that I save time and maintain acceptable image quality."

Description

Generates automated recommendations to keep plans within budget with minimal quality impact, including splitting batches across periods, shifting channel volumes, and switching to lighter presets where allowed. Uses a constraint solver that respects deadlines, channel-specific style requirements, and minimum quality rules defined by the brand. Displays estimated savings, quality impact, and timeline changes, with one-click apply to update the active scenario. Provides explainability (what changed and why) and allows rollback of applied suggestions.

Acceptance Criteria

Suggest When Budget Cap Exceeded

Given an active scenario with forecasted spend exceeding the budget cap by any amount When Auto Optimization is run Then at least one suggestion is produced that, if applied, reduces forecasted spend to ≤ the cap without violating deadlines, channel style requirements, or minimum quality rules And each suggestion includes estimated savings in currency and percentage, estimated quality impact score change, and any timeline change in days And suggestions are generated within 5 seconds for plans ≤ 5,000 items and within 10 seconds for plans ≤ 20,000 items

One-Click Apply Updates Active Scenario

Given a user viewing optimization suggestions When the user clicks Apply on a suggestion Then the active scenario is updated atomically to reflect the suggestion’s changes to batch splits, channel volumes, and preset selections And the scenario’s total spend, quality score, and timeline metrics update immediately and remain within constraints And a success banner summarizing savings, quality impact, and timeline change is displayed And a version snapshot with a diff is stored and labeled with user, timestamp, and suggestion ID

Explainability: What Changed and Why

Given a generated optimization suggestion When the user expands its details Then the UI displays a rationale referencing the budget gap and constraints considered And shows a diff listing each change with before/after values for spend, quality score, volume, and dates And provides links to affected batches and channels And enumerates any assumptions or trade-offs

Constraint Compliance and No-Feasible-Solution Handling

Given a plan where meeting the budget cap would require violating minimum quality, channel style, or deadlines When Auto Optimization is run Then no suggestion is produced that violates constraints And a "No feasible optimization" message is displayed identifying the blocking constraints And the runtime remains within 10 seconds for plans ≤ 20,000 items

Rollback of Applied Suggestions

Given a suggestion has been applied to the active scenario When the user clicks Rollback Then the scenario reverts to the exact prior version with identical batch IDs, channel allocations, presets, and totals And forecasts and KPIs recompute within 3 seconds for plans ≤ 5,000 items And an audit log entry records the rollback with user, timestamp, and reason

Suggestion Coverage: Split, Shift, or Switch

Given an over-cap plan with flexible schedule, channel volumes, and preset allowances When Auto Optimization is run Then the suggestion set includes at least one of each applicable action type: split batches across periods, shift channel volumes, and switch to lighter presets where permitted by brand rules And inapplicable action types are omitted and labeled as not applicable with reason codes And each suggestion indicates estimated delivery impact in days and per-channel effects

Assumptions Management & Versioning

"As an operations lead, I want versioned assumptions with a clear change history so that my team can audit forecasts and confidently reuse scenarios."

Description

Introduces structured management of scenario assumptions (growth rates, seasonality multipliers, channel mix, acceptance rates, price catalog version, holidays). Versions every scenario change, logging editor, timestamp, and rationale, with the ability to add notes and revert to prior versions. Displays assumption diffs and their impact on spend to improve auditability and team collaboration. Ensures consistent forecasting by tying each scenario to explicit, reviewable inputs.

Acceptance Criteria

Create Baseline Assumptions Version

Given an authenticated user creates a new scenario with no existing versions When they populate all required assumption fields (growth rates, seasonality multipliers, channel mix, acceptance rates, price catalog version, holiday calendar) and click Save Then versionNumber = 1 is created with a unique versionId, editor identity, ISO 8601 timestamp, and a read-only snapshot of all fields And the version appears in the history list labeled "v1 (current)" And the saved values exactly match the entered values (no unintended rounding or coercion beyond defined display formatting)

Assumption Change Creates New Version With Mandatory Rationale

Given a scenario has at least one existing version When a user modifies any assumption field and attempts to Save Then the system requires a non-empty rationale of at least 10 characters before allowing Save And upon Save, a new version is created with versionNumber incremented by 1, capturing full snapshot and diff of changed fields And the change log records editor, timestamp, rationale text, and list of changed fields And earlier versions remain immutable and visible in history

View Diff Between Versions With Spend Impact

Given a scenario contains two or more versions and a forecast period is selected When the user selects version A and version B to compare Then the diff view lists each changed assumption with old value, new value, and delta (absolute and % where applicable) And the system displays the projected spend delta for the selected period attributable to the assumption changes, including currency and +/− sign And the user can toggle to see per-dimension impact (e.g., by channel) if applicable And the diff can be exported as CSV including all displayed fields

Revert Scenario To Prior Version

Given a scenario contains multiple versions When the user selects version N and clicks Revert and provides a rationale Then the system creates a new version N+1 whose snapshot exactly equals version N's snapshot and marks it as current And the change log entry states "Reverted to vN" with editor, timestamp, and rationale And no historical versions are altered or deleted

Collaborative Notes Attached To Versions

Given a version detail view is open When a user adds a note (1–1000 characters) and clicks Save Then the note is stored with author identity and ISO 8601 timestamp and is attached to that specific version And notes display in reverse chronological order under the version And blank notes or notes exceeding 1000 characters are rejected with a validation message

Forecast Runs Pinned To Explicit Assumptions Version

Given a user runs a forecast from a scenario with a current version vX When they click Run Forecast Then the forecast request includes versionId of vX and the results store a reference to vX And the results view displays the version label (e.g., "Assumptions vX") with timestamp And subsequent creation of vX+1 does not change the stored results, which remain linked to vX

Validation And Integrity Rules For Required Assumptions

Given a user attempts to Save assumptions or Run Forecast When any required assumption is missing or invalid (e.g., growth rates not numeric 0–500%, channel mix not totaling 100% ±0.1, acceptance rates outside 0–100%, price catalog version or holiday calendar not selected) Then the system blocks the action, highlights offending fields with messages, and keeps Save/Run disabled And no new version is created And errors clear immediately once inputs are corrected

Seat Flex

Only pay for the seats you use. Seats prorate by day, auto‑park inactive users after a chosen idle period, and offer temporary burst seats for peak weeks. Add viewer‑only or approve‑only roles at low or no cost to keep collaboration high and wasted seat spend low.

Requirements

Daily Prorated Seat Billing

"As a workspace billing admin, I want seat charges to prorate by the exact days used so that my team only pays for what we actually consume and finance reports reconcile cleanly."

Description

Implement a billing engine that calculates seat charges at daily granularity for monthly and annual plans. The service must apply immediate prorations for mid-cycle seat additions/removals and role downgrades/upgrades (e.g., editor to viewer), generate itemized invoice lines, handle multiple currencies and tax rules, and provide accurate cost previews before changes are confirmed. It integrates with the existing payment gateway, maintains an auditable, idempotent ledger of adjustments, respects the account’s time zone for day boundaries, and exposes a read API for finance reporting. Edge cases include retroactive seat backfills, plan changes mid-cycle, refunds/credits for early removals, and rounding rules consistent with finance policy.

Acceptance Criteria

Mid-cycle Editor Seat Addition — Monthly Plan Proration with Cost Preview and Immediate Charge

Given an account on a monthly plan in America/New_York with cycle 2025-09-01 00:00 to 2025-09-30 23:59 local, editor price USD 30.00/month, no tax And rounding is to 2 decimals, half-up, and billable days are calendar days the seat is active for any portion When an admin adds 1 editor seat on 2025-09-17 15:20 local and opens the cost preview Then the preview shows a prorated charge for 14 billable days (2025-09-17 through 2025-09-30) at daily rate 30.00/30 = 1.00, subtotal USD 14.00, tax USD 0.00, total USD 14.00 And the preview itemizes: role=editor, quantity=1, service period=2025-09-17–2025-09-30, currency=USD When the admin confirms, the system posts a ledger debit and an invoice line matching the preview amounts and service period exactly And the payment gateway captures a USD 14.00 charge; the invoice is marked paid And the read API returns the ledger entry and invoice line within the account within 60 seconds of confirmation

Mid-cycle Editor Seat Removal — Prorated Credit/Refund Respecting Local Day Boundaries

Given an account as above with an editor seat active on 2025-09-01–2025-09-30 local And the rounding policy is 2 decimals, half-up, and billable days are any calendar day with seat activity When the admin removes the seat on 2025-09-20 10:00 local and opens the cost preview Then the preview shows a prorated credit for 10 unused days (2025-09-21 through 2025-09-30) at daily rate 30.00/30 = 1.00, subtotal USD -10.00, tax USD 0.00, total USD -10.00 And the day of removal (2025-09-20) remains billed; no credit is applied for that day When the admin confirms, the system posts a ledger credit and an invoice line matching the preview and updates the account balance by USD -10.00 And if the account has no outstanding balance and auto-refund is enabled, a USD 10.00 refund is created via the payment gateway and linked to the invoice

Role Downgrade Editor→Viewer Mid-Cycle — Proration and Itemized Invoice Lines

Given an account on a 30-day monthly cycle with editor price USD 30.00/month and viewer price USD 5.00/month, rounding 2 decimals half-up And an editor seat is downgraded to viewer on 2025-09-10 09:00 local within the 2025-09-01–2025-09-30 cycle When the admin opens the cost preview Then the preview itemizes two lines: credit for editor unused portion 21 days × (30/30) = USD -21.00, and charge for viewer remaining portion 21 days × (5/30) = USD 3.50 And the preview shows net total USD -17.50 with service period 2025-09-10–2025-09-30 and role change details When confirmed, the ledger records two entries (one credit, one debit) with an idempotency key; re-submitting with the same key results in no duplicate entries And the invoice lines exactly match the preview amounts and periods

Plan Change Mid-Cycle — Old Plan Credit, New Plan Debit, Idempotent Ledger

Given an account with 10 editor seats on Plan A (monthly price_per_seat_A) switching to Plan B (annual price_per_seat_B) on 2025-09-15 12:00 local And the system computes proration at daily granularity respecting the account time zone and currency minor units When the admin opens the cost preview Then the preview shows for all active seats: a credit equal to (unused_days_in_current_month/total_days_in_current_month) × (10 × price_per_seat_A) and a debit equal to (remaining_days_in_new_annual_period/total_days_in_new_annual_period) × (10 × price_per_seat_B) And both lines show service periods, roles, quantities, currency, taxes, and rounded totals consistent with the rounding policy When confirmed with idempotency key "plan-change-2025-09-15", the ledger creates exactly two entries (one credit, one debit); retrying with the same key creates no additional entries; retrying with a different key creates a second identical financial event flagged as duplicate and rejected And the read API exposes both entries filterable by date range and idempotency key

Multi-Currency and Tax — Accurate VAT/GST Application and Rounding

Given an EU account billed in EUR with VAT 20% and editor price EUR 25.00/month in a 30-day month When 1 editor seat is added on 2025-09-21 local and the preview is opened Then the preview shows base = (25.00/30) × 10 = EUR 8.33 (rounded), tax = 20% of 8.33 = EUR 1.67 (rounded), total = EUR 10.00 And the invoice itemizes currency=EUR, tax jurisdiction, tax rate, tax amount, and base amount And for a JP account billed in JPY with 0 decimal minor units and consumption tax 10%, when the same action occurs with price JPY 3000/month, the totals are base JPY 1000, tax JPY 100, total JPY 1100 with correct 0-decimal rounding

Retroactive Seat Backfill in Current Cycle — Catch-up Charges and API Exposure

Given today is 2025-09-25 local within a 2025-09-01–2025-09-30 cycle and editor price USD 30.00/month When an admin backfills 1 editor seat with a start date of 2025-09-05 Then the preview shows a catch-up charge for 20 billable days (2025-09-05 through 2025-09-24) at daily rate 30/30 = USD 1.00, total USD 20.00, itemized with the backdated service period When confirmed, the ledger records a debit with service period 2025-09-05–2025-09-24 and the invoice includes a corresponding line And the finance read API returns this adjustment when filtered by service_period_start=2025-09-01..2025-09-30 and entry_type=proration

Gateway Integration & Preview Parity — Single Capture on Retry and Exact Match to Preview

Given a seat change that produces a preview total of USD 14.00 and an idempotency key "op-123" When the confirmation is submitted and a transient network failure occurs after the ledger write but before gateway acknowledgment Then the system retries the payment with the same idempotency key and captures exactly one USD 14.00 charge at the gateway And the ledger contains exactly one financial event for "op-123" and the invoice total equals the preview total exactly (difference = 0.00) And all events are audit-stamped with user, timestamp, idempotency key, and request payload hash

Configurable Idle Auto‑Park

"As a workspace owner, I want inactive users to auto‑park after a chosen idle period so that I don’t pay for unused seats without manually auditing activity."

Description

Provide an automated mechanism that detects user inactivity based on last meaningful activity signals (login, job submission, approval action, asset download/API token usage). After a configurable idle period (e.g., 7–60 days) at the workspace level, the system moves the user to a Parked state that preserves access to view personal/profile info but disables paid editing capabilities. Auto‑park triggers notifications to the user and admins, supports one‑click unpark by admins, and optionally auto‑unparks on user return if a free seat is available. The feature must exclude viewer/approver light roles by default, log all state transitions for audit, and ensure no active batch jobs are orphaned; queued jobs from parked users are paused with a clear recovery path. All behavior is available via UI and API and is resilient to time zone differences.

Acceptance Criteria

Idle Auto‑Park Enforced by Configurable Workspace Period

Given a workspace owner sets the idle auto‑park threshold to N days between 7 and 60 and the workspace time zone to TZ And a paid user has no meaningful activity for N consecutive days (activity timestamps stored in UTC and evaluated in TZ) When the idle evaluation job runs Then the user’s state changes to Parked within 15 minutes of threshold crossing And the user’s paid seat is released to the free pool immediately And the state change timestamp is recorded in UTC and displayed in TZ in the UI

Meaningful Activity Signals Reset Idle Timer

Given the system tracks meaningful activity types: interactive login, job submission, approval action, asset download, and API token usage When any of these activity types occur for a user Then the user’s idle timer is reset to the activity timestamp And non‑qualifying events (e.g., email opens, failed logins, UI navigation without action) do not reset the timer And activities performed via API and via UI are treated equivalently for idle reset

Viewer/Approver Light Roles Excluded from Auto‑Park

Given a user has role Viewer or Approver (light roles) When the idle evaluation job runs Then the user is not auto‑parked regardless of inactivity duration And if the user’s role changes from light to a paid editing role, the idle timer starts from the role‑change timestamp

Parked State Preserves Profile Access and Disables Paid Editing

Given a user is in Parked state Then the user can sign in and view personal/profile information and billing receipts And the user cannot access paid editing capabilities (retouch, background removal, style presets, batch job submission) And UI controls for paid editing are disabled or replaced with a prompt explaining Parked status And API calls that initiate paid editing return HTTP 403 with error code USER_PARKED and do not execute

Auto‑Park Notifications and Audit Logging

Given the system auto‑parks a user due to inactivity Then an email and in‑app notification are sent to the user and all workspace admins within 5 minutes, including: reason (idle N days), park time (in TZ), and unpark instructions And an audit log entry is created with user ID, previous state, new state, trigger (idle), actor (system), correlation ID, and timestamps in UTC And duplicate notifications are suppressed for the same park event And audit log entries are retrievable via UI and API with consistent details

Admin One‑Click Unpark and Optional Auto‑Unpark on User Return

Given a user is Parked and at least one paid seat is available When an admin clicks Unpark in the UI or calls POST /v1/users/{userId}/unpark Then the user returns to Active within 1 minute and regains paid editing capabilities And the action is recorded in the audit log with admin actor and timestamp And if no paid seat is available, the request is blocked with HTTP 409 SEAT_UNAVAILABLE and no state change occurs Given auto‑unpark on return is enabled for the workspace When the Parked user logs in or performs a meaningful activity and a free seat exists Then the system auto‑unparks the user and sends confirmations to the user and admins; if no seat exists, the user remains Parked and is shown/requested a seat flow

Job Safety: Active Jobs Continue; Queued Jobs Paused with Recovery

Given a user is auto‑parked Then all active batch jobs continue to completion without error or cancellation And all queued jobs for the user transition to status "Paused — Parked User" And the UI shows a Resume button for each paused job; the API exposes POST /v1/jobs/{jobId}/resume which succeeds after unpark And while Parked, attempts to enqueue new jobs are blocked with clear messaging (UI) and HTTP 403 USER_PARKED (API) And upon unpark, the user can resume paused jobs with a single action, and job state transitions are logged for audit

Temporary Burst Seats Scheduling

"As an admin, I want to schedule temporary burst seats for peak weeks so that my team can scale up editing without committing to permanent seats."

Description

Enable admins to pre-schedule temporary burst seats for specified date ranges to cover peak periods (e.g., product drops, seasonal sales). Burst seats are allocated instantly during the window, billed per day at a defined burst rate, and automatically expire at the end of the window. The system supports caps, approval workflows, and cost previews, and blocks overage with clear prompts to extend or purchase additional capacity. Usage is tracked in real time; when burst seats are exhausted, new invites/concurrent editors are limited according to policy. Integrates with SSO/invite flows, honors role-based permissions, exposes utilization metrics and exports, and reconciles billing with daily prorations.

Acceptance Criteria

Schedule Activation Within Date Window

Given an admin schedules burst seats with start_date and end_date in the account timezone and a seat_cap > 0 When the system clock reaches 00:00 on start_date in the account timezone Then the additional burst seats equal to seat_cap become available for assignment immediately without requiring a restart Given the burst window is active When the system clock reaches 23:59:59 on end_date in the account timezone Then the burst seats expire automatically and the available seat count reverts to baseline Given overlapping burst schedules are active When capacity is calculated Then available burst capacity equals the sum of active schedules' seat_caps Given a scheduled window has not yet started When the admin cancels it Then no charges are incurred and the schedule does not activate Given a scheduled window is active When the admin cancels it mid-window Then burst seats deactivate immediately and only days up to cancellation are billable

Daily Burst Billing and Proration Reconciliation

Given a proposed schedule with seat_cap S, daily_burst_rate R, and N active days When the admin opens the cost preview Then the system displays Estimated Cost = S * R * N in the billing currency with the date range and assumptions Given a schedule is active When a day completes Then a daily burst line item for that day is recorded with quantity S, unit price R, and subtotal S*R, visible in billing logs within 24 hours Given a schedule is edited to change dates or seat_cap When billing is reconciled Then only the actual active days and seat_caps in effect per day are charged, and prior estimates are adjusted with clear audit entries Given no burst usage occurs on days outside the window When invoices are generated Then no burst charges appear for days before start_date or after end_date

Cap Enforcement and Overage Blocking

Given a burst schedule with seat_cap S and U used burst seats When U < S Then invites and seat upgrades can proceed until U reaches S Given U == S When an admin attempts to invite an additional editor or upgrade a viewer Then the action is blocked and a prompt offers options: Extend Window, Increase Cap (if permitted), or Purchase Additional Capacity Given U > S due to a cap reduction attempt When the admin saves the reduced cap Then the save is prevented with an error explaining current usage must be reduced or a later effective date chosen Given a blocking prompt is shown When the admin selects Extend Window Then the date picker opens pre-filled with the current window and displays an updated cost preview before confirmation

Approval Workflow with Cost Preview

Given org policy requires approval when Estimated Cost exceeds threshold T or seat_cap exceeds threshold S_max When an admin submits a schedule meeting those conditions Then an approval request is created, the schedule enters Pending Approval status, and no activation or billing occurs until approval Given a Pending Approval schedule When an approver reviews it Then they can Approve or Reject after seeing the cost preview and policy flags; approval instantly schedules activation; rejection cancels with no charges Given an approved schedule When the admin edits seat_cap, dates, or rate Then the change triggers re-approval if thresholds are crossed; otherwise the schedule updates without requiring re-approval, and cost preview reflects changes Given an approval request is not acted on within SLA X hours When escalation rules run Then the request escalates to the next approver and notifications are sent

Real-Time Usage Tracking and Exhaustion Behavior

Given a burst window is active When seats are consumed or released Then the usage dashboard updates utilized and available counts within 60 seconds, and an event is logged with timestamp and actor Given burst seats are exhausted When a new editor attempts to start a concurrent editing session beyond the allowed editors Then the session is blocked with a message stating "Burst Seats Exhausted" and suggestions to extend or purchase more Given burst seats are exhausted When an admin attempts to send a new editor invite requiring a seat Then the invite flow blocks assignment and presents options consistent with overage policy (extend window, increase cap, purchase capacity) Given burst seats are replenished by extending the window or increasing cap When the action completes Then previously blocked actions can proceed without reloading the admin console

Role-Based Permissions and SSO/Invite Flow Integration

Given role-based permissions are enforced When a Billing Admin or Workspace Admin opens Seat Flex settings Then they can create, edit, and cancel burst schedules; Approvers can only approve/reject; other roles cannot create schedules and see a permission error Given SSO just-in-time provisioning is enabled and a burst window is active When a new user signs in via SSO and baseline seats are full Then a burst seat is auto-assigned if available; if not, the user is provisioned as viewer-only (if allowed) or blocked with a clear message Given an invitation is sent via the Invite flow during an active window When the invite requires a seat Then the system allocates from burst seats first (if available), otherwise follows overage blocking and approval policies, showing the cost preview to the inviter if they have permission

Utilization Reporting and Export

Given an admin selects a date range and clicks Export When the export is generated Then a CSV is downloaded containing columns: date, schedule_id, schedule_name, seat_cap, used_burst_seats, available_burst_seats, utilization_pct, daily_rate, billed_amount, currency, timezone, approval_status Given days with no active burst window exist within the range When the export is generated Then those dates appear with zeros for used_burst_seats and billed_amount Given multiple overlapping schedules exist When the export is generated Then rows are produced per schedule per day, and a summary row per day aggregates totals with utilization_pct = total_used/total_available rounded to one decimal place Given the admin opens the analytics dashboard When charts load Then utilization metrics (daily utilization, peak concurrent, spend to date) reflect the last 30 days and match the CSV export values within 1%

Low‑Cost Viewer & Approver Roles

"As a creative lead, I want low‑cost viewer/approver roles so that stakeholders can collaborate and approve work without paying for full editing seats."

Description

Introduce two light roles—Viewer (read-only access to assets, presets, and job status) and Approver (can review and approve/reject batches and leave feedback without initiating edits). These roles are free or discounted relative to full editor seats and do not consume paid editing capacity. Implement a clear permissions matrix across web and API, including gated actions, watermark or download restrictions as configured, and safe escalation paths to convert a light role to a full editor with a cost preview and proration. Ensure role assignment is available in bulk, supports SSO group mapping, and is fully auditable with reversible changes.

Acceptance Criteria

Viewer Read-Only Access Across Web and API

- Given an active Viewer user, when they access assets, presets, and job status via web UI or GET API endpoints, then they can view metadata, thumbnails, and non-downloadable previews without error and with p95 response time ≤ 300 ms for cached assets. - Given a Viewer attempts any edit, upload, delete, approve/reject, preset modification, or POST/PATCH/DELETE API call, when the request is made, then the action is blocked and a 403 is returned with error code PERMISSION_DENIED and no data is mutated. - Given organization download policy is "watermark only for light roles", when a Viewer attempts to download, then only watermarked versions are downloadable and original downloads are hidden or disabled in UI and API. - Given audit logging is enabled, when a Viewer performs any view or blocked action, then an audit record is written with userId, role, action, resourceId, timestamp (UTC), and outcome (allowed or blocked).

Approver Review and Feedback Without Edit Permissions

- Given an Approver opens a batch awaiting approval, when they view images, then previews load with configured watermark and metadata and edit tools, style presets, and reprocessing actions are not visible or are disabled. - Given an Approver submits Approve, when confirmation occurs, then the batch status transitions to Approved, a notification is sent to the batch owner and watchers within 60 seconds, and an audit record is created. - Given an Approver submits Reject, when they do not enter a rejection reason of at least 10 characters, then the submission is blocked with inline validation; when a valid reason is provided, then the status transitions to Changes Requested, the comment is saved, and notifications are sent within 60 seconds. - Given an Approver uses the API, when they call POST /batches/{id}/approve with valid token, then the request succeeds with 200 and updates status; when they call any edit/job creation endpoints, then the request is rejected with 403 and no job is created.

Light Roles Do Not Consume Paid Editing Capacity

- Given the billing usage report for the current cycle, when only Viewer and Approver users are active, then paid Editor seat count remains unchanged and additional seat charges for light roles are $0.00 or the configured discounted rate, and editor capacity metrics are unaffected. - Given 100 concurrent light role sessions, when users perform allowed actions, then no "seat capacity reached" events are logged and job processing concurrency limits are unchanged. - Given a light role user attempts to start a new edit job or re-run a job, when triggered via UI or API, then the action is blocked with 403 and no job or rerun record is created. - Given invoice generation at period close, when line items are produced, then Editor seats and Light roles are listed separately with quantities and unit prices (Editor at standard rate; Light roles at $0 or configured discount) and totals reflect zero consumption of paid editing capacity by light roles.

Role Escalation to Full Editor with Cost Preview and Proration

- Given an org admin selects Upgrade to Editor for a Viewer or Approver, when the dialog opens, then it displays unit price, proration by remaining days in the billing period, effective date/time (immediate), and projected first charge amount in org currency using: prorated_amount = unit_price × remaining_days ÷ total_days. - Given the admin confirms upgrade, when they click Confirm, then the user role updates within 5 seconds without requiring re-login and a billing event is queued with an amount equal to the displayed proration (±$0.01 tolerance), and an audit entry is recorded. - Given the admin cancels, when the dialog is closed without confirmation, then no role change or billing event occurs. - Given the API upgrade endpoint is called with preview=true, when processed, then a non-mutating 200 response returns the proration details and projected amount; when called with preview=false, then the role updates and the charged amount matches the most recent preview (±$0.01). - Given a 30-day period with 10 days remaining and unit price $30, when preview is requested, then projected amount equals $10.00.

Bulk Role Assignment and Reversal with Full Audit Trail

- Given an admin uploads a CSV mapping up to 1,000 users to Viewer or Approver, when processed, then at least 95% complete within 2 minutes and a summary report shows successes, failures, and per-row failure reasons. - Given bulk assignment via API with an Idempotency-Key header, when the same request is retried within 24 hours, then no duplicate changes are applied and the original result is returned. - Given any role change (assign, remove, upgrade, downgrade), when the change completes, then an audit record is created capturing actor, source (UI, API, SSO), beforeRole, afterRole, timestamp (UTC), correlationId, and is exportable and queryable for 1 year. - Given an incorrect bulk assignment within the last 7 days, when an admin selects Revert for selected users, then prior roles are restored and a reversal audit entry links to the original change.

SSO Group Mapping to Viewer/Approver with Conflict Handling

- Given SSO is enabled with group-to-role mappings, when a user signs in via SSO, then their role is set to the mapped light role within 5 seconds of login and reflected in the UI and API tokens. - Given both SSO mapping and manual assignment exist, when Enforce SSO is OFF, then manual role prevails; when Enforce SSO is ON, then the SSO role overwrites the local role at login and during hourly sync. - Given scheduled sync runs hourly, when upstream group membership changes, then user roles update within 60 minutes and an audit entry is created with source=SSO_SYNC and details of mapping applied. - Given an SSO group maps to an unknown role, when the user logs in, then access is denied with error RoleNotMapped and no role change is applied.

Watermark and Original Download Restrictions by Role

- Given org policy Watermark=On for light roles with preset W1, when a Viewer or Approver previews or downloads allowed assets, then watermark W1 is applied and originals are hidden; UI labels indicate Watermarked Download. - Given org policy Allow original download for Approver, when an Approver downloads, then originals are available only to Approver while Viewer remains watermarked; permission checks enforce at API and signed URL layers. - Given asset download uses signed URLs, when a light role requests a download, then URL expiry is ≤ 10 minutes and bound to that user's role; when the role is elevated, then any prior signed URLs are invalidated immediately. - Given a light role attempts to bypass watermark via URL parameter tampering, when the CDN/origin is requested with watermark disabled, then the request is rejected or returns a watermarked asset and no original leak occurs.

Seat Management Dashboard

"As an account admin, I want a single dashboard to manage seats, roles, and costs so that I can keep collaboration high while controlling spend."

Description

Deliver an admin dashboard to view and manage all seats across the workspace: active vs. parked users, roles, burst seat schedules, idle timers, and historical activity. Provide bulk actions (assign roles, park/unpark, invite/remove), inline cost previews before changes, filters/search, and CSV export. Include real-time utilization charts and projected end-of-cycle costs based on current allocations and scheduled bursts. Surface policy settings (idle threshold, auto‑unpark, role defaults) with guardrails and contextual help. All actions emit audit events and require appropriate admin permissions.

Acceptance Criteria

Seat Roster Overview, Filters, and Search

Given I am an Org Admin on the Seat Management Dashboard When the page loads Then I see total seat counts segmented by Active, Parked, and Burst seats and a breakdown by role (Editor, Viewer-only, Approve-only) And each user row shows current status (Active/Parked), role, last active timestamp, idle days counter, and any scheduled burst assignment Given I enter a search term matching a user’s name or email When I apply the search Then the roster returns only matching users and the result count reflects the filtered set Given I select filters for Status, Role, Idle Days (range), and Activity Window (date range) When I apply the filters Then the roster updates to show only users meeting all selected criteria and the segment counts reflect the filtered data Given I clear all filters and search When I reset the view Then the roster returns to the full unfiltered dataset

Bulk Park/Unpark and Role Assignment with Inline Cost Preview and Proration

Given I select multiple users from the roster When I choose Park as a bulk action Then an inline cost preview displays the projected current-cycle spend before/after and the delta reflecting daily proration from the effective time of the change And I must confirm before changes are applied Given I bulk Unpark previously parked users When I preview costs Then the preview shows updated prorated spend increase for the remainder of the cycle and highlights any policy conflicts (e.g., auto-park thresholds) Given I change roles in bulk (e.g., Editor to Viewer-only or Approve-only) When I preview costs Then the preview reflects role pricing rules (low/no cost roles do not increase paid seat count) and shows the net delta for the current cycle Given the bulk action is confirmed When processing completes Then all selected users reflect the new state consistently And any partial failures are reported per user with a retry option And changes are applied atomically per user and auditable

CSV Export of Seat Roster with Applied Filters

Given I have applied specific filters and/or a search on the roster When I click Export CSV Then the downloaded CSV contains only the currently visible (filtered) rows and includes all visible columns in the same order with a header row And date/time fields are in ISO 8601 with timezone designator And numeric fields (costs, idle days) are unformatted (no currency symbols/commas) Given the dataset exceeds the page size When I export Then the CSV includes the full filtered dataset, not just the current page Given there are zero matching rows When I export Then the CSV contains only the header row

Burst Seat Scheduling and Temporary Seat Management

Given I open Burst Seat scheduling When I add a burst with start and end dates and a seat quantity Then validation prevents end before start and seat quantity must be a positive integer And the UI shows a projected cost impact for the current billing cycle and the next if the range overlaps Given an overlapping burst exists for the same period When I attempt to create a conflicting burst Then I am prevented with a clear error explaining the overlap and pointing to the existing burst Given a burst is scheduled When I view the schedule Then I can edit or cancel future bursts And the projected cost and utilization update immediately upon change

Real-time Utilization Charts Accuracy and Refresh

Given active seat usage changes (e.g., users parked/unparked, bursts added/edited) When the change is saved Then the utilization charts (Active vs Parked vs Burst, by role) update within 5 seconds And chart totals match the roster counts for the same filters Given I apply filters (role, status, date window for activity) When charts render Then they reflect the same filtered scope as the roster and include legends and tooltips with exact counts and percentages

Projected End-of-Cycle Cost Calculation

Given current seat allocations, parked statuses, role assignments, and scheduled bursts When I open the cost projection panel Then I see the projected end-of-cycle spend and a breakdown by seat type (paid, low/no cost) and by change driver (park/unpark, role changes, bursts) And the projection includes daily proration effects for scheduled future changes within the cycle Given I stage changes (without saving) that affect billing When the inline preview is shown Then the projection reflects staged changes distinctly from committed state with a clear before/after and net delta Given the billing engine computes the authoritative amount When engineering runs a consistency check Then the dashboard projection is within ±1% or $1 (whichever is smaller) of the billing engine for the same inputs

Governance: Permissions, Audit Events, and Policy Settings Guardrails

Given my role is Org Admin When I view and edit policy settings (idle threshold, auto-unpark, role defaults) Then inputs are validated against allowed ranges and dependencies And invalid values are blocked with inline error messaging and contextual help links/tooltips And unsaved changes are clearly indicated and require explicit Save Given my role lacks Admin permissions When I attempt seat-affecting actions (park/unpark, role change, burst scheduling, removals) Then the actions are disabled or result in a permission error without applying changes Given any seat-affecting action is performed (including bulk) When the action completes (success or failure) Then an audit event is recorded with actor, timestamp, target(s), action type, before/after values, and correlation ID for bulk operations And the event is immutable, queryable by date range and user, and visible in the historical activity view

Seat Usage Notifications & Alerts

"As a billing admin, I want timely notifications about parking and burst seat changes so that I can prevent surprises on our invoice and keep the team informed."

Description

Provide configurable email and in‑app notifications for key seat events: upcoming auto‑park warnings, successful parking/unparking, burst seat start/ending reminders, seat cap approaching/reached, and mid‑cycle cost change summaries. Include daily/weekly digest options, per-admin preferences, localization, accessible templates, and secure deep links to the dashboard. Implement rate limiting and deduplication to prevent alert fatigue, and ensure all events are logged for audit and can be queried via API.

Acceptance Criteria

Upcoming Auto‑Park Warning Notifications

Given an admin has enabled the “Auto‑park warnings” category and auto‑park is configured for 14 days idle with a 72‑hour pre‑warning, When a user’s seat reaches 11 days idle (72 hours before auto‑park), Then an auto‑park warning is delivered within 5 minutes via the admin’s selected channels (email and/or in‑app). And the notification content includes: user display name, seat role, last active timestamp, scheduled park datetime in the admin’s timezone, and a secure deep link to the Seat Management view filtered to that user. And admins who have opted out of this category receive no warning. When the user becomes active before the scheduled park time, Then any pending warning banners are withdrawn and no further warnings are sent for that auto‑park cycle.

Parking/Unparking Confirmation Alerts

Given a seat is parked (auto‑park) or unparked by an authorized admin, When the action completes successfully, Then a confirmation notification is sent within 2 minutes to opted‑in admins via their selected channels. And the notification includes: action type (parked/unparked), actor (system/admin display name), effective timestamp, affected user, seat role, and a secure deep link to the Seat Management audit view focused on the event. And each admin receives at most one confirmation per channel per event. When the action fails, Then an error notification with failure reason and a deep link to retry/resolve is sent to the initiating admin only.

Burst Seat Start/End Reminders

Given burst seats are activated for an account with a scheduled end time T, When activation occurs, Then a start confirmation is sent within 5 minutes to opted‑in admins including start time, burst seat count, and a deep link to burst settings. When the time is T−24h and again at T−1h, Then reminder notifications are sent to opted‑in admins. When the scheduled end time changes, Then previously scheduled reminders are canceled and new reminders are scheduled accordingly. When T is reached and burst seats end, Then an end confirmation is sent within 5 minutes including the final burst seat count and a deep link to usage history.

Seat Cap Approaching/Reached Alerts

Given an account has a configured seat cap of C and opted‑in admins for “Seat cap alerts”, When active seats first cross 80% of C within a billing cycle, Then an “approaching cap” alert is sent within 5 minutes, including current usage, cap, percentage, and a deep link to the Seat Management view. And no additional “approaching cap” alerts are sent for 24 hours unless usage falls below 75% and then exceeds 80% again. When active seats reach C, Then a “cap reached” alert is sent immediately, including guidance to add burst seats and a deep link to enable burst. And only opted‑in admins receive these alerts.

Mid‑Cycle Cost Change Summary Notifications

Given an account is mid‑cycle and an admin has enabled “Cost change summaries” with a weekly schedule (default) at 09:00 in their timezone, When the scheduled time arrives and the projected seat cost has changed since the last summary, Then a summary notification is delivered that includes: cycle start date, previous projection, current projection, net change (amount and percentage), and a breakdown by driver (parks/unparks, burst consumption, cap overages), all in account currency. And if no net change has occurred since the last summary, Then no summary is sent. When an admin switches the summary cadence to daily or changes delivery time/ timezone, Then the new schedule takes effect within 5 minutes and the next summary adheres to the updated configuration.

Daily/Weekly Digest & Per‑Admin Preferences

Given an admin has selected a daily or weekly digest with a delivery time and timezone and has configured category‑level notification preferences, When the digest time arrives, Then a single digest is sent that aggregates eligible events since the last digest for the categories the admin enabled and excludes items that were already delivered as immediate notifications per that admin’s preferences. And the digest groups items by category, includes counts, and provides secure deep links to the relevant dashboard views. When an admin updates preferences (enable/disable categories, change channels), Then changes take effect for new events within 5 minutes and are reflected in the next digest. And admins who disable digests receive none while still receiving immediate notifications for categories they keep enabled.

Notification Quality, Safety & Observability (Localization, Accessibility, Deep Links, Rate Limiting, Dedup, Audit/API)

Given an admin’s language is set to a supported locale and a timezone is configured, When any notification is generated, Then subject and body are localized to that language with dates/times in the admin’s timezone and currency in the account currency, falling back to English if a string is unavailable. And email templates meet WCAG 2.1 AA (semantic HTML, sufficient contrast, alt text) and in‑app notifications are fully keyboard navigable with visible focus and ARIA labels. And deep links are HTTPS, contain signed tokens that expire within 15 minutes, require authentication, and do not include PII in query parameters; expired links redirect to login and then return to the target view. Given multiple events of the same type occur for the same admin within 60 minutes, Then no more than 2 immediate notifications per type per admin are delivered in that window; additional events are rolled into the next digest. Given the same event_id is processed more than once, Then deduplication ensures at most one notification per channel per admin per event_id is sent. Given any notification attempt occurs, Then an immutable audit record is created with event_id, type, actor_id, subject_user_id, account_id, payload hash, channels attempted, per‑channel delivery status, locale, created_at, and correlation_id, retained for at least 400 days. Given an API client with scope seat.events:read, When it calls GET /api/v1/seat-events with filters (type, admin_id, date range, delivery_status) and pagination (limit, cursor), Then the API responds within 2 seconds for <=10k records, includes next_cursor when applicable, and omits PII; requesting a specific event_id returns full delivery history.

Spend Guard

Add approval gates and alerts for cost thresholds. Flag batches that would exceed caps or surpass per‑project limits, show clear cost diffs vs. baseline, and require one‑click approval before processing. Slack/Email notifications and API webhooks keep finance and ops aligned in real time.

Requirements

Threshold Policy Engine

"As a finance admin, I want to configure workspace-, project-, and batch-level spend caps (soft/hard) so that we prevent overruns while keeping teams productive."

Description

Provide a policy engine to configure workspace-, project-, and batch-level spend caps with soft and hard thresholds. Support absolute currency or credit units, per-period limits (monthly/weekly), and effective date windows with inheritance and overrides. Expose a settings UI and secure API to create, edit, test, and validate policies, including conflict resolution and preview of effective limits. Policies integrate with the pricing estimator and batch-processing pipeline to evaluate spend pre-execution and at run time.

Acceptance Criteria

Create Policy with Thresholds, Units, and Date Window (UI)

Given I am a workspace admin on the Policies UI When I create a Project-scoped policy with unit=currency, monthlyHardCap=1200.00 USD, monthlySoftCap=900.00 USD, weeklyHardCap=300.00 USD, weeklySoftCap=250.00 USD, effectiveFrom=2025-10-01T00:00:00Z, effectiveTo=2026-10-01T00:00:00Z Then the policy saves with status=Scheduled (today < effectiveFrom), an immutable id, and audit fields (createdBy, createdAt) And numeric fields accept up to 2 decimal places and reject negatives or non-numeric input with inline validation And soft caps must be <= their corresponding hard caps or the save is blocked with message "Soft cap must be less than or equal to hard cap" And the saved record can be retrieved via UI list and matches all submitted values

Inheritance and Overrides Across Workspace → Project → Batch

Given a Workspace policy with unit=currency and monthly caps hard=1000, soft=800, effective now And a Project policy with monthly caps hard=700, soft=600 effective now And a Batch policy with weekly caps hard=300, soft=250 effective now When I request the effective limits for that batch context at now Then the effective monthly caps are hard=700, soft=600 and the effective weekly caps are hard=300, soft=250 And no effective cap exceeds its parent; if a child attempts hard=1500 where parent hard=1000, the save is rejected with 409 Conflict and code=ChildCapExceedsParent

Mixed-Unit Policy Conflict Detection

Given a Workspace policy with unit=currency (USD) effective now When I attempt to create a Project policy with unit=credits under that workspace Then the request is rejected with 422 Unprocessable Entity and code=MixedUnitsInInheritance and includes pointers to conflicting policy ids And creating a sibling Project policy with unit=currency is allowed and evaluates correctly

Pre-Execution Evaluation via Pricing Estimator

Given effective monthly caps hard=700, soft=600 and accruedMonthlySpend=200 And the pricing estimator returns predictedBatchCost=450 (USD) When the policy engine evaluates the batch pre-execution Then result.status=SOFT_THRESHOLD_EXCEEDED, result.block=false, result.approvalRequired=true And result.totals.projectedPeriodSpend=650, result.remainingToSoft=-50, result.remainingToHard=50, result.unit=USD And the engine returns a diff object enumerating which thresholds were crossed and by how much

Runtime Spend Evaluation and Events

Given a running batch with effective weekly caps hard=300, soft=250 and initial period spend=0 When cumulative actual spend first reaches 255 during execution Then the engine emits one SOFT_THRESHOLD_CROSSED event once with payload including batchId, period=weekly, amount=255, softCap=250, unit And processing continues When cumulative actual spend reaches 301 Then the engine emits a HARD_THRESHOLD_CROSSED event and returns a decision action=BLOCK with reason=HardCapExceeded and the pipeline halts further tasks for the batch

Secure Policy API: Create/Edit/Test/Preview

Given I authenticate with a token scoped policy:write as a workspace admin When I POST /policies with a valid payload Then I receive 201 Created with the policy body and Location header within 500ms p95 And unauthorized tokens receive 401; authenticated users without admin rights receive 403 When I PATCH /policies/{id} to adjust soft caps only within parent bounds Then I receive 200 OK and the audit trail records who changed what fields When I POST /policies/{id}/test with an evaluation context (periodSpend, estimate, timestamp) Then I receive 200 OK with evaluation.status in {WITHIN_LIMITS, SOFT_THRESHOLD_EXCEEDED, HARD_THRESHOLD_EXCEEDED} and numerical diffs When I GET /limits/preview?workspaceId=...&projectId=...&batchId=...&timestamp=... Then I receive 200 OK with computed effective caps and the sourcePolicyIds used

Effective Limits Preview Consistency

Given I call /limits/preview for a specific context with timestamp T When I perform a pre-execution evaluation and a runtime evaluation at the same timestamp T with matching inputs Then the effective caps and decision statuses returned by evaluation match the preview exactly And any change to a source policy after T does not alter the previously returned preview or evaluation results for T

Real-time Cost Estimation & Flagging

"As an operations manager, I want PixelLift to estimate batch processing costs in real time and flag overruns so that I can adjust scope or seek approval before processing."

Description

During batch upload and preset selection, calculate estimated processing cost using image counts, selected style-presets, and add-ons. Display current remaining budget versus estimate, highlight potential overruns, and block or warn based on policy (soft vs hard). Provide clear reason codes (e.g., "exceeds project cap by $124") and suggestions (reduce images, change preset). Must scale to hundreds of images, remain responsive, and gracefully handle missing data by falling back to safe defaults.

Acceptance Criteria

Real-time Estimate Updates During Batch Configuration

Given a batch of 200 images, a preset priced at $0.08/image, and an add-on priced at $0.03/image When the user enables the add-on Then Estimated Total displays $22.00 within 500 ms And Per-Image Estimated Cost displays $0.11 to 2 decimal places When the user disables the add-on Then Estimated Total updates to $16.00 within 500 ms And values persist across navigation back to the batch setup screen during the current session

Budget Remaining and Overrun Difference Display

Given a project budget cap of $500.00 and spent-to-date of $380.00 And the current batch estimated total is $150.00 When the user views the batch configuration summary Then Remaining Budget displays $120.00 And Over Budget Diff displays $30.00 highlighted as a warning And the warning tooltip text contains "exceeds project cap by $30"

Soft Policy Warning Allows Proceed with Explicit Acknowledgment

Given the policy is set to Soft (warn on overrun) And the estimated total exceeds the remaining budget by any amount When the user clicks Process Then a modal appears showing the estimate, remaining budget, overrun amount, and reason code OVER_PROJECT_CAP And the modal requires an explicit "Proceed Anyway" action to continue And upon confirmation, processing begins and the modal closes And if the user cancels, no processing starts

Hard Policy Block Requires Pre-Approval

Given the policy is set to Hard (block on overrun) And the estimated total exceeds the remaining budget When the user clicks Process Then processing is blocked And the primary action changes to "Request Approval" And a banner displays reason code EXCEEDS_CAP with the overrun amount And processing only becomes enabled after an approval flag is present

Reason Codes and Actionable Suggestions on Overrun

Given an overrun condition is detected When the overrun message renders Then it displays a reason code in plain language including the dollar overage (e.g., "exceeds project cap by $124") And at least two suggestions are shown with estimated savings (e.g., "Reduce images by 50 to save $5.50", "Switch to Basic preset to save $12.00") And clicking a suggestion applies the change and recalculates the estimate within 500 ms And the user can undo the applied suggestion

Performance and Scalability for Large Batches

Given a batch of 1,000 images and up to 3 add-ons selected When the user toggles any preset or add-on Then the estimated total and per-image cost recalculate and render within 800 ms at the 95th percentile on supported browsers And the UI remains responsive with no single interaction blocked longer than 100 ms And calculations produce the same totals as server-side verification within $0.01 tolerance

Graceful Handling of Missing or Stale Pricing Data

Given pricing for a selected preset or add-on is missing or unavailable When the user configures the batch Then the system uses a safe default per-image price equal to the highest active per-image price for the workspace And displays a non-blocking warning "Price unavailable—using safe default" And if policy is Hard, the Process action remains disabled until valid pricing is available or configuration changes resolve the issue And no processing request is submitted while any pricing uses safe defaults

One-click Approval Gate & Overrides

"As a project owner, I want a one-click approval workflow when a batch exceeds limits so that processing is controlled without unnecessary delays."

Description

Introduce an approval step when a batch triggers a policy. Present a review dialog summarizing estimate, baseline, deltas, and policy breaches. Allow authorized approvers to approve/deny with one click, optionally adding justification and setting an override limit or expiration. Support routing rules (project owner, finance role), capture identity/time/IP, and unblock processing instantly on approval. Provide equivalent API endpoints for programmatic approvals and ensure pending batches are queued safely until resolution.

Acceptance Criteria

Approval Dialog Summary on Policy Trigger

Given a batch triggers a Spend Guard policy during cost estimation When the user attempts to process the batch Then processing is blocked and a review dialog is shown And the dialog displays: estimated cost, baseline cost, absolute delta, percentage delta And the dialog lists each breached policy with its threshold and current value And the dialog shows batch ID, project name, and item count And currency and number formatting follow the project locale

One-Click Approve/Deny with Optional Justification

Given the review dialog is displayed to an authorized approver When the approver clicks Approve without entering a justification Then the approval is recorded and batch processing starts immediately When the approver clicks Deny with or without a justification Then the denial is recorded and the batch status becomes Rejected And the dialog closes and the batch no longer attempts processing

Set Override Cap and Expiration on Approval

Given the review dialog for a policy-triggered batch When the approver enters an override cost cap and/or expiration datetime and clicks Approve Then the override values are recorded with the approval And the batch processes only if the approval has not expired at start time and the latest estimated cost is less than or equal to the override cap (if provided) And if the approval expires before processing starts, the batch returns to Pending Approval And if re-estimated cost exceeds the override cap, processing is halted and the batch returns to Pending Approval

Approval Routing to Project Owner and Finance Role

Given a batch requires approval Then the approval request is routed to the project owner and all users with the Finance role for that organization And only routed approvers and organization admins can approve or deny And users without permission see a read-only dialog with disabled approval actions And if no project owner exists, organization admins are the default approvers

Audit Trail Captures Identity, Time, and IP

Given any approve, deny, or override action Then an immutable audit entry is stored with actor user ID, display name, role, UTC timestamp (ISO 8601), and source IP address And the audit entry includes decision (approved/denied), justification text (if any), override cap, and expiration (if any) And the audit entry is visible in the batch history UI and retrievable via the API

Safe Queueing and Instant Unblock on Approval

Given a batch is pending approval Then it is held in a durable queue and not processed until approved And the batch remains in Pending Approval across service restarts When an approval is recorded Then the batch transitions to Processing immediately And concurrent approval attempts are idempotent, resulting in a single processing start

Programmatic Approvals via API

Given an API client with approvals:read scope When it requests the list of pending approvals Then it receives batch ID, project, estimate, baseline, deltas, and breached policies for each item Given an API client with approvals:write scope When it submits an approve request with optional justification, override cap, and expiration Then the approval is recorded and the batch begins processing When it submits a deny request with optional justification Then the batch is rejected And all write requests support an Idempotency-Key header and require OAuth2 authentication

Baseline & Cost Diff Visualization

"As a budget owner, I want clear cost deltas versus baseline so that I can quickly understand the drivers and make the right approval decision."

Description

Compute and display a baseline cost (configured per project or derived from recent comparable batches) and surface total and per-image deltas. Attribute differences to drivers (volume, preset cost multipliers, add-ons) with color-coded indicators and concise explanations. Show this context in the upload flow, approval dialog, notifications, and reports, and expose values via API for downstream systems.

Acceptance Criteria

Project-Configured and Derived Baseline Calculation

Given a project has an explicit baseline cost configuration When a user uploads a new batch Then the system uses the project-configured baseline per image and computes the total baseline for the batch size Given a project lacks an explicit baseline configuration When a user uploads a new batch Then the system derives the baseline from at least the last 3 comparable batches (same preset and add-ons, within ±20% volume, within 30 days) And if fewer than 3 comparable batches exist, the system falls back to the platform default baseline per preset And the response indicates the baseline_source as one of configured|derived|default

Upload Flow Delta Display and Explanations

Given a batch reaches the upload summary step When pricing is shown Then the UI displays per-image baseline, per-image estimate, per-image delta (amount and %), total baseline, total estimate, and total delta And deltas are color-coded: green if estimate <= baseline, amber if 0% < overage <= 10%, red if overage > 10% And an info tooltip lists drivers (volume, preset multipliers, add-ons) with amounts that sum to the total delta within $0.01 tolerance

Approval Dialog Context with Baseline & Diff

Given a batch requires approval per Spend Guard rules When the approval dialog opens Then it shows project name, batch ID, baseline total, estimated total, delta amount and %, and driver breakdown And values match the upload summary within $0.01 tolerance And the dialog provides a link to view per-image deltas

Notifications Contain Baseline and Cost Diffs

Given notifications are enabled for the project When a batch is created or enters approval state Then Slack and Email notifications include baseline total, estimated total, delta amount and %, top 3 drivers by absolute impact, and a deep link to the approval or batch page And numeric values and currency format match the project settings

Reporting and Export Reconcile Baseline vs Actuals

Given completed batches exist When a user views the Cost Reports page Then each row shows baseline total, actual total, and delta (amount and %), with a drivers summary And report totals reconcile to the sum of rows within $0.01 tolerance And exporting CSV includes per-batch baseline, actual, delta, baseline_source, and driver columns with the same values as the UI

API Exposure of Baseline, Deltas, and Drivers

Given an authenticated client requests GET /batches/{id}/costs When the batch exists Then the API returns baseline_total, estimate_total (or actual_total if processed), delta, baseline_source (configured|derived|default), and an array of per-image entries with baseline, estimate, delta, and drivers And all monetary values are in the project currency with 2 decimal places And the API values match the UI within $0.01 tolerance

Accessibility, Localization, and Non-Color Indicators

Given users with varying accessibility needs When cost deltas are displayed Then color indicators meet WCAG AA contrast And non-color indicators (signs, labels, or icons) are present for over/under baseline states And currency, number formatting, and translated driver labels follow the project locale settings

Slack/Email Actionable Alerts

"As a finance controller, I want actionable Slack/email alerts for thresholds and approvals so that I can approve or intervene immediately."

Description

Send real-time, configurable Slack and email notifications for key events: threshold approaching, approval required, approved, denied, cap reset. Include concise context (project, estimate, baseline, diffs, policy breached) and deep links to approve or review. Support Slack interactive actions (approve/deny) with secure verification, notification throttling to avoid spam, per-project channel mapping, templates, and delivery retries with backoff.

Acceptance Criteria

Slack Threshold Approaching Alert

Given a project has approaching_threshold_percent set to 80 and a spend cap C When a new batch estimate E causes projected spend to cross from <80% to >=80% and <100% of C Then send a Slack message to the mapped channel within 15 seconds with event_type="threshold_approaching" including project_id, project_name, batch_id, cap_amount, projected_amount, baseline_amount, diff_amount, diff_percent, policy_name, and deep_link_url And send the message exactly once per batch per threshold crossing And do not send a message if alerts.threshold_approaching is disabled for the project

Approval Required Notifications with Deep Links and Mapping

Given a batch’s projected cost exceeds the project’s approval gate or cap and alerts.approval_required is enabled When the system marks the batch as approval_required Then send a Slack interactive message to the mapped channel and an email to mapped recipients within 15 seconds And include deep links for Approve and Review with signed tokens that expire after 30 minutes And resolve destinations via mapping priority: override > project default > workspace default; if Slack channel is invalid, DM the Project Owner and flag mapping_error; invalid email addresses are skipped and logged And include a unique approval_id to correlate responses And do not emit duplicate notifications for the same approval_id within the dedup window

Slack Interactive Approve/Deny Secure Verification

Given a user clicks Approve or Deny in Slack for approval_id X When the interactive payload is received Then verify Slack signing secret (v2) and timestamp skew <= 5 minutes; reject with 401 on failure And verify the user has permission to approve for the project; on failure send an ephemeral denial and do not change status And process idempotently by approval_id; repeated clicks return the existing decision And update the approval status and post a threaded confirmation in Slack within 10 seconds And send email confirmations to mapped recipients only on the first successful decision

Event Coverage and Message Content Accuracy

Given events approved, denied, and cap_reset occur When each event is emitted Then send notifications (Slack and email as configured) within 15 seconds containing project_id, project_name, event_type, actor, timestamp (UTC ISO8601), batch_id (if applicable), cap_amount_before/after (for cap_reset), baseline_amount, projected_amount, diff_amount, diff_percent, policy_name, and deep_link_url And render messages using templates v1.0 with correct variable substitution and localization for en-US and en-GB And include approver/denier identity and optional reason for approved/denied And include reset_source and new_cap_effective_date for cap_reset

Notification Throttling and Deduplication

Given multiple identical events for the same project and subject (approval_id or batch_id) within the throttle window When notifications are generated Then coalesce duplicates so only one message per channel is sent within a 2-minute window and include a +N more count in a digest after 15 minutes And enforce a maximum of 5 notifications per project per channel per hour; excess roll into the next digest And allow an admin override to send the next notification immediately, bypassing throttling once And record dedup keys and throttle decisions in logs for auditability

Delivery Retries with Exponential Backoff and DLQ

Given a transient delivery failure (HTTP 5xx, 429, or network timeout) occurs when sending Slack or email When attempting delivery Then retry up to 3 times with backoff delays of 1m, 5m, and 15m while respecting provider rate limits And stop retrying on success or permanent failures (HTTP 4xx except 429) and mark final status And place failures after final attempt into a dead-letter queue with error details and next_action And emit metrics (success_rate, retry_count, dlq_count) and logs with correlation_id

Per-Project Channel/Recipient Mapping Administration and Validation

Given a project admin updates Slack channel and email recipient mappings When the configuration is saved Then validate Slack channel IDs via Slack API and emails via syntax and MX checks before persisting And store audit fields (updated_by, updated_at) and apply the new mappings to subsequent notifications immediately And provide test buttons that send test messages to the mapped Slack channel and email and record outcomes And if no valid destination resolves at send time, raise a configuration_error and surface it in UI

Spend Event Webhooks

"As a systems integrator, I want signed webhooks for spend events so that our ERP and procurement tools stay in sync."

Description

Provide secure webhooks for external systems to receive spend-related events (threshold_crossed, approval_required, approved, denied, cap_updated). Include signed payloads with idempotency keys, batch and project metadata, estimates, baselines, diffs, and policy details. Offer a management UI for endpoints, rotating secrets, test sends, delivery logs, and configurable retries with exponential backoff.

Acceptance Criteria

Secure Signature Verification and HTTPS Enforcement

Given a webhook endpoint is created with a HTTPS URL When PixelLift sends an event POST with Content-Type application/json over TLS 1.2+ Then the request includes header X-PixelLift-Signature with fields t (Unix epoch seconds) and v1 (HMAC-SHA256 of `${t}.${raw_body}` using the endpoint’s active secret) And the header X-PixelLift-Event-Id is a UUIDv4 unique per event And the receiver can verify the signature within a ±300 second tolerance on t And if the endpoint URL is not HTTPS, creation/edit is blocked with validation error and no requests are sent And request bodies are UTF-8 encoded, uncompressed by default, with optional gzip when the endpoint’s Accept-Encoding includes gzip

Idempotency and Replay Protection

Given each delivery includes header X-Idempotency-Key unique per event per endpoint When a delivery is retried due to timeout, network error, 5xx, 429, or 408 Then the same X-Idempotency-Key is reused for every retry of that event to that endpoint And receivers returning any 2xx status are treated as success and will not be retried And deliveries older than 300 seconds based on signature timestamp t are rejected by receivers as replays, while PixelLift will still record the attempt in logs And PixelLift will never generate two different events with the same X-Idempotency-Key

Event Emission and Ordering Across Spend Events

Given Spend Guard detects a state change that matches an event type in {threshold_crossed, approval_required, approved, denied, cap_updated} When the state change is committed Then PixelLift emits exactly one event record for that state change (delivery semantics: at-least-once) within 5 seconds And the payload includes event_sequence that increments per (project_id, batch_id) scope starting at 1 And deliveries to a single endpoint are made in sequence order per (project_id, batch_id); later sequence numbers are not delivered before earlier ones And events include causal_ids referencing prior related event_id(s) when applicable (e.g., approval_required cites threshold_crossed)

Event Payload Schema and Field Accuracy

Given an event is emitted When the receiver inspects the JSON body Then it contains required fields with types: event_id (UUIDv4), event_type (enum), occurred_at (ISO-8601 UTC), idempotency_key (string), event_sequence (integer), project_id (UUID/slug), project_name (string), batch_id (UUID/slug), currency (ISO 4217), baseline_cost_cents (integer), estimated_cost_cents (integer), diff_cents (integer), diff_percent (number), policy {threshold_cents (integer), cap_cents (integer), approval_required (boolean)}, actor {id (string|null), role (string|null)} where applicable, causal_ids (array of UUID), meta {version (string), environment (string)} And event_type is one of: threshold_crossed, approval_required, approved, denied, cap_updated And diff_cents = estimated_cost_cents - baseline_cost_cents and diff_percent = round((estimated_cost_cents - baseline_cost_cents) / max(1, baseline_cost_cents) * 100, 2) And fields not applicable to an event_type are present as null or omitted as per schema documentation, with schema version provided in meta.version And the HMAC signature is computed over the exact raw JSON body sent

Configurable Retry Policy with Exponential Backoff

Given an endpoint is configured with retry settings: max_attempts (default 6; allowed 0–10), initial_delay_seconds (default 30; allowed 1–300), backoff_multiplier (default 2.0; allowed 1.1–5.0), max_delay_seconds (default 600; allowed 30–3600), jitter_percent (default 20; allowed 0–50) When a delivery fails due to timeout, network error, HTTP 5xx, 429, or 408 Then PixelLift schedules retries using exponential backoff: delay_n = min(max_delay_seconds, initial_delay_seconds * backoff_multiplier^(n-1)) with ±jitter_percent applied And HTTP 2xx marks success (no further retries); HTTP 4xx other than 408/429 is not retried And after max_attempts are exhausted, delivery status is Final-Failed and no further attempts are made And the next scheduled retry time is recorded in delivery logs

Endpoint Management UI for Webhook Endpoints

Given a user with appropriate permissions opens Spend Guard Webhook settings When they create an endpoint Then the UI requires a valid HTTPS URL, generates a 32-byte secret, and saves endpoint as Enabled And a Test Send action allows selecting an event_type and sends a sample payload with header X-PixelLift-Test: true; result (HTTP code, latency) is shown inline and logged And a Rotate Secret action issues a new secret and for 24 hours sends dual signatures (both old and new) in X-PixelLift-Signature; after finalize, only the new secret is active And an Enable/Disable toggle prevents deliveries while Disabled and surfaces a warning in the UI And the UI displays last delivery status, last HTTP code, and last attempt time per endpoint

Delivery Logs, Search, and Export

Given deliveries occur to one or more endpoints When a user views Delivery Logs Then they can filter by time range, endpoint, status (Success, Retrying, Final-Failed), event_type, project_id, batch_id, and search by event_id or idempotency_key And each log row shows: event_id, event_type, occurred_at, endpoint URL (masked), attempt number, HTTP code, latency_ms, status, next_retry_at (if any), idempotency_key, signature_version, and response_snippet (first 512 chars) And a detail view shows full request body, headers (excluding full secrets), and per-attempt history with timestamps And logs are retained for at least 30 days and can be exported as CSV or JSON for the selected range

Audit Log & Spend Reports

"As a compliance lead, I want an auditable history of policies and approvals so that we can satisfy audits and investigate anomalies."

Description

Maintain an immutable audit trail for policy changes, approvals/denials, overrides, and notifications with actor, timestamp, IP, and before/after values. Provide searchable, filterable reports by project, user, date range, and policy, with CSV export and API access. Enforce role-based visibility and retention policies, and ensure logs are tamper-evident to satisfy compliance and internal review needs.

Acceptance Criteria

Immutable Audit Trail for Policy Changes

- Given an authorized admin updates any spend policy field, When the change is saved, Then an audit event is appended capturing: policy_id, org_id, actor_user_id, actor_role, actor_ip, user_agent, event_type=policy.updated, timestamp (UTC ISO 8601 with ms), before_values and after_values for all changed fields, content_hash, and previous_hash. - And the system prevents updates or deletions of existing audit events; any attempt results in HTTP 403 and a new audit event with event_type=audit.write_denied including actor metadata. - And retrieving the event by policy_id returns all persisted fields and hash values; recomputing the hash matches content_hash. - And if a stored event is altered in persistence, then the integrity check endpoint for the affected range returns integrity_status="failed" with first_mismatch_event_id.

Comprehensive Logging for Approvals, Denials, and Overrides

- Given a batch decision occurs via UI, API, or Slack, When the decision is recorded (approved, denied, overridden), Then an audit event is appended containing: batch_id, project_id, event_type in {approval.approved, approval.denied, approval.overridden}, previous_state, new_state, request_cost_estimate, policy_ids_evaluated, decision_reason (required for denied/overridden), actor_user_id, actor_role, actor_ip, timestamp, and source_channel in {web, api, slack}. - And overrides require a non-empty justification of at least 10 characters; otherwise the override action is rejected with HTTP 422 and no audit event is written. - And associated notifications (Slack/Email/Webhook) generate separate audit events with event_type=notification.sent including recipient, channel, delivery_status, provider_message_id, and a correlation_id linking to the decision event.

Searchable, Filterable Spend Audit Reports

- Given a Finance Analyst with permission view_audit_all, When they query reports with filters (project_id, user_id, date_range, policy_id, action_type), Then the API returns only matching events, paginated (default page_size=100, max=1000), sorted by timestamp desc, with next_page_token when more results exist. - And queries over up to 100,000 events return the first page within 2 seconds at p95 in the staging dataset; total_count is returned as an estimate within ±5%. - And the UI supports combined filters and free-text search over actor_user_email, batch_id, and event_type; empty-state messaging appears when no results match.

CSV Export and Reporting API Access

- Given any filtered audit query, When the user selects Export CSV, Then a background job generates a CSV within 60 seconds for up to 100,000 rows and returns a signed download URL valid for 24 hours. - And the CSV includes a header row with at least: event_id, org_id, project_id, batch_id, policy_id, event_type, timestamp, actor_user_id, actor_role, actor_ip, before_values, after_values, source_channel, content_hash, previous_hash. - And when the result set exceeds 100,000 rows, the export runs asynchronously; progress is visible and a notification is sent on completion. - And the Reporting API supports filters (project_id, user_id, date_from, date_to, policy_id, event_type), pagination (page_size, page_token), ordering (order_by=timestamp:asc|desc), conforms to the published schema (OpenAPI), and returns HTTP 429 when rate limits are exceeded.

Role-Based Visibility and Field-Level Redaction

- Given a user with project_viewer role (not org_admin), When accessing audit logs, Then only events for projects they can access are returned; other projects return HTTP 403 and the attempt is audited. - And IP addresses are fully visible only to org_admin and security_analyst roles; other roles see IP masked to /24 (e.g., 203.0.113.xxx) across UI, CSV, and API. - And export and API responses enforce identical field-level redaction rules; no unredacted IP or sensitive fields appear for unauthorized roles. - And RBAC is enforced server-side using access token scopes; removing client-side checks does not grant access.

Retention Policy Enforcement and Purge Proofs

- Given an organization retention policy (e.g., 365 days), When events exceed the retention period, Then they are purged by a scheduled job and replaced by a purge-proof record containing: range (first_event_id, last_event_id), count_purged, time_window, and a Merkle root over the purged events' content_hashes. - And after purge, queries and exports for the purged window return no events and include a reference to the purge-proof; retention configuration changes are themselves audited with before/after values. - And manual purge runs require org_admin, return a dry-run count before execution, and emit event_type=retention.purge.started and retention.purge.completed with outcome and counts.

Tamper-Evidence Integrity Verification and Alerts

- Given the daily integrity verification job, When it runs, Then it recomputes the ledger hash chain (or Merkle tree) per organization and records event_type=integrity.check.completed with integrity_status in {passed, failed}, checked_count, and ledger_root. - And if integrity_status=failed, Then Slack/Email alerts and a webhook are sent with details; the Audit Reports UI displays a warning banner until the next passing check. - And GET /v1/audit/integrity?from=...&to=... returns ledger_root, last_event_id, and verification result for the requested range; a deliberately mutated event causes HTTP 409 with first_mismatch_event_id.

Product Details

Vision & Mission

Problem & Solution

Details & Audience

User Personas

Automation Architect Avery

Background

Needs & Pain Points

Needs

Pain Points

Psychographics

Channels

Recommerce Refiner Riley

Background

Needs & Pain Points

Needs

Pain Points

Psychographics

Channels

Test-and-Tune Taylor

Background

Needs & Pain Points

Needs

Pain Points

Psychographics

Channels

Studio-Streamliner Sam

Background

Needs & Pain Points

Needs

Pain Points

Psychographics

Channels

Line-Sheet Lila

Background

Needs & Pain Points

Needs

Pain Points

Psychographics

Channels

Private-Label Polisher Priya

Background

Needs & Pain Points

Needs

Pain Points

Psychographics

Channels

Product Features

Role Matrix

Requirements

Brand & Collection-Scoped Permission Model

Description

Acceptance Criteria

Visual Role Matrix Builder UI

Description

Acceptance Criteria

Access Simulator (What‑If) Preview

Description

Acceptance Criteria

Approval & Publish Gate Enforcement

Description

Acceptance Criteria

Conflict Detection & Safe‑Guard Validation

Description

Acceptance Criteria

Audit Trail & Versioned Policy History

Description

Acceptance Criteria

Permissions API & Webhook Notifications

Description

Acceptance Criteria

Draft Sandbox

Requirements

Draft Preset Versioning

Description

Acceptance Criteria

Sample Set Selector

Description

Acceptance Criteria