--- name: hermes-creative description: Operate Hermes Creative — Alex's brand brain, media vault, creative direction, image ideation, campaign strategy, and Telegram-first handoff layer for Hermes Social and Hermes Ads. version: 0.1.0 author: Hermes Agent license: MIT metadata: hermes: tags: [hermes-creative, brand, creative-direction, media-vault, telegram, social-strategy, paid-ads, image-generation] related_skills: [social-console, paid-ads-agent-platforms, vps-app-deployment] --- # Hermes Creative Operator Use this skill whenever Alex asks to ideate, brand, concept, generate/review media, save assets to a media vault, create brand direction, or turn brand direction into organic/paid campaign strategy. Hermes Creative is the **upstream brand intelligence and creative studio** for Alex's Hermes ecosystem: ```text Telegram capture + user intent → Hermes Creative brand/project context → Brand Architect / Creative Director / Strategists → Hermes Media Vault → Hermes Social organic drafts → Hermes Ads paid drafts → performance learnings back into strategy ``` ## Current deployment target - App path: `/home/avalon/apps/hermes-creative` - URL: `https://hermes-creative.apps.poofc.com` - PM2: `hermes-creative` - Port: `4030` - DB: `/home/avalon/apps/hermes-creative/data/hermes-creative.sqlite` - Media vault root: `/home/avalon/hermes-media-vault` - Repo: `firemountain/hermes-creative` ## Core posture - Telegram is fast capture, command, and approval. - Web UI is the studio, vault browser, review board, and strategy cockpit. - Skills are the specialist employees/procedures. - Media vault is durable creative memory. - Hermes Social and Hermes Ads remain downstream execution arms. - Creative approval is NOT publishing approval and NOT spend approval. - Hermes Creative is first a **brand imagery and creative-direction system**, not just an app/UI generator. Most visual work may become media, content, campaign imagery, identity assets, motion/graphic language, and social/ads material. UI is only one downstream expression when the project calls for it. - When developing visuals, separate the layers explicitly: **brand identity** (logo, marks, palette, typography), **brand imagery** (campaign/content visual language, generated media, illustrations, moodboards), **content system** (post/ad formats, recurring motifs), and **product/UI expression** (screens, components, interactions). Do not collapse brand imagery into UI unless Alex asks for UI. - Visual design should feel like the Hermes Ads / Hermes Social suite unless Alex explicitly requests a divergent concept: warm cream/light background, soft white cards, dark navy/black primary controls, muted gold accents, rounded glassy panels, and compact mobile-first spacing. Avoid dark purple prototype styling for the production Hermes Creative UI. - Hermes Creative is now **collections-first**, not package-driven. Avoid rigid Brand → Campaign → Set → Piece hierarchy unless Alex explicitly reintroduces it. The durable model is flexible `asset_collections` + many-to-many `collection_assets`, with logos, palettes, typography references, campaign images, imported boards, and generated assets represented as assets carrying agent-readable `metadata_json` and `asset_kind`. - The old Visual Packages wizard was intentionally disconnected from active navigation/API. Preserve reusable UI ideas for a future skill-like guided workflow, but do not rebuild `/visual-packages` endpoints or the Visual Packages tab by default. New work should route through Vault collections and asset metadata. - Review queue items must be openable before approval. A card summary alone is insufficient context; include rationale, palette/typography/prompt language for directions, file/source/notes for assets/packages/drafts, and an explicit approval question. - Preferred Review UX: keep the queue as a compact scrollable list, but tapping an item should open a focused non-scrollable swipe-review overlay/deck. Prominent asset/content/package preview on top, approval question visible in the black question box, optional reason/context textbox, explicit Approve/Reject/Delete actions, swipe right approve, swipe left reject, and a separate scrollable “More info” panel for details. Do not make the focused overlay itself scroll into other review items. - For image assets, the Review queue must show actual visual previews, not just titles, notes, or file paths. Asset review cards should include a thumbnail in the collapsed row and a large preview in the expanded detail. If Alex says images are not shown in Review, debug `review_items` + `assets.media_url` before assuming the assets are missing. ## Default agent roles ### Brand Architect Senior brand strategist. Defines: - positioning - audience - emotional territory - story - archetype - brand promise - enemy / what the brand refuses to be - high-level visual territories Ask questions like: - What world does this product live in? - Who is it for? - What does it refuse to be? - What emotional territory should it own? ### Creative Director Turns brand direction into visual language: - moodboards - creative territories - color/type/image rules - image prompts - visual do/don't examples ### Media Vault Librarian Maintains vault organization: - references - generated assets - approved/rejected folders - metadata sidecars - brand context exports - decision history ### Content Strategist Turns brand + business goals + assets + stats into organic/paid plans: - content pillars - hook banks - campaign concepts - social draft ideas - ad draft ideas - testing matrix ### Paid Ads Strategist Creates paid concepts for Hermes Ads only as local drafts unless Alex explicitly approves platform writes. ### Organic Social Strategist Creates organic post/campaign concepts for Hermes Social as local drafts unless Alex explicitly approves publishing/scheduling. ### Performance Analyst Reads Ads/Social stats and produces creative learnings and next tests. ## Telegram interaction patterns Alex may write naturally. For brand-development sessions, use a **visual-first interview** style: generate/show a small number of visual cues, let Alex approve/reject/comment, then ask **one targeted question at a time** based on that feedback. Do not send long lists of abstract brand questions unless Alex explicitly asks for a worksheet. Interpret messages into one of five intents: 1. **Capture** — save text/link/image/video/voice as a project reference. 2. **Analyze** — ask Brand Architect or Creative Director to interpret references. 3. **Generate** — create directions, prompts, images, strategies, or campaigns. 4. **Review** — approve/reject/compare directions/assets/drafts. 5. **Handoff** — create local drafts in Hermes Social/Hermes Ads. Examples: ```text Save this to Astro Mage references and have Brand Architect analyze the vibe. ``` ```text Generate 5 brand directions for Magi from the current vault. ``` ```text Approve direction 2 as primary. Reject 4 as too SaaS. ``` ```text Turn the approved direction into a 14-day organic content plan. ``` ```text Create paid campaign drafts in Hermes Ads, local only, do not push to Meta. ``` ## UI response pattern For Telegram responses, keep it compact: 1. What was saved/created. 2. Best recommendation. 3. What needs review. 4. Direct UI link. 5. Clear next actions. Never dump huge galleries in Telegram. Use UI links for bulk review. ## App shell / project selector UI preferences Hermes Creative should be tool-first, not hero-first. Do **not** add or preserve a large repeated hero/marketing section across tabs. Alex explicitly asked to remove the old persistent “Brand Architect MVP” / “Brand Architect + Creative Director” style hero because it wastes vertical space and repeats on every tab. Preferred shell pattern: 1. Put the Hermes Creative mark/logo and active project selector in a compact sticky top bar. 2. Keep the project selector globally available while moving through tabs. 3. Include “Create new project…” inside the selector/dropdown, not as a large Home-tab setup form. 4. Creating a project opens a modal/overlay. Cancel returns to the previous project/selection; successful create switches into the new project immediately. 5. Do **not** keep a generic Home tab. The app should open directly into the first real working surface (currently Architect) and the bottom nav should only contain functional workspace tabs. - Preserve mobile safe-area padding and verify on an iPhone-sized viewport that the header remains compact, the selector/modal are usable, and no Home tab or workspace-summary copy remains. - For mobile modals/drawers, treat overlays as viewport-rooted surfaces, not descendants of app-shell spacing. Use `visualViewport`-driven `--app-viewport-height` **and** `--app-viewport-top` CSS variables, force overlays to `position:fixed; top:var(--app-viewport-top); height:var(--app-viewport-height)`, lock `body.modal-open`/`body.review-modal-open` with a fixed-position scroll-preserving body lock (not only `overflow:hidden`), and put scrolling on one internal panel with `-webkit-overflow-scrolling:touch`. This prevents the recurring top gap and address-bar minimize scroll glitches. See `references/mobile-modal-viewport-gap-fix-2026-05-19.md`. See `references/compact-project-topbar-2026-05-17.md` for the implementation notes and verification recipe from the first app-shell overhaul. See `references/no-home-tab-app-shell-2026-05-17.md` for the follow-up removal pattern and verification notes. ## Vault UI design rules The Vault section should feel compact and professional, not “jumbo.” It is now the active visual operating console. When adding reference/import/asset-management UI: - Treat collections as the main organizing unit, not packages. Show collections as compact visual cards with asset thumbnails/previews. Tapping a collection should open a full-screen collection overlay with Close, Upload, collection metadata, and that collection's visual asset grid. - Keep the top-level Vault command area extremely compact: only New collection and Pinterest. These are disclosure actions: their forms/options should be hidden by default and only appear after tapping the matching button; tapping again may collapse. Do not leave New collection inputs open on initial Vault load. - Do not expose Upload or Reference note at the top level; upload belongs inside the open collection overlay. Inside a collection, the default upload/generation UI should show only primary actions such as Upload and Generate. Detailed fields like upload notes, asset kind/logo classification, title overrides, and metadata should appear only after the user has chosen an upload and can see its preview. - Any upload/generation/ingest action that may take noticeable time must surface a visible processing state in the UI (e.g. uploading spinner/banner, completion/failure status). Do not silently block after file selection. - For AI generation/editing, use persisted `generation_jobs` as the source of truth for UI feedback. A local busy banner is not enough: the Vault should poll/list jobs from SQLite, show a compact job tray with status/errors/reference counts, and recover after page refresh. Do not globally disable upload/generate/edit while one job runs; users must be able to queue parallel work and see all active jobs. The job tray should be collapsed by default and visually subordinate to the asset grid; never let an expanded log panel dominate the collection header on mobile. - Generation should open a focused overlay/session rather than stuffing a large inline form into the collection grid. Put the dynamic placeholder/preview above the prompt controls, show in-progress feedback on that preview, use `object-fit: contain` for arbitrary output dimensions, collapse completed job logs by default, and derive “running” spinners only from non-terminal persisted jobs. A fresh **Generate** click means a fresh session: do not fall back to the last completed job/result image when no active job is selected; show a neutral placeholder until the user enters a prompt or selects a job. Prompt textareas must be disclosed only after the user taps an action such as “Enter prompt” / “Edit with AI”, not shown everywhere by default. After a generation completes, “edit image” should stay in the same session and use the generated asset as the edit source/reference. See `references/parallel-generation-edit-analysis-ux-2026-05-18.md` and `references/minimal-disclosed-ai-generation-asset-ui-2026-05-19.md`. - For asset edit/variation flows, default to a reference-capable edit model and include the source asset as the first reference. If an edit “does not use the reference image,” inspect `/api/assets/:id/edit`, `reference_asset_ids_json`, model selection, and `applyReferenceUrlsToFalBody()` before changing prompts. - Keep add-reference forms in compact rows/columns with small labels, restrained padding, and shorter textarea heights. - Prioritize the visual asset grid over form chrome; imported image assets should show thumbnails immediately. - For large/high-asset collections, compact mode should remove text/actions chrome but **must not make imagery unreadable**. Do not force tiny 3-column mobile thumbnails for tall brand/reference assets; prefer 2 readable columns on phones, fixed square tile regions, `object-fit: contain`, and internal overlay scrolling. See `references/vault-large-collection-compact-grid-2026-05-17.md`. - Collection overlay cards should be **clean visual tiles**, not review cards: show the image, not label clutter. Hide filename, path, notes, source URL, approve/reject actions, and confusing status/AI badge pairs such as `pending` + `Ready` in the collection grid. Approval/rejection belongs in the Review queue/deck, and status/details belong behind the asset drawer’s disclosed info panels. See `references/collection-overlay-and-fal-queue-2026-05-18.md` and `references/minimal-disclosed-ai-generation-asset-ui-2026-05-19.md`. - Use `media_url` from the assets API for thumbnails; do not make the frontend reconstruct filesystem paths. - Asset detail/edit should expose `asset_kind`, status, notes, and JSON metadata so Hermes can later query assets like “approved dark-mode horizontal logo,” but **do not push raw metadata/JSON into the primary creative UI**. Default asset views should be clean and action-disclosed: image-first, minimal close/action controls, and no already-open edit forms, prompt boxes, raw metadata, or redundant labels unless the user explicitly taps **Edit details**, **Edit with AI**, **Analyze**, **Info**, or **Technical**. Show clean human-readable image notes, brand role, fit/mismatch notes, tags, and simple status chips only inside disclosed info surfaces; put raw JSON, paths, IDs, provider/model data, and errors behind a collapsed read-only `Technical` / `More info` disclosure. - Prefer user-facing labels like “Analyze image”, “Image notes”, “Ready”, “Applied”, and “Needs setup” over developer-centric labels like “AI metadata”, “AI needs review”, or “technical fallback”. If real vision analysis is blocked, show a clear setup state rather than pretending fallback metadata is successful enrichment. - Prefer two-pane or stacked layouts where capture controls are secondary and the asset board is primary. - Preserve the Hermes Creative visual tone: eggshell/graphite surfaces, muted gold accents, minimal linework, and compact mobile-first spacing. - When fixing screenshot-reported Vault UI bugs, identify the exact visible failure first (for example overlap vs crop vs scroll clipping). Do not ship repeated approximate CSS fixes from inspection alone; Alex expects the screenshot symptom to be directly addressed. - For dense/compact image grids on iPhone Safari, do not rely only on `aspect-ratio` on button-based thumbnails. Give the asset card a real height, force the preview button/image to `height:100%`, use `object-fit:contain`, and keep `overflow:hidden` so images cannot overlap into neighboring rows. Prefer 2 readable columns on mobile over 3 unreadable columns. See `references/collections-first-vault-overhaul-2026-05-17.md` for the schema migration, API surface, UI pattern, smoke test, and pitfalls from the package-to-collections pivot. See `references/vault-disclosure-upload-flow-2026-05-17.md` for the follow-up correction: top-level New collection/Pinterest forms hidden by default, upload metadata shown only after file preview, and visible processing states. See `references/vault-compact-grid-overlap-2026-05-17.md` for the iPhone Safari compact-grid overlap root cause and durable CSS pattern. ## Reversible creative phase work When Alex asks to move into a next phase after visual exploration, create a versioned/reversible work package before pushing further: - Use a phase slug under `brand/phases//`. - Write the main phase output under `brand/`. - Include `manifest.json` with created files, approved seed assets, model/provider defaults, and timestamp. - Include `rollback/rollback.sh` that removes/reverts only the files created by that phase. - State what the rollback preserves (e.g. previously approved assets) vs. what it deletes/reverts. This is a first-class workflow preference: Alex wants a clean, thorough retry path if he dislikes a direction. ## Review queue / swipe review implementation preference For Hermes Creative UI work, the review queue should support focused review rather than forcing decisions inside a long page: 1. Keep the Review tab as a compact, scrollable list of pending items. 2. Tap an item to open a focused Tinder-style overlay/deck. 3. Lock background/body scrolling while the overlay is open. 4. Keep the asset/content preview prominent and the approval question visible without scrolling. 5. Provide optional reason/context text before approve/reject/delete. 6. Support swipe-right approve and swipe-left reject, plus explicit buttons. 7. Decisions must automatically advance to the next pending review item; approve/reject/delete should **not** close the overlay. Only the Close button exits back to the list. 8. Put detailed metadata in a separate scrollable More Info panel; the focused overlay itself should remain non-scrollable. 9. Delete from review should preserve audit/provenance and mark linked entities consistently. 10. Tune swipe feel for speed/smoothness: no transition while actively dragging, short ease-out after release, `will-change: transform`, and a short busy state to avoid double actions. See `references/reversible-phases-and-swipe-review-2026-05.md` for the original implementation notes and `references/review-deck-and-media-url-pitfalls-2026-05.md` for deck-advance/media-preview debugging details. See `references/reversible-phases-and-swipe-review-2026-05.md` for the session-specific implementation notes and pitfalls. ### Retired Visual Package wizard notes The Visual Package wizard guidance below is historical. As of the collections-first pivot, do **not** expose a Visual Packages tab or rebuild the `/visual-packages` API by default. If Alex asks for the wizard again, reinterpret it as a future guided **skill/workflow** that creates/edits `asset_collections`, `collection_assets`, and asset metadata, not a revival of the old rigid package hierarchy. When building or fixing a future collection-backed guided wizard: 1. A new Brand Visual Identity wizard should immediately `POST /api/projects/:slug/visual-packages` with `status='draft'`; do not wait until the last step to create the DB row, because Hermes chat, uploads, saved step state, and generation batches need a package id. 2. Final confirmation should complete the draft (typically set `status='in_review'` plus `structured_payload.export.confirmed=true`) rather than auto-approving or setting it primary. Only explicit “Set primary” should call `/set-primary`. 3. Hermes chat inside the wizard should be able to trigger generation when the prompt asks for “generate”, “variation(s)”, “batch”, “options”, or “selectable”. Show generated outputs as selectable batches attached to the active step, and persist selected asset ids in that step’s `structured_payload`. 4. Mobile wizard bottom actions should be a full-width bottom-docked bar, not an indented/floating pill. Use `position: fixed; left: 0; right: 0; width: 100vw; bottom: 0;` plus `env(safe-area-inset-bottom/left/right)` padding and enough card bottom padding so content is not hidden. 5. Mobile landing cards must aggressively prevent horizontal overflow: wrap long vault paths/badges/buttons, set `min-width:0` on cards/rows/grids, `overflow-x:hidden` on the app/body, and avoid single-line buttons that exceed the viewport. 6. **Do not call the mobile wizard fixed just because the footer is fixed.** Verify the active step's section controls and Hermes chat are actually visible/reachable on iPhone. A common regression is global mobile `.card { overflow:hidden }` overriding `.wizard-card { overflow:auto }`, clipping everything below the header/Save button. Force `.wizard-card { overflow-y:auto !important; -webkit-overflow-scrolling:touch !important; }` at mobile breakpoints when needed, and place `Save step` after the section-specific controls so the visible UI is not just a save button. 7. **For mobile, the Brand Visual Identity wizard should feel like a bottom-sheet app screen, not a nested desktop modal.** Remove redundant floating info/description panels inside the wizard, put the package name in the top bar alongside Close, make step selectors a horizontal scrolling pill list, anchor the wizard shell to the bottom of the viewport, and aggressively reduce nested card gutters/padding. The top-level card can keep a small edge margin, but the wizard content itself should not create multiple inset panels that waste phone width. 8. **Avoid `svh` + sticky/grid rows for the mobile wizard sheet.** This caused a huge top gap and vertically clipped step pills on iPhone even after the header was moved inside the sheet. Prefer a true bottom-sheet surface: fixed full-screen overlay, flex `align-items:flex-end`, shell height `calc(100dvh - max(8px, env(safe-area-inset-top)))`, card as a flex column with normal non-sticky topbar/steps, step row `min-height` around 54px, and step pills `height/min-height` around 38px. Verify geometry on a live mobile viewport, not just by inspecting CSS. See `references/visual-package-wizard-draft-batches-mobile-2026-05.md` for the session-specific implementation diff and verification notes. See `references/visual-package-wizard-mobile-clipping-2026-05-17.md` for the clipped-controls follow-up. See `references/visual-package-wizard-bottom-sheet-mobile-2026-05-17.md` for Alex's bottom-sheet/no-gutters correction. See `references/visual-package-wizard-mobile-dvh-bottom-sheet-2026-05-17.md` for the final iPhone gap/clipped-step root cause and Playwright verification recipe. ## API quick reference Base local URL: `http://127.0.0.1:4030` Important endpoints: ```bash GET /api/health GET /api/projects POST /api/projects GET /api/projects/:slug GET /api/projects/:slug/references POST /api/projects/:slug/references POST /api/projects/:slug/pinterest/import GET /api/projects/:slug/brand-kit PATCH /api/projects/:slug/brand-kit POST /api/projects/:slug/directions/generate Pitfall: `PATCH /api/projects/:slug/brand-kit` behaves like a full-field update for brand-kit text fields in the current app; omitted fields may be blanked. Always GET the existing brand kit first, merge changes client-side, and send all core fields (`audience`, `offer`, `positioning`, `voice`, `visual_direction`, `colors`, `typography`, `dos`, `donts`, `notes`). GET /api/projects/:slug/directions POST /api/directions/:id/approve POST /api/directions/:id/reject GET /api/projects/:slug/assets POST /api/projects/:slug/assets PATCH /api/assets/:id POST /api/assets/:id/approve POST /api/assets/:id/reject GET /api/projects/:slug/collections POST /api/projects/:slug/collections PATCH /api/collections/:id POST /api/collections/:id/upload POST /api/collections/:id/assets DELETE /api/collections/:id/assets/:assetId POST /api/projects/:slug/strategy/generate GET /api/projects/:slug/review POST /api/review/:id/decision DELETE /api/review/:id POST /api/hermes/ingest ``` Hermes ingest example: ```bash curl -fsS -X POST http://127.0.0.1:4030/api/hermes/ingest \ -H 'Content-Type: application/json' \ --data '{ "project_slug":"astro-mage", "type":"url", "title":"Reference landing page", "source_url":"https://example.com", "notes":"Premium but not the exact color direction." }' ``` Pinterest board import example: ```bash curl -fsS -X POST http://127.0.0.1:4030/api/projects/north-star/pinterest/import \ -H 'Content-Type: application/json' \ --data '{"url":"https://www.pinterest.com/firemnt/celestial/","limit":12,"delay_ms":700}' ``` Use this for moodboards/reference boards when Alex wants vision AI analysis without hammering Pinterest. The importer caches board metadata, downloads images into the media vault, registers `reference-image` assets, and creates review rows. If a valid board imports 0 images, do not retry aggressively; some Pinterest boards expose no RSS/unauthenticated image data, so use an authenticated/browser fallback or ask for exported references. ## AI-assisted asset intelligence / KB / generation architecture When Alex asks about upload metadata, vision models, Telegram-first asset capture, LLM Wiki/knowledge-base integration, or borrowing from the image generation app, treat this as a core Hermes Creative architecture task — not a manual form/UI tweak. Current baseline as of commit `84b930f` plus the 2026-05-18 repair: Hermes Creative has durable technical metadata on upload, `asset_ai_enrichments`, project KB scaffold/pages, enrichment apply/reject endpoints, `generation_jobs`, `asset_versions`, fal generation/edit adapter plumbing, collection Generate UI, asset drawer AI/generation surfaces, and OpenAI/Codex subscription-backed vision enrichment. The real vision provider path should default to Hermes/OpenAI subscription auth (`openai-codex`, usually `gpt-5.5`) rather than direct `OPENAI_API_KEY` quota. Local/technical fallback must not be presented as successful visual enrichment. Preferred target pattern: 1. Save uploaded media immediately and never block upload success on AI provider availability. 2. Queue AI vision enrichment that drafts title, kind, tags, visual semantics, brand role, collection placement, prompt fragments, negative prompt fragments, fit/mismatch against current brand context, do/don’t rules, and proposed brand-rule candidates. 3. Store each AI attempt/proposal in `asset_ai_enrichments`, but on successful real vision analysis also merge the structured fields into canonical `assets.metadata_json` so future Hermes skills/agents can query assets directly without requiring a manual “apply” step. Proposed durable brand rules still require review/approval before updating brand memory. 4. Maintain project-scoped KB files (`kb/SCHEMA.md`, `kb/index.md`, `kb/log.md`, raw/assets/brand/concepts/comparisons/queries) so asset knowledge compounds like an LLM Wiki. 5. Automatically write asset KB pages, but only update durable brand rules/decisions after explicit approval. 6. Borrow fal Studio's server-side generation/edit adapter patterns, not its whole UI. Generated/edited media must become Hermes Creative vault files, assets, collection members, review items, KB pages, generation job records, and automatic vision-enriched asset metadata. FAL queue endpoints may return only `request_id`/`status_url`/`response_url` at first; poll the queue until a real image/video URL exists, treat “still in progress” as non-terminal, and clear stale `generation_jobs.error` on successful reruns. See `references/collection-overlay-and-fal-queue-2026-05-18.md`. 7. For editing current media, default to “Save as new asset”; “Replace current media” requires explicit confirmation and a version snapshot. 8. For AI generation/edit jobs, the UI must be backed by `generation_jobs`, not temporary component state. Poll/list jobs from the project endpoint so active/failed/completed work survives refresh, and show all simultaneous jobs in a compact tray. See `references/parallel-generation-edit-analysis-ux-2026-05-18.md`. 9. For asset edit/variation, default to **OpenAI subscription-backed GPT Image 2** (`openai-codex/gpt-image-2`) when Codex/ChatGPT OAuth is available, and include the source asset as reference id 1. The UI should expose a model picker so Alex can switch to GPT Image 2 High or FAL edit models. If OpenAI/Codex image generation fails, automatically fall back to a reference-capable FAL edit model (currently `fal-ai/nano-banana-2/edit`) and record both requested model and actual fallback provider/model in job/asset metadata. Verify reference images reach the provider: OpenAI/Codex uses `input_image` content, while FAL uses `image_urls`/`image_url` via `applyReferenceUrlsToFalBody()`. ### Vision analysis quality bar Alex explicitly corrected that AI should **not** merely say “picture of a star” or provide a generic caption. The analysis must answer brand-system questions: - What role does this image play in the brand system: logo reference, moodboard/reference, campaign image, UI/product expression, generated output, edit output, typography/color/texture reference, etc.? - Does it fit or conflict with the current brand kit, and why? - What collections should it belong to? - What reusable prompt fragments and negative prompt fragments should be saved? - What visual do/don’t rules does it imply? - Should it become a proposed brand rule for review? The implemented prompt version is `asset-enrichment-v2-brand-system` in `src/lib/ai/visionAdapter.mjs`; keep future provider adapters aligned with that contract. ### Vision provider and “Needs setup” debugging If generated/uploaded/edited images show **Needs setup**, inspect backend enrichment state before changing UI labels: ```bash cd /home/avalon/apps/hermes-creative node - <<'NODE' import Database from 'better-sqlite3'; const db=new Database('./data/hermes-creative.sqlite'); console.log(db.prepare("select e.id,e.asset_id,e.status,e.provider,e.model,substr(e.error,1,180) error,a.title,json_extract(a.metadata_json,'$.ai.status') ai_status from asset_ai_enrichments e join assets a on a.id=e.asset_id order by e.id desc limit 20").all()); NODE ``` Known pitfall: direct OpenAI API quota failures (`OPENAI_API_KEY` / `gpt-4o-mini`) produce legitimate `provider_needed`/failed states even though Hermes has ChatGPT subscription access. The fix is to route vision through `openai-codex`/Codex OAuth (Python bridge if needed), then merge successful `proposedMetadata` into canonical `assets.metadata_json` and clear stale `ai.error`. See `references/openai-codex-vision-asset-enrichment-2026-05-18.md`. See `references/ai-vision-kb-generation-architecture-2026-05-17.md` for the inspected current state, proposed tables, KB layout, brand-aware enrichment schema, generation/edit flow, and implementation order. A full repo-local plan also exists at `/home/avalon/apps/hermes-creative/docs/plans/2026-05-17-ai-vision-kb-generation-architecture.md` (commit `5c01d30`). See `references/asset-intelligence-foundation-2026-05-18.md` for the deployed foundation, exact verification recipe, and the brand-system analysis correction. ## Vault conventions Vault root: ```text /home/avalon/hermes-media-vault/projects// ``` Canonical folders: ```text brand/ references/uploads/ references/screenshots/ references/youtube/ moodboards/ generated/images/ approved/ rejected/ campaigns/ exports/ ``` Every major asset/reference should preserve: - source/provenance - project - direction/campaign if relevant - prompt/model if generated - status - tags - notes - approval/rejection rationale ## Brand context pack Each project should maintain: ```text brand/brand-brief.md brand/brand-context.md brand/brand-glossary.md brand/visual-rules.md brand/voice-and-tone.md brand/content-pillars.md brand/decisions.md ``` The compact `brand-context.md` is the preferred context to inject into future Social/Ads/Creative tasks. ## Safety rules - Do not publish social content from Hermes Creative. - Do not push ads to Meta from Hermes Creative. - Handoffs create local drafts only unless Alex explicitly requests execution in the downstream app. - Separate approvals: - Approve creative direction. - Approve asset. - Create local draft. - Publish/schedule social. - Push paused ad. - Activate/spend. - Track provenance. External references are inspiration unless usage rights are clear. - Never log or expose API tokens/secrets. - For vision enrichment, real provider failure is not pseudo-success. If Venice/OpenAI/etc. is missing credentials, has no balance, or returns a provider error, mark the asset/enrichment as needing setup and keep the technical error in logs/details; do not merge local technical metadata as if it were a completed brand-system analysis. ## Preferred workflow 1. Create/select project. 2. Capture references via Telegram or UI. 3. For brand exploration, lead with visual cues and approval/rejection loops rather than questionnaires. Ask at most one focused question per turn, derived from the specific visual feedback just given. 4. Ask Brand Architect for synthesis. 5. Keep the current layer clear: if Alex is using Hermes Creative for brand work, stay in brand strategy / creative direction mode and do not drift into detailed app UX/build specs unless explicitly asked. First-open copy may be discussed as brand messaging, not implementation. 6. Generate creative directions. 7. Approve/reject direction. 8. Export/update brand context pack. 9. Generate or upload media assets. 10. Review/approve/reject assets. 11. When Alex approves/rejects a batch and asks what it means, analyze the actual DB decisions, create approved/rejected contact sheets for image sets, compare patterns, save a synthesis note under `brand/`, and create 3–5 new `creative_directions` plus review rows. Do not reduce the signal to a crude binary if approved/rejected sets share motifs; find the sharper boundary between approved and rejected executions. 12. Generate campaign strategy. 13. Handoff local drafts to Hermes Social/Hermes Ads. 14. Pull performance learning later and update strategy. ## Visual branding approval loop When Alex asks to move into visual branding/image generation from Hermes Creative: 0. First classify the requested visual layer before generating: **brand mark/logo**, **brand imagery/content system**, **campaign/social/ad creative**, **moodboard/reference**, or **UI/product expression**. If UI is not explicitly requested, keep outputs as brand imagery/media direction rather than app screens. 1. Create a compact set of visual directions in `creative_directions`, scoped to the active project. 2. Generate images for those directions. For brand imagery, prompt for reusable visual systems and media/content treatments, not just UI mockups. 3. Copy generated files into the project media vault, usually `generated/images/`. 4. Register every image as an `assets` row using `POST /api/projects/:slug/assets` with `direction_id`, `file_path`, and prompt/provenance notes; this creates review queue rows. 5. Send Telegram-native `MEDIA:/absolute/path` images with short labels so Alex can approve/reject/comment. 6. Apply feedback to actual Hermes Creative asset/direction statuses before generating the next round. 7. Only translate approved brand imagery into UI screens/components when Alex asks for UI or when the project phase is explicitly product design. Pitfall: do not treat generated images as detached chat artifacts. They must be tracked in Hermes Creative’s DB, media vault, and review queue so approvals/rejections become durable creative memory. Pitfall: do not over-index on UI. Alex may be building a full brand where the primary visual output is content/media/identity; UI should be treated as a downstream expression of the brand system, not the center of the creative process. - Pitfall: Review queue image assets can be present in `assets` with valid image files but still show no preview if `media_url` is not computed. Current app does **not** store `assets.media_url`; it computes it from `file_path`. The resolver must support both absolute vault paths and vault-relative paths like `projects//generated/images/foo.png`. See `references/review-queue-image-assets-2026-05.md` and `references/review-deck-and-media-url-pitfalls-2026-05.md` for the debug/fix recipe. ## DB maintenance patterns When Alex asks to remove generated project data shown in the app, act directly on the SQLite DB after inspecting schema and backing up when the change is broad: - Creative direction cards live in `creative_directions`; their review queue rows live in `review_items` with `item_type='direction'` and matching `item_id`. - To clear generated starter directions for a project, delete matching `review_items` first, then `creative_directions`, scoped by `project_id`; verify both counts are zero. - When renaming a project, update `projects.slug`, `projects.name`, `projects.brief`, and `projects.vault_path`; rename the vault folder under `/home/avalon/hermes-media-vault/projects/` if present. - Also replace old visible names/slugs in attached rows: `brand_kits`, `references`, `campaigns`, and `audit_events` payloads/text fields. - If a duplicate old project exists, archive/rename it instead of leaving old branding active in the project list. - Verify with `GET /api/projects`, `GET /api/projects/:new_slug/brand-kit`, and confirm the old slug returns 404. ## When generating media If Alex provides many source/reference images, do not arbitrarily limit to 3. Use the full provided set or clearly state any provider/model cap first. ### North Star visual style — minimal organic line / block-print (Alex correction) For the North Star project, the approved visual direction is **organic minimal line art with restrained block-print weight, minimal shading, and lots of negative space**. Earlier overly-ornate "modern talisman" demos were rejected. When generating for North Star: - Start MINIMAL first. One small primary symbol (eight-point star) + at most one supporting element (hand OR sun circle OR crescent, not all). - No borders, captions, rays, dot clusters, decorative micro-ornament, or "oracle card" framing unless explicitly requested. - Slightly imperfect carved/handmade line. Warm cream/eggshell background, flat black ink, optional one muted ochre/rust accent. - Negate aggressively: "no glossy vector, no detail flare, no captions, no extra symbols, no ornate rays." - **Banned style words / framings:** Alex does not want the word **"folk"** (or "folkloric", "folk-art") used in prompts or descriptions for North Star or related brand work. Prefer "modern block print", "line art talisman", "minimal block print", or "organic minimal line art". When describing visuals back to Alex, mirror his vocabulary, not stock art-direction phrases. - See `brand/visual-decision-synthesis.md` in the project vault for the full approval/rejection synthesis and the explicit correction note. ### DO NOT invent motif lists from scratch (Alex correction) When Alex asks for "more talisman options" or "another round of brand imagery", **do not** open by inventing a list of stock motifs (eye+triangle, ouroboros, phoenix, wolf+moon phases, all-seeing eye, lotus halos, etc.) and fanning them across models. Alex has explicitly said this produces "cheesy and generic" results that betray the unique references he has already curated. This failure mode happens even when the prompt language sounds correct ("modern block print line art talisman") — because the *motif inventory* is the actual generic ingredient, not the surface adjectives. Correct opening move for any "more options" or "different references" request on a project that already has approved assets: 1. Query the project's approved assets first: `sqlite3 .../hermes-creative.sqlite "SELECT a.file_path FROM assets a JOIN projects p ON p.id=a.project_id WHERE p.slug='' AND a.status='approved' ORDER BY RANDOM() LIMIT 12-15;"` 2. Surface a small contact sheet of those approved references (via the `/media/` URLs) and ask Alex to pick 2-3 as seeds. 3. Only then do reference-guided generation against those seeds. Never substitute "Hermes invents 10 fresh concepts" for "Hermes inherits from Alex's curation." If Alex's reply is a frustration signal ("not liking", "generic", "rewind", "back to creative direction"), treat that as an order to stop generating and re-anchor on the vault — not to retry with different adjectives. For the **seven classical planetary gods** workflow, avoid making Jupiter/Mars/Sun/Venus/Mercury/Moon by editing the approved Saturn image. That produces a coherent style but accidentally transfers Saturn's posture and silhouette into the other gods. Correct sequence: generate a fresh text-only base for each god using the compressed ancient-classical → isolated subject → simplified etching prompt, then apply the approved style variants to that god's own base image. See `references/planetary-god-fresh-base-correction-2026-05-16.md`. When Alex reviews individual planetary gods, treat his character-direction notes as iconographic corrections, not just "style" tweaks. Known corrections: Zeus/Jupiter should read more powerful/kingly/thunderous; Sun should read as powerful Apollo/solar god; Venus should be more alluring/classically beautiful but provider prompts must use safe museum-classical wording; Luna should be lunar mother goddess first with Artemis/Diana cues second. For Luna specifically, avoid AI bow/hand artifacts: no bow in hand, no extra hand, no bow emerging from the dress; place a bow separately on the ground or omit it. See `references/planetary-god-v3-v4-feedback-2026-05-17.md`. ### Multi-engine image generation with vault references (preferred pattern) This is the **default** pattern for any brand-imagery generation round on a project with approved references — not a fallback. Text-only prompts should be reserved for the very first exploration round on a brand-new project with no curated references yet. When generating brand imagery from approved references, the strongest pattern is to feed real approved Hermes Creative assets to reference-capable edit endpoints rather than relying on text-only prompts. 1. Confirm the vault assets are publicly reachable at `https://hermes-creative.apps.poofc.com/media/projects//...` (the server mounts `/media` on the vault root). 2. Pick 1-3 approved references that best embody the target style (filter on `assets.status='approved'`). 3. Submit in parallel to multiple engines for comparison: - **fal `nano-banana/edit`** — `image_urls` array, accepts up to multiple refs, best for multi-ref style transfer. - **fal `qwen-image-edit`** — single `image_url`, fast and obedient to minimal prompts. - **fal `flux-pro/kontext`** — single `image_url`, good for clean line preservation. - **Venice `/image/multi-edit`** — up to 3 images, but uses `modelId` (not `model`) and only certain model ids work (`qwen-edit`, not `qwen-image-2`). 4. Register every output as an asset row pointing into `generated/images//` with provenance in notes. Provider quirks worth remembering: - **OpenAI gpt-image-2** has three quality tiers — `low`, `medium`, `high` (set via `hermes config set image_gen.model gpt-image-2-high`). `medium` tends to over-decorate even when prompts say minimal; `high` is meaningfully more obedient. Codex auth shares a usage budget and will 429 — have FAL ready as a fallback. - **Venice** requires balance on the account or all `/image/*` endpoints return 402 `Insufficient USD or Diem balance` — check before relying on it. - **Replicate** requires `REPLICATE_API_TOKEN` in `~/.hermes/.env`; without it, skip Replicate entirely rather than guessing. See `references/north-star-visual-branding-loop-2026-05.md` and the project vault's `brand/visual-decision-synthesis.md` for the full North Star approval/rejection mapping. ## Deployment notes Build/restart: ```bash cd /home/avalon/apps/hermes-creative npm run build PORT=4030 pm2 start /home/avalon/apps/hermes-creative/server.mjs --name hermes-creative --cwd /home/avalon/apps/hermes-creative pm2 save curl -sS http://127.0.0.1:4030/api/health ``` Pitfall: `pm2 restart hermes-creative --update-env` can preserve or reapply a stale `PORT` from another app if the process was previously started incorrectly. If `:4030` health fails after restart, check `pm2 env | grep '^PORT'` and recreate the process with `PORT=4030 pm2 start ... --cwd ...`, then `pm2 save`. PWA app shell should not rely on nginx Basic Auth; use app-level auth for sensitive APIs when added. ## References - `references/openai-codex-vision-asset-enrichment-2026-05-18.md` — Root-cause/fix recipe for assets stuck at **Needs setup** because vision used quota-exhausted direct OpenAI instead of Hermes/OpenAI Codex OAuth; includes Node→Python bridge pattern, canonical metadata merge, stale error clearing, and DB verification queries. - `references/openai-codex-gpt-image-2-edit-default-2026-05-18.md` — Correction/pattern for making GPT Image 2 via OpenAI/Codex subscription the default image edit/variation model, while keeping a UI model picker and FAL automatic fallback with reference preservation. - `references/parallel-generation-edit-analysis-ux-2026-05-18.md` — durable pattern for persisted `generation_jobs` UI feedback, parallel AI job trays, refresh recovery, edit-reference defaults, and clear image-analysis setup/status labels. - `references/clean-asset-metadata-ui-and-venice-vision-2026-05-18.md` — UX and provider correction for asset intelligence: hide raw metadata/JSON behind read-only Technical details, use clean “Image notes” labels/status chips, wire Venice `qwen3-vl-235b-a22b`, and surface Venice billing/config errors as **Needs setup** rather than pseudo-success. - `references/telegram-workflows.md` — Telegram capture/review/handoff examples. - `references/ui-review-queue-2026-05.md` — design-system correction and expandable review queue pattern from the MVP launch. - `references/north-star-brand-onboarding-2026-05.md` — North Star / North Star OS naming, language, and first-open onboarding direction from the former Astro Mage brand discussion. - `references/north-star-visual-branding-loop-2026-05.md` — North Star visual-branding approval loop: directions → generated images → media vault/assets/review queue → Telegram approvals. - `references/pinterest-board-import-2026-05.md` — Pinterest board import implementation, RSS/HTML fallback quirks, verification recipe, and PM2 `PORT` pitfall. - `references/multi-engine-reference-guided-generation.md` — Multi-engine (FAL/Venice/OpenAI/Replicate) reference-guided image generation pattern: pick approved vault assets, expose via `/media/`, fan out to ref-capable edit endpoints, register outputs back as pending assets. Includes provider quirks and prompt rules for sparse/minimal output. - `references/image-gen-provider-fallback-and-multimodel-comparison.md` — Codex 429 fallback recipe (`hermes config set image_gen.provider/model`), recommended FAL model ids per aesthetic (Flux Pro Ultra / Recraft v3 / Ideogram v3 / Seedream 4), and the multi-model comparison batch workflow with labeled galleries. - `references/reference-seeded-generation-from-approved-vault-2026-05-16.md` — When Alex says "more options / different references / rewind", DO NOT invent motif lists; query approved vault assets, let Alex pick 2-3 seeds, fan across nano-banana/qwen-edit/flux-kontext. Includes FAL `.env` parsing gotcha and the asset-registration sequence. - `references/review-queue-image-assets-2026-05.md` — Debug/fix recipe for Hermes Creative Review queue image assets: `review_items` vs `assets.media_url`, React preview rendering, CSS thumbnail/full-preview styles, and deploy verification. - `references/reversible-brand-marketing-pipeline-and-continuous-review-deck-2026-05.md` — Reversible brand/marketing pipeline phase pattern plus continuous Tinder-style review deck implementation: advance-after-swipe behavior, smooth drag CSS, decision notes, delete semantics, and reviewable marketing concept registration. - `references/classical-planetary-god-etching-pipeline.md` — Alex's iterative ChatGPT-style deity-image workflow for the seven classical planetary gods: ancient alchemical/classical image → isolate subject → minimal line sketch with subtle shading → etching; includes Mercury prompt and model-verification note. - `references/planetary-god-v3-v4-feedback-2026-05-17.md` — targeted review corrections for the planetary gods: stronger Zeus/Apollo, safer Venus prompt wording, Luna mother-goddess direction, Luna bow/hand artifact avoidance, and batch idempotency lessons. See `references/planetary-god-style-variation-transparent-assets-2026-05-16.md` — Saturn variation session: one-prompt vs edit-chain findings, rougher style prompts for block print/marginalia/leadpoint, direct OpenAI/Codex `gpt-image-2` edit invocation, and transparent square bottom-centered asset cleanup workflow. Helper: `scripts/line_art_transparent_square.py`. See `references/planetary-god-fresh-base-correction-2026-05-16.md` — important correction for the seven planetary gods workflow: do not use Saturn as the reference image for the other gods; create a fresh text-only base per god first, then apply the approved style directives to that god's own base image. - `references/approval-rejection-synthesis-2026-05.md` — workflow for turning asset approval/rejection batches into contact-sheet analysis, decision synthesis, and new reviewable creative directions. - `references/visual-identity-packages-foundation-2026-05.md` — package-driven Hermes Creative architecture and implementation notes, including Alex’s correction that brand visual identity packages must be structured visual kits (logo/color/type/layout/inspiration), not copied Brand Kit prose; covers tables/APIs, Visual Packages tab, brand wizard, upload endpoint, dynamic lower-level packages, and next universal-review steps. - `references/visual-package-wizard-draft-batches-mobile-2026-05.md` — follow-up correction for Visual Package wizard semantics: new wizard creates a live draft immediately; Hermes chat can trigger selectable variation batches; final confirm marks ready/in-review rather than primary; mobile footer and landing overflow CSS fixes. - `references/visual-package-wizard-verification-smoke-2026-05.md` — historical/deprecated deploy/smoke verification recipe for the old wizard: build + PM2 + health checks, API-created temporary draft, final-confirm patch to `in_review` (not `approved`), archive cleanup, and browser-sandbox fallback checks. Do not use this as the default architecture after the collections pivot. - `references/collections-first-vault-overhaul-2026-05-17.md` — package-to-collections pivot: schema migration from `visual_identity_packages`/`package_assets` into `asset_collections`/`collection_assets`, new collection APIs, Vault command/grid UI, richer asset metadata fields, smoke tests, and pitfalls. - `references/project-clone-and-collection-seeding-2026-05-17.md` — workflow for creating a new Hermes Creative project from an existing project's brand kit/assets, cloning assets into the new vault, registering Telegram chat-uploaded logo images, and seeding flexible collections with provenance metadata.