--- name: fal-studio-dynamic-media-inputs description: Add fal.ai video models and dynamic upload sections to fal-studio or similar apps by driving the UI from endpoint schemas, separating first/last/general/storyboard references, and mapping uploads to the correct API fields. version: 1.0.0 author: Hermes Agent license: MIT metadata: hermes: tags: [fal-studio, fal.ai, video models, image-to-video, first-last-frame, dynamic forms, reference images, schema-driven ui] related_skills: [fal-replicate-model-inventory, vps-app-deployment, multi-provider-api-resilience] --- # fal-studio Dynamic Media Inputs Use this when updating fal-studio or any similar generation app to support: - text-to-video models - image-to-video models - first/last-frame video models - multi-reference image or video models - special storyboard/grid reference sections - model-specific settings panels derived from real endpoint fields ## Core approach Do NOT hardcode assumptions from memory about a model's inputs. Instead: 1. Find the endpoint IDs you want to support. 2. Pull the endpoint OpenAPI schema directly from fal.ai. 3. Use the schema as the source of truth for: - required fields - enum choices - default values - whether prompt is required - whether first frame / last frame / general refs are needed - whether output is image or video 4. Use `llms.txt` only as a supplement for pricing notes, prompt conventions, and human-facing descriptions. ## Best discovery workflow For each fal endpoint: ```text https://fal.ai/api/openapi/queue/openapi.json?endpoint_id= ``` Then extract: - request body schema title - required array - properties object - output schema (`video`, `image`, `output`, etc.) Useful companion page: ```text https://fal.ai/models//llms.txt ``` Use `llms.txt` for: - human-readable price text - prompt notes - examples like Kling O1's `@Image1` / `@Image2` ## Reusable UI model design Represent each model with: - `id` - `name` - `family` - `type` - `mediaType` (`image` or `video`) - `price` / `priceUnit` - `promptRequired` - optional `alternatePromptParam` - `params` object for the settings form - `uploads` object for upload sections Recommended upload section keys: - `generalReferences` - `firstFrame` - `lastFrame` - `storyboardReference` Each upload section should include: - `label` - `required` - `maxFiles` - `apiParam` - `help` ## Important validation rule The generate button should NOT only check for a non-empty text prompt. Use logic like: - if `promptRequired !== false`, require text prompt - if `promptRequired === false` and model has `alternatePromptParam`, allow either prompt OR that alternate field - also require all upload sections marked `required` This matters for models like: - Kling O3, where `multi_prompt` can substitute for `prompt` ## Upload handling pattern Frontend: - store uploads by section key, not by a single `imageFile` - keep separate previews for each section - render each section independently - show max file count per section Backend: - use `multer.any()` instead of `upload.single('image')` - group files by field name - upload each file to fal CDN - map grouped uploads into the correct request fields ## Known upload-to-API mappings discovered ### Single general image reference Map to one of: - `image_url` Used by examples like: - Flux Kontext - Flux Dev Img2Img - Qwen Image Edit ### Multi general references Map to: - `image_urls` - or `reference_image_urls` Used by examples like: - Qwen Image 2 Edit - Nano Banana edit models - Vidu reference-to-video ### First-frame only video Map to one of: - `image_url` - sometimes `start_image_url` - sometimes `first_frame_url` Examples: - Wan 2.5 image-to-video -> `image_url` - Veo 3 / Veo 3.1 image-to-video -> `image_url` - Ovi image-to-video -> `image_url` ### First + last frame video Map to one of: - `start_image_url` + `end_image_url` - `first_frame_url` + `last_frame_url` - `image_url` + `end_image_url` - `image_url` + `tail_image_url` Examples: - Wan FLF2V -> `start_image_url`, `end_image_url` - Veo 3.1 FLF -> `first_frame_url`, `last_frame_url` - Kling O1 -> `start_image_url`, optional `end_image_url` - Kling O3 -> `image_url`, optional `end_image_url` - Kling 2.5 Turbo Pro -> `image_url`, optional `tail_image_url` - Hailuo 02 -> `image_url`, optional `end_image_url` - Vidu start-end -> `start_image_url`, `end_image_url` ### Special storyboard/grid reference section If the model doesn't expose a dedicated field, you can still provide a separate UI section and merge that file into the model's normal reference array. Concrete example discovered: - Vidu `reference-to-video` uses `reference_image_urls` - a dedicated `storyboardReference` section can still be offered in the UI - backend merges `generalReferences + storyboardReference` into `reference_image_urls` ## Result handling pattern Do not assume image-only responses. Extract both image and video outputs: - image candidates: `images[0].url`, `image.url`, `output.url` when content type is image - video candidates: `video.url`, `videos[0].url`, `output.url` when content type is video Frontend should: - render `` for image results - render `