---
name: fal-studio-dynamic-media-inputs
description: Add fal.ai video models and dynamic upload sections to fal-studio or similar apps by driving the UI from endpoint schemas, separating first/last/general/storyboard references, and mapping uploads to the correct API fields.
version: 1.0.0
author: Hermes Agent
license: MIT
metadata:
  hermes:
    tags: [fal-studio, fal.ai, video models, image-to-video, first-last-frame, dynamic forms, reference images, schema-driven ui]
    related_skills: [fal-replicate-model-inventory, vps-app-deployment, multi-provider-api-resilience]
---

# fal-studio Dynamic Media Inputs

Use this when updating fal-studio or any similar generation app to support:
- text-to-video models
- image-to-video models
- first/last-frame video models
- multi-reference image or video models
- special storyboard/grid reference sections
- model-specific settings panels derived from real endpoint fields

## Core approach

Do NOT hardcode assumptions from memory about a model's inputs.

Instead:
1. Find the endpoint IDs you want to support.
2. Pull the endpoint OpenAPI schema directly from fal.ai.
3. Use the schema as the source of truth for:
   - required fields
   - enum choices
   - default values
   - whether prompt is required
   - whether first frame / last frame / general refs are needed
   - whether output is image or video
4. Use `llms.txt` only as a supplement for pricing notes, prompt conventions, and human-facing descriptions.

## Best discovery workflow

For each fal endpoint:

```text
https://fal.ai/api/openapi/queue/openapi.json?endpoint_id=<ENDPOINT_ID>
```

Then extract:
- request body schema title
- required array
- properties object
- output schema (`video`, `image`, `output`, etc.)

Useful companion page:

```text
https://fal.ai/models/<ENDPOINT_ID>/llms.txt
```

Use `llms.txt` for:
- human-readable price text
- prompt notes
- examples like Kling O1's `@Image1` / `@Image2`

## Reusable UI model design

Represent each model with:
- `id`
- `name`
- `family`
- `type`
- `mediaType` (`image` or `video`)
- `price` / `priceUnit`
- `promptRequired`
- optional `alternatePromptParam`
- `params` object for the settings form
- `uploads` object for upload sections

Recommended upload section keys:
- `generalReferences`
- `firstFrame`
- `lastFrame`
- `storyboardReference`

Each upload section should include:
- `label`
- `required`
- `maxFiles`
- `apiParam`
- `help`

## Important validation rule

The generate button should NOT only check for a non-empty text prompt.

Use logic like:
- if `promptRequired !== false`, require text prompt
- if `promptRequired === false` and model has `alternatePromptParam`, allow either prompt OR that alternate field
- also require all upload sections marked `required`

This matters for models like:
- Kling O3, where `multi_prompt` can substitute for `prompt`

## Upload handling pattern

Frontend:
- store uploads by section key, not by a single `imageFile`
- keep separate previews for each section
- render each section independently
- show max file count per section

Backend:
- use `multer.any()` instead of `upload.single('image')`
- group files by field name
- upload each file to fal CDN
- map grouped uploads into the correct request fields

## Known upload-to-API mappings discovered

### Single general image reference
Map to one of:
- `image_url`

Used by examples like:
- Flux Kontext
- Flux Dev Img2Img
- Qwen Image Edit

### Multi general references
Map to:
- `image_urls`
- or `reference_image_urls`

Used by examples like:
- Qwen Image 2 Edit
- Nano Banana edit models
- Vidu reference-to-video

### First-frame only video
Map to one of:
- `image_url`
- sometimes `start_image_url`
- sometimes `first_frame_url`

Examples:
- Wan 2.5 image-to-video -> `image_url`
- Veo 3 / Veo 3.1 image-to-video -> `image_url`
- Ovi image-to-video -> `image_url`

### First + last frame video
Map to one of:
- `start_image_url` + `end_image_url`
- `first_frame_url` + `last_frame_url`
- `image_url` + `end_image_url`
- `image_url` + `tail_image_url`

Examples:
- Wan FLF2V -> `start_image_url`, `end_image_url`
- Veo 3.1 FLF -> `first_frame_url`, `last_frame_url`
- Kling O1 -> `start_image_url`, optional `end_image_url`
- Kling O3 -> `image_url`, optional `end_image_url`
- Kling 2.5 Turbo Pro -> `image_url`, optional `tail_image_url`
- Hailuo 02 -> `image_url`, optional `end_image_url`
- Vidu start-end -> `start_image_url`, `end_image_url`

### Special storyboard/grid reference section
If the model doesn't expose a dedicated field, you can still provide a separate UI section and merge that file into the model's normal reference array.

Concrete example discovered:
- Vidu `reference-to-video` uses `reference_image_urls`
- a dedicated `storyboardReference` section can still be offered in the UI
- backend merges `generalReferences + storyboardReference` into `reference_image_urls`

## Result handling pattern

Do not assume image-only responses.

Extract both image and video outputs:
- image candidates: `images[0].url`, `image.url`, `output.url` when content type is image
- video candidates: `video.url`, `videos[0].url`, `output.url` when content type is video

Frontend should:
- render `<img>` for image results
- render `<video controls>` for video results
- save both to gallery with a `resultKind`

Gallery should:
- support image and video previews
- preserve `resultKind`
- use video file extension when saving videos locally

## Local verification workflow that worked

1. Add a tiny utility module for form logic.
2. Add a lightweight Node test file using `node:test`.
3. Verify validation logic before wiring the UI.
4. Run:
   - `node --test model-utils.test.mjs`
   - `npm run build`
5. Verify API status locally:
   - `http://127.0.0.1:4016/api/status`
6. Verify live API status after PM2 restart:
   - `https://fal-studio.apps.poofc.com/api/status`

## Deployment notes specific to fal-studio

- This app is a Vite + Express single-port app on port 4016.
- PM2 process name: `fal-studio`
- Restart with:
```bash
pm2 restart fal-studio
```
- Always verify dist timestamps after build.
- Always commit and push after changes.

## Pitfalls

- A browser session authenticated with HTTP Basic auth in the page URL may still produce fetch quirks in browser automation; trust the live `/api/status` endpoint and PM2 verification more than the browser page alone if they disagree.
- Some fal models return 401 because the user's fal account has not activated that model yet. Preserve a friendly activation message in the backend.
- `upload.single('image')` is too limiting for multi-reference and first/last-frame workflows. Use `multer.any()`.
- If you keep only one `imageFile` state in React, you cannot correctly support model-specific upload sections. Replace it with upload state keyed by section.
- Result extraction must support both images and videos, or video models will falsely appear broken.

## Good candidate endpoints to support

Grounded examples that worked well for this pattern:
- `fal-ai/wan-25-preview/text-to-video`
- `fal-ai/wan-25-preview/image-to-video`
- `fal-ai/wan-flf2v`
- `fal-ai/kling-video/v2.5-turbo/pro/image-to-video`
- `fal-ai/kling-video/o3/standard/image-to-video`
- `fal-ai/kling-video/o1/standard/image-to-video`
- `fal-ai/veo3.1/first-last-frame-to-video`
- `fal-ai/veo3.1/image-to-video`
- `fal-ai/veo3/image-to-video`
- `fal-ai/minimax/hailuo-02/standard/image-to-video`
- `fal-ai/vidu/start-end-to-video`
- `fal-ai/vidu/reference-to-video`
- `fal-ai/ovi/image-to-video`