Use this when updating fal-studio or any similar generation app to support: - text-to-video models - image-to-video models - first/last-frame video models - multi-reference image or video models - special storyboard/grid reference sections - model-specific settings panels derived from real endpoint fields
Do NOT hardcode assumptions from memory about a model's inputs.
Instead:
1. Find the endpoint IDs you want to support.
2. Pull the endpoint OpenAPI schema directly from fal.ai.
3. Use the schema as the source of truth for:
- required fields
- enum choices
- default values
- whether prompt is required
- whether first frame / last frame / general refs are needed
- whether output is image or video
4. Use llms.txt only as a supplement for pricing notes, prompt conventions, and human-facing descriptions.
For each fal endpoint:
https://fal.ai/api/openapi/queue/openapi.json?endpoint_id=<ENDPOINT_ID>
Then extract:
- request body schema title
- required array
- properties object
- output schema (video, image, output, etc.)
Useful companion page:
https://fal.ai/models/<ENDPOINT_ID>/llms.txt
Use llms.txt for:
- human-readable price text
- prompt notes
- examples like Kling O1's @Image1 / @Image2
Represent each model with:
- id
- name
- family
- type
- mediaType (image or video)
- price / priceUnit
- promptRequired
- optional alternatePromptParam
- params object for the settings form
- uploads object for upload sections
Recommended upload section keys:
- generalReferences
- firstFrame
- lastFrame
- storyboardReference
Each upload section should include:
- label
- required
- maxFiles
- apiParam
- help
The generate button should NOT only check for a non-empty text prompt.
Use logic like:
- if promptRequired !== false, require text prompt
- if promptRequired === false and model has alternatePromptParam, allow either prompt OR that alternate field
- also require all upload sections marked required
This matters for models like:
- Kling O3, where multi_prompt can substitute for prompt
Frontend:
- store uploads by section key, not by a single imageFile
- keep separate previews for each section
- render each section independently
- show max file count per section
Backend:
- use multer.any() instead of upload.single('image')
- group files by field name
- upload each file to fal CDN
- map grouped uploads into the correct request fields
Map to one of:
- image_url
Used by examples like: - Flux Kontext - Flux Dev Img2Img - Qwen Image Edit
Map to:
- image_urls
- or reference_image_urls
Used by examples like: - Qwen Image 2 Edit - Nano Banana edit models - Vidu reference-to-video
Map to one of:
- image_url
- sometimes start_image_url
- sometimes first_frame_url
Examples:
- Wan 2.5 image-to-video -> image_url
- Veo 3 / Veo 3.1 image-to-video -> image_url
- Ovi image-to-video -> image_url
Map to one of:
- start_image_url + end_image_url
- first_frame_url + last_frame_url
- image_url + end_image_url
- image_url + tail_image_url
Examples:
- Wan FLF2V -> start_image_url, end_image_url
- Veo 3.1 FLF -> first_frame_url, last_frame_url
- Kling O1 -> start_image_url, optional end_image_url
- Kling O3 -> image_url, optional end_image_url
- Kling 2.5 Turbo Pro -> image_url, optional tail_image_url
- Hailuo 02 -> image_url, optional end_image_url
- Vidu start-end -> start_image_url, end_image_url
If the model doesn't expose a dedicated field, you can still provide a separate UI section and merge that file into the model's normal reference array.
Concrete example discovered:
- Vidu reference-to-video uses reference_image_urls
- a dedicated storyboardReference section can still be offered in the UI
- backend merges generalReferences + storyboardReference into reference_image_urls
Do not assume image-only responses.
Extract both image and video outputs:
- image candidates: images[0].url, image.url, output.url when content type is image
- video candidates: video.url, videos[0].url, output.url when content type is video
Frontend should:
- render <img> for image results
- render <video controls> for video results
- save both to gallery with a resultKind
Gallery should:
- support image and video previews
- preserve resultKind
- use video file extension when saving videos locally
node:test.node --test model-utils.test.mjs
- npm run buildhttp://127.0.0.1:4016/api/statushttps://fal-studio.apps.poofc.com/api/statusfal-studiopm2 restart fal-studio
/api/status endpoint and PM2 verification more than the browser page alone if they disagree.upload.single('image') is too limiting for multi-reference and first/last-frame workflows. Use multer.any().imageFile state in React, you cannot correctly support model-specific upload sections. Replace it with upload state keyed by section.Grounded examples that worked well for this pattern:
- fal-ai/wan-25-preview/text-to-video
- fal-ai/wan-25-preview/image-to-video
- fal-ai/wan-flf2v
- fal-ai/kling-video/v2.5-turbo/pro/image-to-video
- fal-ai/kling-video/o3/standard/image-to-video
- fal-ai/kling-video/o1/standard/image-to-video
- fal-ai/veo3.1/first-last-frame-to-video
- fal-ai/veo3.1/image-to-video
- fal-ai/veo3/image-to-video
- fal-ai/minimax/hailuo-02/standard/image-to-video
- fal-ai/vidu/start-end-to-video
- fal-ai/vidu/reference-to-video
- fal-ai/ovi/image-to-video