Large model gallery
Browse a library of 1,000+ production-ready models across image, video, audio, and 3D tasks, including model pages with Try it now and docs links.
fal is a generative media platform for developers that offers model APIs, serverless inference, and dedicated GPU compute for image, video, audio, and 3D workloads. It helps teams discover models, run generation tasks, and scale custom AI jobs with usage-based or hourly pricing.
fal is a generative media platform for developers that brings image, video, 3D, audio, and voice models into one product surface. The site positions it as a place to run production-ready models, call them through model APIs, and scale custom AI workloads with serverless GPUs or dedicated compute.
The homepage emphasizes a workflow for developers who want to integrate models quickly without managing much infrastructure. In practice, fal separates workloads into model APIs for direct generation, Serverless for autoscaling inference endpoints, and Compute for sustained GPU access such as training, fine-tuning, batch processing, and distributed workloads.
Browse a library of 1,000+ production-ready models across image, video, audio, and 3D tasks, including model pages with Try it now and docs links.
Use a simple API to call models directly, with the homepage describing a unified developer workflow and no fine-tuning or setup needed for many models.
Run on-demand inference through serverless GPUs that scale from zero to thousands of GPUs automatically and avoid cold-start planning on your own infrastructure.
Provision dedicated GPU instances for training, fine-tuning, batch jobs, and long-running workloads that need full SSH access and predictable hourly billing.
Deploy private or fine-tuned models and bring your own weights on enterprise-ready infrastructure with private endpoints.
Use output-based pricing for many model APIs, with pricing normalized by output unit on the pricing page for easier comparison across models.
Build apps that generate or edit images and videos through model APIs, using the gallery to pick a model that fits the task.
Run production inference endpoints that scale automatically with traffic and require minimal infrastructure management.
Train or fine-tune models on dedicated GPU instances when jobs need continuous access to hardware and SSH control.
Use 8xH100 Compute instances for distributed training or multi-GPU inference that benefits from InfiniBand-linked nodes.
Explore new models from a single catalog and compare output-based pricing across image and video options before integrating them.
fal is a generative media platform for developers. It provides model APIs, a serverless runtime, and dedicated compute for running image, video, audio, and 3D workloads.
The source shows a unified API and SDKs, but it does not list specific language SDKs or setup steps. The homepage says developers can call models directly, and the compute documentation explains SSH-based access for dedicated GPU instances.
The homepage and model gallery emphasize image, video, audio, and 3D models. The gallery also shows model pages for tasks such as text-to-image, image-to-video, editing, upscaling, background removal, and music generation.
fal offers pay-per-use model API pricing and separate pricing for serverless and compute. The pricing page states that serverless and compute are billed differently, with compute priced hourly and model APIs billed by output-based units for some models.
Compute is designed for training, fine-tuning, batch processing, and other workloads that need sustained access to GPU hardware. The documentation contrasts it with serverless, which is meant for autoscaling and on-demand inference.
DDS Hub is an AI API platform for Claude and OpenAI-family model workflows, with token-based pricing, model selection, and Claude Code setup guidance. It is aimed at developers who want API access, usage-based billing, and basic troubleshooting in one place.
NavtoAI API is a unified AI API gateway that lets developers and teams route requests across 200+ models through one account and one API shape. The collected pages also show API key usage lookup, routing controls, and centralized management for keys, quota, billing, users, and observability.
EvoLink is an AI model API platform that gives developers one OpenAI-compatible endpoint for accessing text, image, video, and music models from multiple providers. It is positioned for production apps, agents, and workflows that need model comparison, routing, and usage-based access.
ZenMux is an enterprise LLM platform with a unified API for multiple model providers, automatic prompt-based routing, and usage-based or subscription pricing. It is aimed at developers and teams building AI applications that need multi-model access, cost visibility, and compensation for certain model failures.
PoYo.ai is a unified AI API platform for developers that provides image, video, music, chat, 3D, and utility model access through one async workflow. Pricing is presented as credit-based and pay-as-you-use, with model comparison pages and docs for integration.
Kie.ai is a developer-focused AI API platform for accessing chat, image, video, and music models through one interface. It combines model browsing, API keys, billing, usage logs, and per-model pricing for integration-focused workflows.