Filter

All Vendors

gemini-2.5-flash-lite

Input $0.1/M Output $0.4/M

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance across common benchmarks compared to earlier Flash models. By default, "thinking" (i.e. multi-pass reasoning) is disabled to prioritize speed, but developers can enable it via the Reasoning API parameter to selectively trade off cost for intelligence.

Tiered Pricing

gemini-3.1-flash-image-preview

Input $0.5/M Output $3.0/M

Designed for speed and efficiency, the Gemini 3.1 Flash Image generation model is effective for quick, interactive responses and high throughput. Preview models may change before becoming stable and have more restrictive rate limits.

Tiered Pricing

gemini-2.5-flash-image

inputTokensPrice:0.30 outputTokensPrice:30.00

Gemini 2.5 Flash Image, a.k.a. "Nano Banana," is now generally available. It is a state of the art image generation model with contextual understanding. It is capable of image generation, edits, and multi-turn conversations.

Usage-Based Pricing

gemini-2.5-pro

Input $1.25/M Output $10.0/M

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy and nuanced context handling. Gemini 2.5 Pro achieves top-tier performance on multiple benchmarks, including first-place positioning on the LMArena leaderboard, reflecting superior human-preference alignment and complex problem-solving abilities.

Tiered Pricing

gemini-2.5-flash

Input $0.3/M Output $2.5/M

Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mathematics, and scientific tasks. It includes built-in "thinking" capabilities, enabling it to provide responses with greater accuracy and nuanced context handling.

Tiered Pricing

gpt-5.2-codex

inputTokensPrice:1.750 outputTokensPrice:14.000

GPT-5.2-Codex is an upgraded version of GPT-5.2 optimized for agentic coding tasks in Codex or similar environments. GPT-5.2-Codex supports low, medium, high, and xhigh reasoning effort settings.

Usage-Based Pricing

gpt-5.2-chat

inputTokensPrice:1.750 outputTokensPrice:14.000

GPT-5.2 Chat (AKA Instant) is the fast, lightweight member of the 5.2 family, optimized for low-latency chat while retaining strong general intelligence. It uses adaptive reasoning to selectively “think” on harder queries, improving accuracy on math, coding, and multi-step tasks without slowing down typical conversations. The model is warmer and more conversational by default, with better instruction following and more stable short-form reasoning. GPT-5.2 Chat is designed for high-throughput, interactive workloads where responsiveness and consistency matter more than deep deliberation.

Usage-Based Pricing

Filter

All Vendors

gemini-2.5-flash-lite

gemini-3.1-flash-image-preview

gemini-2.5-flash-image

gemini-2.5-pro

gemini-2.5-flash

gpt-5.2-codex

gpt-5.2-chat

gpt-5.2

gpt-5.5

gpt-5.4

gpt-5.4-pro

gpt-5.3-codex

Filter

All Vendors

Model list

gemini-2.5-flash-lite

gemini-3.1-flash-image-preview

gemini-2.5-flash-image

gemini-2.5-pro

gemini-2.5-flash

gpt-5.2-codex

gpt-5.2-chat

gpt-5.2

gpt-5.5

gpt-5.4

gpt-5.4-pro

gpt-5.3-codex