Now GA · Launched at Google I/O 2026 · May 19, 2026

Gemini 3.5 Flash — FreePro-Level Agentic AI at Flash Speed

Gemini 3.5 Flash is Google DeepMind's fastest frontier model, launched May 19, 2026. It delivers Pro-level reasoning depth with a 1M token context window and runs 4× faster than comparable frontier models at less than half the cost — try it free on Felo AI right now.

Free to use on Felo AI — no credit card required

81.2%
MMMU-Pro Score
Global #1 multimodal benchmark
$0.50
Input Price
$1.50 / 1M tokens via API
0.2s
Speed Advantage
4× faster than comparable models
1M
Context Window
Tokens in a single request

What Makes Gemini 3.5 Flash Different

The first Flash model to surpass its own Pro predecessor on agentic and coding benchmarks — while maintaining Flash-level speed and cost.

Speed icon

Built for Agentic Workflows

Gemini 3.5 Flash is Google's most capable agentic and coding model to date. It reliably executes long-horizon tasks lasting hours or weeks, handles multi-step tool use, and coordinates sub-agents via Google's Antigravity framework — making large-scale agentic systems economically viable.

Thinking icon

Dynamic Thinking — Configurable Reasoning Depth

Thinking is enabled by default with a new thinking_level parameter (default: medium). Gemini 3.5 Flash performs internal multi-step planning before responding, delivering reasoning depth that rivals flagship Pro models on math, coding, and logic tasks — with the depth tunable per request.

Context icon

1M Token Context Window

Feed in an entire codebase, hours of video, or a year's worth of financial contracts in a single request. The 1M input token window paired with 64K output tokens means complex tasks stay complete — nothing gets truncated. MRCR v2 long-context score of 26.6% leads all comparable models.

Cost icon

4× Faster, Less Than Half the Cost

Gemini 3.5 Flash runs 4× faster than comparable frontier models at less than half the cost. At $1.50 per million input tokens with context caching at $0.15/M, running AI agents around the clock becomes practical — not just technically possible.

The New Pareto Frontier: Speed × Intelligence

For years, faster meant less capable. Gemini 3.5 Flash breaks that tradeoff — it sits at the top-right of the speed-intelligence curve, outpacing models that cost far more.

Intelligence vs Speed chart showing Gemini 3.5 Flash at the Pareto frontier

Gemini 3.5 Flash leads the intelligence-vs-speed Pareto frontier among frontier models. Source: Artificial Analysis, May 2026.

Performance Benchmarks

Gemini 3.5 Flash vs Claude Opus 4.7 vs GPT-5.5

Official model card results. Gemini 3.5 Flash leads on multimodal understanding, agentic tool use, and long-context retrieval.

Benchmark
Gemini 3.5 Flash
Claude Opus 4.7
GPT-5.5
MMMU-Pro
83.6%
75.2%
81.2%
CharXiv Reasoning
84.2%
82.1%
84.1%
MCP Atlas
83.6%
79.1%
75.3%
Terminal-Bench 2.1
76.2%
66.1%
78.2%
OSWorld-Verified
78.4%
78.0%
78.7%
MRCR v2 (1M ctx)
26.6%

Source: Gemini 3.5 Flash Model Card — Google DeepMind, May 2026.

Technical Specifications

Everything you need to know before integrating Gemini 3.5 Flash into your application.

Context Window

1,048,576 tokens input
65,536 tokens output

API Pricing

$1.50 / 1M input tokens
$9.00 / 1M output tokens
$0.15 / 1M cached tokens

General Availability

May 19, 2026 — Google I/O

Knowledge Cutoff

January 2026

Thinking Mode

On by default. Configurable via thinking_level: low / medium (default) / high. Thought Preservation retains reasoning across multi-turn conversations.

Tool Use & APIs

Function calling, structured output, code execution, Google Search grounding, context caching — all supported natively.

Input Modalities

Text, images, audio, video, PDF — native multimodal, no preprocessing required.

Native Multimodal — One Model, Every Input Type

Gemini 3.5 Flash processes text, images, audio, and video natively — no separate pipelines, no stitching together multiple models.

Text & PDF

Parses million-word documents with high accuracy. Handles complex tables, code, and structured data in a single pass.

Image Understanding

MMMU-Pro score of 83.6% — global #1. Analyzes architectural blueprints, charts, and detailed visual content in real time.

Video Analysis

Supports up to 1 hour of video input. Captures key changes frame by frame for summarization, QA, and content analysis.

Audio Processing

Recognizes emotion, ambient sound, and multilingual conversation. Powers real-time translation and voice assistants.

Available Everywhere You Build

Gemini 3.5 Flash is deeply integrated across Google's developer and consumer ecosystem — from API access to the default model powering billions of users.

Developer Platforms

  • Gemini API
  • Google AI Studio
  • Android Studio
  • Google Antigravity
  • Gemini Enterprise Agent Platform (Vertex AI)

Consumer Products

  • Gemini App — global default model
  • Google Search AI Mode — default model
  • Gemini Spark — personal AI agent
  • Felo AI — free access via search & playground
Intelligence vs Cost chart showing Gemini 3.5 Flash as the best value frontier model

Gemini 3.5 Flash leads the intelligence-vs-cost frontier. Less than half the cost of comparable models for equivalent task performance. Source: Artificial Analysis, May 2026.

Who Uses Gemini 3.5 Flash

From individual developers to enterprise teams, Gemini 3.5 Flash fits wherever you need fast, capable AI at scale.

Agentic Coding

Terminal-Bench 2.1 score of 76.2% with low latency. Coding agents complete tasks faster with fewer logic gaps — iterative code generation, debugging, and A/B testing at Flash speed.

Financial & Tax Processing

Process a full year of contracts and statements in one request. Xero uses it to handle 1099 tax forms; Ramp uses its multimodal OCR for complex invoice processing.

Enterprise Agent Platforms

Salesforce integrates it into Agentforce to accelerate enterprise agent deployment. Databricks uses it to monitor real-time data and diagnose issues autonomously.

Long-Horizon Business Tasks

Shopify uses it for merchant growth forecasting. Reliably executes complex workflows lasting hours or weeks — the kind of tasks that previously required human oversight at every step.

Multimodal Content Analysis

Analyze video, images, and documents together in a single request. CharXiv Reasoning score of 84.2% means it extracts insights from complex charts and mixed-media content accurately.

Consumer AI Products

Now the default model in the Gemini app and Google Search AI Mode — serving billions of monthly active users. Fast Mode delivers near-instant responses on mobile.

What Teams Are Saying

Its long-context performance is exceptional for processing large-scale unstructured multimodal datasets.

Bridgewater Associates

We integrated Gemini 3.5 Flash into Agentforce to accelerate enterprise agent deployment — the speed-to-capability ratio is unlike anything we've seen before.

Salesforce

Gemini 3.5 Flash lets us monitor real-time information and diagnose issues autonomously in our agentic workflows.

Databricks

Two Ways to Use Gemini 3.5 Flash on Felo

Felo AI Search

Open Felo AI Search and select the Gemini 3.5 Flash model. Ask questions, search the web with AI, and get cited answers — powered by Google's fastest frontier model.

Open Felo AI Search

Felo LLM Playground

Open Felo LLM Playground, select Gemini 3.5 Flash, and start chatting. Compare outputs from multiple models side by side to see the speed and reasoning difference firsthand.

Open Playground

Frequently Asked Questions

Try Gemini 3.5 Flash Free — Right Now

Launched at Google I/O 2026. Open Felo AI and start using Google's fastest frontier model today.

Open Gemini 3.5 Flash on Felo

Free to use — no credit card required