Can I try Gemini 3.5 Flash for free on Felo AI?

Yes. Felo AI offers free access to Gemini 3.5 Flash. Sign up for a free account to get started — no credit card required.

When was Gemini 3.5 Flash officially released?

Gemini 3.5 Flash was officially launched (GA) on May 19, 2026 at Google I/O. It is now available through the Gemini API, Google AI Studio, and Felo AI.

How does Gemini 3.5 Flash compare to Gemini 3.1 Pro?

On agentic and coding benchmarks, Gemini 3.5 Flash actually surpasses Gemini 3.1 Pro — for example, MCP Atlas 83.6% vs 78.2%, and Terminal-Bench 2.1 76.2% vs 70.3%. It runs 4× faster at less than half the cost. For pure academic reasoning tasks, Gemini 3.1 Pro still holds a slight edge.

What is the thinking_level parameter and how does it work?

Thinking is enabled by default in Gemini 3.5 Flash. The new thinking_level parameter (values: low, medium, high) replaces the old thinking_budget, letting you control reasoning depth per request. The default is medium, which balances speed and depth for most tasks.

What is Thought Preservation?

Thought Preservation automatically retains intermediate reasoning across multi-turn conversations. This improves performance on iterative tasks like debugging and code refactoring, where context from earlier reasoning steps matters.

What is the pricing for Gemini 3.5 Flash via API?

Input: $1.50 per million tokens. Output: $9.00 per million tokens. Cached input: $0.15 per million tokens. Context caching makes repeated long-context tasks significantly more cost-effective.

Does the 1M token context window slow down responses?

No. Gemini 3.5 Flash uses specialized streaming optimizations for long-context inputs. Response speed remains fast even when processing large documents or codebases.

What can I do with Gemini 3.5 Flash on Felo?

Felo integrates Gemini 3.5 Flash across its core features — AI-powered search, deep research, and topic exploration are all ready to use. You can also use the model freely in Felo LLM Playground to chat, compare outputs, or test your own prompts.

Now GA · Launched at Google I/O 2026 · May 19, 2026

Gemini 3.5 Flash — FreePro-Level Agentic AI at Flash Speed

Gemini 3.5 Flash is Google DeepMind's fastest frontier model, launched May 19, 2026. It delivers Pro-level reasoning depth with a 1M token context window and runs 4× faster than comparable frontier models at less than half the cost — try it free on Felo AI right now.

Try Gemini 3.5 Flash Free

Free to use on Felo AI — no credit card required

81.2%

MMMU-Pro Score

Global #1 multimodal benchmark

$0.50

Input Price

$1.50 / 1M tokens via API

0.2s

Speed Advantage

4× faster than comparable models

Context Window

Tokens in a single request

What Makes Gemini 3.5 Flash Different

The first Flash model to surpass its own Pro predecessor on agentic and coding benchmarks — while maintaining Flash-level speed and cost.

Built for Agentic Workflows

Gemini 3.5 Flash is Google's most capable agentic and coding model to date. It reliably executes long-horizon tasks lasting hours or weeks, handles multi-step tool use, and coordinates sub-agents via Google's Antigravity framework — making large-scale agentic systems economically viable.

Dynamic Thinking — Configurable Reasoning Depth

Thinking is enabled by default with a new thinking_level parameter (default: medium). Gemini 3.5 Flash performs internal multi-step planning before responding, delivering reasoning depth that rivals flagship Pro models on math, coding, and logic tasks — with the depth tunable per request.

1M Token Context Window

Feed in an entire codebase, hours of video, or a year's worth of financial contracts in a single request. The 1M input token window paired with 64K output tokens means complex tasks stay complete — nothing gets truncated. MRCR v2 long-context score of 26.6% leads all comparable models.

4× Faster, Less Than Half the Cost

Gemini 3.5 Flash runs 4× faster than comparable frontier models at less than half the cost. At $1.50 per million input tokens with context caching at $0.15/M, running AI agents around the clock becomes practical — not just technically possible.

The New Pareto Frontier: Speed × Intelligence

For years, faster meant less capable. Gemini 3.5 Flash breaks that tradeoff — it sits at the top-right of the speed-intelligence curve, outpacing models that cost far more.

Intelligence vs Speed chart showing Gemini 3.5 Flash at the Pareto frontier

Gemini 3.5 Flash leads the intelligence-vs-speed Pareto frontier among frontier models. Source: Artificial Analysis, May 2026.

Performance Benchmarks

Gemini 3.5 Flash vs Claude Opus 4.7 vs GPT-5.5

Official model card results. Gemini 3.5 Flash leads on multimodal understanding, agentic tool use, and long-context retrieval.

Benchmark

Gemini 3.5 Flash

Claude Opus 4.7

GPT-5.5

MMMU-Pro

83.6%

75.2%

81.2%

CharXiv Reasoning

84.2%

82.1%

84.1%

MCP Atlas

83.6%

79.1%

75.3%

Terminal-Bench 2.1

76.2%

66.1%

78.2%

OSWorld-Verified

78.4%

78.0%

78.7%

MRCR v2 (1M ctx)

26.6%

—

Source: Gemini 3.5 Flash Model Card — Google DeepMind, May 2026.

Technical Specifications

Everything you need to know before integrating Gemini 3.5 Flash into your application.

Context Window

1,048,576 tokens input

65,536 tokens output

API Pricing

$1.50 / 1M input tokens

$9.00 / 1M output tokens

$0.15 / 1M cached tokens

General Availability

May 19, 2026 — Google I/O

Knowledge Cutoff

January 2026

Thinking Mode

On by default. Configurable via thinking_level: low / medium (default) / high. Thought Preservation retains reasoning across multi-turn conversations.

Tool Use & APIs

Function calling, structured output, code execution, Google Search grounding, context caching — all supported natively.

Input Modalities

Text, images, audio, video, PDF — native multimodal, no preprocessing required.

Native Multimodal — One Model, Every Input Type

Gemini 3.5 Flash processes text, images, audio, and video natively — no separate pipelines, no stitching together multiple models.

Text & PDF

Parses million-word documents with high accuracy. Handles complex tables, code, and structured data in a single pass.

Image Understanding

MMMU-Pro score of 83.6% — global #1. Analyzes architectural blueprints, charts, and detailed visual content in real time.

Video Analysis

Supports up to 1 hour of video input. Captures key changes frame by frame for summarization, QA, and content analysis.

Audio Processing

Recognizes emotion, ambient sound, and multilingual conversation. Powers real-time translation and voice assistants.

Available Everywhere You Build

Gemini 3.5 Flash is deeply integrated across Google's developer and consumer ecosystem — from API access to the default model powering billions of users.

Developer Platforms

Gemini API
Google AI Studio
Android Studio
Google Antigravity
Gemini Enterprise Agent Platform (Vertex AI)

Consumer Products

Gemini App — global default model
Google Search AI Mode — default model
Gemini Spark — personal AI agent
Felo AI — free access via search & playground

Intelligence vs Cost chart showing Gemini 3.5 Flash as the best value frontier model

Gemini 3.5 Flash leads the intelligence-vs-cost frontier. Less than half the cost of comparable models for equivalent task performance. Source: Artificial Analysis, May 2026.

Who Uses Gemini 3.5 Flash

From individual developers to enterprise teams, Gemini 3.5 Flash fits wherever you need fast, capable AI at scale.

Agentic Coding

Terminal-Bench 2.1 score of 76.2% with low latency. Coding agents complete tasks faster with fewer logic gaps — iterative code generation, debugging, and A/B testing at Flash speed.

Financial & Tax Processing

Process a full year of contracts and statements in one request. Xero uses it to handle 1099 tax forms; Ramp uses its multimodal OCR for complex invoice processing.

Enterprise Agent Platforms

Salesforce integrates it into Agentforce to accelerate enterprise agent deployment. Databricks uses it to monitor real-time data and diagnose issues autonomously.

Long-Horizon Business Tasks

Shopify uses it for merchant growth forecasting. Reliably executes complex workflows lasting hours or weeks — the kind of tasks that previously required human oversight at every step.

Multimodal Content Analysis

Analyze video, images, and documents together in a single request. CharXiv Reasoning score of 84.2% means it extracts insights from complex charts and mixed-media content accurately.

Consumer AI Products

Now the default model in the Gemini app and Google Search AI Mode — serving billions of monthly active users. Fast Mode delivers near-instant responses on mobile.

What Teams Are Saying

“Its long-context performance is exceptional for processing large-scale unstructured multimodal datasets.”

— Bridgewater Associates

“We integrated Gemini 3.5 Flash into Agentforce to accelerate enterprise agent deployment — the speed-to-capability ratio is unlike anything we've seen before.”

— Salesforce

“Gemini 3.5 Flash lets us monitor real-time information and diagnose issues autonomously in our agentic workflows.”

— Databricks

Two Ways to Use Gemini 3.5 Flash on Felo

Felo AI Search

Open Felo AI Search and select the Gemini 3.5 Flash model. Ask questions, search the web with AI, and get cited answers — powered by Google's fastest frontier model.

Open Felo AI Search

Felo LLM Playground

Open Felo LLM Playground, select Gemini 3.5 Flash, and start chatting. Compare outputs from multiple models side by side to see the speed and reasoning difference firsthand.

Open Playground

Frequently Asked Questions

Try Gemini 3.5 Flash Free — Right Now

Launched at Google I/O 2026. Open Felo AI and start using Google's fastest frontier model today.

Open Gemini 3.5 Flash on Felo

Free to use — no credit card required