IG1 | AI API — Kimi K2.6 Frontier AI, Sovereign Performance

Performance

Kimi K2.6
Performance Signals.

Official Kimi K2.6 benchmarks highlight state-of-the-art coding, long-horizon execution, agent swarm workflows, and strong multimodal reasoning.

AIME 2026

96.4% reasoning benchmark

Kimi K2.6 posts a very strong score on AIME 2026 in Moonshot AI’s official benchmark table, confirming frontier-level mathematical reasoning.

—Thinking-mode benchmark result
—Strong complex problem solving
—No data retention, fully sovereign
—Frontier reasoning through IG1 AI API

Context Window

262K tokens

Kimi K2.6 supports an official 262,144-token context length, making it suitable for long documents, large codebases, and multi-step agent workflows.

—262,144-token official context length
—Long-context document and code analysis
—89.6% on LiveCodeBench v6

Agent Swarm

300 sub-agents

Kimi K2.6 expands agent swarm workflows for coordinated research, analysis, content generation, slides, spreadsheets, and websites.

—Up to 4,000 coordinated steps
—Long-horizon autonomous execution
—Reusable skills from documents

Frontier Model

Kimi K2.6.
The New Flagship.

Kimi K2.6 becomes the frontier model of IG1 AI API: a high-end reasoning and coding engine designed for complex enterprise work, delivered with sovereign inference and transparent per-token pricing.

Model	Positioning	Best for	Offered by IG1 AI
Small model	Compact	Simple automation	No
Efficient frontier	Efficient	Fast enterprise workloads	✓ Qwen 3.5 122B-A10B
Frontier model	Kimi K2.6	World-class reasoning, code, and agents	✓ Yes

What Kimi K2.6 changes in practice

Kimi K2.6 is built for tasks where quality, precision, and persistence matter more than a quick generic answer. Here's where the difference is immediately visible:

Long contract analysis

Kimi K2.6 can follow cross-references, nested clauses, exceptions, and definitions across very long documents while keeping the final answer grounded in the full context.

Multi-file code generation

From architecture planning to implementation details, Kimi K2.6 is tuned for coherent multi-file development, refactoring, debugging, and product-to-code workflows.

Multi-constraint reasoning

Budget, timeline, stack, compliance, security, UX, scalability: Kimi K2.6 keeps simultaneous constraints active and turns them into clear decisions.

Complex instructions over long context

For long prompts with nested requirements, Kimi K2.6 is the model to use when your team needs the answer to stay faithful to every instruction through the end of generation.

Why choose Kimi K2.6?

Use Kimi K2.6 when the work is strategic: hard reasoning, product ideation, advanced coding, agentic planning, deep analysis, and documents where small mistakes are expensive. It is the premium path in the IG1 AI catalog.

Two frontier paths, one sovereign API

Kimi K2.6 is the flagship frontier model at 1,20 € / 1M input tokens and 5,00 € / 1M output tokens. Qwen 3.5 122B-A10B is the efficient high-performance option for production workloads.

Efficient Frontier

Qwen 3.5 122B-A10B.
Frontier Capability, Production Economics.

Kimi K2.6 is the flagship model. Qwen 3.5 122B-A10B is the high-volume innovation: a large Mixture-of-Experts model designed to deliver strong reasoning, coding, and multimodal workflows with far better cost efficiency.

122B total, 10B active

The A10B architecture activates only a fraction of the total model per token. You get a large-model knowledge base with a much lighter inference footprint.

Half the Kimi token price

At 0,60 € input and 2,50 € output per 1M tokens, Qwen 3.5 122B-A10B is built for production traffic, internal copilots, and cost-sensitive agent pipelines.

Production-grade versatility

Use it for chat, coding assistance, OCR/vision workflows, structured extraction, summarization, and tool-using agents when Kimi-level premium reasoning is not required.

Long-context workhorse

Qwen 3.5 122B-A10B keeps the long-context enterprise promise while giving teams a model they can use broadly across recurring workloads without making every request premium-priced.

Use case	Recommended model	Why
Hard reasoning, strategic coding, complex agents	Kimi K2.6	Maximum capability
Production chat, extraction, copilots, high-volume agents	Qwen 3.5 122B-A10B	Best performance / cost balance
Hybrid routing	Kimi + Qwen	Use Kimi only when the task deserves it

Use Cases

What Does This Mean
for Your Team?

Everyone in your team can adopt AI — not just your engineers. Marketing, sales, analysts, developers. IG1 AI API meets you where you are.

Chat

Integrate conversational AI into any application. Build smart assistants, customer support bots, and interactive workflows.

Code

Ship features faster with AI-assisted development. Generate, review, refactor and debug code at enterprise scale.

Summarize

Transform thousands of documents into actionable insights. Extract, condense, and analyze enterprise knowledge at scale.

Create

Generate content and analysis at scale. From marketing copy to financial reports, empower every department with AI-powered creation.

Data Sovereignty

What About
Your Data?

The question every enterprise asks. IG1 AI API is built precisely for this. Your security team says yes. Your legal team says yes. And you can move fast with confidence.

Zero Data Retention

Your prompts are processed, then immediately erased. Nothing is stored, nothing is logged.

No Model Training

We never use your data to train models. Your intellectual property belongs to you.

100% Sovereign

Your data never leaves your enterprise boundaries. Fully compliant with GDPR, EU AI Act, and ISO 27001.

Your Prompt

IG1 AI Processes

Immediately Erased

No logs • No storage • No training

Open-Weight Models

Powered by the
Best Open Models.

Leveraging the most powerful open-weight models available today. No vendor lock-in. Full transparency.

Kimi K2.6 Frontier

by Moonshot AI — Beijing, China

Frontier LLM & Coding

Premium reasoning, code, product planning, long-context analysis, and agentic workflows.

Qwen 3.5

by Alibaba Cloud — Hangzhou, China

122B-A10B LLM

Efficient high-performance model for production chat, code, reasoning, and multi-turn workflows.

Qwen Image

by Alibaba Cloud — Hangzhou, China

Vision & Multimodal

Image understanding, visual analysis, and multimodal reasoning for enterprise workflows.

Qwen3-VL-Embedding-8B SOTA

by Alibaba Cloud — Hangzhou, China

Multimodal Embedding

State-of-the-art retrieval with multimodal embeddings and dynamic dimensions up to 4096 — far beyond fixed-size text encoders.

BGE

by BAAI — Beijing, China

Fast Embedding & Reranking

Fast, fixed-dimension text embeddings and reranking — a quality reference for high-throughput semantic search and RAG pipelines.

Model Pricing

Our Models

Transparent per-token pricing. All models are sovereign, hosted in France, with no hidden fees and no minimum commitment.

Language Models

Model	Description	1M Input Tokens	1M Output Tokens
Kimi K2.6	Frontier general purpose model for reasoning, text, code, OCR/vision, and complex enterprise workflows. Max context: 256K tokens.	1,20 €	5,00 €
Kimi K2.6 Thinking	Same as Kimi K2.6 with thinking mode enforced for deep multi-step reasoning. Max context: 256K tokens.	1,20 €	5,00 €
Qwen3.5-122B-A10B	Efficient high-performance model for production chat, code, and reasoning at lower cost. Default instruct profile.	0,60 €	2,50 €
Qwen3.5-122B-A10B Creative	Same as Qwen3.5-122B-A10B with hyperparameters tuned for creative ideation, synthesis, and high-quality content generation.	0,60 €	2,50 €
Qwen3.5-122B-A10B Thinking	Same as Qwen3.5-122B-A10B with thinking mode enforced for structured reasoning and analysis. Max context: 256K tokens.	0,60 €	2,50 €
Qwen3.5-122B-A10B Thinking Coder	Qwen3.5-122B-A10B configured for advanced coding, architecture, refactoring, debugging, and thinking-enforced development workflows.	0,60 €	2,50 €
BGE-m3	Text embeddings for RAG and semantic search.	0,05 €	N/A
BGE-reranker-v2-m3	Reranking for search result optimization.	0,30 €	N/A
Document	Convert any document in Markdown to be used by LLM. Useful for CAG – Cached RAG including vision RAG use cases. Document images uses IG1 Standard to be described as text.	N/A	2,00 €

Image Models

Model	Description	1M Input Tokens	1M Output Tokens
Qwen Image	Image generation. Token count based on image resolution and quality level.	0,60 €	2,50 €
Qwen Image PE	Image generation with Prompt Enhancing by LLM. Prompt enhancement uses Qwen3.5-122B-A10B (charged separately). Token count based on resolution and quality level.	0,60 €	2,50 €
Qwen Image Edit	Image editing. Token count based on source and generated image resolution and quality level.	0,60 €	2,50 €
Qwen Image Edit PE	Image editing with Prompt Enhancing by LLM. Prompt enhancement uses Qwen3.5-122B-A10B (charged separately). Token count based on source and generated image resolution and quality.	0,60 €	2,50 €

	Self-Hosting	IG1 AI API
GPU CAPEX	$500K – $1M	$0
Specialized Staff	~$200K / year	$0
Infrastructure	Months of setup	Instant
R&D Overhead	Significant	None
Time to Value	6–12 months	Day one

FAQ

Frequently Asked
Questions.

Everything you need to know about IG1 AI API — sovereignty, models, pricing, and getting started.

Sovereignty & Security

Where is my data hosted?

Exclusively in IG1’s own facilities in France, on Dell Technologies and NVIDIA hardware. Your data never leaves our European infrastructure.

Do you retain or use my data to train models?

No. Zero data retention. Zero data training. Zero data sharing. When your request is processed, it’s gone. Period.

Is IG1 AI compliant with GDPR?

Yes. Also ISO 27001 and the EU AI Act. Our sovereign, EU-hosted architecture with strict no-data-retention guarantees is designed for compliance by default.

How is this different from using ChatGPT, Claude or Gemini?

OpenAI, Anthropic, and Google are US companies subject to US jurisdiction (including the CLOUD Act). Your data transits through and may be stored on US infrastructure. With IG1 AI, your data stays in France, under European law, with zero retention, zero data sharing, zero data training.

Models & Performance

What models does IG1 AI use?

We run the best open-weight models available, led by Kimi K2.6 for frontier reasoning, coding, and agentic workflows. The catalog also includes Qwen3.5-122B-A10B for efficient high-performance language tasks, Qwen3-VL-Embedding-8B and BGE for embeddings and reranking, and Qwen Image for visual generation.

Can I choose which model to use?

Yes. Choose Kimi K2.6 for premium frontier reasoning and coding, Qwen3.5-122B-A10B for cost-efficient production workloads, or the dedicated Qwen3-VL-Embedding-8B / BGE embedding and reranking models and Qwen Image for specialized tasks.

Are your models quantized?

Yes — deliberately, and chosen to preserve quality. Where a higher-precision release exists (e.g. Qwen), we serve FP8: the quality impact is negligible while GPU memory and cost are roughly halved, which is the right engineering trade-off. Some frontier models are only released quantized — Kimi K2.6 ships natively in INT4, so there is no higher-precision version to run. What we never do is push aggressive low-bit quantization that would degrade the hardest tasks (multi-step reasoning, complex document analysis, large-scale code generation). The benchmark figures we publish reflect the exact precision we serve in production.

Pricing & Billing

How does pricing work?

Two parts. A flat 100 € / month subscription that includes 100 € of usage across every model in the catalog. Beyond that, usage is metered at the published per-token rates (Kimi K2.6 at 1,20 € input / 5,00 € output, Qwen3.5-122B-A10B at 0,60 € / 2,50 €), up to a 1 000 € monthly ceiling. No hidden fees.

What happens if I hit my cap?

Your monthly ceiling is 1 000 €. When you reach it, access pauses until the next billing cycle — or until you settle the 1 000 €. You are never charged beyond the ceiling you control. If you consistently hit it, we’ll discuss raising your limit.

How is overage billed?

On the 1st of each month we bill the previous month’s overage — usage above your included allowance — together with the coming month’s subscription. Payment is by SEPA direct debit or saved card.

Why a subscription plus metered usage?

You get the predictability of a fixed monthly base and the flexibility to scale when a project demands it — no renegotiation. Light months stay light, heavy months stay capped. You only pay for what you use, up to a ceiling you set.

Can I commit to a higher plan?

Yes. Higher monthly subscription tiers — with a larger included allowance and ceiling — are available for teams that need more. Contact sales to arrange a commitment that fits your volume.

Is there a free trial?

Getting Started

How long does setup take?

Zero infrastructure setup on your side. You can be live the same day.

Do I need a technical team to use IG1 AI?

No. IG1 AI is designed for every team — marketing, sales, HR, legal, analysts, and developers. If you can use a chat interface, you can use IG1 AI.

Can I integrate IG1 AI into my existing tools?

Yes. IG1 AI provides API access compatible with standard AI chat applications (Msty, Open WebUI) and IDE integrations for developers. Connect your existing workflows without changing your tools.

What if a better AI platform comes along?

No lock-in. Monthly billing with no commitment means you can leave anytime. But because we continuously upgrade to the best open-weights models, you’re always on the frontier without switching.

Kimi K2.6
Sovereign Frontier AI.

See IG1 AI API
in Action.

Kimi K2.6
Performance Signals.