18 frontier models · one agentic endpoint

The agentic gateway to every frontier model

One OpenAI-compatible API key for DeepSeek-R1, Llama 3.3, Qwen Coder, Mistral, FLUX and more — with native tool-calling and streaming for agents. Usage-based pricing, instant keys, built-in analytics. Zero infrastructure.

No credit card required · $1 of free usage every month

chat.completions
curl https://api.speka.online/v1/chat/completions \
  -H "Authorization: Bearer sk-speka-live-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-ai/deepseek-r1",
    "messages": [{"role":"user","content":"Explain quantum entanglement simply."}]
  }'

Already using OpenAI? Just change the base_url and key.

Frontier models from the labs you already trust

DSDeepSeekNVNVIDIAMMistral AIQQwenGGoogleMMetaΦMicrosoftBFBlack Forest Labs
18+
Models available
99.9%
Uptime target
<1s
Time to first token*
100%
OpenAI-compatible
Why Speka

Built for agents, ready for production

Skip the infrastructure, the rate-limit headaches and the per-vendor SDKs. One gateway, one bill, full observability — so you can ship agents, not plumbing.

Every model, one key

Reasoning, chat, code, vision, embeddings and image generation — all behind a single endpoint and key.

Agent-native API

Native tool-calling, streaming and JSON mode. Point the OpenAI SDK at our base URL — no rewrites, no lock-in.

Predictable rate limits

Per-plan RPM and concurrency, enforced fairly with clear headers so you never get surprised.

Usage-based pricing

Pay per token. Monthly included usage on every plan plus pay-as-you-go overage. No lock-in.

Built-in analytics

Track spend, tokens and latency by key, model and day — right in your dashboard.

Secure by design

Keys are hashed at rest, scoped per project, and revocable instantly. Row-level isolated data.

How it works

Live in three steps

01

Create an account

Sign up in seconds and get a free API key with usage included — no card needed.

02

Point your SDK

Set the base URL to our gateway and use any model id. Works with the OpenAI SDK, LangChain and Vercel AI SDK.

03

Ship & scale

Monitor usage, set limits and upgrade as you grow. We handle routing, failover and infrastructure.

Trust & security

Secure by default

Your keys and data are protected at every layer — so you can build with confidence.

Hashed API keys

Keys are hashed at rest and shown only once. Scope them per project and revoke instantly.

Encrypted in transit

All traffic is TLS-encrypted end to end between your app, our gateway and model providers.

Row-level isolation

Your usage, keys and metadata are isolated per account at the database level.

Prompts never trained on

We pass prompts through to generate responses — never to train models. Delete your data anytime.

From the field

Teams ship faster on Speka

Swapped our OpenAI base URL and had DeepSeek-R1 in prod the same afternoon. The tool-calling just worked.

A
A. Nakamura
Staff Engineer

One bill for every model we test means we stopped juggling six vendor dashboards. Spend is finally legible.

M
M. Okafor
Founder, AI startup

Failover across capacity has kept our agents up through provider outages. That reliability earned our trust.

J
J. Petrov
Platform Lead
Pricing

Start free. Scale when you do.

Every plan includes monthly usage. Only pay more when you outgrow it.

Free

$0.00 /mo

Kick the tires. No card required.

Start free
  • $1 of model usage / month
  • 10 requests / minute
  • 1 API key
  • Access to all open models

Starter

$19.00 /mo

For indie hackers and side projects.

Get Starter
  • $25 of model usage included
  • 60 requests / minute
  • 5 API keys
  • All models incl. vision & image

Scale

$399.00 /mo

For high-volume, latency-sensitive workloads.

Get Scale
  • $750 of model usage included
  • 1,200 requests / minute
  • 200 API keys
  • Highest-priority routing

See the full breakdown on the pricing page.

FAQ

Questions, answered

Yes. We implement the OpenAI Chat Completions, Embeddings and Images endpoints. Point any OpenAI SDK or client at our base URL, set your Speka key, and use any model id from the catalog — no other code changes.
Models run on our managed inference infrastructure across multiple capacity providers. We route each request to healthy capacity and fail over automatically, so you get one stable endpoint instead of managing several vendors.
Every plan includes a monthly usage allowance measured in real model spend. You're billed per token at transparent published rates. Exceed your allowance and you simply pay standard per-token rates — no overage penalties.
Yes. Set hard spend caps and per-key rate limits from your dashboard, and get alerts as you approach them. Keys can be scoped per project and revoked instantly.
We transmit prompt content to model providers solely to generate responses. We store request metadata (model, token counts, latency) for analytics and billing, but we do not use your prompts to train models.
You get $1 of model usage every month, 1 API key, 10 requests per minute and access to all open models — enough to prototype and test. Upgrade only when you need more.

Build with frontier AI today

Join developers shipping faster with Speka. Your first key is free and takes 30 seconds.