Every model, one key
Reasoning, chat, code, vision, embeddings and image generation — all behind a single endpoint and key.
One OpenAI-compatible API key for DeepSeek-R1, Llama 3.3, Qwen Coder, Mistral, FLUX and more — with native tool-calling and streaming for agents. Usage-based pricing, instant keys, built-in analytics. Zero infrastructure.
No credit card required · $1 of free usage every month
curl https://api.speka.online/v1/chat/completions \
-H "Authorization: Bearer sk-speka-live-..." \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-ai/deepseek-r1",
"messages": [{"role":"user","content":"Explain quantum entanglement simply."}]
}'Already using OpenAI? Just change the base_url and key.
Frontier models from the labs you already trust
Skip the infrastructure, the rate-limit headaches and the per-vendor SDKs. One gateway, one bill, full observability — so you can ship agents, not plumbing.
Reasoning, chat, code, vision, embeddings and image generation — all behind a single endpoint and key.
Native tool-calling, streaming and JSON mode. Point the OpenAI SDK at our base URL — no rewrites, no lock-in.
Per-plan RPM and concurrency, enforced fairly with clear headers so you never get surprised.
Pay per token. Monthly included usage on every plan plus pay-as-you-go overage. No lock-in.
Track spend, tokens and latency by key, model and day — right in your dashboard.
Keys are hashed at rest, scoped per project, and revocable instantly. Row-level isolated data.
Frontier open models, production-ready and priced transparently.
State-of-the-art open reasoning model with transparent chain-of-thought. Excels at math, logic and multi-step problem solving.
deepseek-ai/deepseek-r1NVIDIA's reward-tuned Llama variant optimized for helpfulness and instruction following. A strong general-purpose workhorse.
nvidia/llama-3.1-nemotron-70b-instructMeta's flagship 70B instruct model — 405B-class quality at a fraction of the cost. Great default for production chat.
meta/llama-3.3-70b-instructBest-in-class open code model. Excellent at generation, completion, refactoring and bug fixing.
qwen/qwen2.5-coder-32b-instructLarge multimodal model for image understanding, document Q&A, charts and visual reasoning.
meta/llama-3.2-90b-vision-instructHigh-fidelity text-to-image generation with excellent prompt adherence and typography.
black-forest-labs/flux.1-devSign up in seconds and get a free API key with usage included — no card needed.
Set the base URL to our gateway and use any model id. Works with the OpenAI SDK, LangChain and Vercel AI SDK.
Monitor usage, set limits and upgrade as you grow. We handle routing, failover and infrastructure.
Your keys and data are protected at every layer — so you can build with confidence.
Keys are hashed at rest and shown only once. Scope them per project and revoke instantly.
All traffic is TLS-encrypted end to end between your app, our gateway and model providers.
Your usage, keys and metadata are isolated per account at the database level.
We pass prompts through to generate responses — never to train models. Delete your data anytime.
“Swapped our OpenAI base URL and had DeepSeek-R1 in prod the same afternoon. The tool-calling just worked.”
“One bill for every model we test means we stopped juggling six vendor dashboards. Spend is finally legible.”
“Failover across capacity has kept our agents up through provider outages. That reliability earned our trust.”
Every plan includes monthly usage. Only pay more when you outgrow it.
Kick the tires. No card required.
Start freeFor indie hackers and side projects.
Get StarterFor production apps and growing teams.
Get ProFor high-volume, latency-sensitive workloads.
Get ScaleSee the full breakdown on the pricing page.
Join developers shipping faster with Speka. Your first key is free and takes 30 seconds.