No hidden fees. No vendor lock-in. NexToken routes your requests to the optimal provider — automatically.
nex-auto
Every chat completion now passes upstream prompt-cache savings through automatically.
Near-duplicate prompts bill at 5% of normal retail via our self-hosted semantic cache.
Switch model: "nex-auto" and let the gateway pick the right tier per prompt.
No client code changes required.
nex-auto smart router · batch endpoint 30% off · tokenize + estimate-costTransparent breakdown of what you pay via NexToken versus going direct. NexToken native models beat the market — third-party access adds a flat 20% routing margin that covers infrastructure, compliance, and reliability.
| Model | NexToken (You Pay) input / output per 1M | Official Direct | vs. Direct |
|---|---|---|---|
| ⭐ NexToken Native Models | |||
| nex-pro 32k ctx | $0.12 / $0.48 | vs GPT-4o-mini: $0.15 / $0.60OpenAI.com | −20% input |
| nex-embed-zh 512 ctx | $0.012 / — | text-embedding-3-small: $0.020OpenAI.com | −40% |
| OpenAI — via NexToken vs. platform.openai.com | |||
| gpt-4o | $3.00 / $12.00 | $2.50 / $10.00platform.openai.com | Trial Rate |
| gpt-4o-mini | $0.18 / $0.72 | $0.15 / $0.60platform.openai.com | Trial Rate |
| gpt-4.1 | $2.40 / $9.60 | $2.00 / $8.00platform.openai.com | Trial Rate |
| o3-mini / o4-mini | $1.32 / $5.28 | $1.10 / $4.40platform.openai.com | Trial Rate |
| Anthropic — via NexToken vs. anthropic.com/pricing | |||
| claude-sonnet-4-6 | $3.60 / $18.00 | $3.00 / $15.00anthropic.com/pricing | Trial Rate |
| claude-haiku-4-5 | $0.96 / $4.80 | $0.80 / $4.00anthropic.com/pricing | Trial Rate |
| Google DeepMind — via NexToken vs. ai.google.dev | |||
| gemini-2.5-pro | $1.50 / $12.00 | $1.25 / $10.00ai.google.dev | Trial Rate |
| gemini-2.5-flash | $0.18 / $4.20 | $0.15 / $3.50ai.google.dev | Trial Rate |
| DeepSeek — via NexToken vs. platform.deepseek.com | |||
| deepseek-v3 | $0.324 / $1.32 | $0.27 / $1.10platform.deepseek.com | Trial Rate |
| deepseek-r1 | $0.66 / $2.64 | $0.55 / $2.20platform.deepseek.com | Trial Rate |
Prices shown are what you pay at Developer tier. Prices per 1,000,000 tokens (1M). Input / Output billed separately.
| Model | NexTokenin/out per 1M | Comparable Providerin/out per 1M | Best For | Capabilities |
|---|---|---|---|---|
| nex-pro32k ★ Default | $0.12 / $0.48 | vs GPT-4o-mini$0.15 / $0.60 | The default Nex model. Chat, code, content, summarisation. Self-hosted Singapore GPU. Strong Chinese + English. ~96% cheaper than GPT-4o | stream tools |
| nex-embed-zh512 | $0.012 | vs text-embedding-3-small$0.020 | Chinese-strong embeddings, 1024-dim. BGE-M3, self-hosted Singapore. ~40% cheaper than text-embedding-3-small | /v1/embeddings |
nex-smart and nex-coder transparently route to nex-pro — no migration needed.
| Model | NexTokenTrial · in/out per 1M | Official Directin/out per 1M | Context | Capabilities |
|---|---|---|---|---|
| gpt-4o128k | $3.00 / $12.00 | $2.50 / $10.00 | 128K | stream vision tools |
| gpt-4o-mini128k | $0.18 / $0.72 | $0.15 / $0.60 | 128K | stream vision tools |
| gpt-4.11M | $2.40 / $9.60 | $2.00 / $8.00 | 1M | stream vision tools |
| gpt-4.1-mini1M | $0.48 / $1.92 | $0.40 / $1.60 | 1M | stream tools |
| gpt-4.1-nano1M | $0.12 / $0.48 | $0.10 / $0.40 | 1M | stream tools |
| o3200k | $12.00 / $48.00 | $10.00 / $40.00 | 200K | stream |
| o3-mini200k | $1.32 / $5.28 | $1.10 / $4.40 | 200K | stream |
| o4-mini200k | $1.32 / $5.28 | $1.10 / $4.40 | 200K | stream |
| Model | NexTokenTrial · in/out per 1M | Official Directin/out per 1M | Context | Capabilities |
|---|---|---|---|---|
| claude-opus-4-6200k | $18.00 / $90.00 | $15.00 / $75.00 | 200K | stream vision tools |
| claude-sonnet-4-6200k | $3.60 / $18.00 | $3.00 / $15.00 | 200K | stream vision tools |
| claude-haiku-4-5-20251001200k | $0.96 / $4.80 | $0.80 / $4.00 | 200K | stream vision tools |
| Model | NexTokenTrial · in/out per 1M | Official Directin/out per 1M | Context | Capabilities |
|---|---|---|---|---|
| gemini-2.5-pro1M | $1.50 / $12.00 | $1.25 / $10.00 | 1M | stream vision tools |
| gemini-2.5-flash1M | $0.18 / $4.20 | $0.15 / $3.50 | 1M | stream vision tools |
| Model | NexTokenTrial · in/out per 1M | Official Directin/out per 1M | Context | Capabilities |
|---|---|---|---|---|
| deepseek-v3128k | $0.324 / $1.32 | $0.27 / $1.10 | 128K | stream tools |
| deepseek-r1128k | $0.66 / $2.64 | $0.55 / $2.20 | 128K | stream |
| Model | NexTokenTrial · in/out per 1M | Official Directin/out per 1M | Context | Capabilities |
|---|---|---|---|---|
| llama-4-scout512k | $0.18 / $0.72 | $0.15 / $0.60 | 512K | stream tools |
| llama-4-maverick256k | $0.60 / $2.40 | $0.50 / $2.00 | 256K | stream tools |
| llama-3.3-70b131k | $0.708 / $0.948 | $0.59 / $0.79 | 131K | stream tools |
| Model | NexTokenTrial · in/out per 1M | Official Directin/out per 1M | Context | Capabilities |
|---|---|---|---|---|
| mistral-large-latest128k | $2.40 / $7.20 | $2.00 / $6.00 | 128K | stream tools |
| mistral-medium-latest128k | $0.96 / $2.88 | $0.80 / $2.40 | 128K | stream tools |
| mistral-small-latest128k | $0.24 / $0.72 | $0.20 / $0.60 | 128K | stream tools |
| codestral-latest256k | $0.36 / $1.08 | $0.30 / $0.90 | 256K | stream |
| mixtral-8x7b-3276832k | $0.288 / $0.288 | $0.24 / $0.24 | 32K | stream |
| Model | NexTokenTrial · in/out per 1M | Official Directin/out per 1M | Context | Capabilities |
|---|---|---|---|---|
| qwen-3.5-397b131k | $1.08 / $1.08 | $0.90 / $0.90 | 131K | stream |
| qwen-max32k | $1.92 / $7.68 | $1.60 / $6.40 | 32K | stream tools |
| qwen-plus128k | $0.48 / $1.44 | $0.40 / $1.20 | 128K | stream tools |
| qwen-turbo128k | $0.24 / $0.72 | $0.20 / $0.60 | 128K | stream tools |
| qwen-long10M | $0.60 / $2.40 | $0.50 / $2.00 | 10M | stream |
| Model | NexTokenTrial · in/out per 1M | Official Directin/out per 1M | Context | Capabilities |
|---|---|---|---|---|
| glm-4-plus128k | $0.84 / $0.84 | $0.70 / $0.70 | 128K | stream tools |
| glm-4128k | $0.84 / $0.84 | $0.70 / $0.70 | 128K | stream |
| glm-4-air128k | $0.168 / $0.168 | $0.14 / $0.14 | 128K | stream |
| glm-4-flash128k | $0.084 / $0.084 | $0.07 / $0.07 | 128K | stream |
| Model | NexTokenTrial · in/out per 1M | Official Directin/out per 1M | Context | Capabilities |
|---|---|---|---|---|
| command-r-plus128k | $3.00 / $12.00 | $2.50 / $10.00 | 128K | stream tools |
| command-r128k | $0.18 / $0.72 | $0.15 / $0.60 | 128K | stream tools |
| Model | NexTokenTrial · in/out per 1M | Official Directin/out per 1M | Context | Capabilities |
|---|---|---|---|---|
| sonar128k | $0.12 / $0.12 | $0.10 / $0.10 | 128K | stream |
| sonar-pro200k | $3.60 / $18.00 | $3.00 / $15.00 | 200K | stream |
| Model | NexTokenTrial · in/out per 1M | Official Directin/out per 1M | Context | Capabilities |
|---|---|---|---|---|
| grok-3131k | $3.60 / $18.00 | $3.00 / $15.00 | 131K | stream tools |
| grok-3-mini131k | $0.36 / $0.60 | $0.30 / $0.50 | 131K | stream tools |
| Model | NexTokenTrial · in/out per 1M | Official Directin/out per 1M | Context | Capabilities |
|---|---|---|---|---|
| gemma2-9b-it8k | $0.24 / $0.24 | $0.20 / $0.20 | 8K | stream |
The higher your monthly token spend, the lower your effective markup. Tiers reset on the 1st of each month.
| Tier | Monthly Token Spend | Markup Rate | Effective Saving | Unlocks |
|---|---|---|---|---|
| Developer | $0 – $500 | Standard | — | 3 keys, 20 RPM |
| Pro | $500 – $5,000 | −1% | Up to $50/mo | 200 RPM, analytics |
| Business | $5,000 – $50,000 | −2.5% | Up to $1,250/mo | Custom routing, SLA |
| Enterprise | $50,000+ | Negotiated | Up to 15%+ | Dedicated cluster, custom terms |
* Billing tiers are independent from Loyalty tiers. Billing tiers reflect monthly spend volume; Loyalty tiers reflect cumulative top-up history.
Cumulative top-up milestones unlock permanent wallet bonuses. Tiers do not reset — they track your total lifetime top-up.
Loyalty bonus credits are applied to your wallet at time of top-up. Example: Gold user tops up $1,000 → receives $1,080 wallet balance (+8% bonus). Bonuses do not stack with promotional codes.
Optional extras available on Pro and Business plans.
Detailed breakdown of what's included in each plan.
| Feature | Developer | Pro | Business | Enterprise |
|---|---|---|---|---|
| Limits & Access | ||||
| Monthly free credits | $5 one-time | $10 incl. | $30 incl. | Custom |
| Requests per minute (RPM) | 20 | 200 | 1,000 | Custom |
| API keys | 3 | 20 | Unlimited | Unlimited |
| Sub-keys per key | ✕ | 3 | 10 | Unlimited |
| Context window support | Up to 128k | Up to 200k | Up to 1M | Up to 1M+ |
| Routing & Intelligence | ||||
| Smart auto-routing | ✓ | ✓ | ✓ | ✓ |
| Provider fallback | ✕ | ✓ | ✓ | ✓ |
| Custom routing rules | ✕ | ✕ | ✓ | ✓ |
| Dedicated routing cluster | ✕ | ✕ | ✕ | ✓ |
| Streaming (SSE) | ✓ | ✓ | ✓ | ✓ |
| Budget & Controls | ||||
| Per-key budget limits | ✕ | ✓ | ✓ | ✓ |
| Auto top-up | ✕ | ✓ | ✓ | ✓ |
| Spend alerts | Email only | ✓ | ✓ | ✓ |
| Multi-wallet (teams) | ✕ | ✕ | ✓ | ✓ |
| Observability | ||||
| Request logs retention | 7 days | 30 days | 90 days | 365 days |
| Usage analytics dashboard | Basic | Standard | Advanced | Advanced + export |
| Cost attribution labels | ✕ | ✕ | ✓ | ✓ |
| SLA & Support | ||||
| Uptime target / SLA | ✕ | ✕ | 99.9% target | 99.99% SLA (contracted) |
| Support channel | Community | Priority email | Dedicated Slack | |
| Response time | Best effort | <48h | <8h | <2h |
| Custom invoicing / PO | ✕ | ✕ | ✕ | ✓ |
Cete Ventures Pte. Ltd. (UEN: 202421160G) is GST-registered in Singapore. Platform subscription fees are subject to 9% GST for Singapore-based customers. Model usage credits for SG GST-registered businesses with a valid GST number may qualify for input tax claim — contact us to provide your registration number. Non-SG customers are invoiced without GST. A valid GST invoice is issued for every transaction.
Join hundreds of developers and teams using NexToken to reduce LLM costs and improve reliability.