💎 Transparent Pricing

Pay only for what you use.
Route smarter, spend less.

No hidden fees. No vendor lock-in. NexToken routes your requests to the optimal provider — automatically.

Monthly

Annual Save 20%

−96%

nex-pro vs GPT-4o

−40%

vs OpenAI Embeddings

+20%

Flat routing margin (3rd-party)

12+

Providers on one API key

Developer

Free forever. Start building immediately.

Start Free

✓$5 one-time credit · no card needed

✓100 RPM rate limit

✓3 API keys

✓Standard routing

✓REST API access

✗Custom routing rules

✗SLA guarantee

✗Priority support

Pro

$29/mo

For indie developers and growing projects.

Start Pro

✓$10 monthly credit included

✓200 RPM rate limit

✓20 API keys

✓Smart routing + nex-auto

✓Prompt-cache savings auto-applied

✓Streaming + batch + vision

✓Usage analytics

✗SLA guarantee

⭐ Most Popular

Business

$149/mo

For teams shipping production AI products.

Start Business

✓$30 monthly credit included

✓1,000 RPM rate limit

✓Unlimited API keys

✓Smart router + semantic cache (95% off hits)

✓PII redaction + content moderation (PDPA-ready)

✓Batch API (30% discount, 100 items/call)

✓Prompt templates + budget controls

✓99.9% uptime target · priority email support · live status

Enterprise

Custom

Negotiated pricing for large-scale deployments.

Contact Sales

✓Unlimited credits (prepaid)

✓Custom RPM limits

✓Dedicated routing cluster

✓Volume discounts negotiable

✓99.99% uptime SLA (contracted, with credits)

✓Dedicated Slack channel

✓Custom invoicing / PO

✓Reserved throughput (custom RPM floor)

✓Fine-tune endpoint + DPA

★On-premise option

Platform fee + pass-through model costs. Subscription fees cover your NexToken platform access, rate limits, and features. Model usage is billed separately at cost-plus-markup rates shown in the table below. All prices in USD. Singapore users are charged 9% GST on platform fees per IRAS regulations.

🚀 NEW · MAY 2026

23 upgrades shipped — same API, lower bills.

Every chat completion now passes upstream prompt-cache savings through automatically. Near-duplicate prompts bill at 5% of normal retail via our self-hosted semantic cache. Switch model: "nex-auto" and let the gateway pick the right tier per prompt. No client code changes required.

Cost

Prompt-cache pass-through · semantic cache 95% off · nex-auto smart router · batch endpoint 30% off · tokenize + estimate-cost

Compliance

Content moderation · PII redaction (PDPA / GDPR) · prompt-injection defence · context-window pre-flight

Reliability

Redis-backed circuit breaker · in-provider retry + backoff · region-aware routing · multi-key load balance · live P95 scoring

Reach

DALL-E 3 · Whisper · TTS · vision token math · Cohere · Perplexity · xAI Grok · prompt templates · fine-tune API

All additive. Zero breaking changes. View API docs →

NexToken vs. Official Provider Prices

Transparent breakdown of what you pay via NexToken versus going direct. NexToken native models beat the market — third-party access adds a flat 20% routing margin that covers infrastructure, compliance, and reliability.

⭐ NexToken Native

Market-beating prices on nex-series models

−96%

nex-pro vs GPT-4o
$0.12 vs $2.50 per 1M input tokens

−20%

nex-pro vs GPT-4o-mini
$0.12 vs $0.15 per 1M input tokens

−40%

nex-embed-zh vs text-embedding-3-small
$0.012 vs $0.020 per 1M tokens

Self-hosted Singapore GPU · OpenAI SDK drop-in · Strong Chinese & English · PDPA compliant · Zero external API dependency

⏳ 采购谈判中 · Procurement In Progress

Target: below-market prices + routing

Third-party model prices currently reflect pre-contract spot rates — no bulk procurement discounts are yet applied. Once wholesale contracts are finalised:

−10%+

NexToken target: below official prices
Buy at <80% of market → sell at <96% of market (still includes 20% routing margin)

Procurement at <80% of official rates currently under negotiation
Routing, fallback, APAC edge, PDPA compliance included in margin
Prompt-cache savings from providers passed through automatically
Unified billing — one wallet, one invoice for all providers

nex-pro & nex-embed-zh already beat the market today — self-hosted Singapore GPU, no procurement deal needed.

Model	NexToken (You Pay) input / output per 1M	Official Direct	vs. Direct
⭐ NexToken Native Models
nex-pro 32k ctx	$0.12 / $0.48	vs GPT-4o-mini: $0.15 / $0.60OpenAI.com	−20% input
nex-embed-zh 512 ctx	$0.012 / —	text-embedding-3-small: $0.020OpenAI.com	−40%
OpenAI — via NexToken vs. platform.openai.com
gpt-4o	$3.00 / $12.00	$2.50 / $10.00platform.openai.com	Trial Rate
gpt-4o-mini	$0.18 / $0.72	$0.15 / $0.60platform.openai.com	Trial Rate
gpt-4.1	$2.40 / $9.60	$2.00 / $8.00platform.openai.com	Trial Rate
o3-mini / o4-mini	$1.32 / $5.28	$1.10 / $4.40platform.openai.com	Trial Rate
Anthropic — via NexToken vs. anthropic.com/pricing
claude-sonnet-4-6	$3.60 / $18.00	$3.00 / $15.00anthropic.com/pricing	Trial Rate
claude-haiku-4-5	$0.96 / $4.80	$0.80 / $4.00anthropic.com/pricing	Trial Rate
Google DeepMind — via NexToken vs. ai.google.dev
gemini-2.5-pro	$1.50 / $12.00	$1.25 / $10.00ai.google.dev	Trial Rate
gemini-2.5-flash	$0.18 / $4.20	$0.15 / $3.50ai.google.dev	Trial Rate
DeepSeek — via NexToken vs. platform.deepseek.com
deepseek-v3	$0.324 / $1.32	$0.27 / $1.10platform.deepseek.com	Trial Rate
deepseek-r1	$0.66 / $2.64	$0.55 / $2.20platform.deepseek.com	Trial Rate

Official prices are provider list prices as of June 2026. Third-party model prices shown are current trial rates (pre-contract spot pricing). NexToken is actively negotiating wholesale procurement contracts targeting <80% of official rates — upon finalisation, NexToken prices will be below official market prices. NexToken native models (nex-pro, nex-embed-zh) already beat market rates. Pro (−1%), Business (−2.5%), and Enterprise (up to −15%) tiers reduce the effective markup further.

Model Token Pricing

Prices shown are what you pay at Developer tier. Prices per 1,000,000 tokens (1M). Input / Output billed separately.

🔄 试用价格

价格谈判中，敬请期待更大折扣 · Price negotiation in progress 第三方模型当前价格为现货试用费率，尚未包含批量采购折扣。NexToken 正积极与各大提供商进行批量采购协议谈判，目标采购价格低于官网公开价格的 80%。协议生效后，NexToken 路由价格将低于官网直接购买价格，同时涵盖路由、故障切换、APAC 优化及合规保障。
Third-party models currently reflect pre-contract spot rates. Procurement contracts targeting <80% of market rates are under active negotiation — final prices will undercut direct provider pricing.

✓ nex-pro 和 nex-embed-zh 已低于市场价格 — 新加坡 GPU 自托管，无需等待采购协议生效。

NexToken Native Models Singapore GPU

Model	NexTokenin/out per 1M	Comparable Providerin/out per 1M	Best For	Capabilities
nex-pro32k ★ Default	$0.12 / $0.48	vs GPT-4o-mini$0.15 / $0.60	The default Nex model. Chat, code, content, summarisation. Self-hosted Singapore GPU. Strong Chinese + English. ~96% cheaper than GPT-4o	stream tools
nex-embed-zh512	$0.012	vs text-embedding-3-small$0.020	Chinese-strong embeddings, 1024-dim. BGE-M3, self-hosted Singapore. ~40% cheaper than text-embedding-3-small	/v1/embeddings

Quick pick

💬 Chat, code, anything → nex-pro

🇨🇳 Chinese embeddings → nex-embed-zh

Aliases nex-smart and nex-coder transparently route to nex-pro — no migration needed.

OpenAI

Model	NexTokenTrial · in/out per 1M	Official Directin/out per 1M	Context	Capabilities
gpt-4o128k	$3.00 / $12.00	$2.50 / $10.00	128K	stream vision tools
gpt-4o-mini128k	$0.18 / $0.72	$0.15 / $0.60	128K	stream vision tools
gpt-4.11M	$2.40 / $9.60	$2.00 / $8.00	1M	stream vision tools
gpt-4.1-mini1M	$0.48 / $1.92	$0.40 / $1.60	1M	stream tools
gpt-4.1-nano1M	$0.12 / $0.48	$0.10 / $0.40	1M	stream tools
o3200k	$12.00 / $48.00	$10.00 / $40.00	200K	stream
o3-mini200k	$1.32 / $5.28	$1.10 / $4.40	200K	stream
o4-mini200k	$1.32 / $5.28	$1.10 / $4.40	200K	stream

Anthropic

Model	NexTokenTrial · in/out per 1M	Official Directin/out per 1M	Context	Capabilities
claude-opus-4-6200k	$18.00 / $90.00	$15.00 / $75.00	200K	stream vision tools
claude-sonnet-4-6200k	$3.60 / $18.00	$3.00 / $15.00	200K	stream vision tools
claude-haiku-4-5-20251001200k	$0.96 / $4.80	$0.80 / $4.00	200K	stream vision tools

Google DeepMind

Model	NexTokenTrial · in/out per 1M	Official Directin/out per 1M	Context	Capabilities
gemini-2.5-pro1M	$1.50 / $12.00	$1.25 / $10.00	1M	stream vision tools
gemini-2.5-flash1M	$0.18 / $4.20	$0.15 / $3.50	1M	stream vision tools

DeepSeek

Model	NexTokenTrial · in/out per 1M	Official Directin/out per 1M	Context	Capabilities
deepseek-v3128k	$0.324 / $1.32	$0.27 / $1.10	128K	stream tools
deepseek-r1128k	$0.66 / $2.64	$0.55 / $2.20	128K	stream

Meta Llama

Model	NexTokenTrial · in/out per 1M	Official Directin/out per 1M	Context	Capabilities
llama-4-scout512k	$0.18 / $0.72	$0.15 / $0.60	512K	stream tools
llama-4-maverick256k	$0.60 / $2.40	$0.50 / $2.00	256K	stream tools
llama-3.3-70b131k	$0.708 / $0.948	$0.59 / $0.79	131K	stream tools

Mistral AI

Model	NexTokenTrial · in/out per 1M	Official Directin/out per 1M	Context	Capabilities
mistral-large-latest128k	$2.40 / $7.20	$2.00 / $6.00	128K	stream tools
mistral-medium-latest128k	$0.96 / $2.88	$0.80 / $2.40	128K	stream tools
mistral-small-latest128k	$0.24 / $0.72	$0.20 / $0.60	128K	stream tools
codestral-latest256k	$0.36 / $1.08	$0.30 / $0.90	256K	stream
mixtral-8x7b-3276832k	$0.288 / $0.288	$0.24 / $0.24	32K	stream

Qwen / Alibaba

Model	NexTokenTrial · in/out per 1M	Official Directin/out per 1M	Context	Capabilities
qwen-3.5-397b131k	$1.08 / $1.08	$0.90 / $0.90	131K	stream
qwen-max32k	$1.92 / $7.68	$1.60 / $6.40	32K	stream tools
qwen-plus128k	$0.48 / $1.44	$0.40 / $1.20	128K	stream tools
qwen-turbo128k	$0.24 / $0.72	$0.20 / $0.60	128K	stream tools
qwen-long10M	$0.60 / $2.40	$0.50 / $2.00	10M	stream

ZhiPu GLM

Model	NexTokenTrial · in/out per 1M	Official Directin/out per 1M	Context	Capabilities
glm-4-plus128k	$0.84 / $0.84	$0.70 / $0.70	128K	stream tools
glm-4128k	$0.84 / $0.84	$0.70 / $0.70	128K	stream
glm-4-air128k	$0.168 / $0.168	$0.14 / $0.14	128K	stream
glm-4-flash128k	$0.084 / $0.084	$0.07 / $0.07	128K	stream

Cohere

Model	NexTokenTrial · in/out per 1M	Official Directin/out per 1M	Context	Capabilities
command-r-plus128k	$3.00 / $12.00	$2.50 / $10.00	128K	stream tools
command-r128k	$0.18 / $0.72	$0.15 / $0.60	128K	stream tools

Perplexity

Model	NexTokenTrial · in/out per 1M	Official Directin/out per 1M	Context	Capabilities
sonar128k	$0.12 / $0.12	$0.10 / $0.10	128K	stream
sonar-pro200k	$3.60 / $18.00	$3.00 / $15.00	200K	stream

xAI Grok

Model	NexTokenTrial · in/out per 1M	Official Directin/out per 1M	Context	Capabilities
grok-3131k	$3.60 / $18.00	$3.00 / $15.00	131K	stream tools
grok-3-mini131k	$0.36 / $0.60	$0.30 / $0.50	131K	stream tools

Google Gemma (via Groq)

Model	NexTokenTrial · in/out per 1M	Official Directin/out per 1M	Context	Capabilities
gemma2-9b-it8k	$0.24 / $0.24	$0.20 / $0.20	8K	stream

Pricing note: All prices are retail at Developer tier (wholesale × 1.20). Pro, Business, and Enterprise tiers receive volume discounts of 1%, 2.5%, and up to 15% — see Volume Billing Tiers below. Third-party model prices are current trial rates pending procurement contracts. Prompt-cache discounts from OpenAI, Anthropic, DeepSeek, and Google pass through to your wallet automatically.

Volume Billing Tiers

The higher your monthly token spend, the lower your effective markup. Tiers reset on the 1st of each month.

Tier	Monthly Token Spend	Markup Rate	Effective Saving	Unlocks
Developer	$0 – $500	Standard	—	3 keys, 20 RPM
Pro	$500 – $5,000	−1%	Up to $50/mo	200 RPM, analytics
Business	$5,000 – $50,000	−2.5%	Up to $1,250/mo	Custom routing, SLA
Enterprise	$50,000+	Negotiated	Up to 15%+	Dedicated cluster, custom terms

* Billing tiers are independent from Loyalty tiers. Billing tiers reflect monthly spend volume; Loyalty tiers reflect cumulative top-up history.

Loyalty Tiers

Cumulative top-up milestones unlock permanent wallet bonuses. Tiers do not reset — they track your total lifetime top-up.

🥉

Bronze

≥ $500 cumulative top-up

Bonus credits on every top-up

+3%

🥈

Silver

≥ $2,000 cumulative top-up

Bonus credits + priority routing queue

+5%

🥇

Gold

≥ $10,000 cumulative top-up

Bonus credits + dedicated routing + SLA

+8%

💎

Platinum

≥ $50,000 cumulative top-up

Maximum bonus + enterprise SLA + custom terms

+12%

Loyalty bonus credits are applied to your wallet at time of top-up. Example: Gold user tops up $1,000 → receives $1,080 wallet balance (+8% bonus). Bonuses do not stack with promotional codes.

Add-Ons

Optional extras available on Pro and Business plans.

🔒

Extended Audit Logs

$19 / month

Retain full request/response logs for 365 days. Export to S3 or GCS. Required for SOC 2 audits.

📊

Advanced Analytics

$29 / month

Cost attribution by team, project, or custom labels. CSV export, Grafana-compatible metrics endpoint.

🚨

Spend Alerts & PagerDuty

$9 / month

Multi-channel alerts (Slack, email, SMS, PagerDuty) with configurable thresholds and escalation policies.

🌐

Dedicated IP Egress

$49 / month

Route all traffic through a static IP pool for firewall whitelisting. Required for some financial and healthcare orgs.

🤝

Shared Slack Support

$99 / month

Join a shared Slack Connect channel with the Nex engineering team. <4h response time, Mon–Fri 9–6 SGT.

⚡

Higher Rate Limits

From $49 / month

Burst capacity packs: 5k, 20k, or 50k additional RPM. Stacks on top of your base plan limit.

Full Feature Comparison

Detailed breakdown of what's included in each plan.

Feature	Developer	Pro	Business	Enterprise
Limits & Access
Monthly free credits	$5 one-time	$10 incl.	$30 incl.	Custom
Requests per minute (RPM)	20	200	1,000	Custom
API keys	3	20	Unlimited	Unlimited
Sub-keys per key	✕	3	10	Unlimited
Context window support	Up to 128k	Up to 200k	Up to 1M	Up to 1M+
Routing & Intelligence
Smart auto-routing	✓	✓	✓	✓
Provider fallback	✕	✓	✓	✓
Custom routing rules	✕	✕	✓	✓
Dedicated routing cluster	✕	✕	✕	✓
Streaming (SSE)	✓	✓	✓	✓
Budget & Controls
Per-key budget limits	✕	✓	✓	✓
Auto top-up	✕	✓	✓	✓
Spend alerts	Email only	✓	✓	✓
Multi-wallet (teams)	✕	✕	✓	✓
Observability
Request logs retention	7 days	30 days	90 days	365 days
Usage analytics dashboard	Basic	Standard	Advanced	Advanced + export
Cost attribution labels	✕	✕	✓	✓
SLA & Support
Uptime target / SLA	✕	✕	99.9% target	99.99% SLA (contracted)
Support channel	Community	Email	Priority email	Dedicated Slack
Response time	Best effort	<48h	<8h	<2h
Custom invoicing / PO	✕	✕	✕	✓

🇸🇬

Singapore GST Notice (9%)

Cete Ventures Pte. Ltd. (UEN: 202421160G) is GST-registered in Singapore. Platform subscription fees are subject to 9% GST for Singapore-based customers. Model usage credits for SG GST-registered businesses with a valid GST number may qualify for input tax claim — contact us to provide your registration number. Non-SG customers are invoiced without GST. A valid GST invoice is issued for every transaction.

Frequently Asked Questions

Do I pay provider API costs separately?

No. NexToken handles all provider API relationships on your behalf. You top up your NexToken wallet and we pay the providers. Your wallet is debited at our cost-plus-markup rates shown in the pricing table above. You never need to sign up with OpenAI, Anthropic, or Google directly.

What happens if my wallet balance hits zero?

API calls will return HTTP 402 (Payment Required) immediately. No debt is accumulated — NexToken operates on a strict prepaid model. You can enable auto top-up on Pro and Business plans to avoid interruptions. Existing in-flight streaming requests will complete (up to 120 seconds) before being terminated.

How does smart routing decide which provider to use?

NexToken's routing engine evaluates real-time provider health scores, current latency, your requested model, and your configured routing preferences. On Business plans, you can pin specific providers per API key or set custom fallback chains. The router runs sub-5ms decisions before proxying your request.

Can I get a refund on unused wallet credits?

Yes. Unused top-up credits (excluding free promotional credits) are refundable within 30 days of the top-up transaction. Refunds are processed back to your original payment method within 5–10 business days. Loyalty bonus credits are non-refundable. See our full Refund & Cancellation Policy.

What's the difference between Billing Tiers and Loyalty Tiers?

Billing Tiers reflect your monthly token spend volume and reduce your per-token markup — they reset on the 1st of each month. Loyalty Tiers reflect your cumulative total top-up history and add bonus credits to your wallet when you top up — they never reset. The two systems are completely independent.

Is there a free trial for paid plans?

Every new account gets $5 in one-time free credits — no credit card required. Simply sign up and start making API calls immediately. Pro and Business plans offer a 14-day money-back guarantee on the subscription fee. Enterprise plans are negotiated individually.

Do prices include streaming responses?

Yes. Streaming (SSE) is supported at no additional cost. Token pricing is identical for streaming and non-streaming requests. Token counts are calculated using the provider's reported usage field where available; otherwise NexToken uses tiktoken-based estimation.

Start routing smarter today

Join hundreds of developers and teams using NexToken to reduce LLM costs and improve reliability.

Get Started Free Try the Playground Talk to Sales

Pay only for what you use.Route smarter, spend less.