NovaMLX Cloud

Local inference is free forever — run any model on your Mac, unlimited, fully private. When you need more power, upgrade to cloud inference with the same API you already know.

Larger models hitting your Mac's limits? Cloud plans give you access to datacenter-grade GPUs — same endpoint, bigger models, faster results.

Local Inference

Free & Open Source

Run models 100% on your Mac — no cloud, no limits, fully private.

  • Run 50+ model families locally on Apple Silicon
  • OpenAI & Anthropic compatible API
  • Vision (VLM), structured output & tool calling
  • Unlimited usage — your hardware, your rules
  • 100% private, zero data leaves your machine
Popular

Cloud Pro

Save 17%
$9/mo $90/yr

Same NovaMLX API, cloud-powered — run bigger models, faster.

Everything in Local Inference, plus:

  • Cloud inference on datacenter-grade GPU hardware
  • Run models too large for local memory
  • Priority support
  • API key management

Cloud Team

$29/mo

Cloud inference and collaboration for your whole team.

Everything in Pro, plus:

  • Team seats & management
  • Usage analytics dashboard
  • Custom model hosting
  • SSO & audit logs