Local inference is free forever — run any model on your Mac, unlimited, fully private. When you need more power, upgrade to cloud inference with the same API you already know.
Larger models hitting your Mac's limits? Cloud plans give you access to datacenter-grade GPUs — same endpoint, bigger models, faster results.
Run models 100% on your Mac — no cloud, no limits, fully private.
Same NovaMLX API, cloud-powered — run bigger models, faster.
Everything in Local Inference, plus:
Cloud inference and collaboration for your whole team.
Everything in Pro, plus: