Changelog

Releases, announcements, and updates.

Blog ↗
NovaMLX v1.0.2 Latest April 26, 2026

## What's New in v1.0.2 ### Safe Inference Defaults - **Default frequencyPenalty=0.5** prevents repetition collapse in small quantized models without requiring user parameters - **Default maxTokens lowered to 2048** prevents runaway generation for bare API requests - **Fixed temperature=0 override bug** explicit temp=0 was being overridden to 0.6 ### FusedBatchScheduler Improvements - **Frequency penalty in fused decode loop** GPU-based scatter_add penalty prevents repetition collapse - **Accumulated batch decode** fixes whitespace stripping in SentencePiece tokenized models - **Control token filtering** protocol tokens no longer leak into output ### Other Changes - Agent-aware context scaling with ClientDetector - N-gram speculative decoding in FusedBatchScheduler - ProcessMemoryEnforcer for memory pressure handling - OCROptimizer with model-specific sampling overrides - UI overhaul, deadlock fix

View on GitHub →
v1.0.0 April 25, 2026

**Full Changelog**: https://github.com/cnshsliu/novamlx/compare/v1.0.1...v1.0.0

View on GitHub →
NovaMLX v1.0.1 April 23, 2026

The first release of NovaMLX

View on GitHub →