Unsloth Updates
Unsloth Changelog for our latest releases, improvements and fixes.
To use the latest changes, update Unsloth.
New Important Updates
It’s only been 2 days since our previous release, but we’ve got a more important updates:
Inference is now 20–30% faster. Previously, tool-calling and repeat penalty could slow inference below normal speeds. Inference tokens/s should now perform the same as
llama-server/llama.cpp.Now Auto-detects older or pre-existing models downloaded from LM Studio, Hugging Face, and similar sources.
Inference token/s speed is now calculated correctly. Previously, tokens/s included startup time, which made the displayed speed look slower than it actually was. It should now reflect 'true' inference speed.
CPU usage no longer spikes. Previously, inline querier identity changed every render, causing
useLiveQueryto resubscribe continuously.Unsloth Studio now has a shutdown x button and shuts down properly. Previously, closing it after opening from the desktop icon would not close it properly. Now, launching from the shortcut also opens the terminal, and closing that terminal fully exits Unsloth Studio. If you still have it open from a previous session you can restart your computer or run
lsof -i :8888thenkill -9 <PID>.Even better tool-calling and websearch with reduced errors.
Updated documentation with lots of new info on deleting models, uninstalling etc.
Cleaner, smarter install and setup logging across Windows and Linux. Output is now easier to read with consistent formatting, quieter by default for a smoother experience, and supports richer
--verbosediagnostics when you want full technical detail.You can now view your training history!
First Release post Unsloth Studio
Hey guys, this is our first release since we launched Unsloth Studio. Lots of new features and fixes:
You can now update Unsloth Studio! Please update via:
unsloth studio updateWindows CPU or GPU now works seamlessly. Please reinstall!
App shortcuts. Once installed, you can now launch in Windows, MacOS and Linux via a shortcut icon in the Start / Launch and Desktop.
Pre-compiled
llama.cppbinaries andmamba_ssm- 6x faster installs! Also <300MB in size for binaries.50% reduced installation sizes (-7GB or more savings), 2x faster installs and faster resolving. 50% smaller pypi sizes.
Tool calling improved. Better llama.cpp parsing, no raw tool markup in chat, faster inference, a new Tool Outputs panel, timers.
MacOS and CPU now have Data Recipes enabled with multi-file uploading.
AMD support preliminary for Linux only machines - auto detects.
Settings sidebar redesign. Settings are now grouped into Model, Sampling, Tools, and Preferences
Context length now adjustable. Keep in mind this is not needed as llama.cpp smartly uses the exact context you need via
--fit onMulti-file upload. Data recipes now support multiple drag-and-drop uploads for PDF, DOCX, TXT, and MD, with backend extraction, saved uploads, and improved previews.
Colab with free T4 GPUs with Unsloth Studio now fixed! Try it here. Due to pre-compiled binaries, it's also 20x faster!
Better chat observability. Studio now shows
llama-servertimings and usage, a context-window usage bar, and richer source hover cards.Better UX overall - clickable links, better LaTeX parsing, tool / code / web tooltips for default cards and much more!
LiteLLM - Unsloth Studio and Unsloth were NOT affected by the recent LiteLLM compromise. Nemo Data Designer used LiteLLM only up to
1.80, not the affected1.82.7or1.82.8, and has since removed it entirely.We now have a new one line install command, just run:
Fixes:
Windows/setup improvements. Fixed silent Windows exits, Anaconda/conda-forge startup crashes, broken non-NVIDIA Windows installs, and missing early CUDA/stale-venv setup checks.
System prompts fixed. They work again for non-GGUF text and vision inference.
Persistent system prompts and presets. Custom system prompts and chat presets now persist across reloads and page changes.
GGUF export expanded. Full fine-tunes, not just LoRA/PEFT, can now export to GGUF. Base model resolution is more reliable, and unsupported export options are disabled in the UI.
Chat scroll/layout fixes. Fixed scroll-position issues during generation, thinking-panel layout shift, and viewport jumps when collapsing reasoning panels.
Smarter port conflict detection. Studio now detects loopback conflicts, can identify the blocking process when possible, and gives clearer fallback-port messages.
New Tool calling + Windows Stability
Claude Artifacts works so HTML can be executed like a snake game inside the chat
+30% more accurate tool calls esp for small models + Timer for tool calls
Tool + Web Search outputs can be saved + Toggle auto healing tool on/off
Many bug fixes - Windows CPU works, Mac more seamless, faster and smaller installs
Last updated
Was this helpful?

