BitsAndBytes

AI & LLMs

Accessible large language models via k-bit quantization for PyTorch.

Latest: continuous-release_main15 releases4 breaking changesView on GitHub →

Release History

continuous-release_main1 feature

Jan 8, 2026

This pre-release provides the latest development wheels for all supported platforms, automatically rebuilt upon commits to the main branch. Installation requires using specific wheel URLs based on the target operating system and architecture.

0.49.11 fix

Jan 8, 2026

This patch release updates AMD targets and adds a safety guard for the quantization state attribute.

0.49.0Breaking4 fixes5 features

Dec 11, 2025

This release brings significant performance boosts for x86-64 CPUs, introduces experimental ROCm support via PyPI wheels, and adds compatibility for macOS 14+. Support for Python 3.9 and Maxwell GPUs has been dropped.

0.48.22 fixes1 feature

Oct 29, 2025

Version 0.48.2 fixes critical bugs related to quantization indexing and CPU/disk offloading regressions, and introduces Windows build support for SYCL kernels on XPU.

0.48.12 fixes

Oct 2, 2025

Version 0.48.1 addresses a critical regression in LLM.int8() affecting inference with pre-quantized checkpoints and fixes an issue with 8bit parameter device movement.

0.48.0Breaking3 fixes9 features

Sep 30, 2025

This release introduces official support for Intel GPUs and Intel Gaudi accelerators, alongside significant performance improvements for CUDA 4bit dequantization kernels and compatibility updates for PyTorch and CUDA versions. Support for PyTorch 2.2 and Maxwell GPUs has been dropped.

0.47.08 fixes9 features

Aug 11, 2025

This release introduces FSDP2 compatibility for Params4bit and significantly expands hardware support by improving CPU/XPU coverage and adding Volta support to recent CUDA builds. Several bugs related to 4bit quantization and documentation have also been resolved.

0.46.12 fixes1 feature

Jul 2, 2025

This release focuses on improving compatibility with torch.compile for Params4bit and fixing documentation issues, alongside adding support for CUDA 12.9 builds. It also streamlines the build process by automatically calling CMake during PEP 517 builds.

0.46.0Breaking8 fixes6 features

May 27, 2025

This release introduces significant improvements for `torch.compile` compatibility with both LLM.int8() and 4bit quantization, alongside a major refactoring to integrate with PyTorch Custom Operators. Support for Python 3.8 and older PyTorch versions has been dropped.

continuous-release_multi-backend-refactor

May 19, 2025

No release notes provided.

0.45.51 fix1 feature

Apr 7, 2025

This minor release fixes an issue where the CPU build of bitsandbytes was omitted from the v0.45.4 wheels by including it in the v0.45.5 release.

0.45.41 fix

Mar 25, 2025

This minor release focuses on improving CPU-only usage of bitsandbytes, featuring a bug fix and better system compatibility on Linux by adjusting the build environment.

0.45.34 fixes1 feature

Feb 24, 2025

This patch release introduces support for NVIDIA Blackwell GPUs via a new CUDA 12.8 build and includes several minor bug fixes.

0.45.21 fix

Feb 6, 2025

This patch release resolves a RuntimeError that occurred during bitsandbytes import when no GPUs were present alongside Triton in PyTorch 2.6 environments.

0.45.1Breaking2 fixes2 features

Jan 23, 2025

This patch release focuses on dependency compatibility, notably setting the minimum PyTorch version to 2.0.0 and ensuring compatibility with triton>=3.2.0. It also includes build system updates and packaging cleanup.