Change8

Migrating to vLLM v0.11.1

Version v0.11.1 introduces 4 breaking changes. This guide details how to update your code.

Released: 11/18/2025

4
Breaking Changes
5
Migration Steps
9
Affected Symbols

⚠️ Check Your Code

If you use any of these symbols, you need to read this guide:

vllm.workerVllmConfigMotifForCausalLMget_input_embeddings_v0Phi4FlashForCausalLMNixlConnectortorch.compileMiDashengLMGLM4 MoE Reasoning Parser

Breaking Changes

Issue #1

Removed vllm.worker module; update imports to use the new internal structure.

Issue #2

Removed MotifForCausalLM model support.

Issue #3

Consolidated speculative decode method names for MTP, which may affect custom implementations relying on old naming conventions.

Issue #4

Removed V0 conditions for multimodal embeddings merging, requiring migration to V1 logic.

Migration Steps

  1. 1
    Update torch to 2.9.0 and CUDA to 12.9.1 to use the new default build features.
  2. 2
    Replace imports from vllm.worker with the updated core worker locations.
  3. 3
    Update speculative decoding implementations to use consolidated MTP method names.
  4. 4
    If using Anthropic clients, point them to the new /v1/messages endpoint on vllm serve.
  5. 5
    Update custom model loaders to use VllmConfig from config/vllm.py instead of config/__init__.py.

Release Summary

This release updates vLLM to PyTorch 2.9.0 and CUDA 12.9.1, introduces Anthropic API compatibility, and significantly improves the stability of async scheduling and torch.compile integration.

Need More Details?

View the full release notes and all changes for vLLM v0.11.1.

View Full Changelog