Migrating to vLLM v0.13.0
Version v0.13.0 introduces 7 breaking changes. This guide details how to update your code.
Released: 12/19/2025
⚠️ Check Your Code
If you use any of these symbols, you need to read this guide:
AttentionConfigVLLM_ATTENTION_BACKENDPassConfigModelConfigembed_input_idsembed_multimodalselective_state_updatecompile_rangesencoding_formatBreaking Changes
●Issue #1
PassConfig flags have been renamed per RFC #27995.
●Issue #2
The environment variable VLLM_ATTENTION_BACKEND has been removed; use the --attention-backend CLI argument instead.
●Issue #3
The -O.xx flag has been removed.
●Issue #4
Deprecated plugin and compilation fields have been removed.
●Issue #5
Deprecated task, seed, and Multi-Modal (MM) settings have been removed.
●Issue #6
Removed embed_input_ids and embed_multimodal fallback mechanisms.
●Issue #7
The tokenizer setter has been removed.
Migration Steps
- 1Update deployment scripts to replace VLLM_ATTENTION_BACKEND environment variable with --attention-backend CLI flag.
- 2Review and update PassConfig flag names in custom configurations to match RFC #27995.
- 3Replace usage of --convert reward with --convert embed.
- 4Remove any usage of the deprecated -O.xx optimization flags.
- 5Ensure tokenizer initialization does not rely on the removed tokenizer setter.
Release Summary
vLLM v0.13.0 introduces support for NVIDIA Blackwell Ultra and DeepSeek-V3.2, alongside a major performance overhaul for Whisper models. This release transitions attention configuration from environment variables to CLI arguments and includes significant core engine optimizations like Model Runner V2.
Need More Details?
View the full release notes and all changes for vLLM v0.13.0.
View Full Changelog