Maintenance on a Proxmox node — firmware updates, RAM upgrades, or kernel patches — traditionally means downtime for every VM running on it. With live migration in Proxmox VE, that is no longer the case. VMs are moved from one node to another while they continue running, without users or services noticing a thing. In this article, we cover how migration works, what prerequisites apply, and how to maximize speed and reliability.
What Is Live Migration and Why Does It Matter?
During a live migration, the memory of a running VM is incrementally copied to a target node while the VM keeps running. Only in the final step — the switchover phase — is the VM paused for a few milliseconds, the last modified memory pages transferred, and the VM resumed on the target node. For network connections and applications, this process is transparent.
This enables:
- Maintenance without downtime: Nodes can be updated or repaired without interrupting operations.
- Load balancing: VMs can be moved at runtime to less utilized nodes.
- Hardware replacement: Failing nodes can be drained before shutdown.
Prerequisites for Live Migration
Several conditions must be met for live migration to work reliably:
Shared storage: All VM disks must reside on shared storage accessible from both nodes — for example Ceph, NFS, iSCSI, or GlusterFS. Without shared storage, Proxmox performs an offline migration, copying the disks entirely.
Compatible CPU family: The target node must use the same CPU architecture and family. A VM on an Intel Xeon node cannot be live-migrated to an AMD EPYC node. Within the same family (e.g., different Xeon generations), setting the CPU type to x86-64-v2-AES or another generic type helps avoid compatibility issues.
Network connectivity: Both nodes must be reachable via the cluster network. For production environments, we recommend a dedicated migration network (10 GbE or faster) to avoid impacting VM traffic.
Online vs. Offline Migration
| Feature | Online Migration | Offline Migration |
|---|---|---|
| VM status | Keeps running | Shut down |
| Downtime | Milliseconds | Minutes |
| Shared storage | Required | Not required |
| Disk copy | RAM only | Complete VM disks |
| Duration | Seconds to minutes | Minutes to hours |
Offline migration is the right choice when shared storage is unavailable or when you want to move a VM to a different storage type simultaneously.
Migration via the Web Interface
The simplest method: right-click the VM in the Proxmox web interface and select Migrate. Choose the target node and — for local disks — the target storage. Proxmox detects whether an online or offline migration is possible and shows the appropriate options.
For running VMs on shared storage, the Online option appears. Clicking Migrate starts the process. Progress is shown in the task log in real time.
CLI Migration with qm migrate
For automation and scripting, the CLI is the preferred approach:
# Live-migrate a running VM (ID 101) to node pve2
qm migrate 101 pve2 --online
# Offline migration with disk move to target storage
qm migrate 101 pve2 --targetstorage local-zfs
# Migration with bandwidth limit (in MiB/s)
qm migrate 101 pve2 --online --migration_network 10.10.10.0/24 --bwlimit 500
The --online parameter forces live migration. Without it, the VM is shut down, moved, and restarted on the target node.
Bulk Migration Strategies
Before planned maintenance, you often need to move all VMs off a node. A simple bash script handles this:
# Migrate all running VMs from this node to pve2
for VMID in $(qm list | awk 'NR>1 && $3=="running" {print $1}'); do
echo "Migrating VM $VMID to pve2..."
qm migrate "$VMID" pve2 --online
done
Alternatively, you can use the HA Manager to put the node into maintenance mode — more on that below.
Storage Migration: From Local to Shared
Sometimes VM disks need to be moved from local storage (e.g., local-lvm) to shared storage (e.g., Ceph or NFS) to enable live migration in the first place:
# Move VM disk from local-lvm to Ceph pool
qm disk move 101 scsi0 ceph-pool
# Then perform live migration
qm migrate 101 pve2 --online
This operation can be performed while the VM is running — the disk is copied in the background and seamlessly switched over.
Tuning Migration Speed
Live migration duration depends primarily on the VM’s RAM size and available network bandwidth. The following measures speed up the process:
Dedicated migration network: Under Datacenter > Options > Migration Settings, configure a separate network for migrations. A 10 GbE link reduces migration time for a 32 GB RAM VM from several minutes to under 30 seconds.
# Configure migration network via CLI
pvesh set /cluster/options --migration '{"network":"10.10.10.0/24","type":"secure"}'
Adjust bandwidth limits: By default, Proxmox limits migration bandwidth. For planned maintenance windows, you can increase or remove the limit.
Write-intensive VMs: VMs with high RAM write rates (e.g., databases) require more iterations, as modified pages need to be retransmitted. In extreme cases, Proxmox can force a brief freeze to complete the migration.
HA Manager and Automatic Migration
The Proxmox HA Manager monitors all nodes in the cluster. If a node fails, its HA-managed VMs are automatically restarted on surviving nodes. For planned maintenance, the HA Manager offers a maintenance mode:
# Put node into maintenance mode — all HA VMs are automatically migrated
ha-manager crm-command node-maintenance enable pve1
# Disable maintenance mode after work is complete
ha-manager crm-command node-maintenance disable pve1
In maintenance mode, the HA Manager migrates all managed VMs to other nodes in a controlled fashion. Afterward, VMs can be moved back manually or per policy.
Troubleshooting Common Issues
CPU incompatibility: The error kvm: warning: host doesn't support requested feature indicates different CPU generations. Solution: set the CPU type to a generic type like x86-64-v2-AES.
Lock files: A failed migration sometimes leaves a lock file that blocks further actions. Remove it manually:
# Remove lock
qm unlock 101
Network timeouts: Timeouts during migration point to network problems or excessive write load in the VM. Check the network connection between nodes and consider a dedicated migration network.
Local resources: VMs with PCI passthrough devices (e.g., GPUs) or local USB devices cannot be live-migrated. Remove the device assignment before migration or use offline migration.
Monitoring with DATAZONE Control
In clusters with many nodes and VMs, keeping track of migration activity quickly becomes challenging. DATAZONE Control logs all migrations centrally, monitors cluster health, and alerts on failed migrations or HA failover events. This ensures you maintain visibility even in complex multi-node environments.
Planning a Proxmox cluster with live migration or need help optimizing your setup? Contact us — we design and manage your Proxmox infrastructure for maximum availability.
More on these topics:
More articles
Backup Strategy for SMBs: Proxmox PBS + TrueNAS as a Reliable Backup Solution
Backup strategy for SMBs with Proxmox PBS and TrueNAS: implement the 3-2-1 rule, PBS as primary backup target, TrueNAS replication as offsite copy, retention policies, and automated restore tests.
Proxmox Notification System: Matchers, Targets, SMTP, Gotify, and Webhooks
Configure the Proxmox notification system from PVE 8.1: matchers and targets, SMTP setup, Gotify integration, webhook targets, notification filters, and sendmail vs. new API.
Proxmox Cluster Network Design: Corosync, Migration, Storage, and Management
Design Proxmox cluster networks: Corosync ring, migration network, storage network for Ceph/iSCSI, management VLAN, bonding/LACP, and MTU 9000 — with example topologies.