High Availability
High Availability (HA) ensures that critical VMs are automatically restarted on another node if the node they are running on fails.
Prerequisites:
- A working cluster with at least 3 nodes
- Shared storage accessible by all nodes (so the VM disk can be accessed after failover)
- The HA manager service must be running on all nodes
Note: HA requires a quorum (majority of nodes must be online). With 3 nodes, you can tolerate 1 node failure. With 2 nodes, there is no quorum and HA will not function.
Enabling HA for a VM
- Go to Datacenter > HA
- Click Add under Resources
- Select the VM ID you want to protect
- Set the HA State:
• Started — HA will always try to keep this VM running
• Stopped — HA will manage the VM but leave it stopped
• Disabled — HA does not manage this VM - Set Max Restart — how many times to try restarting on the same node before migrating
- Set Max Relocate — how many nodes to try before giving up
- Click Add
HA Groups
HA groups define which nodes are preferred or required for specific VMs.
- Datacenter > HA > Groups > Add
- Name the group and select nodes
- Set priority (higher = preferred) for each node
- Assign a VM to the group in HA Resources
Testing High Availability
- Start an HA-protected VM on one node
- Simulate a node failure by powering off that node or running:
systemctl stop pve-cluster corosync
Watch the Proxmox web console — the VM should automatically start on another node