The firewall is the central chokepoint of any business IT infrastructure. If it goes down, the entire network stops — no internet, no VPN connections, no communication between sites. For SMBs that depend on permanent availability, a single OPNsense firewall represents an unacceptable risk. The solution: a high-availability cluster with CARP failover, where a second firewall stands by and takes over within seconds.
Why High Availability for Firewalls Is Essential
A firewall outage affects far more than internet access. VoIP telephony drops, cloud services become unreachable, site-to-site VPNs collapse, and internal services between VLANs are interrupted. Depending on the industry, extended outages lead to SLA violations, revenue loss, or even compliance breaches.
Hardware defects, faulty firmware updates, or power failures can hit any firewall — regardless of the quality of the components. An HA cluster eliminates this single point of failure and ensures that a primary firewall outage remains virtually invisible to users.
CARP: The Protocol Behind Failover
CARP (Common Address Redundancy Protocol) originates from the OpenBSD project and is the backbone of the OPNsense HA solution. The principle is straightforward: two or more firewalls share a common virtual IP address (VIP). The primary firewall regularly sends CARP advertisements to the network. If these cease, the secondary firewall takes over the VIP and with it all network traffic.
CARP is complemented by two additional components:
- pfSync — synchronizes the state table (active connections) between both firewalls so that existing TCP sessions survive a failover without dropping.
- XMLRPC Sync — automatically replicates the configuration (firewall rules, NAT, VPN, aliases) from the primary to the secondary.
Prerequisites and Network Architecture
An OPNsense HA cluster requires the following:
| Component | Description |
|---|---|
| 2x OPNsense firewalls | Identical or comparable hardware |
| WAN interface | Both firewalls on the same uplink switch |
| LAN interface | Both firewalls on the same LAN switch |
| Sync interface | Dedicated interface for pfSync and XMLRPC |
| 3 IP addresses per subnet | One per firewall + one CARP VIP |
The network topology looks like this schematically:
Internet
|
[ Uplink Switch ]
| |
[ FW1-WAN ] [ FW2-WAN ] CARP VIP: 203.0.113.1
| |
[ FW1-SYNC ]--[ FW2-SYNC ] Sync: 10.0.0.1 <-> 10.0.0.2
| |
[ FW1-LAN ] [ FW2-LAN ] CARP VIP: 192.168.1.1
| |
[ LAN Switch ]
The dedicated sync interface is critical: it isolates synchronization traffic from the production network and prevents pfSync data from traversing untrusted networks. A direct crossover cable or a separate VLAN is ideal for this purpose.
Configuration Step by Step
1. Set Up the Sync Interface
Configure the sync interface on both firewalls — for example, 10.0.0.1/30 on FW1 and 10.0.0.2/30 on FW2. This network is used exclusively for pfSync and XMLRPC traffic.
2. Create CARP Virtual IPs
Under Interfaces > Virtual IPs > Settings, create the CARP VIPs. A virtual IP is defined for each subnet (WAN and LAN):
Type: CARP
Interface: WAN
Address: 203.0.113.1/24
VHID Group: 1
Password: <secure-shared-secret>
Adv. Skew: 0 (Primary)
On the secondary firewall, configure the same VIP with a higher Adv. Skew value (e.g., 100). The lower skew value determines which node becomes the primary.
3. Enable pfSync
Under System > High Availability > Settings, enable pfSync and select the sync interface. Enter the IP of the other firewall as the sync peer. This replicates active connection states in real time.
4. XMLRPC Configuration Sync
In the same menu, configure XMLRPC synchronization on the primary. Enter the secondary’s IP along with a username and password. Select which configuration areas should be synchronized — typically all of them:
- Firewall rules and NAT
- Aliases and schedules
- VPN configurations (IPsec, OpenVPN, WireGuard)
- DHCP server and DNS
- Intrusion detection (Suricata)
From this point on, every change made on the primary is automatically pushed to the secondary.
5. Adjust Gateway and NAT Settings
Ensure that all clients use the CARP VIP as their default gateway — not the physical IP of an individual firewall. Outbound NAT rules should also use the CARP VIP as the source address so that return traffic always reaches the active firewall.
Testing Failover
An HA setup that has not been tested is no HA setup at all. Perform the following tests:
- Pull the cable — Disconnect the WAN cable from the primary firewall. Within 2-3 seconds, the secondary should take over the VIP.
- Shut down the primary — This simulates a complete hardware failure. Existing connections should persist thanks to pfSync.
- Verify failback — After restoring the primary, it should automatically reclaim the master role (lower skew value).
- Long-duration test — Let the secondary run as master for several hours and verify VPN tunnels, NAT, and IDS functionality.
Monitoring HA Status with DATAZONE Control
A failover is of limited use if nobody notices it has occurred. With DATAZONE Control, the CARP status of both firewalls can be monitored continuously. Typical checks include:
- CARP status — Is the expected primary actually the master? Is the secondary in backup mode?
- pfSync status — Are states being synchronized? How large is the state table?
- Interface health — Are all relevant interfaces (WAN, LAN, Sync) active?
- Config drift — Does the configuration on both nodes match?
When an unexpected failover occurs, an alert is triggered immediately so the IT team can investigate the cause before the secondary also fails. Combined with comprehensive network monitoring, this creates a complete picture of the entire infrastructure.
Common Pitfalls
Asymmetric routing: If inbound traffic flows through FW1 but the response goes through FW2, packets will be dropped. The fix: consistently point all gateways and NAT rules to the CARP VIPs.
State synchronization under heavy load: With very large state tables (>500,000 entries), pfSync can become a bottleneck. A dedicated gigabit sync interface and sufficient RAM on both nodes resolve this.
XMLRPC errors after updates: After an OPNsense update, XMLRPC sync errors may occur if versions differ. Always update the secondary first, then the primary — this keeps the cluster operational during the update process.
VHID conflicts: Each CARP VIP requires a unique VHID group per broadcast domain. When running multiple HA clusters on the same network, ensure distinct VHIDs are used.
Conclusion
OPNsense High Availability with CARP is a proven, cost-effective method for eliminating the firewall as a single point of failure. With two firewalls, a dedicated sync interface, and a clean configuration, SMBs can achieve availability levels typically reserved for enterprise solutions. The key lies in correct setup, thorough testing, and continuous monitoring — so that failover actually works seamlessly when it matters most.
Looking to run your OPNsense firewalls in a high-availability configuration? We plan and implement HA clusters including monitoring — get in touch.
More on these topics:
More articles
Backup Strategy for SMBs: Proxmox PBS + TrueNAS as a Reliable Backup Solution
Backup strategy for SMBs with Proxmox PBS and TrueNAS: implement the 3-2-1 rule, PBS as primary backup target, TrueNAS replication as offsite copy, retention policies, and automated restore tests.
OPNsense Suricata Custom Rules: Write and Optimize Your Own IDS/IPS Signatures
Suricata custom rules on OPNsense: rule syntax, custom signatures for internal services, performance tuning, suppress lists, and EVE JSON logging.
Systemd Security: Hardening and Securing Linux Services
Systemd security hardening: unit hardening with ProtectSystem, PrivateTmp, NoNewPrivileges, CapabilityBoundingSet, systemd-analyze security, sandboxing, resource limits, and creating custom timers.