What are Fault Domains and Update Domains in Azure Availability Sets?
When you first create a Virtual Machine in Azure, you are provided with a choice to select an Availability Option.
There are three types:
- Availability Zone
- Virtual Machine Scale Set
- Availability Set
In this post I’ll discuss Azure Availability Sets and two confusing properties; Fault Domain and Update Domain.
Fault vs. Update domain
Azure Virtual Machines are situated in physical data centres. These physical data centres contain physical servers. All servers are situated in a server rack. A data centre might contain hundreds of server racks.
When you provision a Virtual Machine, the Availability Sets blade allows you to select the number of Fault and Update domains you want to provision.
Fault Domains are designed to protect your Virtual Machines from physical power or network outages. Fault Domains are logical representations of a physical server rack, in that they share the same power and network source. Going forward, It’s easier to imagine Fault Domains as physical server racks.
Now, if you had three Virtual Machines, and stored them all in a single Fault Domain, if the physical rack hosting the servers went down, you would lose access to all three Virtual Machines.
By default Microsoft sets up two Fault Domains when provisioning an Availability Set; however you can still manually choose to have a single Fault Domain.
By having two or more Fault Domains, if a single rack hosting the Virtual Machines went down, you would still have access to your Virtual Machine, now situated in the second or third Fault Domain.
Operating systems often require patches and updates. When you install these updates you are often required to manually reboot these machines. This is where Update Domains help.
Update Domains are designed to protect you from situations where you need to restart a Virtual Machine.
When you provision a new Virtual Machine, you’re given a choice on how you want to manage Guest OS updates. One of the options is Azure-orchestrated. This essentially staggers the update of your Virtual Machines in any Fault Domain.
- An Availability set is a logical grouping of your Virtual Machines only.
- Virtual Machines are not copied or cloned in an Availability Set.
- You should still use a Load Balancer to balance network traffic loads between machines to provide greater redundancy.
- Not all Regions allow you to select the maximum of three Fault domains. This depends on the region you select.
- All VMs in a common availability set are not updated concurrently.
- VMs in a common availability set are updated within Update Domain boundaries and VMs across multiple Update Domains are not updated concurrently.
- Inside the Availability Set resource, you can view their placement between Fault/Update domains.
- Fault Domains start from zero.
- You still need a Load Balancer or similar to access the active Virtual Machine if one of the primary Virtual Machines was down.
Availability Zone vs. Availability Set
Whilst Availability Sets give you some guarantee that a local rack failure will not bring your Virtual Machines down, Availability Zones are geographical dispersion of your Virtual Machines. Availability Zones are the data centres that host those Virtual Machines.
When you create a new Virtual Machine, if you select the Availability Option and set this to Availability Zone, you are provided with three zones that you can choose from. These are the Data Centres.
Availability Zones do not guarantee any specific distance. They can be 10 miles apart or 10 metres apart.