Combining AlwaysOn AG with Failover Cluster Instances
In this article we'll look at how AlwaysOn availability groups combine with Failover Cluster Instances. We'll look individually at Failover Cluster Instance composition, Windows Server Failover Clusters and AlwaysOn availability groups, what they are and how they interact. For the purposes of this article the following are used;
FCI A failover cluster instance of SQL Server
AO Always On availability groups, the new SQL Server 2012 feature
WSFC Windows Server Failover Cluster, a feature of the Windows 2008 operating system
NIC Network interface card
SAN Storage area network
LUN Logical unit number
For many years now Windows Server Failover Clusters and Failover Cluster Instances have been a popular configuration for highly available SQL Servers. Later, with SQL Server 2005 came database mirroring. Although mirroring was limited, it offered an extra level of redundancy. Now with SQL Server 2012, we take a huge leap forward in highly available SQL Servers.
AlwaysOn is the new high availability feature in SQL Server 2012 and provides the ability to create multiple copies of a highly available database. In order for AlwaysOn to provide this level of availability it utilises the functionality within Windows Server Failover Clusters. It doesn't by any means replace Failover Cluster Instances of SQL Server, rather, it shares the use of the core technology presented by the Windows operating system. Just as you could with database mirroring, you have the option to combine an AlwaysOn availability group with a SQL Server Failover Cluster Instance, but it adds a great deal of complexity due to the strict AO implementation rules and the sharing of the Windows operating system failover cluster feature.
In this article we're going to look at 2 different scenarios, the first provides a basic AlwaysOn configuration, the second will show how an FCI can combine with AlwaysOn groups to provide an extremely resilient high availability option for your SQL Server system.
Before we start, and to help us on our way, let's look briefly and individually at each of these technologies to understand a little more about them.
What is a Windows Server Failover Cluster?
Windows Server Failover Clusters are a feature of the Windows server 2008 operating system. They provide the ability to combine multiple computer nodes (physical and\or virtual) to service a set of applications for high availability, in this case a SQL Server instance. The application is made highly available by presenting to clients a virtual access point comprising a unique IP address and a unique computer name or "virtual network name". These become resources in an application group and are passed between participating nodes like tokens.
The computer nodes use shared storage, which is presented to each node, usually from a SAN. The disks are added to clustered application\group and will failover between the nodes when the group is moved (either manually or through automatic failover). Throughout normal operation, only one node may obtain access to the storage disks at any one time (the node that owns the particular resources) to avoid volume corruption, this is also true for all of the remaining resources in the cluster group too. Disk access is controlled by the cluster service once the disks have been added as cluster resources.
The WSFC requires some form of mediation to control cluster resource ownership, and uses Quorum to maintain cluster stability. In Windows Server 2008 WSFC, this Quorum takes the form of a node voting system with the majority required to maintain Quorum. You may also use additional quorum resources in the form of a disk for localised clusters or a remote file share for multi site clusters.
A critical hardware failure of the active computer node would result in the loss of group service and would automatically start the group on an available partner node. At a high level, the client access point details are transferred along with any disk and service resources, etc to a failover partner node. A failover of the clustered instance causes disconnection of client connections, these may then reconnect once the service is available on a partner node.
Common failures usually are;
Public NIC or network failure
Power supply failure
Motherboard failure
CPU failure
What is a Failover Cluster Instance?
A Failover Cluster Instance is an instance of SQL Server, default or named, that has been installed onto a WSFC as a clustered application. The clustered application typically has the following resources as a minimum;
IP address
Network name
Shared disk(s)
SQL Server service
SQL Server agent service
A clustered instance of SQL server will utilise any shared storage that has been presented to the Windows Server Failover Cluster nodes, usually this storage will take the form of LUNs presented from a SAN. A Failover Cluster Instance of SQL Server is deployed by launching the "New SQL Server Failover Cluster Installation" wizard on the first computer node that will participate in the FCI. Once this has been performed you would then launch the "Add Node to a SQL Server Failover Cluster" wizard on any computer node in the WSFC that you wish to participate in the new SQL Server FCI. For Standard edition of SQL Server you are limited to 2 nodes, Enterprise and upwards support the OS maximum (8 nodes in Windows 2003 and 16 nodes in Windows 2008).
Note: Standard edition limits the FCI to 2 nodes but this does not dictate how many nodes have membership of the Windows cluster, it is merely at the installer level.
What is an AlwaysOn Availability Group?
An AlwaysOn Availability Group, is a group of one or more databases configured on a primary replica (or SQL Server instance) for high availability. This will typically include one or more further replicas which will service a copy of the highly available database(s). Partner databases may be either readable or standby and may also use either asynchronous or synchronous commit modes.
In AlwaysOn a replica refers to an instance of SQL Server that is participating in the AlwaysOn availability group.
AlwaysOn relies on the WSFC core functionality to achieve the high availability that AO offers, but does not require any of the following shared resources associated with an FCI.
Shared disks
Shared IP address
Shared network name
Shared SQL Server and SQL Server agent resources
There is one exception to this rule, when creating an AlwaysOn group listener this will create a set of resources shared by the AO group replicas, but note that this application\cluster group has no relation to a Failover Cluster Instance.
If you are using a listener for your AO group you are able to provide a central client access point for entry into the availability group, this provides a great deal of resilience when used in your application connection strings.
AlwaysOn is a feature configured at the individual service level by using SQL Server Configuration Manager. A computer node MUST first be a member of a valid WSFC before you may enable the AlwaysOn service feature. Availability replicas MUST be situated on separate physical nodes. Given this, in an example, INST1 must reside on Node1 and INST2 must reside on Node2. You cannot install INST1 and INST2 on to Node1 and configure an AO group.
For a basic AlwaysOn configuration, each node has local disk storage and non clustered instances of SQL Server installed. Ultimately, AO removes the storage single point of failure that is common place with shared disks. The nodes are completely stand alone apart from their membership of a valid Windows Server Failover Cluster.
Now that we have a little background we may start to look at how these technologies interact.
How Do Failover Cluster Instances Integrate With AlwaysOn Groups?
We first need to understand how this affects the WSFCs.
All participating nodes in an AlwaysOn group need to be joined to the same WSFC. Yes, that means that the servers must be joined to the same Windows domain to participate in the same WSFC. WSFCs, Failover Cluster Instances and AlwaysOn groups do not span Windows domains.
In a basic example of an AlwaysOn configuration we have the following node and IP address configurations:
Computer Nodes
AONode1.GM.co.uk 192.168.1.41
AONode2.GM.co.uk 192.168.1.42
WSFC Joined
SQLCluster01.GM.co.uk 192.168.1.45
Non Clustered SQL Server Instances
AOINST1 on AONode1
AOINST2 on AONode2
AlwaysOn Group
SQLAOGrp1
AlwaysOn Listener
SQLAOListener.GM.co.uk 192.168.1.46
How does this look?
The following diagram shows a graphical representation of how your system would look
For this scenario, each cluster node listed above has the following local drive storage and a single non clustered instance of SQL Server installed.
The instances installed are;
AONode1\AOINST1
AONode2\AOINST2
The locally attached disks in use on each node are;
C: = 78GB
D: = 300GB
E: = 300GB
F: = 300GB
The key here is that both nodes have separate storage resources to each other but are both joined to the Windows cluster shown below;
SQLCluster01.GM.co.uk
The AlwaysOn group has a listener using the virtual network name and IP detailed above. This would represent a standard AlwaysOn configuration and does not use a SQL Server FCI or shared storage resources.
Remember, the goal in AlwaysOn is to remove the storage single point of failure by utilising separate storage resources between participating computer nodes.
Now let's take our second scenario to include a corporate FCI into our AlwaysOn configuration.
The architect team have directed that the primary SQL server instance must be hardware protected on the primary site and there should be provision for read only replicas for reporting and maintenance. The following configuration incorporates an FCI to provide hardware protection on the primary site. These are the node and IP address configurations in use;
Computer Nodes
AONode1.GM.co.uk 192.168.1.41
AONode2.GM.co.uk 192.168.1.42
AONode3.GM.co.uk 192.168.1.43
AONode4.GM.co.uk 192.168.1.44
WSFC Joined
SQLCluster01.GM.co.uk 192.168.1.50
Cluster SQL Server Instance
SQL-CL-INST\SiteLive 192.168.1.51
Non Clustered SQL Server Instances
AONode1\DRRepl
AONode2\CLMaint
AlwaysOn Group
SQLAOGrp1
AlwaysOn Listener
SQLAOListener.GM.co.uk 192.168.1.46
How does this look?
The diagram below shows a graphical representation of the AlwaysOn availability incorporating a SQL Server Failover Cluster Instance.
Each server has the following locally attached drive storage.
C: = 78GB
D: = 300GB
E: = 300GB
F: = 300GB
In addition to the above locally attached disks, AONode3 and SAONode4 have the following shared LUNs attached from a SAN.
G: = 500GB
H: = 250GB
L: = 300GB
We have our WSFC titled "SQLCluster01.GM.co.uk", this is the cluster entry point for all 4 nodes and each server is joined to this cluster.
A Failover Cluster Instance of SQL Server has been installed first to AONode3 and then AONode4 has been added as a failover partner node, the instance is titled
"SQL-CL-INST\SiteLive" as detailed above. The FCI is using the following shared drives
G:
H:
L:
2 non clustered instances of SQL Server have been deployed as follows:
AONode1 has an instance named "DRRepl"
AONode2 has an instance named "CLMaint"
We have created an availability group titled SQLAOGrp1 on the clustered SQL Server instance called SQL-CL-INST\SiteLive and we include the following instances as replicas
AONode1\DRRepl
AONode2\CLMaint
This completes our AlwaysOn availability group incorporating FCI's.
Note here that AONode3 and AONode4 are part of the same Windows Server Failover Cluster as AONode1 and AONode2, only AONode3 and AONode4 are set up as failover partners for the clustered instance SQL-CL-INST\SiteLive (AONode1 and AONode2 have not had the SQL Server "Add node" wizard executed upon them).
As you can see, incorporating an FCI into an availability group requires an increased number of cluster nodes to ensure that the availability replicas meet the standard requirement whereby, each replica (SQL Server instance) must reside on a separate physical node. Understanding the different technologies allows you to pull them together to create one incredibly resilient SQL Server system.
Creating yourself a virtual test system will give you a better grasp on failover clusters and availability groups and as always have fun in your work. Post back if you're still unsure and I'll help all I can.
FCI A failover cluster instance of SQL Server
AO Always On availability groups, the new SQL Server 2012 feature
WSFC Windows Server Failover Cluster, a feature of the Windows 2008 operating system
NIC Network interface card
SAN Storage area network
LUN Logical unit number
For many years now Windows Server Failover Clusters and Failover Cluster Instances have been a popular configuration for highly available SQL Servers. Later, with SQL Server 2005 came database mirroring. Although mirroring was limited, it offered an extra level of redundancy. Now with SQL Server 2012, we take a huge leap forward in highly available SQL Servers.
AlwaysOn is the new high availability feature in SQL Server 2012 and provides the ability to create multiple copies of a highly available database. In order for AlwaysOn to provide this level of availability it utilises the functionality within Windows Server Failover Clusters. It doesn't by any means replace Failover Cluster Instances of SQL Server, rather, it shares the use of the core technology presented by the Windows operating system. Just as you could with database mirroring, you have the option to combine an AlwaysOn availability group with a SQL Server Failover Cluster Instance, but it adds a great deal of complexity due to the strict AO implementation rules and the sharing of the Windows operating system failover cluster feature.
In this article we're going to look at 2 different scenarios, the first provides a basic AlwaysOn configuration, the second will show how an FCI can combine with AlwaysOn groups to provide an extremely resilient high availability option for your SQL Server system.
Before we start, and to help us on our way, let's look briefly and individually at each of these technologies to understand a little more about them.
What is a Windows Server Failover Cluster?
Windows Server Failover Clusters are a feature of the Windows server 2008 operating system. They provide the ability to combine multiple computer nodes (physical and\or virtual) to service a set of applications for high availability, in this case a SQL Server instance. The application is made highly available by presenting to clients a virtual access point comprising a unique IP address and a unique computer name or "virtual network name". These become resources in an application group and are passed between participating nodes like tokens.
The computer nodes use shared storage, which is presented to each node, usually from a SAN. The disks are added to clustered application\group and will failover between the nodes when the group is moved (either manually or through automatic failover). Throughout normal operation, only one node may obtain access to the storage disks at any one time (the node that owns the particular resources) to avoid volume corruption, this is also true for all of the remaining resources in the cluster group too. Disk access is controlled by the cluster service once the disks have been added as cluster resources.
The WSFC requires some form of mediation to control cluster resource ownership, and uses Quorum to maintain cluster stability. In Windows Server 2008 WSFC, this Quorum takes the form of a node voting system with the majority required to maintain Quorum. You may also use additional quorum resources in the form of a disk for localised clusters or a remote file share for multi site clusters.
A critical hardware failure of the active computer node would result in the loss of group service and would automatically start the group on an available partner node. At a high level, the client access point details are transferred along with any disk and service resources, etc to a failover partner node. A failover of the clustered instance causes disconnection of client connections, these may then reconnect once the service is available on a partner node.
Common failures usually are;
Public NIC or network failure
Power supply failure
Motherboard failure
CPU failure
What is a Failover Cluster Instance?
A Failover Cluster Instance is an instance of SQL Server, default or named, that has been installed onto a WSFC as a clustered application. The clustered application typically has the following resources as a minimum;
IP address
Network name
Shared disk(s)
SQL Server service
SQL Server agent service
A clustered instance of SQL server will utilise any shared storage that has been presented to the Windows Server Failover Cluster nodes, usually this storage will take the form of LUNs presented from a SAN. A Failover Cluster Instance of SQL Server is deployed by launching the "New SQL Server Failover Cluster Installation" wizard on the first computer node that will participate in the FCI. Once this has been performed you would then launch the "Add Node to a SQL Server Failover Cluster" wizard on any computer node in the WSFC that you wish to participate in the new SQL Server FCI. For Standard edition of SQL Server you are limited to 2 nodes, Enterprise and upwards support the OS maximum (8 nodes in Windows 2003 and 16 nodes in Windows 2008).
Note: Standard edition limits the FCI to 2 nodes but this does not dictate how many nodes have membership of the Windows cluster, it is merely at the installer level.
What is an AlwaysOn Availability Group?
An AlwaysOn Availability Group, is a group of one or more databases configured on a primary replica (or SQL Server instance) for high availability. This will typically include one or more further replicas which will service a copy of the highly available database(s). Partner databases may be either readable or standby and may also use either asynchronous or synchronous commit modes.
In AlwaysOn a replica refers to an instance of SQL Server that is participating in the AlwaysOn availability group.
AlwaysOn relies on the WSFC core functionality to achieve the high availability that AO offers, but does not require any of the following shared resources associated with an FCI.
Shared disks
Shared IP address
Shared network name
Shared SQL Server and SQL Server agent resources
There is one exception to this rule, when creating an AlwaysOn group listener this will create a set of resources shared by the AO group replicas, but note that this application\cluster group has no relation to a Failover Cluster Instance.
If you are using a listener for your AO group you are able to provide a central client access point for entry into the availability group, this provides a great deal of resilience when used in your application connection strings.
AlwaysOn is a feature configured at the individual service level by using SQL Server Configuration Manager. A computer node MUST first be a member of a valid WSFC before you may enable the AlwaysOn service feature. Availability replicas MUST be situated on separate physical nodes. Given this, in an example, INST1 must reside on Node1 and INST2 must reside on Node2. You cannot install INST1 and INST2 on to Node1 and configure an AO group.
For a basic AlwaysOn configuration, each node has local disk storage and non clustered instances of SQL Server installed. Ultimately, AO removes the storage single point of failure that is common place with shared disks. The nodes are completely stand alone apart from their membership of a valid Windows Server Failover Cluster.
Now that we have a little background we may start to look at how these technologies interact.
How Do Failover Cluster Instances Integrate With AlwaysOn Groups?
We first need to understand how this affects the WSFCs.
All participating nodes in an AlwaysOn group need to be joined to the same WSFC. Yes, that means that the servers must be joined to the same Windows domain to participate in the same WSFC. WSFCs, Failover Cluster Instances and AlwaysOn groups do not span Windows domains.
In a basic example of an AlwaysOn configuration we have the following node and IP address configurations:
Computer Nodes
AONode1.GM.co.uk 192.168.1.41
AONode2.GM.co.uk 192.168.1.42
WSFC Joined
SQLCluster01.GM.co.uk 192.168.1.45
Non Clustered SQL Server Instances
AOINST1 on AONode1
AOINST2 on AONode2
AlwaysOn Group
SQLAOGrp1
AlwaysOn Listener
SQLAOListener.GM.co.uk 192.168.1.46
How does this look?
The following diagram shows a graphical representation of how your system would look
For this scenario, each cluster node listed above has the following local drive storage and a single non clustered instance of SQL Server installed.
The instances installed are;
AONode1\AOINST1
AONode2\AOINST2
The locally attached disks in use on each node are;
C: = 78GB
D: = 300GB
E: = 300GB
F: = 300GB
The key here is that both nodes have separate storage resources to each other but are both joined to the Windows cluster shown below;
SQLCluster01.GM.co.uk
The AlwaysOn group has a listener using the virtual network name and IP detailed above. This would represent a standard AlwaysOn configuration and does not use a SQL Server FCI or shared storage resources.
Remember, the goal in AlwaysOn is to remove the storage single point of failure by utilising separate storage resources between participating computer nodes.
Now let's take our second scenario to include a corporate FCI into our AlwaysOn configuration.
The architect team have directed that the primary SQL server instance must be hardware protected on the primary site and there should be provision for read only replicas for reporting and maintenance. The following configuration incorporates an FCI to provide hardware protection on the primary site. These are the node and IP address configurations in use;
Computer Nodes
AONode1.GM.co.uk 192.168.1.41
AONode2.GM.co.uk 192.168.1.42
AONode3.GM.co.uk 192.168.1.43
AONode4.GM.co.uk 192.168.1.44
WSFC Joined
SQLCluster01.GM.co.uk 192.168.1.50
Cluster SQL Server Instance
SQL-CL-INST\SiteLive 192.168.1.51
Non Clustered SQL Server Instances
AONode1\DRRepl
AONode2\CLMaint
AlwaysOn Group
SQLAOGrp1
AlwaysOn Listener
SQLAOListener.GM.co.uk 192.168.1.46
How does this look?
The diagram below shows a graphical representation of the AlwaysOn availability incorporating a SQL Server Failover Cluster Instance.
Each server has the following locally attached drive storage.
C: = 78GB
D: = 300GB
E: = 300GB
F: = 300GB
In addition to the above locally attached disks, AONode3 and SAONode4 have the following shared LUNs attached from a SAN.
G: = 500GB
H: = 250GB
L: = 300GB
We have our WSFC titled "SQLCluster01.GM.co.uk", this is the cluster entry point for all 4 nodes and each server is joined to this cluster.
A Failover Cluster Instance of SQL Server has been installed first to AONode3 and then AONode4 has been added as a failover partner node, the instance is titled
"SQL-CL-INST\SiteLive" as detailed above. The FCI is using the following shared drives
G:
H:
L:
2 non clustered instances of SQL Server have been deployed as follows:
AONode1 has an instance named "DRRepl"
AONode2 has an instance named "CLMaint"
We have created an availability group titled SQLAOGrp1 on the clustered SQL Server instance called SQL-CL-INST\SiteLive and we include the following instances as replicas
AONode1\DRRepl
AONode2\CLMaint
This completes our AlwaysOn availability group incorporating FCI's.
Note here that AONode3 and AONode4 are part of the same Windows Server Failover Cluster as AONode1 and AONode2, only AONode3 and AONode4 are set up as failover partners for the clustered instance SQL-CL-INST\SiteLive (AONode1 and AONode2 have not had the SQL Server "Add node" wizard executed upon them).
As you can see, incorporating an FCI into an availability group requires an increased number of cluster nodes to ensure that the availability replicas meet the standard requirement whereby, each replica (SQL Server instance) must reside on a separate physical node. Understanding the different technologies allows you to pull them together to create one incredibly resilient SQL Server system.
Creating yourself a virtual test system will give you a better grasp on failover clusters and availability groups and as always have fun in your work. Post back if you're still unsure and I'll help all I can.
Comments