1 – Introduction
This article explains how to deploy an SQL AlwaysOn FCI in Azure IaaS Cloud in SAN-less mode (without Shared Storage like SAN, NAS …) with StarWind Virtual SAN solution.
StarWind Virtual SAN allows to present Shared Volume to a WSFC Cluster from two or more nodes without physical shared storage solution (SAN, NAS …). The DAS Storage (Physical or Virtual) of each node is used to create clustered volume managed by VirtualS AN. It can be used for Hyper-V clusters, SQL Cluster, VMware cluster …
For SQL, the advantage of this solution is that it is possible to deploy a SQL AlwaysOn Failover Cluster Instance (which requires only SQL Server Standard version licenses) instead of a SQL AlwaysOn AAG cluster (which requires SQL Enterprise licenses => more expansive).
Links:
StarWind Virtual SAN: https://www.starwindsoftware.com/starwind-virtual-san/fr
StarWind Resource Library: https://www.starwindsoftware.com/resource-library
Azure Marketplace VM: https://azure.microsoft.com/en-us/marketplace/partners/starwind/starwindvirtualsan-starwindbyol/
Overview of Architecture:
I will deploy three VM in azure:
- One AD DC/DNS Server used to create a forest.
- Two SQL Server nodes in cluster (with SQL Server 2014 STANDARD Edition).
Note:
- In Azure, you can also directly used the Azure SQL Database service (based on AlwaysOn), the infrastructure is managed by Azure (PaaS mode).
- You can also deploy an SQL AlwaysOn Availability Group, but this feature require SQL Server ENTERPRISE licenses (more expansive)
SQL VM Prerequisites:
To deploy StarWind Virtual SAN Solution, each VM must have:
- Two NIC (minimum two subnets are required : one for Data Synchronization and one for Heartbeat)
- One dedicated VHDX (this is not required, you can create Shared volume on each System volume of VM (C:\) but this is not a Best Practice).
StarWind Virtual SAN overview:
Virtual SAN must be installed on all nodes that will participates to the cluster.
On each node a Virtual SAN volume is created (file extension: SWDSK), Virtual SAN will replicate this volume between the two nodes (Synchronous replication). The Virtual SAN Volume will be presented to cluster nodes through iSCSI protocol with the use of MPIO.
Note about Cache: There is different Cache mode configuration applicable on a Virtual SAN Volume, this part is not covered in this article, but for example you can configured cache on SSD disk on each node to accelerate your IOPS.
You can also configure Virtual SAN Volume in Thick or Thin-provisioned mode.
Several Virtual SAN volume can be created on a same volume, this is just a question of Performance/Usage.
Configuration:
In this article, each SQL VM will be configured with two VHDX (one for system and one to host Virtual SAN Volumes). I will configure two Virtual SAN volumes: 1x dedicated for the cluster quorum and one dedicated for the SQL FC Instance Data (DB + LOG).
Overview of Virtual SAN clustered disks and ISCSI configuration:
1.1 – Azure environment preparation
The environment will be composed:
- 1x Azure subscription (for reminder you can create a trial account with 150€ available for 30 days).
-
1x Azure Resource Group:
A RG is a logical container used to regroup Azure resources associated to an application. It provides the centralized management and monitoring of these resources (lifecycle, cost calculation, provisioning, access control …)
name |
type |
location |
RG-TCLAB1 | Resource Group | West Europe |
- 1x Azure Storage Account (required to host VM VHDX):
name |
type |
resource group |
account type |
tclab1storage | Storage Account | RG-TCLAB1 | Standard-LRS (Locally Redundant) |
- 1x Virtual Network (VNET) with three subnets:
name |
type |
resource group |
address space |
subnets |
description |
|
tc-lab1-lan | Virtual Network | RG-TCLAB1 | 172.16.0.0/16 | Prod | 172.16.0.0/24 | PROD Subnet |
Gateway (Azure) | 172.16.1.0/29 | Used for VPN (P2S or S2S) | ||||
Heartbeat | 172.16.10.0/24 | Cluster /Virtual SAN Heartbeat |
- 2x Cloud Service. Just for reminder all VM in a Cloud Service must have the same number of NIC. So with two CS, I don’t need to create the AD DC VM with the Heartbeat VLAN. In addition CS allow scalability option.
name |
type |
resource group |
description |
tc-lab1-cs | Cloud Service | RG-TCLAB1 | Used for basic servers (AD DC …) |
tc-lab1-cs-sqlsrv | Cloud Service | RG-TCLAB1 | Used for SQL Servers |
- 3x Virtual Machine
name |
type |
resource group |
dns name |
pIP |
size |
description |
l1-dc-1 | Virtual Machine | RG-TCLAB1 | tc-lab1-cs.cloudapp.net | 172.16.0.4 | Basic A0 (0.25 Core, 0.75 GB) | AD DC / DNS Server |
l1-sqlfci-1 | Virtual Machine | RG-TCLAB1 | tc-lab1-cs-sqlsrv.cloudapp.net | 172.16.0.5 | Standard A3 (4 Cores, 7 GB) | SQL AlwaysOn FCI Node 1 |
l1-sqlfci-2 | Virtual Machine | RG-TCLAB1 | tc-lab1-cs-sqlsrv.cloudapp.net | 172.16.0.6 | Standard A3 (4 Cores, 7 GB) | SQL AlwaysOn FCI Node 2 |
- The two SQL nodes will be created with two vNIC and two VHDX.
If you begin with Azure read my detailed article (explains all steps to create the Azure environment: VM creation, Virtual Network configuration …): How to create an Azure Cloud environment.
2 – Installation of StarWind Virtual SAN
In this part I will configure two Virtual Volume (Replicated between the two SQL nodes). At the end both volumes can be added to the WSFC Cluster.
You can use the procedure bellow to configure CSV Disk for Hyper-V Cluster or any other WSFC Cluster.
Note: You can install only “Virtual SAN” components on Servers that will participate to the Replication and install the “Management console” component on an administration server or client.
On the first SQL Server, launch the Setup, select “StarWind Virtual SAN Service” and check the “Management Console”:
Enter your license:
https://www.starwindsoftware.com/registration-starwind-virtual-san
Click Finish:
Note that a Firewall Rule is created:
During the first start, the Management Console ask to configure the Storage Pool for Virtual Disk:
Select the disk previously prepared:
Repeat the operation on the second SQL Server and close the console.
2.1 – Virtual Disk “Quorum” – Creation
From the first SQL server, select it and click “Connect”:
Click the “Add Device (advanced)“:
Select “Hard Disk Device”:
Select Virtual disk and click “Next“.
Check the virtual disk location, change the name and specify the size:
Select “Thick-provisioned”:
Configure the cache policy and specify the cache size.
Note: StarWind recommends to use 1GB cache per 1TB storage.
Define the L2 cache policy if needed.
Note: StarWind recommends to use SSD for L2 cache and if it will be used, the formula is 1GB (sum of L1 and L2) cache per 1TB storage.
Enter a “Target Alias”, if you want you can change the Target Name:
Click “Create”:
Wait for completion and click Close:
Note that on the disk, two files (.img and .swdsk) are created:
2.2 – Virtual Disk “Quorum” – Configure Replication
Select the device you just created and click “Replication Manager“.
In the “Replication Manager” Window click “Add Replica“. Select “Synchronous two-way replication” and click “Next“:
Enter the FQDN of the second SQL server:
Select the target disk for the second SQL Server:
Same disk as the first SQL Server:
Choose “Create new Partner Device“
and click Next.
Leave default options (check the driver letter)
Click “Change network settings“:
Select a Network for “Synchronization and HA” and the other Network for Heartbeat:
Note: if you have more than two networks, you can configure several networks for Synchronization/HA flows (or Heartbeat).
You can also modify the ALUA (Asymmetric Logical Unit Assignment / Multipathing method) settings, but it”s recommended to keep “ALUA Optimized” for the both targets.
Click “Create Replica”:
Click “Close”:
Wait for the end of Synchronization and click “Close”:
On the Management Console, add the second SQL Server to manage it:
Click “Add Server”:
Enter “FQDN”:
Click “Connect”:
Now we can see the status of our replicated Virtual Disk. You can see that the second SQL server priority is set to “Second” and you can retrieve the local path of Virtual disk:
Note that the “Image1” name as change to “HAImage1”.
2.3 – Virtual Disk “SQL Data” – Creation & Replication Configuration
Repeat the procedure to create disk for “SQL DATA”:
Then configure Replication:
Wait for Synchronization and click “Close”:
So now the two disk are ready and after iSCSI configuration we can add them to WSFC Cluster:
3 – Enable iSCSI MPIO
To allow iSCSI multipath (configuration of several paths between iSCSI initiator and target) we must configure the MPIO Driver (installed previously) on both nodes
Start the MPIO Console: mpiocpl.exe
Go to “Discover Multi-Paths”, select “Add support for iSCSI devices” and click “Add”:
Restart Computer
After restart, re-run “MPIOCPL” and valid that “MSFT2005iSCSIBusType_0x9” is added to “Devices”:
4 – Configure iSCSI
The last step is to configure iSCSI Initiator on both node to present the two target disk in multipath mode.
On each SQL node (2x iSCSI Target for the Quorum disk + 2x iSCSI Target for the SQL-Data Disk)
Reminder:
Hostname | IP Prod | IP Heartbeat |
l1-sqlfci-1 | 172.16.0.5 | 172.16.10.5 |
l1-sqlfci-2 | 172.16.0.6 | 172.16.10.6 |
4.1 – Present Disks to SQL Node 1
On the first SQL Server (L1-SQLFCI-1), start the “iSCSI initiator” configuration: iscsicpl
Configure Discovery
Go to “Discovery” tab and click “Discover Portal…”:
First add the host himself, enter the loopback address and click “Advanced”:
Select “Microsoft iSCSI Initiator” and set the Initiator IP to “Default”:
Next repeat the procedure to add the other SQL Server:
Enter the IP address of the second SQL Server, click “Advanced” and set the Initiator IP to the local IP of the server.
For more High Availability, you can also add the Heartbeat network:
Connect Targets
Go to the “Targets” tab. You can see that the two disk are listed (two different path) and the connection are “Inactive”.
Select the first target on the local server himself and click “Connect”:
Check “Enable multi-path”, click “Advanced” and configure the Initiator IP to default and the Target Portal on the loopback:
Repeat the procedure for the second path of the Quorum disk. Set the Initiator IP to the local IP of the Server and the Target Portal IP to the IP of the other SQL Serve:
Repeat the procedure for the second iSCSI target (SQL-Data Disk). The first path on the local server and the second path on the other Server. At the end all the targets status must be “Connected”:
Configure MPIO
Select the first Target of the “Quorum disk” and click “Devices…”
Click “MPIO”:
By default the “Load balance policy” is configured to “Round robin”.
Change it to “Fail Over Only” and check that the Active path is the localhost => Select the Active path, click “Details” and control the Source/Target Portal:
Repeat the same procedure for the second Target “SQLData”:
4.2 – Present Disks to SQL Node 2
Repeat the Full procedure to configure iSCSI Targets on the second SQL Server
Configure Discovery:
Connect all Targets (for each disk, one locally and the second to the other server):
Configure MPIO “Load Balance Policy” to “Fail Over Only” on the both targets:
Quorum Disk:
SQLData Disk:
5 – Prepare Disk
Now, we can see that the two volumes are visible by both SQL node (with multipath):
On one node, initialize both disks:
Now we are ready to mount the SQL Cluster!
6 – Create the WSFC Cluster
Go to the first node and start the WSFC Console and select “Validate cluster Configuration.
Add the two SQL Nodes:
Run all tests, check “Create the cluster now…” and click Finish:
Enter the Cluster Name:
Note that the Cluster IP is configured in DHCP Mode.
Uncheck “Add all eligible storage to cluster”:
Click Finish:
At the end of the cluster configuration, there is an error:
This is “normal” on Azure, this issue is due to the DHCP mode for the Cluster IP, the Cluster retrieves the same IP as the node where the cluster is created, so there is an IP Conflict.
Go to the second SQL node and start the WSFC Console:
Edit the IP Cluster Core resource:
Change IP to a Static IP (there is no way for the moment to reserve it on Azure):
Click “Yes”:
6.1 – Configure Network
Go to Network, rename them and check the configuration (Cluster Use):
6.2 – Add Disk to cluster
Go to Storage and select “Add Disk”:
Add the two Virtual Disks (managed by StarWind Virtual SAN):
Start the “Server Management” console and create new Volume on these Disks:
Create Quorum Volume (Q:):
Create “SQL Data” Volume (G:):
6.3 – Configure Quorum
Edit “Cluster Quorum Settings”:
Select “Advanced”:
Keep “All Nodes” for “Voting Configuration” and select “Configure a disk witness”:
Select the Q: Volume and click “Finish”:
So now we have one Disk used for Cluster Quorum and one disk available for SQL Server:
7 – SQL Server – Install first Node
On the first node, mount the SQL Server 2014 (or 2012) Standard ISO and select “Installation\New SQL Server failover cluster installation”:
Select “Database Engine Services” and “Management Tools – Basic”:
Select “Default instance” and enter a SQL Server Network Name:
Keep or change the SQL Cluster group:
Select the Cluster Disk:
Configure an IP Address for the SQL cluster:
Configure you service accounts:
Set “Collation”:
Configure your Authentication mode and Administrators Group:
Configure the SQL path to the Cluster Disk (except for the TempDB, select a local path). Normally you should configure TempDB, Log … on separate disks):
Click “Yes”:
Start Installation:
Wait for installation and click “Close”:
Now, the SQL Cluster Instance is ready with the Clustered disk:
8 – SQL Server – Install the second Node
Go to the second node, mount SQL ISO and select “Add node to a SQL Server failover cluster”:
Check the Cluster Node and Network Configuration:
Configure Service Accounts:
Click “Install” and wait for completion:
9 – Connect to Instance
Ok, so now if you try to connect the instance directly from a node, it’s OK:
But if you try to connect from a client, you get an error:
This is normal, in Azure you cannot connect directly to a cluster, you have to configure an ILB (Internal Load Balancer). To access the SQL Cluster Instance clients must connect to the ILB instead of the Cluster IP.
10 – Create an Internal Load Balancer (ILB)
Start “Azure PowerShell” and run:
1 – Create the ILB:
Change variables with your parameters and choose a Load balanced Set name.
$SQLClusterName = "sqlfci-1" $SQLClusterIP = "172.16.0.115" $CloudServiceSQL = "tc-lab1-cs-sqlsrv" $LBSetName = "LBS-SQLFCI" Add-AzureInternalLoadBalancer -InternalLoadBalancerName $SQLClusterName -SubnetName "default" -ServiceName $CloudServiceSQL –StaticVNetIPAddress $SQLClusterIP
Note: Check ILB in a Cloud Service
Get-AzureInternalLoadBalancer -ServiceName "tc-lab1-cs-sqlsrv" -Verbose:$false
Note: Get VM Azure Endpoint
Get-AzureVM -Name "l1-sqlfci-1" -ServiceName "tc-lab1-cs-sqlsrv" -Verbose:$false | Get-AzureEndpoint | ft Name,protocol,port,localport,ProbeProtocol,ProbePort, ProbeIntervalInSeconds,InternalLoadBalancerName -AutoSize
By default, on a VM a PS and RDP endpoint are created:
2 – Add load balanced endpoint to the first cluster SQL node:
Note: Choose your own “Probe Port” (here: 311433). The same Probe Port must be configured on endpoint on both VM and on the SQL IP Address cluster resource.
Get-AzureVM -ServiceName $CloudServiceSQL -Name "l1-sqlfci-1" | Add-AzureEndpoint -Name "SQL" -LBSetName $LBSetName -Protocol "TCP" -LocalPort 1433 -PublicPort 1433 -ProbePort 31433 -ProbeProtocol tcp -ProbeIntervalInSeconds 10 –DirectServerReturn $true -InternalLoadBalancerName $SQLClusterName | Update-AzureVM
Now if we check Endpoints:
3 – Add load balanced endpoint to the second SQL cluster node:
Get-AzureVM -ServiceName $CloudServiceSQL -Name "l1-sqlfci-2" | Add-AzureEndpoint -Name "SQL" -LBSetName $LBSetName -Protocol "TCP" -LocalPort 1433 -PublicPort 1433 -ProbePort 31433 -ProbeProtocol tcp -ProbeIntervalInSeconds 10 –DirectServerReturn $true -InternalLoadBalancerName $SQLClusterName | Update-AzureVM
MSDN Links:
- LoadBalancerProbe Schema: https://msdn.microsoft.com/en-us/library/azure/jj151530.aspx
- Add-AzureEndpoint: https://msdn.microsoft.com/en-us/library/azure/dn495300.aspx
NOTE – View Load Balanced Set configuration through the Azure Portal:
Edit VM Settings. You can see the new “Load-Balanced” endpoint:
Got to “Load balanced sets”, you can see the ILB:
In addition, you can edit the Load Balanced set:
View of member VM and you can manage ACL:
10.1 – Configure SQL IP address Cluster Resource
Now the last step is to configure Cluster.
For reminder, during the SQL setup, I set a static IP Address 172.16.0.115 on the SQL Server instance role and I configure the ILB with the same IP. The last step is to add the probe port defined in the ILB to the SQL IP resource cluster.
On a cluster node, start a PowerShell console with Elevated privileges. Retrieve the name of the resource “IP Address” of the SQL Server cluster group:
Configure Probe Port (here 31433) on the SQL IP Address cluster resource:
Get-ClusterResource "SQL IP Address 1 (sqlfci-1)" | Set-ClusterParameter -Multiple @{Address="172.16.0.115";ProbePort="31433";SubnetMask="255.255.255.255";Network="Cluster Network - PROD";OverrideAddressMatch=1;EnableDhcp=0}
Check the SQL IP Address cluster resource configuration:
Get-ClusterResource "SQL IP Address 1 (sqlfci-1)" | Get-ClusterParameter
Note: Probe port’s job is to find out which is the active node that hosts the IP Address (SQL Role) in the Cluster. Load Balancer sends the probe pings over TCP port 31433 to every node in the cluster (by default every 10 seconds)
For more information about ILB and Probe Port configuration, read this excellent article: https://blogs.technet.com/b/askcore/archive/2015/06/24/building-windows-server-failover-cluster-on-azure-iaas-vm-part-2-network.aspx
Restart the SQL Server Cluster Role:
Now you can connect to the SQL Cluster Instance through a Client.