HyperConvergence – Tech-Coffee

Don’t do it: enable performance history in an Azure Stack HCI mixed mode cluster

Romain Serre — Fri, 09 Aug 2019 11:34:16 +0000

Lately I worked for a customer to add two nodes in an existing 2-nodes Storage Spaces Direct cluster. The existing nodes are running on Windows Server 2016 while the new ones are running on Windows Server 2019. So, when I integrated the new nodes to the cluster, it was in mixed operating system mode because two different versions of Windows Server were in the cluster. For further information about this process, you can read this topic.

After the integration of the new nodes, I left the customer because the data in the cluster were replicated and spread on them. During this period, the customer ran this command:

Start-ClusterPerformanceHistory

This command start performance history in a Storage Spaces Direct cluster. In a native Windows Server 2019 cluster, a cluster shared volume called ClusterPerformanceHistory is created to store performance metrics. Because the cluster were in mixed operating system mode and not in native Windows Server 2019 mode, it resulted in an unexpected behavior. Several ClusterPerformanceHistory CSV were created. Even if they were deleted, new ClusterPerformanceHistory were created indefinitely.

The customer tried to run the following cmdlet without success:

Stop-ClusterPerformanceHistory -DeleteHistory

How to resolve the performance history issue

To solve this issue, the customer ran this cmdlets:

$StorageSubSystem = Get-StorageSubSystem Cluster* $StorageSubSystem | Set-StorageHealthSetting -Name “System.PerformanceHistory.AutoProvision.Enabled” -Value “False”

The option System.PerformanceHistory.AutoProvision.Enabled is set to True when the cmdlet Start-ClusterPerformanceHistory is run. However, the cmdlet Stop-ClusterPerformanceHistory doesn’t disable this setting.

The post Don’t do it: enable performance history in an Azure Stack HCI mixed mode cluster appeared first on Tech-Coffee.

Keep Dell Azure Stack HCI hardware up to date with WSSD Catalog

Romain Serre — Fri, 03 May 2019 10:17:53 +0000

The firmware and driver’s management can be a pain during the lifecycle of an Azure Stack HCI. Some firmware are not supported, others must be installed to solve an issue. In the case of Dell hardware, a support matrix is available here. If you look at that matrix, you’ll see firmware and drivers for storage devices, Host Bus Adapter, switches, network adapters and so on. It’s nice to get that support matrix but should I find and download each drivers or firmware manually? Of course not.

Dell provides since a few months a WSSD catalog that enables to download only latest supported firmware and drivers for Azure Stack HCI and for your hardware. You can use this catalog from OpenManage (OME) or from Dell EMC Repository Manager. I prefer the second option because not all my customers have deployed OME. Dell EMC Repository Manager can be downloaded from this link.

Download the WSSD Catalog

The best way to download the WSSD Catalog is this webpage. Download the file and unzip it. You should get two files: the catalog and the signature file.

Add the catalog to Dell EMC Repository Manager

Now that you have the WSSD catalog file, you can add it to Dell EMC Repository Manager. When you open it, just click on Add Repository.

Specify a repository name and click on Choose File in base Catalog. Then select the WSSD catalog file.

Then you have to choose the Repository Type. Either I choose Manual or Integration. Integration is nice because you can specify an iDRAC name or IP. Then only specific firmware and drivers are downloaded for the hardware. You can also choose Manual for a new infrastructure to prepare your deployment. In this example, I choose Manual and I select the 740XD model and Windows Server 2019. When you have finished, click on Add.

Create a custom SUU

Once the repository is added, you should see firmware and drivers. Select it and click on export.

Then select the SUU ISO tab. Choose a location where will be exported the SUU file.

Once the export job is finished, you get a SUU image file to update your Azure Stack HCI servers. You just have to copy it to each server, mount the ISO and run suu.cmd -e. Or you can create a script to make a package to deploy firmware and drivers automatically.

Conclusion

The WSSD Catalog provided by Dell enables to ease the management of firmware and drivers in an Azure Stack HCI solution. They have to be updated several times a year and before it would be time consuming. Now it’s straightforward and you don’t have excuse to not update your platform.

The post Keep Dell Azure Stack HCI hardware up to date with WSSD Catalog appeared first on Tech-Coffee.

Storage Spaces Direct: performance tests between 2-Way Mirroring and Nested Resiliency

Romain Serre — Wed, 17 Oct 2018 09:38:52 +0000

Microsoft has released Windows Server 2019 with a new resiliency mode called nested resiliency. This mode enables to handle two failures in a two-node S2D cluster. Nested Resiliency comes in two flavors: nested two-way mirroring and nested mirror-accelerated parity. I’m certain that two-way mirroring is faster than nested mirror-accelerated parity but the first one provides only 25% of usable capacity while the second one provides 40% of usable capacity. After having discussed with some customers, they prefer improve the usable capacity than performance. Therefore, I should expect to deploy more nested mirror-accelerated parity than nested two-way mirroring.

Before Windows Server 2019, two-way mirroring (provide 50% of usable capacity) was mandatory in two-node S2D cluster. Now with Windows Server 2019, we have the choice. So, I wanted to compare performance between two-way mirroring and nested mirror-accelerated parity. Moreover, I want to know if compression and deduplication has an impact on performance and CPU workloads.

N.B: I executed tests on my lab which is composed of Do It Yourself servers. What I want to show is a “trend” to know what could be the bottleneck in some cases and if nested resiliency has an impact on performance. So please, don’t blame me in comment section

Test platform

I run my tests on the following platform composed of two nodes:

CPU: 1x Xeon 2620v2
Memory: 64GB of DDR3 ECC Registered
Storage:
- OS: 1x Intel SSD 530 128GB
- S2D HBA: Lenovo N2215
- S2D storage: 6x SSD Intel S3610 200GB
NIC: Mellanox Connectx 3-Pro (Firmware 5.50)
OS: Windows Server 2019 GA build

Both servers are connected to two Ubiquiti ES-16-XG switches. Even if it doesn’t support PFC/ETS and so one, RDMA is working (I tested it with test-RDMA script). I have not enough traffic in my lab to disturb RDMA without a proper configuration. Even if I implemented that in my lab, it is not supported and you should not implement your configuration in this way for production usage. On Windows Server side, I added both Mellanox network adapters in a SET and I created three virtual network adapters:

1x Management vNIC for RDP, AD and so one (routed)
2x SMB vNIC for live-migration and SMB traffics (not routed). Each vNIC is mapped to a pNIC.

To test the solution I use VMFleet. First I created volumes in two-way mirroring without compression, then I enabled deduplication. After I deleted and recreated volumes in nested mirror-accelerated parity without deduplication. Finally, I enabled compression and deduplication.

I run the VM Fleet with a block size of 4KB, an outstanding of 30 and on 2 threads per VM.

Two-Way Mirroring without deduplication results

First, I ran the test without write workloads to see the “maximum” performance I can get. My cluster is able to deliver 140K IOPS with a CPU workload of 82%.

In the following test, I added 30% of write workloads. The total IOPS is almost 97K for 87% of CPU usage.

As you can see, the RSS and VMMQ are well set because all Cores are used.

Two-Way Mirroring with deduplication

First, you can see that deduplication is efficient because I saved 70% of total storage.

Then I run a VMFleet test and has you can see, I have a huge drop in performance. By looking closely to the below screenshot, you can see it’s because of my CPU that reach almost 97%. I’m sure with a better CPU, I can get better performance. So first trend: deduplication has an impact on CPU workloads and if you plan to use this feature, don’t choose the low-end CPU.

By adding 30% write, I can’t expect better performance. The CPU still limit the overall cluster performance.

Nested Mirror-Accelerated Parity without deduplication

After I recreated volumes I run a test with 100% read. Compared to two-way mirroring, I have a slightly drop. I lost “only” 17KIOPS to reach 123KIOPS. The CPU usage is 82%. You can see also than the latency is great (2ms).

Then I added 30% write and we can see the performance drop compared to two-way mirroring. My CPU usage reached 95% that limit performance (but the latency is content to 6ms in average). So nested mirror-accelerated parity require more CPU than two-way mirroring.

Nested Mirror-Accelerated Parity with deduplication

First, deduplication works great also on nested mirror-accelerated parity volume. I saved 75% of storage.

As two-way mirroring with compression, I have poor performance because of my CPU (97% usage).

Conclusion

First, deduplication works great if you need to save space at the cost of a higher CPU usage. Secondly, nested mirror-accelerated parity requires more CPU workloads especially when there are write workloads. The following schemas illustrate the CPU bottleneck. In the case of deduplication, the latency always increases and I think because of CPU bottleneck. This is why I recommend to be careful about the CPU choice. Nested Mirror Accelerated Parity takes also more CPU workloads than 2-Way Mirroring.

Another interesting thing is that Mirror-Accelerated Parity produce a slightly performance drop compared to 2-Way Mirroring but brings the ability to support two failures in the cluster. With deduplication enabled we can save space to increase the usable space. In two-node configuration, I’ll recommend to customer Nested Mirror-Accelerated Parity by paying attention to the CPU.

The post Storage Spaces Direct: performance tests between 2-Way Mirroring and Nested Resiliency appeared first on Tech-Coffee.

Support two failures in 2-node S2D cluster with nested resiliency

Romain Serre — Mon, 08 Oct 2018 09:19:01 +0000

Microsoft just released Windows Server 2019 with a lot of improvement for Storage Spaces Direct. One of these improvements is the nested resiliency that is specific for 2-node S2D cluster. Thanks to this feature, a 2-node S2D cluster can now support two failures, at the cost of storage dedicated for resiliency. Nested Resiliency comes in two flavors:

Nested two-Way mirroring: It’s more or less a 4-way mirroring that provide 25% of usable storage
Nested mirror-accelerated parity: it’s a volume with a mirror tier and a parity tier.

The following slide comes from a deck presented at Ignite.

To support two failures, a huge amount of storage is consumed by the resiliency. Hopefully, Windows Server 2019 allows to run deduplication in ReFS volume. But be careful about the CPU usage and storage device performances. I’ll talk about that in a next topic.

Create a Nested Two-Way Mirror volume

To create a Nested 2-Way Mirroring volume, you have to create a storage tier and a volume. Below you can find an example in my lab (full flash solution) where Storage Pool is called VMPool:

New-StorageTier -StoragePoolFriendlyName VMPool -FriendlyName Nested2wMirroringTier -ResiliencySettingName Mirror -NumberOfDataCopies 4 -MediaType SSD

New-Volume -StoragePoolFriendlyName VMPool -FriendlyName CSV-01 -StorageTierFriendlyNames Nested2wMorringTier -StorageTierSizes 500GB

Create a Nested Mirror-Accelerated Parity volume

To create a Nested Mirror-Accelerated Parity volume, you need to create two tiers and a volume composed of these tiers. In the below example, I create two nested Mirror-Accelerated Parity volume:

New-StorageTier -StoragePoolFriendlyName VMPool -FriendlyName Nested2wMirroringTier -MediaType SSD -ResiliencySettingName Mirror -NumberOfDataCopies 4

New-StorageTier -StoragePoolFriendlyName VMPool -FriendlyName NestedSParityTier -ResiliencySettingName Parity -NumberOfDataCopies 2 -PhysicalDiskRedundancy 1 -NumberOfGroups 1 -FaultDomainAwareness StorageScaleUnit -ColumnIsolation PhysicalDisk -MediaType SSD

New-Volume -StoragePoolFriendlyName VMPool -FriendlyName PYHYV01 -StorageTierFriendlyNames NestedMirror,NestedParity -StorageTierSizes 80GB, 150GB

New-Volume -StoragePoolFriendlyName VMPool -FriendlyName PYHYV02 -StorageTierFriendlyNames NestedMirror,NestedParity -StorageTierSizes 80GB, 150GB

Conclusion

Some customers didn’t want to deploy a 2-node S2D cluster in branch office because of lack of the support of two failures. Thanks to nested resiliency we can support two failures in a 2-node cluster. However be careful to storage usage for resiliency and the performance of the overall cluster if you enable deduplication.

The post Support two failures in 2-node S2D cluster with nested resiliency appeared first on Tech-Coffee.

Monitor and troubleshoot VMware vSAN performance issue

Romain Serre — Thu, 08 Mar 2018 21:50:23 +0000

When you deploy VMware vSAN in the vSphere environment, the solution comes from several tools to monitor, find performance bottleneck and to troubleshoot VMware vSAN issue. All the information that I’ll introduce you in this topic are built-in to vCenter. Unfortunately, all vSAN configuration, metrics and alerts are not available yet from HTML5 board. So the screenshots were taken from VMware vCenter flash board.

Check the overall health of VMware vSAN

Many information are available from the vSAN cluster pane. VMware has added a dedicated tab for vSAN and some performance counters. In the below screenshot, I show the overall vSAN Health. VMware has included several tests to validate the cluster health such as the hardware compatibility, the network, the physical disk, the cluster and so on.

The hardware compatibility list is downloaded from VMware to validate if vSAN is supported on your hardware. If you take a look at the below screenshot, you can see that my lab is not really supported because my HBA are not referenced by VMware. Regarding the network, several tests are also validated such as the good IP configuration, the MTU, if ping is working and so on. Thanks to this single pane, we are able to check if the cluster is healthy or not.

In the capacity section, you get information about the storage consumption and how the deduplication ratio.

In the same pane you get also a charts which give you the storage usage by object types (before deduplication and compression).

The next pane is useful when a node was down because of an outage or for updates. When you restart a node in vSAN cluster, this last must resync information from its buddy. When the node was down, lot of data were change on the storage and the node must resync these data. This pane indicates which vSAN objects must be resynced to support the chosen RAID level and the FTT (Failure To Tolerate). In case of resync, this pane indicates of many components to resync, the remaining bytes to resync and an estimated time for this process. You can also manage the resync throttling.

In Virtual Objects pane, you can get for each vSAN object the health state. You can check also if the object is compliant with the VM storage policy that you have defined (FTT, RAID Level, Cache pining etc.). Moreovoer, in the physical disk placement tab, you get also the component placement and which are active or not. In my lab, I have a two-node vSAN cluster and I have defined in my storage policy RAID 1 with FTT=1. So for each object, I have three components: two times the data and witness.

In physical disks pane, you can list the physical disks involved in vSAN for each node. You can know also which components are store on which physical disks.

In the proactive tests, you can test a VM creation to validate that everything is working. For example, this test helped me one time to troubleshoot MTU issue between hosts and switches.

vSAN performance counters

Sometime you get poor performance and you expect better. So, you need to find the performance bottleneck. The performance counters can help you to troubleshoot the issue. In performance tab you get the classical performance counters about CPU memory and so on.

VMware has also added two sections dedicated for vSAN performance counters: vSAN – Virtual Machine Consumption and vSAN – Backend. The below screenshot shows you the first section. It is useful because this section indicates you the throughput, the latency and the congestion.

The other section presents performance counters related to backend. You can get the throughput taken by resync job, the IOPS ad latency of vSAN.

The post Monitor and troubleshoot VMware vSAN performance issue appeared first on Tech-Coffee.

Use Honolulu to manage your Microsoft hyperconverged cluster

Romain Serre — Wed, 21 Feb 2018 11:09:19 +0000

Few months ago, I have written a topic about the next gen Microsoft management tool called Honolulu project. Honolulu provides management for standalone Windows Server, failover clustering and hyperconverged. Currently hyperconverged management works only on Windows Server Semi-Annual Channel (SAC) versions (I cross finger for Honolulu support on Windows Server LTSC). I have upgraded my lab to latest technical preview of Windows Server SAC to show you how to use Honolulu to manage your Microsoft hyperconverged cluster.

In part of my job, I deployed dozen of Microsoft hyperconverged cluster and to be honest, the main disadvantage of this solution is the management. Failover Clustering console is archaic and you have to use PowerShell to manage the infrastructure. Even if the Microsoft solution provides high-end performance and good reliability, the day-by-day management is tricky.

Thanks to Honolulu project we have now a modern management which can compete with other solutions on the market. Currently Honolulu is still in preview version and some of features are not yet available but it’s going to the right direction. Moreover, Honolulu project is free and can be installed on your laptop or on a dedicated server. As you wish !

Honolulu dashboard for hyperconverged cluster

Once you have added the cluster connection to Honolulu, you get a new line with the type Hyper-Converged Cluster. By clicking on it, you can access to a dashboard.

This dashboard provides a lot of useful information such as latest alerts provided by the Health Service, the overall performance of the cluster, the resource usage and information about servers, virtual machines, volumes and drives. You can see that currently the cluster performance charts indicate No data available. It is because the Preview of Windows Server that I have installed doesn’t provide information yet.

From my point of view, this dashboard is pretty clear and provides global information about the cluster. At a glance, you get the overall health of the cluster.

N.B: the memory usage indicated -35,6% because of a custom motherboard which not provide memory installed on the node.

Manage Drives

By clicking on Drives, you get information about the raw storage of your cluster and your storage devices. You get the total drives (I know I don’t follow the requirements because I have 5 drives on a node and 4 on another, but it is a lab ). Honolulu provides also the drive health and the raw capacity of the cluster.

By clicking on Inventory, you have detailed information about your drives such as the model, the size, the type, the storage usage and so on. At a glance, you know if you have to run an Optimize-StoragePool.

By clicking on a drive, you get further information about it. Moreover, you can act on it. For example, you can turn light on, retire the disk or update the firmware. For each drive you can get performance and capacity charts.

Manage volumes

By clicking on Volumes, you can get information about your Cluster Shared Volume. At a glance you get the health, the overall performance and the number of volumes.

In the inventory, you get further information about the volumes such as the status, the file system, the resiliency, the size and the storage usage. You can also create a volume.

By clicking on create a new volume, you get this:

By clicking on a volume, you get more information about it and you can make action such as open, resize, offline and delete.

Manage virtual machines

From Honolulu, you can also manage virtual machines. When you click on Virtual Machines | Inventory, you get the following information. You can also manage the VMs (start, stop, turn off, create a new one etc.). All chart values are in real time.

vSwitches management

From the Hyper-Converged cluster pane, you have information about virtual switches. You can create a new one, delete rename and change settings of an existing one.

Node management

Honolulu provides also information about your nodes in the Servers pane. At a glance you get the overall health of all your nodes and resource usage.

In the inventory, you have further information about your nodes.

If you click on a node, you can pause the node for updates or hardware maintenance. You have also detailed information such as performance chartsm drives connected to the node and so on.

Conclusion

Project Honolulu is the future of Windows Server in term of management. This product provides great information about Windows Server, Failover Clustering and Hyperconverged cluster in a web-based form. From my point of view, Honolulu eases the Microsoft hyperconverged solution management and can help administrators. Some features are missing but Microsoft listen the community. Honolulu is modular because it is based on extensions. Without a doubt, Microsoft will add features regularly. Just I cross finger for Honolulu support on Windows Server 2016 released in October 2016 but I am optimistic.

The post Use Honolulu to manage your Microsoft hyperconverged cluster appeared first on Tech-Coffee.

Storage Spaces Direct dashboard

Romain Serre — Thu, 14 Dec 2017 10:17:42 +0000

Today I release a special Christmas gift for you. For some time, I’m writing a PowerShell script to generate a Storage Spaces Direct dashboard. This dashboard enables you to validate each important setting for a S2D cluster.

I decided to write this PowerShell script to avoid to run hundred of PowerShell cmdlet and check manually returned value. With this dashboard, you get almost all the information you needs.

Where can I download the script

The script is available on github. You can download the documentation and the script from this link. Please read the documentation before running the script.

Storage Spaces Direct dashboard

The below screenshot shows you a dashboard example. This dashboard has been generated from my 2-node cluster in lab.

Roadmap

I plan to improve the script next year by adding the support for disaggregated S2D deployment model and to add information such as the cache / capacity ratio and the reservation space.

Special thanks

I’d like to thanks Dave Kawula, Charbel Nemnom, Kristopher Turner and Ben Thomas. Thanks for helping me to resolve most of the issues by running the script on your S2D infrastructures

The post Storage Spaces Direct dashboard appeared first on Tech-Coffee.

Dell R730XD bluescreen with S130 adapter and S2D

Romain Serre — Thu, 09 Nov 2017 17:31:59 +0000

This week I worked for a customer which had issue with his Storage Spaces Direct cluster (S2D). When he restarted a node, Windows Server didn’t start and a bluescreen appeared. It is because the operating system disks were plugged on S130 while Storage Spaces Direct devices were connected to HBA330mini. It is an unsupported configuration especially with Dell R730XD. In this topic, I’ll describe how I have changed the configuration to be supported.

This topic was co-written with my fellow Frederic Stefani (Dell – Solution Architect). Thanks for the HBA330 image

Symptom

You have several Dell R730XD added to a Windows Server 2016 failover cluster where S2D is enabled. Moreover the operating system is installed on two storage devices connected to a S130 in software RAID. When you reboot a node, you have the following bluescreen.

How to resolve issue

This issue occurs because S2D and operating system connected to S130 is not supported. You have to connect the operating system on HBA330mini. This means that the operating system will not be installed on a RAID 1. But a S2D node is redeployed quickly if you have written proper PowerShell script .

To make the hardware change, you need a 50cm SFF-8643 cable to connect operating system to HBA330mini. Moreover, you have to reinstall the operating system (sorry about that). Then a HBA330mini firmware image must be applied otherwise enclosure will not be present in operating system.

Connect the operating system disk to HBA330mini

First place the node into maintenance mode. In the cluster manager, right click on the node, select Pause and Drain Roles.

Then stop the node. When the node is shutdown, you can evict the node from the cluster and delete the Active Directory computer object related to the node (you have to reinstall the operating system).

First we have to remove the cable where connector are circle in red in the below picture. This cable connects the both operating system storage devices to S130.

To connect the operating system device, you need a SFF-8643 cable as below

So disconnect the SAS cable between operating system devices and S130.

Then we need to remove those fans to be able to plug the SFF-8643 on the backplane. To remove the fans, turn the blue items which are circle in red in the below picture

Then connect operating system device to backplane with SFF-8643 cable. Plug the cable in SAS A1 port on the backplane. Remove also the left operating system device from the server (the top left one in the below picture). This is now your spare device.

Add the fans in the server and turn the blue items.

Start the server and open the BIOS settings. Navigate to SATA Settings and set Embedded SATA to Off. Restart the server.

Start again the BIOS settings and open device settings. Check that S130 is not in the menu and select the HBA330mini device.

Check that another physical disk is connected to the HBA as below.

Reinstall operating system and apply HBA330mini image

Now that the operating system disk is connected to HBA330mini, it’s time to reinstall operating system. So use your favorite way to install Windows Server 2016. One the OS is installed, mount your virtual media from iDRAC and mount this image to the removable virtual media:

Next change the next boot to Virtual floppy.

On the next boot, the image is loaded and applied to the system. The server thinks that the HBA has been changed.

Add the node to the cluster

Now you can add the node to the cluster again. You should see enclosure.

Conclusion

When you build a S2D solution based on Dell R730XD, don’t connect your operating system disks to S130: you will get bluescreen on reboot. If you have already bought servers with S130, you can follow this topic to resolve your issue. If you plan to deploy S2D on R740XD, you can connect your operating system disks to BOSS and S2D devices to HBA330+.

The post Dell R730XD bluescreen with S130 adapter and S2D appeared first on Tech-Coffee.

Storage Spaces Direct: plan the storage for hyperconverged

Romain Serre — Mon, 17 Jul 2017 09:49:52 +0000

When a customer calls me to design or validate the hardware configuration for hyperconverged infrastructure with Storage Spaces Direct, there is often a misunderstanding about the remaining useable capacity, the required cache capacity and ratio, and the different mode of resilience. With this topic, I’ll try to help you to plan the storage for hyperconverged and to clarify some points.

Hardware consideration

Before sizing the storage devices, you should be aware about some limitations. First you can’t exceed 26 storage devices per node. Windows Server 2016 can’t handle more than 26 storage devices so if you deploy your Operating System on two storage devices, 24 are available for Storage Spaces Direct. However, the storage devices are bigger and bigger so 24 storage devices per node is enough (I have never seen a deployment with more than 16 storage devices for Storage Spaces Direct).

Secondly, you have to pay attention on your HBA (Host Bus Adapter). With Storage Spaces Direct, this is the Operating System which is in charge to handle the resilience and cache. This is a software-defined solution after all. So, there is no reason that the HBA manages RAID and cache. In Storage Spaces Direct case, the HBA is used mainly to add more SAS ports. So, don’t buy an HBA with RAID and cache because you will not use these features. Storage Spaces Direct storage devices will be configured in JBOD mode. If you choose to buy Lenovo server, you can buy N2215 HBA. If you choose Dell, you can select HBA330. The HBA must provide the following features:

Simple pass-through SAS HBA for both SAS and SATA drives
SCSI Enclosure Services (SES) for SAS and SATA drives
Any direct-attached storage enclosures must present Unique ID
Not Supported: RAID HBA controllers or SAN (Fibre Channel, iSCSI, FCoE) devices

Thirdly, there are requirements regarding storage devices. Only NVMe, SAS and SATA devices are supported. If you have old SCSI storage devices, you can drop them :). These storage devices must be physically attached to only one server (local-attached devices). If you choose to implement SSD, these devices must be enterprise-grade with power loss protection. So please, don’t install a hyperconverged solution with Samsung 850 pro. If you plan to install cache storage devices, these SSD must have 3 DWPD. That means that this device can be written entirely three times per day at least.

To finish, you have to respect a minimum number of storage devices. You must implement at least 4 capacity storage devices per node. If you plan to install cache storage devices, you have to deploy two of them at least per node. For each node in the cluster, you must have the same kind of storage devices. If you choose to deploy NVMe in a server, all servers must have NVMe. The most possible, keep the same configuration across all nodes. The below table provides the minimum storage devices per node regarding the configuration:

Drive types present	Minimum number required
All NVMe (same model)	4 NVMe
All SSD (same model)	4 SSD
NVMe + SSD	2 NVMe + 4 SSD
NVMe + HDD	2 NVMe + 4 HDD
SSD + HDD	2 SSD + 4 HDD
NVMe + SSD + HDD	2 NVMe + 4 Others

Cache ratio and capacity

The cache ratio and capacity is an important part when you choose to deploy cache mechanism. I have seen a lot of wrong design because of cache mechanism. The first thing to know is that the cache is not mandatory. As explained in the above table, you can implement an all flash configuration without cache mechanism. However, if you choose to deploy a solution based on HDD, you must implement a cache mechanism. When the storage devices behind cache are HDDs, the cache is set to Read / Write mode. Otherwise, it is set to write only mode.

The cache capacity must be at 10% of the raw capacity. If in each node you have 10TB of raw capacity, you need at least 1TB of cache. Moreover, if you deploy cache mechanism, you need at least two cache storage devices. This ensures the high availability of the cache. When Storage Spaces Direct is enabled, capacity devices are bound to cache devices in round-robin manner. If a cache storage device fails, all its capacity devices are bound to another cache storage device.

To finish, you must respect a ratio between the number of cache devices and capacity devices. The capacity devices must be a multiple of cache devices. This ensures that each cache device has the same number of capacity devices.

Reserved capacity

When you design the storage pool capacity and you choose the number of storage devices, you need to keep in mind that you need some unused capacity in the storage pool. This is the reserved capacity in case of repair process. If a capacity device fails, storage pool duplicates blocks that were written in this device to respect the resilience mode. This process requires free space to duplicate blocks. Microsoft recommends to leave empty the space of one capacity device per node up to four drives.

For example, I have 6 nodes with 4x 4TB HDD per node. I leave empty 4x 4TB (one per node up to four drives) in the storage pool for reserved capacity.

Example of storage design

You should know that in hyperconverged infrastructure, the storage and the compute are related because these components reside in the same box. So before calculate the required raw capacity you should have evaluated two things: the number of nodes you plan to deploy and the useable storage capacity required. For this example, let’s say that we need four nodes and 20TB of useable capacity.

First thing, you have to choose a resilience mode. In hyperconverged, usually 2-way Mirroring and 3-way Mirroring are implemented. If you choose 2-Way mirroring (1 fault tolerance), you have 50% of useable capacity. If you choose 3-Way Mirroring (recommended, 2 fault tolerances) you have only 33% of useable capacity.

PS: At the time of writing this topic, Microsoft has announced deduplication in next Windows Server release for ReFS volume.

So, if you need 20TB of useable capacity and you choose 3-Way Mirroring, you need at least 60TB (20 x 3) of raw storage capacity. That means that in each node (4-node) you need 15TB of raw capacity.

Now that you know you need 15TB of raw storage per node, you need to define the number of capacity storage devices. If you need maximum performance, you can choose only NVMe devices. But this solution will be very expensive. For this example, I choose SSD for the cache and HDD for the capacity.

Next, I need to define which kind of HDD I select. If I choose 4x 4TB HDD per node, I will have 16TB raw capacity per node. I need to add an additional 4TB HDD for the reserved capacity. But this solution is not good regarding the cache ratio. No cache ratio can be respected with five capacity devices. In this case I need to add an additional 4TB HDD to get a total of 6x 4TB HDD per node (24TB raw capacity) and I can respect the cache ratio with 1:2 or 1:3.

The other solution is to select 2TB HDD. I need 8x 2TB HDD to get the required raw capacity. Then I add an additional 2TB HDD for the reserved capacity. I get 9x 2TB HDD and I can respect the cache ratio with 1:3. I prefer this solution because I’m closest of the specifications.

Now we need to design the cache devices. For our solution, we need 3x cache devices for a total capacity of 1.8TB at least (10% of raw capacity per node). So I choose to buy 800GB SSD (because my favorite cache SSD, Intel S3710, exists in 400GB or 800GB :)). 800GB x 3 = 2.1TB cache capacity per node.

So, each node will be installed with 3x 800GB SSD and 9x 2TB HDD with a cache ratio of 1:3. The total raw capacity is 72TB and the reserved capacity is 8TB. The useable capacity will be 21.12TB ((72-8) x 0.33).

About Fault Domain Awareness

I have made this demonstration with a Fault Domain Awareness at the node level. If you choose to configure Fault Domain Awareness at chassis and rack level, the calculation is different. For example, if you choose to configure Fault Domain Awareness at the rack level, you need to divide the total raw capacity across the rack number. You need also the exact same number of nodes per rack. With this configuration and the above case, you need 15TB of raw capacity per rack.

The post Storage Spaces Direct: plan the storage for hyperconverged appeared first on Tech-Coffee.

Upgrade VMware vSAN to 6.6

Romain Serre — Wed, 19 Apr 2017 11:32:08 +0000

Yesterday VMware released vSAN 6.6. vSAN 6.6 brings a lot of new features and improvements such as encryption, increase of performance and simplified management. You can get the release notes here. Currently my lab is running on vSAN 6.5 and I have decided to upgrade to vSAN 6.6. In this topic I’ll show you how to upgrade VMware vSAN from 6.5 to 6.6

Step 1: upgrade your vCenter Server Appliance

In my lab, I have deployed a vCenter Server Appliance. So, to update the VCSA I’m connecting the Appliance Management (https://:5480). Then, I navigate to update. Click on check updates from repository.

Once the update is installed, click on summary tab and reboot the VCSA. You should have a new version.

Step 2: Update ESXi nodes

Manage patch baseline in Update Manager

My configuration consists of two ESXi 6.5 nodes and one vSAN witness appliance 6.5. To update these hosts, I use Update Manager. To create / edit a baseline open the Update Manager from “hamburger” menu.

I have created an update baseline called ESXi 6.5 updates.

This baseline is dynamic which means that patches are added automatically regarding criteria.

The criteria are any patches for the product VMware ESXi 6.5.0.

Update nodes

Once the baseline is created, you can attach it to the nodes. Navigate to Hosts and Clusters and select the cluster (or a node) and open the update manager tab. In this tab, you can attach the baseline. Then you can click on Scan for Updates to verify if the node is compliant with the baseline (in other words, if the node has the last patches).

My configuration is specific because it is a lab. I run a configuration which is absolutely not supported because the witness appliance is hosted on the same vSAN cluster. To avoid issues, I manually set to maintenance mode the node I want to update and I move VM to the other node. Then I click on Remediate in Update Manager tab.

Next I select the baseline and I click on next.

Then I select the target node.

Two patches are not installed on the node. These patches are related to vSAN 6.6.

I don’t want to schedule later this update so I just click on next.

In host remediation options tab, you can change the VM Power state. I prefer to not change the VM Power state and run a vMotion.

In the next screen, I choose to disable the HA admission control as recommended by the wizard.

Next you can run a Pre-check remediation. Once you have validated the options you can click on finish to install updates on the node.

The node will be rebooted and when the update is finished you can exit the maintenance mode. I do these steps again for the second node and the witness appliance.

Note: in a production infrastructure, you just have to run the update manager from the cluster and not for each node. I add the node to maintenance mode and I move manually the VM because my configuration is not supported and specific.

Step 3: Upgrade disk configurations

Now that nodes and vCenter are updated, we have to upgrade the disk format version. To upgrade these disks, select your cluster, navigate to configure and general. Then run a Pre-check Upgrade to validate the configuration.

If the Pre-Check is successful, you should have something as below. Then click on Upgrade.

Then the disks are upgrading …

Once all disks are upgraded, disks should be on version 5.0.

That’s all. Now you can enjoy VMware vSAN 6.6.

The post Upgrade VMware vSAN to 6.6 appeared first on Tech-Coffee.