TestDrive

VMware vSAN Express Storage Architecture

Updated on

VMware vSAN Express Storage Architecture (ESA) is a new hyperconverged infrastructure (HCI) storage architecture designed to deliver space-efficient and highly resilient storage without compromising performance. It is optimized to exploit the full potential of the latest hardware and unlocks new capabilities for our customers.

Here are some of the key features and benefits of vSAN ESA:

  • Space efficiency:  vSAN ESA uses adaptive erasure coding to achieve space efficiency without compromising performance. This means you can store more data on your vSAN cluster with fewer storage devices.
  • Resilience:  vSAN ESA uses a distributed fault tolerance model to ensure your data is always available, even if one or more disks fail. This makes it an ideal choice for mission-critical applications.
  • Performance:  vSAN ESA delivers high performance for all workloads, including databases, virtual desktop infrastructure (VDI), and big data. This is due to high-speed NVMe disks and a new caching mechanism that reduces the impact of I/O latency.
  • Simplicity:  vSAN ESA is easy to deploy and administer using hosts from various OEMs in the VMware vSAN ReadyNode program. It is managed using the same tools you use for other vSAN deployments.

vSAN ESA is a robust new architecture that delivers space efficiency, high resilience, and high performance for all workloads. It is an ideal choice for organizations looking for a cost-effective and reliable way to store their data.

Before you Begin

In order to complete this product walkthrough please make sure you have the following:

  • A valid account in the VMware TestDrive environment, sign up here if you do not have one.

Accessing the vSAN Environment

To login to the vSAN environment, perform the following steps.

First, open a web browser of your choice and navigate to portal.vmtestdrive.com. Select LOGIN.  If you do not already have an account please reference the instructions found here.

Enter your TestDrive Username and Password, then click ENTER.

Next, locate the vSAN product under the Accelerate Cloud Journey tab.

Click LAUNCH and LAUNCH VIA WORKSPACE ONE.

A new tab will open with Workspace ONE.

Enter your TestDrive username and password, then click Sign In.

Next, go to the Apps section and search for the vSAN desktop. Click the computer icon to open into the desktop either via HTML access or Horizon Client access.

Now you'll be in a session in the vSAN RDSH desktop.

Inside the desktop, double click the vSphere Client shortcut icon.

Click on LAUNCH VSPHERE CLIENT then login with credentials listed below.

Username: (listed in text file "vSAN Demo Credentials.txt" on the vSAN Desktop)

Password: (listed in text file "vSAN Demo Credentials.txt" on the vSAN Desktop)

vSAN Datastore Capacity

Let's begin with looking at the vSAN datastore capacity.

Navigate to the Hosts and Clusters view within the vCenter vdc-vcsa-wl01com > WDC-Workload > and select WDC-WL-Cluster02.

Click the Monitor tab and then select vSAN > Capacity (scroll down if needed).

You can see total usable capacity, how much has been consumed, and free space. There are additional details, such as compression savings and historical use. The What if Analysis lets you see how these capacity metrics would change if a new storage policy were assigned to the virtual machines on the vSAN datastore. Scrolling down further provides more information on capacity usage, including snapshots.

Storage Policies

Storage policies dictate the resilience and capacity consumption levels for objects stored on a vSAN datastore. While virtual machine disks (VMDKs) are the most common objects on a vSAN datastore, other object types exist. These include snapshots, the vSAN performance database, and first-class disks used with Kubernetes containers.

Click the three horizontal bars in the top left corner of the UI and select Policies and Profiles.

Scroll down the list of storage policies and check the box for vSAN ESA Default Policy - RAID6.  Details about this storage policy will appear under the list of storage policies.

Many rules can be included in a storage policy. The two main rules most administrators are concerned with are Site Disaster Tolerance and Failures to Tolerate.  The Site disaster Tolerance rule defines data placement for a standard vSAN cluster, a vSAN 2-node cluster, or a vSAN stretched cluster.

There are additional settings for this rule for more specific data placement you can choose from when you edit the storage policy.

Note:  In this TestDrive walkthrough, the Edit option is not available, so pictures of the settings have been included here for your reference.

For example, selecting Site Mirroring - Stretched Cluster instructs vSAN to place copies of the data at both sites to enable recovery from a site failure. The vSAN documentation contains more information on the effects of each policy rule.

Failures to Tolerate specifies how many drive or host failures can be tolerated before the data becomes inaccessible. RAID-6 (Erasure Coding) distributes the data to maintain accessibility, even if two hosts in the cluster are offline. The erasure coding fault tolerance method is preferred over mirroring with vSAN ESA as it minimizes capacity consumption while maintaining resilience and performance.

Storage policies are assigned to virtual machines and individual virtual disks. Policy assignments can be changed without disruption. This enables precise management of capacity and resilience on a per-VM basis.

vSAN Skyline Health

vSAN makes monitoring a cluster's health easy through 50+ metrics that check for issues with networking, configuration, performance, capacity, object health, hardware compatibility, firmware versions, etc. These metrics are consolidated into a cluster health score so the condition of the cluster can be easily verified. It is recommended to check the cluster health before and after making changes to a cluster, including vSphere upgrades and patches, changing host hardware, and making modifications to the network.

To access vSAN Skyline Health click the three bars in the top left corner, then select Inventory to return to the cluster view.

While still in the vSAN cluster, click the Monitor tab, and scroll down to Skyline Health under vSAN.

If there is an issue, Skyline Health aids with troubleshooting by providing information on why the issue is occurring and guidance for fixing the problem.

The Ask VMware link takes you to the relevant VMware Knowledge Base article.

Performance

vSAN provides robust performance monitoring capabilities. You can view metrics at the cluster, host, drive, and virtual machine levels. The time range can be real-time (current) or historical. You also have the option to save performance information for future analysis.

While still in the Monitor section of the vSphere Client, click  Performance under vSAN.

We are currently viewing performance information at the cluster level. With VM highlighted, use the drop-down menu to select Cluster Level Metrics, Top Contributors, or Show Specific VMs. Scroll down to see data, such as IOPS, throughput, and latency. Take a few moments to select various options. You can also modify the Time Range if desired.

Backend refers to the performance of the vSAN cluster from a backend perspective (versus “frontend” virtual machine metrics). For example, vSAN might need to update or rebuild data because of a drive failure. The traffic generated by this activity would show on the Backend graphs.

Host-level metrics are available after selecting a host in the vSAN cluster. Click one of the hosts in the vSAN ESA cluster, click Monitor, scroll down to vSAN, and click Performance.

Many metrics can be observed at the host level, including VMs on that host, drive performance, and networking details. Click through the various options to better understand the information available for vSAN at the host level.

vSAN performance data for individual virtual machines are also available. Select a virtual machine from the inventory, click Monitor if needed, scroll down to vSAN, and click Performance.

This lets you see performance information at the virtual machine and virtual disk levels, including IOPS, throughput, and latency.

vSAN provides comprehensive views of performance data from the cluster level down to individual virtual disks. This makes it easy to monitor your workloads and troubleshoot performance issues should they arise.

Data Placement for Resilience

vSAN automatically replicates and distributes data across the cluster to ensure resilience against maintenance operations and hardware failures (planned and unplanned downtime). The storage policy assigned to the virtual machine determines the fault tolerance method (mirroring or erasure coding) and how many failures can be tolerated before the data becomes inaccessible.

Let’s consider a virtual disk with a RAID6 erasure coding policy assigned. This policy is used to protect against two failures while minimizing capacity consumption. Data is stored in vSAN components. These components can be up to 255GB in size. Multiple components are used to scale the data storage and distribute the data across drives and hosts to comply with the assigned storage policy. In the example below, a 100GB virtual disk has an assigned RAID6 erasure coding policy. Six components are created for this virtual disk. Greater than 50% (four or more of the six) erasure-coded components must be healthy to access the data they contain. The components are distributed across six hosts in the cluster. That means the virtual disk is still accessible if one or two hosts are offline.

You can explore data components placement by selecting the vSAN cluster, clicking Monitor, scrolling down to vSAN, and clicking Virtual Objects.

Select a virtual machine from the list and click View Placement Details.

Notice that there is another set of components for each object. These components are small, and they are used for caching. vSAN ESA uses a log-structured file system to reduce IO amplification and maximize performance. Data is written to the caching tier or “performance leg” of vSAN ESA before being de-staged to the components we saw previously in the “capacity leg.” Data is mirrored in the performance leg to provide very fast write acknowledgment. This method also prepares the data to be saved optimally and space-efficiently while reducing wear on the NVMe storage devices.

vSAN Services

vSAN provides multiple services that enhance functionality and suit various use cases. Some services, such as the Performance Service (performance monitoring) and the Historical Health Service, are enabled by default. Others must be enabled to use them.

Click the vSAN cluster, Configure, scroll down to vSAN and click Services.

Scroll down on the right side of the UI to see the services.

vSAN File Service provides SMB and NFS v3 and v4.1 protocols for sharing files. vSAN File Service comprises a vSAN Distributed File System, which provides the underlying scalable filesystem by aggregating vSAN objects, a Storage Services Platform that provides resilient file server endpoints, and a control plane for deployment, management, and monitoring. File shares are integrated into the existing vSAN Storage Policy-based Management (SPBM) and on a per-share basis. vSAN file service enables hosting the file shares directly on the vSAN cluster. File Services VMs running in the cluster manage the file shares and contain a file server that provides NFS and SMB services.

Data Services include space efficiency and encryption. Deduplication and compression can be enabled to reduce the amount of raw capacity consumed by the workloads running on the vSAN datastore. Compression can be added to a storage policy to provide per-VM management of data compression. Encryption can be enabled for data at rest and in transit to provide an additional layer of security.

See the vSAN documentation for more details on these and other vSAN services.

Conclusion

vSAN ESA combines the space efficiency of RAID-5/6 erasure coding with the performance capabilities of RAID-1 mirroring. It introduces Adaptive RAID-5 erasure coding, ensuring guaranteed space savings, even in clusters with as few as three hosts. Additionally, storage policy-based data compression offers significantly better compression ratios than vSAN’s original storage architecture.

To enhance data security, vSAN ESA incorporates encryption measures that safeguard data during transit and at rest while imposing minimal overhead. Furthermore, it features adaptive network traffic shaping, prioritizing VM performance during resynchronizations.

Finally, by eliminating dedicated cache devices and adopting a flexible architecture utilizing a single tier, vSAN ESA substantially reduces the total cost of ownership (TCO) and provides an HCI platform suitable for nearly all workloads.