Enterprises running hypervisors on hyper-converged infrastructure (HCI) systems typically have backup options available to them that are not available to those running on generic hardware. Such customers may also have additional backup challenges depending on the HCI vendor and hypervisor they have chosen. Let’s take a look.
When backing up physical servers you might run a full backup on all of them at one time, and that’s fine. It’s an entirely different situation if all of those servers are actually virtual machines sharing the same physical server. Even running several full-file incremental backups (backing up the entire file even if only one byte has changed) at the same time can significantly affect the performance of your hypervisor. That’s why most customers using server-virtualization products such as VMware or Hyper-V have switched to more hypervisor-friendly methods.
Options include block-level incremental or source-side deduplication methods. While they’re not technically designed for VM backups, they can be very helpful because both significantly reduce the I/O requirements of a backup by an order of magnitude or more. They also make it possible to run VM-level backups without impacting the overall performance of the hypervisor. One downside is that it reduces the efficiency of virtualization because it requires the installation and maintenance of client software on each VM. That’s why most people backing up VMs opt for hypervisor-level backups.
Hypervisor-level backups utilize software or APIs at the hypervisor level. Each major hypervisor offers such an API. Backup systems interfacing with these APIs are typically able to ask the hypervisor for the blocks that have changed since the last successful backup. The backup system backs up only those changed blocks, significantly reducing the I/O requirement and reduces the amount of CPU required to identify and locate change blocks. The combined effect of these two features significantly reduces the impact of backups on the hypervisor.
Some storage products have connected hypervisor-backup APIs with their snapshot capability as a backup methodology. All customers need to do is to put their datastore on the storage system in question and provide the appropriate level of authentication into the hypervisor. At the agreed-upon schedule, the snapshot system interfaces with the hypervisor, places the various VMs in the appropriate backup mode, and takes a storage-level snapshot. The snapshot takes only a few seconds to make, then the VMs can be taken out of backup mode. This is faster than the previous backup method and has a lower impact on backup performance.
The snapshots would need to be replicated to another storage system in order to be considered a valid backup. Such replication typically requires very little bandwidth and CPU and is relatively easy to accomplish. This allows enterprises using this backup method to have both an on-premises and off-premises copy without ever having to perform what most people consider to be a backup.
Snapshot-based backups – as long as they are replicated to another location – offer some of the fastest recovery time objectives and tightest recovery point objectives in the industry. One downside to using them is that they traditionally require a separate storage product, one that might be quite expensive.
Many hyper-converged infrastructure systems take care of this downside by bundling compute, network, and storage in a single package that also typically includes snapshot-based data-protection mechanisms. They use the snapshot-based backup method but without requiring a separate storage system. This single, integrated system makes it easier to create and manage VMs while also making sure that backups are happening as well via HCI’s integrated snapshot-based backup system. Instead of compute, networking, storage, and backup systems from four different vendors, the HCI world offers a single vendor that accomplishes all of that. This is a contributing reason why many companies, especially smaller ones, have really taken to HCI.
Some take integrated data protection even further, and integrate these backups into the cloud, providing a DR function as well. This allows you to recover your entire datacenter to the cloud, without ever running a traditional backup or replicating data the way you would in a typical DR scenario.
Some HCI vendors do not use Hyper-V or VMware. For example, Scale Computing uses the KVM hypervisor and Nutanix uses the Acropolis Hypervisor (AHV), although Nutanix also supports VMware. The potential concern here is whether these hypervisors have the same level of data-protection APIs offered by VMware and Hyper-V and whether backup vendors write to those APIs.
Customers using HCI vendors that use other-than-mainstream hypervisors have two basic choices for data protection: find a backup-software vendor that supports the hypervisor or use the integrated data protection features available in the HCI product. A few vendors address the backup needs of this market. The integrated snapshot-based backup systems available in both Scale Computing and Nutanix are on par with the snapshot-based backup systems mentioned in other HCI platforms.
The integrated data-protection and disaster-recovery features from some HCI vendors meet or exceed what is possible using third-party tools. Such companies argue that it’s simply one more thing they are simplifying, and that’s a solid argument. If a single product could meet your compute, networking, and storage needs, while also making sure you’re protected in case of an outage or disaster – that’s a compelling offer.
This story, “How to backup hyperconverged infrastructure” was originally published by
W. Curtis Preston is an expert in backup, storage and recovery, having worked in the space since 1993. He has been an end-user, consultant, and analyst, and has recently joined the team at Druva, a cloud-based data protection company.
Copyright © 2020 IDG Communications, Inc.