AutoVerify Troubleshooting Guide - x360Recover BDR

Written By Tami Sutcliffe (Super Administrator)

Updated at July 14th, 2021

Overview

AutoVerify is a new and improved method of validating and testing the integrity and recoverability of protected systems released with x360Recover version 8.3.0.  [Learn more about AutoVerify:]

With AutoVerify, each nightly Boot VM check performs a series of deep system integrity checks within the running virtual machine, to ensure that the protected system backup is both healthy and ready to recover in the event of a disaster.

AutoVerify provides a more thorough backup integrity check than the legacy Boot VM Check.  It also has the advantage of performing automated validation of the results rather than relying on manual interpretation.

To resolve issues reported by AutoVerify, consult this guide or contact Axcient support for assistance.

This guide will attempt to define the possible issues that may be reported and provide directions on how to troubleshoot and correct those problems that are discovered.

Understanding the AutoVerify Process

AutoVerify is an enhancement to the existing nightly Boot VM check process.

When Boot VM checks are enabled for a protected system (and this is strongly recommended),  AutoVerify checks will be performed as part of the process.

Before launching the Virtual Machine (VM), a communications driver is injected into the protected system image.  Once the VM is booted, the appliance communicates with the running system and issues commands for Windows to perform specific tests.

This process can fail at multiple points along the way:

  • The appliance may be unable to start the VM
  • AutoVerify may be unable to modify the snapshot image
  • Communications with the VM may fail
  • Specific AutoVerify test cases may fail
  • The source protected system may have corrupted data 

Tests Performed by AutoVerify

As of the 8.3.0 release, there are two primary tests performed by AutoVerify:

  • Heartbeat – Successful heartbeat tests indicate that the appliance was able to communicate with the protected system VM via the communications layer
  • Chkdsk – Successful chkdsk tests indicate that all protected system disk volumes present within the backup set were successfully scanned for file system problems and no problems were found

If a chkdsk test is failed for a protected system, the appliance will automatically schedule a new full backup scan (if one has not previously been performed in the last 90 days), in an attempt to self-heal the failure.  A full backup scan synchronizes the protected system with the backup image by examining the entire file system of every disk volume.  Only blocks that differ between the source protected system and the latest backup will be written.

Troubleshooting Failures

A global overview of AutoVerify status can be found on the Health or Trouble report pages of the Global Management Portal. 

mceclip0.png

Detailed status results can be found on the Details page of each protected system.

status.png

  • Note: Ensure that the installed agent version is 2.23 or higher.

Clicking on the results will display a detailed breakdown of the steps performed during the check:

results.png

Possible Failure Results 

AutoVerify Failed to Initiate

In this case, the x360Recover BDR was unable to perform a test for the protected system.

Possible causes might be:

  • Failure to inject communications drivers into snapshot
  • Failure to delete an existing snapshot clone
  • Snapshot is already virtualized and running

Examine the protected system on the BDR and test the StartVM feature.  If the failure reason is not obvious, contact Axcient Support for assistance with further troubleshooting.

Heartbeat Failed

The heartbeat check verifies that the Virtual Machine communication driver is able to connect to the running operating system.

Possible reasons for failure: 

  • The protected system image is corrupt or otherwise unable to boot as a VM
  • Heartbeat communication service was blocked by antivirus scanner
  • Startup of the operating system took too long and the operation timed out.
  • Long running pending Windows updates can cause timeouts.

1. Examine the protected system on the BDR and test the StartVM feature.

2. If pending Windows update installation is causing long delays, reboot the source protected system to allow the Windows update installation to complete, and then run a new incremental backup of the system.

3. If the protected system will not boot, exhibits a blue-screen crash or has other errors, contact Axcient support for assistance in troubleshooting the underlying Boot VM failure.

  • Some antivirus scanners (notably SentinelOne) will sometimes block the execution of the heartbeat communication and control service.  If AutoVerify Heatbeat checks are failing, try adding an antivirus scanning exclusion for %systemroot%\system32\axecomsvc.exe on the protected system.  Note that this service is only installed within the virtual machine image during an AutoVerify operation and is not otherwise present on the machine. 

chkdsk Failed

The chkdsk test is the most complex to troubleshoot, as it can fail for a variety of reasons.

( chkdsk is a Windows utility that checks the integrity of a file system.)

The backup image data could be corrupt on the BDR, or the source protected system could have file system errors.

In rare cases, the Volume Shadow Copy services (VSS) may not start within the Virtual Machine.  (Chkdsk requires a VSS snapshot to check the running file system. VSS failures will cause a failed AutoVerify check.  In this event, the state of the file system is essentially unknown because the chkdsk results will be invalid.)

To troubleshoot chkdsk failures:

  • First, examine the detailed output from chkdsk and identify the specifics of the failure.  Identify the failing volume(s) and perform a chkdsk (without /f) on the source protected system to determine if the problem lies with the backup or the protected system.
  • If the problem is the source system, perform a chkdsk of the affected volume(s) (WITH /f) and schedule a reboot, if necessary, to complete the repair.
  • After repairing the file system and verifying that chkdsk completes without reporting errors, perform a full backup scan of the protected system by selecting Schedule Now and choosing Full from the Details page on the appliance.
  • If the source protected system tests clean, attempt to repair the problem by performing a full backup scan of the protected system to synchronize any differences between it and the backup image.  Select Schedule Now and choose Full from the Details page on the appliance.  Note:  A full scan may already have been triggered automatically by AutoVerify. Check the jobs log before starting another full scan.
  • Verify that the protected system is running at least agent version 2.23.  Previous agent versions have a known bug that can sometimes cause false chkdsk warnings due to incomplete NTFS metadata.  If it is necessary to upgrade a protected system manually to agent 2.23 or higher, run a new incremental backup.

If subsequent AutoVerify chkdsk tests fail, open a ticket with Axcient support for assistance.

If you see the following message in the chkdsk output: The volume is in use by another process. Chkdsk might report errors when no corruption is present then the VSS services were unable to start.  In this case, please open a support ticket with Axcient support.

Other Boot VM Check Failures

For any other issues with AutoVerify or Boot VM checks in general, please contact Axcient Support for assistance.