Local cache verification for D2C- x360Recover

Written By Tami Sutcliffe (Super Administrator)

Updated at December 14th, 2021

x360Recover agent release 2.31 (in conjunction with x360Recover release 10.11.0) now incorporates automated local cache health verification and also provides automated "self-healing" for missing local cache data. 

What is local cache verification

Before x360Recover v.10.11, the health status of x360Recover local cache was relatively invisible, running in the background with limited status reporting.

By providing local cache health verification, x360Recover now makes this information visible in the user interface and, additionally, performs automated self-healing for any data missing in the local cache, ensuring a reliable recovery experience.

Review the full details on local cache here

How does verification work?

Local cache verification efficiently compares (a) the content index of the local cache repository with (b) the change block hash table (used in every backup to indicate which blocks and data have already been send to the backup server.)

 A single CPU core is capable of comparing about 1TB per minute, so the total verification process requires very little time to complete on almost any system.

How does "self-healing" work?

If missing blocks are found (e.g. data on the protected system hard drive that is not already in the local cache), a bitmap of missing blocks is created and saved. The x360Recover agent will process the bitmap during the next backup cycle. Any missing blocks will be populated to the local cache repository to ‘self-heal’ the missing data.

Why might data be missing? 

Many factors can cause missing blocks, including intermittent write errors to the local cache, or backups taken while the local cache device was detached or unavailable.

What happens next?

The local cache verification and self-healing process then

  • confirms that all data necessary for a recovery is held within the cache
  • repairs any missing block data
  • scores the results of testing to report it to the backup server

How to configure the agent

Local cache verification adds several new configuration parameters to the agent for enabling and managing local cache verification checks.

Note: By default, verification jobs are run once every 24 hours, preferably not during business hours.

Configuration parameters:

LOCAL_CACHE_VERIFY_FREQUENCY_HOURS: Default: 24, Maximum: 720

This setting specifies that if a verification job has not been performed within the last x hours, a verification job should be performed during the next backup cycle.

LOCAL_CACHE_VERIFY_ALLOW_DURING_BUSINESS_HOURS: Default False

By default, this setting will defer running verification if it occurs during the current  ‘Business Hours’ as defined by the assigned backup schedule. 

However, if it has been more than 24 hours longer than the defined verification frequency hours interval, (no ‘After-hours’ opportunity for running verification checks occurred in over 24 hours) this option will be overridden, and verification will proceed during business hours. (This setting is to ensure that verification checks are not indefinitely blocked.)

LOCAL_CACHE_VERIFIY_ALLOW_SELF_HEALING: Default True

When enabled, any ‘missing’ blocks discovered during verification will be flagged for the agent to incorporate during the next backup.

This option exists for troubleshooting purposes only.

To ensure your local cache data is kept up to date and reliable, this option should not be disabled.

Note: We recommend keeping the default agent settings for most partners. 

Currently, these settings can only be set manually in the agent aristos.cfg configuration file on the protected system. 

A future enhancement to agent orchestration will add these parameters to the UI for remote management where necessary.


Local cache reporting and monitoring

Scan results from local cache verification are sent to the backup server for reporting. 

The results of the latest local cache verification are displayed on the Protected Systems page in the Status column.

Additional details about the local cache verification are displayed on the Protected System Details page.


What do the status icons mean?

Green indicates that local cache testing passed successfully
Yellow indicates that local cache testing has passed successfully in the past, but the most recent successful test is greater than 48 hours (up to 72 hours) longer than the scheduled testing interval. (After 72 hours, the status is elevated to red.)
Red indicates that the most recent local cache test failed. More than 1% of total data blocks on the protected system were not found in the local cache. Missing blocks should be pushed into the cache on the next backup and the next local cache verification test should then be successful.
Gray indicates that (a) local cache is not enabled for this system, or (b) local cache verification testing has not yet been performed.
  • Direct-to-Cloud endpoints are expected to have local cache enabled. 
  • Non-Direct-to-Cloud endpoints are not expected to have local cache enabled and will not show the gray status icon.

Troubleshooting

Local cache is disabled

The status icon will be gray if you have not yet enabled local cache for your Direct-to-Cloud (D2C) endpoints (or if local cache verification has not yet been run). 

  • D2C is intended to leverage local cache. This allows you to perform rapid recovery and virtualization using Recovery Center and provide the full 'no-hardware BDR experience' to protected systems. If you have not yet configured local cache for your endpoints, refer to this knowledgebase article for assistance: Local cache for D2C

If you have already configured local cache and are receiving a gray icon, verify that the agent installed on the end point is version 2.31.877 or higher.

Local cache warning: The most recent successful test is 48-72 hours older than the configured verification testing frequency.

A yellow status icon indicates that (a) at least one successful local cache verification test has been performed in the past, but (b) the most recent successful test is 48-72 hours older than the configured verification testing frequency. 

Example: The default testing interval is every 24 hours. A yellow icon indicates that the most recent successful test is at least 72 hours old (but no more that 96 hours old.)

There may either be no more recent test attempt, or the most recent attempt may have failed. If the most recent successful test is greater than 72 hours + testing frequency hours old, the status will be elevated to failed

Local cache failed:   More than 1% of the total data blocks were found to be missing from the cache during the verification run and/or the last successful test is more than (72 hours + testing interval) hours old.

A red icon indicates that the local cache is considered to be in a failed state. 

If more than 1% of total data blocks were found to be missing from the cache during the verification run and/or if the last successful verification is more than 72 hours + testing interval hours old, the local cache is considered to be in a failed state.

  • Check to make sure the local cache storage device is accessible and not out of storage space.  
  • Contact Axcient Support for additional assistance in troubleshooting local cache failures

 


 SUPPORT | 720-204-4500 | 800-352-0248

1003