Disaster Recovery Plan

In this article, we describe the main disaster recovery strategies for the Regula IDV Platform and provide guidance on how to implement them effectively.

Four principal disaster-recovery (DR) strategies are commonly used:

Backup and Restore: Take regular backups of data and applications and restore them after an incident.
Pilot Light: Keep a minimal, always-on core of the environment while other components remain offline until needed.
Warm Standby: Maintain a reduced-capacity but fully functional environment that can be quickly scaled to full production.
Active-Active: Operate multiple fully functional environments in different locations and route traffic between them in real time.

Let's discuss the pros and cons of each strategy.

Backup and Restore

Pros

Simple to implement and manage.
Cost-effective for small environments.
Suitable for both data and application recovery.

Cons

Long recovery time and recovery point objectives (RTO/RPO) – recovery can take hours or even days.

Pilot Light

Pros

Faster recovery time than Backup and Restore.
Cost-effective for small to medium environments.

Cons

Requires more resources than Backup and Restore.

Warm Standby

Pros

Faster recovery time than Pilot Light.
Suitable for both data and application recovery.

Cons

Requires more resources than Pilot Light.
More complex to implement and manage.
Higher cost than Pilot Light.

Active-Active

Pros

Lowest RTO/RPO – near-instant failover.
Can handle large-scale disasters.
Provides high availability and load balancing.
Supports both data and application recovery.

Cons

Most complex to implement and manage.
Requires significant resources and infrastructure.
Highest cost of all strategies.

Implementation Guidance

Regula provides an on-premises solution that can also be deployed in cloud or hybrid environments. Because it is not cloud-native, the DR strategy must be implemented at the infrastructure level and remains the customer’s responsibility.

Recommendations:

All components of the IDV solution are horizontally scalable; therefore any of the DR strategies above can be implemented.
Choose a strategy that matches your recovery objectives, budget, and operational capacity.
If you can leverage cloud-native services, Active-Active offers the best performance and availability.
When budget or resources are limited, start with Backup and Restore and evolve to Pilot Light or Warm Standby over time.

Stateful-Component Replication

If you select Pilot Light, Warm Standby, or Active-Active, replicate stateful workloads between the primary and secondary sites:

MongoDB database: Use replica sets to replicate data; see the MongoDB replication documentation.
Object storage: Replicate file storage (for example, Amazon S3 or an on-premises S3-compatible platform such as MinIO) using the storage-provider’s native replication or a custom mechanism.
Vector/Text store: IDV uses either MongoDB Atlas or OpenSearch; both provide built-in replication. See the MongoDB Atlas or OpenSearch documentation.
RabbitMQ message broker: Enable RabbitMQ clustering or federation; refer to the RabbitMQ cluster guide.

Stateless-Component Deployment

All other IDV components are stateless and can be deployed in multiple sites by running additional instances and fronting them with load balancers:

Web Server
Workflow Service
Audit Service
Indexer