Having a disaster recovery (DR) framework in place is critical for any company with digital infrastructure and data. Unplanned outages, at best, can lead to poor customer experience. At worst, customer data may be lost or compromised, or your critical applications could be down for protracted periods, costing your business millions of dollars in lost revenue.
IT leadership should closely evaluate their current DR plan. Has it been tested recently? Has it been updated to modern standards? Legacy, on-premises infrastructure may not have the SLA and uptime needed for modern application architecture and business processes.
More and more companies have turned to the cloud, not just for their production infrastructure but for disaster recovery as well. The cloud offers plenty of advantages over legacy infrastructure, but utilizing the cloud effectively requires careful management and a well-scoped objective.
What Are the Goals of Disaster Recovery?
Disaster recovery is a broad term and can encompass many different facets of an organization, including physical, logistical, and digital. However, at a high level, disaster recovery has three primary goals: business continuity, compliance, and security. Most any effort toward DR should be made with these three goals in mind.
Business Continuity
For an organization to last, the business must “continue,” even during times of adversity. Business continuity is about making sure the core functionality of your business can persist under non-optimal conditions.
One of the top objectives in business continuity should be to maintain customer experience—and thus hopefully revenue stability. For example, imagine an e-commerce site that experiences an outage in one of its data centers. It’s very likely numerous supporting services and applications are degraded or have become totally non-functional. Is it better to have the site completely down while the issue is being triaged, or should customers still be able to place orders?
In the highly competitive e-commerce market, a bad customer experience, like the site being inaccessible, will simply drive customers (and revenue) to a competing site. Worse still would be a loss of customer data, like order history and payment options. Lost customer trust is very hard to win back, and it doesn’t take much of an outage to generate a significant impact on your bottom line and possibly future growth as well.
Compliance
Most frameworks for compliance require organizations to have demonstrable disaster recovery infrastructure and procedures in place. Auditors will apply a critical eye to your DR procedures, and evidence of recent and regularly scheduled testing, as well as lessons learned and action items, are important to receiving certification.
Organizations that can demonstrate they have successfully implemented compliance frameworks are able to offer an extra layer of trust and gain access to participate in markets and activities that require special data handling.
Security
Disaster recovery isn’t just about being able to recover if a construction crew cuts a critical network line or flooding and storms take a data center offline. Disasters can exist entirely in the digital realm. Important data might be accidentally deleted or corrupted. In a worst-case scenario, an outside attacker may compromise, steal and possibly destroy crucial data.
Circling back to compliance, in the previous example of the hypothetical e-commerce site, accepting customer payments puts the site firmly in the purview of PCI compliance for processing sensitive financial data. PCI DSS requirement 12.10.1 requires an incident response plan in the event of a breach, which should address things like business recovery and continuity.
Once an attacker has penetrated the network boundary to where customer data lies, it’s generally wise to assume that any part of the existing infrastructure is compromised and potentially a vector for a future attack. In these cases, organizations are forced to treat their own architecture as hostile. After an attack, having clean, safe backups and a well-thought-out DR plan could mean the difference between business continuity and business failure.
Why Cloud?
The next logical step in the evolution from the classic data center and on-premises infrastructure is the cloud. Cloud platforms offer a broad variety of infrastructure and managed services and are continually growing their offerings.
Aside from the latest and greatest in scalability and computing, the cloud gives enterprises new ways to tackle building a solid DR process. It also offers several advantages right out of the gate. Cloud services, by design, are often highly available and fault-tolerant inside and across geographic regions. Many of their data and storage services are designed with this in mind, including easily utilized features that take advantage (for a cost) of this fault-tolerant design. Most of these services also offer some form of encryption at rest, typically enabled by default, and many offer encryption in transit as well.
Managed cloud services providers (MCSPs) provide a “batteries-included” solution. Rather than having to dedicate development and operational cycles to configuring and deploying bespoke solutions, teams can leverage managed cloud services, outsourcing the operational overhead to the provider. AWS S3 provides a great example of this. Eschewing complex layers of backup hardware, files can be uploaded to S3 and automatically configured with a deep hierarchy of retention policies and storage classes. Other cloud platforms like Azure offer similar resources to support robust data protection strategies.
Disaster Recovery as a Service (DRaaS) solutions have existed for a number of years now and have come into their maturity to protect critical business workloads, applications and virtualized infrastructure, but also driven in no small measure by demand and use for cloud platforms have exploded, in part due to the need use those platforms as targets for recovery data.
The major cloud providers also offer nearly ubiquitous API access to their services. This makes it easy to automate deployment, configuration, and usage of DRaaS using code. This programmatic approach additionally makes it easier and more efficient to lift and shift on-premises resources to the cloud.
Potential Downsides to Cloud DR
For all the benefits the cloud offers, it’s not without some negatives. Like any tool, the cloud is at its best when used correctly for the right situation. Organizations that equate the cloud with having more guardrails or being inherently safer than on-premises infrastructure could be in for a rude awakening if they don’t choose the right provider and approach to their DR in the cloud.
Cost is always a concern with cloud infrastructure. While the value proposition of the cloud certainly warrants additional cost overhead versus overseeing the infrastructure yourself, it still needs to be carefully managed. Infrastructure costs can grow rapidly. Network transport can become a major cost factor on monthly bills, with companies seeing huge charges for data transfer, depending on whether the chosen cloud platform charges for such elements. This can be especially jarring in a DR scenario when large data transfers are often needed during restoration.
Complexity and/or unintended sprawl can also be a stumbling block. Smaller, leaner engineering teams may suddenly find that they have a sprawling mix of vanilla compute infrastructure, data storage, and APIs. It can be difficult to clearly document and track this usage, and doubly difficult to craft a cohesive DR plan that adequately addresses all the potential failure modes.
Disaster Recovery: Having a Cloud Strategy is Critical
Disaster recovery is crucial for any software organization, as well as any company with critical recovery needs for business data and applications with a low tolerance for operational downtime. Organizations that are looking to upgrade their legacy infrastructure and disaster recovery systems often look to the cloud, which offers state-of-the-art services and a path to lower capital expenditure—but it’s not without potential issues.
Teams and organizations that might not be quite experienced enough to fully tackle DR design and management can engage third-party partners to provide expertise and fully featured DRaaS services that focus solely on disaster recovery in a cloud-enabled world. These third parties can free up development and operations teams to focus on iterating and delivering features that their customers want.
The Rub
A solid disaster recovery framework, including procedures and infrastructure, is critical for any organization. Do you have one? When was the last time your disaster recovery process was tested? Have you been revising it regularly to encompass changes in your environment?
Maintaining business continuity during a downtime event can mean the difference between the long-term success or failure of your business. There is considerable evidence that organizations without a cohesive DR strategy in place, that is regularly tested and updated, risk going out of business when a significant downtime occurs.
The cloud offers capabilities that self-managed, legacy infrastructure does not. It’s not, however, without some unique challenges to consider.
At the end of the day, the best value is often gained by engaging an MCSP with expertise and resources that are fully focused on disaster recovery solutions.
Learn more how RapidScale can help you improve your DR strategy and data protection, with a DRaaS solution backed by deep cloud and DR expertise and industry-leading, 100% RapidResponse Support.