In my 25-plus years as a business continuity management professional, I’ve noticed that the biggest failures in organizations’ ability to recover from disruptions usually come from two problems: half-baked recovery strategies and inadequate testing. In today’s post, we’ll look at why these BCM failures and shortcomings can be so costly and share tips on how to avoid them.
There are many reasons BCM programs fail when you need them most. In my experience, two shortcomings lie behind most failures and also the most serious failures: underdeveloped recovery strategies and a lack of realistic testing.
When these two problems occur together, it can lead to a one-two punch that knocks a company to its knees whenever there’s a disruption.
Let’s look more closely at why these areas are so important and how you can strengthen your program’s performance in each one.
We’ll start with recovery strategies.
Underdeveloped Recovery Strategies
When I meet with companies to talk about improving their BCM programs, the inadequacy of their business continuity recovery strategy is often the elephant in the room. It’s the big problem that everyone knows about and no one wants to acknowledge.
A recovery strategy is an overall approach that will be used to restore a business or IT process.
Recovery strategies set forth the steps an organization should take to resume its mission-critical business processes and computer systems and applications in the event of a disruption.
Unfortunately, most companies’ recovery strategies range from half-baked to nonexistent.
Companies in that situation have little to no idea what they need to do to recover the business, if and when they are hit with an outage.
What are the most common deficiencies found in companies’ recovery strategies?
- The company has not established any standards to guide its stakeholders in applying its strategies.
- Even though the company knows—as a result of having conducted a Business Impact Analysis—which of its business units are the most critical, it doesn’t take the findings of the BIA into account in devising its recovery strategies.
- The company either has not conducted a Threat and Risk Assessment, or it ignores the conclusions of the TRA in creating strategies based on relevant threats/risks.
- The organization starts strong and then peters out, failing to fully implement the strategies.
- The company does not budget sufficient resources to fully implement its strategies.
If any of these problems exist, it can be enough to keep a program from working when you need it most.
What are some things an organization can do to maximize its recovery strategies?
- Establish and document an enterprise standard for implementation of the recovery strategies for the business processes and IT.
- Take the BIA and Threat and Risk Assessment results into account in devising recovery strategies that make sense.
- Make sure the recovery strategies are sufficiently robust to function when the company is at peak work volume.
- Develop a wide spectrum of strategies to fit the full range of business processes and IT systems and applications, from those that are mission-critical to those that can be deferred for an extended period.
If an organization’s recovery strategies are underdeveloped, it is essentially gambling with its future. If the strategies are sound and fully implemented, the organization is well on its way to being able to face the future with confidence.
Inadequate Recovery Exercises
The proof is in the pudding, as the saying goes. In business continuity, the pudding—and hence the proof—resides in the recovery exercises.
Do your strategies really work? The only way to find out is to put them to the test through realistic exercises. The failure of many companies to do this is the second major cause of BCM program failure. Combined with underdeveloped recovery strategies it amounts to a one-two punch that can knock your company out cold.
What are some of the most common problems with companies’ BCM testing and exercise programs?
- They only perform tabletop exercises. This doesn’t fully validate capability. See this post for more on the limits of tabletop exercises.
- They don’t conduct BCM exercises regularly or with sufficient frequency.
- When they do exercise, they don’t go through all scenarios and test for simultaneous events.
- Program managers don’t test in an unannounced manner.
- They don’t follow the relevant documentation in implementing the strategies during the test.
- Managers don’t document the progress of the exercise as it’s underway, noting gaps and creating action items for later implementation.
- Management doesn’t review or validate the exercise.
- Exercises for each business unit are never integrated with those of the units upstream and downstream.
- The exercises don’t take peak work volumes into account.
Any organization who’s testing program has these problems is not really testing anything. It’s only fooling itself—and/or squandering precious opportunities to identify gaps and close them.
What can a company do to make sure its exercise program truly validates its recovery capability?
- Follow the enterprise recovery exercise standard.
- Take the BIA and Threat and Risk Assessment results into account in planning and carrying out recovery exercises.
- Ensure that mission-critical business processes and IT systems and applications are fully exercised.
- Conduct exercises regularly, frequently, and comprehensively.
- Secure the participation of management. Managers should take responsibility for the successful execution of exercises and the resolution of exceptions.
By taking these steps, an organization can ensure that its testing program provides meaningful validation rather than false comfort.
Becoming More Resilient
The two problems that cause the most, and the most serious, BCM failures are undercooked recovery strategies and inadequate testing. Together these problems amount to a one-two punch that can knock any organization to the canvas, whenever an event occurs. Companies that take steps to strengthen their position in these two areas reap the benefits of greater resiliency and recoverability.
For more information on BCM program failure, common BCM mistakes, and other hot topics in BCM and IT/disaster recovery, check out these recent posts from BCMMETRICS and MHA Consulting:
- Double Trouble: How to Handle Multiple Business Disruptions
- Weighing the Danger: The Continuing Value of the Threat and Risk Assessment
- Testing 1-2-3: Three Things You Should Know About Business Continuity Testing
- How Do BCM Offices Fail? Let Us Count the Ways
- Let’s Get Real: The Limitations of Tabletop Recovery Exercises
- 8 Dos and 1 Don’t for Conducting Disaster Recovery Tests