Blog | BCMMetrics

The 4-3-3 Rule for Writing Business Recovery Checklists

Written by Michael Herrera | Mar 1, 2018 1:11:30 PM

When it comes to your organization’s recovery plan, your business recovery checklists might just be the single most important ingredient. They are the engine of your recovery plan.

As we state in MHA’s Complete Guide to Creating and Implementing a Business Recovery Plan, “Recovery checklists guide you step-by-step through the process of getting your business back up and running” after a disruption. Without such checklists, your team would have no direction as to the steps and actions they would need to take to respond to and recover from a disruption and to resume business operations. I urge you to take checklists seriously.

If you’re still reading, I will assume that means you are taking them seriously. Great. Now, let’s roll our sleeves up and get to the heart of today’s post.

Having accepted the importance of recovery checklists, you might be wondering how to develop them for your own organization.

It all comes down to what I call the 4-3-3 Rule for Writing Business Recovery Checklists.

This rule states that in writing recovery checklists for your organization, you should:

  • Develop a recovery checklist for each of the 4 major impact scenarios
  • Make sure each checklist covers the 3 phases of recovery
  • Make sure your checklist for each phase follows the 3 key checklist qualities

That’s it: that’s the 4-3-3 Rule.

But you might be asking yourself: What are those things, impact scenarios and the rest?

Let’s take them one by one.

IMPACT SCENARIOS

Impact scenarios are the ways that negative events can damage your business or operations. There are four main scenarios, and our aforementioned guide breaks them down like this:

  • Loss of building or geographic region. Typically this is the result of a natural disaster, though it may also be brought on by other events: fires, floods, severe weather, air contaminants, hazardous spills, or acts of terrorism.
  • Loss of technology or telecommunication equipment. This situation may be brought on by technical issues such as equipment and software failure, communications failure, cybersecurity attacks, or data breaches.
  • Loss of resources (specifically, people) or a pandemic-related event. This category applies when there’s a resource loss equivalent to 40% or more due to a pandemic or workplace violence incident, for instance.

You should devise a separate checklist for each of these four scenarios.

Note that we do not develop checklists for specific events, such as a hurricane or cyber attack. As always in business continuity, the event is almost beside the point. What we should focus on is the impact, and remedying and recovering from the impact.

Theoretically, there is an infinite number of events that could cause you problems, but the possible impact of those problems is basically limited to the four scenarios given above.

RECOVERY PHASES

Recovery phases are the subtasks that make up the larger task of recovering from an event. As adapted from the MHA guide previously mentioned, the three phases of any recovery are:

  • Response phase. The activation, notification, and assessment of the situation. In this phase, you will need to evaluate the severity of the situation, formally activate your plan, communicate with your employees about the situation, identify any deviations from normal working procedures, and determine next steps.
  • Recovery phase. Implementing the requirements for operating within a non-business-as-usual scenario. Once the response phase is over, tasks in this phase might include how and where to relocate so that critical business units can continue to operate, going to the bank and getting checks (which are inaccessible inside the building), communicating with critical vendors, the specific steps to restore each critical process in the business unit, and notifying the post office to hold mail or picking mail up yourself.
  • Restoration phase. The tasks involved in returning to business as usual. For example, you may need to ensure that forwarded phone lines are returned to normal settings, notify customers your site has returned to normal operations, notify vendors (like FedEx, UPS, and others) if you’ve been using a forwarding address, and update your business recovery plan with any learning you’ve gained from experiencing the situation firsthand.

Each of your four recovery checklists (remember, you’re going to develop a checklist for each of the four impact scenarios) should include steps covering these three phases.

CHECKLIST QUALITIES

We are now ready to put the final “3” in the 4-3-3 Plan, which is the key checklist qualities. These are the qualities that your checklists must have in order to be useful when it counts.

In my opinion, your checklists should be:

  • Relevant. Make sure all steps are relevant to what needs to be done; don’t waste time on extraneous tasks that don’t make sense or aren’t relevant.
  • Easy to Understand. Recovery steps should be written in easy-to-understand language. Aim for writing at about the 6th grade reading level.
  • Comprehensive Yet Concise. The checklists need not exhaustively cover every task required to complete the recovery. Rather they should lay out the tasks that a seasoned professional might forget. Condense your checklists into the critical path necessary to restore operations.

That’s all there is to the 4-3-3 Rule for Writing Business Recovery Checklists. To reprise it, in writing recovery checklists for your organization, you should:

  • Develop a recovery checklist for each of the 4 major impact scenarios
  • Make sure each checklist covers the 3 phases of recovery
  • Make sure your checklist for each phase follows the 3 key checklist qualities

Are you interested in learning more about what makes a good checklist? If so, I recommend that you check out the The Checklist Manifesto, by Atul Gawande (link goes to author’s website).  It’s full of interesting stories and explains how well-designed checklists can improve outcomes in many endeavors.

Task Description and Other Information By Whom Initial When Complete
Response Phase    
1.     Based on the severity of the event, assess and determine the immediate impacts to critical finance and administration activities and report findings to the Crisis Management Team (CMT) as needed.   o ______
2.     The CMT will coordinate efforts with corporate communications to ensure the appropriate notification is provided to affected internal personnel (e.g., employees, contractors) and external IT stakeholders as needed.   o   ______
3.   o   ______
4.   o   ______
5.   o   ______
Recovery Phase    
6.     In the event of a long-term outage, the CMT leader may execute the enterprise recovery strategy that includes relocation of personnel (e.g., home) in order to continue supporting core activities.   o   ______
7.     As needed, reprioritize current workloads and/or suspend nonessential (Recovery Time Objective 4) activities throughout the duration of the disruption.   o   ______
8.     Obtain approval for potential exceptions to business-as-usual activities by the appropriate corporate stakeholders (e.g., risk management, information security) as applicable to the event.   o   ______
9.     Identify orders that have been placed and contact vendors to request they hold and/or reroute deliveries as applicable to the event.   o   ______
10.  If needed, coordinate efforts with the CMT to gain access to emergency checks and endorsement stamps that are maintained in the CMT command centers (on- and off-site).   o   ______
Restoration Phase    
11.  Upon receipt of the appropriate building permits, engineering certificates, and/or building inspections, re-occupancy to the affected site can be granted. The CMT will work with the required stakeholders (e.g., facility management) to implement the applicable measures in order to establish re-occupancy.   o   ______
12.  Upon resuming business-as-usual operations at the primary work site:

·   Ensure validation of all required systems, applications, and other IT-related components. Direct staff to report exceptions to the help desk.

·   Implement a process to ensure that any manually tracked data is keyed online and brought current.

  o   ______
13.  The phone system administrator will complete the task of rerouting phone lines back to the primary facility, a process that will be transparent to customers.   o   ______
14.   o   ______
15.   o   ______