When Does Your Clock Start? The Business Continuity RTO Conundrum

Your Disaster Recovery Plan has a Recovery Time Objective (RTO) – or, perhaps multiple RTO’s for each Application or Service being recovered). Your Business Continuity Plans have RTO’s for the underlying functions or business processes they are designed to recover.

But do these RTO’s really mean the same thing? Probably not; and if your Business Continuity Recovery Teams don’t understand that difference, they may be in for a very rude surprise when a disruption occurs.

Many organizations’ Information Technology (IT) Disaster Recovery Plans are predicated on failing over, or rebuilding, at an alternate site. There is no middle ground in the DR Plan – initiate the alternate site plan or do nothing. To DR or not to DR is a difficult choice; declaring a disaster (activating the alternate site plan) triggers an obligation to expend time, manpower, money and other resources that, once begun, usually can’t be reined in or scaled back.

So deciding whether to ‘declare’ involves a financial commitment that may require senior management signoff (and perhaps a fresh cost/benefit analysis and group consensus). All of that may require hours or days, depending on the severity of the situation; a complete meltdown of the data center makes for any easy decision, a non-catastrophic power outage may result in a wait-and-see situation. (There’s another dirty little secret that often remains hidden: most DR plans don’t include a plan to return from the alternate site – one more reason to make certain a declaration is absolutely necessary)

Regardless, the clock that tracks the RTO commitment(s) shown in the DR plan doesn’t start ticking until that declaration is made.

Meanwhile, for every critical business process or function, the clock started ticking the moment the lights went out or the computer screens went blank. Meeting customer or regulatory obligations is not dependent upon a declaration; the BC plan must be activated the moment the disruption occurs.

Yet few BCM programs (I’d say “most BCM programs” but I have no empirical data to back it up) have ever discussed the disconnection between DR declaration and RTO and its impact on BC Plan RTO’s. Fewer still have worked out the mechanics of reconciliation.

So the result is an assumption: IT provides an RTO for each Application, and the Business processes/functions which are dependent upon them create BC Plans assuming access to those Applications within that same timeframe. What the BC Plans don’t take into account is the decision-making time that may elapse between the moment of disruption and the moment a disaster declaration triggers the DR plan. The latter is sometimes referred to as T-zero and is not a fixed point; it will be determined once the impact and prognosis of the disruption are known and analyzed.

But if BC Plan owners understand that IT may not start the T-zero clock running when the disruption occurs (and may delay T-zero for hours or even days), they can begin to develop strategies for what they can and will do during that undetermined interim period. Manual processing, workload shifting, holding statements and other temporary strategies must all be sustainable for extended periods (not just the assumed RTO timeframe). Communications must be preplanned – both internally and with external stakeholders.

The bottom line: every organization’s Business Continuity Managers need to coordinate a conversation between the IT team responsible for making DR declarations and the business leaders on whom the responsibility for Business Continuity preparedness falls.

Promote a mutual understanding of the decision-making process that will be employed to determine when a ‘disaster’ will be declared. Try to come to a consensus on the maximum amount of time between the disruption and the declaration.

Without this mutual understanding, business process owners will continue to base their BC Plan strategies on stated RTO’s for the IT applications on which they rely. And will continue to be shocked when that 4-hour RTO turns into a 2-day waiting period.

When does the RTO clock really start ticking? When IT says it starts. So make sure your business units are prepared to prolong their temporary recovery measures while they wait for IT to decide when (or whether) to declare a disaster.