Data Center

6 Steps to Manage Business Continuity and Disaster Recovery

If your plans aren't complete, consider building a better business case to put some muscle behind your disaster recovery and business continuity plans.

Photo: Shane Van Boxtel

“Disaster recovery plans grow along with the changes in your business; they need to be up on the priority list,” says Colbert Packaging’s Pascal Majon.

Disaster recovery and business continuity is a far more complicated and involved process than simply making sure your servers are backed up on a regular basis.

Business continuity involves detailed policies and procedures, numerous supporting technologies, and a commitment to redundancy and security. Along with all of that, it requires an actual plan. Business continuity can be expensive in terms of time, personnel, raw materials and above all, budget dollars, making it imperative to create a plan that maximizes benefits to the organization and clearly lays out the business case for continuity.

“People do not take disaster recovery seriously enough, or they leave projects half-finished and just do a piece here and there,” says Pascal Majon, IT manager at Colbert Packaging, a 300-employee packaging firm in Lake Forest, Ill. “Disaster recovery plans grow along with the changes in your business; they need to be up on the priority list even though a lot of IT managers are busy as it is and they’re in constant ‘break/fix’ mode.”

Colbert Packaging’s well-documented disaster recovery plan is also a requirement of doing business with many of the large pharmaceutical companies that Colbert services. “It makes you more attractive [to] a client if you can impress a customer with your quality controls, disaster recovery plans and security,” Majon says. “We use a server configuration standard with naming conventions, specific paths and scripts. Any change to that configuration needs to be documented and reviewed and signed off on.”

Unfortunately, business decision-makers often put business continuity on a back burner as more pressing problems take priority. But underfunded business continuity and disaster recovery plans can have serious consequences. All too often, organizations that cut corners or lack the right IT infrastructure and business policies create substantial risks.

In reality, business continuity operates on a de facto pass-fail system. If, in an emergency, the environment functions correctly, the organization achieves desirable results and success. Here’s a checklist of steps that can help an organization manage data business continuity and disaster recovery issues effectively:

Inventory Resources — Hardware and Software: The first step in devising an effective strategy is to understand the organization’s business needs and the systems that map to those needs. Key hardware, configurations and passwords should be included in inventory.

What do you feel is the biggest vulnerability in your company’s disaster recovery plan?

32%	We rely too heavily on tape backups
19%	Insufficient funding to make it effective
16%	Our plan is not updated frequently
14%	We don't have a disaster recovery plan
4%	Our data is not centralized to facilitate better backups
15%	Other

Source: CDW poll of 318 BizTech readers

Establish Recovery Objectives: Understanding recovery time objectives (RTO) and recovery point objectives (RPO) becomes the foundation on which an effective business continuity solution is built. A few years ago, an organization might have established a two- or three-day window for getting systems back online. Today, the timeframe is usually a few hours. When an organization establishes clear RTO and RPO, it’s possible to match hardware and software with business processes and data recovery needs. In some cases, it’s important to have systems back online within minutes; in other cases, hours or days will suffice.

Without specific RTO objectives, Colbert Packaging wouldn’t remain a top vendor with its customers. “Many of the big pharmaceutical companies that we work with are big on business continuity,” says Majon. “They want their partners to have reliable systems to provide uninterrupted service. There’s pressure to be compliant with their standards for disaster recovery and business continuity.

“The main concern is supply chain interruption,” he continues. “Our customers don’t want to withhold a product because we had a fire. They need to know that we can ship out of multiple warehouses and match all the specifications, whether it’s coming from a plant in Indiana or Illinois.”

Budget Adequately: Underfunded initiatives aren’t likely to provide the level of protection that’s desired or required. A business must quantify risks and understand the costs of redundant systems, backup power, redundant Web connections, spare servers and offsite storage.

Develop a Response Plan: When an incident takes place, it’s essential that IT administrators and employees have a clear understanding of how to handle the situation and manage work with minimal disruption. An effective plan spells out tasks, responsibilities and roles — and it covers an array of situations that could demand entirely different responses. It’s not enough to ensure that machines are turned on and operating; an organization must establish how employees will access systems and data during an outage or emergency.

“We keep a physical copy of our inventory,passwords and configuration standards, which we would use in the event that something went wrong offsite,” says Majon. “It gives us something to work from and ensures that the critical hours after [a disaster strikes aren’t] wasted.”

Designate a Recovery Team: The unpredictability of a disaster requires an organization to establish a team to lead employees through the recovery process. Armed with phone trees, mobile technology and a clear response plan, these individuals are able to make quick decisions and change course on the fly.

Revisit Business Continuity Often: Because business conditions, processes and technology constantly change, it’s vital to re-examine business continuity and update a plan on a regular basis. An organization must also test systems periodically — every quarter or at least once a year — to ensure that it hasn’t overlooked anything and that the plan works. Finally, it’s crucial to update and upgrade systems periodically to fit changing requirements.

Power Plays

One of the most basic but overlooked aspects of business continuity is maintaining electrical power during a blackout or disaster. Many data centers, computer rooms or wiring closets are prepared for power-related incidents that last a fraction of a second, or a few seconds, and some a few minutes or a few hours with some combination of power protection/conditioning and uninterruptible power supply (UPS), and possibly a generator.

“Most IT people in the United States don’t understand what extended outages really are,” says Dave Slotten, senior product manager at Tripp Lite. “For most people, a long disruptive outage is 45 minutes to one hour. So they have an inappropriate peace of mind — they’ve bought a UPS for their servers, and some of their network has battery backup. But they haven’t thought about longer outages, like three to four hours, which battery solutions can handle, or even longer ones, which mean generator-based backup power.”

There are three main types of UPS systems:

Standby or Offline UPS: It powers IT equipment directly from the AC outlet. If a power disturbance occurs, whether it’s a blackout, surge or sag, a standby UPS will switch to battery power to protect the technology. A standby UPS is the simplest, most affordable UPS and is best for inexpensive or noncritical computers.

Line Interactive or Automatic Voltage Regulation (AVR) UPS: When an overvoltage or undervoltage occurs, a line interactive UPS corrects the strength of the voltage without the device switching over to battery power. It’s a step up from a standby UPS, which automatically switches to battery power for voltage problems. This UPS increases battery life as a result.

Double Conversion Online UPS: This UPS is designed so the incoming power flows through the battery, which then powers the IT equipment. If there’s an outage, the battery continues to power the equipment until it is drained. With other UPS systems, there is a short interruption in service as the device switches from incoming power to the battery. This system takes the incoming power, converts the AC power to DC, then reconverts it to AC, so it filters out problems, such as electrical line noise, and provides clean, perfect power to IT equipment.

textfield