Tactical Advice

De-Duplication Is the New Word in Backup Technology

This story appears in the December 2007 issue of BizTech Magazine.

Critical data is growing at an exponential rate, and tapes are no longer the only option for backup. Data de-duplication technology (also called data reduction or commonality factoring) allows users to store more information on fewer physical disks than has been possible in the past, making the cost of disk backup competitive with tape.

“Although the technology is fairly new, de-duplication is becoming widespread,” says Stephanie Balaouras, an analyst at Forrester Research. “Right now, disk space is three to four times as expensive as tape, but de-duplication can reduce data that needs to be backed up by a ratio of 20 to one. The big question is whether this technology is what puts the last nail in the coffin of tape backup.”

As the name suggests, the goal of de-duplication is to eliminate redundant data from backups. The technology replaces duplicate copies with much smaller pointers to a shared record. This can take place at the level of either whole records or smaller unique data segments.

For example, if someone e-mails a 10-megabyte Excel file to 10 people on a network and each of them stores it, that translates into 100MB of backup disk space without de-duplication. With whole-record de-duplication, one copy would be stored along with 10 reference pointers. If, however, one of the users changes the name of the file or alters the contents in even the slightest way, the entire copy will be backed up. Using sub-record level de-duplication, only the changes to the altered file would be saved, with pointers to the original. Both de-duplication methods are usually used in conjunction with the traditional compression algorithms — standard backup tactics that reduce the space consumed on the backup disk.

The trend is toward subrecord level de-duplication. A wide range of systems that provide de-duplication are already available, such as Quantum’s DXi hardware or Cybernetics’ iSCSI SAN and software such as Veritas NetBackup PureDisk. Along with dramatically reducing backup storage space consumption, these technologies cut restore time and eliminate the need to wade through incremental backup tapes. Most systems allow users to restore back to a specific date and time, and some make decentralized backups possible.

Proceed With Care

Balaouras warns that while data de-duplication is fast becoming a standard feature in backup systems, the technology is new enough, and there are enough variations among applications, that buyers should proceed with care. A key distinction is whether the data reduction takes place at the source (the backup server) or the target (a virtual tape library or disk appliance). Source-based processing uses much less bandwidth and provides for either local or global backup, but it often requires users to replace their current backup systems or run one system for central office backup and another for remote locations.

Whether de-duplication occurs during or after data are processed is also a serious concern. Data reduction is very CPU-intensive and can slow down the backup. Performing the de-duplication, after an initial backup has been completed, however, requires more disk space and means that the data reduction must be completed before the next scheduled backup.

Scalability and data integrity issues raised by the number of times the data is processed by de-duplication and checking algorithms in most systems are also issues users should investigate before they buy, says Balaouras. But de-duplication is here to stay, and it’s accelerating movement toward disk backup, especially among SMBs without large investments in legacy tape systems.

“Tape will be around for a while — for one thing it’s got a better power and cooling profile than disk, and that’s important in today’s data center,” says Balaouras. “But data de-duplication is a reality — it will take some time to sort out the approaches, but it definitely changes the comparison with tape.”

IT Takeaway

To narrow your options, consider the following criteria:

• Location of the de-duplication — backup source or target
• Data integrity
• Scalability
•Maturity of the vendor offering (Some systems have included de-duplication for several years, but in others it’s a new feature.)
Jeff Gross is an IT manager at Tucker Industries in Bensalem, Pa.
Sign up for our e-newsletter

Security

Why Cloud Security Is More E... |
Cloud protection services enable companies to keep up with security threats while...
Securing the Internet of Thi... |
As excitement around the connected-device future grows, technology vendors seek ways to...
Tools to Maintain Mobile Sec... |
Far-flung devices pose serious challenges, but a variety of technologies can help protect...

Storage

The New Backup Utility Proce... |
Just getting used to the Windows 8 workflow? Prepare for a change.
How to Perform Traditional W... |
With previous versions going unused, Microsoft radically reimagined the backup utility in...
5 Easy Ways to Build a Bette... |
While large enterprises have the resources of an entire IT department behind them, these...

Infrastructure Optimization

Why Cloud Security Is More E... |
Cloud protection services enable companies to keep up with security threats while...
Ensure Uptime Is in Your Dat... |
Power and cooling solutions support disaster recovery and create cost savings and...
The Value of Converged Infra... |
Improvements in security, management and efficiency are just a few of the benefits CI can...

Networking

Securing the Internet of Thi... |
As excitement around the connected-device future grows, technology vendors seek ways to...
How to Maximize WAN Bandwidt... |
Understand six common problems that plague wide area networks — and how to address them.
Linksys Makes a Comeback in... |
The networking vendor introduced several new Smart Switch products at Interop this week.

Mobile & Wireless

Now that Office for iPad Is... |
After waiting awhile for Microsoft’s productivity suite to arrive, professionals who use...
Visualization Can Help Busin... |
Companies need to put their data in formats that make it consumable anytime, anywhere.
Linksys Makes a Comeback in... |
The networking vendor introduced several new Smart Switch products at Interop this week.

Hardware & Software

New Challenges in Software M... |
IT trends such as cloud, virtualization and BYOD pose serious hurdles for software...
Visualization Can Help Busin... |
Companies need to put their data in formats that make it consumable anytime, anywhere.
The Tools That Power Busines... |
Ever-evolving analytic software can greatly improve financial institutions’ decision-...