Tactical Advice

Data Deduplication: The Case for Post-Process Deduping

There are real, distinct performance advantages to post-process deduplication with a grid architecture.
Data Deduplication: The Case for Post-Process Deduping

When making the decision for a disk-based backup deduplication solution, how do you evaluate the tradeoffs between post-process deduplication with a grid architecture and inline deduplication with a fixed controller?

Marc Crespi, vice president of product management for ExaGrid Systems, sees the benefits of grid deduplication in three key areas:

Highest Performance for Shortest Backup Window

Post-process deduplication in a grid with full servers offers the fastest backups because the system deduplicates data after it has landed to disk and because full servers bring CPU, memory, disk and Gigabit Ethernet. Post-process also enables the fastest restores because the disk backup system keeps a full copy of the most recent backup available in high-speed cache for immediate recovery. In contrast, with inline deduplication, the disk backup system performs the dedupe process before data is fully protected on disk, and for a full system restore the data must first be “rehydrated.”’

Performance Maintained as Data Grows

Grid architecture solutions maintain high performance as the disk backup system scales because you add full appliances including processor power, memory, bandwidth and disk matched to the amount of backup data. When the system needs to expand, additional full appliance nodes are attached to the grid, thereby maintaining all aspects of performance as data grows. With the inline (controller/disk shelf) model, all of the processing power, memory, and bandwidth are contained in the controller, so when data increases and IT staff expands the system by adding only disk shelves, backup performance degrades.

Control Costs At Scale

Disk backup with deduplication systems based on a grid architecture are the most cost-effective to scale because as data grows, full servers can be seamlessly added to the grid in modular increments as needed without replacing existing nodes. Grid capacity is typically load-balanced automatically, which maintains a virtual pool of storage that is shared across all nodes. This contrasts the controller-disk shelf model, which adds disk to a fixed-capacity controller as data grows resulting in an expansion of backup windows. In this scenario, the controller must eventually be replaced via costly forklift upgrades to the next larger controller.

For more on the benefits of grid-based deduplication, read Crespi’s post on Data Center Knowledge.

Sign up for our e-newsletter

About the Author

Ricky Ribeiro

Online Content Manager

Ricky publishes and manages the content on BizTech magazine's web site. He's a writer, technology enthusiast, social media lover and all-around digital guy. You can learn more by following him on Google+ or Twitter:

Security

Three Ways to Integrate Fire... |
Follow these tips to align the devices with log management and incident tracking systems.
Why Cloud Security Is More E... |
Cloud protection services enable companies to keep up with security threats while...
Securing the Internet of Thi... |
As excitement around the connected-device future grows, technology vendors seek ways to...

Storage

The New Backup Utility Proce... |
Just getting used to the Windows 8 workflow? Prepare for a change.
How to Perform Traditional W... |
With previous versions going unused, Microsoft radically reimagined the backup utility in...
5 Easy Ways to Build a Bette... |
While large enterprises have the resources of an entire IT department behind them, these...

Infrastructure Optimization

Why Cloud Security Is More E... |
Cloud protection services enable companies to keep up with security threats while...
Ensure Uptime Is in Your Dat... |
Power and cooling solutions support disaster recovery and create cost savings and...
The Value of Converged Infra... |
Improvements in security, management and efficiency are just a few of the benefits CI can...

Networking

Securing the Internet of Thi... |
As excitement around the connected-device future grows, technology vendors seek ways to...
How to Maximize WAN Bandwidt... |
Understand six common problems that plague wide area networks — and how to address them.
Linksys Makes a Comeback in... |
The networking vendor introduced several new Smart Switch products at Interop this week.

Mobile & Wireless

Now that Office for iPad Is... |
After waiting awhile for Microsoft’s productivity suite to arrive, professionals who use...
Visualization Can Help Busin... |
Companies need to put their data in formats that make it consumable anytime, anywhere.
Linksys Makes a Comeback in... |
The networking vendor introduced several new Smart Switch products at Interop this week.

Hardware & Software

New Challenges in Software M... |
IT trends such as cloud, virtualization and BYOD pose serious hurdles for software...
Visualization Can Help Busin... |
Companies need to put their data in formats that make it consumable anytime, anywhere.
The Tools That Power Busines... |
Ever-evolving analytic software can greatly improve financial institutions’ decision-...