Tactical Advice

Data Deduplication: The Case for Post-Process Deduping

There are real, distinct performance advantages to post-process deduplication with a grid architecture.
Data Deduplication: The Case for Post-Process Deduping

When making the decision for a disk-based backup deduplication solution, how do you evaluate the tradeoffs between post-process deduplication with a grid architecture and inline deduplication with a fixed controller?

Marc Crespi, vice president of product management for ExaGrid Systems, sees the benefits of grid deduplication in three key areas:

Highest Performance for Shortest Backup Window

Post-process deduplication in a grid with full servers offers the fastest backups because the system deduplicates data after it has landed to disk and because full servers bring CPU, memory, disk and Gigabit Ethernet. Post-process also enables the fastest restores because the disk backup system keeps a full copy of the most recent backup available in high-speed cache for immediate recovery. In contrast, with inline deduplication, the disk backup system performs the dedupe process before data is fully protected on disk, and for a full system restore the data must first be “rehydrated.”’

Performance Maintained as Data Grows

Grid architecture solutions maintain high performance as the disk backup system scales because you add full appliances including processor power, memory, bandwidth and disk matched to the amount of backup data. When the system needs to expand, additional full appliance nodes are attached to the grid, thereby maintaining all aspects of performance as data grows. With the inline (controller/disk shelf) model, all of the processing power, memory, and bandwidth are contained in the controller, so when data increases and IT staff expands the system by adding only disk shelves, backup performance degrades.

Control Costs At Scale

Disk backup with deduplication systems based on a grid architecture are the most cost-effective to scale because as data grows, full servers can be seamlessly added to the grid in modular increments as needed without replacing existing nodes. Grid capacity is typically load-balanced automatically, which maintains a virtual pool of storage that is shared across all nodes. This contrasts the controller-disk shelf model, which adds disk to a fixed-capacity controller as data grows resulting in an expansion of backup windows. In this scenario, the controller must eventually be replaced via costly forklift upgrades to the next larger controller.

For more on the benefits of grid-based deduplication, read Crespi’s post on Data Center Knowledge.

Sign up for our e-newsletter

About the Author

Ricky Ribeiro

Online Content Manager

Ricky publishes and manages the content on BizTech magazine's web site. He's a writer, technology enthusiast, social media lover and all-around digital guy. You can learn more by following him on Google+ or Twitter:

Security

Review: Belkin Advanced Secu... |
This tool can prevent KVM toggling from being a source of network vulnerabilities.
Honeywords: Password Securit... |
Researchers are proposing a new method of spiking the password punch as a way to identify...
How Many Vulnerabilities Doe... |
The potential for damaging data breaches lurks in nearly every corner for SMBs.

Storage

EMC World 2013: Software-Def... |
Storage virtualization is a key element of providing on-demand, flexible cloud services.
How Steve Wozniak Explains V... |
Fusion-io's chief scientist breaks virtualization down into terms everyone can understand.
Product Review: Quantum NDX-... |
Device does double duty for storage and backup.

Infrastructure Optimization

Why More Software Is Headed... |
Many of your favorite software suites are trading in their shiny discs for cloud-based...
Cisco Live 2013: Brush Up wi... |
Get up to speed on convergence, wireless networking, collaboration and more ahead of the...
EMC World 2013: Software-Def... |
Storage virtualization is a key element of providing on-demand, flexible cloud services.

Networking

How to Secure Optimized Netw... |
WAN optimization and security aren’t always complementary. These tips can help you deal...
Cisco Live 2013: Brush Up wi... |
Get up to speed on convergence, wireless networking, collaboration and more ahead of the...
Do Virtual Meetings Boost Pr... |
New study finds that face-to-face meetings don’t always work in workers’ favor.

Mobile & Wireless

Consumr App Powers Informed... |
Reviews and ratings for products on the shelf are only a barcode scan away.
Faster In-Flight Wi-Fi: Com... |
The FCC is working on regulation to free up more Internet bandwidth for air travelers.
CTIA: Wireless Network Data... |
The invisible bytes that zip through the air continue to multiply at rapid rates.

Hardware & Software

Consumr App Powers Informed... |
Reviews and ratings for products on the shelf are only a barcode scan away.
Review: Belkin Advanced Secu... |
This tool can prevent KVM toggling from being a source of network vulnerabilities.
How Many Vulnerabilities Doe... |
The potential for damaging data breaches lurks in nearly every corner for SMBs.