Chapter 7: Data Protection: Sources, Policies, and Protection Groups

Learning Objectives

If the previous chapters built the platform — clusters, networks, identity — this chapter is where Cohesity finally earns its keep. Data Protection is the day-job: pulling backups from a sprawling, heterogeneous estate of hypervisors, file servers, databases, and SaaS tenants; storing those copies efficiently; and proving, at 3 a.m. on the worst day of someone's career, that the data can come back.

The CCAE exam tests three intersecting constructs: Sources (what you protect), Policies (how often, how long, where copies go), and Protection Groups (the binding object that stitches the two together). Master those three nouns, and most of the data-protection blueprint falls into place.

Figure 7.1: End-to-end data protection object model

flowchart TD A[Source
vCenter / Hyper-V / Prism / Physical / NAS / DB] --> B[Protection Group
Membership: Static / Container / Tag] B --> C[Policy
SLA Contract] C --> D[Schedule
Frequency / RPO] C --> E[Retention
GFS Hierarchy + DataLock] D --> F[Snapshots
Local Cluster Storage] E --> F F --> G[Replication
DR Cluster] F --> H[Archive
CloudArchive / S3 Glacier] style A fill:#1f6feb,color:#fff style C fill:#238636,color:#fff style F fill:#8957e5,color:#fff
Animation 1: Data Flow — Source through Protection Group, Policy, Snapshots, and beyond
Source vCenter / DB Protection Group Policy SLA Contract Snapshots Local Cluster Replication DR Cluster Archive S3 Glacier subscribe contract snapshot

Section 1: Source Registration and Discovery

Before Cohesity can protect anything, it must know that the source exists, hold credentials to talk to it, and understand its API surface. Source registration is the moment a "production system" becomes a "discoverable, protectable inventory" inside Cohesity.

vCenter, SCVMM, and Nutanix Prism Integration

For VMware environments, the primary handshake is at the vCenter level. You register vCenter once, and Cohesity walks the entire managed inventory — datacenters, clusters, hosts, resource pools, folders, datastores, tags, and individual VMs. Registration requires a service account with sufficient privileges to read inventory, snapshot VMs, and (for some restore paths) attach virtual disks. Most architects create a dedicated svc-cohesity account in vCenter rather than reusing a domain admin.

Critically, registration is the moment to set per-datastore stream caps. After Cohesity discovers all datastores, you can override global stream limits by enabling a Cap and setting a maximum number of concurrent backup streams per datastore. This is one of the most commonly overlooked exam topics: a small, hot all-flash datastore hosting tier-1 transactional workloads should not be saturated by a 32-stream backup job hammering its queues. For Microsoft Hyper-V, registration goes through SCVMM (or directly to standalone Hyper-V hosts), and Cohesity uses Resilient Change Tracking (RCT) instead of CBT for incremental detection. Nutanix AHV registers via Prism Element or Prism Central; Cohesity then uses Nutanix's native snapshot APIs.

Physical Agent for Linux and Windows

Not everything is virtualized, and Cohesity's physical agent handles the rest. The Cohesity Agent is a lightweight binary that runs on Linux or Windows and provides three modes:

NAS Sources via SMB/NFS and NDMP

NAS protection has two main flavors. For modern NAS (NetApp ONTAP, Dell PowerScale/Isilon, Pure FlashBlade, generic Linux NFS exporters, Windows file servers), Cohesity registers the share over SMB or NFS and walks the namespace. For legacy or large enterprise NAS where snapshot-and-stream is preferable, Cohesity drives backups via NDMP — talking directly to the array's tape-out protocol but redirecting the stream into Cohesity instead of physical tape.

Database Sources: Oracle, SQL, SAP HANA, Exchange

Key Points

Pre-Quiz: Source Registration

1. An architect is registering a vCenter source on Cohesity. Which configuration step at registration time is most commonly overlooked but materially impacts production I/O?

2. Which change-tracking mechanism does Cohesity use for Microsoft Hyper-V incremental backups?

Post-Quiz: Source Registration

1. An architect is registering a vCenter source on Cohesity. Which configuration step at registration time is most commonly overlooked but materially impacts production I/O?

2. Which change-tracking mechanism does Cohesity use for Microsoft Hyper-V incremental backups?

Section 2: Policies and Schedules

A Cohesity Protection Policy is the SLA expressed in code. It encapsulates how often a backup runs (RPO), how long copies are kept (retention), where copies go (replication and archival targets), and any immutability rules. Critically, a single policy can express the entire lifecycle of a backup — from the first snapshot on local cluster storage all the way through replication to a DR site and archival to S3 Glacier seven years later.

Frequency, Retention, and Lock Attributes

The minimum RPO Cohesity can express in a standard policy is 15 minutes for hypervisor-based backups using Redirect-on-Write (RoW) snapshots. Tighter RPOs — sub-minute, even continuous — are achievable when integrated with primary array snapshots through SmartCopy.

Retention is configured as "Keep for N days/weeks/months/years" and supports DataLock attributes for compliance and ransomware resilience. Two flavors exist: Compliance Lock (truly immutable, legally enforceable) and Governance Lock (soft-immutable, can be overridden by a quorum of admins). For SOX, HIPAA, and PCI workloads, Compliance Lock is mandatory.

Hierarchical Retention (GFS)

Cohesity policies natively support the Grandfather-Father-Son (GFS) retention model — you can promote the first snapshot of each day, week, month, and year into longer-retention buckets. Combined with global variable-length deduplication, the storage cost of a 7-year monthly retention is far lower than naive arithmetic suggests, because unchanged blocks are stored once across the entire chain.

Policy Templates and Re-Use

A core architectural principle: one policy per SLA tier, not per workload. If you have 50 SQL servers, 200 file shares, and 1,200 VMs all in the "Gold" tier, they should all reference the same Gold policy. When the SLA changes — and it will — you edit one object instead of 1,450.

The SLA Analogy: A protection policy is the SLA contract; a Protection Group is the customer roster. The same Gold contract can be sold to a hundred customers (Protection Groups), and changing the contract terms automatically updates all subscribers.

Reference SLA Tier Design (Gold/Silver/Bronze)

TierFrequency (RPO)Local RetentionReplicationArchiveTarget RTO
GoldEvery 15 min via SmartCopy30 days, app-consistentAsync every cycle to DRMonthly, 7+ yrs, Compliance LockMinutes (Instant Mass Restore)
SilverEvery 4–6 hrs, CBT/RCT14–30 daysAsync daily to DRMonthly, 3–5 yrs< 1 hour
BronzeDaily, crash-consistent OK7–14 daysNone or weeklyQuarterly, 1 yr< 4 hours

Figure 7.2: Tiered policy decision tree

graph TD A[Workload SLA Requirement] --> B{RPO needed?} B -->|<= 15 min| C{RTO needed?} B -->|4-6 hours| D[Silver Tier] B -->|>= 24 hours| E[Bronze Tier] C -->|Minutes
Instant Mass Restore| F[Gold Tier] C -->|< 1 hour| D F --> G[SmartCopy + Pure
30d local + DR replication
7y archive + Compliance Lock] D --> H[CBT/RCT Hypervisor
14-30d local + DR daily
3-5y archive] E --> I[Daily crash-consistent
7-14d local
1y archive] style F fill:#d4af37,color:#000 style D fill:#c0c0c0,color:#000 style E fill:#cd7f32,color:#fff

Key Points

Pre-Quiz: Policies and Schedules

3. Which DataLock flavor is required for SOX, HIPAA, and PCI workloads where backup deletion must be impossible even with admin override?

4. An enterprise has 50 SQL servers, 200 file shares, and 1,200 VMs all classified as Gold-tier. What is the recommended policy design?

5. What is the minimum RPO Cohesity supports in a standard policy for hypervisor-based backups?

Post-Quiz: Policies and Schedules

3. Which DataLock flavor is required for SOX, HIPAA, and PCI workloads where backup deletion must be impossible even with admin override?

4. An enterprise has 50 SQL servers, 200 file shares, and 1,200 VMs all classified as Gold-tier. What is the recommended policy design?

5. What is the minimum RPO Cohesity supports in a standard policy for hypervisor-based backups?

Section 3: Protection Groups

A Protection Group is the binding object that connects a set of source objects to a single policy. It also holds operational settings such as proxy assignment, indexing options, pre/post scripts, application-quiesce flags, and exclude lists. The architectural decision that dominates Protection Group design is how membership is determined: statically, by container, or by tag.

Static Membership vs. Tag-Based Auto-Protection

Static membership means the administrator hand-picks individual objects at job-creation time. The Protection Group's scope never changes unless someone edits it. This is highly deterministic and auditable — but the risk is silent under-protection: a new VM provisioned by a junior engineer in a regulated environment may not appear in any Protection Group for weeks until someone notices.

Auto-protect automatically protects new VMs added to selected parent objects — datacenters, folders, clusters, hosts, resource pools — and supports vSphere tags for inclusion and exclusion. New VMs added to that container are automatically swept into the next backup run.

Auto-Protect with vSphere Tags — The AND/OR Quirk

Tag-based auto-protect has a non-obvious behavior that frequently appears on the CCAE exam:

If you want to exclude everything tagged dev or lab, add them one-by-one (OR). If you want to exclude only VMs that are both dev and decommissioned, add them together (AND). Misreading this distinction has caused architects to either over-protect (tagging an entire dev fleet into Gold) or under-protect (silently excluding production workloads).

Figure 7.3: Auto-protect via vSphere tags — dynamic membership update flow

flowchart LR A[vCenter Tag
tier=gold] --> B[Cohesity Inventory Sync] B --> C{Tag Filter
Include / Exclude} C -->|Match| D[Dynamic Membership
Update] C -->|No Match| E[Excluded from PG] D --> F[Protection Group
pg-gold-auto] F --> G[Next Backup Run
New VMs Auto-Swept] H[New VM
Provisioned + Tagged] -.-> A I[Untagged VM
Removed] -.-> E style A fill:#1f6feb,color:#fff style D fill:#238636,color:#fff style F fill:#8957e5,color:#fff
Animation 2: Auto-Protect via vSphere Tags — VM appears, tag detected, filter matched, included in PG
1. VM Provisioned 2. Cohesity Detects 3. Tag Filter 4. Added to PG New VM vm-app-42 just provisioned tier=gold Cohesity Inventory Sync tag detected Filter tier=gold ? Match Protection Group pg-gold-auto +1 member No admin intervention — new VMs are auto-swept on the next backup run.

When to Use Each Membership Model

Membership ModelBest ForRisksAudit Posture
StaticPCI/HIPAA/SOX-scoped VMs; small, slow-changing high-value setsNew VMs silently unprotectedStrongest — explicit list
Container auto-protectWell-organized vSphere with folder-per-business-unitFolder reorganization can shift scopeGood — provided folder hygiene
Tag auto-protectCross-cutting concerns (tier=gold, app=sql) where folder structure is contestedTag drift, AND/OR confusionModerate — requires tag governance

App-Consistent vs. Crash-Consistent Backups

A crash-consistent backup captures the disk state as if the system had been suddenly powered off. An app-consistent backup pauses the application briefly so it can flush buffers, checkpoint state, and quiesce I/O before the snapshot is taken. On Windows, this is VSS; for databases, it is the engine's own quiesce API (RMAN, VDI, Backint). Architects should default to app-consistent for any VM running a database or transactional system.

Key Points

Pre-Quiz: Protection Groups

6. An architect adds two exclusion tags dev and lab to a Cohesity Protection Group's auto-protect filter, in two separate "add" operations. What is the resulting exclusion logic?

7. A regulated PCI environment requires explicit, auditable proof that every cardholder-data VM is included in a Protection Group. Which membership model best meets this requirement?

8. A Protection Group backs up a VM hosting a SQL Server transactional database. The architect should configure the backup as:

Post-Quiz: Protection Groups

6. An architect adds two exclusion tags dev and lab to a Cohesity Protection Group's auto-protect filter, in two separate "add" operations. What is the resulting exclusion logic?

7. A regulated PCI environment requires explicit, auditable proof that every cardholder-data VM is included in a Protection Group. Which membership model best meets this requirement?

8. A Protection Group backs up a VM hosting a SQL Server transactional database. The architect should configure the backup as:

Section 4: Performance and Concurrency

A perfectly designed policy is worthless if backups run hot enough to crash production. Performance tuning sits at the intersection of source impact, network bandwidth, proxy capacity, and cluster ingest throughput.

SmartCopy and Storage Snapshot Integration

SmartCopy is Cohesity's snapshot-based copy and replication mechanism that integrates directly with primary storage arrays — most prominently Pure Storage FlashArray, but also NetApp, HPE Nimble/Primera, and Dell PowerStore via partner integrations. Rather than running a hypervisor-side or in-guest backup that competes with production I/O, Cohesity drives the array's own snapshot APIs and ingests data from those snapshots.

The architecture flow:

  1. Discovery — Register the Pure FlashArray as a source.
  2. Policy assignment — Assign a Protection Policy with snapshot frequency, retention on the array, retention on Cohesity, replication, and archive.
  3. Snapshot creation — At schedule time, Cohesity calls the Pure REST API. Optional pre/post scripts quiesce SQL/Oracle/Exchange.
  4. Mount and read — Cohesity mounts the snapshot via iSCSI, reads only changed blocks, and ingests through inline dedupe and compression.
  5. Retention tiering — Recent snapshots remain on Pure for instant restore at flash speed; older snapshots tier to Cohesity for long-term recovery.
  6. Recovery — Volume-level restore back to any Pure FlashArray, file-level via SmartFiles mount, or cross-platform to native cloud VMs.

The exam-relevant point: SmartCopy enables sub-15-minute RPOs with zero hypervisor overhead and is the canonical Gold-tier mechanism for transactional databases sitting on Pure.

Figure 7.4: SmartCopy with Pure FlashArray — orchestration sequence

sequenceDiagram participant App as SQL/Oracle App participant Cohesity as Cohesity Cluster participant Pure as Pure FlashArray participant Archive as CloudArchive (S3) Cohesity->>App: Pre-script: Quiesce (VSS/RMAN) App-->>Cohesity: Quiesce ACK Cohesity->>Pure: REST API: Take Snapshot Pure-->>Pure: Native array snapshot created Pure-->>Cohesity: Snapshot ID Cohesity->>App: Post-script: Release quiesce Cohesity->>Pure: Mount snapshot (iSCSI) Pure-->>Cohesity: Stream changed blocks only Cohesity-->>Cohesity: Inline dedupe + compression Note over Pure: Recent snapshots retained
on flash for instant restore Note over Cohesity: Older snapshots tier
to Cohesity DataPlatform Cohesity->>Archive: Tier monthly to S3 Glacier
Animation 3: SmartCopy — snapshot stays on Pure flash for fast restore, then tiers to Cohesity
Pure FlashArray Production volume SQL Volume flash-tier production SNAP Stays on flash — fast restore Tier older snapshots Cohesity DataPlatform — long-term Dedupe + Compress 30d retention Recent snapshots: instant restore at flash speed. Older: deduped & compressed on Cohesity.

CBT/RCT and Incremental Forever

For non-array-integrated VM backups, Cohesity uses CBT on VMware and RCT on Hyper-V. The hypervisor maintains a bitmap of changed blocks since the last backup; Cohesity reads only those changed blocks. Combined with global variable-length deduplication on ingest, this delivers an Incremental Forever model: a single full backup at job inception, then deltas only.

Throttling and QoS

Cohesity supports time-windowed bandwidth throttling for replication and archive traffic — for example, capping replication to 200 Mbps during business hours and lifting the cap overnight. Per-policy QoS lets you mark Gold backups higher-priority than Bronze on a shared cluster.

Key Points

Pre-Quiz: Performance and Concurrency

9. A Gold-tier SQL database lives on a Pure FlashArray and requires a 5-minute RPO with minimal hypervisor overhead. Which Cohesity mechanism best meets this requirement?

10. In a SmartCopy + Pure deployment, where do the most recent snapshots reside, and why?

11. A small all-flash datastore hosting a tier-1 OLTP database is being saturated by a 32-stream Cohesity backup job. What is the most appropriate fix?

Post-Quiz: Performance and Concurrency

9. A Gold-tier SQL database lives on a Pure FlashArray and requires a 5-minute RPO with minimal hypervisor overhead. Which Cohesity mechanism best meets this requirement?

10. In a SmartCopy + Pure deployment, where do the most recent snapshots reside, and why?

11. A small all-flash datastore hosting a tier-1 OLTP database is being saturated by a 32-stream Cohesity backup job. What is the most appropriate fix?

Chapter Summary

This chapter unpacked the trio of objects that drive Cohesity data protection: Sources, Policies, and Protection Groups. You learned how to register vCenter, SCVMM, Prism, physical hosts, NAS, and databases — and the importance of per-datastore stream caps set at registration time. You walked through GFS hierarchical retention, DataLock immutability, and the tiered Gold/Silver/Bronze SLA model. You compared static membership against container- and tag-based auto-protect, including the AND/OR tag logic that catches careless architects on the exam. And you traced how SmartCopy with Pure FlashArray enables 15-minute RPOs without hypervisor overhead.

Hold the analogy in mind: a policy is the SLA contract; a Protection Group is the customer roster.

Your Progress

Answer Explanations