Chapter 8: Application-Aware Backup and Recovery Patterns

Cohesity CCAE Exam Prep · Interactive Study Guide

Learning Objectives

Pre-Quiz — Test Your Starting Knowledge

1. Which interface does Cohesity register itself as to integrate with Oracle's Recovery Manager?

VDI (Virtual Device Interface) SBT (System Backup to Tape) Backint shared library VADP (vStorage APIs for Data Protection)

2. Which two TCP ports does the Cohesity Linux Agent require for Oracle backup and self-monitoring?

443 and 22 1433 and 1521 50051 and 59999 8080 and 9000

3. Approximately how long does Cohesity Instant Mass Restore take from clicking "Recover" to recoveries being in progress?

~30 seconds ~5 minutes ~30 minutes ~2 hours

4. Which Microsoft 365 workload requires an admin to manually re-create the underlying container before a fully deleted entity can be restored?

Exchange Online mailbox OneDrive for Business SharePoint Online site Microsoft Teams (Group)

5. What restore throughput does Microsoft 365 Backup Storage (MBS) integration deliver for OneDrive content?

Up to 100 GB/hour Up to 500 GB/hour Up to 3 TB/hour Up to 30 TB/hour

8.1 Database Workloads: Oracle, SQL Server, and SAP HANA

Database protection on Cohesity is fundamentally about meeting the database engine on its own terms. Oracle wants to drive its own backup via RMAN. SQL Server expects an application to call its VDI. SAP HANA mandates a Backint-compliant target. Cohesity speaks each engine's native API so backups and restores remain application-consistent and supportable by the database vendor.

Oracle RMAN Integration

Cohesity registers as an SBT (System Backup to Tape) target — the same interface RMAN uses for tape libraries — while still providing disk-class performance and global deduplication. The Cohesity Remote Adapter consolidates RMAN scripts, schedules, and alerts under a single console. Channel count is the primary throughput lever: more channels increase parallelism but also drive up CPU and network load on the Oracle host.

PathHow RMAN Sees ItWhere Dedupe HappensWhen to Use
Target-sideNFS view mounted on Oracle host; RMAN writes pieces to mountInline on Cohesity cluster as data landsDefault for low-latency LANs; CPU-constrained Oracle hosts
Source-sideCohesity SBT plugin installed on Oracle hostOn Oracle host before bytes traverse networkWAN-attached DB servers; bandwidth-constrained sites

Network requirements: TCP 50051 for backup operations and 59999 for self-monitoring. CBT is strongly recommended for large databases. Archive log backups schedule independently from datafile backups inside the same protection policy.

SQL Server: VDI and Always On Availability Groups

For Microsoft SQL Server, Cohesity uses the Virtual Device Interface (VDI). The Cohesity Windows Agent registers as a VDI client; SQL Server streams its own backup to the agent. AAG-aware protection can target the preferred backup replica, primary, or secondary with lowest backup priority — backing up from a secondary offloads I/O from the primary while still producing a usable backup chain. Cohesity reconstructs the log chain across replica failovers.

SAP HANA Backint

Backint is the SAP-certified shared library that HANA dynamically loads. Cohesity provides a Backint agent; HANA streams its native backup format directly to the cluster. This is the only SAP-supported third-party backup path for production HANA — VM-level snapshots are unsupported by SAP for production restores. Backint covers data, log, and catalog backups, plus per-tenant protection in MDC deployments.

Log Backups and PITR

For Oracle, SQL Server, and HANA, Cohesity protection policies expose log backup frequency as an independent dial from full/incremental cadence. Tier-1 policy: weekly full + daily incremental + 15-min log + monthly archive copy. This pattern hits 15-minute RPO with 7-day fine-grained PITR.

Animation 1 — Oracle RMAN: Channel Orchestration through SBT to Cohesity

DBA triggers RMAN, which allocates parallel channels through the SBT library to the Cohesity Linux Agent (ports 50051/59999), landing in the cluster's protection view.

DBA Scheduler Oracle RMAN N channels parallelism lever SBT Library Cohesity plugin sbtopen / sbtwrite Linux Agent TCP 50051 / 59999 inline dedupe Cohesity Cluster SnapTree view global dedupe 4 channels port 50051 target-side: NFS mount source-side: fingerprint at host ack + catalog metadata

Mermaid: RMAN Sequence

sequenceDiagram participant DBA as DBA / Scheduler participant RMAN as Oracle RMAN participant SBT as SBT Library participant Agent as Cohesity Linux Agent participant Cluster as Cohesity Cluster DBA->>RMAN: BACKUP DATABASE PLUS ARCHIVELOG RMAN->>RMAN: Allocate N channels RMAN->>SBT: sbtopen / sbtwrite alt Source-side dedupe SBT->>SBT: Variable-length fingerprint SBT->>Agent: Send unique blocks only else Target-side dedupe SBT->>Agent: Stream all blocks via NFS Agent->>Agent: Inline dedupe at landing end Agent->>Cluster: Write to protection view Cluster-->>Agent: Ack + catalog metadata Agent-->>SBT: sbtwrite OK SBT-->>RMAN: Piece complete RMAN-->>DBA: Backup successful

Key Points — 8.1 Databases

8.2 Microsoft 365 and SaaS Workloads

Microsoft 365 occupies a unique architectural position: data lives entirely in Microsoft's cloud, accessed only through Graph API and EWS, with platform-imposed throttling and a shared-responsibility model that explicitly puts third-party backup on the customer.

WorkloadGranularityNotable Behavior
Exchange OnlineMailbox / folder / message / attachmentIndependent retention; global keyword search
OneDrive for BusinessFile / folder with full ACL fidelityMBS delivers up to 3 TB/hour bypassing Graph throttling
SharePoint OnlineSite / library / list item / pageRestore to original or alternate location
Microsoft TeamsChannel messages / files / tabsCaveat: deleted Teams require admin to re-create the Group container first

MFA, Graph API Limits, and Authentication

Cohesity registers the tenant via an Entra ID service principal using application-permission tokens (no interactive MFA at runtime). Throttling is mitigated by parallelizing across users, exponential 429 backoff, MBS for OneDrive/SharePoint/Teams files, and Cohesity-side indexing so search and selective restore make zero extra Graph calls.

Auto-Protection and Granular Restore

Policy-driven auto-protection — when a new user is provisioned, their mailbox and OneDrive are auto-discovered and added to the protection group. This eliminates the operational drift problem of static include-list designs.

Animation 2 — Microsoft 365 Parallel Protection Fanout

Cohesity acquires an Entra ID token, then fans out four parallel branches: Mailbox + OneDrive + SharePoint + Teams. OneDrive/SharePoint/Teams files use MBS (3 TB/hr); messages and Teams metadata flow through Graph API.

Cohesity DataProtect Graph API Entra ID OAuth2 429 backoff Mailbox (Exchange) per-user parallel OneDrive (MBS) 3 TB/hr SharePoint (MBS) sites + lists Teams channels + tabs + files MBS bypasses Graph throttling for OneDrive / SharePoint / Teams files parallelize across users, not requests

Mermaid: M365 Sequence

sequenceDiagram participant Cohesity as Cohesity DataProtect participant Entra as Entra ID participant Graph as Graph API participant MBS as M365 Backup Storage participant Tenant as M365 Workloads Cohesity->>Entra: Acquire app-permission token Entra-->>Cohesity: OAuth2 access token Cohesity->>Graph: Enumerate users / sites / teams Graph-->>Cohesity: Object inventory par Mailbox Cohesity->>Graph: Read messages (per-user parallel) Graph->>Tenant: Exchange Online Tenant-->>Cohesity: Indexed mailbox data and OneDrive / SharePoint Cohesity->>MBS: Snapshot file content MBS->>Tenant: OneDrive / SharePoint Tenant-->>Cohesity: Files + ACLs (3 TB/hr) and Teams Cohesity->>Graph: Channel messages + tabs Graph->>Tenant: Teams Tenant-->>Cohesity: Teams payload end Note over Cohesity,Graph: 429 backoff with exponential delay

Key Points — 8.2 Microsoft 365

8.3 Instant Recovery Mechanics

Backups become recovery products through two primitives: Instant Mass Restore (IMR) for VMs and Clone for VMs and databases. Both rely on Cohesity's SnapTree metadata structure providing O(1) snapshot access — every snapshot is a fully hydrated, instantly mountable view, not a delta chain.

IMR Workflow (5 automated stages)

  1. Present an NFS datastore from the Cohesity cluster to ESXi hosts.
  2. Create new VMs from backup metadata, register with vCenter.
  3. Power on VMs — workloads serve users from Cohesity NFS in ~30 seconds.
  4. Storage vMotion running VMs back to primary storage at chosen pace.
  5. Clean up the temporary NFS datastore once migration completes.

Animation 3 — Instant Mass Restore: 200 VMs Online in ~30 Seconds

Cohesity exports an NFS view, ESXi hosts mount it, 200 VMs power on rapidly, and Storage vMotion drains them back to primary storage in the background.

Cohesity SnapTree snapshot scale-out NFS export scale-out ESXi Host 1 ESXi Host 2 ESXi Host N VM live VMs 200 / 200 Storage vMotion staggered, 8/host Primary Storage production array ~30s click to power-on — auto NFS cleanup after migration

Mermaid: IMR Flow

flowchart LR A[Cohesity Cluster
SnapTree snapshot] -->|NFS export| B[ESXi Hosts
mount datastore] B -->|register from
backup metadata| C[vCenter
VM inventory] C -->|power on VMs
~30 seconds| D[Live Workloads] D -->|Storage vMotion
staggered| E[Primary Storage] E -->|migration complete| F[Auto-cleanup]

IMR vs. VMware vSphere Native Instant Recovery

DimensionCohesity IMRVMware vSphere Instant Recovery
ScaleUnlimited concurrent VMs (demonstrated to 200)One or a handful
Storage backingDistributed scale-out clusterSingle replica appliance
MigrationAutomated Storage vMotion orchestrated by CohesityManual operator-driven
CleanupAutomatic NFS export removalManual
Point-in-timeAny snapshot (O(1) via SnapTree)Latest replica only
Performance3x transactions/min vs. Veeam-from-targetPerformance cliff under load

Clone vs. IMR (the git analogy)

A Cohesity clone is to a backup what a git branch is to a commit: a cheap, isolated, fully writable workspace forked off a known-good point-in-time. IMR is one step further — like a git checkout of that branch into a running production environment, with the migration step being the eventual git merge back into primary storage.

Key Points — 8.3 Instant Recovery

8.4 Granular File and Item Recovery

Mermaid: Granular Recovery Decision Tree

flowchart TD Start[Recovery Request] --> Q1{Scope of loss?} Q1 -->|Whole VM
or many VMs| VM[Instant Mass Restore] Q1 -->|Single file or version| File[Indexed Search
Yoda service] Q1 -->|Mailbox / message| Item{Exchange type?} Q1 -->|DB object| DB{Engine?} Item -->|On-prem Exchange| ItemOn[VSS + Exchange API] Item -->|Exchange Online| ItemCloud[Graph API + EWS] DB -->|SQL Server| SQL[Mount as DB
bcp / INSERT-SELECT] DB -->|Oracle| Oracle[RMAN RECOVER TABLE] DB -->|SAP HANA| HANA[Clone tenant
SQL export] VM --> Done[Recovery complete] File --> Done ItemOn --> Done ItemCloud --> Done SQL --> Done Oracle --> Done HANA --> Done

When indexing is enabled, Cohesity walks file-system metadata at backup time and pushes filenames, paths, sizes, timestamps, and (optionally) full-text content into the Yoda search service. Selected files restore to the original VM, an alternate VM, or to the admin's workstation — no full VM restore required.

Item-level for Exchange: mailbox / folder / message / attachment, with destinations of original mailbox, alternate mailbox, or PST export. Object-level for SQL/Oracle: mount the backup as a database, navigate via Cohesity's object browser, extract via bcp or INSERT...SELECT for SQL, RMAN RECOVER TABLE or clone+expdp for Oracle.

Self-service via Helios: end users browse their own VM's backup history, restore single files, view RBAC-scoped audit logs.

Key Points — 8.4 Granular Recovery

8.5 Worked Example: Three Recovery Scenarios

Scenario A — Single Exchange Mailbox Recovery

Finance manager deleted an audit folder 3 weeks ago, past native retention. Search Helios → filter by user/source/type → pick snapshot from day before deletion → restore to "Recovered_Audit" subfolder. ~5 min, 80 MB, zero impact on other users.

Scenario B — Ransomware Mass VM Recovery

47 VMs encrypted overnight. DataHawk anomaly alert at 03:47, SOC declares incident at 04:15. IMR triggered on all 47 from the 22:00 snapshot. VMs powering on ~30 sec later from Cohesity NFS on quarantine VLAN. Storage vMotion at 4 concurrent/host, completing over 8 hrs. RTO < 1 hr; RPO 6 hrs.

Scenario C — Oracle PITR after Bad Transaction

Developer ran DELETE FROM orders at 14:32 against an 8 TB DB. Cohesity invokes RMAN with SBT, restores most-recent incremental, applies archive logs to 14:31:59. 8 channels, 10 GbE network. Restore in 1 hr 12 min; RPO 1 min.

8.6 Native App Integration Comparison

ApplicationNative InterfaceAgentKey PortsPITR
OracleRMAN via SBTLinux Agent (+ optional source-side dedupe plugin)50051, 59999Yes — archive logs
SQL ServerVDIWindows Agent50051, 59999Yes — log backups across AAG replicas
SAP HANABackintBackint agent50051, 59999Yes — log backups + catalog
Exchange (on-prem)VSS + Exchange APIsWindows AgentSMB / RPCPer-database log replay
Exchange OnlineGraph + EWSService principalHTTPSSnapshot granularity
OneDrive / SharePointGraph + MBSService principalHTTPSSnapshot granularity
TeamsGraph + SharePointService principalHTTPSSnapshot (Group must pre-exist)
VMware VMsVADP, CBTAgentless (or VMware Tools)VMware portsPer-snapshot
Post-Quiz — Validate Your Understanding

1. Which interface does Cohesity register itself as to integrate with Oracle's Recovery Manager?

VDI (Virtual Device Interface) SBT (System Backup to Tape) Backint shared library VADP (vStorage APIs for Data Protection)

2. Which two TCP ports does the Cohesity Linux Agent require for Oracle backup and self-monitoring?

443 and 22 1433 and 1521 50051 and 59999 8080 and 9000

3. Approximately how long does Cohesity Instant Mass Restore take from clicking "Recover" to recoveries being in progress?

~30 seconds ~5 minutes ~30 minutes ~2 hours

4. Which Microsoft 365 workload requires an admin to manually re-create the underlying container before a fully deleted entity can be restored?

Exchange Online mailbox OneDrive for Business SharePoint Online site Microsoft Teams (Group)

5. What restore throughput does Microsoft 365 Backup Storage (MBS) integration deliver for OneDrive content?

Up to 100 GB/hour Up to 500 GB/hour Up to 3 TB/hour Up to 30 TB/hour

6. Which Cohesity primitive is best described as analogous to a git branch against a backup snapshot?

Instant Mass Restore Clone Replication Archive copy

7. Which SAP HANA backup interface is the only third-party path SAP supports for production restores?

VSS VDI Backint SBT

8. Which RMAN dedupe path is best for a WAN-attached Oracle host where network bandwidth is the bottleneck and the host has spare CPU?

Target-side dedupe via NFS mount Source-side dedupe via Cohesity SBT plugin Client-side compression only Disable dedupe to avoid CPU load

9. Which feature distinguishes Cohesity IMR from VMware's native vSphere Instant Recovery in terms of scale?

Both support unlimited concurrent VMs equally VMware native is designed for hundreds of VMs; Cohesity is limited to one Cohesity supports unlimited concurrent VMs (demonstrated to 200); VMware native is designed for one or a handful Both top out at 50 VMs

10. What is the recommended Cohesity strategy for handling Microsoft Graph API throttling on M365 mailbox backups?

Increase requests per second per user as high as possible Parallelize across users (not requests per single user) and apply exponential 429 backoff Disable Graph and use SMB to mailbox stores Run all users serially in a single thread

11. In a tier-1 database protection policy, which dial converts a daily backup into minute-grained PITR?

Compression ratio Independent log-backup frequency Number of RMAN catalogs Disabling CBT

12. Which step in the Instant Mass Restore workflow is NOT executed by Cohesity itself?

Presenting the NFS datastore Registering VMs with vCenter from backup metadata Storage vMotion of running VMs back to primary storage Auto-cleanup of the temporary NFS export

13. Which AAG-aware design minimises I/O impact on the primary SQL Server replica?

Always back up from the primary replica Back up from the secondary replica with the lowest backup priority and reconstruct the log chain across failovers Disable AAG awareness and treat each replica as standalone Take crash-consistent VM snapshots only

14. Cohesity's SnapTree provides O(1) snapshot access. What architectural advantage does this give over delta-chain backup designs?

Every snapshot is a fully hydrated, instantly mountable view regardless of snapshot depth Snapshots take longer the older they get Restores require walking the full chain SnapTree limits the cluster to 10 snapshots per source

15. Which authentication construct does Cohesity use against M365 for unattended nightly backup jobs?

Interactive MFA prompts each night An Entra ID service principal with application-permission tokens Stored end-user passwords Anonymous Graph access

Your Progress

Answer Explanations