Chapter 1: CCAE Exam Overview and Cohesity Platform Architecture

Learning Objectives

Section 1: CCAE Exam Blueprint and Study Strategy

Architects who pursue the Cohesity Certified Architect Expert (CCAE) credential are signaling that they can do more than operate a backup platform — they can size, design, and defend a Cohesity Data Cloud deployment in front of customers, security teams, and CIOs. This section unpacks what the exam tests, who it is for, and how to study for it efficiently.

Pre-Quiz: Section 1

1. Which CCAE exam domain carries the largest weight, and what does that emphasize about the exam's character?

Domain 1 (Platform Architecture, 22%) — the exam is about memorizing services and APIs.
Domain 2 (Solution Discovery and Design, 35%) — the exam is design-oriented, not CLI-trivia.
Domain 3 (Security-Focused Solutions, 18%) — the exam is primarily a security certification.
Domain 4 (Third-Party Integration, 13%) — the exam focuses on API and connector mastery.

2. An architect with deep DataProtect operations background but limited security knowledge is planning a 30-day study plan. What is the most defensible re-allocation of study time?

Spend the bulk of time on Domain 1 since it most aligns with their operational background.
Spread time evenly across the four domains regardless of background.
Invert the time allocation slightly and over-invest in Domain 3 (security) where scenarios are unfamiliar.
Skip Domain 4 entirely because it carries the smallest weight.

3. Which combination most accurately describes the CCAE exam delivery facts?

120 minutes, $300 USD, 70% pass, 1-year validity, in-person only.
90 minutes, $200 USD, 60% pass, 2-year validity, online proctored.
60 minutes, $150 USD, 65% pass, 3-year validity, in-person only.
90 minutes, $200 USD, 80% pass, lifetime validity, online proctored.

4. How does CCAE most fundamentally differ from CCPE in terms of audience and mindset?

CCPE is for sales engineers; CCAE is for cluster administrators.
CCPE focuses on "how to fly the plane daily"; CCAE focuses on "why design the airframe and mission profile this way".
CCPE is the expert-tier capstone; CCAE is the foundational/associate credential.
CCPE and CCAE cover identical material but CCAE is more expensive.

5. Which lab setup gives the highest study signal for CCAE preparation, given the exam's design emphasis?

A read-only Helios sandbox tenant with no compute resources connected.
A Helios sandbox tenant connected to a Virtual Edition cluster (or Cloud Edition trial) for hands-on workflow practice.
PowerPoint slides reviewed without any clusters available.
A bare-metal physical ReadyNode in a colo facility used only for capacity testing.
Cohesity Certification Track Progression (CCA → CCSE/CCPE → CCAE)
CCA Cohesity Certified Associate Operators · Daily ops CCSE Sales Engineer Positioning · Demo · Light design CCPE Professional Engineer Implementation · Day-2 ops CCAE Architect Expert · End-to-end design

Domain Weightings

DomainWeightFocus
1. Cohesity Data Cloud Data Management Platform Architecture22%Products, technology use cases, designing a DataProtect platform
2. Cohesity Architecture Solution Discovery and Design35%Sizing, workload protection, hybrid/multi-cloud, Helios Self-Managed, business alignment
3. Design Security-Focused Solutions18%Cyber-resiliency, immutability, encryption, ransomware patterns
4. Integrate Third-party Solutions with Cohesity13%Integration patterns and APIs

Exam Delivery and Recertification

AttributeValue
Duration90 minutes
Cost$200 USD
Passing score60%
Validity2 years
Retake policy14-day waiting period
DeliveryOnline proctored

Certification Track Comparison

CertAudienceFocusDepth
CCAOperators, junior adminsCluster basics, daily operationsFoundational
CCSEPre-sales SEsPositioning, demo, light designAssociate
CCPESenior admins, consultantsImplementation, deployment, Day-2Professional
CCAESolutions architectsEnd-to-end design, sizing, security, multi-cloudExpert
flowchart TD CCA["CCA - Cohesity Certified Associate
Operators and junior admins"] CCSE["CCSE - Sales Engineer
Positioning, demo, light design"] CCPE["CCPE - Professional Engineer
Implementation, Day-2 ops"] CCAE["CCAE - Architect Expert
End-to-end design, sizing, security"] CCA --> CCSE CCA --> CCPE CCSE --> CCAE CCPE --> CCAE style CCA fill:#1f6feb,stroke:#58a6ff,color:#ffffff style CCSE fill:#1f6feb,stroke:#58a6ff,color:#ffffff style CCPE fill:#1f6feb,stroke:#58a6ff,color:#ffffff style CCAE fill:#238636,stroke:#58a6ff,color:#ffffff

Key Points — Section 1

Post-Quiz: Section 1

1. Which CCAE exam domain carries the largest weight, and what does that emphasize about the exam's character?

Domain 1 (Platform Architecture, 22%) — the exam is about memorizing services and APIs.
Domain 2 (Solution Discovery and Design, 35%) — the exam is design-oriented, not CLI-trivia.
Domain 3 (Security-Focused Solutions, 18%) — the exam is primarily a security certification.
Domain 4 (Third-Party Integration, 13%) — the exam focuses on API and connector mastery.

2. An architect with deep DataProtect operations background but limited security knowledge is planning a 30-day study plan. What is the most defensible re-allocation of study time?

Spend the bulk of time on Domain 1 since it most aligns with their operational background.
Spread time evenly across the four domains regardless of background.
Invert the time allocation slightly and over-invest in Domain 3 (security) where scenarios are unfamiliar.
Skip Domain 4 entirely because it carries the smallest weight.

3. Which combination most accurately describes the CCAE exam delivery facts?

120 minutes, $300 USD, 70% pass, 1-year validity, in-person only.
90 minutes, $200 USD, 60% pass, 2-year validity, online proctored.
60 minutes, $150 USD, 65% pass, 3-year validity, in-person only.
90 minutes, $200 USD, 80% pass, lifetime validity, online proctored.

4. How does CCAE most fundamentally differ from CCPE in terms of audience and mindset?

CCPE is for sales engineers; CCAE is for cluster administrators.
CCPE focuses on "how to fly the plane daily"; CCAE focuses on "why design the airframe and mission profile this way".
CCPE is the expert-tier capstone; CCAE is the foundational/associate credential.
CCPE and CCAE cover identical material but CCAE is more expensive.

5. Which lab setup gives the highest study signal for CCAE preparation, given the exam's design emphasis?

A read-only Helios sandbox tenant with no compute resources connected.
A Helios sandbox tenant connected to a Virtual Edition cluster (or Cloud Edition trial) for hands-on workflow practice.
PowerPoint slides reviewed without any clusters available.
A bare-metal physical ReadyNode in a colo facility used only for capacity testing.

Section 2: Cohesity DataPlatform Architecture Pillars

Every CCAE exam scenario eventually traces back to four architectural pillars: a single distributed file system, a hyperconverged scale-out node model, MapReduce-style background services, and strict consistency. If you internalize these pillars, you can derive most design answers from first principles.

Pre-Quiz: Section 2

1. SpanFS supports unlimited snapshots and clones with effectively zero performance penalty. Which underlying technology delivers this property?

RAID-6 with periodic full clones.
SnapTree, which implements Distributed Redirect-on-Write (D-ROW) on a B+ tree metadata store.
VMware vSphere CBT replicated across nodes.
Tape-out staging with linear read-only locks.

2. Why do typical Cohesity clusters require a minimum of three or four nodes depending on form factor?

To physically fit the chassis backplane.
Because Paxos-style quorum requires a majority of nodes to remain healthy to accept writes; smaller clusters cannot tolerate a single-node loss.
Because each protocol (NFS, SMB, S3) requires its own dedicated node.
Because Apollo MapReduce jobs require exactly three workers.

3. A customer protects 500 Windows VMs averaging 80 GB each (FETB ~40 TB) with significant OS overlap. Which cluster behavior most accurately explains why effective stored capacity lands near 5-7 TB?

Bridge applies fixed-block local dedupe per node.
Apollo’s MapReduce post-process re-dedupe and garbage collection sustain global variable-length dedupe (4-6x) plus inline compression (1.5-2x).
Magneto compresses VMs at the source agent before transmit only.
Iris caches duplicate blocks at the management layer.

4. What does “strict consistency” in SpanFS mean for client behavior?

Clients must always connect to the master metadata node for the latest data.
Any node can serve any I/O for any object and clients always see the latest committed state.
Reads are eventually consistent; only writes are strict.
Clients see committed data only after a 30-second propagation delay.

5. Compared with a traditional two-tier backup architecture, which property is unique to a Cohesity hyperconverged scale-out cluster?

A dedicated metadata controller decouples capacity and compute scaling.
Add-capacity workflows are re-rack and migrate operations.
Throughput and metadata capacity scale linearly with node count, with no privileged master node.
A single-node failure typically halts the entire array.

SpanFS Layered Subsystems

  1. Access Layer — exposes NFS, SMB, S3 (plus OST and DirectIO for NetBackup) on the same volumes via virtual IPs, with no master node and no protocol-specific choke point.
  2. I/O Engine — chunks data, performs variable-length global dedupe (inline or post-process), compresses, encrypts, indexes, and tiers blocks across SSD, HDD, and cloud.
  3. Metadata Management — distributed key-value store on a patented B+ tree, replicated and sharded consistently. SnapTree delivers Distributed Redirect-on-Write (D-ROW) for unlimited snaps and clones.
  4. Storage and Distribution — fully distributed across hyperconverged x86 nodes, dynamically rebalanced, protected by erasure coding or replication.

Hyperconverged vs. Traditional

PropertyTraditional Two-Tier BackupCohesity Hyperconverged
Compute / storage scalingIndependent, often imbalancedCoupled, balanced per node
Metadata controllerDedicated server, bottleneck riskDistributed across all nodes
Add-capacity workflowRe-rack, re-license, migrateAdd ReadyNode, auto-rebalance
Failure blast radiusOften whole-arraySingle node, EC-bounded

Key Points — Section 2

Post-Quiz: Section 2

1. SpanFS supports unlimited snapshots and clones with effectively zero performance penalty. Which underlying technology delivers this property?

RAID-6 with periodic full clones.
SnapTree, which implements Distributed Redirect-on-Write (D-ROW) on a B+ tree metadata store.
VMware vSphere CBT replicated across nodes.
Tape-out staging with linear read-only locks.

2. Why do typical Cohesity clusters require a minimum of three or four nodes depending on form factor?

To physically fit the chassis backplane.
Because Paxos-style quorum requires a majority of nodes to remain healthy to accept writes; smaller clusters cannot tolerate a single-node loss.
Because each protocol (NFS, SMB, S3) requires its own dedicated node.
Because Apollo MapReduce jobs require exactly three workers.

3. A customer protects 500 Windows VMs averaging 80 GB each (FETB ~40 TB) with significant OS overlap. Which cluster behavior most accurately explains why effective stored capacity lands near 5-7 TB?

Bridge applies fixed-block local dedupe per node.
Apollo’s MapReduce post-process re-dedupe and garbage collection sustain global variable-length dedupe (4-6x) plus inline compression (1.5-2x).
Magneto compresses VMs at the source agent before transmit only.
Iris caches duplicate blocks at the management layer.

4. What does “strict consistency” in SpanFS mean for client behavior?

Clients must always connect to the master metadata node for the latest data.
Any node can serve any I/O for any object and clients always see the latest committed state.
Reads are eventually consistent; only writes are strict.
Clients see committed data only after a 30-second propagation delay.

5. Compared with a traditional two-tier backup architecture, which property is unique to a Cohesity hyperconverged scale-out cluster?

A dedicated metadata controller decouples capacity and compute scaling.
Add-capacity workflows are re-rack and migrate operations.
Throughput and metadata capacity scale linearly with node count, with no privileged master node.
A single-node failure typically halts the entire array.

Section 3: Core Services and Software Stack

Beneath the CCAE exam’s scenario language is a handful of cooperating services. Memorizing what they do — and what they don’t do — is one of the highest-yield activities for Domain 1.

Pre-Quiz: Section 3

1. A scenario states: “The cluster reclaims unused capacity overnight, runs post-process dedupe across all data, and rebuilds file analytics indices.” Which service owns this behavior?

Bridge.
Apollo — cluster-wide MapReduce-style background services.
Magneto — data protection orchestration.
Iris — UI and RBAC.

2. Which service is responsible for chunking, dedupe, compression, encryption, erasure coding, tiering, and serving the NFS/SMB/S3 protocol stacks?

Magneto.
Yoda.
Bridge — the SpanFS data path.
ScribeStore.

3. A customer wants to find every PDF named contract-2024.pdf across 14 clusters in 8 regions and have results returned in seconds. Which service makes this possible?

Iris — per-cluster UI search.
ScribeStore — key-value metadata.
Yoda — the global indexing and search service surfaced through Helios.
Bridge — data path scan on each cluster.

4. Which statement about Helios is correct?

Helios stores all customer backup data centrally for SaaS access.
Helios is the SaaS multicloud control and insight plane; it does not host customer backup data, only control and observability.
Helios runs only inside a customer’s data center as on-premises software with no SaaS option.
Helios is a backup agent installed on protected workloads.

5. The Cohesity App Framework runs containerized third-party apps on cluster nodes. What is its primary architectural advantage?

It moves data off-cluster to dedicated analytics servers.
It exploits data gravity — apps run where the data already lives, avoiding petabyte-scale network transfers, and are sandboxed away from the data path.
It bypasses RBAC for performance reasons.
It replaces Bridge for protocol serving.
Layered Cohesity DataPlatform — Hardware up to Helios
HARDWARE LAYER x86 Nodes · CPU · Memory · NVMe/SSD · HDD STORAGE FOUNDATION SpanFS (NFS/SMB/S3/OST) · ScribeStore distributed KV metadata CORE SERVICES Bridge · Apollo · Magneto · Iris · Yoda WORKLOAD PRODUCTS DataProtect · SmartFiles · SiteContinuity CONTROL PLANE Helios SaaS Multicloud Mgmt · AI Insights · Self-Managed option

Core Services Cheat Sheet

ServiceRoleResponsibilities
BridgeSpanFS data pathChunking, dedupe, compression, encryption, erasure coding, tiering, NFS/SMB/S3 stacks
ApolloBackground analytics & MapReduceGarbage collection, post-process dedupe, indexing, scrubbing, file analytics
MagnetoData protection orchestrationBackups, snapshots, replication, archive, recovery; integrates with vCenter, DBs, NAS, cloud
IrisManagement UI/control planeWeb UI, REST API, CLI dispatch, RBAC enforcement
ScribeStoreDistributed KV metadataInodes, chunk locations, snapshot trees
YodaGlobal searchCross-cluster file/object search, surfaced via Helios
flowchart LR HW["Hardware Layer
x86 Nodes"] --> SP["SpanFS + ScribeStore"] SP --> CS["Core Services
Bridge / Apollo / Magneto / Iris / Yoda"] CS --> WP["Workload Products
DataProtect / SmartFiles / SiteContinuity"] WP --> H["Helios SaaS Control Plane
(or Helios Self-Managed)"] style HW fill:#1f6feb,stroke:#58a6ff,color:#ffffff style SP fill:#1f6feb,stroke:#58a6ff,color:#ffffff style CS fill:#1f6feb,stroke:#58a6ff,color:#ffffff style WP fill:#1f6feb,stroke:#58a6ff,color:#ffffff style H fill:#238636,stroke:#58a6ff,color:#ffffff

Key Points — Section 3

Post-Quiz: Section 3

1. A scenario states: “The cluster reclaims unused capacity overnight, runs post-process dedupe across all data, and rebuilds file analytics indices.” Which service owns this behavior?

Bridge.
Apollo — cluster-wide MapReduce-style background services.
Magneto — data protection orchestration.
Iris — UI and RBAC.

2. Which service is responsible for chunking, dedupe, compression, encryption, erasure coding, tiering, and serving the NFS/SMB/S3 protocol stacks?

Magneto.
Yoda.
Bridge — the SpanFS data path.
ScribeStore.

3. A customer wants to find every PDF named contract-2024.pdf across 14 clusters in 8 regions and have results returned in seconds. Which service makes this possible?

Iris — per-cluster UI search.
ScribeStore — key-value metadata.
Yoda — the global indexing and search service surfaced through Helios.
Bridge — data path scan on each cluster.

4. Which statement about Helios is correct?

Helios stores all customer backup data centrally for SaaS access.
Helios is the SaaS multicloud control and insight plane; it does not host customer backup data, only control and observability.
Helios runs only inside a customer’s data center as on-premises software with no SaaS option.
Helios is a backup agent installed on protected workloads.

5. The Cohesity App Framework runs containerized third-party apps on cluster nodes. What is its primary architectural advantage?

It moves data off-cluster to dedicated analytics servers.
It exploits data gravity — apps run where the data already lives, avoiding petabyte-scale network transfers, and are sandboxed away from the data path.
It bypasses RBAC for performance reasons.
It replaces Bridge for protocol serving.

Section 4: Hardware, Cloud, and Virtual Edition Form Factors

Domain 2 routinely asks you to choose a form factor. The wrong hardware decision can sink an otherwise-correct architecture, so understand the trade-offs across Cohesity-branded appliances, ReadyNodes, Virtual Edition, Cloud Edition, and Robo Edition.

Pre-Quiz: Section 4

1. An insurance company runs 18 branch offices, each with ~2 TB of data, no local IT, and a need to replicate to a regional hub. Which Cohesity form factor is the architect-grade pick?

A pair of physical ReadyNodes per branch with full local IT staffing.
Robo Edition replicating back to the regional hub, centrally managed by Helios.
Cloud Edition in AWS us-east-1 for each branch.
Virtual Edition on a single laptop per branch.

2. A regulated, classified, dark-site customer cannot use any SaaS dependency. Which combination is appropriate?

Cloud Edition in AWS GovCloud + Helios SaaS.
Physical cluster + Helios SaaS only.
Physical cluster + Helios Self-Managed for on-prem fleet management.
Virtual Edition on a public cloud hypervisor + Helios SaaS.

3. Which statement about Cloud Edition is most accurate?

Cloud Edition runs the same Bridge/Apollo/Magneto stack as physical clusters but uses cloud-provider block storage; it is the foundation of CloudReplicate and CloudSpin patterns.
Cloud Edition replaces SpanFS with native AWS S3 to avoid software dependencies.
Cloud Edition is a SaaS-only offering with no customer-controlled VMs.
Cloud Edition runs only on-premises with cloud archive disabled.

4. A Cisco UCS shop wants Cohesity but must align procurement with their existing OEM partnership. Which form factor fits?

Cohesity-branded appliance, no OEM involvement.
Cisco-branded ReadyNode — OEM hardware running Cohesity software with Cohesity software TAC.
Custom whitebox server self-procured outside the HCL.
Robo Edition centrally managed.

5. Which sequence best describes how form-factor decisions should be made?

Pick the cheapest hardware first, then derive RPO/RTO from what it supports.
Decide on a single form factor for the entire fleet regardless of site type.
Form-factor choice is downstream of business requirements (RPO, RTO, sovereignty, branch IT capability) — derive the form factor from the requirement.
Pick the form factor based on the architect’s personal vendor preference.
Form-Factor Decision Branching (DC / Branch / Cloud / Lab / Air-Gapped)
New Cohesity Workload Identify deployment location Appliance / ReadyNode Data center deployment Cohesity-branded or Cisco/HPE/Dell ReadyNode Robo Edition Branch / ROBO 1- or 3-node small footprint Replicates to regional hub Virtual Edition Lab / POC / Mgmt cluster VMware vSphere or Hyper-V Zero hardware cost Cloud Edition Public cloud DR target Native AWS / Azure / GCP CloudReplicate & CloudSpin Physical + Self-Managed Sovereign / air-gapped Helios Self-Managed No SaaS dependency Form-factor choice is downstream of business and security requirements (RPO, RTO, branch capability, sovereignty) — not the other way around.

Form Factor Selection Matrix

ScenarioRecommended Form FactorWhy
200 TB enterprise DC, mixed VM/DBReadyNode (Cisco/HPE) or branded appliancePredictable performance, dense capacity, partner support
50 TB classified dark sitePhysical cluster + Helios Self-ManagedNo SaaS dependency, on-prem control
30 retail branch offices @ 2 TBRobo Edition replicating to regional hubSmall footprint, central Helios management
AWS-resident DR targetCloud EditionNative cloud, supports CloudReplicate & CloudSpin
Lab / proof-of-conceptVirtual Edition on existing vSphereZero hardware cost, fast spin-up
Cisco UCS shop in productionCisco-branded ReadyNodeFits procurement and ops model
flowchart TD START["New Cohesity Workload"] Q1{"Where will the
cluster physically run?"} APPLIANCE["Appliance / ReadyNode
Data center"] ROBO["Robo Edition
Branch office"] CLOUDED["Cloud Edition
Public cloud"] VE["Virtual Edition
Lab / POC / Mgmt"] DARKSITE["Physical + Helios Self-Managed
Sovereign / air-gapped"] START --> Q1 Q1 -->|On-prem DC| APPLIANCE Q1 -->|Remote branch| ROBO Q1 -->|Public cloud| CLOUDED Q1 -->|Test / mgmt| VE Q1 -->|Air-gapped| DARKSITE style START fill:#238636,stroke:#58a6ff,color:#ffffff style Q1 fill:#1f6feb,stroke:#58a6ff,color:#ffffff style APPLIANCE fill:#1f6feb,stroke:#58a6ff,color:#ffffff style ROBO fill:#1f6feb,stroke:#58a6ff,color:#ffffff style CLOUDED fill:#1f6feb,stroke:#58a6ff,color:#ffffff style VE fill:#1f6feb,stroke:#58a6ff,color:#ffffff style DARKSITE fill:#238636,stroke:#58a6ff,color:#ffffff

Key Points — Section 4

Post-Quiz: Section 4

1. An insurance company runs 18 branch offices, each with ~2 TB of data, no local IT, and a need to replicate to a regional hub. Which Cohesity form factor is the architect-grade pick?

A pair of physical ReadyNodes per branch with full local IT staffing.
Robo Edition replicating back to the regional hub, centrally managed by Helios.
Cloud Edition in AWS us-east-1 for each branch.
Virtual Edition on a single laptop per branch.

2. A regulated, classified, dark-site customer cannot use any SaaS dependency. Which combination is appropriate?

Cloud Edition in AWS GovCloud + Helios SaaS.
Physical cluster + Helios SaaS only.
Physical cluster + Helios Self-Managed for on-prem fleet management.
Virtual Edition on a public cloud hypervisor + Helios SaaS.

3. Which statement about Cloud Edition is most accurate?

Cloud Edition runs the same Bridge/Apollo/Magneto stack as physical clusters but uses cloud-provider block storage; it is the foundation of CloudReplicate and CloudSpin patterns.
Cloud Edition replaces SpanFS with native AWS S3 to avoid software dependencies.
Cloud Edition is a SaaS-only offering with no customer-controlled VMs.
Cloud Edition runs only on-premises with cloud archive disabled.

4. A Cisco UCS shop wants Cohesity but must align procurement with their existing OEM partnership. Which form factor fits?

Cohesity-branded appliance, no OEM involvement.
Cisco-branded ReadyNode — OEM hardware running Cohesity software with Cohesity software TAC.
Custom whitebox server self-procured outside the HCL.
Robo Edition centrally managed.

5. Which sequence best describes how form-factor decisions should be made?

Pick the cheapest hardware first, then derive RPO/RTO from what it supports.
Decide on a single form factor for the entire fleet regardless of site type.
Form-factor choice is downstream of business requirements (RPO, RTO, sovereignty, branch IT capability) — derive the form factor from the requirement.
Pick the form factor based on the architect’s personal vendor preference.

Your Progress

Answer Explanations