Chapter 13: Helios SaaS, Marketplace Apps, and Automation

CCAE Exam Preparation — Interactive Study Guide

Learning Objectives

If Chapters 1 through 12 taught you how to design, deploy, secure, and protect data on individual Cohesity clusters, this chapter zooms out to the operating model an architect actually inherits in production: a fleet. Real CCAE-scale customers run anywhere from three or four clusters to several hundred, and they cannot afford for the same Gold protection policy to mean three different things in three different regions. The architectural answer is the Helios SaaS control plane plus the Marketplace and the automation stack.

Think of it this way: a single Cohesity cluster is a server. Helios is the cloud console for the whole fleet, and the automation tools are the deployment pipelines that keep that fleet in compliance with whatever the architecture document says it should be.

Section 1: Helios SaaS Control Plane

Pre-Quiz — Helios SaaS Control Plane

1. Which network direction does Helios require for cluster connectivity?

Inbound TCP/443 from Cohesity SaaS to each cluster
Outbound HTTPS (TCP/443) from cluster to Helios
Bidirectional VPN tunnel
Inbound on a dedicated port range opened in the customer firewall

2. What flows from a managed cluster up to the Helios SaaS service?

Backup data and deduplication chunks
Operational metadata only (job status, capacity, alerts, source inventory, policy IDs)
Customer primary production data for analytics
SpanFS chunk files mirrored for disaster recovery

3. A federal classified site forbids any outbound SaaS connectivity. Which Helios option fits?

Disable Helios entirely and manage clusters individually
Use Helios over a dedicated MPLS leased line
Helios Self-Managed — customer-hosted variant on customer infrastructure
DMaaS in a GovCloud region

Helios Architecture and Tenancy Model

Cohesity Helios is delivered as a SaaS service that aggregates and centralizes management of every Cohesity cluster a customer owns, regardless of whether those clusters live on-premises, in a public cloud, or in a hybrid topology. From an architect's standpoint, Helios is the single pane of glass that turns a fleet of independent clusters into one logical, policy-governed estate.

Architecturally, Helios is a multi-tenant cloud service hosted by Cohesity. Each customer gets a Helios account (a tenant), and clusters are bound to that account during onboarding. Connectivity from cluster to Helios is outbound HTTPS over TCP/443 — Cohesity does not require any inbound openings on customer firewalls, which is exactly what enterprise security architects want to hear. The cluster establishes a long-lived TLS session to Helios, sends telemetry, accepts pushed configuration, and exposes a relay channel for "Launch Cluster UI" proxying.

Helios LayerResponsibilityLives Where
Helios UI / APISingle pane of glass; entry point for admins, APIs, dashboardsCohesity SaaS
Helios services (telemetry, search index, reporting)Aggregate cluster metadata, run anomaly detectionCohesity SaaS
Cluster agent / Helios connectorOutbound HTTPS tunnel from each clusterOn each managed cluster
Data plane (SpanFS, Bridge, Magneto)Stores and protects data; Helios does not see data, only metadataOn each managed cluster
Animation 13.1 — Helios fleet management: outbound HTTPS convergence
ON-PREM DATA CENTERS PUBLIC CLOUD COHESITY-OPERATED Cluster A — NYC on-prem Cluster B — Frankfurt on-prem Cluster C — Singapore Cloud Edition AWS us-east-1 Cloud Edition Azure westeurope DMaaS / FortKnox HTTPS / TCP-443 HTTPS / TCP-443 HTTPS / TCP-443 Helios SaaS Control Plane UI / API / Reporting DataHawk · FortKnox Admin Browser
Multi-source convergence: each cluster opens an outbound TLS tunnel; backup data stays local, only operational metadata flows up.

Onboarding, Dashboards, and Federated Capabilities

Onboarding is intentionally trivial: in the cluster UI under Settings → Helios Registration, the admin enters Helios credentials, Helios issues a registration token, and the cluster establishes the outbound tunnel. For dark sites or air-gapped environments where outbound HTTPS is forbidden, Cohesity offers Helios Self-Managed, a customer-hosted variant of the Helios services that runs on customer infrastructure and provides equivalent fleet management without depending on Cohesity's SaaS — the standard answer for federal classified environments and certain regulated banks.

Helios presents a unified, real-time view that aggregates health, capacity, protection status, SLA compliance, and performance metrics from every managed cluster. Federated global search lets an admin search for a VM, file, mailbox object, or database backup by name across the entire fleet. Federated RBAC layers on top: granular roles can be scoped to specific clusters, regions, organizations, or object types, sourced from Okta, Azure AD/Entra ID, AD, or any SAML 2.0 IdP.

Helios-Only Features

Figure 13.1 — Helios SaaS topology (mermaid)

flowchart LR subgraph OnPrem["On-Premises Data Centers"] C1[Cluster A NYC] C2[Cluster B Frankfurt] C3[Cluster C Singapore] end subgraph Cloud["Public Cloud"] CE1[Cloud Edition AWS] CE2[Cloud Edition Azure] end subgraph DMaaS["Cohesity-Operated"] DM[DMaaS Tenant] FK[FortKnox] end C1 -- "outbound HTTPS 443" --> Helios C2 -- "outbound HTTPS 443" --> Helios C3 -- "outbound HTTPS 443" --> Helios CE1 -- "outbound HTTPS" --> Helios CE2 -- "outbound HTTPS" --> Helios DM --> Helios FK --> Helios Helios[("Helios SaaS Control Plane")] Helios -- "HTTPS" --> Browser[Admin Browser / API Client]

Key Points

Key Takeaway: Helios converts N independent Cohesity clusters into one managed estate over outbound-HTTPS only. Backup data stays put; metadata flows up; and Self-Managed Helios covers the dark-site case.
Post-Quiz — Helios SaaS Control Plane

1. Which network direction does Helios require for cluster connectivity?

Inbound TCP/443 from Cohesity SaaS to each cluster
Outbound HTTPS (TCP/443) from cluster to Helios
Bidirectional VPN tunnel
Inbound on a dedicated port range opened in the customer firewall

2. What flows from a managed cluster up to the Helios SaaS service?

Backup data and deduplication chunks
Operational metadata only (job status, capacity, alerts, source inventory, policy IDs)
Customer primary production data for analytics
SpanFS chunk files mirrored for disaster recovery

3. A federal classified site forbids any outbound SaaS connectivity. Which Helios option fits?

Disable Helios entirely and manage clusters individually
Use Helios over a dedicated MPLS leased line
Helios Self-Managed — customer-hosted variant on customer infrastructure
DMaaS in a GovCloud region

Section 2: Helios as a Service (HaaS) and DMaaS

Pre-Quiz — HaaS and DMaaS

4. In a DMaaS subscription, who owns cluster ops (upgrades, hardware, capacity)?

The customer, same as on-prem DataProtect
A reseller or VAR partner
Cohesity — clusters, capacity, upgrades, and SLA are managed by Cohesity
It depends on the cloud provider chosen

5. An EU customer is backing up M365 mailboxes via DMaaS. From a residency standpoint, which placement is acceptable?

Any DMaaS region; backup data is exempt from primary residency rules
A region in the EU selected at subscription time and pinned for the lifetime of the tenant
A US region for cost savings, with replication back to the EU
Whichever region has the lowest egress price

6. Which is the most accurate cost-model contrast between on-prem DataProtect and DMaaS?

DMaaS is CapEx-only; on-prem is OpEx-only
On-prem is CapEx + maintenance; DMaaS is OpEx subscription (typically per-FETB/month or per-workload)
Both are perpetual licenses with identical cost structures
DMaaS bills only for egress; the rest is free

Helios is not just a console; it is also the entry point and management surface for Cohesity's Data Management as a Service offerings. DMaaS shifts the operating model from "I run the cluster" to "I subscribe to backup outcomes" — Cohesity operates the underlying clusters in the cloud, and the customer consumes them through Helios.

DataProtect Delivered as a Service

DMaaS bundles DataProtect, replication, archive, and recovery as a fully managed SaaS service. The customer points sources (VMs, M365 tenants, databases, NAS) at the DMaaS endpoint; Cohesity provisions the underlying SpanFS capacity, runs the backup jobs, manages upgrades, and meters consumption. There is no cluster to bootstrap, no node to replace, and no version to upgrade — the SLA covers the platform.

Operating ModelCustomer OwnsCohesity Owns
Self-managed DataProtect (on-prem cluster)Hardware, OS, network, cluster software, policies, sourcesSoftware releases, support, Helios SaaS
Self-managed DataProtect (Cloud Edition)Cloud VM/IaaS bill, cluster software ops, policiesSoftware, Helios SaaS
DMaaSSources, policies, RBAC, data residency choiceCluster, capacity, upgrades, SLA, infrastructure
FortKnox cyber vaultVault policies, recovery decisionsVault infrastructure (immutable, air-gapped)

Region Selection and Data Residency

DMaaS is provisioned into specific cloud regions (AWS or Azure, depending on offering). The architect's job is to map regulatory boundaries — GDPR for EU data, sovereignty laws in countries like Germany, Switzerland, India, Australia, Canada — to a region selection. Backup data carries the same residency obligations as primary data; picking a US region for an EU tenant's M365 backups is not an option you can quietly hand-wave past an auditor. Helios surfaces region choice during DMaaS subscription and pins data residency for the lifetime of the tenant.

Subscription and Licensing Implications

DMaaS is sold on a subscription basis (typically per FETB-month or per workload tier), versus the perpetual or term license model common to self-managed DataProtect. This shifts the economics from CapEx + ongoing maintenance to pure OpEx. Architects sizing DMaaS apply the same FETB and change-rate inputs from Chapter 3 but must additionally model:

On-Prem vs. SaaS — Comparison

ConcernOn-Prem DataProtectDMaaS
Cluster ops (upgrades, hardware)CustomerCohesity
Capacity planningCustomer (sizing tool, refresh cycles)Cohesity (elastic)
Network ingressLAN-speed to clusterWAN egress to cloud (mind change rate)
Data residencyWherever you put the clusterRegion selection at subscription time
Cost modelCapEx + maintenanceOpEx subscription
Best fitLarge, dense, predictable workloads; latency-sensitive recoveriesM365, branch offices, cloud-native apps, fast time-to-value

Key Points

Key Takeaway: DMaaS is DataProtect-as-a-managed-SaaS. Architects pick a region for residency, subscribe by FETB/workload, and consume backup as an outcome rather than an appliance.
Post-Quiz — HaaS and DMaaS

4. In a DMaaS subscription, who owns cluster ops (upgrades, hardware, capacity)?

The customer, same as on-prem DataProtect
A reseller or VAR partner
Cohesity — clusters, capacity, upgrades, and SLA are managed by Cohesity
It depends on the cloud provider chosen

5. An EU customer is backing up M365 mailboxes via DMaaS. From a residency standpoint, which placement is acceptable?

Any DMaaS region; backup data is exempt from primary residency rules
A region in the EU selected at subscription time and pinned for the lifetime of the tenant
A US region for cost savings, with replication back to the EU
Whichever region has the lowest egress price

6. Which is the most accurate cost-model contrast between on-prem DataProtect and DMaaS?

DMaaS is CapEx-only; on-prem is OpEx-only
On-prem is CapEx + maintenance; DMaaS is OpEx subscription (typically per-FETB/month or per-workload)
Both are perpetual licenses with identical cost structures
DMaaS bills only for egress; the rest is free

Section 3: Marketplace Apps

Pre-Quiz — Marketplace Apps

7. What runtime executes Marketplace apps on a Cohesity cluster?

A KVM hypervisor running per-app micro-VMs
Apollo — the cluster's Docker container runtime
Bridge running app code as kernel modules
Magneto job workers

8. In an AppSpec YAML, what does network.egress: false guarantee?

The app cannot read backup data
The app cannot make outbound network calls — it is air-gapped from the internet
The app runs at higher CPU priority
The app may only be scheduled on one node

9. How does a Marketplace app gain access to backup data?

Direct SpanFS file-system access from inside the container
Through admin-authorized View mounts (NFS/SMB) declared in the AppSpec
By calling Bridge RPCs directly
By replicating data to an external store first

The Cohesity Marketplace is the storefront and delivery mechanism for first- and third-party applications that run directly on the Cohesity DataPlatform. The architectural value proposition is compute at data: instead of egressing backup data to a separate analytics, AV, or compliance system, applications run in containers next to SpanFS where the data already lives.

App Framework, Apollo, and Isolation

Apps are packaged as Docker images plus a Cohesity AppSpec — a Kubernetes-style YAML descriptor extended with Cohesity-specific fields. The cluster's container runtime, Apollo (introduced in the Pegasus 6.3 release line), executes the image. The AppSpec declares resources, view mounts, network requirements, and lifecycle hooks.

apiVersion: cohesity.com/v1
kind: App
metadata:
  name: clamav-scanner
  version: 1.4.0
spec:
  image: cohesity-marketplace/clamav:1.4.0
  resources:
    cpu: "2"
    memory: "4Gi"
  views:
    - name: vm-backups
      mountPath: /mnt/backups
      mode: ReadOnly
  network:
    egress: false
  lifecycle:
    onStart: /opt/clamav/scan.sh

Three security properties matter:

  1. views — the only sanctioned data path. The app sees backup data through view mounts (NFS/SMB-backed), restricted to admin-authorized views. There is no direct SpanFS access.
  2. network.egress: false — apps can be locked into air-gapped operation, supporting dark-site and high-security deployments.
  3. resources — Apollo enforces CPU/memory quotas, so a misbehaving app cannot starve the cluster.
Animation 13.2 — Marketplace deployment: AppSpec → Apollo → Container
STEP ARTIFACT 1. AppSpec YAML image, resources views, egress=false lifecycle hooks 2. Helios push EULA accepted target cluster(s) image ref + spec 3. Apollo runtime register, enforce CPU/mem quotas docker run 4. Container running isolated, view-mounted onStart hook fires SDK init SDK callbacks App SDK · Management SDK · REST v2 SpanFS Views NFS/SMB · ReadOnly SECURITY POSTURE: isolated container · admin-authorized views only · resource quotas · optional egress=false Defense in depth = vetting gate + container isolation + view-scoped data path
AppSpec YAML flows through Helios to Apollo, which spins a Docker container; the only data path is admin-mounted Views, and SDK callbacks return through REST API v2.

Marketplace Access and Vetted Apps

AppPurposeTypical Use Case
SentinelOneEndpoint/AV scanning of backup data, no internet egress requiredValidate backups are clean before relying on them for recovery
ClamAVOpen-source antivirus scanning of NAS and VM snapshotsCost-effective AV for compliance check-the-box
SophosCommercial AV alternativeEnterprise environments standardized on Sophos
Splunk EnterpriseLog analytics / SIEM ingest running on the clusterSearch audit logs and backup metadata in place
Imanis DataNoSQL/Hadoop backup integrationMongoDB, Cassandra, Hadoop, Couchbase backup

Custom App Development

Cohesity ships two SDKs: the App SDK (primitives like cohesity_mount for view mounting) and the Management SDK (REST API surface from inside the container). The publish flow is build → AppSpec validate (appspecvalidator) → Cohesity vetting (developer@cohesity.com) → list. Vetting + isolation + view scoping form a defense-in-depth posture that makes "third-party code on my backup cluster" an acceptable architectural decision.

Key Points

Key Takeaway: Marketplace turns a Cohesity cluster into a compute-at-data platform — Docker containers isolated by Apollo, scoped to admin-authorized views, with optional internet air-gap.
Post-Quiz — Marketplace Apps

7. What runtime executes Marketplace apps on a Cohesity cluster?

A KVM hypervisor running per-app micro-VMs
Apollo — the cluster's Docker container runtime
Bridge running app code as kernel modules
Magneto job workers

8. In an AppSpec YAML, what does network.egress: false guarantee?

The app cannot read backup data
The app cannot make outbound network calls — it is air-gapped from the internet
The app runs at higher CPU priority
The app may only be scheduled on one node

9. How does a Marketplace app gain access to backup data?

Direct SpanFS file-system access from inside the container
Through admin-authorized View mounts (NFS/SMB) declared in the AppSpec
By calling Bridge RPCs directly
By replicating data to an external store first

Section 4: Automation Stack

Pre-Quiz — Automation Stack

10. An architect wants declarative IaC with state and drift detection for cluster-side policies and protection groups. Which tool fits best?

Raw REST API v2 with cron jobs
Ansible cohesity.dataprotect collection
Terraform cohesity/cohesity provider
PowerShell module ad-hoc scripts

11. Which task is the most natural fit for the Ansible cohesity.dataprotect collection?

Defining storage domains and views as code in Git
Rolling agent installs across a fleet of Linux/Windows servers and registering them as sources
Building a custom self-service ServiceNow portal
Providing drift detection for protection policies

12. A ServiceNow workflow needs to trigger a Cohesity recovery from an incident ticket. Which integration approach is most appropriate?

Embed Terraform inside the ServiceNow flow
Push a Helios button manually each time
Call REST API v2 directly via webhook (Bearer token / Helios API key)
Use Ansible Tower from inside the ServiceNow form

13. Which statement is TRUE about Cohesity's automation tooling layering?

Each tool talks to a different proprietary protocol; there is no shared API
Terraform, Ansible, PowerShell, and direct REST are all thin layers over REST API v2
PowerShell bypasses the API and uses iris_cli directly over SSH
Terraform talks to Helios while the others talk to clusters; they cannot mix

Helios and Marketplace solve the interactive operating model. The programmatic operating model is REST API v2 plus the language-specific wrappers — Terraform, Ansible, and PowerShell. At CCAE scale, automating policy and protection-group management is the only scalable path; manually clicking through hundreds of policies across dozens of clusters is both error-prone and untraceable.

REST API v2 — the bedrock

All higher-level tools sit on top of the cluster's REST API v2 (cluster 6.3.1+). It covers protection groups, policies, sources, storage domains, views, alerts, recoveries, and tenants. Direct REST is the right choice when no higher-level wrapper exists, or for purpose-built integrations with ITSM (ServiceNow), SIEM (Splunk, Sentinel), or custom self-service portals.

TOKEN=$(curl -sk -X POST https://cluster.example.com/v2/mcm/access-tokens \
  -H 'Content-Type: application/json' \
  -d '{"username":"admin","password":"'"$PWD"'","domain":"LOCAL"}' \
  | jq -r .accessToken)

curl -sk -H "Authorization: Bearer $TOKEN" \
  https://cluster.example.com/v2/data-protect/protection-groups
Animation 13.3 — Terraform pipeline: commit → plan → apply → fleet
git commit main.tf change PR merged CI trigger GitHub Actions / Jenkins / GitLab tf plan diff drift preview approval gate tf apply REST v2 cohesity provider .tfstate updated Fleet Frankfurt Singapore NYC · DMaaS · ... resource "cohesity_protection_policy" "gold" { name = "Gold-1h-30d-1y" backup_policy { regular { incremental { schedule { unit = "Hours" } } } } retention { unit = "Days" duration = 30 } remote_target_policy { archival_targets { ... duration = 1 unit = "Years" } } } + cohesity_protection_policy.gold will be created + cohesity_protection_group.tier1_vms will be created Plan: 2 to add, 0 to change, 0 to destroy.
A single PR cascades through CI to every Cohesity cluster in the fleet — guaranteeing identical Gold semantics globally.

Tool Selection — Comparison

DimensionTerraformAnsiblePowerShellRaw REST v2
ParadigmDeclarative IaCImperative + idempotent pushImperative scriptingDirect HTTP
StateYes (.tfstate)NoneNoneNone
Best forCluster config: policies, groups, views, RBAC, replicationSource-side: agent installs, source registration, ad-hoc jobsWindows ops, ad-hoc reportingITSM/SIEM webhooks, custom portals, gap-fill
CI/CD fitExcellent (plan/apply, drift)Good (AAP / Tower)ModerateExcellent (any HTTP-aware tool)
Drift detectFirst-classRe-runs convergeManualManual
SkillMedium (HCL, state)Low–medium (YAML)Low for Windows adminsHigh for non-trivial flows
OwnerPlatform / SREServer / source teamWindows opsIntegration / dev team

Architect rule of thumb: Terraform for cluster-side IaC, Ansible for source-side push, PowerShell for Windows-native ops, REST when nothing else fits. All four are thin layers over REST API v2 — pick the right ergonomics for the job.

Figure 13.4 — Tool selection decision tree (mermaid)

flowchart TD Start{What are you automating?} --> Q1{Cluster-side config?} Q1 -- Yes --> Q2{Need declarative state plus drift detection?} Q2 -- Yes --> TF["Terraform cohesity/cohesity provider"] Q2 -- No --> Q5 Q1 -- No --> Q3{Source-side push? agents, registration} Q3 -- Yes --> AN["Ansible cohesity.dataprotect"] Q3 -- No --> Q4{Windows-native shop?} Q4 -- Yes --> PS["PowerShell Module"] Q4 -- No --> Q5{Webhook from ITSM/SIEM?} Q5 -- Yes --> REST["Raw REST API v2"] Q5 -- No --> PS

Worked Example: Terraform Module — Gold Policy + Protection Group

The architectural payoff: every Cohesity cluster in the fleet has a Gold-1h-30d-1y policy that means exactly the same thing — 1-hour incremental, 30-day local retention, 1-year archive, app-consistent VMware snapshots, indexed for search, weekly archive cadence. A change request to extend retention to 45 days is a one-line PR that cascades to every cluster on merge.

resource "cohesity_protection_policy" "gold" {
  name        = "Gold-1h-30d-1y"
  description = "Tier 1: 1h RPO, 30d local retention, 1y archive"

  backup_policy {
    regular {
      incremental {
        schedule { unit = "Hours" hour_schedule { frequency = 1 } }
      }
      retention { unit = "Days" duration = 30 }
    }
  }
  remote_target_policy {
    archival_targets {
      target_id = var.archive_target_id
      schedule  { unit = "Weeks" frequency = 1 }
      retention { unit = "Years" duration = 1 }
    }
  }
}

resource "cohesity_protection_group" "tier1_vms" {
  name        = "Tier1-VMware-Gold"
  policy_id   = cohesity_protection_policy.gold.id
  environment = "kVMware"
  vmware_params {
    source_id  = var.vcenter_source_id
    object_ids = var.vm_object_ids
    app_consistent_snapshot = true
    indexing_policy { enable_indexing = true }
  }
}

Key Points

Key Takeaway: Treat backup configuration as code. Pair Terraform (cluster-side) with Ansible (source-side), use PowerShell for Windows ops, and call REST when nothing else fits. Every cluster in the fleet gets the same Gold policy from the same Git PR.
Post-Quiz — Automation Stack

10. An architect wants declarative IaC with state and drift detection for cluster-side policies and protection groups. Which tool fits best?

Raw REST API v2 with cron jobs
Ansible cohesity.dataprotect collection
Terraform cohesity/cohesity provider
PowerShell module ad-hoc scripts

11. Which task is the most natural fit for the Ansible cohesity.dataprotect collection?

Defining storage domains and views as code in Git
Rolling agent installs across a fleet of Linux/Windows servers and registering them as sources
Building a custom self-service ServiceNow portal
Providing drift detection for protection policies

12. A ServiceNow workflow needs to trigger a Cohesity recovery from an incident ticket. Which integration approach is most appropriate?

Embed Terraform inside the ServiceNow flow
Push a Helios button manually each time
Call REST API v2 directly via webhook (Bearer token / Helios API key)
Use Ansible Tower from inside the ServiceNow form

13. Which statement is TRUE about Cohesity's automation tooling layering?

Each tool talks to a different proprietary protocol; there is no shared API
Terraform, Ansible, PowerShell, and direct REST are all thin layers over REST API v2
PowerShell bypasses the API and uses iris_cli directly over SSH
Terraform talks to Helios while the others talk to clusters; they cannot mix

Chapter Summary

Helios is the architectural answer to fleet sprawl — a SaaS control plane with outbound-HTTPS-only connectivity that converts any number of independent Cohesity clusters into one managed estate, with shared dashboards, federated RBAC, global search, anomaly detection, and centralized policy authoring. Helios Self-Managed delivers the same model on customer infrastructure for dark sites. DMaaS extends Helios into a fully managed subscription where Cohesity operates the cluster — region selection pins residency.

The Marketplace brings third-party compute to where the data already lives: Apollo (Docker runtime) executes apps declared by AppSpec YAML, isolated by container boundaries, scoped to admin-authorized view mounts, optionally air-gapped. The automation stack — REST API v2 plus Terraform, Ansible, and PowerShell — is the programmatic counterpart, with Git, CI/CD, and policy review keeping the fleet honest.

Your Progress

Answer Explanations