Chapter 11: Security, Encryption, and Ransomware Resilience

Learning Objectives

Backups used to be the last thing an attacker thought about. Today they are the first. Modern ransomware operators routinely spend days or weeks inside an environment specifically hunting for backup admin credentials. For a Cohesity architect, security is the central design axis around which encryption, immutability, isolation, detection, and recovery are organized.

Defense-in-Depth Layers (Animated)

Defense in Depth: concentric layers build outward from hardware to data immutability
Hardware - SED + FIPS modules OS - Hardened Linux Software - SpanFS + TLS Identity - SSO + MFA + Quorum DATA DataLock Defense in Depth: 5 Layers

Section 1: Encryption at Rest and In Transit

Pre-Section Check - Encryption

1. A federal agency requires that pulling drives from a Cohesity chassis must yield ciphertext, with crypto-erase capability that wipes a drive instantly. Which encryption approach best fits?

Software AES at the SpanFS layer with internal KMS
Self-Encrypting Drives (SEDs) on FIPS-validated ReadyNodes
SMB3 encryption for all Views
TLS 1.3 with mutual authentication

2. What happens to encrypted data on a Cohesity cluster if the external KMIP/KMS revokes the KEK?

The cluster falls back to internal keys automatically
Data becomes inaccessible - acting as a global kill-switch
Replication continues but local reads fail
Only new writes are blocked; existing data is unaffected

3. Which is true about FIPS 140-2/140-3 mode on a Cohesity cluster?

It can be set per-View for tenant isolation
It is a cluster-wide setting and most easily enabled at deployment
It only affects in-transit traffic, not at-rest encryption
It permits MD5 and SHA-1 for backwards compatibility

Software Encryption vs. Self-Encrypting Drives

Cohesity supports two parallel at-rest encryption paths. Software encryption is performed by SpanFS itself: every chunk is encrypted with AES-256 using a Data Encryption Key (DEK) wrapped by a Key Encryption Key (KEK). The advantage is portability - it works on any node, physical, virtual, or cloud - with small CPU overhead absorbed by AES-NI. Self-Encrypting Drives push encryption into drive firmware; the drive holds the Media Encryption Key and refuses to release plaintext without an authentication key. Pulling a drive yields ciphertext.

FactorSoftware EncryptionSED
Where it runsSpanFS / Cohesity softwareDrive firmware (FIPS 140-2/3)
Form factorAll (physical, VE, Cloud)Physical appliances and ReadyNodes only
Performance impactSmall (AES-NI accelerated)None on the host
Crypto-eraseRe-key wipes data logicallyPSID revert wipes drive instantly

Customer-Managed Keys via KMIP

At enterprise scale, KEKs live in a customer-controlled KMS (Thales CipherTrust, Entrust KeyControl, HashiCorp Vault via KMIP, AWS KMS, Azure Key Vault). The cluster generates DEKs locally; KEKs wrap and unwrap them over KMIP/TLS. Revoking the KEK acts as a kill-switch.

sequenceDiagram participant Cluster as Cohesity Cluster participant KMS as KMIP / KMS Server participant Disk as SpanFS Chunk File Cluster->>Cluster: Generate local DEK Cluster->>KMS: Request KEK wrap (KMIP/TLS) KMS-->>Cluster: Wrapped DEK released Cluster->>Disk: Encrypt chunk with DEK Note over Cluster,KMS: On read: cluster requests unwrap Cluster->>KMS: Unwrap DEK request KMS-->>Cluster: Plaintext DEK (in-memory) Cluster->>Disk: Decrypt chunk Note over KMS: Revoke KEK = global kill-switch

TLS, SMB3, NFS Encryption, and FIPS Mode

Every interface that leaves a node is TLS-protected (1.2 or 1.3 with administrator-installed CA-signed certificates). Replication is encrypted, compressed, and deduplicated on the wire. SMB3 supports per-session encryption, and NFSv4.1 with Kerberos krb5p provides privacy - configured per View. FIPS mode is cluster-wide, restricts cipher suites to FIPS-approved algorithms, disables MD5/SHA-1 for signatures, forces SEDs to FIPS-validated configuration, and is most easily enabled at deployment time.

Key Points

Post-Section Check - Encryption

1. A federal agency requires that pulling drives from a Cohesity chassis must yield ciphertext, with crypto-erase capability that wipes a drive instantly. Which encryption approach best fits?

Software AES at the SpanFS layer with internal KMS
Self-Encrypting Drives (SEDs) on FIPS-validated ReadyNodes
SMB3 encryption for all Views
TLS 1.3 with mutual authentication

2. What happens to encrypted data on a Cohesity cluster if the external KMIP/KMS revokes the KEK?

The cluster falls back to internal keys automatically
Data becomes inaccessible - acting as a global kill-switch
Replication continues but local reads fail
Only new writes are blocked; existing data is unaffected

3. Which is true about FIPS 140-2/140-3 mode on a Cohesity cluster?

It can be set per-View for tenant isolation
It is a cluster-wide setting and most easily enabled at deployment
It only affects in-transit traffic, not at-rest encryption
It permits MD5 and SHA-1 for backwards compatibility

Section 2: Immutability and DataLock

Pre-Section Check - Immutability

4. A regulator demands that no individual, regardless of role, can delete records before retention expires. Which DataLock mode applies?

Governance lock
Compliance lock
Legal hold
SpanFS baseline immutability only

5. Once a DataLock policy locks a snapshot, which of these is permitted during the lock window?

Cluster admin deletes the snapshot
Security Officer shortens retention
Extending retention further
In-place modification of the gold copy

SpanFS Baseline Immutability

By default every Cohesity backup snapshot is stored as a read-only, immutable object. The original gold copy cannot be mounted, modified, encrypted in place, or deleted by any external system. Read/write workflows (instant recovery, dev/test) get zero-cost, redirect-on-write clones; the gold copy stays intact, defeating the most common ransomware pattern of overwriting backup files in place.

DataLock Policies and WORM

DataLock adds a hardened, time-bound, role-bound enforcement layer applied by a Security Officer - a role distinct from the cluster admin. Once applied, a snapshot enters WORM state for the configured retention. It cannot be deleted or modified by any user, including the cluster admin, the Security Officer who applied the lock, or any account with full privileges. The lock can only be extended - never shortened or removed. DataLock applies equally to copies tiered or archived to CloudArchive and FortKnox.

DataLock Lifecycle Animation

DataLock states: Created -> Locked (countdown) -> Expired -> Released. Attempted delete fails during the locked window.
DataLock Lifecycle Created snapshot taken Locked (WORM) 07d : 23h : 59m cannot delete or shorten DELETE Expired timer ends Released deletable delete attempt - BLOCKED Compliance mode: even Security Officer cannot shorten or remove during Locked state

Compliance vs. Governance

ModeWho can shorten retentionUse caseMapping
GovernanceSecurity Officer can shorten/removeInternal policy, operational immutabilityBest-practice hygiene
ComplianceNobody - not even Security OfficerSEC 17a-4, FINRA, HIPAA retention floorsStrict regulatory immutability

Legal Hold and Quorum Approval

Legal hold is an indefinite freeze applied when litigation is anticipated; it extends retention until explicitly released, regardless of the DataLock timer. For sensitive operations outside the locked window - Protection Group deletion, retention shortening, target removal, legal hold release - Cohesity supports quorum approval (e.g., 2-of-3 or 3-of-5). A compromised account can request the destructive action; it cannot finish it without independent approvers.

Key Points

Post-Section Check - Immutability

4. A regulator demands that no individual, regardless of role, can delete records before retention expires. Which DataLock mode applies?

Governance lock
Compliance lock
Legal hold
SpanFS baseline immutability only

5. Once a DataLock policy locks a snapshot, which of these is permitted during the lock window?

Cluster admin deletes the snapshot
Security Officer shortens retention
Extending retention further
In-place modification of the gold copy

Section 3: Cyber Vaulting with Cohesity FortKnox

Pre-Section Check - FortKnox

6. Which architectural feature most distinguishes FortKnox from CloudArchive?

Lower cost per TB
Persistent network connection to cloud
Virtual air gap with mandatory multi-person quorum
Customer-managed S3 buckets only

7. A customer needs a 7-year cost-optimized cold archive replacing tape. Which is appropriate?

FortKnox
CloudArchive
SmartFiles tier
CloudReplicate to Cloud Edition

FortKnox is a SaaS-delivered cyber vault - Data Isolation and Recovery as a Service (DIRaaS) - that stores an immutable, isolated tertiary copy of backup data in a Cohesity-managed cloud tenant on AWS, Azure, or Google Cloud. The customer subscribes; Cohesity operates the vault.

The Virtual Air Gap (Animated)

Virtual Air Gap cycle: gap closed -> opens for transfer window -> data flows -> closes -> ransomware attempt fails
FortKnox Virtual Air Gap Cycle Source Cohesity Cluster Snapshot Virtual Air Gap CLOSED OPEN 02:00-04:00 ransomware attack blocked at gap FortKnox Vault AWS / Azure / GCP WORM Vault 2-of-3 quorum

Mandatory Multi-Person Quorum

FortKnox enforces multi-person quorum approval for sensitive operations - recoveries, retention changes, vault configuration - at the vault level. Two or more authorized users must approve before the operation proceeds. The control is purpose-built for insider-threat and stolen-credential scenarios.

FortKnox vs. CloudArchive

DimensionFortKnox (DIRaaS)CloudArchive
Primary purposeIsolated, immutable cyber-recovery vaultLong-term cloud tiering / archive
ConnectivityVirtual air gap; transfer windows onlyPersistent connection to cloud target
ApprovalMandatory multi-person quorumStandard MFA + RBAC
Operating modelCohesity-managed SaaSCustomer's own buckets
Use case3-2-1-1-0 ransomware copy7-year retention, tape replacement

Heuristic: scenarios mentioning "cyber recovery," "isolated copy," "air gap," "ransomware blast radius" -> FortKnox. Scenarios mentioning "long-term retention," "tape replacement," "cold storage" -> CloudArchive.

Key Points

Post-Section Check - FortKnox

6. Which architectural feature most distinguishes FortKnox from CloudArchive?

Lower cost per TB
Persistent network connection to cloud
Virtual air gap with mandatory multi-person quorum
Customer-managed S3 buckets only

7. A customer needs a 7-year cost-optimized cold archive replacing tape. Which is appropriate?

FortKnox
CloudArchive
SmartFiles tier
CloudReplicate to Cloud Edition

Section 4: Ransomware Detection and Recovery with DataHawk

Pre-Section Check - DataHawk

8. Which DataHawk capability identifies the last-known-good recovery point during a suspected encryption attack?

BigID classification
Anomaly detection (entropy + change rate)
Threat intelligence feed
YARA rule authoring

9. Which DataHawk feature provides 100,000+ daily-refreshed indicators of compromise to find malware in backups?

Anomaly detection
Threat intelligence (with YARA + CrowdStrike feeds)
BigID classification
SpanFS deduplication

10. Which DataHawk feature would be used to scope a HIPAA breach notification by identifying PHI exposure in affected files?

Anomaly detection
Threat intelligence
BigID classification (200+ patterns, 50+ policies)
DataLock Compliance mode

DataHawk is the AI/ML-driven security service in the Cohesity Data Cloud, packaging three capabilities into a SaaS that answers the three incident questions: Is there an attack? Where is the malware? What sensitive data was exposed?

graph TD DH[Cohesity DataHawk
AI/ML Security SaaS] DH --> AD[Anomaly Detection] DH --> TI[Threat Intelligence] DH --> CL[BigID Classification] AD --> AD1[Entropy analysis] AD --> AD2[Change-rate baselines] AD --> AD3[Clean snapshot recommendation] TI --> TI1[100K+ IOCs daily] TI --> TI2[YARA + CrowdStrike feeds] TI --> TI3[Malware hash matching] CL --> CL1[200+ patterns] CL --> CL2[50+ compliance policies] CL --> CL3[PII / PHI / PCI / GDPR]

Anomaly Detection

ML models trained on per-workload baselines inspect data entropy (encrypted-in-place data has near-uniform byte distribution), file/object change rates, write patterns, and file-extension patterns (e.g., .locked, .crypt). Anomaly scores drive the clean snapshot recommendation, pointing administrators to the last-known-good recovery point - critical because naive "most recent backup" restore often restores the encryption itself.

Threat Intelligence and YARA

Continuously updated feed of 100,000+ IOCs refreshed daily from 160,000+ sources, including curated YARA rules, CrowdStrike Falcon Intelligence, and Cohesity-curated default libraries. Customers do not author rules. DataHawk identifies malware hashes, the specific files containing them, and the variant/family.

BigID-Powered Classification

200+ predefined patterns and 50+ out-of-the-box compliance policies (PII, PHI/HIPAA, PCI, GDPR). Classification reports tell responders exactly which categories of sensitive data lived in affected files - essential for HIPAA's 60-day and GDPR's 72-hour notification clocks.

Clean-Room Recovery

  1. Identify the clean recovery point via DataHawk's ML recommendation.
  2. Provision an isolated environment (alternate cluster, alternate VLAN, recovery-only VPC).
  3. Restore from FortKnox or DataLock-protected snapshot into the clean room.
  4. Run threat intel scans, AV, and integrity validation.
  5. Cut over only after validation passes.

Key Points

Post-Section Check - DataHawk

8. Which DataHawk capability identifies the last-known-good recovery point during a suspected encryption attack?

BigID classification
Anomaly detection (entropy + change rate)
Threat intelligence feed
YARA rule authoring

9. Which DataHawk feature provides 100,000+ daily-refreshed indicators of compromise to find malware in backups?

Anomaly detection
Threat intelligence (with YARA + CrowdStrike feeds)
BigID classification
SpanFS deduplication

10. Which DataHawk feature would be used to scope a HIPAA breach notification by identifying PHI exposure in affected files?

Anomaly detection
Threat intelligence
BigID classification (200+ patterns, 50+ policies)
DataLock Compliance mode

Section 5: Hardening, Compliance, and Defense-in-Depth

Pre-Section Check - Hardening

11. What is the primary architectural value of quorum approval (e.g., 2-of-3) on destructive Cohesity operations?

Speeds up DR runbooks
Provides resilience against single-credential compromise
Replaces MFA
Enables tenant self-service

12. Which Cohesity controls best support SEC 17a-4 / FINRA WORM retention requirements?

DataLock Governance + standard MFA
DataLock Compliance mode + legal hold + immutable audit trail
CloudArchive with lifecycle policy only
SpanFS baseline only

Identity, MFA, and Quorum

Audit Logging and SIEM

Forward audit logs to enterprise SIEM (Splunk, Sentinel, Chronicle, QRadar). Build alerts on DataLock changes, quorum approval requests, failed MFA, role assignment changes, KMIP fetch failures.

Compliance Mappings

FrameworkCohesity controls
HIPAADataLock retention floors, encryption at rest/in transit, audit logging, BigID PHI classification
PCI-DSSFIPS-validated encryption, KMIP/KMS separation, MFA, role separation
FedRAMPFIPS 140-2/3 mode, audit logging, FedRAMP-authorized IdPs
SEC 17a-4 / FINRADataLock Compliance mode (WORM), legal hold, immutable audit trail
GDPRBigID classification (PII), data subject access via search, retention controls

Worked Example: 500 TB Hospital

  1. FIPS-validated SED ReadyNodes; KMIP to Thales CipherTrust; FIPS 140-2 mode.
  2. Three policies (Clinical, Fileshares, M365) all with DataLock Compliance mode for 7-year retention floor.
  3. Replicate to DR cluster + FortKnox vault on AWS, transfer window 02:00-04:00, 2-of-3 quorum (CISO, VP Infra, Compliance).
  4. DataHawk subscribed: anomaly dashboard daily, IOC scanning, BigID with HIPAA/PII/PCI policies.
  5. Azure AD SAML SSO, mandatory MFA, audit to Microsoft Sentinel.
  6. Quarterly clean-room recovery rehearsal of EHR DB from FortKnox.

Key Points

Post-Section Check - Hardening

11. What is the primary architectural value of quorum approval (e.g., 2-of-3) on destructive Cohesity operations?

Speeds up DR runbooks
Provides resilience against single-credential compromise
Replaces MFA
Enables tenant self-service

12. Which Cohesity controls best support SEC 17a-4 / FINRA WORM retention requirements?

DataLock Governance + standard MFA
DataLock Compliance mode + legal hold + immutable audit trail
CloudArchive with lifecycle policy only
SpanFS baseline only

Your Progress

Answer Explanations