Chapter 12: SmartFiles — Files, Objects, and Unstructured Data Services
Learning Objectives
Architect SmartFiles for primary file and object workloads on top of an existing Cohesity cluster, choosing the right View Box and Storage Domain shape per workload class.
Compare SMB3, NFSv3/v4, and S3 access semantics on a Cohesity View, including cross-protocol identity mapping and locking.
Apply quotas, QoS policies, and hot/cold tiering policies to Views and explain why QoS must be selected up front.
Design data protection for SmartFiles, including snapshots, DR replication, file audit logging, and ICAP-based antivirus scanning.
Plan migrations from legacy NetApp or Isilon using a combination of cold-data tiering, the Cohesity NAS File Migration Service, and backup-driven cross-filer restores.
For most enterprises, the largest pool of "data sprawl" is unstructured. Cohesity SmartFiles is the product that turns the same DataPlatform you already use as a backup target into a primary, multi-protocol unstructured-data service. SmartFiles is not a separate appliance to learn — it is a consumption mode of the cluster you have already designed.
1. SmartFiles Architecture
Pre-Section Quiz — SmartFiles Architecture
1. A View in SmartFiles is best described as:
A separate physical volume per protocol (one for SMB, one for NFS, one for S3).
A logical container in SpanFS that can be exposed simultaneously as SMB3, NFSv3/v4, and S3 against the same data.
A protection group definition for backup workloads only.
A network VIP used to load-balance NAS clients.
2. The View Box (Storage Domain) primarily defines:
The DNS name a client uses to mount.
The Active Directory forest used for authentication.
Storage efficiency, resiliency (RF/EC), encryption, tiering, and default quotas for the Views inside it.
The replication target cluster.
3. How does SmartFiles handle the "writable S3 clone" pattern (parallel writes from S3 against a live NFS/SMB View)?
By taking byte-range locks on the S3 side that block NFS/SMB writers.
By spawning an instant zero-copy SnapTree clone that is exposed as a separate writable S3 bucket.
By making the live View read-only across all protocols whenever S3 writes occur.
It is not supported — S3 writes always block NFS access.
From SpanFS to View: One File System, Many Faces
SmartFiles is not a separate product riding on top of Cohesity — it is a way of consuming SpanFS, the same distributed file system that holds backup snapshots, archived databases, and replicated VMs. SpanFS exposes a single global namespace across every node in the cluster with strict consistency. That single-namespace property is what lets the same logical object be a file to an NFS client, a share to an SMB client, and an object to an S3 client at the same time, without copying data into protocol-specific silos.
The central SmartFiles construct is the View. A View is a logical container that lives inside a View Box (Storage Domain) and that can be exposed simultaneously as:
An NFS export, with mount-point / volume semantics (NFSv3 and NFSv4 are both supported).
An SMB share, with Windows file share semantics (SMB3 with signing and encryption).
An S3 bucket, where files become S3 objects keyed by their path in the View.
flowchart LR
SpanFS[SpanFS Distributed File System Single Global Namespace]
VB[View Box / Storage Domain Policy Boundary]
V[View Logical Container]
SMB[SMB3 Windows Shares]
NFS[NFSv3 / NFSv4 UNIX Mounts]
S3[S3 Object Buckets]
SpanFS --> VB
VB --> V
V --> SMB
V --> NFS
V --> S3
SMB -.same data.-> NFS
NFS -.same data.-> S3
The View Box — which newer documentation calls a Storage Domain — is the policy container for the Views inside it. The View Box is where you define storage efficiency, resiliency (RF2/RF3 or erasure coding), encryption, tiering policy, and default quotas. A common architectural pattern is to maintain at least two Storage Domains: one tuned for backup landing (high dedupe, erasure-coded, HDD-biased) and one tuned for primary file/object workloads (SSD-biased, lower dedupe ratio target) so that backup ingest cannot starve a busy SmartFiles user share.
NFS uses UID/GID (and in NFSv4, name@domain principals).
SMB3 uses Windows SIDs and Kerberos principals.
S3 uses bucket policies and IAM-style access keys.
Cohesity bridges these via AD/LDAP integration. Cross-protocol locking is enforced inside SpanFS: an SMB3 oplock blocks conflicting NFS access per the locking semantics SpanFS exposes.
Animation: Multi-Protocol Simultaneous Access — SMB Write + NFS Read + S3 Read on the Same Data
Key Points — SmartFiles Architecture
SmartFiles is a consumption mode of SpanFS, not a separate product or appliance.
The View is the multi-protocol logical container; the View Box / Storage Domain is the policy boundary that sets dedupe, RF/EC, encryption, tiering, and default quotas.
A single View can expose SMB3, NFSv3/v4, and S3 simultaneously against the same underlying data; SpanFS enforces cross-protocol locking and AD/LDAP mediates identity translation.
Use separate Storage Domains for backup landing vs. primary file/object workloads to prevent backup ingest from starving SmartFiles latency.
The "writable S3 clone" pattern uses an instant zero-copy SnapTree clone exposed as a separate writable bucket; the live View stays unaffected.
Post-Section Quiz — SmartFiles Architecture
1. A View in SmartFiles is best described as:
A separate physical volume per protocol (one for SMB, one for NFS, one for S3).
A logical container in SpanFS that can be exposed simultaneously as SMB3, NFSv3/v4, and S3 against the same data.
A protection group definition for backup workloads only.
A network VIP used to load-balance NAS clients.
2. The View Box (Storage Domain) primarily defines:
The DNS name a client uses to mount.
The Active Directory forest used for authentication.
Storage efficiency, resiliency (RF/EC), encryption, tiering, and default quotas for the Views inside it.
The replication target cluster.
3. How does SmartFiles handle the "writable S3 clone" pattern (parallel writes from S3 against a live NFS/SMB View)?
By taking byte-range locks on the S3 side that block NFS/SMB writers.
By spawning an instant zero-copy SnapTree clone that is exposed as a separate writable S3 bucket.
By making the live View read-only across all protocols whenever S3 writes occur.
It is not supported — S3 writes always block NFS access.
2. Quotas, QoS, and Tiering
Pre-Section Quiz — Quotas, QoS, and Tiering
1. In SmartFiles, the relationship between quota and alert-limit is best described as:
The alert-limit is the hard cap; the quota is the soft warning.
The quota is the enforced cap; the alert-limit is the soft warning that fires before enforcement.
Both are advisory; SpanFS never enforces either.
The quota only applies to S3 buckets; the alert-limit only applies to NFS/SMB.
2. Why must the QoS policy be selected at View creation time?
Because changing it on a busy View is non-trivial and may require data movement.
Because QoS is a billing-only setting and locks at view creation.
Because Cohesity's UI does not let you create a View without picking one.
Because QoS determines the SMB share name.
3. After a cold block has been tiered to cloud, what happens when an SMB client opens that file?
The client sees an error and must explicitly rehydrate the file.
SmartFiles transparently recalls the block; the client sees the same path, may experience first-read latency, and may incur egress cost.
The file appears truncated to zero bytes until tier-up completes.
The cluster automatically promotes the entire View back to local storage.
4. Which QoS policy best matches an active SMB user share that needs SSD-class latency?
Backup Target Low.
TestAndDev High.
Archive / general-purpose HDD.
CloudArchive Cold.
Quotas: Capacity Governance
SmartFiles supports both per-View quotas and per-user quotas inside a View, with audit logs of usage and Helios REST endpoints (getViewUserQuotas) to drive reporting and chargeback. Storage-Domain defaults are typically configured via the CLI parameters default-view-quota (in GiB) and default-view-quota-alert-limit.
A subtlety the CCAE exam can test: Cohesity does not sharply distinguish "soft" from "hard" quotas in the NetApp sense. Instead, think of:
The quota itself as the enforced cap.
The alert-limit as the soft trigger — the value at which an operator gets a warning.
Two CCAE design rules: (1) Pick the QoS at View creation — changing it on a busy View is non-trivial and may require data movement. (2) Match QoS to workload, not to who paid for it. Putting an active SMB user share on "Backup Target Low" tanks latency; putting a backup target on "TestAndDev High" wastes SSD.
Tiering: Hot/Cold Placement
SmartFiles applies policy-driven tiering across SSD, HDD, and S3-compatible cloud targets. Cold blocks move out without breaking the namespace — clients still see the file or object at the same path, and access triggers a transparent recall. Tiering is configured at the Storage Domain or View level and applies to all protocols simultaneously: a file tiered to S3 that is then read via SMB or NFS or the S3 API behaves identically.
Architects should explicitly model:
Recall latency — first read after tier-down hits cloud egress latency.
Working set sizing — local SSD/HDD must still hold the active working set.
Egress cost — repeated recalls of the same dataset can dwarf the storage savings.
Tiering moves cold blocks transparently across all protocols; the namespace is preserved and access triggers automatic recall.
Model recall latency, working-set sizing, and egress cost when designing tiering policies.
Post-Section Quiz — Quotas, QoS, and Tiering
1. In SmartFiles, the relationship between quota and alert-limit is best described as:
The alert-limit is the hard cap; the quota is the soft warning.
The quota is the enforced cap; the alert-limit is the soft warning that fires before enforcement.
Both are advisory; SpanFS never enforces either.
The quota only applies to S3 buckets; the alert-limit only applies to NFS/SMB.
2. Why must the QoS policy be selected at View creation time?
Because changing it on a busy View is non-trivial and may require data movement.
Because QoS is a billing-only setting and locks at view creation.
Because Cohesity's UI does not let you create a View without picking one.
Because QoS determines the SMB share name.
3. After a cold block has been tiered to cloud, what happens when an SMB client opens that file?
The client sees an error and must explicitly rehydrate the file.
SmartFiles transparently recalls the block; the client sees the same path, may experience first-read latency, and may incur egress cost.
The file appears truncated to zero bytes until tier-up completes.
The cluster automatically promotes the entire View back to local storage.
4. Which QoS policy best matches an active SMB user share that needs SSD-class latency?
Backup Target Low.
TestAndDev High.
Archive / general-purpose HDD.
CloudArchive Cold.
3. Data Protection for SmartFiles
Pre-Section Quiz — Data Protection for SmartFiles
1. ICAP integration in SmartFiles is best described as:
An asynchronous batch scan that runs nightly.
A synchronous protocol that fans out write paths to one or more configured ICAP servers (Trellix, Symantec, Sophos) for AV scanning before commit.
A bolt-on appliance that sits in front of the cluster.
A replacement for AD/LDAP authentication.
2. For SmartFiles DR replication to a remote cluster, which prerequisite is most often overlooked?
Adding extra disk capacity at the source cluster.
Ensuring AD/LDAP is reachable from the DR site so SMB and NFSv4 identities resolve post-failover.
Disabling deduplication on replicated streams.
Removing snapshots before failover.
3. SmartFiles snapshots are powered by:
A traditional copy-on-write block store with capacity penalties.
SnapTree, providing zero-copy snapshots and clones.
The legacy NAS array's hardware snapshot engine.
A separate snapshot service running outside SpanFS.
Snapshots and Policies
Every View can be snapshotted on a schedule using the same Protection Policies covered in Chapter 7 — frequency, hierarchical retention (daily / weekly / monthly / yearly), and lock attributes. SnapTree gives near-zero overhead for snapshots, so retention windows can be aggressive without paying capacity penalties. Snapshots are mountable as read-only Views, which makes "previous versions" workflows straightforward for SMB users.
For ransomware resilience, combine snapshot policies with DataLock (Chapter 11) so snapshot deletes require either time-elapse or quorum approval.
DR Replication for SmartFiles
Views replicate to a remote cluster using the same replication engine that DataProtect uses. Architectural notes:
AD/LDAP must be reachable from the DR site for SMB and NFSv4 access to resolve identities post-failover.
DNS / VIP planning matters more than for backup workloads — clients connect on protocol VIPs.
S3 endpoint URLs must be planned end-to-end; cloud-native applications often hard-code bucket endpoints.
File Audit Logging
SmartFiles ships native file audit logging that records per-event activity (open, read, write, rename, delete, ACL change) on Views. This is pushed to Syslog or to SIEM platforms and replaces bolt-on third-party audit appliances. Audit logging is also a control for ransomware detection: an unusual rate of rename-then-delete on a user share is a classic encryption signature.
Anti-Virus and ICAP Integration
SmartFiles integrates antivirus scanning natively via the ICAP (Internet Content Adaptation Protocol). When ICAP AV is enabled on a View, write paths fan out to one or more configured ICAP servers (Trellix, Symantec, Sophos) for scanning before the data is committed.
For the CCAE exam, remember:
ICAP is a synchronous scan; size scanner pools accordingly or scope ICAP to specific Views (e.g., user shares but not render scratch).
ICAP scan results integrate with audit logs and DataHawk where deployed.
Cohesity's ICAP support is a feature of SmartFiles itself; it is not an additional appliance.
Key Points — Data Protection for SmartFiles
SmartFiles inherits SnapTree-based zero-copy snapshots and the same replication engine DataProtect uses.
Pair snapshot policies with DataLock for WORM immutability against ransomware.
For DR, AD/LDAP reachability, VIP/DNS strategy, and S3 endpoint planning are first-class design decisions.
ICAP AV is synchronous and built in — size scanner pools accordingly and scope ICAP to Views that need it.
Post-Section Quiz — Data Protection for SmartFiles
1. ICAP integration in SmartFiles is best described as:
An asynchronous batch scan that runs nightly.
A synchronous protocol that fans out write paths to one or more configured ICAP servers (Trellix, Symantec, Sophos) for AV scanning before commit.
A bolt-on appliance that sits in front of the cluster.
A replacement for AD/LDAP authentication.
2. For SmartFiles DR replication to a remote cluster, which prerequisite is most often overlooked?
Adding extra disk capacity at the source cluster.
Ensuring AD/LDAP is reachable from the DR site so SMB and NFSv4 identities resolve post-failover.
Disabling deduplication on replicated streams.
Removing snapshots before failover.
3. SmartFiles snapshots are powered by:
A traditional copy-on-write block store with capacity penalties.
SnapTree, providing zero-copy snapshots and clones.
The legacy NAS array's hardware snapshot engine.
A separate snapshot service running outside SpanFS.
4. Migration and Modernization
Pre-Section Quiz — Migration and Modernization
1. Cohesity's packaged NAS File Migration Service is sized at approximately:
3 TB per migration event.
30 TB per migration event.
300 TB per migration event.
Unlimited per migration event.
2. The "transparent cold-data tiering" path lets architects:
Move only cold blocks off the legacy filer to SmartFiles or cloud while the legacy NAS keeps serving hot data.
Force a full cutover of all data immediately.
Replace the legacy filer's controller hardware in place.
Use S3 as a primary tier for hot data only.
3. The strongest architectural argument SmartFiles makes against NetApp ONTAP and Dell PowerScale is:
SmartFiles uses faster network cards.
SmartFiles is the only platform that supports SMB.
Consolidation: the same platform serves backup, files, objects, archive, and DR with global dedupe and native audit/ICAP.
SmartFiles is free.
Three Migration Paths
NAS File Migration Service — packaged Professional Services engagement covering cluster prep, planning, cutover, and end-state docs, sized at ~30 TB per migration event. For larger estates, plan multiple cutover events.
Transparent cold-data tiering — SmartFiles scans the source NAS, classifies data by access pattern, and policy-tiers cold blocks to Cohesity (or cloud) without rehydration on access. The legacy NAS keeps serving hot data; SmartFiles silently absorbs the cold tail.
Backup-driven cross-filer restore — back up the source NAS to Cohesity, then restore directly into a SmartFiles View (or even into a different NAS array). Especially useful when SMB/NFS permission preservation matters.
Cohesity SmartFiles vs. NetApp ONTAP and Dell PowerScale (Isilon)
Capability
Cohesity SmartFiles
NetApp ONTAP
Dell PowerScale (Isilon)
Architecture
SpanFS, hyperconverged
HA pairs (cluster of pairs)
OneFS scale-out
Single namespace
Cluster-wide
Per SVM
Cluster-wide
Multi-protocol same data
NFSv3/v4 + SMB3 + S3
NFS + SMB + S3 (bolt-on)
NFS + SMB + (S3 via OneFS or ECS)
Global dedupe
Yes, cluster-wide
Volume/aggregate
Limited; per-volume
Cold-tier cloud
Yes, transparent recall, all protocols
FabricPool
CloudPools
Native ICAP AV
Yes
Yes (Vscan)
Yes (CAVA / ICAP)
Native audit
Yes
FPolicy
Audit subsystem
Backup target
Native
Possible, not designed for
Possible, not designed for
Single platform: backup + primary
Yes
No
No
Lift-and-Shift Sequence
Discovery. Run SmartFiles analytics against source filers; produce hot/warm/cold map per share.
Decision tree. Per share: tier-off vs. full cutover vs. backup-driven restore.
Identity and AV. Stand up AD/LDAP and ICAP scanner pools before any cutover.
QoS placement. Map each share to a QoS class up front.
Cutover windows. Plan at the 30-TB-per-event sizing; use replication seeding to minimize cutover time.
Decommission. Retire source filer once SmartFiles View is authoritative and snapshots have aged.
flowchart TD
Legacy[Legacy NAS NetApp ONTAP / Dell Isilon]
Disc[Discovery and Analytics hot/warm/cold map]
Decision{Decision tree per share}
Cutover[Full Cutover Path NAS Migration Service ~30 TB per event]
Tier[Cold-Data Tier-Off Path policy-driven tiering legacy keeps hot data]
Backup[Backup-Driven Path NAS backup + cross-filer restore]
Prep[Pre-cutover prep AD/LDAP + ICAP + QoS]
SF[SmartFiles View SMB3 + NFSv4 + S3]
Decom[Decommission legacy]
Legacy --> Disc
Disc --> Decision
Decision --> Cutover
Decision --> Tier
Decision --> Backup
Cutover --> Prep
Backup --> Prep
Tier --> SF
Prep --> SF
SF --> Decom
Animation: NAS Migration Workflow — Legacy NAS to File Migration Service to SmartFiles
Key Points — Migration and Modernization
Three migration paths: NAS File Migration Service (~30 TB/event), transparent cold-data tier-off, and backup-driven cross-filer restore.
Cold-data tier-off lets the legacy NAS keep serving hot data while SmartFiles silently absorbs the cold tail.
Backup-driven restore preserves SMB/NFS ACLs better than rsync/robocopy in many cases.
Stand up AD/LDAP and ICAP before any cutover — not after.
The strategic argument vs. NetApp/Isilon is consolidation: one platform for backup, files, objects, archive, and DR.
Post-Section Quiz — Migration and Modernization
1. Cohesity's packaged NAS File Migration Service is sized at approximately:
3 TB per migration event.
30 TB per migration event.
300 TB per migration event.
Unlimited per migration event.
2. The "transparent cold-data tiering" path lets architects:
Move only cold blocks off the legacy filer to SmartFiles or cloud while the legacy NAS keeps serving hot data.
Force a full cutover of all data immediately.
Replace the legacy filer's controller hardware in place.
Use S3 as a primary tier for hot data only.
3. The strongest architectural argument SmartFiles makes against NetApp ONTAP and Dell PowerScale is:
SmartFiles uses faster network cards.
SmartFiles is the only platform that supports SMB.
Consolidation: the same platform serves backup, files, objects, archive, and DR with global dedupe and native audit/ICAP.