Chapter 9: Monitoring, Automation, and the Network Analytics Engine (NAE)
Learning Objectives
Compare SNMPv2c and SNMPv3 and explain why authPriv is the only acceptable production profile.
Configure sFlow sampling and polling on AOS-CX, and pick reasonable rates for access vs uplink ports.
Filter Syslog by RFC 5424 severity and forward selectively to a SIEM over the management VRF.
Authenticate to the AOS-CX REST API with session cookies and perform CRUD operations on /system objects.
Describe the NAE script -> agent -> monitor -> condition -> action hierarchy and write a small NAE script skeleton.
Explain the Aruba Central ZTP onboarding sequence and the role of activate.arubanetworks.com.
Choose between Aruba Central (cloud) and NetEdit (on-prem) for fleet orchestration.
9.1 Traditional Monitoring Tools
SNMP, sFlow, Syslog, and mirror sessions are the four traditional pillars. Each answers a different operational question, and AOS-CX supports all four natively so you can integrate with existing NMS, flow collectors, SIEMs, and packet brokers.
SNMPv2c vs SNMPv3
SNMP is poll-based: the manager sends a GET for an OID, the agent replies. Traps and informs are unsolicited messages pushed when an event occurs. AOS-CX runs v2c and v3 simultaneously.
Feature
SNMPv2c
SNMPv3
Authentication
Community string (cleartext)
User-based MD5/SHA, SHA-2
Encryption
None
DES, 3DES, AES-128/192/256
Integrity
None
HMAC
Security Levels
n/a
noAuthNoPriv, authNoPriv, authPriv
Production?
No
Yes (authPriv only)
switch(config)# snmp-server vrf mgmt
switch(config)# snmpv3 user netops auth sha auth-pass-plaintext "S3curePass!" priv aes priv-pass-plaintext "Pr1vKey!"
switch(config)# snmp-server host 10.10.10.50 trap version v3 user netops
sFlow Sampling
sFlow samples 1-in-N packets at the ASIC, exports the first ~128 bytes plus periodic counters to a collector, and gives statistically representative top-talker data with very low CPU. Aruba guidance: 1-in-4096 for 1 GbE access; 1-in-8192 to 1-in-16384 for 10/25 GbE uplinks.
Syslog Severity
RFC 5424 numbers severities 0 (Emergency) through 7 (Debug). Lower number = more severe. Forward severity 5 (Notice) and lower to a SIEM; Debug is reserved for troubleshooting. Set clock timezone and NTP first or your timeline correlation is broken.
Mirror Sessions
When sampling is not enough — security investigations, IDS feeds — mirror sessions copy 100% of traffic from source interfaces to a destination port (or GRE-tunneled remote endpoint, ERSPAN-equivalent).
Key Points
SNMPv2c uses cleartext community strings — never use it in production.
The API lives at https://<switch-ip>/rest/v10.04/ (the version suffix follows your firmware). An interactive Swagger UI at /api/ documents and exercises every endpoint live.
Session-Cookie Authentication
You POST credentials to /login; a 200 response sets a cookie that subsequent calls reuse until you POST /logout. requests.Session in Python handles the cookie jar automatically.
Animation 1 - REST API Session-Cookie Authentication
CRUD on the Configuration Tree
Objects are addressable as URIs under /system. GET reads, POST creates, PUT replaces, PATCH merges, DELETE removes. The selector query parameter selects configuration, status, or default views.
The pyaoscx SDK wraps REST in idiomatic Python objects (Vlan, Interface, Vrf, etc.) and handles cookies and materialization. The arubanetworks.aoscx Ansible Collection provides declarative modules built on top of pyaoscx for fleet orchestration.
Key Points
REST API is bound to a VRF; bind to mgmt for OOB.
Auth = POST /login -> Set-Cookie -> reuse on every call -> POST /logout.
Standard HTTP verbs: POST = create (201), PUT = replace, PATCH = merge, DELETE = remove.
pyaoscx is the official SDK; Ansible Collection uses it under the hood.
Pre-Quiz: REST API
1. After a successful POST /login, how does the AOS-CX REST API authenticate subsequent requests?
An HTTP Basic auth header on every call
A Bearer token returned in the JSON body
A session cookie set in the Set-Cookie response header
Mutual TLS using a client certificate
2. Which HTTP verb creates a new VLAN object under /rest/v10.04/system/vlans?
GET
POST
PUT
DELETE
3. You need a Python script to manage VLANs across a fleet. Which option requires the least manual cookie handling and payload construction?
Raw curl commands wrapped in subprocess
The pyaoscx SDK
A custom requests wrapper you write from scratch
SNMPv3 SET operations
4. Which command enables the REST API in read-write mode on the management VRF?
https-server rest access-mode read-only
ip http secure-server + vrf forwarding mgmt
https-server rest access-mode read-write and https-server vrf mgmt
snmp-server rest enable vrf mgmt
9.3 The Network Analytics Engine (NAE)
NAE runs Python directly on AOS-CX to monitor, alert, and remediate — no external server, no polling latency, no separate license. The hierarchy is tidy:
Concept
Role
Script
Python file with Manifest + ParameterDefinitions
Agent
Running instance of a script with parameter values
Monitor
Subscription to a switch state URI
Condition
Boolean expression on monitor values; fires once on transition
HPE publishes a public, tested library of NAE scripts at github.com/aruba/nae-scripts covering BGP/OSPF flaps, DHCP snooping anomalies, high CPU, MAC churn, PoE budget, microbursts, and STP topology changes.
A condition fires once when it transitions from false to true, not continuously.
Actions can include ActionSyslog, ActionCLI, ActionShell, webhooks, and REST self-heal calls.
NAE is on-box and license-free. NAE-Lite is the reduced variant on lower-tier platforms.
Time-series charts in the Web UI mark each condition firing with a vertical line linked to action output.
Pre-Quiz: NAE
1. In NAE terminology, what is the difference between a Script and an Agent?
A Script runs continuously; an Agent runs once
A Script is the Python template; an Agent is a running instance with parameter values
A Script lives on Aruba Central; an Agent lives on the switch
A Script writes to syslog; an Agent writes to a database
2. In an NAE script, the line self.r1.condition('{} == "down"', [self.m1]) defines what?
A monitor that subscribes to a REST URI
A Python action that runs show interface
A Boolean rule evaluated on the monitor value, firing once on transition to true
A Manifest entry describing the script's author
3. Which is NOT a typical NAE Action?
ActionSyslog to write a message at a chosen severity
ActionCLI to run a show command and capture the output
A REST callback to AOS-CX to remediate configuration
An OSPF LSA flood to neighbor switches
4. Which statement about NAE licensing and platform support is correct?
NAE requires a separate per-switch license sold with AOS-CX
NAE is free and bundled; NAE-Lite is the reduced variant on lower-tier platforms
NAE only runs on Aruba Central, not on the switch
NAE is included only with HPE GreenLake subscriptions
9.4 Aruba Central and NetEdit
For multi-switch fleets, HPE offers two complementary tools: Aruba Central (cloud-managed) and NetEdit (on-prem). Both validate config and detect drift, but only Central does ZTP.
Aruba Central ZTP Onboarding
Pre-provision the device in Central with serial number + MAC address.
Apply HPE GreenLake licensing.
Place in a template group with a JSON variables file keyed by serial/MAC.
Power on at the remote site. The factory-default switch DHCPs, contacts activate.arubanetworks.com, gets redirected to your tenant by serial, pulls its template+vars, applies config, and establishes an IPSec tunnel back to Central.
Verify in Central UI ("up") and on the device with show system.
Animation 3 - Aruba Central ZTP Onboarding Sequence
UI Group vs Template Group
Aspect
UI Group
Template Group
Editor
GUI form-driven
Text template + JSON vars
Best for
Small fleets, simple configs
Large fleets, scripted ops
Per-device customization
Limited
Full (one JSON entry/device)
Drift detection
Yes
Yes
NetEdit (On-Prem)
For air-gapped or regulated environments, Aruba NetEdit is the on-premises virtual appliance that talks to AOS-CX over REST. Capabilities: configuration validation, multi-switch atomic transactions (all-or-nothing rollback), golden-config compliance, visual diffs, and change history. Critically: no ZTP from activate.arubanetworks.com — that capability is exclusive to Central.
Key Points
ZTP requires Central. Pre-provision serial + MAC, apply GreenLake license, place in template group.
The factory-default switch contacts activate.arubanetworks.com, which redirects to your tenant by serial.
Common ZTP failure causes: no internet from default gateway, firewall blocking activate, typo in serial/MAC.
Template groups use a Jinja2-style template + JSON variables file keyed by serial or MAC.
NetEdit = on-prem, multi-switch atomic transactions, no ZTP.
Pre-Quiz: Aruba Central / NetEdit
1. A factory-default AOS-CX switch is plugged in at a remote site for ZTP. Which DNS name does it contact first?
portal.central.arubanetworks.com
activate.arubanetworks.com
aoscx.update.hpe.com
greenlake.hpe.com
2. Which two values must you pre-provision in Aruba Central before a ZTP-eligible switch ships?
Hostname and management IP
Serial number and MAC address (plus GreenLake license)
SSH public key and admin password
VLAN list and SNMP community
3. A regulated customer requires that all switch management remain inside their data center. Which orchestration tool is appropriate?
Aruba Central in template-group mode
Aruba Central in UI-group mode
Aruba NetEdit on-prem virtual appliance
A direct sFlow + Syslog pipeline
4. Which capability is exclusive to Aruba Central and NOT available in NetEdit?
Configuration validation before deploy
Multi-switch atomic transactions
Drift detection / compliance checking
Zero Touch Provisioning via activate.arubanetworks.com
Chapter Summary
Traditional telemetry: SNMPv3 authPriv for OID polling and traps; sFlow for top-talker visibility; Syslog filtered by severity to a SIEM; mirror sessions for full packet capture.