HPE7-A01: Aruba Certified Switching Associate Exam Preparation
A comprehensive intermediate-level study guide covering AOS-CX switching, Layer 2/3 protocols, VSX, security, and operational best practices for the HPE Aruba ACSA certification.
Table of Contents
- Chapter 1: Introduction to HPE Aruba CX Switching and the ACSA Exam
- Chapter 2: AOS-CX CLI, Configuration Management, and Initial Setup
- Chapter 3: Layer 2 Switching: VLANs, Trunks, and MAC Learning
- Chapter 4: Spanning Tree Protocol and Loop Prevention on AOS-CX
- Chapter 5: Link Aggregation, LACP, and VSX Multi-Chassis LAG
- Chapter 6: Layer 3 Routing: Static Routes, OSPF, and VRRP/Active-Gateway
- Chapter 7: Switch Security: Authentication, Access Control, and Port Security
- Chapter 8: Quality of Service, Multicast Snooping, and DHCP Services
- Chapter 9: Monitoring, Automation, and the Network Analytics Engine (NAE)
- Chapter 10: Troubleshooting, Upgrade Workflows, and Exam Strategy
Chapter 1: Introduction to HPE Aruba CX Switching and the ACSA Exam
Learning Objectives
- Describe the HPE7-A01 exam format, blueprint, and passing score so you can plan a study schedule that matches how the exam actually weights its domains.
- Identify the Aruba CX switch families (6000, 6100, 6200, 6300, 6400, 8000, 8100, 9300, 10000) and recognize which target use case each was engineered for.
- Explain the AOS-CX operating system architecture, paying particular attention to the database-driven design that underpins the CLI, REST API, and Network Analytics Engine (NAE).
- Recognize where the Aruba Certified Switching Associate (ACSA) credential fits within the broader HPE Aruba certification track and how it sets you up for ACP and ACE level work later.
HPE7-A01 Exam Overview
If you have arrived at this book hunting for the “HPE7-A01” code on the HPE certification site, you may already have noticed that the catalog is in flux. The most current associate-level switching exam in the HPE Aruba portfolio is the Aruba Certified Associate - Switching (ACA-Switching), currently coded HPE6-A86, and the underlying body of knowledge — AOS-CX fundamentals — is the same content that “HPE7-A01” maps to in legacy and third-party study material [Source: https://certification-learning.hpe.com/tr/datasheet/course/0001208243?version=1]. Whether your voucher reads HPE7-A01 or HPE6-A86, the blueprint, the question style, and the recommended preparation path are the same. This book treats the two codes as interchangeable and uses ACSA (the credential name) as the canonical label.
Exam objectives and weighting
The ACSA blueprint is built around five weighted domains. Memorize these percentages early — they tell you exactly how many study hours each topic deserves [Source: https://open-exam-prep.com/practice/hpe-aruba-associate].
| Domain | Weight | Representative topics |
|---|---|---|
| Switching & VLANs | 30% | AOS-CX switching, VLANs, 802.1Q trunking, LAG/LACP, VSX, STP/RSTP/MSTP, PoE |
| Routing & OSPF | 20% | Static routing, OSPF areas, DR/BDR, cost calculation, inter-VLAN routing, DHCP relay, VRRP |
| Security & ACLs | 20% | Standard/extended ACLs, 802.1X, MAC authentication, port security, DHCP snooping, DAI, ClearPass integration |
| Aruba Central Management | 15% | Cloud management, Zero-Touch Provisioning (ZTP), firmware/group configuration, AI Insights, reporting |
| Monitoring & API | 15% | NAE, REST API automation, sFlow, port mirroring |
Diagram opportunity: A pie chart of the five domains is one of the highest-value visuals you can build for this chapter. An animated reveal that highlights each slice in turn (0.4 s per step, 0.15 s stagger) drives home the relative weighting better than a table alone.
A practical reading of the table: 70 percent of the exam is “classic” Layer 2 and Layer 3 networking expressed in AOS-CX terms (switching, routing, security). The remaining 30 percent is what makes AOS-CX modern — cloud management through Aruba Central and on-box programmability through NAE and REST. If you come from a Cisco or legacy ProCurve background, expect the first 70 percent to feel familiar in concept but unfamiliar in syntax, and the last 30 percent to be genuinely new material.
Figure 1.1: ACSA exam domain weighting
graph TD
EXAM["ACSA Exam<br/>100%"]
EXAM --> SW["Switching & VLANs<br/>30%"]
EXAM --> RT["Routing & OSPF<br/>20%"]
EXAM --> SEC["Security & ACLs<br/>20%"]
EXAM --> CEN["Aruba Central<br/>15%"]
EXAM --> MON["Monitoring & API<br/>15%"]
SW --> CLASSIC["Classic L2/L3 = 70%"]
RT --> CLASSIC
SEC --> CLASSIC
CEN --> MODERN["Modern AOS-CX = 30%"]
MON --> MODERN
Question format and duration
- Number of questions: approximately 60 multiple-choice items. Some third-party prep sites list 100 scored questions; that figure conflates ACSA with longer ACP/ACE exams [Source: https://open-exam-prep.com/practice/hpe-aruba-associate].
- Duration: 90 minutes, which works out to roughly 90 seconds per question. That is generous enough that you can re-read scenario items, but tight enough that you cannot afford to deeply puzzle over every command-line snippet.
- Passing score: approximately 66 percent (HPE adjusts the cut score periodically based on item statistics) [Source: https://open-exam-prep.com/practice/hpe-aruba-associate].
- Delivery: proctored at Pearson VUE test centers, with online proctoring also available in most regions.
Real-world analogy: Think of the 90-minute clock as a single highway commute. You want to maintain a steady cruising speed (about 1 question every 90 seconds), pull over briefly for the few “construction zone” scenario questions, and avoid the temptation to camp on a single hard problem the way an inexperienced driver might brake hard at every minor bump.
Recommended experience level
There are no formal prerequisites for the ACSA, but HPE strongly recommends the AOS-CX Switching Fundamentals course (Rev. 24.31, course ID 0001208242) before you sit for the exam [Source: https://redinter.services/wp-content/uploads/2024/09/AOS-CX-Switching-Fundamentals-Rev-24.31-_-Certification-and-Learning.pdf]. The course is five days long and split roughly 60 percent lecture / 40 percent hands-on lab — an honest reflection of an exam where roughly half the items are scenario- or configuration-driven.
Practically, candidates who succeed have:
- 6 to 12 months of switching experience (any vendor counts; the concepts transfer).
- Comfort reading and writing CLI configurations (subnetting, VLAN tagging, basic OSPF).
- A willingness to think in terms of databases and APIs rather than just CLI commands — a mindset shift that AOS-CX rewards.
Study resources and lab practice
The ACSA blueprint cannot be conquered by reading alone. Build a lab early.
| Resource type | Recommended option | Notes |
|---|---|---|
| Official course | AOS-CX Switching Fundamentals (Rev. 24.31) | Aligned to the blueprint; best single resource [Source: https://certification-learning.hpe.com/tr/datacard/course/0001208241] |
| Textbook | This study guide | Concept reinforcement and worked examples |
| Practice exams | HPE practice tests, ExamTopics, third-party banks | Validate readiness, expose weak domains [Source: https://www.examtopics.com/exams/hp/hpe7-a08/view/19/] |
| Lab environment | HPE Networking Lab Engine, EVE-NG/GNS3 with AOS-CX VM, or physical 6100/6200/6300 hardware | CLI fluency is non-negotiable |
| Cloud management | Aruba Central trial or partner sandbox | The 15% Central domain is hard to fake without console time |
Worked example — building a 6-week study plan. Suppose you have 10 hours per week. Allocate study time in proportion to the blueprint weights:
- Weeks 1-2: Switching & VLANs — 20 hours (30%)
- Week 3: Routing & OSPF — 13 hours (20%)
- Week 4: Security & ACLs — 13 hours (20%)
- Week 5: Aruba Central — 10 hours (15%)
- Week 6: Monitoring, NAE, REST + practice exams — 14 hours (15% domain + review)
This proportional allocation is the simplest defensible study plan and reflects the same logic exam-builders use to weight item counts.
Key Takeaway: The ACSA exam is a 60-question, 90-minute, ~66%-to-pass associate-level credential whose blueprint is dominated by Layer 2/3 fundamentals (70%) but distinguished by 30% modern content on Aruba Central and on-box programmability. Plan your study time in proportion to the published domain weights and book lab time before you book the exam.
Aruba CX Switch Portfolio
AOS-CX runs across the entire Aruba CX portfolio, from a 24-port closet switch up to a hyperscale data-center chassis. The portfolio is intentionally tiered: smaller numbers serve campus access, middle numbers serve aggregation and core, and the 8000/9000/10000 series serve the data center [Source: https://support.hpe.com/hpesc/public/docDisplay?docId=sd00007177en_us&page=GUID-4776FF26-8C76-4126-9D4A-D421FB0FFCAD.html&docLocale=en_US]. Knowing the portfolio is more than trivia; the exam routinely asks “which platform is appropriate for X scenario,” and you need a mental shortcut to answer quickly.
Figure 1.2: Aruba CX switch portfolio hierarchy by network tier
graph TD
PORTFOLIO["Aruba CX Portfolio<br/>(single AOS-CX image)"]
PORTFOLIO --> ACCESS["Campus Access Layer"]
PORTFOLIO --> AGGCORE["Aggregation / Campus Core"]
PORTFOLIO --> DC["Data Center Core"]
ACCESS --> CX6000["CX 6000 / 6100 / 6200<br/>1G + PoE, mGig uplinks"]
AGGCORE --> CX6300["CX 6300 (VSF stack)"]
AGGCORE --> CX6400["CX 6400 / 8100 / 8325 / 8360<br/>(VSX active-active)"]
DC --> CX8400["CX 8400 (legacy DC core)"]
DC --> CX9300["CX 9300 (modern spine, 400G)"]
DC --> CX10000["CX 10000<br/>Pensando DPU integration"]
Diagram opportunity: A layered campus-to-DC architecture diagram with three rows (Access / Aggregation / Core / DC) and the CX series numbers placed in their appropriate row makes a perfect animated reveal. Light up each layer in sequence (0.6 s major transition between layers) so the learner internalizes the hierarchy.
Access switches (6000/6100/6200)
The 6000-class switches are the workhorses of the wiring closet. They terminate user devices — laptops, IP phones, wireless APs, IoT — and they dominate by sheer count in any campus deployment [Source: https://www.router-switch.com/faq/aruba-switch-stacking-guide.html].
| Series | Form factor | Typical port mix | Uplinks | Primary role |
|---|---|---|---|---|
| CX 6000 | Fixed | 12/24/48 × 1G with PoE options | 1G/10G | Small office / branch access |
| CX 6100 | Fixed | 24/48 × 1G | 4 × 10G SFP+ | Cost-optimized campus access |
| CX 6200 | Fixed | 24/48 × 1G or 1G/2.5G mGig | 4 × 10G or 25G | Modern campus access (Wi-Fi 6/6E APs) |
The differentiator inside the 6000-class line is PoE budget and uplink speed. The 6100 is the budget-conscious choice for low-power endpoints. The 6200 handles Wi-Fi 6/6E access points that need 2.5G mGig downlinks and 25G uplinks back to aggregation — increasingly the default ask in enterprise refreshes.
Real-world analogy: The 6000-class line is like the ground floor of an office building. Most of the people you serve never see anything else; they just want their badge to scan, their phone to ring, and their Wi-Fi to work. The “fanciness” lives upstairs.
Aggregation/Core (6300/6400/8100/8325/8360)
Above the wiring closet sits the aggregation/core layer, where switches consolidate dozens of access uplinks and provide Layer 3 routing into the data center or WAN.
| Series | Form factor | Ports | Uplinks | Stacking | Primary role |
|---|---|---|---|---|---|
| CX 6300 | Fixed (1U), stackable | 24-48 × 1G/10G/25G | 10G/25G/50G | VSF ring (up to 10) | Stackable access/aggregation [Source: https://airheads.hpe.com/discussion/vsf-best-practices-for-the-aruba-cx-6300-switch-series] |
| CX 6400 | Modular chassis | up to 96 × 1G/10G/25G per slot | 25G/40G/100G | VSX | Campus aggregation/core |
| CX 8100 | Fixed | 48 × 10G/25G | 40G/100G | VSX | Mid-size aggregation/core |
| CX 8325 | Fixed | 48 × 10G/25G | 40G/100G | VSX | Campus core, small DC ToR |
| CX 8360 | Fixed | 32-130 × 10G/40G | 100G/400G | VSX | Core, mixed campus/DC |
Two acronyms recur here:
- VSF (Virtual Switching Framework) lets up to 10 fixed switches behave as a single logical switch with a single control plane. It is the campus-stacking story for the 6100-6400 line and is most commonly deployed as a ring topology using 50G DAC cables on the 6300 [Source: https://www.youtube.com/watch?v=TjYSi4l-2OM].
- VSX (Virtual Switching Extension) is the active-active, dual-control-plane alternative for the 6300, 6400, and the 8000-series. Unlike VSF, each VSX peer keeps its own control plane and reboots independently — giving you non-disruptive upgrades and fast failover [Source: https://www.youtube.com/watch?v=Qa07KX5lF74].
Worked example — choosing between VSF and VSX. A retail customer wants two CX 6300s in the closet of each store to provide redundancy. They want the simplest possible operations and accept a brief outage during firmware upgrades. → VSF is appropriate: single management plane, simple ring, lowest operations cost. A hospital customer wants two CX 8325s in the data-center core and refuses any maintenance window. → VSX is required: independent control planes mean firmware can be upgraded one peer at a time without dropping traffic.
Data center (8400/9300/10000)
The 8400, 9300, and 10000 are modular chassis aimed at data-center cores and large campus headends [Source: https://www.router-switch.com/faq/aruba-switch-stacking-guide.html].
| Series | Form factor | Density | Top uplink | Distinguishing capability |
|---|---|---|---|---|
| CX 8400 | Modular (8/16-slot) | Hundreds of ports | 100G | Established core; VSX active-active |
| CX 9300 | Modular | Hundreds of ports | 400G | Modern DC core/spine |
| CX 10000 | Modular | Thousands of ports | 400G | Pensando DPU integration for stateful services at line rate |
The CX 10000 is the headline product of the data-center line: integrated Pensando DPUs allow it to run stateful firewalling, microsegmentation, and telemetry directly on the switch ASIC, eliminating “service hairpin” trips to a centralized appliance. For ACSA you do not need to engineer a Pensando deployment, but you should be able to recognize the 10000 as “the DPU-equipped DC switch” on a multiple-choice question.
Choosing the right platform
A pragmatic decision tree for the exam:
- Is the switch terminating end-user devices? → 6100/6200 (or 6000 for branch).
- Is the switch a closet aggregation pair that should look like one switch? → 6300 with VSF.
- Is the switch a campus aggregation/core pair that needs non-disruptive upgrades? → 6400, 8100, 8325, or 8360 with VSX.
- Is the switch a data-center core? → 8400 (legacy), 9300 (modern), or 10000 (DPU-equipped).
Diagram opportunity: A flowchart of the four questions above, animated as a step-by-step reveal (0.4 s per node), is one of the most useful visuals you can put in front of an exam-day candidate.
Key Takeaway: The CX portfolio is intentionally tiered: 6000-series for access, 6300-6400/8100-8360 for aggregation and campus core, and 8400/9300/10000 for the data center. Memorize the tier each model belongs to and recognize VSF as the campus stacking story versus VSX as the active-active redundancy story.
AOS-CX Architecture Fundamentals
If you remember only one thing from this chapter, make it this: AOS-CX is database-driven, not CLI-driven. Every switch capability — VLANs, OSPF neighbors, interface counters, even the running configuration — lives as rows and columns in a centralized in-memory database. The CLI, REST API, and NAE Python agents are simply different “lenses” on the same underlying data [Source: https://redinter.services/wp-content/uploads/2024/09/AOS-CX-Switching-Fundamentals-Rev-24.31-_-Certification-and-Learning.pdf]. This single design choice cascades into every other AOS-CX capability you will study.
Figure 1.3: AOS-CX database-driven architecture layers
flowchart TD
subgraph CONSUMERS["Management & Automation Consumers"]
CLI["CLI<br/>(interactive)"]
REST["REST API<br/>(automation)"]
NAE["NAE<br/>(on-box Python)"]
CENTRAL["Aruba Central<br/>(cloud)"]
end
subgraph DB["Centralized Database Layer"]
CFGDB["OVSDB-style<br/>Config & State DB"]
TSDB["Time-Series<br/>Telemetry DB"]
end
subgraph DAEMONS["Modular Linux Daemons"]
OSPF["OSPF"]
LACP["LACP"]
STP["STP"]
REST_D["REST Server"]
end
HW["Hardened Linux Kernel + Switch ASIC"]
CLI --> CFGDB
REST --> CFGDB
NAE --> TSDB
CENTRAL --> CFGDB
CFGDB <--> OSPF
CFGDB <--> LACP
CFGDB <--> STP
CFGDB <--> REST_D
OSPF --> HW
LACP --> HW
STP --> HW
HW --> TSDB
Time-series database (OVSDB-based)
AOS-CX uses an OVSDB-style centralized configuration and state database combined with a time-series database for telemetry [Source: https://redinter.services/wp-content/uploads/2024/09/AOS-CX-Switching-Fundamentals-Rev-24.31-_-Certification-and-Learning.pdf]. OVSDB (Open vSwitch Database protocol) was popularized by the Open vSwitch project; HPE adopted the same general schema-driven, transaction-oriented model so that AOS-CX could expose all configuration and state in a programmatically queryable way.
Two databases work together:
- The configuration/state database holds the current declared configuration plus live operational state (interface up/down, neighbor adjacencies, MAC address table).
- The time-series database records timestamped samples of the same data — interface counters every 5 seconds, CPU every 10 seconds, queue depth every second. This is what lets NAE detect “the interface error rate has been climbing for the last 30 minutes,” a question impossible to answer from a CLI snapshot.
Real-world analogy: Think of a hospital. The configuration/state database is the patient’s current chart — name, room number, current medications. The time-series database is the bedside monitor — heart rate every second, oxygen saturation every five seconds. Doctors (CLI users), automation (REST clients), and on-call alerts (NAE agents) all read the same charts and monitors; they do not maintain their own private copies.
Diagram opportunity: An animated diagram of the central database surrounded by three “consumers” (CLI, REST API, NAE) with arrows pulsing inward and outward beautifully illustrates the architecture. Use a 0.6 s major transition for arrows and 0.15 s stagger between the three consumers.
Modular daemons and process isolation
AOS-CX runs as a collection of independent processes (daemons) on a hardened Linux kernel [Source: https://redinter.services/wp-content/uploads/2024/09/AOS-CX-Switching-Fundamentals-Rev-24.31-_-Certification-and-Learning.pdf]. Each daemon — the OSPF process, the LACP process, the spanning-tree process, the REST API server — runs in its own protected memory space and communicates with peers exclusively via the central database.
Three operational benefits follow:
- Process isolation. A bug in OSPF cannot corrupt LACP state and cannot crash the switch. The faulty daemon restarts; the rest of the box keeps forwarding traffic.
- Hot-patch and ISSU. Because daemons are independent, AOS-CX supports in-service software upgrades on supported platforms — you replace one daemon at a time without taking the chassis down.
- Common image across the portfolio. The same AOS-CX binary set runs across the entire CX line, from 6100 to 10000. Skills you build on a $2,000 closet switch transfer directly to a $200,000 DC chassis.
Worked example — daemon isolation in action. Suppose a malformed OSPF Type 5 LSA arrives that triggers a regression in the OSPF daemon. On a legacy monolithic NOS the entire control plane could panic and reboot the switch. On AOS-CX, the OSPF daemon crashes, the supervisor restarts it within seconds, the database flags the event in the time-series store, and an NAE agent can email the on-call engineer — all while traffic continues to forward in hardware because the data plane was never affected.
REST API and Python on-box scripting
Because the database is the source of truth, every CLI-configurable element is also reachable via a REST URI. AOS-CX exposes a versioned REST API; a typical interaction is GET /rest/v10.13/system/vlans to list every VLAN configured on the switch [Source: https://redinter.services/wp-content/uploads/2024/09/AOS-CX-Switching-Fundamentals-Rev-24.31-_-Certification-and-Learning.pdf]. Whatever you can do with show running-config you can do with curl, Postman, Ansible, or a Python script.
| Interface | Best for | Typical user |
|---|---|---|
| CLI | Interactive troubleshooting, ad-hoc changes | Network engineer at a console |
| REST API | Bulk automation, integration with ITSM/IaC | DevOps / NetOps automation |
| NAE (on-box Python) | Closed-loop monitoring and remediation | Network reliability engineer |
| Aruba Central | Multi-site fleet management, ZTP | NOC, managed-service provider |
The same configuration change (say, “add VLAN 200”) can be made through any of these interfaces and the result is identical because all four converge on the same database transaction.
NAE overview
The Network Analytics Engine (NAE) runs Python-based agents on the switch itself [Source: https://redinter.services/wp-content/uploads/2024/09/AOS-CX-Switching-Fundamentals-Rev-24.31-_-Certification-and-Learning.pdf]. An NAE agent is a Python script that:
- Subscribes to the time-series database for one or more metrics (interface error rate, CPU, BGP neighbor state).
- Evaluates a condition every sampling period (e.g., “error rate > 1% for 60 seconds”).
- When the condition fires, takes action — log an event, post to a webhook, run a CLI command, or open a ticket.
NAE is what turns AOS-CX from “a switch that exposes data” into “a switch that takes action on its own data.” Real-world examples include automatic ARP-storm detection, OSPF flap correlation, and PoE budget alerts.
Real-world analogy: NAE is the on-call doctor who lives inside the hospital instead of being paged from home. The data never leaves the building, the response is measured in seconds, and only true anomalies escalate to humans.
Diagram opportunity: A small state diagram showing an NAE agent transitioning from “sampling” to “thresholded” to “action” — three nodes, two arrows, with a Replay button — is an ideal animated visual for this section.
Key Takeaway: AOS-CX is built around a centralized OVSDB-style database plus a time-series telemetry store. Modular Linux daemons read and write through the database, the REST API and NAE expose the same data programmatically, and the entire CX portfolio runs the same image — making AOS-CX the foundation that the rest of this book elaborates on.
Aruba Certification Pathway
The ACSA is a single rung on a four-rung HPE Aruba certification ladder. Knowing where you sit and where you can go next is useful both for career planning and for the exam, which occasionally tests recognition of the broader track [Source: https://certificationpractice.com/exam-overviews].
Figure 1.4: HPE Aruba switching certification pathway
graph LR
START(["Junior Engineer<br/>6-12 mo experience"])
START --> ACSA["ACSA / ACA<br/>Associate<br/>HPE7-A01 / HPE6-A86"]
ACSA --> ACSP["ACSP / ACP-Switching<br/>Professional<br/>HPE7-A08"]
ACSP --> ACSE["ACSE / ACE-Switching<br/>Expert"]
ACSE --> ACMX["ACMX (optional)<br/>Master"]
ACSP -.renews.-> ACSA
ACSE -.renews.-> ACSP
ACA, ACSA, ACSP, ACSE progression
| Level | Acronym | Example exam | Audience |
|---|---|---|---|
| Associate | ACA / ACSA | HPE6-A86 / HPE7-A01 (this exam) | Junior engineers, career changers, ~6-12 months experience |
| Professional | ACSP (or ACP-Switching) | HPE7-A08 | Engineers designing and operating mid-to-large networks [Source: https://www.examtopics.com/exams/hp/hpe7-a08/view/19/] |
| Expert | ACSE (or ACE-Switching) | Expert-level practical/written exam | Architects, principal engineers |
| Master | ACMX (where offered) | Capstone/board interview | Pre-sales/solution architects |
Diagram opportunity: A staircase animation — four steps lighting up in sequence (0.4 s each) with the ACSA step highlighted — gives readers a visceral sense of where they are in the journey.
The progression is cumulative: each level assumes mastery of the level below. ACP-Switching, for example, expects you to already know the ACSA OSPF material cold, then layers on PIM-SM multicast, user-based tunneling, and deeper VSX/EVPN-VXLAN content [Source: https://www.examtopics.com/exams/hp/hpe7-a08/view/19/].
Cross-track relevance
HPE Aruba runs parallel tracks for switching, wireless (mobility), and network security/ClearPass. Skills transfer freely. A few cross-track points worth knowing:
- ClearPass appears in the ACSA blueprint under Security & ACLs (RADIUS / 802.1X with ClearPass) — meaning even a pure-switching engineer needs basic ClearPass literacy.
- Aruba Central is the unified cloud management plane for all tracks. Your ACSA work in Central is directly reusable when you study mobility.
- AI Insights in Central is increasingly cross-track — the same alerts surface campus, mobility, and security signals.
In other words, ACSA work is not a dead-end skill; it doubles as the foundation for any other Aruba certification you pursue.
Recertification policies
HPE certifications have historically followed a three-year recertification cycle. You retain a credential by:
- Re-passing the same exam, or
- Passing a higher-level exam in the same track (e.g., earning ACP-Switching automatically renews ACSA), or
- Passing a delta/refresh exam if HPE publishes one for a major release wave.
Always confirm the current policy on the HPE Certification & Learning portal at the time you certify; HPE periodically refreshes both the exam codes (HPE6 → HPE7 → HPE8 generations) and the recertification rules.
Worked example — career trajectory. A network technician earns ACSA in year 1. In year 2 she leads a 6300/6400 deployment and earns ACP-Switching. ACP-Switching automatically extends her ACSA certification, and her growing exposure to multicast and VSX prepares her to attempt ACE-Switching by year 4. She never has to “re-take” ACSA because each upward step renews it.
Key Takeaway: ACSA is the entry point of a four-tier ladder (ACA → ACP → ACE → optional Master). Each level cumulatively assumes the one below, skills transfer across switching/wireless/security tracks via Aruba Central, and you can renew ACSA simply by climbing one rung higher within the recertification window.
Chapter Summary
This chapter framed the three things you need before you open Chapter 2: the exam, the hardware, and the operating system. The ACSA exam (HPE7-A01 / HPE6-A86) is a 60-question, 90-minute, ~66%-to-pass associate-level credential weighted 30/20/20/15/15 across Switching & VLANs, Routing & OSPF, Security & ACLs, Aruba Central, and Monitoring & API. Plan study time in proportion to those weights and book lab time as soon as you book the exam.
The Aruba CX hardware portfolio is intentionally tiered. The 6000/6100/6200 land in the wiring closet; the 6300, 6400, 8100, 8325, and 8360 occupy aggregation and campus core; the 8400, 9300, and 10000 anchor data-center cores, with the 10000 distinguished by integrated Pensando DPUs. Stacking in the campus is delivered by VSF, while VSX provides active-active redundancy with non-disruptive upgrades on aggregation/core models. Critically, all of these platforms run the same AOS-CX image — making the operating system the unifying skill.
That operating system is database-driven from the ground up. An OVSDB-style central configuration/state database plus a time-series telemetry store sit at the core; modular Linux daemons, the REST API, NAE Python agents, and Aruba Central are all “lenses” on that same data. This architecture is the reason AOS-CX supports streaming telemetry, on-box automated remediation, and ISSU — capabilities that legacy ArubaOS-Switch (the ProVision/Comware-lineage predecessor) did not provide. Finally, the ACSA itself is a single rung on a four-tier HPE Aruba ladder; passing it is the start of a path that goes ACSA → ACSP → ACSE, with skills that transfer across the wireless and security tracks.
Key Terms
| Term | Definition |
|---|---|
| AOS-CX | The modern, Linux-based, database-driven network operating system that runs on every switch in the Aruba CX portfolio (6000 through 10000). Replaces the legacy ArubaOS-Switch. |
| ACSA | Aruba Certified Switching Associate — the entry-level switching credential earned by passing the HPE7-A01 / HPE6-A86 exam covered by this book. |
| OVSDB | Open vSwitch Database protocol; the schema-driven, transaction-oriented database model that AOS-CX uses to centralize all configuration and operational state, enabling consistent CLI/REST/NAE views. |
| NAE | Network Analytics Engine — an on-box Python agent runtime that subscribes to the time-series database, evaluates conditions, and triggers automated remediation (logs, webhooks, CLI actions). |
| REST API | A first-class HTTP/JSON interface on AOS-CX. Every CLI-configurable element is also reachable via REST URIs because both interfaces operate against the same underlying database. |
| Aruba Central | HPE Aruba’s cloud-based management plane providing Zero-Touch Provisioning, group configuration, firmware orchestration, AI Insights, and cross-track (switching/wireless/security) operations. |
| ArubaOS-Switch (legacy) | The ProVision/Comware-lineage predecessor NOS to AOS-CX. CLI/SNMP-centric, monolithic image, limited programmability. Out-of-scope for the ACSA exam except as a contrast point. |
| Time-series database | The telemetry store inside AOS-CX that records timestamped samples of operational state (interface counters, CPU, queue depth) and feeds NAE, streaming telemetry, and historical troubleshooting workflows. |
| VSF (Virtual Switching Framework) | Stacking technology for the 6100-6400 series, up to 10 members, single control plane; commonly deployed as a 50G DAC ring on the 6300. |
| VSX (Virtual Switching Extension) | Active-active dual-control-plane redundancy for the 6300/6400/8000-series; supports non-disruptive upgrades and fast failover without full stacking. |
| HPE7-A01 / HPE6-A86 | The exam codes (current and legacy) for the Aruba Certified Switching Associate credential. Treated interchangeably throughout this book. |
Chapter 2: AOS-CX CLI, Configuration Management, and Initial Setup
Learning Objectives
By the end of this chapter, you will be able to:
- Navigate AOS-CX CLI command modes and use context-sensitive help to discover commands quickly.
- Perform initial switch setup, including hostname assignment, management IP configuration, local user creation, and SSH enablement.
- Manage running, startup, and checkpoint configurations using AOS-CX-specific workflows like
write memory,copy running-config checkpoint, androllback. - Use diagnostic tools such as
show tech-support, event logs, debug commands, and syslog forwarding to troubleshoot effectively.
If Chapter 1 introduced the architecture and platforms of AOS-CX, this chapter is where you put your hands on the keyboard. Think of Chapter 1 as the blueprints of a house and Chapter 2 as the day you walk through the front door, find the light switches, and learn the alarm code. Everything that follows in this book — VLANs, routing, security, virtualization — assumes you can confidently get into a switch, configure it, save your work, and back out gracefully when you make a mistake.
AOS-CX CLI Fundamentals
The AOS-CX command-line interface (CLI) will feel familiar to anyone who has worked with Cisco IOS, Juniper Junos, or HPE’s older ProVision/Comware operating systems — but it has its own personality. Aruba designed it to be modern, predictable, and tightly integrated with the underlying database-driven architecture you learned about in Chapter 1.
Operator, Manager, and Configuration Modes
AOS-CX organizes commands into three primary modes, each with its own level of privilege and its own prompt character. You can think of these modes like the floors of a secure office building: the lobby is open to anyone with a badge, the staff floor requires elevated credentials, and the executive suite is locked behind another layer of access.
| Mode | Prompt | Purpose | Typical User |
|---|---|---|---|
| Operator | switch> | Read-only commands, basic show, ping | Help-desk operator |
| Manager (Privileged Exec) | switch# | Full diagnostics, reload, copy, save | Network administrator |
| Configuration | switch(config)# | Make configuration changes | Configuring engineer |
| Sub-config (e.g., interface) | switch(config-if)# | Configure a specific feature scope | Configuring engineer |
A subtle but important detail: AOS-CX uses configure (or configure terminal) to enter global configuration mode, not the more verbose Cisco-style only configure terminal. Both forms work, but configure alone is the documented Aruba shorthand [Source: https://www.kareemccie.com/2021/01/aruba-switches-cx-initial-configuration.html].
switch> enable
switch# configure
switch(config)# interface 1/1/1
switch(config-if)# exit
switch(config)# end
switch#
The end command is your express elevator: from any depth of sub-configuration, end jumps you all the way back to Manager mode. exit only goes up one level. This distinction will save you from the embarrassment of typing exit six times to escape a deeply nested ACL.
Figure 2.1: AOS-CX CLI mode hierarchy and transitions
stateDiagram-v2
[*] --> Operator
Operator --> Manager: "enable"
Manager --> Operator: "disable"
Manager --> Configuration: "configure"
Configuration --> Manager: "end"
Configuration --> SubConfig: "interface 1/1/1"
SubConfig --> Configuration: "exit"
SubConfig --> Manager: "end"
Operator: "Operator (switch>)"
Manager: "Manager (switch#)"
Configuration: "Configuration (switch(config)#)"
SubConfig: "Sub-config (switch(config-if)#)"
Context-Sensitive Help and Command Completion
AOS-CX has one of the most forgiving help systems in the industry. There are three tricks to memorize, and they will carry you through the entire exam:
- Question mark (
?) — Lists every command or argument valid at the current cursor position. - Tab key — Auto-completes a partially typed command, or shows options if multiple commands match.
- Partial commands — As long as your input is unambiguous, AOS-CX accepts shortened keywords (
int 1/1/1forinterface 1/1/1).
switch(config)# in?
include-credentials interface ip
switch(config)# inter<TAB>
switch(config)# interface
Analogy: Context-sensitive help is like typing into a search engine. You don’t have to know the full URL — you just need a hint, and the system suggests the rest. New engineers should treat ? as a permanent companion. Veterans use it to discover features added in new firmware releases.
Command Piping and Filtering
Show output on a switch with a thousand routes can run into hundreds of lines. AOS-CX supports Unix-style pipes to filter and format output:
switch# show running-config | include vlan
switch# show interface brief | begin 1/1/24
switch# show events | exclude DEBUG
switch# show running-config | section interface
| Pipe Modifier | Meaning |
|---|---|
include <pattern> | Show only lines matching the pattern |
exclude <pattern> | Hide lines matching the pattern |
begin <pattern> | Start output at first matching line |
count | Display only the line count |
section <pattern> | Display the entire stanza beginning with the match |
The section modifier is especially useful for dumping just the configuration of one interface or VLAN without scrolling through the entire running-config.
CLI Sessions, Aliases, and Banners
AOS-CX allows multiple concurrent CLI sessions and supports both aliases (custom shorthand commands) and banners (login messages). Banners are not just decorative — many regulatory frameworks (PCI-DSS, HIPAA) require a legal warning banner on management interfaces.
switch(config)# alias shrun show running-config
switch(config)# alias shint show interface brief
switch(config)# banner motd #
Authorized access only. All activity is logged.
#
After defining shrun, typing it at the Manager prompt is identical to typing show running-config. Aliases save thousands of keystrokes in a typical operations day.
Key Takeaway: AOS-CX organizes commands into Operator, Manager, and Configuration modes; mastering
?,Tab, and the pipe modifiers (include,exclude,begin,section) is the single fastest way to become productive on the platform.
Initial Switch Provisioning
A factory-default Aruba CX switch ships with no IP address, no users, and SSH disabled. The very first thing you do with one is connect to the console — and that workflow is the same whether you bought a 6300 stack for an IDF closet or an 8400 chassis for a data-center spine.
Console and Out-of-Band Management (OOBM)
Every CX switch has at least two ways to reach it before the network is configured:
- Console port — A serial connection (RJ-45 or USB-C on newer models) at 9600 baud, 8N1, no flow control. This is your “always works” lifeline.
- OOBM port — A dedicated 1 GbE management interface on the front panel, typically labeled
mgmt. It lives in its own dedicated VRF, isolated from production traffic.
Analogy: The console port is like the master key kept in a fireproof box; the OOBM port is like the building’s service entrance, used daily but separate from the main lobby. If the front-of-house network collapses, you can still reach the switch through either side door.
The OOBM interface is configured under interface mgmt, and uniquely uses the command ip static rather than ip address. This is one of the most common gotchas for engineers crossing over from Cisco or older Aruba ProVision platforms [Source: https://www.youtube.com/watch?v=4F1RaMOV2FU].
switch(config)# interface mgmt
switch(config-if-mgmt)# no shutdown
switch(config-if-mgmt)# ip static 192.168.50.201/24
switch(config-if-mgmt)# default-gateway 192.168.50.1
switch(config-if-mgmt)# exit
Note that the default gateway is configured inside the interface mgmt context, not at the global level — another departure from traditional CLIs. A single ip static command can set both an IPv4 and an IPv6 address simultaneously, which is convenient when bringing a switch up in a dual-stack environment.
Hostname, Domain, and DNS
The hostname is set globally, while DNS settings must specify the management VRF if your name servers are reachable only through OOBM:
switch(config)# hostname ACCESS-IDF1-SW01
ACCESS-IDF1-SW01(config)# ip dns server-address 192.168.50.10 vrf mgmt
ACCESS-IDF1-SW01(config)# ip dns domain-name corp.example.com vrf mgmt
ACCESS-IDF1-SW01(config)# ntp server pool.ntp.org vrf mgmt
The prompt updates immediately after hostname runs — a small but pleasant detail that confirms the change took effect. If you forget the vrf mgmt keyword, AOS-CX will configure DNS for the default VRF (data plane), which usually has no path to your management subnet, leaving you wondering why name resolution silently fails [Source: https://airheads.hpe.com/discussion/aruba-cx-not-using-configured-hostname-in-dhcp-request].
Local User Accounts and Role-Based Access Control (RBAC)
AOS-CX ships with a single default admin account whose password must be set on first login. Beyond that, you can create additional local users and assign them to groups that map to roles. AOS-CX supports up to 29 user-defined groups in addition to the predefined ones.
| Built-in Group | Privilege |
|---|---|
administrators | Full read/write access to all commands |
operators | Read-only access (show commands, ping) |
auditors | Read-only with audit-log visibility |
ACCESS-IDF1-SW01(config)# user netadmin group administrators password plaintext H@rdPass!1
ACCESS-IDF1-SW01(config)# user helpdesk group operators password plaintext View0nly!
Worked Example — Creating a Custom Role: Suppose your security policy requires a “VLAN-only” administrator who can create VLANs but cannot reload the switch. You define a custom role, attach it to a new group, and assign a user:
switch(config)# role vlan-only
switch(config-role)# rule 10 permit config command "vlan*"
switch(config-role)# rule 20 permit exec command "show*"
switch(config-role)# exit
switch(config)# user vlan-jr group operators password plaintext Vlan0nly!
This pattern — group plus role plus user — is the foundation of AOS-CX RBAC and is common on the exam [Source: https://netz.schulon.org/fw/CX/Dokumentation/10.14/security_6200-6300-6400.pdf].
Enabling SSH and Disabling Telnet
By default, SSH is enabled on the management VRF and Telnet is disabled. Verifying and adjusting these settings is a routine hardening task:
ACCESS-IDF1-SW01(config)# ssh server vrf mgmt
ACCESS-IDF1-SW01(config)# ssh server vrf default
ACCESS-IDF1-SW01(config)# no telnet-server vrf mgmt
ACCESS-IDF1-SW01# show ssh server vrf mgmt
For tighter security, apply a control-plane ACL that only permits SSH from your jump-host subnet:
switch(config)# access-list ip MGMT-ACL
switch(config-acl-ip)# 10 permit tcp 10.10.10.0/24 any eq 22
switch(config-acl-ip)# 40 deny tcp any any eq 22
switch(config-acl-ip)# 50 permit any any any
switch(config-acl-ip)# exit
switch(config)# apply access-list ip MGMT-ACL control-plane vrf mgmt
This ACL allows SSH only from 10.10.10.0/24 and silently drops all other SSH attempts at the control plane — a common requirement in audited environments [Source: https://airheads.hpe.com/discussion/how-to-apply-access-restriction-for-ssh-in-aruba-cx].
Key Takeaway: Initial provisioning on AOS-CX revolves around the OOBM
interface mgmt(withip staticand an inlinedefault-gateway), the management VRF for DNS/NTP/SSH, and group/role-based local users. Memorize thevrf mgmtkeyword — its omission is the single most common configuration error for newcomers.
Configuration Management
If initial setup is the blueprint, configuration management is the version-control system. AOS-CX has a richer model than many competing platforms, blending the familiar running-config / startup-config pair with a powerful checkpoint system that acts like a built-in time machine.
Running vs. Startup Configuration
The running-config lives in RAM and reflects every change as it is typed. The startup-config lives in flash and is loaded on boot. Changes made in config mode do not persist across reboots until you save them.
| Configuration | Where It Lives | Persists Across Reboot? | Command to View |
|---|---|---|---|
| Running-config | Volatile DB / RAM | No | show running-config |
| Startup-config | Non-volatile flash | Yes | show startup-config |
| Checkpoint | Non-volatile flash | Yes (named snapshot) | show checkpoint <name> |
To save the running-config to startup-config, AOS-CX gives you two equivalent commands — write memory and the shorter save [Source: https://www.examtopics.com/discussions/hp/view/65649-exam-hpe6-a72-topic-1-question-18-discussion/]:
ACCESS-IDF1-SW01# write memory
ACCESS-IDF1-SW01# save
If you have used Cisco IOS, write memory is the precise equivalent of copy running-config startup-config. Aruba kept the historical command as a courtesy to muscle memory.
Checkpoints: Snapshots Beyond Startup-Config
A checkpoint is a named, point-in-time snapshot of the running-config — independent of startup. You can create as many as you like, name them meaningfully, and roll back to any one instantly.
switch# copy running-config checkpoint pre-vlan-change
switch# show checkpoints
switch# show checkpoint pre-vlan-change
switch# rollback checkpoint pre-vlan-change
AOS-CX also generates automatic checkpoints, named with a timestamp prefix like CPC202604291843. By default, the system creates one roughly five minutes after the last config change if you have not saved [Source: https://airheads.hpe.com/discussion/arubaos-cx-automatically-save-running-configuration].
Analogy: Think of checkpoints as the “Save As” feature of a word processor: startup-config is the file you saved when you closed the program, but checkpoints are every named version you preserved along the way (“draft-monday”, “before-edit”, “final-review”). The auto-checkpoint is like auto-save in modern editors — it catches you if you forget.
Figure 2.2: Configuration management workflow across running, startup, and checkpoint stores
flowchart LR
A["Running-config (RAM)"] -->|"write memory / save"| B["Startup-config (flash)"]
A -->|"copy running-config checkpoint <name>"| C["Named checkpoint (flash)"]
A -.->|"auto checkpoint after ~5 min"| D["Auto checkpoint (CPCyyyymmdd)"]
C -->|"rollback checkpoint <name>"| A
B -->|"loaded at boot"| A
E["TFTP / SFTP / SCP / USB"] -->|"copy ... running-config vrf mgmt"| A
A -->|"copy running-config ... vrf mgmt"| E
Auto-Checkpoint on Commit (Safe Testing)
The most powerful — and exam-favored — feature is checkpoint auto, which arms a deadman timer between 1 and 60 minutes. If you don’t confirm your changes before the timer expires, the switch automatically rolls back to the pre-change state.
switch# checkpoint auto 5
switch# configure
switch(config)# router ospf 1
switch(config-ospf-1)# area 0
switch(config-ospf-1)# exit
switch(config)# end
switch# show ip ospf neighbor
... (verify connectivity)
switch# checkpoint confirm
If, after starting checkpoint auto 5, you fat-finger a command that severs your SSH session, the switch waits five minutes, hears no checkpoint confirm, and rolls back automatically. Five minutes later you are back in. This is the AOS-CX answer to Cisco’s reload in 5 trick — but cleaner, because nothing actually reboots [Source: https://www.sikich.com/insight/how-to-use-aruba-cx-checkpoint-auto-commands/].
Worked Example — A Safe Remote Change:
ssh netadmin@10.20.30.40checkpoint auto 10- Make ACL changes that might lock you out.
- Verify SSH still works (open a second session).
checkpoint confirm— your changes persist.- Or do nothing for ten minutes — the switch reverts and saves you.
Copying Configurations: TFTP, SFTP, and USB
AOS-CX supports several transfer protocols for backup and restore. The general syntax is copy <source> <destination>:
switch# copy running-config tftp://192.168.50.20/sw01-backup.cfg vrf mgmt
switch# copy startup-config sftp://backup@192.168.50.20/configs/sw01.cfg vrf mgmt
switch# copy tftp://192.168.50.20/sw01-restore.cfg running-config vrf mgmt
switch# copy usb:sw01.cfg running-config
| Protocol | Encryption | Common Use |
|---|---|---|
| TFTP | None | Lab, secured management VLAN |
| SFTP | Yes (SSH) | Production backups |
| SCP | Yes (SSH) | Production backups |
| USB | Local | Bench setup, air-gapped sites |
Notice the trailing vrf mgmt — without it, the switch tries to reach the file server through the default VRF and fails. This is the same gotcha that bites you with DNS.
Key Takeaway: AOS-CX persistence has three layers — running-config, startup-config, and named checkpoints. Use
write memoryto save,copy running-config checkpoint <name>to snapshot, andcheckpoint auto Npluscheckpoint confirmfor any change you would not want to drive to the data center to fix.
Diagnostics and Logging
When something breaks at 2 a.m., your job isn’t to remember every command — it’s to know where the truth lives. AOS-CX consolidates diagnostic information into a small set of go-to commands that any new engineer should rehearse before they ever need them in anger.
show tech-support and Core Dumps
show tech-support is the kitchen-sink diagnostic dump. It runs hundreds of show commands and concatenates the output into a single file you can hand to TAC.
switch# show tech-support
switch# show tech-support local-file
switch# copy show-tech tftp://192.168.50.20/sw01-tech.txt vrf mgmt
The local-file variant writes the output to onboard storage so you can review or transfer it later. For deeper failures, AOS-CX also stores core dumps — memory snapshots of crashed processes — that engineering can analyze:
switch# show core-dump
switch# copy core-dump <id> tftp://192.168.50.20/ vrf mgmt
Analogy: show tech-support is the airplane’s flight-data recorder. You don’t read it in flight — you hand the whole black box to investigators after a problem. Core dumps are the cockpit voice recorder: rarer, more detailed, and aimed at specific failure modes.
Event Logs and Severity Levels
The event log is your real-time narrative of what the switch is doing. It is structured and severity-tagged — much friendlier to grep than a free-form syslog stream:
switch# show events
switch# show events -d ospfd
switch# show events -s warning
switch# show events | include link
| Severity | Numeric | Meaning |
|---|---|---|
emergency | 0 | System unusable |
alert | 1 | Immediate action required |
critical | 2 | Critical conditions |
error | 3 | Error conditions |
warning | 4 | Warning conditions |
notice | 5 | Normal but significant |
informational | 6 | Informational messages |
debug | 7 | Debug-level messages |
Each daemon (lldpd, ospfd, lacpd, hpe-fand, etc.) emits its own events, which you can filter using -d <daemon>. This is far more surgical than scrolling through the entire log looking for the one line that matters.
Debug Commands
When event logs aren’t enough, AOS-CX has per-feature debugging that you can toggle live without affecting other modules. Unlike older OSes where debug all could melt the CPU, CX debugging is daemon-scoped:
switch# debug ospfv2 all
switch# debug bgp updates
switch# show debug
switch# no debug ospfv2 all
switch# no debug all
A best practice: always pair debug with terminal monitor if you want to see output live in your SSH session, and always turn debugging off when you’re finished. Forgotten debug sessions have caused real outages.
Syslog Forwarding
For long-term retention and central correlation, forward events to an external syslog server. AOS-CX supports UDP (514), TCP (1470), and TLS (6514) targets:
switch(config)# logging 192.168.50.30 vrf mgmt severity info
switch(config)# logging 192.168.50.30 vrf mgmt udp 514 severity info
switch(config)# logging 192.168.50.30 vrf mgmt tls 6514 severity warning
| Transport | Default Port | Use Case |
|---|---|---|
| UDP | 514 | High volume, lossy networks |
| TCP | 1470 | Reliable delivery |
| TLS | 6514 | Encrypted forwarding (regulated industries) |
Worked Example — Compliance-Grade Logging Pipeline:
- Enable NTP so timestamps are accurate:
ntp server pool.ntp.org vrf mgmt. - Forward warnings and above over TLS to a SIEM:
logging 10.50.50.10 vrf mgmt tls 6514 severity warning. - Verify the connection:
show logging. - Generate a test event:
clear eventsthen trigger a known condition. - Confirm the event arrived in the SIEM with matching timestamp.
This pipeline ensures auditors see the same timeline the switch saw, encrypted in transit, with no gaps.
Figure 2.3: Compliance-grade syslog forwarding flow over TLS
sequenceDiagram
participant D as "Daemon (ospfd, lacpd, ...)"
participant E as "Event log (on-box)"
participant L as "Logging subsystem"
participant N as "NTP server"
participant S as "SIEM / syslog server"
N->>L: "Time sync (vrf mgmt)"
D->>E: "Emit event (severity=warning)"
E->>L: "Forward if >= configured severity"
L->>S: "TLS 6514 (vrf mgmt)"
S-->>L: "TCP/TLS ack"
Note over L,S: "Encrypted in transit; timestamps aligned via NTP"
S->>S: "Correlate, retain, alert auditors"
Key Takeaway: Diagnostics in AOS-CX form a layered toolbox —
show tech-supportfor full snapshots,show eventswith severity filtering for everyday troubleshooting, daemon-scopeddebugfor deep dives, and syslog forwarding for long-term, central retention. Knowing which tool to reach for first is what separates fast troubleshooters from slow ones.
Chapter Summary
This chapter took you from a powered-on but unconfigured CX switch to a fully provisioned, hardened, and observable network device.
You learned that AOS-CX organizes its CLI into Operator, Manager, and Configuration modes, with configure as the entry point and end as the express elevator back. Context-sensitive help (?, Tab, partial commands) and pipe modifiers (include, exclude, begin, section) make the CLI faster than it looks at first glance.
You walked through initial provisioning: connecting via console, configuring the OOBM interface mgmt with the unique ip static syntax and inline default-gateway, setting hostname/DNS/NTP scoped to vrf mgmt, and creating local users mapped to RBAC groups and roles. You then enabled SSH on the management VRF and optionally locked it down with a control-plane ACL.
For configuration management, you saw three layers — running-config (RAM), startup-config (flash), and checkpoints (named snapshots in flash). You learned that write memory and save are equivalent, that copy running-config checkpoint <name> lets you snapshot before risky changes, and that checkpoint auto N plus checkpoint confirm provides a deadman-timer safety net for remote work. Backups travel via TFTP, SFTP, SCP, or USB — and remember to specify vrf mgmt.
Finally, diagnostics and logging rest on show tech-support for TAC-level dumps, show events for severity-filtered real-time troubleshooting, daemon-scoped debug commands, and external syslog forwarding (UDP/TCP/TLS) for compliance-grade retention.
These skills become second nature in the lab, and the rest of the book — VLANs in Chapter 3, routing in Chapter 5, security and ClearPass integration later — builds on every habit you started forming here.
Key Terms
| Term | Definition |
|---|---|
| Configuration mode | The CLI mode entered with configure (prompt (config)#) where running-config edits occur. |
| Checkpoint | A named, point-in-time snapshot of the running-config stored in flash, distinct from startup-config; can be restored with rollback checkpoint <name>. |
| Running-config | The active configuration in volatile memory; reflects every change as it is typed but is lost on reboot unless saved. |
| Startup-config | The configuration loaded automatically at boot; updated by write memory or save. |
| OOBM | Out-of-Band Management — the dedicated mgmt interface on AOS-CX switches, isolated in its own management VRF for administrative traffic separate from data plane. |
| Show tech-support | A diagnostic command that runs and concatenates many show commands into a single output for TAC analysis; show tech-support local-file writes the output to onboard storage. |
| RBAC | Role-Based Access Control — AOS-CX security model that ties users to groups (e.g., administrators, operators) and groups to roles, supporting up to 29 user-defined groups. |
| Syslog | Standard protocol for forwarding event logs to a central server over UDP (514), TCP (1470), or TLS (6514); configured in AOS-CX with the logging command and a VRF. |
Chapter 3: Layer 2 Switching: VLANs, Trunks, and MAC Learning
Layer 2 switching — VLAN segmentation, 802.1Q trunking, MAC learning, and IP-phone provisioning — is the day-to-day fabric of every campus and branch network. The HPE7-A01 exam expects fluency here, and so does any production engineer who has ever traced “the user can’t reach the printer” back to a missing vlan trunk allowed entry on an uplink.
Learning Objectives
By the end of this chapter, you should be able to:
- Configure VLANs, access ports, and 802.1Q trunk ports on AOS-CX, including allowed-VLAN lists and the native VLAN.
- Explain how MAC address learning, aging, and table behavior work on a transparent bridge.
- Implement voice VLANs and LLDP-MED so an IP phone auto-provisions VLAN, 802.1p, and DSCP without user touch.
- Troubleshoot common VLAN and trunk misconfigurations using
showcommands and structured reasoning.
Section 1: VLAN Concepts and Configuration
A VLAN (Virtual LAN) is a Layer 2 broadcast domain identified by a 12-bit VLAN ID in the range 1-4094. Two ports in the same VLAN behave as if they were plugged into the same dumb hub; two ports in different VLANs are as isolated as two unconnected switches. The VLAN ID lives in the IEEE 802.1Q tag, a 4-byte header inserted right after the source MAC of an Ethernet frame, and it gives modern networks the segmentation, isolation, and policy boundaries that physical wiring used to provide [Source: https://www.arubanetworks.com/techdocs/AOS-CX/10.13/HTML/l2_bridging/Content/Chp_VLAN/con-vlan.htm].
Analogy: VLANs as floors in an office building. Without VLANs, every cubicle on every floor opens onto the same lobby — anyone can wander into anyone else’s department. VLANs give each department its own keycard-controlled floor that shares the elevators (the trunks). The elevator needs a label (the 802.1Q tag) to know which floor to open for each rider.
The Default VLAN
When AOS-CX boots a fresh configuration, exactly one VLAN exists: VLAN 1. This is the default VLAN, and every Layer 2 switch port is implicitly a member of it in access mode. VLAN 1 cannot be deleted, but for security and operational reasons, you should never use it for production traffic. Two reasons drive this advice:
- VLAN 1 is the default everywhere in the industry, so untagged traffic that leaks through a misconfigured trunk often lands in VLAN 1. If your real users are there, the leak goes unnoticed.
- Many control-plane protocols (LLDP, LACP, STP BPDUs) historically used VLAN 1; using it for users mixes user traffic with the control plane.
The convention is to leave VLAN 1 unused, create explicit data, voice, management, and IoT VLANs, and ensure VLAN 1 is not in the allowed list on production trunks.
Creating VLANs and Assigning Names
VLANs are created in global configuration mode. AOS-CX strongly encourages naming and describing VLANs — the running config is far more readable when you can see “VOICE” instead of just “20”:
switch# configure terminal
switch(config)# vlan 10
switch(config-vlan-10)# name DATA
switch(config-vlan-10)# description "User Data VLAN"
switch(config-vlan-10)# no shutdown
switch(config-vlan-10)# exit
switch(config)# vlan 20
switch(config-vlan-20)# name VOICE
switch(config-vlan-20)# exit
switch(config)# vlan 99
switch(config-vlan-99)# name MANAGEMENT
switch(config-vlan-99)# exit
You can also create VLANs in batches: vlan 10,20,30 or vlan 100-110. AOS-CX VLANs are administratively up by default; the shutdown / no shutdown command at the VLAN level controls whether the VLAN is operationally active. A VLAN that exists but is shut down will not pass traffic [Source: https://www.arubanetworks.com/techdocs/AOS-CX/10.10/HTML/fundamentals_guide/Content/Chp_VLANs/VLA-cmds.htm].
Access vs. Trunk Modes
Layer 2 ports on AOS-CX operate in one of two modes:
| Mode | Tag handling on ingress | Tag handling on egress | Typical use case |
|---|---|---|---|
| Access | Untagged frames assigned to the access VLAN; tagged frames dropped (with a few exceptions like voice) | Frames sent untagged | End-host port (PC, printer, camera) |
| Trunk | Tagged frames assigned to the tag’s VLAN; untagged frames assigned to the native VLAN | Frames in non-native VLANs sent tagged; native VLAN sent untagged unless tag keyword used | Inter-switch links, hypervisor uplinks, AP uplinks, phone+PC ports |
A critical AOS-CX detail catches many newcomers: most CX platform interfaces default to routed (Layer 3) mode. You must run no routing on the interface before VLAN-membership commands take effect. If vlan access 10 returns an error or silently does nothing, this is almost always why.
switch(config)# interface 1/1/1
switch(config-if)# no shutdown
switch(config-if)# no routing
switch(config-if)# vlan access 10
switch(config-if)# exit
The Native VLAN
On a trunk, exactly one VLAN can be designated the native VLAN. Frames in the native VLAN are sent untagged across the wire; conversely, any untagged frame received on a trunk is mapped into the native VLAN. The default is VLAN 1.
Native VLAN matters for two reasons:
- Compatibility: Devices that don’t understand 802.1Q tagging (legacy hubs, some embedded systems, the management ports of certain appliances) can still talk on a trunk if their VLAN is the native one.
- Security and stability: A native VLAN mismatch — e.g., one switch has native 1, the peer has native 99 — causes silent traffic merging. Frames that left as untagged from VLAN 1 on switch A arrive at switch B and get classified into VLAN 99. Worse, the symptom is intermittent connectivity rather than a hard error.
Best practice is to either (a) set the native VLAN to an unused VLAN that carries no traffic, or (b) force tagging of the native VLAN with vlan trunk native <id> tag so all frames are tagged regardless. The latter mirrors Cisco’s vlan dot1q tag native behavior [Source: https://www.arubanetworks.com/techdocs/AOS-CX/10.10/HTML/fundamentals_guide/Content/Chp_VLANs/vla-trk-int.htm].
Worked Example: Building a Three-VLAN Edge Switch
Suppose a small office has 24 user PCs, an uplink to the core, and a need for separate Data (10), Voice (20), and Management (99) VLANs.
switch(config)# vlan 10
switch(config-vlan-10)# name DATA
switch(config-vlan-10)# exit
switch(config)# vlan 20
switch(config-vlan-20)# name VOICE
switch(config-vlan-20)# exit
switch(config)# vlan 99
switch(config-vlan-99)# name MANAGEMENT
switch(config-vlan-99)# exit
!
switch(config)# interface 1/1/1-1/1/24
switch(config-if-<1/1/1-1/1/24>)# no routing
switch(config-if-<1/1/1-1/1/24>)# vlan access 10
switch(config-if-<1/1/1-1/1/24>)# exit
!
switch(config)# interface 1/1/48
switch(config-if)# no routing
switch(config-if)# vlan trunk native 99
switch(config-if)# vlan trunk native 99 tag
switch(config-if)# vlan trunk allowed 10,20,99
Here, ports 1-24 are user access ports on VLAN 10; port 48 is a trunk to the core that carries Data, Voice, and Management VLANs. The native VLAN is set to Management, but tagging is forced so untagged frames are not silently classified anywhere.
Figure 3.1: Three-VLAN edge switch with access ports and uplink trunk
flowchart LR
PC1["PC (VLAN 10)"] -->|untagged| P1["Port 1/1/1<br/>Access VLAN 10"]
PC2["PC (VLAN 10)"] -->|untagged| P2["Port 1/1/2<br/>Access VLAN 10"]
PHONE["IP Phone (VLAN 20)"] -->|tagged 20| P5["Port 1/1/5<br/>Access 10 + Trunk 20"]
MGMT["Mgmt Host (VLAN 99)"] -->|untagged| P24["Port 1/1/24<br/>Access VLAN 99"]
P1 --> SW{{"Edge Switch"}}
P2 --> SW
P5 --> SW
P24 --> SW
SW -->|"Trunk: tagged 10,20<br/>native 99 tagged"| P48["Port 1/1/48<br/>Trunk to Core"]
P48 --> CORE["Core Switch"]
Key Takeaways: VLAN Configuration
- VLANs span IDs 1-4094; VLAN 1 is the default and should not carry production traffic.
vlan <id>creates the VLAN; it must exist before being applied to any port.no routingis required on most CX interfaces before VLAN commands take effect.- Access ports use
vlan access <id>; trunk ports usevlan trunk native <id>andvlan trunk allowed <list>. - Native VLAN mismatches silently merge traffic between VLANs — tag the native VLAN or use an unused one.
Section 2: 802.1Q Trunking
VLANs solve the “one switch, many broadcast domains” problem. Trunks solve the “many switches, same VLANs” problem. An 802.1Q trunk is a single Ethernet link that carries frames from multiple VLANs by inserting a VLAN tag into each frame, replacing what would otherwise be one cable per VLAN.
The 802.1Q Tag in Detail
The IEEE 802.1Q tag is a 4-byte header inserted in the Ethernet frame between the source MAC address and the EtherType / Length field:
+-----------+-----------+-----------+----------+----------+--------+-----+----------+
| Dest MAC | Src MAC | TPID | PCP | DEI | VLAN ID | EtherT | ... | FCS |
| (6 bytes) | (6 bytes) | 0x8100 | 3b | 1b | 12 bits | (2 b) | | (4 b) |
+-----------+-----------+-----------+----------+----------+--------+-----+----------+
| Field | Bits | Purpose |
|---|---|---|
| TPID | 16 | Tag Protocol Identifier; 0x8100 indicates a standard 802.1Q tag. |
| PCP | 3 | Priority Code Point — the 802.1p Class of Service (0-7), used by QoS. |
| DEI | 1 | Drop Eligible Indicator (formerly CFI). Marks a frame as drop-eligible under congestion. |
| VLAN ID | 12 | The VLAN identifier (0-4095; 0 and 4095 reserved, so 1-4094 usable). |
The 4-byte tag pushes a tagged frame up to 1522 bytes (1518 + 4); modern switches accept these, but legacy gear may not.
Tagged vs. Untagged Behavior
Picture a trunk port handling two frames:
- Frame A belongs to VLAN 10 (a non-native VLAN). The switch inserts the 802.1Q tag with VLAN ID 10 and sends it. The receiving switch reads the tag, strips it, and forwards the frame inside VLAN 10.
- Frame B belongs to VLAN 1 (the native VLAN). The switch sends it untagged. The receiving switch sees no tag and assigns it to its native VLAN.
This asymmetry is the source of native VLAN mismatch bugs. If switch A’s native is 1 and switch B’s native is 99, frame B from switch A’s VLAN 1 arrives at switch B and lands in VLAN 99.
Figure 3.2: 802.1Q frame tagging behavior across a trunk boundary
sequenceDiagram
participant H1 as "Host A (VLAN 10)"
participant SA as "Switch A"
participant SB as "Switch B"
participant H2 as "Host B (VLAN 10)"
H1->>SA: "Untagged frame ingress on access VLAN 10"
Note over SA: "Lookup egress: trunk to Switch B"
SA->>SB: "Tagged frame (TPID 0x8100, VID=10) on trunk"
Note over SB: "Read tag, classify into VLAN 10"
SB->>H2: "Untagged frame egress on access VLAN 10"
Note over SA,SB: "Native VLAN frames travel UNTAGGED — mismatch silently merges VLANs"
Allowed VLAN Lists
A trunk does not automatically carry every VLAN in the system; you must explicitly authorize each one. The vlan trunk allowed command takes a list of VLAN IDs:
switch(config-if)# vlan trunk allowed 10,20,99
switch(config-if)# vlan trunk allowed 100-110 add
switch(config-if)# vlan trunk allowed 99 remove
switch(config-if)# vlan trunk allowed all
Note the subtle but important add and remove keywords: without them, the new list replaces the existing list. Forgetting add is a classic outage cause — you intended to permit VLAN 30 in addition to VLANs 10, 20, 99, but instead you removed everything except 30 [Source: https://www.arubanetworks.com/techdocs/AOS-CX/10.10/HTML/fundamentals_guide/Content/Chp_VLANs/vla-trk-int.htm].
The all keyword permits every VLAN that exists on the switch. It’s convenient for lab work but discouraged in production: pruning unused VLANs from trunks reduces broadcast overhead and limits the blast radius of configuration mistakes.
VLAN Translation
Some AOS-CX platforms support VLAN translation on trunks: a frame arriving with VLAN ID X is rewritten to VLAN ID Y before forwarding. This is useful at administrative boundaries — e.g., a service provider mapping the customer’s local VLAN 100 to the provider’s transport VLAN 2200. Consult the platform feature matrix for support.
Common Trunking Pitfalls
| Symptom | Likely cause | Fix |
|---|---|---|
| Hosts in VLAN X cannot reach hosts in VLAN X across switches | VLAN X not in vlan trunk allowed list on intermediate trunks | vlan trunk allowed X add |
| Sporadic connectivity, traffic in wrong VLAN | Native VLAN mismatch | Align native VLAN at both ends or use vlan trunk native <id> tag |
| Trunk doesn’t come up | Forgot no routing; or VLAN doesn’t exist; or peer is not in trunk mode (no DTP on AOS-CX!) | Add no routing, create the VLAN, configure trunk mode statically on both ends |
| Phones get IP from data DHCP scope | Voice VLAN not allowed on the trunk path | Add voice VLAN to all vlan trunk allowed lists end-to-end |
| Replaced switch dropped a VLAN silently | New vlan trunk allowed overwrote prior list (forgot add) | Always use add / remove when modifying the allowed list in place |
AOS-CX does not implement Cisco’s DTP. Always configure both sides of a trunk explicitly and identically [Source: https://community.arubanetworks.com/discussion/aos-cx-vlan-configuration].
Verification Workflow
When a trunk misbehaves, walk through these commands in order:
switch# show vlan
switch# show vlan port 1/1/48
switch# show interface 1/1/48 vlan
switch# show running-config interface 1/1/48
switch# show lldp neighbor-info 1/1/48 detail
show vlan port is especially useful: it reports every VLAN currently active on the port and its tagged/untagged disposition, which is exactly the abstraction you debug at.
Key Takeaways: 802.1Q Trunking
- An 802.1Q tag is 4 bytes; TPID 0x8100, 3-bit PCP, 1-bit DEI, 12-bit VLAN ID.
- The native VLAN’s traffic is sent untagged — mismatches merge VLANs silently.
- Always explicitly manage the allowed VLAN list with
add/remove. - AOS-CX has no DTP; statically configure trunk mode on both endpoints.
- Verify with
show vlan portandshow interface ... vlan.
Section 3: MAC Address Table
The MAC address table is what makes a switch a switch and not a hub: instead of flooding every frame to every port, the switch learns where each MAC lives and sends frames only where they need to go.
Analogy: The MAC table as a building’s mailroom. A new mailroom has no list of who sits where. The first time it sees outbound mail from Alice in office 1404, it scribbles “Alice -> 1404” on a sticky note. Now letters addressed to Alice go straight to 1404 instead of being run to every floor. If nobody mails from Alice for five minutes, the note is thrown away (aging). If “Alice” appears on a different floor, the note is updated (MAC move). Static entries are permanent nameplates — they never expire.
Dynamic Learning
When a frame enters a switch port, AOS-CX inspects the source MAC and the ingress {VLAN, port} pair. It then installs (or refreshes) a row in the MAC address table:
{ VLAN, source MAC } -> ingress port
That entry is marked dynamic. From then on, any frame whose destination MAC matches that entry — in the same VLAN — is unicast-forwarded out the matching port instead of being flooded.
Forwarding decisions follow this short logic ladder:
- Destination MAC is in the table (same VLAN): forward out the matching port.
- Destination MAC is broadcast (FF:FF:FF:FF:FF:FF): flood within VLAN.
- Destination MAC is multicast (and IGMP/MLD snooping is not constraining it): flood within VLAN.
- Destination MAC is unknown unicast: flood within VLAN, hoping to provoke a reply that completes the learning.
Figure 3.3: MAC learning lifecycle for a dynamic entry
stateDiagram-v2
[*] --> Unknown: "Switch boots / table empty"
Unknown --> Flooding: "Frame to dest MAC X arrives"
Flooding --> Learning: "Reply seen with src MAC X on port P"
Learning --> Active: "Install {VLAN, X} -> port P (dynamic)"
Active --> Active: "Refresh on each frame from X (reset 300s timer)"
Active --> Moved: "Same MAC X seen on port Q (MAC move)"
Moved --> Active: "Update entry to port Q; log move"
Active --> Aged: "No frames from X for 300s"
Aged --> [*]: "Entry purged from table"
Active --> Static: "Admin pins MAC (mac-address-table static)"
Static --> Static: "Survives reboot, never ages"
Dynamic vs. Static Entries
| Type | Created by | Persists across reboot? | Ages out? | Typical use |
|---|---|---|---|---|
| Dynamic | Source-MAC learning | No | Yes | Normal end-host traffic |
| Static | Administrator (mac-address-table static) | Yes (in saved config) | No | Bind a known device’s MAC to a port; suppress flooding for silent endpoints |
| System / management | Switch software at boot | Yes | No | The switch’s own management MAC, SVI/RVI MACs |
Static entry configuration:
switch(config)# mac-address-table static 00:11:22:33:44:55 vlan 10 interface 1/1/5
Static entries override learning: the same MAC seen on a different port will not update the table, and the switch may log or drop the offending frames depending on the platform [Source: https://www.arubanetworks.com/techdocs/AOS-CX/10.10/HTML/l2_bridging/Content/Chp_MAC_addr_tbl/mac-add-tbl-cmds.htm].
MAC Aging Timer
Default MAC aging on AOS-CX is 300 seconds (5 minutes). Each dynamic entry has its own age clock, which is reset every time a frame from that source MAC is seen. After 300 seconds without a refresh, the entry is purged.
switch(config)# mac-address-table age-time 600
The valid range is roughly 10-1,000,000 seconds, with 0 disabling aging entirely. Why might you change it?
- Increase aging (e.g., 1800 s) when endpoints transmit infrequently — IoT sensors, printers — and you want to avoid unnecessary unknown-unicast flooding.
- Decrease aging (e.g., 60 s) in environments where endpoints move quickly between ports, like virtual desktop infrastructure or high-mobility wireless backhaul.
- Disable aging only in tightly controlled lab or static environments; this risks stale entries pinned forever.
The MAC aging timer should be at least slightly longer than the ARP aging timer of upstream routers, otherwise you produce unnecessary unicast flooding when an ARP cache is still valid but the MAC entry has expired [Source: https://community.arubanetworks.com/discussion/mac-aging-time-aos-cx].
Port Security: mac-lockout, mac-limit
AOS-CX provides several mechanisms to constrain which MAC addresses are allowed on a port and how many can appear:
switch(config-if)# port-access security violation action shutdown
switch(config-if)# port-access security client-limit 2
The client-limit caps the number of MAC addresses that can be learned on the port. When the cap is exceeded, the configured violation action runs — the choices typically include drop new MACs, log only, or err-disable the entire port. Combine this with 802.1X (port-access authenticator) or MAC authentication (port-access mac-auth) for true admission control [Source: https://www.arubanetworks.com/techdocs/AOS-CX/10.10/HTML/security/Content/Chp_port_security/con-port-sec.htm].
For environments that must block specific MACs (e.g., a stolen laptop), a static “lockout” entry can blackhole any frame matching the offending MAC.
MAC Move Detection
When the same source MAC appears on a different port within the same VLAN, the switch updates the MAC table to reflect the new port. Each move is normally fine — your laptop unplugged and re-plugged into a different port — but rapid, repeated moves are a strong signal of one of three problems:
- A Layer 2 loop: the switch is seeing its own frames echoed back from a different port.
- A duplicate MAC: two different devices are accidentally using the same Ethernet address.
- A flapping link: the device alternates between two access paths (e.g., redundant NIC misconfigured).
AOS-CX logs frequent MAC moves and many platforms can rate-limit or alarm on excessive move counts. The companion feature loop-protect proactively detects loops by emitting probe frames; if a probe returns, the originating port is err-disabled.
Table Size and Hardware Considerations
MAC entries on Aruba CX are stored in hardware (TCAM / L2 forwarding tables) for line-rate lookup. Capacity varies by platform:
| Platform family | Approximate MAC table size |
|---|---|
| 6100 / 6300 | 16K - 32K |
| 6400 / 8325 | 96K - 288K |
| 8400 / 10000 | 288K - 1M+ |
[Source: https://www.arubanetworks.com/assets/ds/DS_AOS-CX.pdf]
Exhaustion is rare but possible in very large flat L2 domains; the cure is segmentation or a routed core.
show mac-address-table
The single most useful command for L2 troubleshooting:
switch# show mac-address-table
switch# show mac-address-table vlan 10
switch# show mac-address-table port 1/1/5
switch# show mac-address-table address 00:11:22:33:44:55
switch# show mac-address-table count
switch# show mac-address-table dynamic
Output columns include MAC, VLAN, type, and port. The count form summarizes capacity utilization. To clear stale dynamic entries during testing:
switch# clear mac-address-table dynamic
switch# clear mac-address-table dynamic vlan 10
switch# clear mac-address-table dynamic port 1/1/5
Worked Example: “Where is this MAC?”
A user reports they can’t reach the file server. From the user’s switch:
switch# show mac-address-table address 00:50:56:aa:bb:cc
No entries found.
switch# show vlan port 1/1/8
Port VLAN Mode Tagging
1/1/8 10 Access Untagged
switch# show mac-address-table vlan 10 | count
Total dynamic entries: 1 (only the user!)
The user is the only MAC in VLAN 10 from this switch’s perspective — the file server isn’t reachable at L2. Move up to the uplink trunk:
upstream# show vlan port 1/1/24
Port VLAN Mode Tagging
1/1/24 20 Trunk Tagged
1/1/24 99 Trunk Tagged
(no VLAN 10!)
The uplink trunk does not include VLAN 10. Adding it (vlan trunk allowed 10 add) restores connectivity. “Trace the MAC up the path, find the broken link” is the most common L2 troubleshooting workflow.
Key Takeaways: MAC Address Table
- AOS-CX learns dynamic entries from source MACs and ages them out after 300 s by default.
- Static entries persist forever and override learning.
- Unknown unicast and broadcast are flooded within the VLAN; storm-control can rate-limit unknown-unicast.
- Frequent MAC moves usually mean a loop or duplicate MAC;
loop-protectcan err-disable looped ports. - Use
show mac-address-tablewith filters to walk the L2 path during troubleshooting.
Section 4: Voice VLAN and LLDP-MED
The most “interesting” L2 port in a modern enterprise is the desk port shared by an IP phone and a daisy-chained PC. That port must (1) carry the phone on a tagged voice VLAN with QoS markings appropriate for real-time media, (2) carry the PC on an untagged data VLAN with no special handling, and (3) auto-provision the phone with VLAN, 802.1p priority, and DSCP — without anyone touching the phone. LLDP-MED is the standards-based protocol that makes #3 possible.
LLDP and LLDP-MED Basics
LLDP (IEEE 802.1AB) is a vendor-neutral neighbor-discovery protocol. Each device periodically multicasts a frame describing itself: chassis ID, port ID, system name, capabilities, and management address. Switches store what they’ve heard and surface it via show lldp neighbor-info.
LLDP-MED (ANSI/TIA-1057) extends LLDP for IP phones and similar endpoints. It adds Network Policy TLVs that let a switch tell a phone which VLAN, 802.1p priority, and DSCP to use for voice media and signaling. The phone arrives unconfigured; seconds later, it knows what it needs [Source: https://www.arubanetworks.com/techdocs/AOS-CX/10.10/HTML/security/Content/Chp_LLDP/con-lldp-med.htm].
Analogy: LLDP-MED as the hotel concierge. When a guest (the phone) checks in, the concierge (the switch) hands them a card explaining which floor (VLAN), which entrance (priority), and which elevator key (DSCP) to use. A PC, with no reservation, just uses the lobby (the data VLAN) like everyone else.
Voice VLAN Auto-Configuration
The standard model is tagged voice + untagged data on the same port:
- Voice VLAN: tagged. The phone learns the VLAN ID via LLDP-MED and tags its own RTP/SIP traffic.
- Data VLAN: untagged. The PC daisy-chained behind the phone (using the phone’s PC pass-through port) is VLAN-unaware and sees a normal access port.
On AOS-CX, this is configured by combining vlan access with vlan trunk allowed:
switch(config)# vlan 10
switch(config-vlan-10)# name DATA
switch(config-vlan-10)# exit
switch(config)# vlan 20
switch(config-vlan-20)# name VOICE
switch(config-vlan-20)# voice
switch(config-vlan-20)# exit
!
switch(config)# interface 1/1/5
switch(config-if)# no shutdown
switch(config-if)# no routing
switch(config-if)# vlan access 10
switch(config-if)# vlan trunk allowed 20
The voice keyword inside the VLAN context flags VLAN 20 as a voice VLAN — this is what LLDP-MED Network Policy TLVs reference. The interface configuration says: “untagged frames are VLAN 10, but tagged VLAN 20 frames are also accepted.” The phone, having learned VLAN 20 from LLDP-MED, will tag its own traffic with VLAN 20. The PC, with no VLAN awareness, sends untagged frames that land in VLAN 10 [Source: https://www.arubanetworks.com/techdocs/AOS-CX/10.10/HTML/fundamentals_guide/Content/Chp_VLANs/voice-vlan.htm].
Enabling LLDP and LLDP-MED
LLDP is enabled by default globally on most AOS-CX switches. To enable LLDP-MED and advertise a network policy:
switch(config)# lldp
switch(config)# lldp configure med
switch(config)# lldp med-tlv-select network-policy
!
switch(config)# interface 1/1/5
switch(config-if)# lldp transmit
switch(config-if)# lldp receive
switch(config-if)# lldp med-tlv-select network-policy
switch(config-if)# lldp med network-policy voice vlan 20 priority 5 dscp 46
switch(config-if)# exit
The Network Policy TLV carries the application type, VLAN ID, 802.1p priority, and DSCP value. Common application types include:
| Application type | Typical priority | Typical DSCP |
|---|---|---|
| voice | 5 | 46 (EF) |
| voice-signaling | 3 | 24 (CS3) |
| guest-voice | 5 | 46 |
| softphone-voice | 5 | 46 |
| video-conferencing | 4 | 34 (AF41) |
| streaming-video | 4 | 26 (AF31) |
| video-signaling | 3 | 24 (CS3) |
[Source: https://www.arubanetworks.com/techdocs/AOS-CX/10.13/HTML/fundamentals/Content/Chp_LLDP/lldp-cmds.htm; https://global.tia.org/standard/ANSI-TIA-1057]
QoS Marking from LLDP-MED
The Network Policy tells the phone what to mark; you also need the switch to honor those markings. The simplest method is to trust the phone’s DSCP markings and queue accordingly:
switch(config)# qos trust dscp
switch(config)# interface 1/1/5
switch(config-if)# qos trust dscp
Alternatively, classify and apply a policy that places EF traffic into a strict-priority queue:
switch(config)# class ip VOICE-RTP
switch(config-class-ip)# match dscp ef
switch(config-class-ip)# exit
switch(config)# policy VOICE-POLICY
switch(config-policy)# class ip VOICE-RTP action priority strict
switch(config-policy)# exit
switch(config)# interface 1/1/5
switch(config-if)# apply policy VOICE-POLICY in
Strict-priority (or low-latency) queuing keeps voice packets ahead of bursty data. The objective is to keep one-way latency below ~150 ms, jitter below ~30 ms, and loss below 1% — beyond those numbers, call quality degrades audibly [Source: https://www.arubanetworks.com/techdocs/AOS-CX/10.10/HTML/qos/Content/Chp_QoS/qos-trust.htm].
Verifying Phone Discovery
show lldp neighbor-info ... detail reveals everything the phone advertised back, including the LLDP-MED TLVs it accepted:
switch# show lldp neighbor-info 1/1/5 detail
Port : 1/1/5
Neighbor Entries : 1
Chassis-ID Type : MAC address
Chassis-ID : 00:21:5a:11:22:33
Port-ID Type : MAC address
Port-ID : 00:21:5a:11:22:33
System Name : SEP00215A112233
System Description : Aruba IP Phone, Firmware: 7.4.1
Capabilities Supported : Bridge, Telephone
Capabilities Enabled : Bridge, Telephone
LLDP-MED Capabilities
Device Type : Endpoint Class III
Capabilities : Capabilities, Network Policy, Inventory
Network Policy
Application : voice
VLAN ID : 20
Priority : 5
DSCP : 46
Inventory
Manufacturer : HPE Aruba
Model : AP-IP501
Firmware : 7.4.1
If the Network Policy block matches the intended VLAN, priority, and DSCP, the LLDP-MED handshake succeeded. If missing or wrong, check show lldp configuration (LLDP enabled?), show lldp tlvs-tx (network-policy TLV included?), show vlan port (voice VLAN allowed?), and show running-config interface (policy actually configured?).
Worked Example: A Complete Phone+PC Port
Putting it all together for an Aruba CX 6100 supporting an IP phone with PoE and a daisy-chained PC:
vlan 10
name DATA
vlan 200
name VOICE
voice
!
interface 1/1/10
no shutdown
no routing
vlan access 10
vlan trunk allowed 200
lldp med network-policy voice vlan 200 priority 5 dscp 46
qos trust dscp
power-over-ethernet
Sequence when the phone is plugged in: (1) PoE negotiation supplies power; (2) within ~10 seconds the switch announces the Network Policy TLV (VLAN 200, priority 5, DSCP 46); (3) the phone tags its traffic with VLAN 200 and marks RTP with DSCP 46; (4) the phone requests DHCP on the voice VLAN and registers with the call manager; (5) frames from the PC behind the phone arrive untagged and land in VLAN 10.
Figure 3.4: LLDP-MED voice VLAN auto-provisioning sequence
sequenceDiagram
participant Phone as "IP Phone"
participant SW as "AOS-CX Switch"
participant DHCP as "DHCP Server"
participant CM as "Call Manager"
Phone->>SW: "Link up + PoE classification"
SW-->>Phone: "PoE class 3/4 power supplied"
SW->>Phone: "LLDP-MED Network Policy TLV<br/>(VLAN 200, PCP 5, DSCP 46)"
Note over Phone: "Phone configures voice VLAN tag"
Phone->>SW: "Tagged DHCP DISCOVER on VLAN 200"
SW->>DHCP: "Forward DHCP request (voice scope)"
DHCP-->>Phone: "DHCP OFFER: IP, TFTP, call-mgr address"
Phone->>CM: "SIP REGISTER (DSCP 46 on voice VLAN)"
CM-->>Phone: "200 OK — phone ready"
Note over SW: "PC behind phone sends untagged → VLAN 10 (data)"
Common Voice VLAN Troubleshooting
| Symptom | Likely cause | Fix |
|---|---|---|
| Phone never gets a voice VLAN | lldp med network-policy not applied; voice VLAN not in vlan trunk allowed | Apply lldp med network-policy voice vlan ... priority ... dscp ...; allow VLAN |
show lldp neighbor-info empty | LLDP disabled on port; phone not powered | lldp transmit/lldp receive; verify PoE |
| Voice quality choppy | DSCP not trusted; voice not in priority queue | qos trust dscp on port; verify priority queueing |
| PC behind phone has no connectivity | Data VLAN not configured as untagged on the port | vlan access <data-vlan> on the interface |
| Native VLAN warnings on uplink | Trunk native VLAN mismatch | Align native or vlan trunk native <id> tag |
[Source: https://community.arubanetworks.com/discussion/voice-vlan-cx-lldp-med]
Key Takeaways: Voice VLAN and LLDP-MED
- The voice/data design is tagged voice + untagged data on the same port, with the phone learning its VLAN via LLDP-MED.
- Mark VLANs as voice with the
voicekeyword and applylldp med network-policy voice vlan <id> priority 5 dscp 46. qos trust dscpmakes the switch honor the phone’s EF marking and queue voice ahead of data.- Verify with
show lldp neighbor-info <port> detail, looking for the Network Policy TLV the phone accepted. - Most voice problems trace back to either a missing allowed-VLAN entry or a missing LLDP-MED policy on the interface.
Chapter Summary
Layer 2 switching on AOS-CX is the foundation everything else rests on. VLANs partition a single physical fabric into many logical broadcast domains, identified by 12-bit VLAN IDs and signaled across switches by the IEEE 802.1Q tag. Access ports carry untagged traffic for one VLAN, trunks carry tagged traffic for many, and the native VLAN is the trunk’s one untagged exception — a feature that, mismanaged, becomes the most common silent-misconfiguration in real networks.
The MAC address table turns Ethernet from a hub into a switch. Source-MAC learning, 300-second aging, dynamic and static entries, MAC moves, port-security limits, and unknown-unicast flooding all interact to deliver line-rate forwarding while still adapting to a changing topology. show mac-address-table and its filtered variants are the workhorse troubleshooting commands.
Voice VLAN and LLDP-MED bring it all together at the user-facing edge. With the voice keyword, an LLDP-MED network policy, and qos trust dscp, an IP phone plugs in, learns its VLAN/priority/DSCP automatically, and shares its port with a daisy-chained PC — all without phone-side configuration. Verify with show lldp neighbor-info detail, show vlan port, and show interface.
AOS-CX has its own conventions you must internalize:
- Most interfaces default to routed; you must
no routingbefore applying VLAN commands. - AOS-CX does not speak DTP — every trunk is statically configured on both ends.
- VLANs must exist globally (
vlan <id>) before being applied to a port. - The
vlan trunk allowedcommand replaces the list unless you useadd/remove.
These four habits, more than any other knowledge, separate engineers who are productive on AOS-CX from engineers who are constantly puzzled by it.
Key Terms
| Term | Definition |
|---|---|
| VLAN | Virtual LAN; a Layer 2 broadcast domain identified by a 12-bit VLAN ID (1-4094) on AOS-CX. |
| 802.1Q | IEEE standard for VLAN tagging on Ethernet; inserts a 4-byte tag (TPID 0x8100, PCP, DEI, VLAN ID) after the source MAC. |
| Trunk | A port that carries traffic from multiple VLANs by tagging frames with 802.1Q; configured with vlan trunk commands. |
| Access port | A port that carries untagged traffic for a single VLAN; configured with vlan access <id>. |
| Native VLAN | The one VLAN per trunk whose traffic is sent untagged. Mismatches between trunk endpoints cause silent traffic merging. |
| MAC address table | The forwarding table mapping {VLAN, MAC} -> egress port. Populated by source-MAC learning; ages out after 300 s by default. |
| LLDP-MED | Link Layer Discovery Protocol for Media Endpoint Devices (ANSI/TIA-1057); advertises Network Policy TLVs to IP phones. |
| Voice VLAN | A VLAN flagged with the voice keyword that LLDP-MED Network Policy TLVs reference; carries tagged phone traffic on a shared port. |
Sources Cited
AOS-CX VLAN, MAC, LLDP, QoS, and port-security documentation at arubanetworks.com/techdocs (10.10 and 10.13 release branches); IEEE 802.1AB-2016; ANSI/TIA-1057; AOS-CX datasheet; Aruba community discussions on AOS-CX VLAN configuration, MAC aging, and LLDP-MED voice VLANs (URLs listed inline above).
Chapter 4: Spanning Tree Protocol and Loop Prevention on AOS-CX
Layer 2 networks built with Ethernet have a wonderful property — broadcast domains scale gracefully because every switch learns MAC addresses dynamically and floods unknowns until it knows where they live. They also have a terrible property: a single redundant cable can collapse the entire network in seconds. Without a loop-prevention protocol, broadcast frames cycle endlessly, multiplying with every switch they pass, until CPUs hit 100%, MAC tables churn uncontrollably, and even ping fails between hosts in the same VLAN.
Spanning Tree Protocol (STP) and its successors are the antidote. They build a logical, loop-free tree on top of the physical mesh, blocking just enough ports to break loops while leaving alternate paths armed for fast failover. On AOS-CX, the modern incarnation is Multiple Spanning Tree Protocol (MSTP), which extends classic STP with VLAN-grouped instances and convergence speeds measured in seconds — sometimes milliseconds. This chapter teaches you how MSTP works, how to configure it on AOS-CX, how to harden it with edge protections, and how to troubleshoot it when something inevitably goes sideways.
Learning Objectives
By the end of this chapter, you will be able to:
- Compare STP, RSTP, and MSTP and identify the default spanning-tree behavior on AOS-CX switches.
- Configure MSTP regions, instances, and VLAN-to-instance mappings on AOS-CX.
- Apply edge-port, BPDU guard, root guard, and loop guard protections in the right places.
- Diagnose and resolve common spanning-tree issues using
showanddebugcommands.
Section 1: STP/RSTP/MSTP Fundamentals
Why a Spanning Tree?
Imagine three switches cabled in a triangle so that any two have a direct link. That triangle is a Layer 2 loop. When a single broadcast frame — say, an ARP request — leaves a host, every switch floods it out every port (except the one it arrived on). The frame travels around the triangle and returns. Each switch floods it again. Within seconds, the network has billions of copies of one frame circulating, and Ethernet has no Time-to-Live field to stop them.
Spanning tree’s job is to elect a root bridge (the trunk of the tree) and then have every other switch select a single best path back to that root, blocking all other paths. The result is a tree topology — loop-free by construction — that can be re-computed if any link fails.
Figure 4.1: Root bridge election and path selection sequence
sequenceDiagram
participant SW1 as Switch 1<br/>(BID 32768:AAAA)
participant SW2 as Switch 2<br/>(BID 32768:BBBB)
participant SW3 as Switch 3<br/>(BID 4096:CCCC)
SW1->>SW2: BPDU (Root=SW1, Cost=0)
SW2->>SW1: BPDU (Root=SW2, Cost=0)
SW3->>SW1: BPDU (Root=SW3, Cost=0)
SW3->>SW2: BPDU (Root=SW3, Cost=0)
Note over SW1,SW3: Compare BIDs - lowest priority wins
SW1->>SW1: Accept SW3 as root (lower priority)
SW2->>SW2: Accept SW3 as root (lower priority)
SW1->>SW3: BPDU (Root=SW3, Cost=2000)
SW2->>SW3: BPDU (Root=SW3, Cost=2000)
Note over SW1,SW3: Each non-root selects best root port<br/>and blocks redundant paths
Analogy: Picture a city’s road network with several bridges across a river. To prevent traffic chaos during a parade, the mayor designates one bridge as the official route, marks alternates as “emergency only,” and posts officers to redirect traffic. Spanning tree is that mayor: it picks a single forwarding path and parks the rest until needed.
Bridge ID, Root Election, and Path Cost
Every switch participating in spanning tree has a Bridge ID (BID) composed of two parts:
- Priority — a 16-bit value (default 32768 in classic STP, expressed as multiples of 4096 with the lower 12 bits reserved for the VLAN/instance).
- MAC address — the switch’s base Ethernet address, used as a tiebreaker.
The switch with the lowest Bridge ID wins the root election. Once a root is chosen, every other switch calculates the cost of reaching the root over each of its links and selects the lowest-cost path. The port carrying that path becomes the root port. Costs are based on link bandwidth — for example, in the IEEE 802.1D-2004 cost table a 1 Gbps link costs 20,000, while 10 Gbps costs 2,000. Lower is better.
| Link Speed | IEEE 802.1D-2004 Cost (Long) | Original 802.1D Cost (Short) |
|---|---|---|
| 10 Mbps | 2,000,000 | 100 |
| 100 Mbps | 200,000 | 19 |
| 1 Gbps | 20,000 | 4 |
| 10 Gbps | 2,000 | 2 |
| 100 Gbps | 200 | 1 |
Example: A switch named ACCESS-1 is connected to two distribution switches, DIST-A (root) over a 10 Gbps link and DIST-B over a 1 Gbps link routed through DIST-A. The path cost via DIST-A is 2,000; the path cost via DIST-B is 2,000 + 20,000 = 22,000. ACCESS-1 elects the link to DIST-A as its root port.
BPDUs: How Switches Talk Spanning Tree
Switches exchange information using Bridge Protocol Data Units (BPDUs) — small frames sent every 2 seconds (default Hello time) to the multicast address 01:80:C2:00:00:00. Two main types exist:
- Configuration BPDUs — carry root ID, sender BID, root path cost, and timers. Used for elections.
- Topology Change Notification (TCN) BPDUs — sent toward the root when a link goes up/down, telling the root to flush MAC tables network-wide.
Port Roles and States
Classic STP defines five port states; RSTP collapses them to three. AOS-CX uses RSTP/MSTP semantics by default.
| State (RSTP/MSTP) | Forwards Data? | Learns MACs? | Sends BPDUs? |
|---|---|---|---|
| Discarding (replaces Disabled/Blocking/Listening) | No | No | Sends/receives BPDUs |
| Learning | No | Yes | Yes |
| Forwarding | Yes | Yes | Yes |
Figure 4.2: RSTP/MSTP port state transitions
stateDiagram-v2
[*] --> Discarding: Port enabled
Discarding --> Learning: Forward delay<br/>or proposal/agreement
Learning --> Forwarding: Forward delay<br/>expires
Forwarding --> Discarding: Link down<br/>or role change
Learning --> Discarding: Role change to<br/>alternate/backup
Discarding --> Forwarding: Edge port<br/>(skip listen/learn)
Forwarding --> [*]: Port disabled
note right of Discarding
No data forwarding
No MAC learning
BPDUs sent/received
end note
note right of Forwarding
Full data forwarding
MAC learning active
BPDUs sent/received
end note
Port roles in RSTP/MSTP:
- Root — best path to the root bridge (one per non-root switch).
- Designated — best path on a segment, sending BPDUs out to that segment.
- Alternate — backup root port; goes Forwarding fast if the root port fails.
- Backup — backup designated port on a shared segment (rare on point-to-point links).
RSTP Convergence
Classic 802.1D STP took 30–50 seconds to converge — far too slow for modern networks. Rapid Spanning Tree Protocol (802.1w) introduces three key improvements:
- Proposal/Agreement handshake on point-to-point links — replaces the listening/learning timer with a sub-second negotiation between two switches.
- Edge ports — ports declared edge (toward hosts) skip listening/learning and go straight to forwarding.
- Alternate and backup ports — pre-computed, so failover is near-instant when a root port fails.
Result: convergence in under a second for most failures, instead of half a minute.
MSTP Regions and Instances
Per-VLAN spanning tree (PVST+) runs one tree instance per VLAN, which doesn’t scale — 4,000 VLANs means 4,000 BPDU streams. Multiple Spanning Tree Protocol (802.1s/MSTP) solves this by grouping VLANs into a small number of Multiple Spanning Tree Instances (MSTIs). Typically you have one MSTI for VLANs that should follow path A and another for VLANs that should follow path B, achieving load balancing without the BPDU overhead of PVST+.
A group of switches with identical region configuration (config-name + revision + VLAN-to-instance map) forms an MST region. To the outside world, the entire region looks like a single bridge in the Common and Internal Spanning Tree (CIST), also called Instance 0. Inside the region, MSTIs run independently, optimizing forwarding for their VLAN groups.
graph TD
subgraph CIST [CIST / Instance 0 - VLANs 1, 99 default]
Root[Root Bridge<br/>DIST-A priority 0]
end
subgraph MSTI1 [MSTI 1 - VLANs 10-20]
Root1[Root: DIST-A priority 4096]
end
subgraph MSTI2 [MSTI 2 - VLANs 30-40]
Root2[Root: DIST-B priority 4096]
end
[Source: https://www.sikich.com/insight/understanding-stp-options-in-aruba-cx-switches/] [Source: https://support.hpe.com/hpesc/public/docDisplay?docId=sf000098115en_us&docLocale=en_US]
Default AOS-CX Behavior
A critical fact: spanning tree is disabled by default on AOS-CX switches. This is the opposite of many legacy ProVision/Comware platforms, where PVST or MSTP came up enabled. If you cable two AOS-CX switches together with redundant links and forget to enable spanning tree, you will create an instant Layer 2 loop. Always enable spanning tree before adding redundant uplinks.
When you do enable it, AOS-CX defaults to MSTP mode. RPVST+ (per-VLAN rapid STP) is also available for interop with Cisco environments.
[Source: https://www.sikich.com/insight/understanding-stp-options-in-aruba-cx-switches/]
Key Takeaways — Section 1
- Spanning tree elects a single root bridge (lowest Bridge ID wins) and computes a loop-free tree by blocking redundant ports.
- BPDUs carry root ID, sender BID, and path cost; TCNs notify the root of topology changes so MAC tables can be flushed.
- RSTP collapses states to Discarding/Learning/Forwarding and converges in under a second using proposal/agreement handshakes.
- MSTP groups VLANs into instances within a region, scaling beyond PVST+ while still allowing load balancing.
- AOS-CX has spanning tree disabled by default — you must explicitly enable it before deploying redundant Layer 2 links.
Section 2: MSTP Configuration on AOS-CX
Enabling MSTP and the Region
Three settings define an MST region: config-name (a 32-character string), config-revision (an integer 0–65535), and the VLAN-to-instance mapping. All switches in the region must agree on all three. AOS-CX hashes these into a 16-byte config digest and includes it in BPDUs; if two switches compute different digests, they treat each other as separate regions and only the CIST runs between them.
Begin by enabling spanning tree, choosing MSTP mode, and naming the region:
switch(config)# spanning-tree
switch(config)# spanning-tree mode mstp
switch(config)# spanning-tree config-name CAMPUS-CORE
switch(config)# spanning-tree config-revision 1
The first command turns spanning tree on globally. The mode line locks it to MSTP. The config-name and revision must match on every switch you want to share a region with — typo-sensitive, case-sensitive.
VLAN-to-Instance Mapping
By default, all VLANs map to Instance 0 (CIST/IST). Create additional instances and explicitly map VLANs to balance load:
switch(config)# spanning-tree instance 1 vlan 10-20,30
switch(config)# spanning-tree instance 2 vlan 40-50,99
Any VLAN not explicitly mapped stays in Instance 0. A common design pattern uses two instances and alternates VLANs so that DIST-A roots one and DIST-B roots the other, splitting traffic across both uplinks rather than wasting a blocked path.
Tuning Bridge Priority
Priority is set per instance and must be a multiple of 4096 (the lower 12 bits are reserved for the instance number). Lower wins.
switch(config)# spanning-tree instance 0 priority 0 # CIST root
switch(config)# spanning-tree instance 1 priority 4096 # MSTI 1 root
switch(config)# spanning-tree instance 2 priority 8192 # MSTI 2 secondary
Reserve priority 0 for the intentional root and priority 4096 for the secondary root in each instance. Leaving priorities at the default 32768 means whichever switch boots first with the lowest MAC becomes root — a recipe for accidental root migrations the first time you reboot the wrong device.
Analogy: Setting bridge priority is like assigning seat numbers at a wedding. If you leave it to chance, the loudest cousin ends up at the head table. Set explicit priorities and the right people sit where you want them.
Root and Secondary Root — A Two-Switch Example
Consider a campus core with DIST-A and DIST-B as redundant aggregation switches, both members of the CAMPUS-CORE region. We want DIST-A to root MSTI 1 (VLANs 10–20) and DIST-B to root MSTI 2 (VLANs 40–50).
DIST-A configuration:
spanning-tree
spanning-tree mode mstp
spanning-tree config-name CAMPUS-CORE
spanning-tree config-revision 1
spanning-tree instance 1 vlan 10-20
spanning-tree instance 2 vlan 40-50
spanning-tree instance 1 priority 0 ! root for MSTI 1
spanning-tree instance 2 priority 4096 ! secondary for MSTI 2
DIST-B configuration:
spanning-tree
spanning-tree mode mstp
spanning-tree config-name CAMPUS-CORE
spanning-tree config-revision 1
spanning-tree instance 1 vlan 10-20
spanning-tree instance 2 vlan 40-50
spanning-tree instance 1 priority 4096 ! secondary for MSTI 1
spanning-tree instance 2 priority 0 ! root for MSTI 2
Access switches in the same region inherit the mappings. Their uplinks to DIST-A and DIST-B will form root ports/alternates per instance — VLANs 10–20 forward via DIST-A, VLANs 40–50 forward via DIST-B, and if either fails, the other takes over in under a second.
Per-Port Cost and Priority
You can override the automatic path cost on a specific port to influence which uplink wins:
switch(config)# interface 1/1/49
switch(config-if)# spanning-tree instance 1 cost 1000
switch(config-if)# spanning-tree instance 1 port-priority 16
Use this sparingly. Most designs work fine with bandwidth-derived costs and bridge priority alone.
Verification
switch# show spanning-tree
switch# show spanning-tree mst-config
switch# show spanning-tree mst 1
switch# show spanning-tree detail
show spanning-tree mst-config echoes the config-name, revision, and the digest hash — the easiest way to spot a region mismatch. show spanning-tree mst 1 lists port roles, states, costs, and the elected root for that instance.
[Source: https://www.scribd.com/document/616571061/AOS-CX-Simulator-MSTP-Lab-Guide] [Source: https://www.youtube.com/watch?v=l195crT_Rxc]
Key Takeaways — Section 2
- An MST region is defined by config-name, config-revision, and VLAN-to-instance mapping; mismatches in any of those produce different digests and split the region.
- VLANs not mapped to a numbered MSTI live in Instance 0 (CIST).
- Priority is set per instance, must be a multiple of 4096, and lower wins; use 0 for the intended root and 4096 for secondary.
- Splitting VLANs across two MSTIs with opposing roots achieves active/active load balancing across redundant uplinks.
Section 3: STP Edge Protection Features
Spanning tree by itself protects against most loops, but the protocol assumes every switch in the network plays fair. In real deployments, three threats break that assumption:
- Rogue switches plugged into user ports — could win the root election and re-shape your topology.
- Unidirectional links — an SFP that transmits but doesn’t receive looks healthy to spanning tree but blackholes BPDUs, fooling the partner into forwarding both ways.
- End-user mistakes — a desktop bridged to two access ports, or a cable patched between two wall jacks.
AOS-CX provides four guard features to address each threat. Each goes in a specific place; using them in the wrong location either does nothing useful or causes outages.
Admin-Edge-Port and BPDU Guard
An edge port is a port connected to a host, not another switch. Declaring a port edge makes RSTP skip the listening/learning delay and transition straight to forwarding. This is what eliminates the dreaded 30-second delay laptops used to face when DHCP timed out at boot.
But an edge declaration is a promise — no switch will ever be plugged in here. BPDU guard (called bpdu-protection on AOS-CX) enforces that promise. If any BPDU arrives on an edge port, BPDU guard immediately err-disables the port (state: BpduError). This protects against rogue switches and accidental loop-back patches.
switch(config)# interface 1/1/1
switch(config-if)# spanning-tree bpdu-guard
switch(config-if)# spanning-tree port-type admin-edge
Some AOS-CX releases use spanning-tree bpdu-protection as the keyword — verify with show running-config interface 1/1/1. Either way, the behavior is identical: the moment a BPDU arrives, the port goes down.
Figure 4.3: BPDU guard activation flow on an edge port
flowchart TD
Start([Frame arrives on edge port]) --> Check{Frame is<br/>a BPDU?}
Check -->|No - normal data frame| Forward[Forward normally<br/>per VLAN/MAC table]
Check -->|Yes - BPDU detected| Guard{BPDU guard<br/>enabled?}
Guard -->|No| Process[Process BPDU<br/>Port may transition role]
Guard -->|Yes| Disable[Err-disable port<br/>State: BpduError]
Disable --> Log[Log event:<br/>BPDU received on edge port]
Log --> Notify[SNMP trap<br/>generated]
Notify --> Wait{Auto-recovery<br/>configured?}
Wait -->|No| Manual[Port stays down until<br/>admin runs 'no shutdown']
Wait -->|Yes| Timer[Wait recovery timer<br/>then re-enable port]
Forward --> End([Done])
Process --> End
Manual --> End
Timer --> End
style Disable fill:#8b0000,color:#fff
style Log fill:#8b0000,color:#fff
[Source: https://www.flomain.de/2018/01/protect-from-spanning-tree-loops-access-area/]
Recovery: A BpduError port stays disabled until you manually re-enable it (no shutdown and interface 1/1/1 enable) or until the auto-recovery timer expires if configured. Until then, the user gets a help-desk ticket — the right outcome.
Root Guard
Root guard prevents a downstream switch from becoming the root bridge through your port. It’s typically applied on aggregation/core downlinks toward access switches. If the port receives a BPDU advertising a better root than the current one, the port transitions to root-inconsistent and stops forwarding until the offending BPDUs disappear.
switch(config)# interface 1/1/47
switch(config-if)# spanning-tree root-guard
Use root guard on every downlink from an aggregation/core switch to an access switch. Never use it on inter-core links, where you want the actual root election to happen freely.
Analogy: Root guard is like a velvet rope at a club. The bouncer (the core switch) doesn’t kick anyone out — but if a customer claims to be the owner, the bouncer pretends not to hear them.
Loop Guard
Loop guard protects against unidirectional link failures. If a non-edge port stops receiving BPDUs (because the fiber broke in one direction, or the partner switch went catatonic), classic STP would assume the link is down and might transition an alternate port to forwarding — creating a loop. Loop guard instead places the port in loop-inconsistent state, blocking until BPDUs return.
switch(config)# interface 1/1/48
switch(config-if)# spanning-tree loop-guard
Apply loop guard to access-to-aggregation uplinks — places where you depend on receiving BPDUs from upstream to know the topology is healthy.
Loop-Protect (Aruba-specific)
loop-protect is an Aruba enhancement that operates independently of spanning tree. It periodically sends a small loop-detection frame out the port and listens for it on any other port. If the same frame returns, a loop exists and the port is disabled. This is invaluable for protecting against unmanaged switches or end-user cable loops where standard BPDUs aren’t exchanged.
switch(config)# interface 1/1/1
switch(config-if)# loop-protect
Use loop-protect on access ports alongside BPDU guard; together they catch BPDU-emitting rogues (BPDU guard) and BPDU-silent loops (loop-protect).
[Source: https://www.flomain.de/2018/01/protect-from-spanning-tree-loops-access-area/]
Topology Change Notification (TCN) Behavior
When any port transitions to forwarding (or down from forwarding) on a non-edge link, the local switch sends a TCN BPDU toward the root. The root then sets the TC flag in its outbound BPDUs for the MaxAge + ForwardDelay period (default 35 seconds), which causes every switch in the tree to age out its MAC table down to ForwardDelay seconds. The result: hosts get re-learned quickly after a topology change.
Excessive TCNs are a symptom of flapping links or unstable end-host ports. Edge ports do not generate TCNs — that’s another reason to declare host ports as admin-edge. Without that declaration, every laptop that disconnects sets off a TCN that flushes the entire campus MAC table.
Filtering vs Guarding — A Comparison
It’s worth distinguishing guarding (detect-and-react) from filtering (drop silently).
| Feature | Trigger Condition | Action on Trigger | Where to Apply | Re-enable Method |
|---|---|---|---|---|
| Admin-edge-port | Always (declaration) | Skip listen/learn → forwarding | Host-facing access ports | N/A |
| BPDU Guard | Any BPDU received on edge port | Err-disable (BpduError) | Host-facing access ports | Manual or auto-recovery |
| BPDU Filter | Any BPDU received | Drop silently, send no BPDUs out | Rare; very specific cases | Remove config |
| Root Guard | Superior BPDU received on protected port | Root-inconsistent (block) | Core/aggregation downlinks | Auto when superior BPDU stops |
| Loop Guard | BPDUs stop arriving on non-edge port | Loop-inconsistent (block) | Access uplinks | Auto when BPDUs resume |
| Loop-Protect | Loop-detect frame returns | Disable port | Access endpoint ports | Manual or auto-recovery |
BPDU filter silently drops BPDUs without disabling the port — dangerous because it makes a switch invisible to spanning tree. Use it only in well-understood lab or carrier scenarios; never confuse it with BPDU guard.
Best-Practice Placement Diagram
graph TD
Core[Core / Distribution]
Agg[Aggregation Switch]
Acc[Access Switch]
Host[End Host / Phone]
Core -- "root-guard" --> Agg
Agg -- "root-guard" --> Acc
Acc -- "loop-guard (uplink)" --> Agg
Acc -- "admin-edge + bpdu-guard + loop-protect" --> Host
Translation:
- Core/aggregation downlinks → root-guard, no edge declarations.
- Access uplinks → loop-guard, no BPDU guard (you want BPDUs here).
- Access host ports → admin-edge + BPDU guard + loop-protect.
Key Takeaways — Section 3
- Declare host ports as admin-edge so they go forwarding immediately and don’t trigger TCNs; pair with BPDU guard to enforce the no-switch promise.
- Root guard belongs on core/aggregation downlinks toward access; it stops a downstream switch from becoming root.
- Loop guard belongs on access uplinks; it blocks the port if BPDUs vanish on a non-edge link, preventing unidirectional-link loops.
- Loop-protect is an Aruba access-layer feature that catches BPDU-silent loops via probe frames.
- BPDU filter and BPDU guard are different — guard disables the port on a BPDU; filter silently drops them. Use guard, not filter, in nearly all designs.
Section 4: Troubleshooting Spanning Tree
Spanning tree problems usually present as one of three symptoms: the wrong switch is root, a port is unexpectedly blocked or err-disabled, or users see periodic 1–10 second outages matching topology changes. The trick is reading the right show command and recognizing the pattern.
The Essential Show Commands
switch# show spanning-tree
switch# show spanning-tree mst 1
switch# show spanning-tree mst-config
switch# show spanning-tree detail
switch# show spanning-tree interface 1/1/49
show spanning-tree gives you the global mode, the elected root for the CIST, and a per-port summary. show spanning-tree mst 1 zooms into a specific instance. show spanning-tree mst-config displays config-name, revision, and digest — the first thing to compare across switches when a region won’t form. show spanning-tree detail is the verbose view, including TC counters, last TC time, and timer values.
Identifying Unstable Roots
A common failure mode is the root flapping between two switches. Symptoms:
show spanning-treeshows different “We are the root” answers depending on which switch you query.show spanning-tree detailshows TC counters incrementing every few seconds.- Logs contain repeated “topology change detected” or “root changed” messages.
Figure 4.4: Unstable root diagnosis decision tree
flowchart TD
Start([Root flapping reported]) --> Cmd1[Run: show spanning-tree mst id<br/>on each switch]
Cmd1 --> Pri{Bridge priorities<br/>both default 32768?}
Pri -->|Yes| Fix1[Set priority 0 on intended root<br/>priority 4096 on secondary]
Pri -->|No| Tied{Priorities tied<br/>at non-default?}
Tied -->|Yes| Fix2[Enforce hierarchy:<br/>differentiate priorities]
Tied -->|No| Region[Run: show spanning-tree mst-config]
Region --> Digest{Digest matches<br/>across switches?}
Digest -->|No| Fix3[Align config-name, revision,<br/>and VLAN-to-instance map]
Digest -->|Yes| Link[Run: show interface brief<br/>on uplinks]
Link --> Flap{Link transitions<br/>recent/frequent?}
Flap -->|Yes| Fix4[Replace SFP/cable<br/>or shut flapping port]
Flap -->|No| Deep[Run: show spanning-tree detail<br/>inspect TC counters and timers]
Fix1 --> Done([Root stable])
Fix2 --> Done
Fix3 --> Done
Fix4 --> Done
Deep --> Done
style Fix1 fill:#1f6feb,color:#fff
style Fix2 fill:#1f6feb,color:#fff
style Fix3 fill:#1f6feb,color:#fff
style Fix4 fill:#1f6feb,color:#fff
Diagnosis path:
- Compare bridge priorities.
show spanning-tree mst <id>on each switch — if two are tied at the default 32768, the lower MAC wins, and a reboot can swap them. Fix: explicitly setpriority 0on the intended root andpriority 4096on the secondary. - Check MAC tiebreakers when priorities match. If you’ve configured both to priority 0 by accident, the lower-MAC switch wins — but the other will challenge it. Enforce the hierarchy.
- Inspect for region mismatches.
show spanning-tree mst-configon each switch — if the digest differs, switches see each other as separate regions and only the CIST runs between them, which can cause unexpected root choices. - Look for flapping uplinks.
show interface 1/1/49 briefand watch for link transitions; a flapping uplink generates TCNs and can shift root path costs.
[Source: https://www.youtube.com/watch?v=duCLjs99_qw]
Reading Inconsistency States
| Inconsistency Marker | Meaning | Likely Cause |
|---|---|---|
| BpduError | BPDU guard triggered on an edge port. | Switch plugged into a host port. |
| Root Inconsistent | Root guard triggered — superior BPDU received from a downstream port. | Misconfigured downstream switch with low priority. |
| Loop Inconsistent | Loop guard triggered — non-edge port stopped receiving BPDUs. | Unidirectional fiber, distant switch crashed. |
| Disputed | Conflicting BPDU information on a designated port. | Two switches each claim designated role. |
Example diagnosis: A user calls about no network on port 1/1/3.
switch# show spanning-tree interface 1/1/3
Port 1/1/3
Status: BpduError Role: Disabled Cost: 20000
BPDU guard: enabled
The port is err-disabled because BPDU guard fired. Either the user plugged in a switch (rogue device — investigate) or there was a brief loop. After verifying there’s no rogue switch, re-enable:
switch(config)# interface 1/1/3
switch(config-if)# no shutdown
Storm Control and Broadcast Suppression
Even with spanning tree healthy, broadcast/multicast storms from misbehaving hosts can hammer a network. AOS-CX supports storm-control to rate-limit broadcast, multicast, or unknown-unicast traffic per interface:
switch(config)# interface 1/1/1
switch(config-if)# storm-control broadcast level pps 1000
switch(config-if)# storm-control multicast level pps 1000
switch(config-if)# storm-control unknown-unicast level pps 500
You can specify the limit as a percentage (level 1), kilobits per second (level kbps 5000), or packets per second (level pps 1000). When the threshold is exceeded, excess traffic is dropped; some platforms can also err-disable the port.
Storm control is a defense-in-depth complement to spanning tree, not a replacement. Spanning tree prevents loops; storm control limits the blast radius of bursty hosts.
Common Loop Patterns
| Pattern | Symptom | Fix |
|---|---|---|
| Two access switches cabled together with STP off | Broadcast storm; CPU at 100%; entire VLAN down | Enable spanning tree; consider MC-LAG instead. |
| User patches two wall jacks together | Rapid link-up/link-down; access switch CPU high | BPDU guard on host ports + loop-protect. |
| Misconfigured MSTP region (digest mismatch) | Switches form their own regions; unexpected blocking | Match config-name, revision, and VLAN-to-instance map exactly. |
| Rogue switch plugged into user port | Root re-election; users on other VLANs see drops | Root guard on access uplinks; BPDU guard on host ports. |
| Unidirectional fiber | Slow drift; periodic outages | Loop guard on access uplinks; UDLD where supported. |
| Forgot to declare host ports as edge | Every laptop disconnect causes campus-wide MAC flush | Set spanning-tree port-type admin-edge on host ports. |
Debug Commands
For real-time visibility into BPDU exchange and TC events:
switch# debug spanning-tree mstp all
switch# debug spanning-tree events
switch# show log | include spanning-tree
Use debug sparingly — output volume on a live core switch can cripple the CPU. Prefer show spanning-tree detail and log filtering for non-emergency diagnosis.
Walkthrough: An Outage Post-mortem
A campus reports that VLAN 30 (printers) loses connectivity for ~3 seconds every 4 minutes. Other VLANs are fine. Steps:
show spanning-tree mst 2(VLAN 30 is mapped to MSTI 2). The root is DIST-B as expected, but the topology change counter is incrementing every 4 minutes.show spanning-tree detail | include topologyconfirms TCs at regular intervals.show log | include spanning-treereveals:Topology change received on interface 1/1/24, MSTI 2.show interface 1/1/24showslast-state-changerecent — the link is flapping.show interface 1/1/24 transceiverreveals an SFP at 7% optical receive level — failing optic.
Replace the SFP. TCs stop. Users stop calling. Lesson: a flapping link in MSTI 2 generated TCs that flushed the MSTI 2 MAC table — including printers, which cache poorly — even though the actual root never moved.
Key Takeaways — Section 4
show spanning-tree,show spanning-tree mst <id>, andshow spanning-tree mst-configare your first three commands for any STP issue.- Inconsistency markers (BpduError, Root Inconsistent, Loop Inconsistent) point directly to which guard tripped and why.
- An unstable root almost always traces to default priorities, region digest mismatches, or flapping uplinks — set explicit priorities and verify config-name/revision.
- Storm control is the right complement to spanning tree for taming broadcast bursts; it doesn’t prevent loops, only the damage they cause.
- Debug commands work but are heavy; use them only when show commands aren’t enough.
Chapter Summary
Spanning tree is the protocol that lets Layer 2 networks have redundancy without self-destructing. On AOS-CX, MSTP is the default mode when you turn spanning tree on — and you must turn it on, because AOS-CX ships with spanning tree disabled. Configure a region by matching config-name, config-revision, and VLAN-to-instance mapping across all participating switches; mismatches in any of those split the region.
Tune bridge priorities explicitly: priority 0 for the intended root, 4096 for the secondary, in multiples of 4096 thereafter. Use multiple instances to distribute VLANs across redundant uplinks for active/active load balancing.
Harden the topology with the right guard in the right place: admin-edge + BPDU guard + loop-protect on host ports, root guard on aggregation downlinks, loop guard on access uplinks. Each guard solves a specific failure mode; placing them incorrectly either does nothing or causes outages.
When troubleshooting, lead with show spanning-tree, show spanning-tree mst, and show spanning-tree mst-config. Inconsistency states tell you which guard tripped. TC counters and log filtering pinpoint flapping links. Storm control is your last line of defense, capping broadcast/multicast/unknown-unicast bursts before they overwhelm a switch.
A well-engineered Layer 2 fabric — spanning tree enabled, regions consistent, priorities deterministic, edge protections in place — converges in well under a second on link failure and shrugs off the kinds of human mistakes (rogue switches, cable loops, bad SFPs) that bring down the unprotected.
Key Terms
- STP (Spanning Tree Protocol, 802.1D) — Original IEEE protocol that builds a loop-free tree on a switched network; converges in 30–50 seconds.
- RSTP (Rapid Spanning Tree Protocol, 802.1w) — Faster successor to STP using proposal/agreement handshakes; sub-second convergence.
- MSTP (Multiple Spanning Tree Protocol, 802.1s) — Groups VLANs into instances within a region for scalable load balancing; the default mode on AOS-CX.
- BPDU (Bridge Protocol Data Unit) — The frame switches use to exchange spanning-tree information, sent every 2 seconds by default.
- Root Bridge — The switch with the lowest Bridge ID; the trunk of the spanning tree from which all path costs are measured.
- Bridge ID (BID) — Combination of priority (16-bit) plus MAC address; lowest BID wins root election.
- MST Region — Group of switches sharing identical config-name, config-revision, and VLAN-to-instance map; appears as a single bridge to the outside CIST.
- CIST / Instance 0 — Common and Internal Spanning Tree; default instance carrying all unmapped VLANs and inter-region spanning tree decisions.
- MSTI (Multiple Spanning Tree Instance) — A numbered instance (1–64) that runs its own spanning tree for a group of VLANs inside a region.
- Config Digest — 16-byte MD5 hash of the region’s config-name, revision, and VLAN-to-instance map; carried in BPDUs to verify region membership.
- Edge Port (admin-edge-port) — Port toward a host that skips listening/learning and goes straight to forwarding; doesn’t generate TCNs.
- BPDU Guard (bpdu-protection) — Err-disables an edge port the instant any BPDU arrives, protecting against rogue switches.
- Root Guard — Blocks a port (root-inconsistent state) when superior BPDUs arrive, preventing downstream switches from becoming root.
- Loop Guard — Blocks a non-edge port (loop-inconsistent state) when BPDUs stop arriving, defending against unidirectional links.
- Loop-Protect — Aruba-specific feature that sends loop-detect frames and disables ports if a loop is detected; works without BPDUs.
- TCN (Topology Change Notification) — BPDU sent toward the root on link state changes; triggers MAC table flush across the tree.
- Storm Control — Per-interface rate limiter for broadcast, multicast, and unknown-unicast traffic; complements spanning tree.
Sources cited in this chapter:
- [Source: https://www.sikich.com/insight/understanding-stp-options-in-aruba-cx-switches/]
- [Source: https://support.hpe.com/hpesc/public/docDisplay?docId=sd00007461en_us&page=GUID-70CAA050-CD03-4672-8074-419345E99DB7.html&docLocale=en_US]
- [Source: https://support.hpe.com/hpesc/public/docDisplay?docId=sf000098115en_us&docLocale=en_US]
- [Source: https://www.scribd.com/document/616571061/AOS-CX-Simulator-MSTP-Lab-Guide]
- [Source: https://www.youtube.com/watch?v=l195crT_Rxc]
- [Source: https://www.youtube.com/watch?v=rhXMXTkVYn8]
- [Source: https://www.youtube.com/watch?v=duCLjs99_qw]
- [Source: https://www.flomain.de/2018/01/protect-from-spanning-tree-loops-access-area/]
- [Source: https://airheads.hpe.com/discussion/2-trunks-per-switch-what-is-the-best-configuraton-between-bpdu-guard-root-guard-loop-guard]
- [Source: https://cadinc.com/wp-content/uploads/2024/08/CLI-Ref-Guide-2019v1-HPE-Aruba-Cisco-CX-from-Carolina-Advanced-Digtial.pdf]
Chapter 5: Link Aggregation, LACP, and VSX Multi-Chassis LAG
A single uplink between two switches is a single point of failure and a single bottleneck. Modern campus and data center designs eliminate both problems by bonding multiple physical links into one logical pipe and by pairing two physical chassis so they appear as one logical forwarding entity. On Aruba AOS-CX, the building blocks for that design are Link Aggregation Groups (LAGs), the Link Aggregation Control Protocol (LACP), and Virtual Switching Extension (VSX). This chapter walks through each layer — starting at the wire and ending at a coordinated, dual-chassis upgrade that doesn’t drop a packet.
Learning Objectives
By the end of this chapter, you should be able to:
- Configure static and LACP-based link aggregation groups (LAGs) on AOS-CX, including hash algorithm and member-port consistency considerations.
- Explain the VSX architecture, including the Inter-Switch Link (ISL), the keepalive, primary/secondary roles, and active-active forwarding via active-gateway.
- Configure VSX LAGs that span two physical switches so a downstream device sees a single logical aggregation.
- Plan VSX upgrades using Live Software Upgrades (LSU/ISSU), including prerequisites, rollback considerations, and how VSX compares to traditional stacking and VSF.
Section 1 — LAG Fundamentals
Think of a LAG like a multi-lane highway built between two cities. One lane (a single physical link) might carry the load most days, but during rush hour, or when a lane is closed for maintenance, the highway needs more capacity and redundancy. A LAG bundles multiple physical interfaces into one logical highway: bandwidth scales (close to) linearly, and a single lane closure doesn’t shut down the road.
Static vs LACP
AOS-CX supports two LAG flavors:
- Static LAG — both ends are configured manually. There is no negotiation. If you wire it correctly and the configs match, traffic flows. If a member is misconfigured or the wrong port is patched, the switch happily forwards into a black hole.
- Dynamic LAG (LACP, IEEE 802.3ad) — both ends exchange LACPDUs (Link Aggregation Control Protocol Data Units) to verify the partner, agree on which member ports are eligible, and detect mis-cabling before forwarding starts.
Analogy: A static LAG is like a handshake agreement between two contractors — fast, but if either side forgets, the work doesn’t line up. LACP is the same handshake plus a signed contract that’s re-checked every second. It costs a tiny bit of overhead, but you find out about mistakes immediately rather than after a packet capture.
In AOS-CX, every LAG starts as static. You opt in to LACP by adding lacp mode active (or passive) under the LAG interface [Source: https://egstory.net/en/aos-cx-cli-%EB%AA%85%EB%A0%B9%EC%96%B4-2/].
Figure 5.1: LACP negotiation handshake between two switches
sequenceDiagram
participant A as Switch A (Active)
participant B as Switch B (Active)
A->>B: LACPDU (System ID, Key, Port Priority)
B->>A: LACPDU (System ID, Key, Port Priority)
A->>A: Verify partner System ID matches expectations
B->>B: Verify partner System ID matches expectations
A->>B: LACPDU (Sync=1, Collecting=1, Distributing=1)
B->>A: LACPDU (Sync=1, Collecting=1, Distributing=1)
Note over A,B: LAG members InSync, Collecting, Distributing
A-->>B: Heartbeat LACPDU (every 1s with 'lacp rate fast')
B-->>A: Heartbeat LACPDU (every 1s with 'lacp rate fast')
Note over A,B: Failure detected in ~3s if heartbeats stop
switch# configure terminal
switch(config)# interface lag 1
switch(config-lag-if)# description "Static LAG to Peer"
switch(config-lag-if)# no shutdown
switch(config-lag-if)# exit
switch(config)# interface 1/1/1
switch(config-if)# lag 1
switch(config-if)# no shutdown
switch(config)# interface 1/1/2
switch(config-if)# lag 1
switch(config-if)# no shutdown
That’s a static LAG. To convert it to LACP, simply add the mode under the LAG interface:
switch(config)# interface lag 20
switch(config-lag-if)# description "LACP LAG to Peer"
switch(config-lag-if)# lacp mode active
switch(config-lag-if)# lacp rate fast
switch(config-lag-if)# no shutdown
lacp rate fast shortens the LACPDU heartbeat from the default 30 seconds to 1 second, allowing failure detection in roughly 3 seconds [Source: https://www.scribd.com/document/430131860/LACP-and-DT].
LACP Active vs Passive
LACP defines two modes that determine who initiates negotiation:
| Mode | Behavior |
|---|---|
| Active | Sends LACPDUs continuously. Initiates negotiation. |
| Passive | Only responds to LACPDUs. Will not bring up the LAG without an active partner. |
The rule: at least one side must be active. Passive-on-passive is the silent-treatment failure mode — both ends sit waiting for the other to speak first, and the LAG never comes up. The safest default for switch-to-switch links is active on both sides; use passive only when policy dictates that a switch should never initiate aggregation toward an unknown neighbor (e.g., on edge ports facing third-party devices).
LAG Hash Algorithms
When traffic enters a LAG, the switch must decide which member port carries each frame. It does this with a hash function over fields in the packet — typically source/destination MAC, source/destination IP, and L4 ports — producing a consistent member selection. Consistency matters because reordering frames within a flow breaks TCP performance.
AOS-CX uses an L2/L3/L4 hash by default, which works well for most traffic. The trade-off is that all frames from a single flow (same 5-tuple) ride the same physical link — so you cannot exceed the bandwidth of a single member for a single flow. Two 10G members in a LAG yield 20G of aggregate capacity but still only 10G for any one TCP session.
Analogy: A LAG hash is like a toll plaza directing cars by license plate. Every car with plate “ABC-123” is sent to lane 2 — always the same lane, so they stay in order. But you can’t make that one car drive in two lanes at once.
Member Port Consistency
For a LAG to work, member ports must match in:
- Speed and duplex — you cannot mix 1G and 10G members.
- VLAN membership — all members carry the same trunk/access VLAN configuration.
- MTU and storm control — applied at the LAG level, inherited by members.
- L2 vs L3 mode — a routed LAG must have all members in routed mode.
AOS-CX rejects mismatched configs at commit time, but a partial reconfiguration (e.g., changing one member’s VLAN by mistake) can knock the member out of the LAG until consistency is restored. Always configure VLAN/L3 settings on the LAG interface, never on individual members.
LACP Fallback-Static
Beginning with AOS-CX 10.02, the lacp fallback-static option lets a single member port forward traffic if the LACP partner is unresponsive [Source: https://www.youtube.com/watch?v=mjy4mUlzwQ0]. This solves the classic PXE-boot problem: a server’s NICs come up before the OS loads its LACP driver, so the LAG would normally stay down and the boot would fail. With fallback-static, one member becomes a temporary static link until LACPDUs start arriving, then the LAG converges normally.
switch(config)# interface lag 30
switch(config-lag-if)# lacp mode active
switch(config-lag-if)# lacp fallback-static
Verification
Two commands tell you almost everything:
switch# show interface lag 1
switch# show lacp interfaces
show lacp interfaces displays a state column with codes like ALFNCD — Active, Long-timeout, Fast/slow, In-sync, Collecting, Distributing. A healthy member shows all flags set; a stuck negotiation shows missing flags.
Section 2 — VSX Architecture
A LAG handles redundancy within one switch chassis. But what if the chassis itself fails? The traditional answer was stacking — combining multiple switches into one logical unit. Stacking simplifies management but creates a shared control plane: a control-plane bug or a stack-master crash can take down every switch in the group. VSX takes a different approach.
VSX (Virtual Switching Extension) clusters exactly two AOS-CX switches with independent control planes that synchronize state over a dedicated link. From a downstream device’s perspective, the pair acts as one switch — but each VSX peer runs its own copy of OSPF, BGP, STP, and management processes. If one peer reboots or hits a bug, the other keeps forwarding [Source: https://www.scribd.com/document/586039876/TB-VSX].
Analogy: Stacking is one brain controlling two bodies — efficient until the brain has a stroke. VSX is two pilots in a cockpit, each fully qualified, sharing notes constantly. If one passes out, the other already knows the plan.
Figure 5.2: VSX architecture with ISL, keepalive, and downstream multi-chassis LAG
flowchart LR
KA[Keepalive L3 Link<br/>VRF 'ka' or OOBM]
VSX1[VSX-1<br/>Primary<br/>Independent Control Plane]
VSX2[VSX-2<br/>Secondary<br/>Independent Control Plane]
ISL{{ISL multi-chassis LAG<br/>L2 sync, MAC sync, fallback}}
ToR[Downstream Device<br/>Server / ToR / Router]
VSX1 -.keepalive.- KA
VSX2 -.keepalive.- KA
VSX1 === ISL === VSX2
VSX1 -- VSX LAG member 1 --> ToR
VSX2 -- VSX LAG member 2 --> ToR
ToR -. sees one logical switch .- VSX1
VSX Components Diagram
┌───────────────────┐
│ Keepalive (L3) │ ← detects peer aliveness
│ Separate VRF/OOBM │ (prevents split-brain)
└─────────┬─────────┘
│
┌──────────────────┼──────────────────┐
│ │ │
┌──────┴──────┐ ┌──────┴──────┐
│ VSX-1 │ ════ ISL (LAG) ═════ │ VSX-2 │
│ (primary) │ L2 sync, MAC sync │ (secondary) │
│ │ data fallback path │ │
└──┬───────┬──┘ └──┬───────┬──┘
│ │ │ │
│ └────────────┐ ┌────────────┘ │
│ │ │ │
│ ┌─────┴──┴─────┐ │
│ │ VSX LAG │ (MC-LAG) │
│ │ Downstream │ │
│ └──────────────┘ │
ISL — Inter-Switch Link
The ISL is a multi-chassis LAG between the two VSX peers carrying:
- L2 control sync — MAC table, ARP, IGMP snooping
- Configuration sync — anything tagged
vsx-sync - Data fallback — traffic destined to a peer-only egress (e.g., a VSX LAG member is down on one peer)
Sizing the ISL matters. In a worst case where every VSX LAG loses one of its two member links, all traffic that should have gone out the failed link redirects across the ISL. A common rule: ISL bandwidth should equal the largest single VSX LAG capacity, or higher. Typical deployments use 2× to 4× 100G or 4× 25G ISLs.
interface lag 100 multi-chassis
description ISL
no shutdown
no routing
vlan trunk native none
vlan trunk allowed all
lacp mode active
interface 1/1/4
description ISL
no shutdown
lag 100
interface 1/1/5
description ISL
no shutdown
lag 100
vsx
inter-switch-link lag 100
The multi-chassis keyword on the LAG interface marks it as a VSX construct, and vsx inter-switch-link lag 100 binds it to VSX [Source: https://www.youtube.com/watch?v=bz4n0MwMt7s].
Keepalive
The keepalive is a dedicated L3 link (not the ISL) used to detect whether the peer is alive. If the ISL goes down, both peers consult the keepalive: if the keepalive is still up, the peers know each other is alive but disconnected on the ISL — they enter a defined recovery state to avoid both claiming primary role. If both ISL and keepalive are down, the peer assumes its partner is dead and takes over fully.
The keepalive lives in its own VRF (commonly named ka) or, in AOS-CX 10.10 and later, on the OOBM (out-of-band management) port for redundancy [Source: https://www.coursehero.com/file/247065217/Configuration-Switch-Aruba-Copypptx/].
vrf ka
interface 1/1/6
vrf attach ka
description keepalive
ip address 192.168.0.1/30 ! Peer uses .2
no shutdown
vsx
keepalive peer 192.168.0.2 source 192.168.0.1 vrf ka
Analogy: The ISL is the office Slack channel where the two pilots compare notes constantly. The keepalive is the second radio they each carry — if Slack goes down, the radio confirms the other pilot is still flying.
Primary vs Secondary Role
VSX peers are configured with a deterministic role:
vsx
system-mac 02:01:00:00:01:00
inter-switch-link lag 100
role primary ! Or 'secondary' on the other peer
keepalive peer ...
Both peers actively forward data. The role matters for:
- Configuration sync direction — primary pushes vsx-sync state to secondary
- Tie-breaking — during simultaneous events, the primary makes decisions
- LSU orchestration — the upgrade flow starts on the primary
A shared system MAC is required so downstream LACP partners see one System ID across both chassis. Without a shared system-mac, a VSX LAG would never form.
Active-Gateway and Active-Forwarding
For first-hop routing, VSX uses active-gateway — both peers respond to the same virtual IP and MAC on a VLAN, so any host can route locally through whichever peer it reaches first. There is no failover delay because there is no failover — both gateways are always active.
vlan 10
name employee
interface vlan 10
description employee
vsx-sync active-gateways
ip address 172.17.0.2/24
active-gateway ip 172.17.0.1 mac 12:00:00:00:00:01
The peer switch uses ip address 172.17.0.3/24 with the same active-gateway ip 172.17.0.1. Hosts learn 172.17.0.1 via DHCP or static config and don’t know (or care) that two switches answer for it.
This eliminates the FHRP failover delay you’d see with VRRP or HSRP — both peers route every packet, immediately.
vsx-sync Configuration
vsx-sync is a per-feature command applied under VLANs, ACLs, QoS profiles, route-maps, etc., that pushes configuration from primary to secondary. Without vsx-sync, a config you make on the primary stays only on the primary.
| Common vsx-sync targets | Why |
|---|---|
| VLANs | Both peers must know all VLANs that traverse the ISL or VSX LAGs |
| Active-gateways | Both peers must answer the same gateway IP/MAC |
| ACLs | Symmetric forwarding requires symmetric policy |
| QoS classifiers and queues | Asymmetric QoS produces asymmetric latency |
| Route maps and prefix lists | Routing policy must match on both peers |
Treat vsx-sync as a discipline: every time you add a VLAN, ACL, or active-gateway on the primary, ensure vsx-sync is enabled or replicate manually on the secondary.
Section 3 — VSX LAG Configuration
A VSX LAG (also called MC-LAG, multi-chassis LAG) is a single logical LAG whose member ports are split across the two VSX peers. The downstream device — a server, an access switch, another router — believes it is talking to one switch with two links. In reality, it’s connected to two independent chassis.
Analogy: Imagine you have two phone lines from two different carriers, but a magic phone that lets you publish one number that rings on both. Your callers don’t know there are two carriers — they just know the line is always reachable.
Defining VSX Peers and ISL — Putting It All Together
A complete minimal VSX bring-up looks like this on the primary:
! 1. Define the ISL LAG
interface lag 100 multi-chassis
no shutdown
no routing
vlan trunk native none
vlan trunk allowed all
lacp mode active
interface 1/1/49
no shutdown
lag 100
interface 1/1/50
no shutdown
lag 100
! 2. Define keepalive interface in its own VRF
vrf ka
interface 1/1/48
vrf attach ka
ip address 192.168.0.1/30
no shutdown
! 3. Bind it all under VSX
vsx
system-mac 02:01:00:00:01:00
inter-switch-link lag 100
role primary
keepalive peer 192.168.0.2 source 192.168.0.1 vrf ka
linkup-delay-timer 180 ! seconds before bringing up VSX LAGs after reboot
The secondary mirrors everything except role secondary and the keepalive IPs swapped (peer 192.168.0.1 source 192.168.0.2). The system-mac must match.
The linkup-delay-timer is critical. After a reboot, it prevents the rebooting peer from forwarding on its VSX LAG members until it has fully synced state from the partner. Without it, the rebooted peer might attract traffic toward an empty MAC table and black-hole it for 30+ seconds.
Multi-Chassis LAGs — Configuring a VSX LAG
With VSX up and ISL/keepalive healthy, defining a VSX LAG is just a matter of using the same LAG ID and multi-chassis keyword on both peers, with one member port from each peer:
On both peers (identical config):
interface lag 10 multi-chassis
description "VSX LAG to ToR-1"
no shutdown
no routing
vlan trunk native 1
vlan trunk allowed 10,20,30
lacp mode active
On the primary:
interface 1/1/1
description "ToR-1 link 1"
no shutdown
lag 10
On the secondary:
interface 1/1/1
description "ToR-1 link 2"
no shutdown
lag 10
The downstream device sees one LACP partner (because the system-mac is shared) and brings up a 2-member LAG. From its perspective, this is a normal aggregation. Internally, VSX coordinates which switch forwards which frame and uses the ISL only when a local egress isn’t available.
Split-Brain Prevention
The nightmare scenario is split-brain — both peers think the other is dead and both claim primary, both answer for the active-gateway IP, and both forward duplicate frames. VSX prevents this through the keepalive + ISL state machine:
| ISL State | Keepalive State | Peer Behavior |
|---|---|---|
| Up | Up | Normal active-active forwarding |
| Down | Up | Secondary disables its VSX LAG member ports (split-brain prevention). Primary continues forwarding alone. |
| Up | Down | Warning logged; forwarding continues; admin should fix keepalive immediately |
| Down | Down | Each peer assumes partner is dead; both forward independently (true network split — but at least the active half stays up) |
The crucial case is the second row. When the ISL fails but the keepalive confirms the peer is alive, the secondary voluntarily shuts down its VSX LAG members. This avoids duplicate forwarding and forces all downstream traffic onto the primary’s still-working links.
This is why the keepalive must be on a different physical path from the ISL. If both ride the same fiber bundle, you’ve designed a single point of failure that defeats the entire mechanism.
Figure 5.3: VSX peer state transitions based on ISL and keepalive health
stateDiagram-v2
[*] --> Booting
Booting --> LinkupDelay: chassis powers on
LinkupDelay --> ActiveActive: linkup-delay-timer expires<br/>ISL up, keepalive up
ActiveActive --> ISLDown_KAUp: ISL fails
ActiveActive --> ISLUp_KADown: keepalive fails
ISLDown_KAUp --> SecondaryShutdownVSXLAGs: secondary disables<br/>VSX LAG members
SecondaryShutdownVSXLAGs --> ActiveActive: ISL restored
ISLUp_KADown --> ActiveActive: keepalive restored<br/>(forwarding never stopped)
ActiveActive --> SplitBrain: ISL and keepalive both fail
SplitBrain --> ActiveActive: both links restored<br/>renegotiate roles
ActiveActive --> [*]: graceful shutdown
Verification
switch# show vsx status
switch# show vsx brief
switch# show vsx config
switch# show vsx config keepalive
switch# show interface lag 10
switch# show lacp interfaces
switch# show lacp aggregates
A healthy show vsx status confirms:
- ISL is up and operational
- Keepalive is established
- Roles are primary and secondary (one each)
- Configuration sync is in-sync
- Linkup-delay timer expired
[Source: https://www.youtube.com/watch?v=YYgmlCjSOJI]
Section 4 — VSX Lifecycle Operations
Standing up a VSX pair is the easy part. The real test is upgrading it without taking the network down.
Live Software Upgrades (LSU / ISSU)
Live Software Upgrade (LSU) — also called In-Service Software Upgrade (ISSU) — is a single-command rolling upgrade of a VSX pair. The command runs on the primary; the system orchestrates the rest:
switch# vsx update-software tftp://10.1.1.5/CX-10.07.swi secondary vrf mgmt
The flow:
- Image staging — the secondary node downloads the image to its secondary partition. The primary’s running code is untouched.
- Secondary reboots to the new image. The primary handles all traffic during this 1–3 minute window via VSX LAG redistribution and ISL.
- Secondary rejoins with the new code, takes over forwarding.
- Primary upgrades and reboots to the same image. The (newly upgraded) secondary now handles traffic.
- Primary rejoins with the new code. Roles return to original.
Throughout, traffic continues — every flow has a path through at least one peer at all times [Source: https://www.youtube.com/watch?v=_aDsAc2GzTE].
Figure 5.4: LSU rolling upgrade workflow across a VSX pair
flowchart TD
Start([vsx update-software issued on Primary]) --> Check{Pre-flight checks<br/>pair healthy, image accessible,<br/>disk space ok?}
Check -- No --> Abort[Abort: refuse to start LSU]
Check -- Yes --> Stage[Secondary downloads image<br/>to secondary partition]
Stage --> RebootSec[Secondary reboots<br/>to new image]
RebootSec --> PrimaryAlone[Primary handles all traffic<br/>via VSX LAG redistribution + ISL<br/>~1-3 min window]
PrimaryAlone --> SecRejoin[Secondary rejoins<br/>with new code, takes over forwarding]
SecRejoin --> StagePri[Primary stages new image<br/>on secondary partition]
StagePri --> RebootPri[Primary reboots<br/>to new image]
RebootPri --> SecAlone[Newly upgraded Secondary<br/>handles all traffic]
SecAlone --> PriRejoin[Primary rejoins<br/>with new code]
PriRejoin --> Verify[show vsx status<br/>both peers in-sync,<br/>roles restored]
Verify --> End([Upgrade complete])
Pre-flight checklist:
| Check | Command | Why |
|---|---|---|
| Both peers in sync | show vsx status | LSU refuses to start on an unhealthy pair |
| Sufficient disk space | show images | Secondary partition must have room |
| Image accessible | Test TFTP/SCP from both peers | Both peers must reach the image server |
| Linkup-delay-timer set | show vsx config | Prevents black-holing on rejoin |
| Backup config | copy running-config tftp://... | Standard change control |
LSU is hardware-specific. Virtual AOS-CX (the OVA distribution) cannot be live-upgraded — it requires deploying a new VM and migrating config [Source: https://www.youtube.com/watch?v=g_Vy0K-pOek].
VSX Restore and Recovery
If a peer fails permanently — bad hardware, dead PSU, water damage — the operational steps:
- Replace the chassis (identical model required).
- Restore configuration from backup.
- Reconnect ISL and keepalive cables before powering on, if possible, or expect a brief flap as VSX renegotiates.
- The healthy peer detects the new chassis via keepalive and ISL.
- vsx-sync pushes synchronized configuration to the new peer automatically.
- Verify with
show vsx statusand confirm both peers are in-sync before declaring complete.
The replacement chassis must:
- Match the model exactly
- Run the same software version as the surviving peer (upgrade it before connecting if needed)
- Use the same system-mac in VSX config
VSX vs Stacking vs VSF — Comparison
This is one of the most exam-relevant tables in the book.
| Aspect | Traditional Stacking (legacy) | VSF (Virtual Switching Framework) | VSX (Virtual Switching Extension) |
|---|---|---|---|
| Control planes | One shared (master + members) | One shared (master elected from stack) | Two independent, synchronized |
| Member count | Varies (typically up to 8) | 2–10 switches | Exactly 2 |
| Failure domain | Stack master fault = stack risk | Stack master fault = stack risk | Peer fault contained — partner keeps forwarding |
| Topology | Chain or ring | Chain or ring | Point-to-point (ISL + keepalive) |
| Use case | Access layer simplification | Access/aggregation simplification | Core, aggregation, data center resilience |
| MC-LAG | LAG within stack | LAG within stack | True MC-LAG with active-active |
| Upgrade impact | Often requires full reboot | Rolling on some platforms | LSU — minimal data plane impact |
| AOS-CX 8320/8325 | N/A | Not supported | Supported (only option) |
| AOS-CX 6300 | N/A | Supported (only option) | Not supported |
| AOS-CX 6400 | N/A | Supported | Supported |
| Mgmt simplicity | Single IP/CLI | Single IP/CLI | Two IPs/CLIs (one per peer) |
| Best fit | Edge closets | Edge/IDF stacking | Core, distribution, server access leaf-spine |
[Source: https://www.examtopics.com/discussions/hp/view/60644-exam-hpe6-a72-topic-1-question-79-discussion/] [Source: https://airheads.hpe.com/discussion/why-is-vsx-considered-stacking]
The trade-off boils down to: VSF is simpler to manage; VSX is more resilient. Use VSF where a single management plane is more valuable than fault isolation (access switches in a closet). Use VSX where uptime is critical and the network team can handle two CLIs (core, distribution, data center top-of-rack).
Best Practices Summary
- Always use LACP between switches and to dual-homed servers — silent miswires are the #1 cause of VSX LAG outages.
- Always set linkup-delay-timer — without it, a rebooted peer can black-hole traffic for 30+ seconds.
- Keepalive on a different physical path than the ISL — same bundle = shared single point of failure.
- vsx-sync everything — VLANs, active-gateways, ACLs. Asymmetric config produces asymmetric forwarding.
- Match models exactly — VSX requires identical hardware. A 8325 cannot peer with a 8320.
- Pre-stage images before LSU — verify TFTP/SCP reachability from both peers before scheduling a maintenance window.
- Test failover in lab — pull the ISL, pull the keepalive, kill a peer. Watch what happens before production teaches you.
- Document the role mapping — knowing which physical chassis is “primary” matters when you’re on a 3 AM bridge call.
Chapter Summary
Link aggregation is the foundation of resilient L2 design: bundling multiple physical links into one logical pipe with bandwidth scaling and member-failure tolerance. AOS-CX makes static LAGs the default and adds LACP via a one-line lacp mode active toggle, with lacp fallback-static solving the PXE-boot edge case.
VSX takes the same idea up a level — instead of redundant links inside a chassis, redundant chassis that present as one. Two AOS-CX switches synchronize state over an Inter-Switch Link, monitor each other via a separate keepalive, and forward in active-active mode. Hosts get a single virtual gateway via active-gateway, downstream devices get true multi-chassis LAGs via the multi-chassis keyword on a LAG interface, and the operator gets a network that survives a chassis failure with no FHRP convergence pause.
Live Software Upgrade ties it together: a single command rolls a new image through both peers, one at a time, while traffic continues. Compared to stacking (one shared brain) and VSF (one shared brain across more switches), VSX is the choice when fault isolation matters more than management simplicity — which, in core and data center designs, is almost always.
The exam will test all of this: the difference between active and passive LACP, the role of the keepalive when the ISL fails, why vsx-sync exists, and which platforms support which clustering technology. Memorize the comparison table; configure a VSX pair in the lab; pull cables and watch the recovery state machine work. That hands-on confirmation is what separates passing from guessing.
Key Terms
- LAG (Link Aggregation Group) — A logical interface bundling multiple physical links for combined bandwidth and member-level redundancy. AOS-CX supports up to 8 members per LAG.
- LACP (Link Aggregation Control Protocol) — IEEE 802.3ad protocol that negotiates LAG membership via LACPDUs. Modes are active (initiates negotiation) and passive (responds only). At least one peer must be active.
- VSX (Virtual Switching Extension) — Aruba AOS-CX clustering technology that pairs exactly two switches with independent control planes, synchronizing state over an ISL and using a keepalive to prevent split-brain. Supported on 8320, 8325, 6400.
- ISL (Inter-Switch Link) — A multi-chassis LAG between the two VSX peers carrying L2 sync, MAC sync, configuration sync, and data fallback traffic. Configured with the
multi-chassiskeyword andvsx inter-switch-link lag <ID>. - Keepalive — A dedicated L3 link between VSX peers (typically in its own VRF or on OOBM) used to confirm peer aliveness independent of the ISL. Without it, an ISL failure could cause split-brain.
- Active-gateway — A virtual IP/MAC configured on a VLAN interface across both VSX peers so each peer locally answers as the gateway for connected hosts. Eliminates FHRP failover delay.
- vsx-sync — A per-feature CLI command (under VLANs, ACLs, active-gateways, QoS, etc.) that pushes configuration from the primary peer to the secondary peer to ensure symmetrical state.
- Live Software Upgrade (LSU / ISSU) — A single-command rolling upgrade (
vsx update-software) that upgrades the secondary VSX peer first, fails traffic over, then upgrades the original primary, minimizing data plane disruption during a software refresh.
Chapter 6: Layer 3 Routing — Static Routes, OSPF, and VRRP/Active-Gateway
Layer 2 stops at the broadcast domain. The moment a host in VLAN 10 needs to reach a server in VLAN 30, or a campus user needs to reach the data center across the building, you cross into Layer 3. AOS-CX is a fully routed switching platform — every Aruba CX 6000, 8000, and 10000 series switch can be a Layer 3 device on every port — and the design choices you make here determine whether your network converges in 200 milliseconds or 30 seconds when something goes wrong.
This chapter walks through the four pillars of Layer 3 on AOS-CX: how to build the interfaces (SVIs, routed ports, loopbacks); how to configure the simplest routing protocol (static routes); how to scale out with a real interior gateway protocol (OSPFv2); and how to keep hosts on the network when a default gateway dies (VRRP and VSX active-gateway). By the end, you should be able to look at a CX configuration and predict its routing table, and look at a routing table and predict the configuration.
A useful analogy before we dive in: think of a Layer 2 switch as a building’s hallway system — every door on the same floor (VLAN) is reachable directly. A Layer 3 switch adds elevators and stairwells (routed interfaces) that connect the floors. Static routes are like printed signs on the wall (“Cafeteria — Floor 3”). OSPF is like a building directory that updates itself when a new tenant moves in. And VRRP/active-gateway is the redundant elevator — when one breaks, the other carries the load without anyone in the lobby noticing.
6.1 Layer 3 Interface Types
AOS-CX exposes four kinds of Layer 3 interfaces, and a Layer 3 design uses some combination of all of them. Before you can route anything, the global feature must be on:
switch(config)# ip routing
Without ip routing, the switch will accept Layer 3 configuration on interfaces but will not actually forward packets between subnets [Source: https://support.hpe.com/hpesc/public/docDisplay?docId=sd00007415en_us].
Switched Virtual Interfaces (SVIs)
An SVI — interface vlan <id> in AOS-CX — is a logical Layer 3 interface bound to a VLAN. It is the gateway for hosts in that VLAN. SVIs are the workhorse of inter-VLAN routing.
switch(config)# vlan 10
switch(config-vlan-10)# name Users
switch(config-vlan-10)# exit
switch(config)# interface vlan 10
switch(config-if-vlan)# description Users_VLAN_Gateway
switch(config-if-vlan)# ip address 192.168.10.1/24
switch(config-if-vlan)# no shutdown
A few details that trip people up:
no shutdownis mandatory. SVIs come up administratively down by default.- The VLAN must exist in the VLAN database, and at least one switchport (access or trunk) must be in the forwarding state for the SVI to be operationally up. If every port carrying VLAN 10 is shut down or in STP blocking, the SVI will go line-protocol down — even though the IP configuration is correct.
- Both
/24prefix-length and dotted-decimal masks are accepted. The running-config will normalize to the prefix form.
Inter-VLAN routing then happens automatically: as soon as two SVIs are up and ip routing is enabled, the switch will forward packets between their subnets at hardware line rate.
Routed (Layer 3) Ports
A routed port is a physical port that has been stripped of its switching behavior. It has its own IP address and is not a member of any VLAN. Routed ports are typically used for point-to-point uplinks between switches or to firewalls and routers.
switch(config)# interface 1/1/48
switch(config-if)# description Uplink_to_Core
switch(config-if)# routed
switch(config-if)# ip address 10.0.0.1/30
switch(config-if)# no shutdown
The routed keyword removes any switchport configuration and converts the port to Layer 3 mode. (On some AOS-CX versions, no switchport is the equivalent.) Reverting is done with no routed, which moves the port back into Layer 2 mode and removes any IP address.
Routed ports are preferred over SVIs for transit links because they:
- Skip the broadcast domain. No spanning-tree, no flooding, no MAC learning on a routed port — it is a clean point-to-point Layer 3 hop.
- Have predictable up/down semantics. A routed port goes down the moment the cable is unplugged. An SVI may stay up because some other port still carries the VLAN.
- Have lower MTU overhead. No 802.1Q tag is added on a non-trunked routed port.
Loopback Interfaces
A loopback is a virtual Layer 3 interface that is always up as long as the switch is running. It has no associated physical port, so cable failures do not bring it down.
switch(config)# interface loopback 0
switch(config-if-loopback)# ip address 1.1.1.1/32
Loopbacks are used for two purposes that matter on the exam:
- Stable router-id and management address. When you SSH to
1.1.1.1, you do not care which physical port is carrying the traffic — as long as any path to the switch is working, the loopback is reachable. This is invaluable for monitoring, NetEdit/Aruba Central inventory, and AAA source-interface configuration. - Stable OSPF/BGP router-id. OSPF picks the highest IP on a loopback (if any exist) as its router-id by default. Setting
router-idexplicitly is best practice, but the loopback is conventionally the source.
Inter-VLAN Routing — Putting It Together
A complete distribution-layer L3 configuration combines all three interface types: SVIs for user/server VLANs, routed ports for the upstream/inter-switch links, and a loopback for management.
switch(config)# ip routing
switch(config)# vlan 10
switch(config-vlan-10)# name Users
switch(config-vlan-10)# vlan 20
switch(config-vlan-20)# name Servers
switch(config-vlan-20)# exit
switch(config)# interface vlan 10
switch(config-if-vlan)# ip address 192.168.10.1/24
switch(config-if-vlan)# no shutdown
switch(config-if-vlan)# interface vlan 20
switch(config-if-vlan)# ip address 192.168.20.1/24
switch(config-if-vlan)# no shutdown
switch(config)# interface 1/1/1-1/1/12
switch(config-if-<>)# no shutdown
switch(config-if-<>)# no routed
switch(config-if-<>)# vlan access 10
switch(config)# interface 1/1/48
switch(config-if)# routed
switch(config-if)# ip address 10.0.0.1/30
switch(config-if)# no shutdown
switch(config)# interface loopback 0
switch(config-if-loopback)# ip address 1.1.1.1/32
Verify with:
switch# show ip interface brief
switch# show ip route
switch# show vlan
show ip interface brief is the workhorse — it lists every Layer 3 interface, its IP, and its administrative/operational state in one screen.
Table 6.1 — AOS-CX Layer 3 Interface Types
| Interface Type | Command | Use Case | Up When |
|---|---|---|---|
| SVI | interface vlan <id> | Default gateway for a VLAN; inter-VLAN routing | A member port of the VLAN is forwarding and no shutdown |
| Routed port | interface 1/1/x then routed + ip address | Point-to-point inter-switch or upstream link | Cable is up and no shutdown |
| Loopback | interface loopback <n> | Stable router-id, management address | Always (while switch is running) |
| Sub-interface | interface 1/1/x.<vid> (router-on-a-stick) | Trunked Layer 3 to a single port | Parent port up and encapsulation dot1Q <vid> set |
6.2 Static and Default Routing
Static routes are the simplest Layer 3 plumbing on the switch. You manually tell the routing table, “to reach prefix X, send the packet to next-hop Y.” They are deterministic, require no protocol overhead, and are perfect for stub networks, default gateways, and floating backups.
Syntax
switch(config)# ip route <prefix>/<length> <next-hop-ip> [distance]
Examples:
ip route 10.1.0.0/16 192.168.10.254
ip route 10.2.0.0/16 192.168.10.254 100
ip route 0.0.0.0/0 10.0.0.254
The first installs a /16 route via 192.168.10.254 with the default administrative distance (1 for static). The second installs the same kind of route but at AD 100 — a “floating static” that will only be installed if no better source advertises 10.2.0.0/16. The third is the default route that catches all unmatched destinations.
Administrative Distance
Administrative distance (AD) is how AOS-CX decides which source to trust when two protocols know about the same prefix. Lower AD wins. The defaults match the industry standard:
Table 6.2 — Default Administrative Distance on AOS-CX
| Source | AD |
|---|---|
| Connected (interface up with IP) | 0 |
| Static route | 1 |
| eBGP | 20 |
| OSPF (intra/inter-area) | 110 |
| iBGP | 200 |
| Floating static (typical override) | 200+ |
Connected routes always win — that is why an SVI you bring up with ip address 10.10.10.1/24 immediately appears in show ip route even before any protocol runs. Static routes beat OSPF, OSPF beats iBGP, and so on.
Recursive and Floating Static Routes
A recursive static has a next-hop that is itself reached via another route — the switch performs a recursive lookup at forwarding time. Most static routes are recursive in practice: you point at a next-hop IP, and the FIB resolves that IP to a real outgoing interface and MAC.
A floating static is a static route configured with an artificially high AD so that it stays out of the routing table while a more-preferred source (typically OSPF) is alive. The moment the OSPF route disappears, the floating static is installed. This is the classic “backup link” pattern — primary path runs OSPF over the WAN, floating static at AD 200 points at a 4G modem.
ip route 0.0.0.0/0 203.0.113.1 ! primary, AD 1
ip route 0.0.0.0/0 198.51.100.1 200 ! backup, AD 200
Only the primary appears in the routing table while it is reachable. When 203.0.113.1 stops responding (interface goes down or BFD/IP-SLA tracking flips), the AD-200 default takes over.
Equal-Cost Multipath (ECMP)
If you configure multiple static routes for the same prefix with the same AD and metric, AOS-CX installs them all and load-balances flows across them. Same applies to OSPF: when two equal-cost OSPF paths exist, both are installed.
ip route 10.10.0.0/16 10.0.0.1
ip route 10.10.0.0/16 10.0.0.5
The switch will hash flows (typically using a 5-tuple of src IP, dst IP, src port, dst port, protocol) and send half of them via each next-hop. The maximum number of ECMP paths is platform-dependent (commonly 8 or 16 on CX 8000-series) and configurable.
ECMP is what makes a leaf-spine fabric work: every leaf has equal-cost paths to every spine, and traffic spreads evenly across all uplinks without any single link being a bottleneck.
Figure 6.4: ECMP load balancing across a leaf-spine fabric
flowchart TD
L1[Leaf-1] -->|Path 1| S1[Spine-1]
L1 -->|Path 2| S2[Spine-2]
L1 -->|Path 3| S3[Spine-3]
L1 -->|Path 4| S4[Spine-4]
S1 --> L2[Leaf-2]
S2 --> L2
S3 --> L2
S4 --> L2
H[Host A on Leaf-1] --> L1
L2 --> H2[Host B on Leaf-2]
Each Leaf has four equal-cost OSPF paths to every other Leaf via the four Spines. Per-flow hashing distributes sessions evenly without reordering packets within a flow.
Diagram 6.1 — Static Route Decision Flow
flowchart TD
A[Packet arrives] --> B{Longest match in routing table?}
B -->|Yes| C{Multiple equal-cost paths?}
C -->|Yes| D[Hash flow across paths - ECMP]
C -->|No| E[Forward via single next-hop]
B -->|No| F{Default route 0.0.0.0/0 exists?}
F -->|Yes| E
F -->|No| G[Drop packet - ICMP unreachable if enabled]
6.3 OSPFv2 on AOS-CX
Static routes are fine for two switches. They become a maintenance disaster at twenty. OSPFv2 (Open Shortest Path First, version 2 — the IPv4 version) is the link-state interior gateway protocol of choice for Aruba campus and data center designs. Every router in an OSPF area builds an identical map of the area’s topology, then runs Dijkstra’s shortest-path-first algorithm to compute its own routing table. When a link fails, hello/dead timers detect the failure within seconds and the entire area re-converges.
Areas and the Backbone
OSPF organizes routers into areas to limit the scope of LSA flooding and SPF recalculation. Area 0 (also written as 0.0.0.0) is the backbone — every other area must connect to area 0, either directly or via a virtual link. Routers with interfaces in two areas are Area Border Routers (ABRs).
The analogy here is a corporate org chart. Each department (area) handles its own internal communication and only sends summary reports up to corporate (the backbone). The CFO’s office (the ABR) is the one person with feet in both meetings.
Configuration on AOS-CX
This is one of the most common HPE7-A01 exam traps: AOS-CX does not use Cisco-style network statements under the OSPF process. Instead, OSPF is enabled per interface.
switch(config)# router ospf 1
switch(config-ospf-1)# router-id 1.1.1.1
switch(config-ospf-1)# area 0
switch(config-ospf-1)# exit
switch(config)# interface 1/1/1
switch(config-if)# no shutdown
switch(config-if)# ip address 10.0.1.1/30
switch(config-if)# ip ospf 1 area 0
switch(config-if)# ip ospf network point-to-point
The router ospf 1 command starts process 1 (locally significant — process IDs do not need to match between neighbors). router-id 1.1.1.1 sets a stable identifier explicitly; if you omit it, OSPF picks the highest active loopback IP, or the highest interface IP if no loopbacks exist [Source: https://mytechmemo.net/building-basic-bgp-ospf-network-with-aos-cx/]. Always set the router-id explicitly — relying on automatic selection can cause weird behavior when interfaces flap.
ip ospf 1 area 0 on the interface attaches it to OSPF process 1, area 0. ip ospf network point-to-point declares this is a point-to-point link between two routers — no DR/BDR election needed, and adjacency reaches FULL state in about 40 seconds instead of 60.
Loopbacks in OSPF
Always advertise loopbacks into OSPF. They give every router a stable, always-up address.
interface loopback 0
ip address 1.1.2.1/32
ip ospf 1 area 0
Multi-Area OSPF
When the network grows beyond a few dozen routers, splitting into multiple areas reduces LSDB size and SPF run time.
router ospf 1
router-id 2.2.2.2
area 0
area 1
interface 1/1/1
ip ospf 1 area 0 ! to backbone
interface vlan 100
ip ospf 1 area 1 ! to access layer
This switch is now an ABR — area 0 sees its area 0 interfaces, area 1 sees its area 1 interfaces, and the ABR generates summary LSAs (Type 3) into each area to advertise the prefixes from the other.
Figure 6.5: Multi-area OSPF design with backbone, stub, and NSSA
graph TD
subgraph Area0[Area 0 - Backbone]
BB1[Core-1]
BB2[Core-2]
end
subgraph Area1[Area 1 - Standard]
R1[Dist-A]
R2[Dist-B]
end
subgraph Area2[Area 2 - Stub]
R3[Branch-A]
end
subgraph Area3[Area 3 - NSSA]
R4[Edge-Router]
EXT[External Static Routes]
end
BB1 --- BB2
BB1 --- R1
BB2 --- R2
BB1 --- R3
BB2 --- R4
EXT -.redistribute.-> R4
R4 -.Type 7 LSA.-> BB2
Every non-backbone area touches Area 0 through an ABR. The Stub area receives only a default route (no Type 5 LSAs). The NSSA permits local redistribution via Type 7 LSAs, which the ABR translates to Type 5 before flooding into Area 0.
LSA Types
Each OSPF router and ABR generates and floods specific Link-State Advertisements (LSAs). Memorizing the LSA types is essential exam material.
Table 6.3 — OSPFv2 LSA Types
| Type | Name | Originated By | Scope | Carries |
|---|---|---|---|---|
| 1 | Router LSA | Every OSPF router | Within an area | All of the router’s links and their costs in that area |
| 2 | Network LSA | DR on a multi-access network | Within an area | List of routers attached to the multi-access segment |
| 3 | Summary LSA | ABR | Into a non-originating area | Inter-area prefixes (advertised from another area) |
| 4 | ASBR Summary | ABR | Into a non-originating area | How to reach an ASBR |
| 5 | AS-External | ASBR | Throughout AS (except stub/NSSA) | External (redistributed) prefixes |
| 7 | NSSA External | ASBR within an NSSA | Within the NSSA only (translated to Type 5 at ABR) | External prefixes injected at the NSSA edge |
Stub, Totally Stubby, and NSSA Areas
Stub-area variants exist to keep external LSAs out of areas that don’t need them — typically branch offices or access-layer pods that only need a default route.
Table 6.4 — OSPF Stub Area Variants
| Area Type | Allows Type 3 (inter-area)? | Allows Type 5 (external)? | Allows Type 7 (NSSA external)? | Use Case |
|---|---|---|---|---|
| Standard | Yes | Yes | No | Default — full LSDB |
| Stub | Yes | No (replaced by default route) | No | Branch office with no external redistribution needed |
| Totally Stubby | No (only default route) | No | No | Smallest LSDB; everything reached via default |
| NSSA | Yes | No | Yes | Stub area that itself has an ASBR (e.g., redistributing static into OSPF at the edge) |
| Totally NSSA | No (only default route) | No | Yes | NSSA with maximum LSA filtering |
Configuration is done under the area:
router ospf 1
area 1 stub
Or for totally stubby:
router ospf 1
area 1 stub no-summary
Every router in a stub area must agree on the stub flag — a mismatch breaks adjacency.
Authentication and Timers
OSPF supports MD5 authentication per interface to prevent rogue neighbors from injecting routes.
interface 1/1/1
ip ospf message-digest-key 1 md5 plaintext MyOspfKey123
ip ospf authentication message-digest
You can also enforce authentication area-wide so any interface in that area must use it:
router ospf 1
area 0 authentication message-digest
Default OSPF timers on broadcast networks are hello 10 seconds and dead 40 seconds (4 × hello). On point-to-point networks they are also 10/40. To tighten convergence:
interface 1/1/1
ip ospf hello-interval 1
ip ospf dead-interval 4
Both neighbors must match hello and dead timers, or the adjacency stays stuck in INIT or EXSTART. In modern designs, BFD is preferred over aggressive OSPF timers for sub-second detection because it is purpose-built and lighter weight.
Passive Interfaces
User-facing SVIs should not form OSPF adjacencies — there are no OSPF speakers in the user VLAN, and a malicious laptop pretending to be an OSPF neighbor is exactly the kind of thing you don’t want. Mark them passive:
interface vlan 10
ip ospf 1 area 0
ip ospf passive
The interface still has its subnet advertised into OSPF (so other routers know how to reach the VLAN), but no hellos are sent and no adjacency forms [Source: https://www.networkacademy.io/ccna/ospf/passive-interface]. A common shortcut is to flip the default with passive-interface default under the OSPF process and then explicitly mark transit interfaces non-passive — fewer mistakes that way.
Verification Commands
switch# show ip ospf
switch# show ip ospf neighbor
switch# show ip ospf interface
switch# show ip route ospf
switch# show ip ospf database
show ip ospf neighbor is the first thing you check. A healthy point-to-point neighbor sits in the FULL state; a healthy multi-access neighbor sits in FULL/DR or FULL/BDR. Anything stuck in INIT, EXSTART, or 2-WAY is a problem — usually a timer mismatch, area mismatch, MTU mismatch, or authentication mismatch.
Diagram 6.2 — OSPF Neighbor State Progression
flowchart LR
A[Down] --> B[Init]
B --> C[2-Way]
C --> D[ExStart]
D --> E[Exchange]
E --> F[Loading]
F --> G[Full]
A point-to-point neighbor walks through every state to FULL. On a multi-access network, only DR/BDR pairs progress past 2-WAY; non-DR/non-BDR routers stay in 2-WAY with each other, which is normal.
6.4 First-Hop Redundancy
A host has a single default gateway. If that gateway disappears, the host’s traffic stops — no DHCP renew, no ARP storm, no magical recovery. First-hop redundancy protocols (FHRPs) solve this by presenting a virtual gateway IP that two physical switches both stand ready to answer for.
AOS-CX supports two: VRRP (RFC 5798, the standards-based protocol) and VSX active-gateway (Aruba’s proprietary, VSX-only feature). Understanding the difference between them is one of the most heavily tested topics on the HPE7-A01 exam.
VRRP (Virtual Router Redundancy Protocol)
VRRP elects a master from a group of routers and the master answers ARP for the virtual IP (VIP). The backup sits idle, monitoring the master via VRRP advertisements (multicast 224.0.0.18 by default). When the master goes silent for the dead-interval, the backup becomes master and takes over the VIP and the virtual MAC.
switch(config)# interface vlan 10
switch(config-if-vlan)# ip address 10.10.10.2/24
switch(config-if-vlan)# vrrp 10 address-family ipv4
switch(config-if-vrrp)# address 10.10.10.1 primary
switch(config-if-vrrp)# priority 150
switch(config-if-vrrp)# preempt
switch(config-if-vrrp)# no shutdown
On the second switch, configure the same VRID (10), the same VIP (10.10.10.1), and a lower priority (the default is 100):
switch2(config)# interface vlan 10
switch2(config-if-vlan)# ip address 10.10.10.3/24
switch2(config-if-vlan)# vrrp 10 address-family ipv4
switch2(config-if-vrrp)# address 10.10.10.1 primary
switch2(config-if-vrrp)# priority 100
switch2(config-if-vrrp)# no shutdown
The switch with priority 150 wins. preempt means it will forcibly take back the master role if it boots after the lower-priority switch has already become master.
VRRP supports interface tracking: if the uplink that is critical for forwarding goes down, VRRP can drop its priority below the peer’s so the peer takes over.
vrrp 10 address-family ipv4
track interface 1/1/1 decrement 60
If 1/1/1 goes down, priority drops by 60 (from 150 to 90), the peer at 100 wins, and traffic flips.
Figure 6.6: VRRP master election and failover
sequenceDiagram
participant H as Host (GW 10.10.10.1)
participant M as Switch-1 (Pri 150)
participant B as Switch-2 (Pri 100)
M->>B: VRRP advertisement (multicast 224.0.0.18, Pri 150)
Note over M,B: Switch-1 wins election (highest priority)
M->>H: ARP reply for 10.10.10.1 = vMAC 00:00:5E:00:01:0A
H->>M: Forwards traffic to vMAC
Note over M: Tracked uplink 1/1/1 fails
M->>M: Priority decremented 150 - 60 = 90
M->>B: VRRP advertisement (Pri 90)
Note over B: B (100) > M (90), B becomes Master
B->>H: Gratuitous ARP for 10.10.10.1 = same vMAC
H->>B: Forwards traffic to vMAC (no host re-ARP needed)
Note over M: Uplink restored, Pri returns to 150
M->>B: VRRP advertisement (Pri 150)
Note over M,B: With preempt enabled, M reclaims Master role
Hosts never see a gateway change because the virtual MAC is preserved across failover; only the switch answering for it changes. Without preempt, Switch-2 would remain Master indefinitely after recovery.
VSX Active-Gateway
VSX active-gateway is Aruba’s elegant simplification of FHRP for VSX clusters. Instead of electing a master, both VSX peers actively forward traffic for the same virtual IP and virtual MAC. Hosts ARP for the gateway, get the shared virtual MAC, and any frame destined to that MAC is routed locally by whichever VSX peer happens to receive it.
! Configure identically on both VSX peers
interface vlan 10
ip address 10.10.10.2/24 ! unique per peer (.2 on left, .3 on right)
active-gateway ip 10.10.10.1 mac 02:00:00:00:01:00
That is the entire configuration. There is no priority, no preemption, no advertisement timer to tune. The VSX control channel (the ISL) keeps the two peers in sync, and both respond to ARP with the same virtual MAC.
This works because of how VSX MC-LAG hashes traffic. A downstream host’s frame travels up an MC-LAG link and lands on whichever peer the LACP hash selected. That peer routes it locally — no ISL traversal, no protocol gymnastics. If a VSX peer fails, the surviving peer simply continues to forward, and the failed peer’s MC-LAG members re-hash to the survivor.
VRRP vs. Active-Gateway
Table 6.5 — VRRP vs. VSX Active-Gateway
| Feature | VRRP | VSX Active-Gateway |
|---|---|---|
| Forwarding mode | Active-standby (master only) | Active-active (both peers forward) |
| Configuration | VRID, priority, VIP, timers, preempt per VLAN | Single line per SVI: active-gateway ip <vip> mac <vmac> |
| Virtual MAC | Master owns; transfers on failover | Shared by both peers continuously |
| Protocol overhead | Hello packets per VLAN per second (multicast) | None — uses VSX control channel |
| ISL traversal | Backup sends north-bound traffic across ISL to master | None — each peer routes locally |
| Convergence | Sub-second with aggressive timers | Sub-second; both already forwarding |
| Compatibility | Standards-based; works between any vendors | VSX-only; mutually exclusive with VRRP per VLAN |
| Best fit | Non-VSX redundancy; multi-vendor; standalone cores | VSX pairs (campus distribution, DC ToR) |
[Source: https://egstory.net/en/acp-dc-%EA%B5%90%EC%9C%A19-vsx-1/] [Source: https://community.arubanetworks.com/viewdocument/?DocumentKey=90070d8f-038e-4fc2-915a-537f68a3a1c7]
Diagram 6.3 — Active-Gateway vs. VRRP Traffic Flow
flowchart TB
subgraph AG[VSX Active-Gateway]
H1[Host A] --> L1[VSX-1 routes locally]
H2[Host B] --> L2[VSX-2 routes locally]
end
subgraph V[VRRP]
H3[Host C] --> M[Master forwards]
H4[Host D] --> B[Backup must hop ISL to master]
B --> M
end
Why Active-Gateway Wins in VSX
- Active-active forwarding. With VRRP in a VSX pair, the backup peer drops L3 traffic that hashed to it through the MC-LAG and hairpins it across the ISL to the master. Active-gateway lets each peer route locally, freeing ISL bandwidth for genuine east-west traffic (and keeping latency lower).
- Configuration simplicity. A single line per SVI versus six or more for VRRP, multiplied by every VLAN you have. On a campus with 50 VLANs, that is a meaningful operational difference.
- No protocol overhead. Active-gateway state is synchronized via the VSX control channel, which you already have. VRRP sends multicast advertisements every second on every VLAN.
- DHCP relay friendly. Both peers can act as the relay agent for the same VIP simultaneously.
- Symmetric ECMP. Northbound, both peers contribute equal-cost paths to the OSPF/BGP fabric. With VRRP, only the master sources packets from the VIP, so half the upstream capacity sits idle from a flow-symmetry perspective.
When VRRP Is the Right Choice
- The L3 redundancy is between switches that are not in a VSX pair (two standalone cores; a CX peering with a Cisco/Juniper device).
- Multi-vendor environments where you must use a standards-based FHRP.
- Designs where active-standby semantics are explicitly required (e.g., a strict primary/secondary policy for license or compliance reasons).
Active-gateway and VRRP are mutually exclusive on the same VLAN. Choose one per SVI [Source: https://airheads.hpe.com/discussion/aruba-cx-vrrp-between-vsx-clusters].
Tracking and Preemption
Both protocols support tracking object failures and reacting accordingly:
- VRRP tracking decrements priority when a tracked interface or route goes down. The peer with higher (now relative) priority takes over.
- Active-gateway does not need explicit tracking — if a VSX peer dies, MC-LAG removes its links from the bond and the surviving peer naturally absorbs the traffic. ISL failure between peers triggers VSX split-recovery logic, not a gateway re-election.
Preemption in VRRP means a higher-priority router that comes back online forcibly reclaims the master role. Without preempt, whoever is master stays master until it fails. In a campus design, preemption is usually desirable — you want the bigger core box to be primary when both are healthy.
Verification
switch# show vrrp
switch# show vrrp brief
switch# show interface vlan 10
switch# show vsx status
switch# show vsx active-gateway
switch# show ip route
For VRRP, you want to see one peer as Master and the other as Backup, with the correct VIP and priorities. For active-gateway, you want to see the VIP/VMAC active on both peers and the VSX status as “in-sync.”
Chapter Summary
Layer 3 on AOS-CX rests on a small but powerful set of primitives. SVIs serve as default gateways for VLANs, routed ports give you clean point-to-point inter-switch links, and loopbacks anchor the device with a stable identity that does not depend on any single physical port. Static routes — including default routes, floating backups, and ECMP groups — handle deterministic plumbing where simplicity matters more than dynamics.
OSPFv2 scales those plumbing decisions from a few switches to an entire enterprise. AOS-CX configures OSPF per interface (ip ospf <id> area <area>), not via Cisco-style network statements. Backbone area 0 anchors the design; additional areas reduce LSDB size and isolate flooding. Stub and NSSA variants further restrict LSAs into edge areas. MD5 authentication, passive interfaces, and tuned timers harden and tune the protocol, and a clear set of show ip ospf commands makes troubleshooting tractable.
For first-hop redundancy, VRRP gives you a standards-based active-standby model that works anywhere, while VSX active-gateway gives you an active-active, lower-overhead model that wins inside a VSX pair. The exam will expect you to recognize that active-gateway is the default choice in VSX deployments and that VRRP belongs in non-VSX or multi-vendor designs. Both protocols support tracking and preemption (where preemption is configured), and both are verified with their own dedicated show commands.
Together, these topics turn AOS-CX from a pile of switches into a routed, redundant, self-healing network — the kind that survives a midnight cable cut without anyone having to wake up.
Key Takeaways
ip routingis the global on/off switch for inter-VLAN forwarding.- SVIs (
interface vlan <id>) are gateways for VLANs; routed ports (routed) are L3 transit links; loopbacks are always-up logical interfaces. - Static routes use
ip route <prefix>/<len> <next-hop> [distance]; equal-cost statics or OSPF routes form ECMP automatically. - AOS-CX OSPF is enabled per interface with
ip ospf <id> area <id>— there are nonetworkstatements under the OSPF process. - Always set
router-idexplicitly and advertise loopbacks into area 0. ip ospf network point-to-pointon inter-switch links removes DR/BDR election and speeds adjacency.- Passive interfaces (
ip ospf passive) suppress hellos on user-facing SVIs while still advertising the subnet. - VRRP is active-standby; VSX active-gateway is active-active and uses a shared VMAC across both peers.
- Active-gateway is preferred inside a VSX pair; VRRP is preferred for non-VSX or multi-vendor designs.
- VRRP and active-gateway are mutually exclusive on the same VLAN.
Key Terms
- SVI (Switched Virtual Interface): A logical L3 interface bound to a VLAN; created with
interface vlan <id>. - Routed port: A physical port stripped of switching behavior and given an IP directly; configured with
routed(orno switchport). - Loopback: A virtual L3 interface that is always up; used for stable router-id and management.
- Static route: A manually configured route entry; default AD is 1.
- Floating static: A static route configured with high AD so it is only installed when the preferred dynamic route is gone.
- ECMP (Equal-Cost Multipath): Multiple equal-cost paths to the same prefix installed simultaneously and load-balanced.
- OSPF (Open Shortest Path First): A link-state interior gateway protocol; OSPFv2 carries IPv4 routes.
- Area: OSPF’s scope of LSDB synchronization; area 0 is the backbone.
- ABR (Area Border Router): A router with interfaces in two or more OSPF areas; generates Type 3 summary LSAs.
- ASBR (Autonomous System Boundary Router): A router that redistributes external routes into OSPF; generates Type 5 (or Type 7 in NSSA) LSAs.
- LSA (Link-State Advertisement): OSPF’s unit of topology information; types 1, 2, 3, 4, 5, 7 are tested material.
- Stub / Totally Stubby / NSSA: OSPF area types that filter external and/or inter-area LSAs to shrink the edge LSDB.
- VRRP (Virtual Router Redundancy Protocol): Standards-based active-standby first-hop redundancy.
- VSX active-gateway: Aruba VSX active-active first-hop redundancy using a shared virtual MAC across both peers.
- VIP (Virtual IP): The IP address hosts use as their default gateway, owned by the VRRP master or shared by both VSX peers.
- VMAC (Virtual MAC): The MAC address that hosts learn for the gateway VIP.
- Preemption: A higher-priority VRRP router reclaiming the master role after recovering.
- Tracking: Reducing priority (VRRP) or triggering failover (active-gateway via VSX state) when a monitored interface or route fails.
Chapter 7: Switch Security: Authentication, Access Control, and Port Security
A campus switch is the front door of the enterprise network. Every printer, kiosk, IP phone, laptop, contractor tablet, and IoT camera plugs into it, and once a frame is forwarded the rest of the network has to trust the sender. The HPE Aruba Networking Switching Associate (HPE7-A01) exam expects you to treat that front door like a security checkpoint: authenticate the user or device, hand it a role that says what it is allowed to do, enforce that role with ACLs and Classifier policies, and finally lock the management plane so no attacker can simply log into the switch and erase the rules. This chapter walks through each of those steps in the order the packet experiences them — from EAPOL handshake to RADIUS-authorized VLAN to ACL hit-count to TACACS+ admin login.
Think of the switch as the lobby of a corporate building. The 802.1X handshake is the security guard checking your badge. The user role is the visitor sticker that says which floors you can ride to. The ACL is the elevator’s card reader that enforces it. And the management plane hardening is the locked supply closet behind reception where the building blueprints live — the place even legitimate staff need a separate key for. Lose any one of those, and the whole lobby model collapses.
7.1 Port-Based Authentication
Port-based authentication means the switch refuses to forward user traffic on an access port until the connecting device proves who it is. AOS-CX implements this with the port-access framework, which unifies 802.1X, MAC authentication, and captive portal under a single authenticator state machine.
7.1.1 The 802.1X EAP Flow
802.1X is an IEEE standard that wraps the Extensible Authentication Protocol (EAP) inside Layer-2 frames called EAPOL (EAP over LAN). Three actors are involved:
- Supplicant — the client, typically the OS-native 802.1X service on a laptop or phone.
- Authenticator — the AOS-CX switch port. It is the gatekeeper: it relays EAP between the supplicant and the RADIUS server but does not itself decide whether the user is allowed in.
- Authentication Server — the RADIUS server, almost always Aruba ClearPass Policy Manager in HPE designs.
The handshake follows a strict sequence. The supplicant or the switch sends an EAPOL-Start. The switch replies with an EAP-Request/Identity. The supplicant returns its identity in EAP-Response/Identity, which the switch encapsulates inside RADIUS Access-Request and forwards to ClearPass. ClearPass and the supplicant then negotiate an inner EAP method — most commonly EAP-TLS (certificates), PEAP-MSCHAPv2 (AD password inside a TLS tunnel), or EAP-TTLS. The switch is a transparent relay during this exchange. When ClearPass is satisfied it returns a RADIUS Access-Accept that may include vendor-specific attributes (VSAs) such as the Aruba user role or a VLAN ID, and the switch finally moves the port from the unauthenticated to the authorized state [Source: https://h22260.www2.hpe.com/labguides/campus-CX%20-%20TE1/te1221/te1221-security/te1-221-5.1.html] [Source: https://mytechmemo.net/simple-802-1x-authentication-on-aos-cx/].
Figure 7.1: 802.1X EAP exchange across supplicant, authenticator, and RADIUS server
sequenceDiagram
participant S as Supplicant<br/>(Laptop)
participant A as Authenticator<br/>(AOS-CX Switch)
participant R as RADIUS Server<br/>(ClearPass)
Note over S,A: EAPOL (Layer 2)
Note over A,R: RADIUS (UDP 1812)
S->>A: EAPOL-Start
A->>S: EAP-Request/Identity
S->>A: EAP-Response/Identity
A->>R: RADIUS Access-Request (EAP)
R->>A: RADIUS Access-Challenge (EAP)
A->>S: EAP-Request (method: TLS/PEAP)
S->>A: EAP-Response (credentials)
A->>R: RADIUS Access-Request
Note over S,R: Inner EAP method exchange<br/>(TLS handshake / MSCHAPv2)
R->>A: RADIUS Access-Accept<br/>(VSA: role, VLAN)
A->>S: EAP-Success
Note over A: Port moves to<br/>AUTHORIZED state
A minimal AOS-CX configuration to make this happen looks like this:
radius-server host 10.10.0.105 key plaintext aruba123 vrf mgmt
aaa authentication port-access dot1x authenticator
vlan 10
port-access role LUR
vlan access 10
interface 1/1/1
no shutdown
no routing
vlan access 1
aaa authentication port-access dot1x authenticator
enable
Three things deserve attention. First, the vrf mgmt qualifier on the RADIUS server tells the switch to send authentication traffic out the dedicated out-of-band management VRF, which is the recommended hardening pattern. Second, the port-access role LUR block defines a Local User Role (LUR) — a named policy stored on the switch itself that pins the authenticated client into VLAN 10. Third, aaa authentication port-access dot1x authenticator enable under the interface activates the authenticator role on that specific port [Source: https://h22260.www2.hpe.com/labguides/campus-CX%20-%20TE1/te1221/te1221-security/te1-221-5.1.html].
When ClearPass returns its own role via the Aruba VSA, the local role is overridden. AOS-CX flags this on the CLI by suffixing the role name with an asterisk (for example test_LUR*), which is your visual cue that the role came from RADIUS, not from the running config [Source: https://h22260.www2.hpe.com/labguides/campus-CX%20-%20TE1/te1221/te1221-security/te1-221-5.1.html].
7.1.2 MAC Authentication as Fallback
802.1X assumes the device has a supplicant. Plenty of devices do not — printers, badge readers, label printers, building-management sensors, and many older IP cameras have no concept of EAP. For these, AOS-CX supports MAC authentication (often called MAC-Auth or MAB, MAC Authentication Bypass). The switch simply sends the device’s MAC address to RADIUS as both username and password (or password derived from a shared secret), and ClearPass decides whether that MAC belongs to a known endpoint, often consulting an internal endpoint database or an external profiling source.
The combined dot1x plus mac-auth deployment is the production norm. The switch tries 802.1X first; if no EAPOL is heard within a timeout window, or if 802.1X fails, the port falls back to MAC authentication. On AOS-CX both methods are enabled per interface and the order is configurable [Source: https://airheads.hpe.com/discussion/aos-cx-mac-auth-and-8021x]:
interface 1/1/1
aaa authentication port-access dot1x authenticator
enable
aaa authentication port-access mac-auth
enable
The analogy here is helpful: 802.1X is the staff entrance where employees badge in with a chip card. MAC authentication is the loading dock where a recognized truck (by license plate) is allowed to back in even though the driver has no badge. Both are doors to the same building, both are logged, but they accept different proof of identity.
7.1.3 Multi-Domain and Multi-Host Modes
A single access port often serves more than one device. The classic case is a desk where an IP phone and a PC share the same wall jack; the PC dangles off the phone’s pass-through port. AOS-CX provides port-access modes to handle this:
| Mode | Behavior | Typical Use |
|---|---|---|
| Single-host | Exactly one MAC may authenticate; additional MACs are denied. | High-security ports, server racks. |
| Multi-host | First MAC authenticates; once authorized, the port is open to all subsequent MACs without authentication. | Legacy mode; rarely recommended. |
| Multi-domain | Two MACs allowed: one in the voice domain (VLAN tagged, typically the IP phone) and one in the data domain (untagged, the PC). Each authenticates independently. | Phone-plus-PC desks. |
| Multi-auth | Every MAC behind the port authenticates separately; each can land in its own VLAN/role. | Mini-conference rooms, hubs, virtualized hosts with multiple VMs sharing a NIC. |
The voice/data split in multi-domain mode lets the switch trust a tagged voice VLAN from a Cisco/Polycom/Aruba phone while still forcing the PC behind it to do its own 802.1X dance. That separation is what lets a single wall jack carry two different security postures without compromising either.
7.1.4 Captive Portal Integration
For guests and BYOD devices, certificate-based or AD-based 802.1X is overkill. AOS-CX integrates with ClearPass Guest (or any captive portal) via a redirect role. The flow is:
- Unknown MAC connects, MAC-Auth is sent to ClearPass.
- ClearPass returns a “guest-redirect” role: VLAN with limited reachability (DHCP, DNS, ClearPass) plus a redirect URL.
- The user opens a browser; the switch captures HTTP traffic and redirects to the ClearPass Guest portal.
- User registers, accepts the AUP, or logs in via sponsor.
- ClearPass issues a Change-of-Authorization (CoA) over RADIUS dynamic authorization, and the switch re-authenticates the client into a “guest-authorized” role with full Internet (but not internal) reachability.
CoA requires radius dyn-authorization enable on the switch so ClearPass can push state changes asynchronously [Source: https://airheads.hpe.com/discussion/aos-cx-mac-auth-and-8021x]. Without it the switch only knows what it learned at the original Access-Accept moment and ClearPass cannot revoke or upgrade a session at runtime.
Verification commands every administrator should memorize:
show port-access clients
show aaa authentication port-access interface 1/1/1 client-status
show running-config port-access
show port-access clients lists every authenticated session with VLAN, role, authentication method, and session timer. The detailed client-status form adds EAP method, RADIUS attributes received, and any CoA history [Source: https://h22260.www2.hpe.com/labguides/campus-CX%20-%20TE1/te1221/te1221-security/te1-221-5.1.html].
7.2 User Roles and Device Profiles
Authentication tells the switch who connected. Authorization tells it what they can do. AOS-CX expresses authorization with roles — named policies bundling VLAN, ACLs, QoS, captive portal redirect, session timeout, and reauthentication interval. Roles come in two flavors and are complemented by device profiles for unauthenticated discovery scenarios.
7.2.1 Local User Roles (LUR)
A Local User Role lives in the switch’s running configuration. It is defined once with port-access role <name> and referenced either by the local fallback config or by the name returned in a RADIUS VSA. Because the role is local, the switch can apply it instantly with no extra round-trip.
port-access role employee-data
vlan access 100
reauth-period 3600
session-timeout 28800
auth-mode client-mode
LURs are perfect for small deployments and for fallback when ClearPass is unreachable. The drawback is sprawl: every change to the policy requires touching every switch. For a campus with hundreds of access switches, LURs alone become a maintenance burden, which is why DUR exists.
7.2.2 Downloadable User Roles (DUR)
A Downloadable User Role is defined entirely in ClearPass and pulled to the switch on demand over HTTPS/REST after a successful authentication. The switch caches the role in volatile memory; it is cleared on reboot or on policy update. DUR centralizes policy in ClearPass — change a rule once, and every switch that authenticates a client into that role gets the new version without any switch-side configuration push [Source: https://www.flomain.de/2022/06/aruba-downloadable-user-roles/] [Source: https://airheads.hpe.com/discussion/aos-cx-downloadable-user-role-dur-simple-steps-to-configure].
Three switch-side prerequisites:
- Trust Anchor (TA) profile for ClearPass HTTPS — the switch must validate the ClearPass certificate before downloading any role config.
crypto pki ta-profile cp crypto pki ta-import cp common <certificate-file> password <password> - A read-only ClearPass admin account dedicated to DUR. The switch authenticates to ClearPass’s REST API as this user.
radius-server host 10.10.0.105 key plaintext aruba123 radius-server clearpass username dur-reader password plaintext <pwd> - HTTPS reachability from the switch’s source VRF to ClearPass on TCP/443.
On the ClearPass side, the administrator builds the role with the Aruba Downloadable Role Enforcement template (under Configuration > Enforcement > Profiles), selects AOS-CX as the product, and authors the role body in the same CLI grammar the switch expects. A typical ClearPass-defined DUR body looks like:
class ip http-only
10 match tcp any any eq 80
20 match tcp any any eq 443
policy http-only-policy
10 class ip http-only action permit
20 class ip any action drop
port-access role contractor
vlan access 200
policy http-only-policy
After authentication the switch logs Type: clearpass and Status: Completed for the downloaded role, visible in show port-access clients detail and show user-role <name> [Source: https://airheads.hpe.com/discussion/aos-cx-downloadable-user-role-dur-simple-steps-to-configure] [Source: https://support.hpe.com/hpesc/public/docDisplay?docId=sf000094631en_us&docLocale=en_US].
Figure 7.2: Downloadable User Role flow between switch and ClearPass
sequenceDiagram
participant C as Client
participant SW as AOS-CX Switch
participant CP as ClearPass<br/>(RADIUS + REST)
C->>SW: 802.1X / MAC-Auth
SW->>CP: RADIUS Access-Request
CP->>SW: RADIUS Access-Accept<br/>(VSA: role-name = "contractor")
Note over SW: Role not in local config<br/>Trigger DUR fetch
SW->>CP: HTTPS GET /api/role/contractor<br/>(TA profile validates cert)
CP->>SW: 200 OK<br/>(role body: VLAN, class, policy)
Note over SW: Cache role in volatile memory<br/>show port-access clients<br/>Type: clearpass
SW->>C: Port AUTHORIZED into role
Note over CP,SW: Later: policy change in ClearPass
CP->>SW: RADIUS CoA (Disconnect/Reauth)
SW->>CP: Re-fetch updated DUR via HTTPS
| Aspect | Local User Role (LUR) | Downloadable User Role (DUR) |
|---|---|---|
| Storage | Switch running-config | ClearPass, downloaded volatilely |
| Change management | Per-switch CLI push | Single ClearPass edit |
| Survives reboot? | Yes (in startup-config) | No — re-downloaded after auth |
| Requires HTTPS to ClearPass? | No | Yes |
| Best for | Small sites, fallback | Campus-wide consistent policy |
| Visible suffix when applied via RADIUS | * | * plus Type: clearpass |
7.2.3 Device Profiles via LLDP
Not every device authenticates. A new wireless access point plugged into a switch port has no 802.1X supplicant, but it does send LLDP-MED announcements identifying itself as an Aruba AP. A device profile is an AOS-CX feature that watches LLDP TLVs and applies a pre-built role when a match is found, before any authentication is attempted. Common matches include the LLDP system description (e.g., “Aruba AP-515”), system capabilities, or specific OUI patterns in the chassis ID.
port-access lldp-group aruba-ap
match sys-desc Aruba
port-access device-profile aps
associate role ap-role
associate lldp-group aruba-ap
enable
When the switch sees LLDP from an AP, it instantly drops the port into ap-role (perhaps a trunk allowing AP management VLAN and SSID-mapped VLANs) before the AP ever passes user traffic. Device profiles solve the chicken-and-egg problem of provisioning APs on ports that would otherwise be locked down by 802.1X. They are also widely used for Aruba IoT gateways and certain certified phones.
7.2.4 Role-Based VLAN and ACL Assignment
Whether local or downloaded, every role can pin three things:
- VLAN —
vlan access <id>for untagged data,vlan trunkfor tagged. - Classifier policy / ACL — applied as an inline policy under the role, restricting the client’s traffic.
- Session attributes — reauth period, idle timeout, captive portal redirect URL.
This is the heart of the AOS-CX zero-trust posture: a single port on a single switch can host an executive (full VLAN 10 access), a contractor (Internet-only via VLAN 200 + permit-http policy), and an IoT camera (VLAN 30 + deny-everything-but-NVR policy) simultaneously, each authenticated and constrained by its role.
7.3 ACLs and Classifier Policies
Roles describe intent; ACLs and classifier policies enforce it in hardware. AOS-CX supports both named ACLs for stateless packet filtering and classifier policies that combine match (class-map) and action (policy-map) for more granular treatment such as QoS marking or traffic redirection.
7.3.1 Standard vs Extended ACLs
In legacy IOS-style usage, standard ACLs match only on source IP and extended ACLs match on source, destination, protocol, and port. AOS-CX collapses this distinction: every ACL is a named list whose ACE syntax can match any combination of L3/L4 fields. The historical labels still appear in documentation, but practically:
- Use a “standard”-style ACL (source-only) when filtering management plane sources or simple route-map matches.
- Use an “extended”-style ACL (full 5-tuple) when filtering data plane traffic between hosts.
The CLI grammar is uniform [Source: https://support.hpe.com/hpesc/public/docDisplay?docId=sd00007196en_us&page=GUID-E87F12E9-712B-494C-B924-326995B7A2D4.html&docLocale=en_US]:
access-list ip BLOCK-RFC1918
10 deny ip 10.0.0.0/8 any
20 deny ip 172.16.0.0/12 any
30 deny ip 192.168.0.0/16 any
40 permit ip any any
For IPv6 the keyword changes to ipv6:
access-list ipv6 V6-FILTER
10 permit ipv6 2001:db8::/32 any
20 deny ipv6 any any
7.3.2 Stateless Filtering on AOS-CX
AOS-CX ACLs are stateless: each packet is evaluated independently against the ACE list, top-down, first-match-wins, with an implicit deny any at the end of every list. There is no flow table tracking the TCP three-way handshake the way a stateful firewall would. This has practical consequences:
- Return traffic must be permitted explicitly — a
permit tcp any any eq 443for outbound HTTPS does not automatically permit the SYN-ACK back. You addpermit tcp any eq 443 any established(matches packets with ACK or RST set) for the return path. - ACLs scale to wire speed because they are programmed into TCAM. A 6300 or 8360 evaluates millions of packets per second against an ACL with no measurable latency.
- The trade-off: you cannot do payload inspection or application-layer filtering. For that, you still need a stateful firewall north of the access layer.
The “first-match-wins” rule rewards careful ordering. A common mistake is putting a permissive permit ip any any early in the list, which silently bypasses every later deny. Read your ACL top-to-bottom the way the switch does, every time you edit it.
Figure 7.3: ACL evaluation pipeline (top-down, first-match-wins)
flowchart TD
A[Packet arrives at<br/>port / VLAN / SVI / VNI] --> B{ACL applied<br/>in this direction?}
B -- No --> Z[Forward normally]
B -- Yes --> C[ACE 10:<br/>match attempt]
C -- Match --> C1{Action?}
C1 -- permit --> P[Forward<br/>increment hit-count]
C1 -- deny --> D[Drop<br/>increment hit-count]
C -- No match --> E[ACE 20:<br/>match attempt]
E -- Match --> E1{Action?}
E1 -- permit --> P
E1 -- deny --> D
E -- No match --> F[ACE N:<br/>match attempt]
F -- Match --> F1{Action?}
F1 -- permit --> P
F1 -- deny --> D
F -- No match --> G[Implicit<br/>deny any any]
G --> D
7.3.3 Class Maps and Policy Maps (Classifier Policies)
A classifier policy is the AOS-CX equivalent of Cisco’s MQC (Modular QoS CLI): a named class-map matches traffic, and a named policy-map binds classes to actions. Where an ACL only permits or denies, a classifier policy can also mark DSCP, rate-limit, mirror, or redirect.
class ip web-traffic
10 match tcp any any eq 80
20 match tcp any any eq 443
class ip voice-traffic
10 match udp any any range 16384 32767
policy CAMPUS-QOS
10 class ip voice-traffic action dscp ef
20 class ip web-traffic action dscp af31
30 class ip any action dscp default
Historically, classifier policies were the primary method for traffic filtering on L3 VNI contexts (routed VXLAN), but modern AOS-CX has extended native ACLs to those same contexts on platforms like CX 6300, 6400, 8100, and 8360, so classifier policies are now reserved for actions richer than permit/deny [Source: https://support.hpe.com/hpesc/public/docDisplay?docId=sd00007196en_us&page=GUID-E87F12E9-712B-494C-B924-326995B7A2D4.html&docLocale=en_US].
7.3.4 Applying ACLs to Ports and VLANs
AOS-CX supports three application contexts:
| Context | Command Location | Direction | Use Case |
|---|---|---|---|
| Physical interface | interface 1/1/1 | apply access-list ip <name> in/out | Per-port ingress filter, e.g., guest port lockdown. |
| SVI / VLAN interface | interface vlan 10 | apply access-list ip <name> in/out | Filter routed traffic entering/leaving a VLAN. |
| L2 VLAN | vlan 10 | apply access-list ip <name> in/out | Filter switched (intra-VLAN) traffic between endpoints. |
| L3 VNI (VXLAN) | inside interface vxlan 1 / VNI context | apply access-list ip <name> in/out | Filter routed traffic inside an EVPN-VXLAN fabric. |
For Layer-2 traffic between two endpoints in the same VLAN, attach the ACL to the VLAN itself, not the SVI. For routed traffic crossing the VLAN boundary, attach to the SVI. For routed-VXLAN, attach to the L3 VNI [Source: https://support.hpe.com/hpesc/public/docDisplay?docId=sd00007196en_us&page=GUID-E87F12E9-712B-494C-B924-326995B7A2D4.html&docLocale=en_US].
7.3.5 RADIUS-Based Filter Rules (NAS Filter Rule)
When ClearPass returns a per-session ACL via the RADIUS NAS-Filter-Rule attribute (a “downloadable ACL” or dACL), AOS-CX applies it to the authenticated session. NAS Filter Rule supports IPv4 and IPv6 in a single rule statement (port lists, port ranges, source/destination), but IPv4 and IPv6 each require their own rule entries in practice [Source: https://www.youtube.com/watch?v=htOjEF50vOc]. This is how DUR delivers per-user firewall rules without ever pre-staging them on the switch.
7.3.6 Hit-Counts and Verification
Every AOS-CX ACL keeps per-ACE hit counters. The single most useful troubleshooting command for “is my ACL even matching?” is:
show access-list hit-counts ip BLOCK-RFC1918
A line that should be matching but shows zero hits is an instant flag — either the traffic is not arriving where you think it is, or an earlier ACE is catching it [Source: https://www.youtube.com/watch?v=lOjEFhqHC5k]. Reset the counters with clear access-list hit-counts between tests.
7.4 Management Plane Hardening
The data plane carries user traffic; the management plane carries the commands that change how user traffic is treated. An attacker who reaches the management plane does not need to evade ACLs — they can simply rewrite them. Hardening the management plane is therefore a security multiplier for everything else in this chapter.
7.4.1 SSH Key-Based Authentication
Telnet is disabled on AOS-CX by default and should never be re-enabled. SSH is the only acceptable interactive shell. By default SSH accepts password authentication, but key-based auth is far stronger because there is no password to phish, brute-force, or shoulder-surf.
Generate an Ed25519 keypair on the admin’s workstation, then on the switch:
ssh server vrf mgmt
ssh password-authentication disable
user admin authorized-key "ssh-ed25519 AAAAC3Nz...comment"
The flow is now: admin presents the private key, switch validates against the stored public key, session opens. Even if the admin’s password is leaked, the attacker cannot SSH in without the private key file, and that file should itself be protected by a passphrase and ideally an OS keychain or hardware token.
Restrict the management VRF and bind SSH to it. The switch’s data VRF (default) should not run an SSH listener at all unless a deliberate exception is documented.
7.4.2 HTTPS Certificates
The AOS-CX REST API and Web UI run over HTTPS. Out of the box the switch generates a self-signed certificate, which produces browser warnings and does not protect against MITM if an attacker can convince a client to ignore the warning. In production you replace this with a certificate signed by your enterprise CA:
crypto pki application https-server
crypto pki certificate enroll <profile-name>
A typical workflow uses a crypto pki ta-profile to trust your enterprise root, generates a CSR from the switch, has your CA sign it, and installs the signed cert as the HTTPS server certificate. Now both browsers and the ClearPass DUR REST client (Section 7.2.2) get a clean TLS handshake with no warnings.
7.4.3 AAA TACACS+ for Admin Login
RADIUS handles network-edge authentication; TACACS+ handles administrator login. The two protocols look similar but differ in critical ways:
| Feature | RADIUS | TACACS+ |
|---|---|---|
| Transport | UDP 1812/1813 | TCP 49 |
| Encryption | Password only | Entire payload |
| AAA separation | Auth+Authz combined | Auth, Authz, Accounting separate |
| Per-command authorization | No | Yes |
| Common use | 802.1X, MAC-Auth, VPN | Switch/router admin login |
| Vendor | Open standard | Cisco-originated, broadly adopted |
TACACS+ encrypts the entire packet, so the username, the typed commands, and the responses are all confidential to anyone sniffing the management network. More importantly, TACACS+ supports per-command authorization: when a junior network engineer types write erase, the switch can ask the TACACS+ server “is this user allowed to run this command?” and refuse if the answer is no. RADIUS has no equivalent.
tacacs-server host 10.10.0.110 key plaintext tacacs123 vrf mgmt
aaa authentication login default group tacacs local
aaa authorization commands default group tacacs local
aaa accounting all default start-stop group tacacs
The local fallback is critical: if the TACACS+ server is unreachable, the switch falls back to local authentication using the locally-defined admin user. Without it, a TACACS+ outage locks every administrator out of every switch — exactly when you most need to log in.
This pairs with Role-Based Access Control (RBAC) on the switch. AOS-CX ships with built-in roles (administrators, operators, auditors) and supports custom roles. TACACS+ can return a role name in its authorization response, mapping the AD group “NetOps-L1” to the AOS-CX operators role with read-only and limited-write privileges.
7.4.4 Audit Logging
Every privileged action should land in a log that is shipped off-box. The switch’s local event log is a forensic starting point but is volatile and bounded; centralized syslog is the system of record.
logging 10.10.0.120 vrf mgmt severity info
logging 10.10.0.120 vrf mgmt include-auditable-events
include-auditable-events captures CLI commands, login attempts (success and failure), AAA decisions, and configuration changes. Combined with TACACS+ accounting (which records every command typed by every admin), this gives you a tamper-evident trail that SOC analysts and compliance auditors both expect.
A practical analogy: SSH key auth is the lock on the door, HTTPS certs are the tamper-evident seal on the window, TACACS+ is the visitor log at the front desk that records who came in and what they touched, and syslog is the security-camera DVR sitting in a different building. Any one of those alone is fragile; together they are a defensible posture.
Figure 7.4: Management plane hardening layers on AOS-CX
graph TD
A[Administrator] --> B[Transport Layer]
B --> B1[SSH key-based auth<br/>password-auth disable<br/>bound to mgmt VRF]
B --> B2[HTTPS w/ enterprise<br/>CA-signed cert<br/>crypto pki ta-profile]
B1 --> C[Authentication Layer]
B2 --> C
C --> C1[TACACS+ over TCP/49<br/>full payload encryption]
C --> C2[Local fallback user<br/>group tacacs local]
C1 --> D[Authorization Layer]
C2 --> D
D --> D1[Per-command authz<br/>aaa authorization commands]
D --> D2[RBAC roles<br/>administrators / operators / auditors]
D1 --> E[Accounting Layer]
D2 --> E
E --> E1[TACACS+ accounting<br/>start-stop all commands]
E --> E2[Syslog off-box<br/>include-auditable-events]
E1 --> F[Hardened<br/>Management Plane]
E2 --> F
Chapter Summary
Switch security on AOS-CX is a layered story that follows the packet from the edge inward. At the access port, 802.1X authenticates the user via EAP between supplicant, authenticator (the switch), and ClearPass; MAC authentication handles supplicant-less devices as a fallback; multi-domain mode lets a single port serve a phone and a PC under independent identities. Once authenticated, the client receives a role — a Local User Role stored on the switch, a Downloadable User Role pulled from ClearPass on demand, or a device profile matched on LLDP — that pins the session to a VLAN with associated ACLs and session timers. ACLs and classifier policies enforce the role’s intent in TCAM at wire speed, applied per-port, per-VLAN, per-SVI, or per-L3-VNI, with hit-count counters revealing whether traffic is being matched as designed. Above all of this sits the management plane, hardened with SSH key authentication, signed HTTPS certificates, TACACS+ with per-command authorization and local fallback, and centralized syslog with auditable-event capture. The exam expects you to recognize the CLI shape of each piece, know which command goes in which context, and understand which mechanism solves which problem — 802.1X for who, role for what, ACL for enforcement, TACACS+ for admin accountability.
Key Takeaways
- 802.1X requires three actors: supplicant, authenticator (switch), authentication server (ClearPass/RADIUS). EAP frames are encapsulated in EAPOL on the access link and in RADIUS toward the server.
- Use
radius-server host <ip> key plaintext <secret> vrf mgmtto point AOS-CX at ClearPass over the management VRF; enable the authenticator on each access port withaaa authentication port-access dot1x authenticator enable. - MAC authentication is the universal fallback for devices without supplicants. Production deployments combine 802.1X and MAC auth on the same port.
- Multi-domain mode supports one voice and one data MAC on a single port; multi-auth handles many MACs each in their own role.
- Local User Roles live on the switch, Downloadable User Roles live on ClearPass and are downloaded over HTTPS; the asterisk suffix on a role name indicates RADIUS override.
- DUR prerequisites: TA profile trusting ClearPass, dedicated read-only ClearPass admin user (
radius-server clearpass username/password), HTTPS reachability, andradius dyn-authorization enablefor CoA. - Device profiles use LLDP TLVs to apply a role before authentication runs — the standard way to bring up Aruba APs on locked-down access ports.
- AOS-CX ACLs are stateless, first-match-wins with implicit deny-all; apply them to physical interfaces, VLAN interfaces (SVIs), L2 VLANs, or L3 VNI contexts.
- Use classifier policies (class-map plus policy-map) when you need actions richer than permit/deny — DSCP marking, rate limiting, redirection.
show access-list hit-counts ip <name>is the first command to run when an ACL is not behaving as expected.- TACACS+ over TCP/49 encrypts the entire packet and supports per-command authorization, which is why it is preferred over RADIUS for switch administrator login.
- Always configure a
localfallback inaaa authentication loginso administrators can recover the box during a TACACS+ outage. - SSH key-based auth, signed HTTPS certificates, and centralized syslog with
include-auditable-eventscomplete the management plane hardening.
Key Terms
- 802.1X — IEEE standard for port-based network access control. Defines the supplicant/authenticator/authentication-server triad and the EAPOL framing used between supplicant and switch.
- RADIUS — Remote Authentication Dial-In User Service. UDP-based AAA protocol used by AOS-CX to consult ClearPass for 802.1X, MAC-Auth, and dynamic role/VLAN assignment.
- ClearPass — HPE Aruba Networking’s policy management platform. Acts as the RADIUS authentication server, the source of truth for user/device profiles, and the host of Downloadable User Roles.
- User role — A named AOS-CX policy bundling VLAN, ACL/classifier policy, session timers, and reauth interval. May be a Local User Role (defined on the switch) or a Downloadable User Role (defined on ClearPass).
- Device profile — An AOS-CX construct that matches LLDP TLVs (or other discovery data) to apply a role to an unauthenticated port — typically used for APs and IoT gateways.
- ACL — Access Control List. A stateless, first-match-wins ordered set of permit/deny rules applied to interfaces, VLANs, SVIs, or L3 VNI contexts on AOS-CX.
- Classifier policy — A class-map plus policy-map construct that matches traffic and applies actions richer than permit/deny — DSCP marking, rate limiting, redirection — historically used for L3 VNI filtering.
- TACACS+ — Terminal Access Controller Access-Control System Plus. TCP-based AAA protocol used for switch administrator login on AOS-CX, supporting full payload encryption and per-command authorization.
Chapter 8: Quality of Service, Multicast Snooping, and DHCP Services
A modern campus switch does much more than simply forward Ethernet frames. It must protect a voice call from being trampled by a backup job, deliver a multicast video stream only to interested receivers, hand out IP addresses safely from servers across a routed network, and refuse to be fooled by an attacker armed with a laptop and a free DHCP-server tool. This chapter is about those four jobs on AOS-CX: Quality of Service (QoS), IGMP/multicast snooping, DHCP relay/snooping, and a stack of Layer 2 mitigations (DAI, IP Source Guard, loop protect, storm control).
Think of the switch as a busy airport. QoS is the boarding-pass priority lane that lets first-class (voice) and business (video) passengers board ahead of economy. IGMP snooping is the gate agent who only opens the jet bridge for passengers actually flying that route. DHCP relay is the shuttle that runs travelers from a remote terminal to the right check-in counter. And the L2 mitigations are airport security — the metal detectors and ID checks that keep impostors off the planes. Each system runs constantly, mostly invisibly, and each fails noisily when misconfigured.
8.1 QoS on AOS-CX
Why QoS exists at all
Ethernet, by default, is a best-effort medium. When a 1 Gb uplink fills up — a backup hits the wire, a software update fans out — every flow on that link suffers equally. For bulk traffic that means a slower transfer; for a voice call it means dropouts, jitter, and angry users. QoS is the policy machinery that lets the switch say “voice goes first, video next, file copies last” when the pipe is congested.
QoS is only meaningful at points of contention. On an uncongested link there is nothing to schedule — every packet leaves immediately. The classic point of contention in a campus is the access-to-aggregation uplink, where many 1 Gb edge ports converge onto a smaller number of 10/25 Gb uplinks. That is where QoS pays for itself.
The AOS-CX QoS pipeline
AOS-CX implements QoS in a four-stage pipeline that mirrors the IETF DiffServ model. Knowing the names of the stages and the CLI building block at each stage is the quickest route to passing any QoS question on HPE7-A01:
- Classify —
class-map. Identify the traffic (by DSCP, CoS, or ACL). - Mark/queue —
policy-map. Set DSCP/CoS, assign to an internal queue. - Schedule —
schedule-profile. Decide who serves first (strict vs. DWRR). - Shape/drop —
queue-profile. Per-queue rate limits and WRED.
[Source: https://www.bhphotovideo.com/lit_files/1146748.pdf]
flowchart LR
A[Ingress packet] --> B[Trust mode<br/>cos/dscp/none]
B --> C[class-map<br/>match dscp/cos/ACL]
C --> D[policy-map<br/>set dscp/cos, queue N]
D --> E[Internal queue 0-7]
E --> F[schedule-profile<br/>strict / DWRR]
F --> G[queue-profile<br/>shape / WRED]
G --> H[Egress port]
Trust modes — the gate that decides whether to believe the marking
When a packet arrives, the very first QoS decision is do I trust the DSCP/CoS the sender wrote? AOS-CX offers three answers:
| Trust mode | What it does | Typical use |
|---|---|---|
qos trust none | Default. Ignore incoming markings; classify everything to the default internal priority. | Untrusted user ports where you don’t want clients setting their own priority. |
qos trust cos | Use the 802.1p CoS bits in the VLAN tag for queue selection. | Trunk ports between L2 switches; uplinks from access points. |
qos trust dscp | Use the DSCP bits in the IP header for queue selection. | L3 boundaries, server-to-server links, IP phone uplinks (after the phone marks DSCP). |
[Source: https://www.bhphotovideo.com/lit_files/1146748.pdf]
The default of trust none is a common gotcha. A new Aruba CX deployment with IP phones will not honor those phones’ DSCP 46 (EF) markings until you either configure qos trust dscp on the access port or use a class-map/policy-map to remark the traffic explicitly. Without trust, the phone’s “this is voice” markings are silently overwritten internally and the call rides the default queue alongside YouTube.
switch(config)# interface 1/1/10
switch(config-if)# qos trust dscp
Multiple trusts can coexist — the switch can use CoS for internal queueing and still preserve DSCP on egress. Global trust can also be set:
switch(config)# qos trust dscp
Per-interface settings override global.
Classification — class-map
A class-map names a group of traffic. AOS-CX class-maps can match by DSCP, CoS, or by referencing an IPv4/IPv6 ACL for full 5-tuple classification:
class-map match-any CLASS_VOICE
match dscp 46
class-map match-any CLASS_VIDEO
match cos 5
class-map match-any CLASS_SCAVENGER
match ip access-group SCAVENGER_ACL
match-any means any of the listed criteria triggers the class; match-all requires all of them. Most exam-style classification uses match-any.
Marking and queuing — policy-map
A policy-map binds actions to class-maps. Two actions matter most: remark (write a new DSCP/CoS) and queue (force into a specific internal queue 0–7):
policy-map QOS-POLICY
class CLASS_VOICE
set dscp 46
queue 7
class CLASS_VIDEO
set cos 5
queue 5
class CLASS_SCAVENGER
set dscp 8
queue 1
Apply on the ingress, egress, or both directions of an interface:
interface 1/1/10
service-policy input QOS-POLICY
[Source: https://www.bhphotovideo.com/lit_files/1146748.pdf]
A common pattern at the access edge: untrust user ports with qos trust none, then use a policy-map to explicitly remark untrusted user traffic to DSCP 0, while a separate class-map matches the IP phone’s voice VLAN and remarks DSCP 46. This way users cannot smuggle high-priority markings, but real voice flows still get express service.
Eight queues, one of them strict — DSCP/CoS to queue
AOS-CX exposes 8 internal traffic classes (queues 0–7). Each queue is associated with a local priority. By convention, queue 7 is the highest and is used as a strict-priority queue for voice and other real-time traffic; queue 0 is the lowest and is typically used for scavenger/background traffic.
Although the exact default mappings depend on the platform and software train, a representative DSCP-to-CoS-to-queue mapping looks like this:
| DSCP value | DSCP name | 802.1p CoS | AOS-CX queue | Typical traffic |
|---|---|---|---|---|
| 46 | EF (Expedited Forwarding) | 5 | 7 (strict) | Voice (RTP) |
| 34 | AF41 | 4 | 6 | Interactive video |
| 26 | AF31 | 3 | 5 | Multimedia signaling / SIP |
| 18 | AF21 | 2 | 4 | Transactional data |
| 10 | AF11 | 1 | 3 | Bulk data |
| 0 | BE (Best Effort) | 0 | 2 | Default / Internet |
| 8 | CS1 | 1 | 1 | Scavenger |
| — | — | — | 0 | Background |
The takeaway is the shape of the table, not the exact numbers: voice (DSCP 46/EF) lives at the top, default best-effort lives in the middle, and scavenger sits at the bottom.
Scheduling — schedule-profile
A schedule-profile is the answer to “when two queues both have packets, who goes first?” AOS-CX supports two scheduling disciplines:
- Strict priority — this queue is always served first as long as it has packets. Used for voice/real-time. Risk: a misbehaving strict-priority queue can starve everything else, so cap it with shaping.
- DWRR (Deficit Weighted Round Robin) — queues are served in proportion to their weights. Higher weight = more bandwidth share when congested.
schedule-profile S-PROFILE
queue 7 strict
queue 6 dwrr weight 50
queue 5 dwrr weight 40
queue 0 dwrr weight 10
The factory-default schedule profile typically places queue 7 in strict priority and the remaining queues under DWRR with predefined weights — adequate for many small/medium deployments and a frequent reference point on the exam.
Shaping and drop policy — queue-profile
A queue-profile sets per-queue caps and congestion-management behavior. Two key knobs:
- Shaping (CIR/CBS) — rate-limit a queue so that even strict-priority traffic can’t consume the entire link.
- WRED (Weighted Random Early Detection) — start dropping (or marking) packets before the queue is completely full, signaling TCP senders to slow down. Tuned per drop-precedence color.
queue-profile Q-PROFILE
queue 7 shaping cir 1000000 cbs 32000
queue 0 wred green-percentage 80 yellow-percentage 100
Apply both profiles to an interface (or globally as factory-default replacement):
interface 1/1/10
qos queue-profile Q-PROFILE
qos schedule-profile S-PROFILE
QoS for voice and video — the canonical recipe
Putting the pieces together, the standard “voice on the edge” recipe looks like this:
class-map match-any CLASS_VOICE
match dscp 46
class-map match-any CLASS_VIDEO
match dscp 34
policy-map EDGE-IN
class CLASS_VOICE
queue 7
class CLASS_VIDEO
queue 6
interface 1/1/10
description "IP phone + PC daisy chain"
qos trust dscp
service-policy input EDGE-IN
Then on the uplink toward the aggregation switch, leave qos trust dscp so the marking is preserved end-to-end through the campus. The schedule-profile on every uplink ensures queue 7 (voice) wins any contention.
Verification
| Command | What it shows |
|---|---|
show qos trust | Current trust mode (global and per-interface) |
show qos schedule-profile [name] | Scheduling configuration |
show qos queue-profile [name] | Queue/shaping configuration |
show qos interface 1/1/10 | Effective QoS state on an interface |
show interface 1/1/10 queues | Per-queue counters and drops |
show interface 1/1/10 queues is the single most useful operational command — if voice is glitching, the per-queue drop counters tell you instantly whether queue 7 is overrunning.
[Source: https://www.bhphotovideo.com/lit_files/1146748.pdf]
8.2 IGMP and Multicast Snooping
The problem snooping solves
Layer 2 switches treat multicast frames the same way they treat broadcast: flood out every port in the VLAN. That means if a single host on VLAN 10 subscribes to a 50 Mb video multicast, every other host on that VLAN receives — and discards — that 50 Mb stream. With dozens of hosts and several streams, an unsnooped multicast environment can saturate access links with traffic that nobody asked for.
IGMP snooping is the fix. The switch listens (snoops) on IGMP control messages — Membership Reports, Leaves, and General Queries — and dynamically programs its L2 forwarding tables so that each multicast group is forwarded only to ports where a host actually subscribed (plus the multicast router port).
The analogy: think of multicast as a magazine subscription service. Without snooping, the post office crams every issue of every magazine into every mailbox in the building (broadcast). With snooping, the building’s mailroom (the switch) reads who actually subscribes and delivers each title only to those tenants.
IGMP versions and the querier role
IGMP comes in three versions on a campus:
| Version | Notes |
|---|---|
| IGMPv1 | RFC 1112. No leave message — relies on query timeout. Rare today. |
| IGMPv2 | RFC 2236. Adds explicit Leave messages and Group-Specific Queries. Still common. |
| IGMPv3 | RFC 3376. Adds source-specific multicast (SSM). Default on AOS-CX. |
Every multicast-snooping VLAN needs a querier — a device that periodically sends IGMP General Queries so that hosts respond with Membership Reports. Without queries, the snooping table eventually times out and groups go quiet.
In a routed multicast network, the PIM router on the segment automatically takes the querier role. In a flat L2 network with no PIM router, you must enable a snooping querier on the switch — usually on the SVI for that VLAN.
Figure 8.3: IGMP snooping topology — querier, mrouter port, and selective forwarding
flowchart LR
Q[IGMP Querier<br/>PIM router or SVI] -->|General Query| SW[Snooping Switch<br/>VLAN 10]
SW -->|mrouter port<br/>1/1/24| Q
SW -->|forward group A| H1[Host 1<br/>joined 239.1.1.1]
SW -->|forward group A| H2[Host 2<br/>joined 239.1.1.1]
SW -.->|drop / no forward| H3[Host 3<br/>not subscribed]
H1 -->|Membership Report| SW
H2 -->|Membership Report| SW
H3 -->|Leave / silent| SW
[Source: https://www.youtube.com/watch?v=hq_porjLunE]
Snooping behavior per VLAN
IGMP snooping is disabled by default on AOS-CX, so multicast frames are flooded across the VLAN until you enable it. The default version when you do enable it is v3.
There are two ways to turn it on:
Global enable — the simplest:
switch(config)# ip igmp snooping all-vlans
This turns snooping on for every existing VLAN (including VLAN 1) and for any VLAN added afterward. Once all-vlans is in effect, you cannot configure snooping on individual VLANs in the VLAN context — the global command takes precedence. You can still override the version or disable snooping on a specific VLAN.
Per-VLAN enable — finer control:
switch(config)# vlan 10
switch(config-vlan-10)# ip igmp snooping enable
If you have IGMPv2-only senders or receivers (older IPTV gear, certain industrial systems), drop the version on a per-VLAN basis:
switch(config-vlan-10)# ip igmp snooping version 2
Configuring a snooping querier
If there is no PIM router on the segment, designate one switch as the querier:
switch(config)# interface vlan 10
switch(config-if-vlan)# ip address 10.10.10.1/24
switch(config-if-vlan)# ip igmp snooping querier
Only one querier should be active per VLAN. If two querier-capable devices exist, IGMP elects the lowest IP address as the active querier. The other(s) become standby.
Static mrouter ports
By default, snooping detects multicast router ports automatically (it learns from PIM hellos and IGMP queries). But you can statically nail a port as a multicast router uplink — useful when the upstream device doesn’t speak PIM/IGMP in a way snooping can detect, or when you want to ensure all multicast streams always reach a particular monitoring or analytics port:
switch(config)# vlan 10
switch(config-vlan-10)# ip igmp snooping mrouter interface 1/1/24
A statically configured mrouter port receives every snooped multicast group that exists in the VLAN.
Verifying multicast group state
| Command | What it shows |
|---|---|
show ip igmp snooping | Global and per-VLAN snooping status |
show ip igmp snooping vlan 10 | Detail for one VLAN |
show ip igmp snooping groups | Currently joined groups and member ports |
show ip igmp snooping mrouter | Learned and static mrouter ports |
show ip igmp snooping statistics | Counters for queries, reports, leaves |
When troubleshooting “the video stream isn’t reaching host X,” the workflow is:
show ip igmp snooping vlan 10— confirm snooping is enabled on the VLAN.show ip igmp snooping mrouter— confirm a multicast router/querier is known.show ip igmp snooping groups— confirm host X’s port is listed for the group.
If step 3 shows no membership but the host claims to have joined, the host’s IGMP report is being lost or dropped — check ACLs and port state.
The ip igmp snooping all-vlans command is supported on all AOS-CX platforms (CX 6300, 6200, 8320, 8325) with platform-specific support timelines documented in HPE’s notes.
[Source: https://www.youtube.com/watch?v=YqoDxSmz0uI]
8.3 DHCP Services
Two distinct features — relay and snooping
DHCP on a campus switch comes in two flavors that solve different problems:
- DHCP relay lets clients on a routed VLAN reach a DHCP server that lives on a different subnet. Without it, DHCPDISCOVER broadcasts die at the SVI.
- DHCP snooping is a security feature: it inspects DHCP traffic, blocks rogue DHCP servers, and builds a binding database that DAI and IP Source Guard rely on.
Both can — and usually should — coexist on the same access switch. The relay forwards the legitimate request to the right server; snooping makes sure no impostor server answers from inside the VLAN.
DHCP relay — ip helper-address
When a client sends a DHCPDISCOVER, it goes out as an L2 broadcast and an L3 broadcast (255.255.255.255). Routers do not forward broadcasts. So in a typical campus where the DHCP server lives in the data center, the access SVI must catch that broadcast and relay it as a unicast to the server.
switch(config)# interface vlan 10
switch(config-if-vlan)# ip address 10.10.10.1/24
switch(config-if-vlan)# ip helper-address 10.50.50.10
Multiple ip helper-address lines can be configured for redundancy — the relay forwards a copy to each:
switch(config-if-vlan)# ip helper-address 10.50.50.10
switch(config-if-vlan)# ip helper-address 10.50.50.11
[Source: https://www.youtube.com/watch?v=5V4l4nm4I8U]
The packet flow:
sequenceDiagram
participant H as Host (10.10.10.x)
participant R as Switch SVI VLAN 10<br/>10.10.10.1
participant S as DHCP server<br/>10.50.50.10
H->>R: DHCPDISCOVER (broadcast)
R->>S: DHCPDISCOVER (unicast, giaddr=10.10.10.1)
S->>R: DHCPOFFER (unicast)
R->>H: DHCPOFFER (unicast/broadcast)
H->>R: DHCPREQUEST (broadcast)
R->>S: DHCPREQUEST (unicast)
S->>R: DHCPACK
R->>H: DHCPACK
The relay sets the giaddr field in the DHCP packet to the SVI’s IP address so the server knows which scope to draw from.
DHCP snooping — blocking rogue servers
DHCP snooping divides ports into trusted and untrusted. All ports are untrusted by default. On an untrusted port, the switch:
- Permits client-side DHCP messages (DISCOVER, REQUEST, RELEASE, DECLINE).
- Drops server-side messages (OFFER, ACK, NAK) — this is what blocks rogue DHCP servers, because a rogue server’s reply tries to come from an untrusted access port.
- Builds entries in the binding database as it sees clients lease addresses.
Trusted ports allow all DHCP messages — these are the uplinks toward the legitimate DHCP server (or relay).
Figure 8.4: DHCP snooping — trusted vs untrusted ports
graph TD
DS[Legitimate DHCP Server<br/>10.50.50.10]
UP[Uplink port 1/1/24<br/>TRUSTED<br/>permits OFFER/ACK/NAK]
SW[Access Switch<br/>dhcpv4-snooping enabled<br/>VLAN 10]
AP1[Access port 1/1/1<br/>UNTRUSTED<br/>drops server-side DHCP]
AP2[Access port 1/1/2<br/>UNTRUSTED]
AP3[Access port 1/1/3<br/>UNTRUSTED]
C1[Legit Client<br/>DISCOVER/REQUEST OK]
R[Rogue DHCP server<br/>OFFER dropped at switch]
C2[Legit Client]
DS --- UP
UP --- SW
SW --- AP1
SW --- AP2
SW --- AP3
AP1 --- C1
AP2 --- R
AP3 --- C2
Configuration:
switch(config)# dhcpv4-snooping
switch(config)# vlan 10
switch(config-vlan-10)# dhcpv4-snooping
switch(config)# interface 1/1/24
switch(config-if)# description "Uplink to DHCP server"
switch(config-if)# dhcpv4-snooping trust
The pattern is two-line global enable, per-VLAN enable, and trust on the uplink(s) only. Forgetting trust on the uplink is a common mistake — clients then fail to get addresses because the legitimate OFFER is dropped along with the rogue ones.
The binding database
The DHCP snooping binding database is auto-populated as clients lease addresses through the snooping path. Each entry records:
| Field | Description |
|---|---|
| MAC address | Client hardware address |
| IP address | Address leased by the server |
| VLAN | VLAN ID the lease belongs to |
| Interface | Ingress port the client is on |
| Lease time | When the binding expires |
switch# show dhcpv4-snooping binding
[Source: https://support.hpe.com/hpesc/public/docDisplay?docId=sf000098853en_us]
This binding database is the foundation for both Dynamic ARP Inspection and IP Source Guard. When DAI sees an ARP on an untrusted port, it consults this table to verify the IP/MAC pair. When IPSG sees a data packet, it consults the same table to verify the source IP/MAC/port tuple.
Option 82 handling
DHCP Option 82 (Relay Agent Information) is metadata that a relay or snooping switch can insert into a DHCP packet to identify the originating port and switch. Servers that understand Option 82 can use it to:
- Assign IPs from specific scopes based on switch port (e.g., conference rooms get a smaller pool).
- Log which physical port leased each address.
- Reject DHCP requests that don’t carry the expected relay information.
When DHCP relay is enabled, the relay typically inserts Option 82 by default. When DHCP snooping is enabled, the snooping switch may also insert Option 82 on packets passing through. Behavior is configurable — for environments where the upstream server rejects packets with Option 82, you can disable insertion. The exact CLI is in the AOS-CX IP Services Guide.
Verification
| Command | Purpose |
|---|---|
show dhcpv4-snooping | Global and per-VLAN status, trusted ports |
show dhcpv4-snooping binding | Current bindings |
show dhcpv4-snooping statistics | Counters: drops, forwards, validations |
8.4 Layer 2 Threat Mitigation
The DHCP snooping binding database, once built, becomes the truth source for two more security features that close common L2 attack vectors. Add loop protect and storm control to the picture and you have a four-feature L2 hardening stack that should be standard on every access port.
Threats and the matching mitigations
| Threat | What an attacker does | Mitigation |
|---|---|---|
| Rogue DHCP server | Plug in a laptop running dnsmasq; hand out bogus IPs/gateway. | DHCP snooping (untrusted ports drop server-side DHCP). |
| ARP spoofing / MITM | Send unsolicited ARP replies claiming to be the gateway. | Dynamic ARP Inspection (validates against binding DB). |
| IP spoofing | Send packets with a forged source IP from a compromised host. | IP Source Guard (filters by binding DB). |
| L2 loop | Two ports inadvertently bridged outside the switch’s STP domain. | Loop protect (LP). |
| Broadcast/multicast/unknown-unicast storm | Flood from a chatty/malicious device. | Storm control. |
Dynamic ARP Inspection (DAI)
DAI inspects every ARP packet on untrusted ports and compares the sender’s IP/MAC/port against the DHCP snooping binding database. If the tuple isn’t in the table, the ARP is dropped — an attacker’s forged ARP cannot match because the attacker never received that IP from DHCP.
DAI requires DHCP snooping to be enabled because it has nothing to validate against without the binding database.
Figure 8.5: Dynamic ARP Inspection — validation flow against the snooping binding database
sequenceDiagram
participant H as Host (untrusted port 1/1/5)
participant SW as Switch (DAI on VLAN 10)
participant DB as DHCP Snooping Binding DB
participant GW as Gateway / Target
H->>SW: ARP packet (sender MAC/IP, port)
SW->>DB: Lookup {MAC, IP, VLAN, port}
alt Tuple matches binding
DB-->>SW: Match found
SW->>GW: Forward ARP
else Tuple absent or mismatched
DB-->>SW: No match
SW--xH: Drop ARP, increment violation counter
end
Note over SW,DB: Trusted uplink ports bypass<br/>this check entirely
switch(config)# vlan 10
switch(config-vlan-10)# arp inspection
switch(config)# interface 1/1/24
switch(config-if)# arp inspection trust
[Source: https://airheads.hpe.com/discussion/ip-snooping-configuration-on-aruba-switches]
As with DHCP snooping, the access ports stay untrusted (default) and uplinks are trusted. Verification:
show arp inspection
show arp inspection statistics vlan 10
In static-IP environments, DAI will black-hole the static hosts unless you add manual ip source-binding entries. This is the single biggest deployment gotcha — turning on DAI in a server VLAN where servers have static IPs is how you trigger an outage.
IP Source Guard / IP Source Lockdown
IP Source Guard (called “IP Source Lockdown” on the older AOS-Switch ProVision line; configured via ip source-binding on AOS-CX) filters IP traffic on access ports against the DHCP snooping binding table. Only traffic whose {MAC, IP, VLAN, port} tuple matches a binding is forwarded; everything else drops.
This blocks IP spoofing at the access edge: a compromised host cannot pretend to be another IP because the binding table won’t agree.
Like DAI, IPSG depends on DHCP snooping and breaks for static-IP hosts unless manual bindings are configured.
switch(config)# ip source-binding 00:11:22:33:44:55 vlan 10 ipv4 10.10.10.50 interface 1/1/5
[Source: https://airheads.hpe.com/discussion/ip-snooping-configuration-on-aruba-switches]
The combined security stack
| Feature | Layer | What it blocks |
|---|---|---|
| DHCP snooping | DHCP control | Rogue DHCP servers, DHCP starvation |
| Dynamic ARP Inspection | ARP | ARP spoofing, MITM |
| IP Source Guard | IP data plane | IP spoofing |
Best practice:
- Enable on access switches only. Aggregation and core have no end users; turning on snooping there just creates trust-port bookkeeping with no security benefit.
- Untrust access ports, trust uplinks. Both for DHCP and DAI.
- Beware of static IPs. Provision
ip source-bindingentries first or skip these features on the affected VLANs. - Roll out together. DHCP snooping first (build the database), then DAI and IPSG layered on top.
Loop protect
Loop protect (LP) is AOS-CX’s L2-loop safety net. Even with STP running, an inadvertently bridged pair of access ports — a user plugging both ends of a patch cable into a wall jack, or a rogue mini-switch — can create a forwarding loop on a single VLAN that STP didn’t see.
Loop protect periodically transmits a loop-detect frame on the port. If the same frame returns on any port, the originating port is shut (or rate-limited, depending on configuration). Standard recipe:
switch(config)# loop-protect
switch(config)# interface 1/1/1-1/1/24
switch(config-if-<>)# loop-protect
Apply on every access port; do not apply on uplinks (where loops are managed by STP/MSTP).
Storm control
Storm control rate-limits the volume of broadcast, multicast, and/or unknown-unicast frames a port may forward. When the rate exceeds a configured threshold, the switch either drops the excess or shuts the port down for a recovery interval.
The classic use cases: a NIC stuck in broadcast, a worm fan-out, or a misconfigured device flooding ARP requests. Without storm control, a single chatty access port can saturate every uplink in the building.
switch(config)# interface 1/1/1
switch(config-if)# storm-control broadcast level pps 100
switch(config-if)# storm-control multicast level pps 200
switch(config-if)# storm-control unknown-unicast level pps 100
Tune levels by what’s normal in your environment — too low and legitimate ARP/DHCP traffic gets clipped; too high and the feature does nothing useful.
Chapter Summary
QoS on AOS-CX is a four-stage pipeline: classify with class-maps, mark/queue with policy-maps, schedule with schedule-profiles, and shape/drop with queue-profiles. Trust modes (qos trust dscp/cos/none) decide whether to honor incoming markings — the default is none, which silently overrides phone DSCP unless you change it. Eight queues (0–7) carry traffic; queue 7 is conventionally strict-priority for voice. The factory schedule profile combines strict-on-7 with DWRR on the rest.
IGMP snooping is off by default — meaning multicast floods every port until you turn it on. Enable globally with ip igmp snooping all-vlans or per-VLAN; default version is v3. Every snooping VLAN needs a querier; if no PIM router exists on the segment, configure ip igmp snooping querier on the SVI. Verify with show ip igmp snooping groups and show ip igmp snooping mrouter.
DHCP relay (ip helper-address on the SVI) carries client broadcasts to off-segment servers as unicasts. DHCP snooping is the security cousin: untrust by default on every port, trust only the uplink toward the real server, and let the switch build a binding database. The binding database then powers Dynamic ARP Inspection (blocks ARP spoofing) and IP Source Guard (blocks IP spoofing). Loop protect catches L2 loops STP misses, and storm control caps broadcast/multicast/unknown-unicast rates.
The whole stack — DHCP snooping + DAI + IPSG + loop protect + storm control — belongs on access switches only, and you must remember that DAI/IPSG break static-IP hosts unless you configure manual bindings.
Key Terms
- QoS (Quality of Service) — Mechanism for prioritizing traffic during congestion via classification, marking, queuing, and scheduling.
- DSCP (Differentiated Services Code Point) — 6-bit field in the IP header used for L3 QoS classification (e.g., 46/EF for voice).
- CoS (Class of Service) — 3-bit 802.1p priority field in the VLAN tag used for L2 QoS classification.
- Trust mode — Per-interface setting (
qos trust dscp/cos/none) that decides whether to honor incoming QoS markings. - Class-map — AOS-CX construct that names a group of traffic by DSCP/CoS/ACL match.
- Policy-map — AOS-CX construct that applies actions (set DSCP, queue) to class-maps.
- Queue-profile / Schedule-profile — AOS-CX profiles that govern egress shaping/WRED and scheduling (strict / DWRR).
- DWRR (Deficit Weighted Round Robin) — Weighted scheduling discipline that shares bandwidth in proportion to queue weights.
- IGMP (Internet Group Management Protocol) — Protocol hosts use to subscribe/unsubscribe from IP multicast groups (v1/v2/v3).
- IGMP snooping — Switch feature that listens to IGMP and forwards multicast only to subscribed ports.
- Querier — Device (PIM router or snooping switch) that periodically sends IGMP queries to keep membership state alive.
- Mrouter port — Multicast router port; receives all snooped multicast groups in a VLAN.
- DHCP relay (ip helper-address) — Forwards DHCP broadcasts as unicasts to off-segment DHCP servers.
- DHCP snooping — Security feature that drops server-side DHCP messages on untrusted ports, blocking rogue servers.
- Binding database — Table of {MAC, IP, VLAN, port, lease} entries built by DHCP snooping; used by DAI and IPSG.
- Trusted vs. untrusted port — In DHCP snooping/DAI: trusted = uplink toward legitimate infrastructure; untrusted = access port (default).
- Option 82 — DHCP Relay Agent Information option; carries port/switch identifiers from relay to server.
- DAI (Dynamic ARP Inspection) — Validates ARP packets against the DHCP snooping binding database; blocks ARP spoofing.
- IP Source Guard / IP Source Lockdown — Filters IP traffic on access ports against the binding database; blocks IP spoofing.
- Loop protect (LP) — AOS-CX feature that detects L2 loops STP misses by sending and watching for loop-detect frames.
- Storm control — Rate-limits broadcast/multicast/unknown-unicast traffic per port to contain floods.
Chapter 9: Monitoring, Automation, and the Network Analytics Engine (NAE)
A network you cannot see is a network you cannot defend, troubleshoot, or scale. For decades, the answer to that problem was a tireless engineer hunched over a CLI, typing show interface every few minutes during an outage. That model does not survive in a modern campus where a single switch may carry tens of thousands of flows, hundreds of VLANs, and dozens of dynamic segmentation policies. AOS-CX was built from day one with this in mind: every operational state lives in a structured database, every database object is reachable through a REST API, and a built-in analytics engine can watch any of those objects and react in Python.
This chapter walks you up the monitoring and automation stack — from classical SNMP, sFlow, and Syslog all the way to the Network Analytics Engine (NAE) and cloud-based orchestration with Aruba Central and NetEdit. By the end, you will be comfortable picking the right tool for the right job and explaining why AOS-CX makes telemetry feel less like archeology and more like instrumentation.
Analogy: Think of monitoring tools as different sensors on a car. SNMP is the tachometer — a periodic poll of “what’s the RPM right now?” sFlow is the dashcam — a sampled video record of what passed by. Syslog is the OBD-II diagnostic log — events that the car decided to write down. NAE is the modern self-driving brain — it watches all the sensors continuously, decides when something is wrong, and takes action.
9.1 Traditional Monitoring Tools
Even in a world of streaming telemetry and APIs, the classics still pay the bills. Most network management systems (NMS), SIEMs, and flow collectors speak SNMP, sFlow, and Syslog natively, so AOS-CX supports all three. Understanding their strengths, weaknesses, and configuration footprint is non-negotiable for the HPE7-A01 exam.
9.1.1 SNMP — Simple Network Management Protocol
SNMP is a poll-based protocol. A management station (the manager) sends a GET for a specific Object Identifier (OID), and the switch’s SNMP agent replies with the current value. The switch can also push unsolicited messages (traps or informs) when a defined event occurs — a link going down, a temperature crossing a threshold, an authentication failure.
AOS-CX supports SNMPv2c and SNMPv3 simultaneously. The version you choose has serious security implications.
Table 9-1. SNMPv2c vs SNMPv3 at a glance
| Feature | SNMPv2c | SNMPv3 |
|---|---|---|
| Authentication | Community string (cleartext) | User-based (MD5/SHA, SHA-2) |
| Encryption | None | DES, 3DES, AES-128/192/256 |
| Integrity | None | HMAC ensures messages aren’t altered |
| Identifier | Community (“public”, “private”) | User name + security level |
| Security Levels | n/a | noAuthNoPriv, authNoPriv, authPriv |
| Best For | Lab and read-only legacy NMS | Production, regulated environments |
SNMPv2c essentially relies on a shared password sent in the clear. Capturing one packet on a span port reveals the community string, after which an attacker can read — and possibly write — your entire OID tree. SNMPv3 fixes this with the User-based Security Model (USM): each user has authentication credentials (HMAC) and optionally privacy (encryption) credentials. The combination defines the security level: authPriv (auth + encryption) is the only one you should accept in production.
Example: configuring SNMPv3 on AOS-CX
switch(config)# snmp-server vrf mgmt
switch(config)# snmpv3 user netops auth sha auth-pass-plaintext "S3curePass!" priv aes priv-pass-plaintext "Pr1vKey!"
switch(config)# snmp-server host 10.10.10.50 trap version v3 user netops
switch(config)# snmp-server enable traps link
switch(config)# snmp-server enable traps snmp authentication
The user netops authenticates with SHA and encrypts with AES — the recommended combination. Traps are sent over the management VRF to the NMS at 10.10.10.50.
Tip: Pair SNMPv3 with read-only views when you can. The NMS rarely needs to write configuration; restricting it to a read view limits blast radius if credentials leak.
9.1.2 sFlow — Sampled Flow Telemetry
SNMP tells you “the interface is at 60% utilization.” It does not tell you which conversation is consuming that bandwidth. That is the job of flow telemetry. AOS-CX implements sFlow, an industry standard defined by RFC 3176, which uses statistical packet sampling and periodic counter polls to give you a representative view of traffic without the CPU expense of mirroring every packet.
How sFlow works on AOS-CX:
- The switch ASIC samples every Nth packet (the sampling rate, e.g., 1 in 4096).
- The first ~128 bytes of each sampled packet — enough to reveal L2/L3/L4 headers — are exported in a UDP datagram to an sFlow collector.
- In parallel, counter samples export interface counters every
polling-intervalseconds. - The collector reconstructs flow records from the samples and presents top talkers, application mix, etc.
Example: enabling sFlow globally and on an interface
switch(config)# sflow collector 10.10.10.60 vrf mgmt
switch(config)# sflow sampling 4096
switch(config)# sflow polling 30
switch(config)# interface 1/1/1
switch(config-if)# sflow enable
The sampling rate is the knob that matters. Too aggressive (e.g., 1-in-256) and you flood the collector and burn CPU. Too sparse (e.g., 1-in-65536) and short-lived flows never get seen. Aruba’s general guidance is 1-in-4096 for 1 GbE access ports and 1-in-8192 to 1-in-16384 for uplinks at 10/25 GbE and above.
Analogy: sFlow is like an exit poll. You don’t ask every voter, but if you sample randomly and at scale you get a statistically valid picture of the election. The bigger the population (traffic), the higher the sampling rate can go without losing accuracy.
9.1.3 Syslog — Event Logging
Syslog is the timeline of “things that happened.” Every config change, authentication attempt, link state change, and protocol event the switch decides is worth recording is written as a syslog message. Each message has a severity — a numeric priority from 0 to 7 — that lets you filter the firehose down to what you care about.
Table 9-2. Syslog severity levels (RFC 5424)
| Level | Keyword | Meaning | Typical AOS-CX Example |
|---|---|---|---|
| 0 | Emergency | System unusable | Kernel panic |
| 1 | Alert | Immediate action needed | PSU failure |
| 2 | Critical | Critical condition | Stack split brain |
| 3 | Error | Error condition | OSPF neighbor flap |
| 4 | Warning | Warning condition | High CPU |
| 5 | Notice | Normal but significant | Config commit |
| 6 | Informational | Informational | Link up/down |
| 7 | Debug | Debug-level | Protocol traces |
AOS-CX can log to local persistent storage, to the console, to a remote syslog server (TCP or UDP, optionally over TLS), and to multiple destinations simultaneously with different filters per destination.
Example: forwarding to a SIEM at notice or higher
switch(config)# logging 10.10.10.70 vrf mgmt severity notice
switch(config)# logging persistent-log severity warning
switch(config)# logging filter siem-filter
switch(config-logging-filter)# include event-id LINK-3-LINKDOWN severity error
Sending levels 5 (Notice) and below — meaning numerically lower, more severe — to your SIEM keeps the volume manageable while still catching anything operationally interesting. Reserve Debug (level 7) for active troubleshooting because it is extremely chatty.
Tip: Set
clock timezoneand configure NTP before you turn on remote syslog. A SIEM full of log entries with the wrong timestamp is worse than no logs at all when you are correlating an incident.
9.1.4 Mirror Sessions (Port Mirroring / SPAN)
When sampled telemetry is not enough — a security investigation, a packet-capture-driven troubleshoot, an IDS feed — you need every packet, not a sample. AOS-CX provides mirror sessions (the AOS-CX term for SPAN/RSPAN-style port mirroring). A mirror session captures traffic on one or more source interfaces, optionally in one or both directions, and replicates it to a destination interface or remote endpoint over GRE (ERSPAN-equivalent).
Example: local mirror to a capture port
switch(config)# mirror session 1
switch(config-mirror-1)# source interface 1/1/10 both
switch(config-mirror-1)# destination interface 1/1/48
switch(config-mirror-1)# no shutdown
This sends ingress and egress traffic from port 1/1/10 to port 1/1/48, where a Wireshark host or NDR appliance lives. Be aware that the destination port becomes dedicated — it cannot forward normal traffic — and that mirror traffic is unidirectional in the sense that the destination cannot reply on that interface.
Table 9-3. Choosing the right traditional monitoring tool
| Need | Best Tool | Why |
|---|---|---|
| Periodic counters (utilization, errors) | SNMPv3 | Standard NMS integration |
| Top talkers, conversations, app mix | sFlow | Sampled flows, low overhead |
| Audit trail and event timeline | Syslog | Persistent, correlatable |
| Full packet capture for deep dive | Mirror Session | 100% of bytes, all headers |
| Real-time threshold alerting on switch state | NAE (next section) | On-box, programmable |
Figure 9.1: Monitoring tool selection decision tree
flowchart TD
Start[What do you need to monitor?] --> Q1{Need every packet<br/>for forensics or IDS?}
Q1 -->|Yes| Mirror[Mirror Session<br/>SPAN/RSPAN/ERSPAN]
Q1 -->|No| Q2{Need traffic patterns<br/>and top talkers?}
Q2 -->|Yes| SFlow[sFlow<br/>sampled packets +<br/>counter polls]
Q2 -->|No| Q3{Need event timeline<br/>or audit trail?}
Q3 -->|Yes| Syslog[Syslog<br/>severity-filtered<br/>to SIEM]
Q3 -->|No| Q4{Need periodic<br/>counters / OIDs?}
Q4 -->|Yes| SNMP[SNMPv3 authPriv<br/>traps + polls]
Q4 -->|No| Q5{Need on-box<br/>reactive alerting?}
Q5 -->|Yes| NAE[NAE Agent<br/>see Section 9.3]
Q5 -->|No| Rethink[Reconsider<br/>requirements]
style Mirror fill:#1f4068,stroke:#58a6ff,color:#fff
style SFlow fill:#1f4068,stroke:#58a6ff,color:#fff
style Syslog fill:#1f4068,stroke:#58a6ff,color:#fff
style SNMP fill:#1f4068,stroke:#58a6ff,color:#fff
style NAE fill:#1f4068,stroke:#58a6ff,color:#fff
[Source: https://developer.arubanetworks.com/aoscx/docs/nae-getting-started]
9.2 REST API and Automation
The CLI is not going anywhere — you will spend plenty of time in it — but the future of operations is programmatic. AOS-CX exposes its entire configuration and state through a versioned REST API. Anything you can do in the CLI you can do via HTTPS calls, and anything in the database is reachable via a URI.
9.2.1 Enabling and Authenticating to the AOS-CX REST API
Before you can call the API, you must turn it on. By default, AOS-CX ships with the REST server in read-only mode bound to no VRF. To enable read-write and bind it to the management VRF:
switch(config)# https-server rest access-mode read-write
switch(config)# https-server vrf mgmt
The API is then reachable at https://<switch-ip>/rest/v10.04/ (the version suffix depends on your firmware — v10.08, v10.09, etc.). Each switch also hosts an interactive Swagger UI at /api/ that documents every endpoint and lets you “try it now” against the live device.
Authentication is session-cookie based. You POST your credentials to /login, and a successful response (HTTP 200) sets a cookie that you reuse on every subsequent call until you POST /logout.
import requests
import json
base_url = "https://10.10.10.5/rest/v10.04"
creds = {"userName": "admin", "password": "ArubaNet!"}
session = requests.Session()
r = session.post(f"{base_url}/login", data=json.dumps(creds), verify=False)
print("Login:", r.status_code) # 200 = success, cookie stored in session
The requests.Session object handles the cookie jar automatically — every later call through session.get(...) or session.post(...) carries the auth cookie without any manual work on your part.
Figure 9.2: REST API session-cookie authentication flow
sequenceDiagram
participant C as Client (Python)
participant S as AOS-CX Switch<br/>(REST API)
participant DB as Switch DB
C->>S: POST /login<br/>{userName, password}
S->>S: Validate credentials
S-->>C: 200 OK<br/>Set-Cookie: session=...
Note over C: Cookie stored in<br/>requests.Session jar
C->>S: GET /system/vlans<br/>Cookie: session=...
S->>DB: Query VLAN objects
DB-->>S: VLAN list
S-->>C: 200 OK<br/>JSON payload
C->>S: POST /system/vlans<br/>{id, name, admin}<br/>Cookie: session=...
S->>DB: Create VLAN object
DB-->>S: Object created
S-->>C: 201 Created
C->>S: POST /logout<br/>Cookie: session=...
S->>S: Invalidate session
S-->>C: 200 OK<br/>Cookie cleared
[Source: https://developer.arubanetworks.com/aoscx/v10.04/docs/python-getting-started]
9.2.2 GET and POST Against the Configuration Tree
The AOS-CX object model is a tree. /system is the root for switch-level objects, with children like /system/vlans, /system/interfaces, /system/vrfs, and so on. A GET returns the configured (or running, or default) state depending on the selector query parameter; a POST creates a new child object; PUT replaces; PATCH merges; DELETE removes.
Listing all VLANs
r = session.get(f"{base_url}/system/vlans?selector=configuration", verify=False)
print(json.dumps(r.json(), indent=2))
Creating VLAN 200
new_vlan = {
"id": 200,
"name": "Guest",
"admin": "up",
"type": "static"
}
r = session.post(f"{base_url}/system/vlans", data=json.dumps(new_vlan), verify=False)
print("Create VLAN:", r.status_code) # 201 = created
Equivalent with curl
# Login (save the cookie)
curl -k -c cookie.txt -H "Content-Type: application/json" \
-d '{"userName":"admin","password":"ArubaNet!"}' \
https://10.10.10.5/rest/v10.04/login
# Create VLAN 200
curl -k -b cookie.txt -H "Content-Type: application/json" \
-d '{"id":200,"name":"Guest","admin":"up","type":"static"}' \
-X POST https://10.10.10.5/rest/v10.04/system/vlans
Always log out when you are done:
session.post(f"{base_url}/logout", verify=False)
A leaked cookie is just as dangerous as a leaked password — clean up.
9.2.3 The pyaoscx Python SDK
Writing raw requests calls is fine for one-offs, but for serious automation HPE provides the pyaoscx SDK, a Python library that wraps the REST API in idiomatic Python objects.
pip install pyaoscx
from pyaoscx.session import Session
from pyaoscx.vlan import Vlan
s = Session("10.10.10.5", "10.04")
s.open("admin", "ArubaNet!")
# Object-oriented: create a VLAN
v = Vlan(s, vlan_id=300, name="IoT", admin_state="up")
v.apply() # POST under the hood
# Read state
v.get()
print(v.materialized, v.name)
s.close()
pyaoscx handles cookie management, error parsing, payload construction, and even materialization (lazy loading of children). It is what you should reach for inside Ansible modules, NAE callbacks, or any Python-based orchestration.
[Source: https://github.com/aruba/pyaoscx]
9.2.4 On-Box Python and Ansible Collections
AOS-CX can run Python on the switch, not just from your laptop. On-box Python is most often invoked from NAE scripts (covered in §9.3), but you can also drop a Python script into the switch’s user storage and execute it from the CLI for one-off automation.
For multi-switch orchestration, HPE publishes the arubanetworks.aoscx Ansible Collection on Ansible Galaxy:
ansible-galaxy collection install arubanetworks.aoscx
A small playbook to push a VLAN to a fleet:
- hosts: aoscx_switches
collections: [arubanetworks.aoscx]
tasks:
- name: Ensure Guest VLAN exists
aoscx_vlan:
vlan_id: 200
name: Guest
admin_state: up
state: create
The collection uses pyaoscx under the hood, so you get the same authentication and object model with a declarative front-end. For organizations already on Ansible Tower / AAP, this is the path of least resistance.
Analogy: Think of the layers like building blocks. Raw curl is hand-laying bricks.
requests+ JSON is using a trowel.pyaoscxis using a brick-laying machine. Ansible is hiring a contractor with a brick-laying machine and a blueprint.
9.3 The Network Analytics Engine (NAE)
If §9.2 was about you automating the switch, this section is about the switch automating itself. The Network Analytics Engine (NAE) is a built-in framework that runs Python scripts directly on AOS-CX to monitor, alert on, and remediate network conditions — without an external server, without polling latency, and without a separate license.
9.3.1 Agent Architecture
The four NAE building blocks are a tidy hierarchy:
Table 9-4. NAE concepts
| Concept | Role | Example |
|---|---|---|
| Script | Python file that defines monitors, conditions, actions | interface_link_flap.py |
| Agent | An instance of a script with parameter values | ”MonitorPort1/1/1” agent from the script |
| Monitor | Subscribes to switch state via REST URI | CPU utilization, interface link state |
| Condition | Boolean expression on monitor values | cpu > 90 for 30 seconds |
| Action | Code executed when a condition fires | Send syslog, run CLI, call webhook |
A script is a template; an agent is a running instance. You can instantiate the same script as ten agents, each watching a different interface or VRF, with different thresholds. The agent runs natively on the switch, reads from the switch’s config and time-series databases, and writes its own time-series series for the Web UI to graph.
Figure 9.3: NAE script -> agent -> monitor -> condition -> action hierarchy
graph TD
Script[Script<br/>interface_link_flap.py<br/>Python file with Manifest<br/>+ ParameterDefinitions]
Script --> AgentA[Agent A<br/>MonitorPort1/1/1<br/>params: interface=1/1/1]
Script --> AgentB[Agent B<br/>MonitorPort1/1/24<br/>params: interface=1/1/24]
Script --> AgentC[Agent C<br/>MonitorUplink<br/>params: interface=1/1/49]
AgentA --> M1[Monitor m1<br/>URI: /system/interfaces/<br/>1%2F1%2F1?attributes=link_state]
M1 --> Cond1[Condition r1<br/>m1 == 'down']
Cond1 --> ActA1[Action: ActionSyslog<br/>severity=WARNING]
Cond1 --> ActA2[Action: ActionCLI<br/>show interface 1/1/1]
AgentB --> M2[Monitor m1<br/>same URI template<br/>different interface]
M2 --> Cond2[Condition r1<br/>m1 == 'down']
Cond2 --> ActB[Action: webhook +<br/>self-heal via REST API]
style Script fill:#1f4068,stroke:#58a6ff,color:#fff
style AgentA fill:#264d73,stroke:#58a6ff,color:#fff
style AgentB fill:#264d73,stroke:#58a6ff,color:#fff
style AgentC fill:#264d73,stroke:#58a6ff,color:#fff
style M1 fill:#2d5a87,stroke:#58a6ff,color:#fff
style M2 fill:#2d5a87,stroke:#58a6ff,color:#fff
style Cond1 fill:#3d6a97,stroke:#58a6ff,color:#fff
style Cond2 fill:#3d6a97,stroke:#58a6ff,color:#fff
style ActA1 fill:#4d7aa7,stroke:#58a6ff,color:#fff
style ActA2 fill:#4d7aa7,stroke:#58a6ff,color:#fff
style ActB fill:#4d7aa7,stroke:#58a6ff,color:#fff
+-----------------------------------------------+
| AOS-CX switch |
| |
| +--------+ +-----------+ +-----------+ |
| | Script | ->| Agent A |-->| Monitor | |
| +--------+ | (CPU) | | cpu_util | |
| \ +-----------+ +-----------+ |
| \ | | |
| \ Condition Time-series DB |
| \ cpu > 90 | |
| \ | | |
| v +--------+ +----------+ |
| -->| Action | | Web UI | |
| | syslog | | graphs | |
| +--------+ +----------+ |
+-----------------------------------------------+
9.3.2 Monitors, Conditions, and Actions
A monitor is just a subscription to a REST URI. The switch already exposes /rest/v10.04/system/interfaces/1%2F1%2F1?attributes=link_state — wrap that in Monitor() and the agent gets a callback every time the value changes.
A condition is a Python expression evaluated on the monitor’s current value. A condition fires once when it transitions from false to true.
Actions are arbitrary Python — you can call ActionSyslog, ActionCLI, ActionShell, send an email, hit a webhook, or even call back into the AOS-CX REST API to change configuration (this is how self-healing works).
Skeleton of a tiny NAE script
from Manifest import Manifest
Manifest = {
'Name': 'interface_link_monitor',
'Description': 'Alert when a monitored interface goes down',
'Version': '1.0',
'Author': 'HPE7-A01 Study',
}
ParameterDefinitions = {
'interface_id': {
'Name': 'Interface',
'Description': 'Interface to monitor (e.g. 1/1/1)',
'Type': 'string',
'Default': '1/1/1'
}
}
class Agent(NAE):
def __init__(self):
uri = '/rest/v10.04/system/interfaces/{}?attributes=link_state'.format(
self.params['interface_id'].replace('/', '%2F'))
self.m1 = Monitor(uri, 'Link State')
self.r1 = Rule('Interface Down')
self.r1.condition('{} == "down"', [self.m1])
self.r1.action(self.action_alert)
def action_alert(self, event):
ActionSyslog('Interface {} went DOWN'.format(self.params['interface_id']),
severity=SYSLOG_WARNING)
ActionCLI('show interface {}'.format(self.params['interface_id']))
When the link drops, the agent emits a syslog message and runs show interface against itself, capturing diagnostic output to the agent’s history pane in the Web UI — exactly the data the on-call engineer would have collected manually.
9.3.3 Pre-Built and Custom Scripts
You do not have to write everything from scratch. HPE maintains a public, tested library of NAE scripts at github.com/aruba/nae-scripts covering:
- BGP/OSPF neighbor monitoring with auto-remediation hooks
- DHCP snooping anomalies
- High CPU and memory pressure detection
- MAC table churn detection
- PoE budget alerts
- Buffer congestion and microburst tracking
- Loop and STP topology change detection
Workflow to deploy:
- Download the
.pyscript from the GitHub repo. - In the AOS-CX Web UI, navigate to Analytics > Scripts > Upload. The UI base64-encodes for you. (Via REST API you must base64-encode the file body yourself.)
- From Analytics > Agents, create an agent from the uploaded script. Set its parameters (which interface, threshold, etc.).
- Enable the agent.
Admin privileges are required for both script upload and agent creation. There is no separate NAE license — it is included with AOS-CX. A simplified variant called NAE-Lite ships on lower-end platforms for basic log monitoring; full NAE on the higher-end CX models supports the complete Python execution environment.
9.3.4 Time-Series Visualization
Every monitor automatically populates a time-series chart in the Web UI. When a condition fires, a vertical marker is drawn on the chart with a link to the action output, so you can see the moment the spike crossed the line and what the agent did about it. This visual correlation between configuration change, state change, and remediation is the headline feature that distinguishes NAE from “just running a Python script every minute via cron.”
Analogy: NAE is an embedded SRE. It never sleeps, never blinks, has the same access to the switch you do, and follows a runbook you wrote — including paging itself, gathering diagnostics, and (if you trust it enough) fixing the problem.
[Source: https://developer.arubanetworks.com/aoscx/docs/nae-getting-started] [Source: https://github.com/aruba/nae-scripts]
9.4 Aruba Central and NetEdit
NAE handles a single switch beautifully. But what about a campus of 200 switches across 15 sites? That is the job of network-wide orchestration, and HPE offers two complementary tools: Aruba Central (cloud-managed) and NetEdit (on-premises).
9.4.1 Aruba Central — Cloud Onboarding and ZTP
Aruba Central is HPE’s cloud-based network management platform. For AOS-CX it offers monitoring, group-based configuration, firmware management, AI Insights, and — the killer feature — Zero Touch Provisioning (ZTP).
ZTP lets you ship a switch directly from HPE to a remote site, have the local hands plug in two cables (uplink and power), and have the switch auto-configure itself with no field engineer required. The flow:
- Pre-provision in Central. From
portal.central.arubanetworks.com, go to Account Home > Device Inventory > Add Devices and enter the serial number and MAC address (from the box label or HPE’s order export). Up to 30 at once, or upload a CSV. - Assign a license. HPE GreenLake licensing must be applied to each device or it will not show up in pre-provisioning lists.
- Place in a template group. Create a group with a configuration template plus a JSON variables file keyed by serial or MAC. Per-device values (hostname, management IP, VLANs) come from this file, while the template provides the structure.
- Power on at the remote site. A factory-default switch with internet-reachable DHCP will:
- Get a DHCP lease.
- Resolve and connect to
activate.arubanetworks.com. - Be redirected to your Central tenant based on its serial.
- Pull its template + variables, apply config, and establish an IPSec tunnel back to Central for ongoing management.
- Verify. In Central UI the device transitions from “down” to “up.” On the device,
show systemconfirms IP, serial, MAC, and licenses.
[Switch boot]
|
v
DHCP lease (Internet) -> activate.arubanetworks.com
|
v
Redirected to your Central tenant (matched by serial)
|
v
Pull template + JSON variables -> apply running-config
|
v
Establish IPSec tunnel -> appear "up" in Central
If the switch never connects, the usual culprits are: no internet from the DHCP-provided default gateway, a firewall blocking activate.arubanetworks.com, or a typo in the serial number / MAC in the inventory.
Figure 9.4: Aruba Central ZTP onboarding sequence
sequenceDiagram
participant Admin as Network Admin
participant Central as Aruba Central<br/>(portal)
participant Activate as activate.<br/>arubanetworks.com
participant DHCP as Site DHCP /<br/>Internet
participant SW as Factory-Default<br/>AOS-CX Switch
Admin->>Central: Add device:<br/>Serial + MAC
Admin->>Central: Apply GreenLake<br/>license
Admin->>Central: Place in template<br/>group + JSON vars
Note over SW: Switch shipped to<br/>remote site, plugged in
SW->>DHCP: DHCP DISCOVER
DHCP-->>SW: Lease + gateway + DNS
SW->>Activate: HTTPS connect<br/>(serial in cert/req)
Activate->>Central: Lookup tenant<br/>by serial
Activate-->>SW: Redirect to<br/>tenant URL
SW->>Central: Connect with serial
Central->>Central: Match device,<br/>render template<br/>with JSON vars
Central-->>SW: Push running-config
SW->>SW: Apply config,<br/>reboot interfaces
SW->>Central: Establish IPSec tunnel
Central-->>Admin: Device shows "up"<br/>in dashboard
Admin->>SW: show system<br/>(verify)
SW-->>Admin: IP, serial, MAC,<br/>license OK
9.4.2 Configuration Groups in Central
Central organizes devices into groups. A group has a configuration model (UI-based or template-based) and a license tier. Devices in a UI-managed group inherit settings configured graphically; devices in a template-managed group inherit a Jinja2-style template rendered with their per-device variables.
Table 9-5. UI groups vs Template groups
| Aspect | UI Group | Template Group |
|---|---|---|
| Editor | GUI form-driven | Text template + JSON vars |
| Best for | Small fleets, simple configs | Large fleets, scripted ops |
| Per-device customization | Limited | Full (one JSON entry per device) |
| Drift detection | Automatic | Automatic |
| Audit trail | Yes | Yes |
| Skill required | Junior network admin | Engineer comfortable with templating |
A common pattern is a UI group for a small lab and template groups for production sites where consistency and version control matter.
9.4.3 NetEdit for On-Prem Orchestration
Some customers cannot or will not put their switches in the cloud — government, certain regulated industries, air-gapped networks. Aruba NetEdit is HPE’s on-premises alternative. It is a virtual appliance (VM) that you deploy in your own data center and that talks to switches over the AOS-CX REST API.
Key NetEdit capabilities:
- Configuration validation before deployment (catch typos, missing references, conflicts)
- Multi-switch transactions — apply a change to many switches atomically; if any fails, all roll back
- Compliance checking — define a “golden config” and continuously check switches against it
- Visual diff before and after a change with policy-aware syntax checking
- Change history with undo
NetEdit is particularly powerful for change windows where you must, for example, push a new ACL to 50 access switches simultaneously and absolutely must not leave the fleet half-changed.
9.4.4 Compliance and Validation
Both Central (in template groups) and NetEdit continuously compare actual device state against intended configuration. Drift — caused by an admin making a CLI change directly on the switch — is flagged in the dashboard and can be auto-remediated by reapplying the template. This closes the loop between intent (what the template says) and reality (what the switch is actually running) and is the single biggest operational win of moving away from CLI-only management.
Tip: A common exam question contrasts the two: Central is cloud-hosted; NetEdit is on-prem. Both validate config and detect drift, but only Central does ZTP from
activate.arubanetworks.com.
[Source: https://support.hpe.com/hpesc/public/docDisplay?docId=sf000099376en_us&docLocale=en_US]
Chapter Summary
Modern AOS-CX operations are layered:
- SNMP, sFlow, Syslog, and mirror sessions remain the workhorses for integration with third-party NMS, flow collectors, SIEMs, and packet brokers. SNMPv3 with
authPrivis the only acceptable SNMP profile in production. sFlow gives you statistically sound traffic visibility at low overhead. Syslog with a meaningful severity filter feeds your SIEM without drowning it. - The AOS-CX REST API turns the switch into a programmable platform. Login is a session-cookie POST; the configuration tree is reachable under
/rest/vXX.YY/system/...; CRUD operations follow standard HTTP verbs. ThepyaoscxSDK and thearubanetworks.aoscxAnsible Collection make this comfortable for both scripts and infrastructure-as-code. - The Network Analytics Engine runs Python on the switch itself, defining monitors, conditions, and actions. Agents are instances of scripts that continuously evaluate state from the AOS-CX databases and act when thresholds are crossed — including invoking the REST API to self-heal.
- Aruba Central delivers cloud-based ZTP, group configuration, drift detection, and AI Insights. Pre-provision serial + MAC, assign a GreenLake license, place in a template group, and a factory-default switch can boot itself into your network with no truck roll.
- NetEdit is the on-prem equivalent for organizations that keep management inside their own perimeter, with strong compliance, validation, and atomic multi-switch transactions.
The exam-relevant summary: pick SNMPv3, sFlow, and remote Syslog for telemetry; pick REST API + pyaoscx for ad-hoc automation; pick NAE for on-switch reactive monitoring; pick Aruba Central for cloud-managed ZTP and Central groups; pick NetEdit when you need on-prem orchestration with rollback.
Key Terms
- SNMP (Simple Network Management Protocol) — Polling protocol for device metrics; v2c uses cleartext community strings, v3 adds authentication and encryption.
- SNMPv3 Security Levels —
noAuthNoPriv,authNoPriv,authPriv. OnlyauthPrivis acceptable in production. - sFlow — Sampled-packet plus counter-poll telemetry standard (RFC 3176) used to identify traffic patterns and top talkers with low CPU overhead.
- Sampling Rate — Frequency of sFlow packet sampling, e.g., 1-in-4096; lower = more detail, higher CPU.
- Syslog — Event logging protocol; severity 0 (Emergency) through 7 (Debug); AOS-CX can forward to multiple servers with per-destination filters.
- Mirror Session — AOS-CX port mirroring (SPAN/RSPAN/ERSPAN equivalent) for full packet capture.
- REST API — HTTPS-based programmatic interface to AOS-CX; session-cookie authentication via
POST /login. - pyaoscx — Official Python SDK that wraps the AOS-CX REST API with object-oriented constructs.
- Swagger UI — On-switch interactive REST API documentation served at
/api/. - NAE (Network Analytics Engine) — Built-in AOS-CX framework that runs Python scripts to monitor, alert, and remediate. No additional license required.
- Script / Agent / Monitor / Condition / Action — The five NAE abstractions: a script defines the logic; an agent is a running instance; monitors subscribe to state; conditions are Boolean triggers; actions are Python remediation steps.
- NAE-Lite — Reduced NAE feature set on lower-tier platforms, focused on log monitoring.
- Aruba Central — HPE’s cloud-managed network platform; supports ZTP, group configuration, AI Insights, and drift detection.
- ZTP (Zero Touch Provisioning) — Factory-default switch boots, contacts
activate.arubanetworks.com, and pulls config from Central with no manual on-site configuration. - Template Group — Central group type that combines a config template with a JSON variables file keyed by serial/MAC.
- NetEdit — HPE’s on-prem orchestration tool for AOS-CX; provides validation, atomic multi-switch transactions, and compliance checking.
- Compliance / Drift Detection — Continuous comparison of running config against an intended baseline; both Central and NetEdit support this.
[Source: https://developer.arubanetworks.com/aoscx/docs/nae-getting-started] [Source: https://github.com/aruba/pyaoscx] [Source: https://github.com/aruba/nae-scripts] [Source: https://pyaoscx.readthedocs.io/en/stable/] [Source: https://developer.arubanetworks.com/aoscx/docs/aos-cx-swagger-ui] [Source: https://support.hpe.com/hpesc/public/docDisplay?docId=sf000099376en_us&docLocale=en_US]
Chapter 10: Troubleshooting, Upgrade Workflows, and Exam Strategy
By this point you have built campus topologies, secured them, and watched them auto-heal. The final mile of the HPE7-A01 journey is what to do when something breaks, when it is time to push new code, and when the green “Start Exam” button appears in front of you at Pearson VUE. This chapter binds those three skills together: a structured troubleshooting methodology, the AOS-CX upgrade and recovery toolbox, and a study/exam-day plan tuned to the official blueprint.
Think of this chapter as the post-flight checklist for everything you have learned. A pilot does not improvise when an engine surges — they reach for a known sequence. A network engineer in a Friday-night change window does the same.
Section 1: Structured Troubleshooting
1.1 The OSI Methodology in Practice
When a user reports “the network is broken,” that statement is almost always wrong in its scope. Something specific is broken — a cable, an optic, a VLAN, a route, a DNS lookup, a TLS handshake. The OSI model is the universal triage tool because it forces you to ask the question one layer at a time, from the bottom up.
Bottom-up troubleshooting starts at Layer 1 because cheaper-to-check problems live at the physical layer and because every higher layer assumes the one below it is healthy. There is no point debugging OSPF adjacency if the optic is missing.
Top-down troubleshooting starts at the application layer and is faster when the symptom is clearly application-level (e.g., one web app fails while ICMP, DNS, and other apps work). For HPE7-A01, expect to see questions where the symptom dictates the direction.
Divide-and-conquer picks Layer 3 or 4 first (“can you ping the gateway?”), then narrows up or down based on the answer. It is the most common approach for experienced engineers.
Figure 10.1: OSI-layer troubleshooting decision tree
flowchart TD
Start([User reports problem]) --> Scope{Symptom scope?}
Scope -->|Physical / link down| L1[L1: show interface<br/>show interface transceiver]
Scope -->|App-specific failure| L7[L7: DNS, AAA, cert checks]
Scope -->|Reachability unclear| L3[L3: ping gateway<br/>show ip route]
L1 -->|Errors / no light| FixL1[Replace cable / optic<br/>check DOM dBm]
L1 -->|Link OK| L2[L2: show vlan<br/>show mac-address-table<br/>show spanning-tree]
L2 -->|VLAN / STP issue| FixL2[Correct trunk / STP role]
L2 -->|L2 OK| L3
L3 -->|No route / no neighbor| L3Fix[show ip ospf neighbor<br/>show ip interface brief]
L3 -->|L3 OK| L4[L4: show access-list hitcounts<br/>TCP/UDP port checks]
L4 -->|Drops / ACL hits| L4Fix[Adjust ACL / firewall]
L4 -->|L4 OK| L7
L7 -->|Auth / DNS fail| L7Fix[show aaa authentication<br/>RADIUS test, ClearPass logs]
L7 -->|All layers clean| Capture[Mirror session +<br/>show tech for TAC]
| Layer | Typical Symptoms | First-Look AOS-CX Commands |
|---|---|---|
| L1 Physical | Link down, CRC errors, FCS errors, intermittent flaps | show interface, show interface transceiver, show interface <id> error-statistics |
| L2 Data Link | VLAN mismatch, MAC flapping, STP topology change | show vlan, show mac-address-table, show spanning-tree |
| L3 Network | No route, wrong next hop, OSPF down | show ip route, show ip ospf neighbor, show ip interface brief |
| L4 Transport | TCP resets, blocked ports, ACL drops | show access-list hitcounts, show tech (selective) |
| L5–L7 | DNS, AAA, certificate failures | show aaa authentication, RADIUS test, ClearPass logs |
Analogy. OSI troubleshooting is like diagnosing a house with no water pressure. You do not start by inspecting the showerhead. You walk to the street, then the meter, then the main valve, then the hot-water tank, then the branch line. Each step rules out an entire family of causes.
[Source: https://support.hpe.com/hpesc/public/docDisplay?docId=sd00007167en_us&page=GUID-37E0392B-4299-4AA0-9832-FBBA00D389D5.html] [Source: https://www.youtube.com/watch?v=A1i8iTroCec]
1.2 Cabling, Optics, and Transceiver DOM
Layer 1 is responsible for a surprising share of “complex” outages. AOS-CX exposes Digital Optical Monitoring (DOM) data on every supported transceiver — temperature, supply voltage, bias current, TX power, and RX power. These five values catch dying optics before they cause an outage.
switch# show interface 1/1/49 transceiver detail
Interface 1/1/49
Transceiver Type : 10GBASE-SR
Vendor Name : HPE
Part Number : J9150D
Serial Number : MY12345678
Connector Type : LC
Wavelength : 850 nm
Diagnostic Information:
Temperature : 38.4 C (warn 0/70, alarm -5/75)
Voltage : 3.30 V
TX Bias : 6.4 mA
TX Power : -2.1 dBm (warn -8.2/0.5)
RX Power : -4.7 dBm (warn -11.3/0.5)
Healthy multimode 10G optics typically run TX/RX between -1 dBm and -7 dBm. An RX value drifting toward -10 dBm signals a dirty connector, a bent fiber, or a failing far-end laser. Memorize: lower (more negative) dBm = weaker signal.
Counters tell the same story over time. show interface <id> reports input/output errors, runts, giants, CRC, and collisions. Persistent CRC errors point to cabling or optic issues; runts often indicate a duplex mismatch; broadcast storms suggest a Layer 2 loop that STP failed to break.
switch# show interface 1/1/12
1/1/12 is up
Admin state is up
Link speed 1000 Mb/s, Duplex full
Input packets 12,034,221 Bytes 8.4 GB
Input errors 0 CRC/FCS 0 Runts 0 Giants 0
Output packets 9,201,887 Bytes 6.1 GB
Output errors 0 Drops 14 Collisions 0
To clear counters before a focused test: clear interface 1/1/12 counters. Then run traffic and re-check — counters that climb during a known-good test pinpoint the bad segment.
1.3 Mirror Sessions for Packet Captures
When counters and DOM cannot answer the question, capture the packets. AOS-CX implements port mirroring through mirror sessions that copy traffic from a source port (or VLAN) to a destination port where Wireshark, tcpdump, or a probe is connected.
switch(config)# mirror session 1
switch(config-mirror-1)# source interface 1/1/10 both
switch(config-mirror-1)# destination interface 1/1/48
switch(config-mirror-1)# no shutdown
switch(config-mirror-1)# end
switch# show mirror
Key options:
source ... rx | tx | both— choose ingress, egress, or both directions.source vlan 30— mirror an entire VLAN’s traffic.- Remote mirror (ERSPAN-like) — encapsulate mirrored traffic in GRE and forward to a remote analyzer. Aruba access points can also be configured to mirror wireless client traffic to a UDP listener (commonly port 5555) where Wireshark decodes the stream.
Limit mirror session source counts and never mirror a 10G port to a 1G destination — the destination will tail-drop and your capture will be incomplete.
[Source: https://support.hpe.com/hpesc/public/docDisplay?docId=sf000094341en_us&docLocale=en_US] [Source: https://wifiromigh.wordpress.com/2018/03/22/capturing-client-traffic-on-aruba-campus-and-instant-access-points-using-wireshark/]
1.4 The “Show Tech” Safety Net
When you must escalate to TAC, you will be asked for show tech. This single command bundles version, configuration, hardware inventory, log buffers, interface statistics, and protocol state into a comprehensive snapshot. Pipe it to a file:
switch# show tech | redirect tftp://10.0.0.5/sw01-showtech.txt
show tech for specific subsystems (show tech ospf, show tech vsx) generates a focused bundle, ideal when you know the failure domain. Always capture show tech before you start poking at a problem — it gives TAC a baseline.
1.5 Troubleshooting Checklist
| Step | Action | Command(s) |
|---|---|---|
| 1 | Define the problem precisely (what works, what doesn’t, when did it start) | — |
| 2 | Verify physical layer | show interface, show interface transceiver detail |
| 3 | Verify L2 reachability | show mac-address-table, show vlan, show lldp neighbor |
| 4 | Verify L3 reachability | show ip route, ping, traceroute |
| 5 | Verify protocol state | show ip ospf neighbor, show vsx status, show lacp aggregate |
| 6 | Check for drops/security | show access-list hitcounts, show events |
| 7 | Capture if needed | mirror session, show tech |
| 8 | Document & escalate | copy running-config tftp://... |
Section 2: Image Management and Upgrades
AOS-CX ships with a dual-partition firmware architecture: every switch has a primary and a secondary flash partition, each capable of holding a complete firmware image [Source: https://support.hpe.com/hpesc/public/docDisplay?docId=sd00002116en_us&page=GUID-F5189734-3F06-4501-AEC4-8A75A30F020E.html&docLocale=en_US]. One is active; the other is a parachute.
2.1 Primary and Secondary Images
Analogy. Think of primary/secondary like a dual-engine aircraft. You upgrade one engine at a time. If the new engine misbehaves, you fly home on the old one.
The cardinal rule: always upload new firmware to the non-active partition. If the switch is running from primary, stage the new image to secondary. This guarantees the running image stays untouched and rollback is one boot away.
switch# show version
Version : FL.10.13.1000
Active Image : primary
Service OS Version : FL.01.06.0001
switch# show images
Primary Image
Version : FL.10.13.1000
Date : 2026-01-15
Secondary Image
Version : FL.10.11.1010
Date : 2025-08-20
Default Boot : primary
Stage the new image to the alternate partition:
switch# copy sftp://admin@10.0.0.5/ArubaOS-CX_6300_10_14_0001.swi secondary vrf mgmt
Then verify:
switch# verify signature flash secondary
Signature verification PASSED.
[Source: https://support.hpe.com/hpesc/public/docDisplay?docId=sd00007177en_us&page=GUID-B0DACEAE-FD68-428B-AE7B-95029E9FE167.html&docLocale=en_US] [Source: https://www.tenable.com/audits/items/CIS_HPE_Aruba_Networking_CX_Switch_v1.0.1_L1.audit:db2ab023863f7285eea4f164d3f5c913]
2.2 Boot Order, Image Verification, and Cryptographic Signing
Every AOS-CX image is signed with RSA-3072 / SHA-256 by HPE. The signature is checked twice: at download (rejected if invalid) and at every boot (drops you into ServiceOS if invalid). This is non-negotiable — there is no flag to bypass it.
Configure the boot partition for the next reload:
switch(config)# boot system primary
switch(config)# boot system secondary ! choose one
switch(config)# end
switch# copy running-config startup-config
After verification and boot configuration, reload:
switch# boot system flash secondary
This will reboot the entire switch and disconnect your current session.
Continue (y/n)? y
Figure 10.2: AOS-CX boot/image management state
stateDiagram-v2
[*] --> BootMenu: Power on
BootMenu --> PrimaryBoot: default boot=primary
BootMenu --> SecondaryBoot: default boot=secondary
BootMenu --> ServiceOS: press 0 during countdown
PrimaryBoot --> SigCheckP: verify RSA-3072 / SHA-256
SecondaryBoot --> SigCheckS: verify RSA-3072 / SHA-256
SigCheckP --> RunningPrimary: signature OK
SigCheckP --> ServiceOS: signature FAIL
SigCheckS --> RunningSecondary: signature OK
SigCheckS --> ServiceOS: signature FAIL
RunningPrimary --> Staging: copy sftp://... secondary
RunningSecondary --> Staging: copy sftp://... primary
Staging --> RunningPrimary: boot system primary + reload
Staging --> RunningSecondary: boot system secondary + reload
RunningPrimary --> Sync: copy primary secondary
RunningSecondary --> Sync: copy secondary primary
Sync --> RunningPrimary
Sync --> RunningSecondary
ServiceOS --> RunningPrimary: boot
ServiceOS --> RunningSecondary: boot secondary
ServiceOS --> [*]: erase zeroize / USB recovery
When the switch comes back, immediately verify and then synchronize partitions so both hold the new image (some shops keep the previous image as an explicit fallback — choose the policy that fits your change-management strategy).
switch# show version
switch# copy primary secondary
[Source: https://support.hpe.com/hpesc/public/docDisplay?docId=sd00007458en_us&page=GUID-DB61250B-62A6-4958-9CAD-930CD614B648.html&docLocale=en_US] [Source: https://support.hpe.com/hpesc/public/docDisplay?docId=sd00007217en_us&page=GUID-F8569AA7-22D4-4FE6-BC34-96B080E4ABA2.html&docLocale=en_US]
2.3 The Eight-Step Traditional Upgrade
| Step | What You Do | Command |
|---|---|---|
| 1 | Identify active partition | show version, show images |
| 2 | Transfer new image to inactive partition | copy sftp://... secondary vrf mgmt |
| 3 | Verify signature | verify signature flash secondary |
| 4 | Set boot partition | boot system secondary |
| 5 | Save config | copy running-config startup-config |
| 6 | Reboot | boot system flash secondary |
| 7 | Post-upgrade validation | show version, show interface brief, show ip ospf neighbor |
| 8 | Sync partitions | copy primary secondary |
2.4 Unsafe Updates
Some releases include component-level updates to the bootloader, PoE controller, or PHY firmware. These are flagged as unsafe updates because interrupting them can permanently brick the affected component. They are disabled by default. To allow them within a 30-minute window:
switch(config)# allow-unsafe-updates 30
switch(config)# end
switch# boot system secondary
Never remove power during an unsafe update. The switch may reboot multiple times — this is expected.
[Source: https://airheads.hpe.com/discussion/aos-cx-tech-tips-allow-unsafe-updates]
2.5 ServiceOS / ROMmon Basics
ServiceOS is a tiny recovery OS embedded on a separate flash partition. Its job is to boot when nothing else can. You reach it via the console (9600 baud, 8-N-1) and pressing 0 during the boot menu countdown. From there:
ServiceOS> dir
ServiceOS> boot # boot primary AOS-CX image
ServiceOS> boot secondary
ServiceOS> password reset
ServiceOS> erase zeroize
ServiceOS> copy usb /usb/ArubaOS-CX_6300_10_14_0001.swi primary
ServiceOS is also where the switch lands automatically if both AOS-CX images fail signature verification. Treat it as the “BIOS recovery mode” of the switch.
[Source: https://airheads.hpe.com/discussion/firmware-corrupted] [Source: https://www.youtube.com/watch?v=unGwCiN-3bA]
2.6 VSX Live Software Upgrade
Single-switch upgrades take the switch offline. For redundant pairs (CX 8100/8300/8325/8400), VSX Live Upgrade orchestrates a hitless transition. The headline number: 12–19 ms of measurable impact during the active/standby switchover.
The Live Upgrade flow:
Figure 10.3: VSX Live Upgrade sequence
sequenceDiagram
participant Admin
participant Active as VSX Active
participant Standby as VSX Standby
Admin->>Standby: copy image to secondary
Admin->>Standby: boot system secondary (Live Upgrade)
Standby->>Standby: reboot to new image
Standby-->>Active: rejoin VSX (now on new code)
Active->>Active: drain traffic via LACP (12-19ms)
Active->>Standby: traffic now flowing through Standby
Admin->>Active: boot system secondary
Active->>Active: reboot to new image
Active-->>Standby: rejoin VSX (both on new code)
Pre-flight: confirm show vsx status shows In-Sync on the ISL and the keepalive, and that both switches have the new image staged on their inactive partition. Live Upgrade fails fast if either side is out of sync — this is a feature, not a bug.
| Platform | Hitless Mechanism | Min AOS-CX |
|---|---|---|
| CX 6400 chassis | ISSU (redundant management modules + line-card hot patching) | 10.10 |
| CX 8300/8325/8400 (VSX pair) | VSX Live Upgrade (LACP traffic drain) | 10.06+ |
| CX 6300 stack (VSF) | Enhanced Software Upgrade (ESU) | 10.11 |
| CX 6300 stack (VSF) | Hitless ISSU (no conductor reboot) | 10.13 |
[Source: https://solutiontechlab.com/2024/01/05/aruba-cx-live-upgrade/] [Source: https://airheads.hpe.com/discussion/aruba-cx-1013-hitless-issu-and-required-firmware-updates]
Section 3: Recovery Procedures
When the upgrade you carefully planned does not go as planned, or when someone forgot the password, or when a power blip corrupted both images, you need recovery. AOS-CX exposes four escalating tiers.
Figure 10.4: Recovery procedure decision tree
flowchart TD
Start([Switch in trouble]) --> Q1{What is broken?}
Q1 -->|Forgot admin password| Tier1[Tier 1: Password Recovery<br/>Console 9600/8-N-1<br/>Press 0 at boot menu<br/>ServiceOS: password reset]
Q1 -->|Bad config / change<br/>switch boots OK| Tier4[Tier 4: Checkpoint Rollback<br/>checkpoint rollback name<br/>no reboot required]
Q1 -->|Re-deploying / wipe needed| Tier2[Tier 2: Factory Reset<br/>ServiceOS: erase zeroize<br/>config + certs wiped<br/>images preserved]
Q1 -->|Both images corrupt /<br/>signature fails| Tier3[Tier 3: USB Recovery<br/>FAT32 USB with .swi<br/>ServiceOS: copy usb ... primary<br/>RSA-3072 verified]
Tier1 --> Done1([Login admin / blank<br/>set new password])
Tier4 --> Done4([Verify with<br/>show running-config])
Tier2 --> ZTP([Boots into ZTP-ready<br/>DHCP mgmt + Aruba Central])
Tier3 --> Done3([Boot, then SFTP<br/>known-good config])
Done1 --> Verify[Post-recovery checks:<br/>show version<br/>show interface brief<br/>show ip ospf neighbor]
Done4 --> Verify
ZTP --> Verify
Done3 --> Verify
3.1 Password Recovery via ServiceOS
Forgotten admin password — the most common recovery scenario. ServiceOS handles this without erasing configuration.
- Connect a console cable (9600/8-N-1).
- Reboot the switch — soft press of the Reset button or
reloadfrom a privileged session. - During the boot menu countdown, press
0to enter ServiceOS. - At the
ServiceOS login:prompt, typeadminwith no password. - Reset the credential and reload:
ServiceOS> password reset
Reset admin password? (y/n): y
Admin password reset.
ServiceOS> reload
When the switch boots, log in as admin with a blank password and immediately set a new one. Configuration, VLANs, routes, certificates — everything is preserved.
[Source: https://airheads.hpe.com/discussion/basics-cx-6100-6200-6300-6400-restore-switch-to-factory-default] [Source: https://www.youtube.com/watch?v=XupP2Le_DYg]
3.2 Factory Default and Zeroize
When the switch is destined for a different role (or different customer) and configuration must be wiped:
ServiceOS> erase zeroize
This will erase all configuration including certificates and management files.
Continue? (y/n): y
erase zeroize (or erase all zeroize) wipes startup configuration, certificates, RADIUS shared secrets, SSH keys, and management state. The firmware images themselves are preserved. The switch reboots into ZTP-ready state with DHCP on the management interface and admin/blank credentials. This is the right command before redeploying a switch — and the wrong command if you only need a password reset.
[Source: https://www.router-switch.com/faq/aruba-cx-6000-series-factory-reset-guide.html]
3.3 USB-Based Image Recovery
When both flash images are corrupt and the switch sits at the ServiceOS prompt with no usable AOS-CX, USB recovery is your bootstrap:
- Format a USB drive as FAT32 and copy the desired AOS-CX
.swito the root. - Insert the USB into the switch’s USB port.
- From ServiceOS:
ServiceOS> dir usb
ServiceOS> copy usb /ArubaOS-CX_6300_10_14_0001.swi primary
ServiceOS> boot
The image is verified (RSA-3072) before it is written, so a tampered file cannot revive a switch. After boot, you can SFTP-pull a known-good configuration from a checkpoint server and you are back in business.
[Source: https://www.youtube.com/watch?v=unGwCiN-3bA]
3.4 Configuration Restoration from Checkpoint
AOS-CX maintains a built-in checkpoint subsystem — think of it as Time Machine for switch configuration. Every committed change can be saved as a named checkpoint, and any checkpoint can be restored without a reboot.
switch# checkpoint create pre-ospf-redesign
switch# show checkpoint list
Name Date User
startup-config 2026-04-29T09:00:00 system
running-config 2026-04-29T09:14:22 admin
pre-ospf-redesign 2026-04-29T09:14:22 admin
switch# checkpoint diff running-config pre-ospf-redesign
switch# checkpoint rollback pre-ospf-redesign
When integrated with the Network Analytics Engine (NAE), every checkpoint creation appears as a purple diamond on performance graphs, giving you instant visual correlation between configuration changes and behavior shifts.
For longer-term safety, copy startup-config to an external server before any change window:
switch# copy startup-config sftp://admin@10.0.0.5/backups/sw01-2026-04-29.cfg vrf mgmt
[Source: https://airheads.hpe.com/discussion/arubaos-cx-configuration-checkpoints-and-auto-rollback]
Section 4: HPE7-A01 Exam Strategy
The exam is winnable. It is also unforgiving if you walk in unprepared for the format.
4.1 Exam Specifications
| Attribute | Value |
|---|---|
| Exam code | HPE7-A01 |
| Title | HPE Aruba Networking Certified Professional – Campus Access |
| Delivery | Pearson VUE (proctored, online or test center) |
| Format | Multiple-choice + scenario/simulation |
| Questions | 75 |
| Time | 120 minutes |
| Passing score | 68% (~51/75 correct) |
| Cost | USD 350 (USD 195 in emerging markets) |
| Languages | English, Japanese, Latin American Spanish |
| Ideal candidate | NOC L2/L3 with 2–5 years HPE Aruba experience |
[Source: https://certification-learning.hpe.com/tr/datacard/exam/HPE7-A01] [Source: https://certification-learning.hpe.com/tr/datasheet/exam/HPE7-A01]
4.2 Mapping the Blueprint to Study Time
The blueprint is your study budget. Allocate hours proportional to weight, then adjust for personal weak areas.
| Domain | Weight | Hours per 60-hour Plan | Hours per 120-hour Plan |
|---|---|---|---|
| WLAN | 17% | 10.2 | 20.4 |
| Switching | 14% | 8.4 | 16.8 |
| Routing | 13% | 7.8 | 15.6 |
| Security | 9% | 5.4 | 10.8 |
| Authentication & Authorization | 8% | 4.8 | 9.6 |
| Network Resiliency & Device Virtualization | 8% | 4.8 | 9.6 |
| Connectivity | 8% | 4.8 | 9.6 |
| Performance Optimization | 7% | 4.2 | 8.4 |
| Managing & Monitoring | 6% | 3.6 | 7.2 |
| Troubleshooting | 6% | 3.6 | 7.2 |
| Network Stack | 4% | 2.4 | 4.8 |
Example. A network engineer with strong wired/routing background but limited WLAN exposure looks at the table and immediately shifts 4–6 hours from Routing into WLAN. Time allocation is proportional, not religious — bend it to your gaps.
[Source: https://www.certfun.com/hpe/hpe-aruba-network-campus-access-professional-acp-camacss-exam-syllabus]
4.3 Practice Labs and Emulators
The exam emphasizes scenario-based questions where you must recognize a configuration’s outcome, not just recite syntax. Hands-on practice converts study into instinct.
AOS-CX Switch Simulator runs in:
- EVE-NG — strong topology editor, easy multi-switch topologies (1 vCPU, 2 GB RAM, 8 GB disk per switch).
- GNS3 with GNS3-VM or VirtualBox — free, well-documented, large community.
- vSphere/ESXi — requires more resources (2 vCPU, 4 GB RAM) but gives production-like behavior.
Build, in order, the following lab scenarios — each maps to a heavily weighted domain:
- Two-switch VLAN + LACP + RSTP topology (Switching).
- Three-switch OSPF area design with policy-based routing (Routing).
- VSX pair with active gateway and ISL (Resiliency).
- VSF stack of three CX 6300s — practice ESU upgrade (Switching + Upgrade).
- 802.1X with ClearPass (or FreeRADIUS) and dynamic VLAN assignment (Authentication).
- Aruba Central onboarding of a virtual switch (Management).
- Mirror session capturing wireless client traffic from a simulated AP into Wireshark (Troubleshooting + Monitoring).
[Source: https://airheads.hpe.com/discussion/using-the-aos-cx-switch-simulator-lab-guides] [Source: https://airheads.hpe.com/discussion/creating-your-simulation-environment]
4.4 Time Management on Exam Day
75 questions in 120 minutes is 96 seconds per question on average. The trap is to spend 8 minutes on question 3 and run out of time at question 60.
Use the 60-30-10 rule:
| Phase | Time Slice | Goal |
|---|---|---|
| Pass 1 | First 60% (~72 min) | Answer every easy/medium question. Flag anything that takes >2 min. Skip and flag hard questions. |
| Pass 2 | Next 30% (~36 min) | Tackle flagged hard questions one by one. |
| Pass 3 | Final 10% (~12 min) | Review flagged answers, sanity-check, do not change answers without a concrete reason. |
Analogy. Treat the exam like a buffet line, not a tasting menu. Walk the whole line picking up everything you obviously want (easy points). Come back for the items you had to think about. Only at the end do you debate dessert.
A few question-attack tactics:
- Eliminate, don’t pick. Cross out two clearly-wrong answers first; the correct answer often becomes obvious.
- Watch for absolutes. “Always,” “never,” and “only” are usually wrong; “typically” and “in most cases” are usually right.
- Read the last sentence first in long scenario questions — it tells you exactly what is being asked.
- For sims, run the simplest verification first —
show vsx status,show ip ospf neighbor,show interface brief.
[Source: https://www.certfun.com/blog/hpe7-a01-exam-simplified-your-ultimate-preparation-guide] [Source: https://www.youtube.com/watch?v=p8pX_ZvU3LA]
4.5 Final-Week Review Checklist
The week before the exam is for consolidation, not new material. Cramming raises cortisol and lowers recall.
| Day | Activity | Output |
|---|---|---|
| Day -7 | Take a full 75-question timed practice exam | Identify weakest 2 domains |
| Day -6 | Targeted review of weakest domain (re-read chapter + run lab) | Notes on shaky concepts |
| Day -5 | Targeted review of second weakest domain | Notes on shaky concepts |
| Day -4 | Hands-on: VSX, OSPF, 802.1X end-to-end | Confidence check |
| Day -3 | Read every “Key Takeaways” / “Key Terms” section in this textbook | Mental map |
| Day -2 | Second full timed practice exam | Aim for ≥75% |
| Day -1 | Light review only. Verify Pearson VUE check-in: ID, system test, quiet room | Logistics done |
| Exam Day | 8 hours sleep, normal breakfast, water, arrive 30 min early | Pass |
The day before is logistics, not learning. Confirm your testing room (or Pearson VUE travel route), test your webcam if online, charge your laptop, and clear your desk — Pearson VUE proctors will scan it.
4.6 Recommended Study Resources
- HPE Aruba Networking Certified Professional – Campus Access Official Certification Study Guide HPE7-A01 (HPE Press, hardcover or eBook). [Source: https://hpepress.hpe.com/product/HPE+Aruba+Networking+Certified+Professional+-+Campus+Access+Official+Certification+Study+Guide+HPE7-A01-Hardcover-19786]
- HPE7-A01 Practice Test (HPE Press eBook, 30 questions with explanations). [Source: https://hpepress.hpe.com/product/HPE7-A01+Practice+Test+HPE+Aruba+Networking+Certified+Professional++Campus+Access-eBook-19845]
- Implementing Campus Access, Rev. 23.11 (instructor-led or self-paced training).
- Airheads Community lab guides for VSF, VSX, VXLAN/GBP, EAP-TLS. [Source: https://airheads.hpe.com/discussion/using-the-aos-cx-switch-simulator-lab-guides]
- HPE Aruba Networking Hardening Guides for security baseline knowledge. [Source: https://airheads.hpe.com/blogs/pjortiz/2026/04/16/hpe-aruba-networking-hardening-guides]
Chapter Summary
This chapter closed the loop from “things break” to “you fix them” to “you walk into Pearson VUE and pass.” The structured troubleshooting methodology (OSI bottom-up, divide-and-conquer, top-down) gives you a repeatable process that does not depend on memory or luck. AOS-CX’s dual-partition firmware architecture, combined with show version, show images, boot system, and signed image verification, makes upgrades nearly impossible to brick — provided you always stage to the inactive partition. ServiceOS, password reset, erase zeroize, USB image recovery, and checkpoint rollback give you four recovery escalation tiers, ranging from “I forgot the password” to “both images are dead.” For the HPE7-A01 exam itself, the formula is: study time proportional to blueprint weights, hands-on labs in EVE-NG/GNS3, full-length timed practice tests in the final two weeks, and the 60-30-10 time-management rule on exam day. Pass rate goes up sharply when candidates respect both the technical depth and the test-taking discipline.
Key Takeaways
- Troubleshooting is layered, not random. Bottom-up for L1 symptoms, top-down for application symptoms, divide-and-conquer otherwise.
- DOM data on transceivers (TX/RX dBm, temperature, bias) catches dying optics before users feel them.
- Mirror sessions are your packet capture lifeline — source ports, source VLANs, and remote (ERSPAN-like) destinations are all supported.
- Always upload firmware to the non-active partition and verify the RSA-3072 signature before changing the boot setting.
- VSX Live Upgrade delivers 12–19 ms of measurable disruption; ISSU on CX 6400 chassis and Hitless ISSU on CX 6300 VSF (10.13+) are the zero-disruption paths.
- Unsafe updates require
allow-unsafe-updates 30and an uninterrupted power source; never pull power during one. - ServiceOS is reached by pressing
0during the boot menu at 9600/8-N-1; it providespassword reset,erase zeroize,boot, andcopy usb. - Checkpoints roll configuration back without a reboot; integrate with NAE for visual change correlation.
- Blueprint weights drive study time: WLAN 17%, Switching 14%, Routing 13%, Security 9%, Auth 8%, Resiliency 8%, Connectivity 8%, Perf 7%, Mgmt 6%, Troubleshooting 6%, Network Stack 4%.
- 60-30-10 time rule: 60% of exam time on easy/medium, 30% on hard, 10% reviewing flagged. Average pace: 96 seconds per question.
Key Terms
- ServiceOS — Minimal recovery operating system embedded on a separate flash partition, accessed via console at 9600 baud by pressing
0during boot. Providespassword reset,erase zeroize,boot, and USB image-recovery commands when AOS-CX cannot start. - Primary/Secondary image — The two flash partitions on every AOS-CX switch, each holding a complete firmware image. Best practice: upload new firmware to the non-active partition, set
boot system, reload, then sync. - ZTP recovery — Zero-Touch Provisioning state after
erase zeroizewhere the switch boots with default credentials and DHCP-acquired management address, ready for re-onboarding via DHCP option 60/Aruba Central. - Mirror session — Port-mirroring construct that copies traffic from source ports/VLANs to a destination port for capture. Supports rx, tx, both, and remote (GRE-encapsulated) destinations.
- Password recovery — ServiceOS procedure that resets the admin credential without erasing configuration; uses
password resetthenreload. - Show tech — Comprehensive diagnostic bundle (
show techor subsystem-specific variants) that captures version, configuration, hardware inventory, logs, and protocol state for TAC escalation. - Live upgrade — VSX non-disruptive firmware upgrade orchestration that updates standby first, drains traffic via LACP (12–19 ms impact), then upgrades the former active. Generalizes to ISSU on CX 6400 and Hitless ISSU on CX 6300 VSF.
- Exam blueprint — The official HPE7-A01 domain weighting (WLAN 17%, Switching 14%, Routing 13%, Security 9%, Authentication 8%, Resiliency 8%, Connectivity 8%, Performance 7%, Management 6%, Troubleshooting 6%, Network Stack 4%) used to allocate study time and predict question distribution.