Cassian Gate v79 — Operator Cheat Sheet

(Operator reference — supporting surface; execution and artifacts remain authoritative)

This document describes the user-facing execution model for Cassian Gate.

It reflects implemented CLI behavior only and does not replace deterministic execution or authoritative artifacts.

Cassian Gate is a deterministic, artifact-authoritative network change-validation gate.

It is for: - network engineers validating planned changes before production - platform and infrastructure engineers using a CI-safe network gate - operators who need explicit pass/fail artifacts and deterministic execution

It is not yet for: - users seeking a broad network automation platform - users expecting generic multi-vendor feature parity - users wanting exploratory labs or AI output to act as deployment authority

Cassian Gate is a:

Deterministic Network Change Validation Gate

Execution is:

deterministic
reproducible
artifact-backed
CI-safe
non-heuristic

1️⃣ What Cassian Gate Is (and Is Not)

Cassian Gate IS

a network change validation gate
a deterministic execution engine
a CI pipeline safety check
a behavior validation system

Cassian Gate IS NOT

a general network lab builder
a chaos framework
a retry system
a configuration merge engine
an AI decision system

2️⃣ Command Index

Environment

cassian doctor
cassian validate <topology.yaml>
cassian validate-contrib <path>
cassian preflight <topology.yaml>

Execution (Validation)

cassian test <topology.yaml>
cassian replay <artifacts-dir>
cassian run <topology.yaml>
cassian up <topology.yaml>
cassian down <lab>
cassian cleanup --all

Inspection

cassian status <lab>
cassian exec <lab> <node>
cassian vty <lab> <node> "<command>"
cassian collect <lab>

DevOps Integration

cassian adapt terraform
cassian adapt ansible

AI Assistance (optional / advisory only)

cassian ai --lab <lab-name> "<question>"
cassian ai --artifacts <path> "<question>"
cassian ai "<question>"
cassian ai --lab <lab-name> --online "<question>"
cassian ai --artifacts <path> --online "<question>"

AI never affects execution or verdicts.

3️⃣ Two Execution Modes (CRITICAL)

Understanding this distinction is mandatory.

🔷 Gate Mode (Authoritative Validation)

Command:

cassian test <topology.yaml>

Gate mode automatically performs:

Clean-state destroy (if needed)
Deploy
Provision
Execute tests
Collect artifacts
Destroy lab

Returns deterministic exit codes.

Gate mode is used for:

production validation
CI pipelines
change validation
baseline vs candidate comparison

You do NOT run cassian up first.

Gate mode owns the lifecycle.

Important summary boundary

The human-readable results.summary.txt file is not the verdict authority.

Use:

results.json for authoritative verdict sharing in CI, tickets, and PRs
results.summary.txt for human-readable explanation only

The summary now explicitly states:

what PASS means
what PASS does not mean
what FAIL means
which artifact to share

Zero-assertion gate runs are rejected

If the topology contains:

no tests
no scenarios

then:

ERROR: no assertions defined

A validation gate must include at least one test or scenario.
This run would produce a vacuous PASS and is therefore rejected.

Meaning:

cassian test <topology.yaml> requires at least one declared assertion
a zero-assertion topology is not a valid validation gate
no PASS or FAIL verdict is produced
no lifecycle execution begins
no lab/artifacts are created
exit code is 2 (usage / contract error)

Important boundary:

this rule applies to authoritative gate execution with cassian test <topology.yaml>
it does not block cassian run <topology.yaml>
it does not change replay behavior

cassian replay — Deterministic replay of prior artifacts

Replay re-executes a previous Cassian Gate run from previously generated artifacts.

Replay is a reproduction/analysis surface, not a new authority path.

Authority is preserved from the replayed source context.

Inputs

Replay consumes artifacts from a previous run:

1 2	`topology.resolved.yaml results.json`

These are generated replay inputs.

Important boundary:

artifact reuse for replay does not make replay a new source of authority
shared artifact shape does not imply shared authority
authority still depends on the replay mode and source context

Gate replay (authoritative context preserved)

Replay a prior authoritative gate run:

cassian replay labs/clab-<lab> --gate

This preserves gate / authoritative context.

Meaning:

authoritative validation path
clean-state lifecycle context is preserved from the source gate run
CI-safe verdict semantics remain tied to the original authoritative context

You can also verify deterministic result equivalence:

cassian replay labs/clab-<lab> --gate --verify-results

With --verify-results, replay checks deterministic result equivalence against the source artifacts and fails on mismatch.

Non-gate replay (non-authoritative context preserved)

Replay without --gate keeps replay in a non-authoritative exploration context.

Example:

1	`cassian replay labs/clab-<lab>`

Meaning:

replayed exploration context remains non-authoritative
useful for inspection/debugging only

This path is useful for:

inspection
investigation
iterative debugging
bringing replayed runtime up for manual follow-up commands

This does not upgrade exploration artifacts into gate proof.

When to use replay

Use replay when you want deterministic reproduction of a prior run.

Typical uses:

reproducing a prior authoritative gate result
replaying a prior exploration run for investigation
checking deterministic stability
debugging unexpected behavior from existing artifacts

Replay summary boundary

Replay preserves the same authority boundary messaging in results.summary.txt.

Meaning:

replay does not create a new authority model
results.json remains authoritative
results.summary.txt remains explanatory only

Important boundary

Replay:

preserves prior context
does not create a parallel authority model
does not make exploration authoritative
does not change verdict/exit semantics by itself

🔷 Exploration Mode (Non-Authoritative)

Used for interactive debugging and inspection.

Two approaches exist.

Option A — `run`

1	`cassian run <topology.yaml>`

Meaning:

exploration only
non-authoritative
useful for debugging, not for proof

Typical workflow shape:

1	`up → test → collect → destroy`

By default the lab is destroyed.

Keep the lab running:

cassian run <topology.yaml> --keep

Exploration summary boundary

Even when run mode produces results artifacts, results.summary.txt remains explanatory only.

Use results.json as the authoritative verdict artifact when you need the exact recorded result. Run mode itself remains non-authoritative as a workflow mode.

Option B — Explicit Lifecycle

cassian up <topology.yaml>
cassian status <lab>
cassian exec <lab> <node>
cassian down <lab>

Use this when you want:

a persistent exploratory lab
manual inspection
iterative debugging

Important boundary:

this path is for exploration and inspection
authoritative validation still runs through cassian test <topology.yaml>

Lifecycle Comparison

Feature	Gate Mode	Exploration
Clean-state enforced	Yes	Optional
Auto destroy	Yes	Optional
CI-safe	Yes	No
Interactive inspection	No	Yes
Authoritative verdict	Yes	No

4️⃣ Topology vs Lab Name

Many commands accept different inputs.

Commands That Use a Topology File

cassian gen <topology.yaml>
cassian validate <topology.yaml>
cassian preflight <topology.yaml>
cassian up <topology.yaml>
cassian run <topology.yaml>
cassian test <topology.yaml>

Commands That Use a Lab Name

cassian status <lab>
cassian exec <lab> <node>
cassian vty <lab> <node>
cassian collect <lab>
cassian down <lab>

Where does lab name come from?

Defined inside topology:

name: demo-lab

Displayed during execution:

1	`Lab: demo-lab`

5️⃣ Topology Authoring

Cassian Gate consumes YAML topology definitions.

Minimal Example

name: demo-lab

nodes:
  - name: r1
    type: frr

  - name: r2
    type: frr

links:
  - endpoints: ["r1:eth1", "r2:eth1"]

tests:
  - name: r1_to_r2_ping
    kind: ping
    src: r1
    dst: 10.0.0.1
    count: 2
    expect: pass

Required Keys

Required:

name
nodes
links

Optional:

tests
scenarios
packs
fabric
candidate_changes
vlans

Invariant Packs (Loaded and Expanded During Resolve)

Cassian Gate supports declarative invariant packs that are loaded from the supported local pack surface, compatibility-checked, and then expanded into explicit invariant declarations during Resolve.

Packs are optional authoring shortcuts. The authoritative validation still comes later from the expanded invariant verdicts.

Packs are:

declarative only
loaded locally and deterministically
compatibility-checked before expansion
expanded deterministically during Resolve
written as explicit tests in topology.resolved.yaml
non-authoritative by themselves

Packs do not:

execute code
change lifecycle behavior
introduce runtime-only semantics
change authority boundaries
load from remote registries
use fallback or best-match lookup

Later validation still comes from the resulting invariant verdicts.

Pack Declaration

Example:

packs:
  - datacenter-bgp-safety

Rules:

packs must be a list
each pack entry must be a non-empty string
pack lookup is deterministic and local only
unknown pack names fail fast with exit code 2
incompatible pack contents fail fast with exit code 2
pack expansion must be deterministic

Current Supported Pack

1	`datacenter-bgp-safety`

Typical behavior for supported pack usage:

loads from the supported local pack surface
undergoes compatibility checks before expansion
expands during Resolve into explicit invariant tests
later phases consume the expanded invariants

Example

name: pack-local-compatibility-ok

packs:
  - datacenter-bgp-safety

fabric:
  evpn:
    enabled: true
    mode: vlan-aware
    asn: 65100

nodes:
  - name: spine1
    type: frr
    role: spine
    evpn_rr: true
    router_id: 10.255.0.1

  - name: leaf1
    type: frr
    role: leaf
    router_id: 10.255.0.11

  - name: leaf2
    type: frr
    role: leaf
    router_id: 10.255.0.12

  - name: host1
    type: host
    attach: leaf1
    vlan: 10
    ip: 10.10.10.11/24
    gw: 10.10.10.1
    mac: "00:11:22:33:44:55"

  - name: host2
    type: host
    attach: leaf2
    vlan: 10
    ip: 10.10.10.12/24
    gw: 10.10.10.1
    mac: "00:11:22:33:44:66"

links:
  - endpoints: ["spine1:eth1", "leaf1:eth1"]
    ipv4: ["172.16.0.0/31", "172.16.0.1/31"]

  - endpoints: ["spine1:eth2", "leaf2:eth1"]
    ipv4: ["172.16.0.2/31", "172.16.0.3/31"]

  - endpoints: ["host1:eth1", "leaf1:eth2"]
  - endpoints: ["host2:eth1", "leaf2:eth2"]

vlans:
  10:
    vni: 10100

tests: []

Operator Commands

Validate local pack loading and compatibility enforcement:

1	`cassian validate topologies/pack_local_compatibility_ok.yaml`

Run authoritative gate execution of the accepted expanded invariants:

cassian test topologies/pack_local_compatibility_ok.yaml

Negative misuse proofs:

cassian validate topologies/neg/pack_unknown_reference.yaml
cassian validate topologies/neg/pack_incompatible_contents.yaml

Typical outcomes:

valid local pack topology is accepted
unknown pack references are rejected
incompatible pack contents are rejected

Artifact Note

After Resolve, the expanded invariant list appears explicitly in:

1	`labs/clab-<lab-name>/topology.resolved.yaml`

These expanded tests are generated inputs for later execution only.

Authority still comes from the later invariant verdicts in:

1	`results.json`

6️⃣ Nodes

Supported node types:

Type	Description
frr	FRR router
host	Linux host
nft-fw	nftables firewall
sonic-vm	SONiC VM runtime

7️⃣ Links

Example:

- endpoints: ["r1:eth1", "r2:eth1"]
  ipv4: ["10.0.0.0/31", "10.0.0.1/31"]

If ipv4 is omitted:

/31 addresses auto-assigned

View assigned addresses:

1	`labs/clab-<lab>/topology.resolved.yaml`

8️⃣ EVPN Runtime Substrate (Generation Support)

Cassian Gate supports a deterministic EVPN topology/config generation substrate for a limited, explicit proof shape.

This support exists to produce runtime EVPN control-plane state for later validation work.

It does not make EVPN generation itself authoritative.

Generated EVPN state is supporting runtime substrate only.

Truth still comes from:

tests
invariants

Supported EVPN Intent Surface

Declare EVPN only under:

fabric:
  evpn:
    enabled: true
    mode: vlan-aware
    asn: 65100

Required EVPN fields:

fabric.evpn.enabled
fabric.evpn.mode
fabric.evpn.asn

Supported mode:

vlan-aware

Supported Node Shape

EVPN participants currently use frr nodes with explicit roles.

Example:

nodes:
  - name: spine1
    type: frr
    role: spine
    evpn_rr: true
    router_id: 10.255.0.1

  - name: leaf1
    type: frr
    role: leaf
    router_id: 10.255.0.11

  - name: leaf2
    type: frr
    role: leaf
    router_id: 10.255.0.12

Rules:

EVPN participant nodes must use type: frr
spine nodes must declare evpn_rr: true
leaf nodes must not declare evpn_rr: true
EVPN participant nodes require router_id
leaves must have an explicit direct link to at least one RR spine

VLAN ↔ VNI Mapping

EVPN requires a top-level vlans mapping.

Example:

vlans:
  10:
    vni: 10100

Rules:

each VLAN must map to exactly one VNI
duplicate VNI reuse is rejected
invalid or missing VNI fails fast

Host Attachment Requirements

Host attachment must be explicit.

Example:

- name: host1
  type: host
  attach: leaf1
  vlan: 10
  ip: 10.10.10.11/24
  gw: 10.10.10.1
  mac: "00:11:22:33:44:55"

Required host fields for EVPN proof substrate:

attach
vlan
ip
mac

Rules:

attached host must connect explicitly to an EVPN leaf
host VLAN must exist in the declared VLAN/VNI map
host MAC must be explicit
host must have exactly one explicit link to its attached leaf

Minimal Supported Proof Shape

Supported proof shape is intentionally narrow:

leaf/spine only
explicit RR spine
explicit host attachment
one VLAN is sufficient
deterministic MAC/IP declarations required

This support is intended to produce:

EVPN BGP control-plane sessions
deterministic VLAN/VNI configuration
deterministic host attachment semantics
deterministic runtime substrate for later MAC-route observation

Unsupported / Rejected Shapes

Cassian Gate fails fast on unsupported EVPN topology intent.

Examples include:

EVPN declared outside fabric.evpn
ambiguous EVPN participant selection
unsupported node role combinations
missing RR spine
missing or invalid VNI
missing explicit host attachment semantics
shapes requiring out-of-band configuration
heuristic peer inference

These are misuse / invalid-topology errors.

Example EVPN Runtime Generation Topology

name: evpn-runtime-generation

fabric:
  evpn:
    enabled: true
    mode: vlan-aware
    asn: 65100

nodes:
  - name: spine1
    type: frr
    role: spine
    evpn_rr: true
    router_id: 10.255.0.1

  - name: leaf1
    type: frr
    role: leaf
    router_id: 10.255.0.11

  - name: leaf2
    type: frr
    role: leaf
    router_id: 10.255.0.12

  - name: host1
    type: host
    attach: leaf1
    vlan: 10
    ip: 10.10.10.11/24
    gw: 10.10.10.1
    mac: "00:11:22:33:44:55"

  - name: host2
    type: host
    attach: leaf2
    vlan: 10
    ip: 10.10.10.12/24
    gw: 10.10.10.1
    mac: "00:11:22:33:44:66"

links:
  - endpoints: ["spine1:eth1", "leaf1:eth1"]
    ipv4: ["172.16.0.0/31", "172.16.0.1/31"]

  - endpoints: ["spine1:eth2", "leaf2:eth1"]
    ipv4: ["172.16.0.2/31", "172.16.0.3/31"]

  - endpoints: ["host1:eth1", "leaf1:eth2"]
  - endpoints: ["host2:eth1", "leaf2:eth2"]

vlans:
  10:
    vni: 10100

tests: []

Operator Commands

Validate the EVPN topology:

1	`cassian validate topologies/evpn_runtime_generation.yaml`

Bring up EVPN runtime substrate:

1	`cassian up topologies/evpn_runtime_generation.yaml`

Run authoritative gate proof:

cassian test topologies/evpn_runtime_generation.yaml

Replay deterministically:

cassian replay labs/clab-evpn-runtime-generation --gate --verify-results

Negative misuse proofs:

cassian test topologies/neg/evpn_invalid_vni.yaml
cassian test topologies/neg/evpn_invalid_roles.yaml

Artifact Note

topology.resolved.yaml may include additive EVPN-resolved fields for the generated proof substrate.

These fields remain generated and non-authoritative.

They support deterministic execution only.

Important Boundary

EVPN topology/config generation support:

configures deterministic EVPN runtime substrate
does not prove EVPN correctness by itself
does not validate dataplane forwarding
does not validate EVPN invariants by itself
does not change authority semantics

Use later tests/invariants to establish truth.

9️⃣ Tests and Invariants

Cassian Gate supports both:

active behavior tests
deterministic invariant checks

Both produce standard authoritative results in gate mode.

Standard test kinds

Supported kinds:

ping
tcp
invariant — see "Invariant tests" below for the supported invariant type values

Ping Example

- name: r1_to_r2
  kind: ping
  src: r1
  dst: 10.0.0.1
  count: 2
  expect: pass

Required fields:

name
kind
src
dst

TCP Example

- name: tcp_test
  kind: tcp
  src: h1
  dst: r2
  port: 443
  listener: true
  expect: pass

Required fields:

name
kind
src
dst

Invariant tests

Invariant tests use:

kind: invariant

They validate declared truth conditions and return authoritative pass/fail results like any other test.

Blocked declared validation items

If a declared test or selected scenario reaches authoritative execution scope but cannot execute normally because execution is blocked later in the gate path, Cassian Gate records that item explicitly in results.json.

This prevents omission from being misread as success.

Typical blocked representation:

observed: blocked
verdict: fail
error: blocked before execution

Example meaning:

the declared validation item existed
it was in authoritative scope
it did not run normally
the result was recorded explicitly rather than omitted

Routing Invariants

Routing invariants validate specific routing truth on a named node.

They are useful when you need to prove policy outcome, path preference, route advertisement boundaries, or route attributes.

BGP Local Preference Invariant

Invariant type:

1	`bgp_localpref_equals`

Purpose:

Verify that a BGP route installed on a node has the expected LOCAL_PREF value.

This is useful for validating routing policy behavior such as:

inbound route-maps
outbound policy manipulation
policy-based path preference
iBGP policy consistency

Typical required fields:

Field	Description
node	Node where the route must be observed
prefix	Prefix being validated
expected	Expected BGP local preference value

Example:

tests:
  - name: r2_sees_1_1_1_1_32_with_localpref_200
    kind: invariant
    type: bgp_localpref_equals
    node: r2
    prefix: 1.1.1.1/32
    expected: 200
    expect: pass

Behavior:

The invariant inspects the routing information on the specified node.
The route must exist and contain the declared LOCAL_PREF value.
If the route is present but the LOCAL_PREF differs from the expected value, the invariant fails.
If the invariant definition itself is invalid, the run fails with misuse exit code 2.

Typical exit semantics follow the standard Cassian Gate model:

satisfied invariant → passing gate outcome
invariant mismatch → validation failure
invalid invariant declaration → usage / contract error

Artifacts produced:

The invariant result is recorded in the standard artifacts:

1 2	`labs/<lab>/results.json labs/<lab>/results.summary.txt`

Example result entry:

{
  "name": "r2_sees_1_1_1_1_32_with_localpref_200",
  "kind": "invariant",
  "type": "bgp_localpref_equals",
  "verdict": "pass"
}

Determinism notes:

invariant evaluation occurs during the TEST phase
replay is intended to preserve the same authority semantics as the source gate context

Route Advertised To Invariant

Invariant type:

1	`route_advertised_to`

Purpose:

Verify that a specific route is being advertised from the specified node to the specified peer.

This is useful for validating routing advertisement boundaries such as:

expected route export to a peer
intended prefix propagation across a boundary
prevention of missing outbound advertisements
verification that a route is actually being sent to a named neighbor

Required fields:

Field	Description
node	Node where the route advertisement is checked
peer	Named peer that must receive the route
prefix	Prefix being validated

Example:

tests:
  - name: r1_advertises_10_10_10_0_24_to_r2
    kind: invariant
    type: route_advertised_to
    node: r1
    peer: r2
    prefix: 10.10.10.0/24
    expect: pass

Behavior:

The invariant inspects supported structured advertisement evidence on the specified node.
It passes when the specified prefix is observed as advertised to the named peer.
It fails when the prefix is not observed as advertised to that peer.
If the invariant definition itself is invalid, the run fails with misuse exit code 2.

Typical exit semantics follow the standard Cassian Gate model:

satisfied invariant → passing gate outcome
invariant mismatch → validation failure
invalid invariant declaration → usage / contract error

Artifacts produced:

The invariant result is recorded in the standard artifacts:

1 2	`labs/<lab>/results.json labs/<lab>/results.summary.txt`

Replay:

This invariant can be checked again using standard gate replay workflows.

cassian replay labs/clab-route-advertised-to --gate --verify-results

Scope boundary:

This invariant validates only peer-scoped route advertisement presence.

It does not by itself prove:

generic routing policy correctness
attribute correctness
community / AS-path behavior
broader route-map intent

Route Not Advertised To Invariant

Invariant type:

1	`route_not_advertised_to`

Purpose:

Verify that a specific route is not being advertised from the specified node to the specified peer.

This is useful for validating routing advertisement boundaries such as:

expected suppression of a prefix to a peer
prevention of route leaks
verification that a route is withheld from a named neighbor
confirming that local route presence does not imply outbound advertisement

Required fields:

Field	Description
node	Node where the route advertisement is checked
peer	Named peer that must not receive the route
prefix	Prefix being validated

Example:

tests:
  - name: r1_does_not_advertise_10_10_10_0_24_to_r2
    kind: invariant
    type: route_not_advertised_to
    node: r1
    peer: r2
    prefix: 10.10.10.0/24
    expect: pass

Behavior:

The invariant inspects supported structured advertisement evidence on the specified node.
It passes when the specified prefix is not observed as advertised to the named peer.
It fails when the prefix is observed as advertised to that peer.
If the invariant definition itself is invalid, the run fails with misuse exit code 2.

Typical exit semantics follow the standard Cassian Gate model:

satisfied invariant → passing gate outcome
invariant mismatch → validation failure
invalid invariant declaration → usage / contract error

Artifacts produced:

The invariant result is recorded in the standard artifacts:

1 2	`labs/<lab>/results.json labs/<lab>/results.summary.txt`

Replay:

This invariant can be checked again using standard gate replay workflows.

cassian replay labs/clab-route-not-advertised-to --gate --verify-results

Scope boundary:

This invariant validates only peer-scoped route advertisement absence.

It does not by itself prove:

generic routing policy correctness
attribute correctness
community / AS-path behavior
broader route-map intent

BGP Session Up Invariant

Invariant type:

1	`bgp_session_up`

Purpose:

Verify that an IPv4-AFI BGP session from the specified node to a declared neighbor IPv4 address is in the FRR Established state.

This is useful for validating BGP session establishment such as:

iBGP session presence to a known neighbor
eBGP session presence to a known neighbor
post-change BGP session re-establishment
guarded assertion of session up before further routing-policy invariants

Required fields:

Field	Description
node	Node where the BGP session is checked (FRR-typed)
neighbor	IPv4 literal of the BGP neighbor on that node (canonical alias `dst` accepted)

Example:

tests:
  - name: r1_bgp_up_to_10_0_0_2
    kind: invariant
    type: bgp_session_up
    node: r1
    neighbor: 10.0.0.2
    expect: pass

Behavior:

The invariant runs vtysh -c 'show bgp summary json' on the specified node and parses the structured output.
It passes when the queried neighbor is present in FRR's BGP summary and its session state is Established.
It fails when the session is in any other FRR FSM state (Idle, Active, Connect, OpenSent, OpenConfirm), when the neighbor is not configured (engine-synthesized state literal NotConfigured), or when vtysh fails or its output is not parseable as JSON (engine-synthesized state literal Unknown).
If the invariant definition itself is invalid (missing or malformed dst IPv4 literal), the run fails with misuse exit code 2.
The retry policy mirrors the existing bgp_neighbor test surface: retries are bounded by the test's timeout_s (default 15 seconds) and retry_interval_s (default 1.0 seconds); the loop terminates on first vtysh-rc success and the post-retry block reads the parsed state.

Typical exit semantics follow the standard Cassian Gate model:

satisfied invariant → passing gate outcome
invariant mismatch → validation failure
invalid invariant declaration → usage / contract error

Artifacts produced:

The invariant result is recorded in the standard artifacts:

1 2	`labs/<lab>/results.json labs/<lab>/results.summary.txt`

When verdict: fail, the test record carries a structured observed_state payload with deterministic keys (type, peer, state, last_error, source_node). See "Failed-Invariant Observed State" below and docs/topology-schema-v1.5.md §4.1 for the full per-type schema.

Positive proof example:

cassian test topologies/bgp_invariant.yaml

Replay:

This invariant can be checked again using standard gate replay workflows.

cassian replay labs/clab-bgp-invariant --gate --verify-results

Scope boundary:

This invariant validates only IPv4-AFI BGP session-Established truth from one node to one neighbor IPv4.

It does not by itself prove:

EVPN-AFI session state (use evpn_bgp_session_up)
route presence or attribute correctness
route advertisement boundaries
generic routing policy correctness

Route Present Invariant

Invariant type:

1	`route_present`

Purpose:

Verify that a specific route is present on the specified node's IPv4 routing table.

This is useful for validating route installation such as:

expected RIB presence after policy or session establishment
verification that a prefix is actually installed on the node
guarded assertion of route presence before further routing-policy invariants

Required fields:

Field	Description
node	Node where the route presence is checked
prefix	Prefix being validated (canonical IPv4 CIDR)

Example:

tests:
  - name: r1_has_10_10_10_0_24
    kind: invariant
    type: route_present
    node: r1
    prefix: 10.10.10.0/24
    expect: pass

Behavior:

The invariant inspects the IPv4 routing table on the specified node.
It passes when the queried prefix is observed in the routing table.
It fails when the queried prefix is not observed.
If the invariant definition itself is invalid, the run fails with misuse exit code 2.

Artifacts produced:

1 2	`labs/<lab>/results.json labs/<lab>/results.summary.txt`

When verdict: fail, the test record carries a structured observed_state payload (type, prefix, routes, source_node). See docs/topology-schema-v1.5.md §4.3 for the full per-type schema.

Replay:

cassian replay labs/clab-route-present-missing --gate --verify-results

Route Absent Invariant

Invariant type:

1	`route_absent`

Purpose:

Verify that a specific route is not present on the specified node's IPv4 routing table.

This is useful for validating intentional route absence such as:

prefix-blackhole effectiveness
expected suppression after withdrawal
verification that a route is genuinely not installed
negative-complement of route_present

Required fields:

Field	Description
node	Node where the route absence is checked
prefix	Prefix being validated (canonical IPv4 CIDR)

Example:

tests:
  - name: r1_does_not_have_10_20_20_0_24
    kind: invariant
    type: route_absent
    node: r1
    prefix: 10.20.20.0/24
    expect: pass

Behavior:

The invariant inspects the IPv4 routing table on the specified node.
It passes when the queried prefix is not observed in the routing table.
It fails when the queried prefix is observed.
If the invariant definition itself is invalid, the run fails with misuse exit code 2.

Artifacts produced:

1 2	`labs/<lab>/results.json labs/<lab>/results.summary.txt`

When verdict: fail, the test record carries a structured observed_state payload (type, prefix, routes, source_node). See docs/topology-schema-v1.5.md §4.3 for the full per-type schema.

BGP MED Equals Invariant

Invariant type:

1	`bgp_med_equals`

Purpose:

Verify that a BGP route installed on a node has the expected MED (Multi-Exit Discriminator) value.

This is useful for validating routing policy behavior such as:

inbound MED-rewriting policy
expected MED preservation across boundaries
iBGP MED propagation consistency
companion to bgp_localpref_equals for full attribute coverage

Required fields:

Field	Description
node	Node where the route must be observed
prefix	Prefix being validated
expected	Expected BGP MED value (integer)

Example:

tests:
  - name: r2_sees_1_1_1_1_32_with_med_50
    kind: invariant
    type: bgp_med_equals
    node: r2
    prefix: 1.1.1.1/32
    expected: 50
    expect: pass

Behavior:

The invariant inspects the BGP route entry on the specified node.
The route must exist and contain the declared MED value.
If the route is present but the MED differs from the expected value, the invariant fails.
If the invariant definition itself is invalid, the run fails with misuse exit code 2.

Artifacts produced:

1 2	`labs/<lab>/results.json labs/<lab>/results.summary.txt`

When verdict: fail, the test record carries a structured observed_state payload (type, prefix, peer, actual, expected, source_node). See docs/topology-schema-v1.5.md §4.5 for the full per-type schema.

OSPF Neighbor Up Invariant

Invariant type:

1	`ospf_neighbor_up`

Purpose:

Verify that an OSPF neighbor adjacency from the specified node to a declared peer router-ID has reached the expected FSM state (default Full).

This is useful for validating OSPF adjacency establishment such as:

backbone-area neighbor convergence
post-change OSPF re-adjacency
guarded assertion of OSPF Full adjacency before further routing-policy invariants

This invariant is FRR-only; declaring ospf_neighbor_up against a non-FRR src node is rejected at validation with exit code 2.

Required fields:

Field	Description
src	Node where the OSPF neighbor table is checked (must be `type: frr`)
neighbor	IPv4 literal of the peer's OSPF router-ID (NOT a node name)

Optional fields:

Field	Description
state	Expected FSM state literal; one of `Down`, `Attempt`, `Init`, `2-Way`, `ExStart`, `Exchange`, `Loading`, `Full`. Default `Full` materialised at Resolve.

The companion node-level ospf: block (declared on FRR nodes) carries area (int ≥ 0, required) and networks (non-empty list of canonical IPv4 CIDRs, required); declaring ospf: requires the node to also declare top-level router_id. Timer customization (hello-interval, dead-interval, spf-delay) and passive-interface posture are out of scope; FRR defaults govern. See docs/topology-schema-v1.md §3.1 (Optional ospf: block) for the topology-side schema.

Example:

nodes:
  - name: r1
    type: frr
    router_id: 1.1.1.1
    ospf:
      area: 0
      networks:
        - 10.0.0.0/16
        - 1.1.1.1/32
  - name: r2
    type: frr
    router_id: 2.2.2.2
    ospf:
      area: 0
      networks:
        - 10.0.0.0/16
        - 2.2.2.2/32

links:
  - endpoints: ["r1:eth1", "r2:eth1"]

tests:
  - name: r1_neighbor_up_to_r2
    kind: invariant
    type: ospf_neighbor_up
    src: r1
    neighbor: 2.2.2.2
    expect: pass

Behavior:

The invariant runs vtysh -c 'show ip ospf neighbor json' on the specified src node and parses the structured output.
It passes when the queried neighbor's router-ID is present in FRR's neighbor table and its FSM state matches state (default Full).
It fails when the FSM state differs (engine-synthesized state literals NotConfigured and Unknown may also appear on the FAIL path).
If the invariant definition is invalid (non-FRR src, non-IPv4 neighbor, undeclared state literal), the run fails with misuse exit code 2.
Retry policy: bounded by the test's timeout_s (default 60 seconds — pragmatic to OSPF dead-interval reality) and retry_interval_s (default 1.0 seconds) when expect: pass. Single attempt for expect: fail.

Artifacts produced:

1 2	`labs/<lab>/results.json labs/<lab>/results.summary.txt`

When verdict: fail, the test record carries a structured six-key observed_state payload (type, neighbor, state, expected_state, last_error, source_node). See docs/topology-schema-v1.5.md §4.8 for the full per-type schema, including the comprehensive 10-FSM-literal closed-set documentation (8 declarable + 2 observed-only).

Positive proof example:

cassian test topologies/ospf_neighbor_up.yaml

Replay:

cassian replay labs/clab-ospf-neighbor-up --gate --verify-results

Scope boundary:

This invariant validates only OSPFv2 single-area FRR adjacency from one node to one neighbor router-ID.

It does not by itself prove:

OSPFv3 / IPv6 OSPF adjacency
multi-area OSPF design correctness
OSPF LSA-level inspection
non-FRR (SONiC, Arista) OSPF — src must be type: frr
area-mismatch as an invariant (the negative proof topology demonstrates the FAIL pathology, but no ospf_area_match invariant exists in v1.5)
routing policy or attribute correctness

Interface State Invariant

Invariant type:

1	`interface_state`

Purpose:

Verify that an interface declared by a links: endpoint has the expected administrative/operational state inside its node's network namespace.

This is useful for validating interface posture such as:

post-deploy confirmation that all topology interfaces came up
post-fault confirmation that a fault: interface_down step actually brought the interface down
pre-test posture gate before subsequent reachability invariants

This invariant is NOS-agnostic: it uses the Linux primitive ip -j link show <iface> and works on any node type with a Linux network namespace (frr, host, nft-fw).

Required fields:

Field	Description
node	Node whose namespace is probed
interface	Interface name as seen inside the node namespace (e.g. `eth1`)

Optional fields:

Field	Description
state	Expected state literal; one of `up`, `down`. Default `up` materialised at Resolve.

The verdict predicate is asymmetric:

state: up requires admin_state == "up" AND operstate == "UP" (conjunction; both must hold).
state: down requires admin_state == "down" OR operstate != "UP" (disjunction; either suffices).

Carrier (link-layer signal) is reported in observed_state for diagnostic clarity but does NOT participate in the verdict.

iproute2 capability dependency: the probe requires an ip binary supporting the -j JSON flag. BusyBox ip (the default in alpine:latest, the engine default for host and nft-fw) does NOT support -j. Topologies exercising interface_state on host or nft-fw nodes MUST pin a compatible image (e.g. nicolaka/netshoot:v0.15) explicitly in the node declaration. FRR's default image already includes full iproute2.

Example:

nodes:
  - name: r1
    type: frr
  - name: h1
    type: host
    image: nicolaka/netshoot:v0.15
    ip: 192.168.10.10/24
    gw: 192.168.10.1
  - name: fw1
    type: nft-fw
    image: nicolaka/netshoot:v0.15
    routed: true
    interfaces:
      eth1: 10.0.0.1/31
      eth2: 192.168.10.1/24
    allow_icmp: true

links:
  - endpoints: ["r1:eth1", "fw1:eth1"]
    ipv4: ["10.0.0.0/31", "10.0.0.1/31"]
  - endpoints: ["h1:eth1", "fw1:eth2"]
    ipv4: ["192.168.10.10/24", "192.168.10.1/24"]

tests:
  - name: r1_eth1_up
    kind: invariant
    type: interface_state
    node: r1
    interface: eth1
    state: up
    expect: pass

Behavior:

The invariant runs ip -j link show <iface> on the specified node and parses the JSON output.
It passes when the kernel-reported admin_state and operstate together satisfy the asymmetric predicate above.
It fails when the predicate does not hold OR when the probe itself fails (closed-set last_error literal indicates which path: capability-probe failure, interface-not-present, ip-command-failure, JSON parse failure, structural surprise, missing field).
A per-(lab, node) capability probe runs at most once per gate run on first use of interface_state against that node; capability-probe failures short-circuit with last_error: "ip -j flag not supported by node's iproute2".
If the invariant definition is invalid (missing node, missing interface, unknown node reference, invalid state literal, unknown key), the run fails with misuse exit code 2.
Retry policy: bounded by the test's timeout_s (default 10 seconds) and retry_interval_s (default 0.5 seconds) when expect: pass. Single attempt for expect: fail.

Artifacts produced:

1 2	`labs/<lab>/results.json labs/<lab>/results.summary.txt`

When verdict: fail, the test record carries a structured eight-key observed_state payload (type, interface, expected_state, admin_state, operstate, carrier, last_error, source_node). See docs/topology-schema-v1.5.md §4.9 for the full per-type schema, including the closed-set documentation for all four state-axis fields (admin_state, operstate, carrier, last_error).

Positive proof example:

cassian test topologies/interface_state_up.yaml

Replay:

cassian replay labs/clab-interface-state-up --gate --verify-results

Scope boundary:

This invariant validates only kernel-reported interface administrative/operational state inside a node's Linux network namespace.

It does not by itself prove:

L2 reachability across the link (use ping for that)
L3 reachability or routing-table correctness (use ping, route_present, or BGP invariants)
MTU, speed, duplex, error counters, or other interface-level metrics
vendor NOS-specific interface state (the probe is a Linux primitive; SONiC/Arista VM nodes are out of scope)
carrier-level signal — carrier is reported in observed_state for diagnostic clarity but is NOT part of the verdict predicate

EVPN Invariants

Cassian Gate supports deterministic EVPN invariant checks as standard authoritative test results.

EVPN MAC Route Present

Validates that a specific MAC route is present for the specified VNI on the specified node.

Example:

tests:
  - name: leaf2_sees_host1_mac_route
    kind: invariant
    type: evpn_mac_route_present
    node: leaf2
    mac: "00:11:22:33:44:55"
    vni: 10100
    expect: pass

Required fields:

kind: invariant
type: evpn_mac_route_present
node
mac
vni

EVPN MAC Route Absent

Validates that a specific MAC route is absent for the specified VNI on the specified node.

Example:

tests:
  - name: leaf2_does_not_see_mac_route
    kind: invariant
    type: evpn_mac_route_absent
    node: leaf2
    mac: "00:11:22:33:44:55"
    vni: 10100
    expect: pass

Required fields:

kind: invariant
type: evpn_mac_route_absent
node
mac
vni

EVPN VNI Route Present

Validates that EVPN control-plane route presence exists for the specified VNI on the specified node.

Example:

tests:
  - name: leaf2_sees_vni_10100
    kind: invariant
    type: evpn_vni_route_present
    node: leaf2
    vni: 10100
    expect: pass

Required fields:

kind: invariant
type: evpn_vni_route_present
node
vni

EVPN BGP Session Up

Validates that the EVPN BGP session to the specified peer is up on the specified node.

Example:

tests:
  - name: leaf1_evpn_session_to_spine1_up
    kind: invariant
    type: evpn_bgp_session_up
    node: leaf1
    peer: spine1
    expect: pass

Required fields:

kind: invariant
type: evpn_bgp_session_up
node
peer

Expected outcomes

These invariants behave like other authoritative test results:

expect: pass means the declared invariant should be observed as true
mismatch leads to a validation failure
invalid invariant declarations are treated as usage / contract errors

Evidence and authority

For EVPN invariants:

runtime EVPN route/session data is supporting evidence
the invariant verdict in results.json is authoritative

The check is intended to preserve deterministic authority semantics and replay consistency.

Positive proof examples

cassian test topologies/evpn_mac_route_present.yaml
cassian test topologies/evpn_vni_route_present.yaml
cassian test topologies/evpn_bgp_session_up.yaml

Negative validation example

cassian test topologies/evpn_mac_route_absent_expected_present.yaml

Negative misuse example

cassian test topologies/neg/evpn_invalid_mac_invariant.yaml

Replay

These invariants can be checked again using standard gate replay workflows:

cassian replay labs/clab-evpn-mac-route-present --gate --verify-results
cassian replay labs/clab-evpn-vni-route-present --gate --verify-results
cassian replay labs/clab-evpn-bgp-session-up --gate --verify-results

Scope boundary

EVPN invariants validate only the declared EVPN truth being tested.

They do not by themselves prove:

full dataplane forwarding
broader EVPN feature correctness
non-EVPN control-plane behavior

Failed-Invariant Observed State

When an invariant test produces verdict: fail, the test record in results.json carries a structured observed_state payload alongside the existing observed string. This is the authoritative deterministic failure-reason artifact.

Where it appears:

on records in results["tests"] whose kind == "invariant" AND verdict == "fail"
on records in results["events"] whose type == "scenario_test_run" AND kind == "invariant" AND verdict == "fail"

Where it does NOT appear:

on passing-invariant records
on non-invariant test kinds (ping, tcp)
on prereq failure paths (those surface as hard_failure: in the summary)
on records with observed: blocked, verdict: fail, error: blocked before execution (those are recorded explicitly per the Blocked declared validation items rules above)

Determinism contract:

every value in observed_state is derived from declared topology / test inputs or from deterministically-computable scalars in parsed vtysh JSON
environmental nondeterminism (host clocks, container IDs, runtime PIDs, hostnames-of-the-runner, containerlab-allocated veth MAC addresses) MUST NOT enter observed_state
two clean runs of the same topology produce byte-identical observed_state payloads

Per-record byte ceiling:

a single record's observed_state is bounded at 8192 bytes of canonical JSON
when a payload would exceed the ceiling, the engine deterministically suffix-drops trailing entries from the longest list field until it fits and sets observed_state_truncated: true on the record
the supporting evidence field still carries the full pre-truncation list

Example failed-invariant record shape in results.json:

{
  "name": "leaf2_evpn_mac_route_for_unknown_mac",
  "kind": "invariant",
  "type": "evpn_mac_route_present",
  "verdict": "fail",
  "observed": "fail",
  "observed_state": {
    "type": "evpn_mac_route_present",
    "mac": "de:ad:be:ef:00:01",
    "vni": 10100,
    "evpn_routes": [
      {"mac": "00:11:22:33:44:55", "vni": 10100, "rd": "", "prefix": "", "route_type": 2}
    ],
    "source_node": "leaf2"
  }
}

Summary rendering in results.summary.txt:

Each failed-invariant line in the failed_tests: block is followed by an indented observed: block. Indentation is fixed: header at 4-space, key/value lines at 6-space, list entries at 8-space. List values are capped at 5 entries with a trailing (+<N> more) over-cap line. When the record carries observed_state_truncated: true, the renderer emits a literal trailing line (observed_state truncated; full payload in results.json) at 6-space indent.

Example summary block:

failed_tests:
 - leaf2_evpn_mac_route_for_unknown_mac (invariant) leaf2-> : evpn_mac_route_present mismatch (expected pass, observed fail)
    observed:
      evpn_routes:
        - mac=00:11:22:33:44:01, prefix=, rd=, route_type=2, vni=10100
        - mac=00:11:22:33:44:02, prefix=, rd=, route_type=2, vni=10100
        (+58 more)
      mac: de:ad:be:ef:00:01
      source_node: leaf2
      type: evpn_mac_route_present
      vni: 10100
      (observed_state truncated; full payload in results.json)

Authority boundary unchanged:

results.json observed_state field = authoritative structured failure reason
results.summary.txt observed: block = explanatory rendering only

For the per-type observed_state schema (which keys are required for each invariant type) see docs/topology-schema-v1.5.md §4.

🔟 Scenarios (Failure Choreography)

Scenarios define ordered fault injection sequences.

Example:

scenarios:
  - id: failover
    steps:

      - run: r1_to_r2

      - fault:
          link_down:
            endpoints: ["r1:eth1", "r2:eth1"]

      - wait_for_bgp:
          node: r1
          timeout: 30

      - run: r1_to_r2

Step Types

Currently implemented scenario step types:

run
fault
wait
wait_for
wait_for_bgp
pcap_start
pcap_stop

`wait` (explicit elapsed-time pause)

Canonical form:

- wait:
    seconds: 5

Rules:

payload must be a mapping
payload must contain exactly one field: seconds
seconds must be a positive integer
scalar form such as - wait: 5 is invalid
extra keys are invalid
wait executes only as an explicit elapsed-time pause
wait does not prove readiness, BGP convergence, reachability, or service health

Use wait_for or wait_for_bgp when you want condition-based convergence checks.

No implicit retries. Timeout = failure.

`wait_for` (condition-based convergence)

Polls a deterministic predicate until satisfied or timeout. Records a scenario step verdict (no test verdict).

Required keys (every wait_for step):

type — one of the nine condition types listed below
from — source node name
expect — pass or fail
timeout — int (seconds)
interval_s — number (polling cadence)

Optional: per_attempt_timeout_s.

Accepted condition types:

ping — ICMP from from to to (node name or IPv4 literal)
tcp — TCP from from to to:port
route_prefix — prefix (CIDR) present in RIB on from
bgp_session_up — BGP session to dst (IPv4 neighbor) reaches Established
route_present — prefix (CIDR) present in BGP RIB on from
route_advertised_to — prefix (CIDR) advertised toward peer (node name)
evpn_bgp_session_up — EVPN BGP session to peer (node name) reaches Established
evpn_vni_route_present — EVPN type-2/3 route present for vni (integer)
evpn_mac_route_present — EVPN type-2 route for mac + vni is present

Per-type parameter requirements: see docs/topology-schema-v1.md §6 (### wait_for) and docs/topology-schema-v1.5.md §2.

Notes:

A successful wait_for step does not count as a passing test. Verdicts come only from items in tests:.
expect: fail inverts the convergence semantics (succeeds if the condition does not become satisfied within timeout).
wait_for: bgp_session_up is a single-neighbor check (explicit dst); wait_for_bgp is a coarse all-neighbors-of-one-node readiness check. Both remain available.

Prefer wait_for with an invariant condition over fixed wait: { seconds: N } for convergence purposes.

Grey Failures (Deterministic Degradation)

Grey failures are scenario-only capabilities, not standalone CLI commands.

Scenarios can model partial network degradation, not only full outages.

Supported grey-failure actions:

packet_loss
latency
bandwidth_cap
prefix_blackhole

These actions are:

deterministic
explicit
replay-stable
recorded in results.json

Grey failures affect the network condition, not the verdict logic.

Verdicts still come from the test results that run after the fault step.

Example: Packet Loss

scenarios:
  - id: loss5_ping_still_passes
    steps:
      - fault:
          packet_loss:
            node: h1
            if: eth1
            loss: 5

      - run: h1_to_fw1_ping

Meaning:

Apply 5% packet loss on h1:eth1, then run the declared test.

Example: Latency

scenarios:
  - id: delayed_path
    steps:
      - fault:
          latency:
            node: h1
            if: eth1
            latency_ms: 100

      - run: app_check

Example: Bandwidth Cap

scenarios:
  - id: slow_link
    steps:
      - fault:
          bandwidth_cap:
            node: h1
            if: eth1
            bandwidth_mbps: 10

      - run: transfer_check

Example: Prefix Blackhole

scenarios:
  - id: blackhole_prefix
    steps:
      - fault:
          prefix_blackhole:
            node: r1
            prefix: 192.168.50.0/24

      - run: reachability_check

Target Forms

Grey failures support two target styles.

Interface target

fault:
  packet_loss:
    node: h1
    if: eth1
    loss: 5

Link target

Useful when you want to degrade both ends of a declared link.

fault:
  packet_loss:
    a: r1
    b: r2
    a_if: eth1
    b_if: eth1
    loss: 5

If multiple links exist between the same nodes, explicit interfaces are required.

Parameter Rules

packet_loss

loss or loss_percent
integer
valid range: 0..100

latency

latency_ms
integer
must be >= 0

bandwidth_cap

bandwidth_mbps
integer
must be >= 1

prefix_blackhole

node
prefix

Invalid values fail fast with exit code 2.

How to Run

cassian test topologies/fixtures/grey_failure_direct_pass.yaml --scenario loss5_ping_still_passes

Replay deterministically:

cassian replay labs/clab-grey-failure-direct-pass --gate --verify-results

Artifact Evidence

Grey failures are recorded in results.json as scenario_fault events.

Example shape:

{
  "type": "scenario_fault",
  "scenario_id": "loss5_ping_still_passes",
  "step": 1,
  "meta": {
    "action": "packet_loss",
    "loss_percent": 5,
    "target": "h1:eth1"
  }
}

This provides deterministic evidence that the degradation was applied before the test step ran.

1️⃣1️⃣ Candidate Configuration (Gate Only)

Apply candidate changes during validation.

cassian test <topology.yaml> \
  --candidate-config <dir>

Directory layout:

1
2
3

<dir>/
  <node-name>/
    <config-files>

Currently proven supported examples:

1
2
3

<dir>/
  frr/<node>.conf
  nft/<node>.nft

Rules:

full replacement
no merge
atomic apply
failure aborts gate
candidate config is non-authoritative input only
verdicts still come only from tests / scenarios / invariants

Important current boundary for vendor NOS VM nodes:

candidate-config for supported sonic-vm / NOS VM nodes is not currently a supported candidate-config surface
unsupported or undefined NOS VM candidate-config input is rejected explicitly
current truthful behavior for unsupported NOS VM candidate-config input is:
misuse / invalid candidate-config surface
exit code 2

Example of current unsupported behavior:

cassian test topologies/vendor_nos_smoke.yaml \
  --candidate-config tests/fixtures/vendor-nos-cand-neg-unsupported

Expected outcome:

1 2	`ERROR: Candidate config directory structure invalid: <dir> exit code: 2`

Meaning: this candidate-config surface is unsupported or malformed for the current command/topology.

Support boundary:

supported current surfaces: generated FRR and nft-fw candidate files only
unsupported current surfaces: vendor NOS / sonic-vm candidate-config input

Scope boundary:

candidate config support is currently proven only for the existing supported candidate-apply surfaces
this does not currently establish candidate-config support for sonic-vm or other vendor NOS VM node types
any future NOS VM candidate-config support requires an explicit contract surface and proof

1️⃣2️⃣ Status Command

Inspect running labs.

1	`cassian status <lab>`

Useful options:

--summary
--interfaces
--bgp
--bgp-verbose
--routes
--routes-verbose
--json
--strict

Example:

cassian status demo-lab --summary

1️⃣3️⃣ Cleanup & Lab Management

Destroy a running lab:

1	`cassian down <lab>`

Clean up abandoned labs:

cassian cleanup --all
cassian cleanup --all --yes

Dry-run occurs unless --yes is provided.

Example cleanup flow:

cassian down <lab>
cassian cleanup --all --yes

Meaning:

cassian down <lab> tears down the named lab
cassian cleanup --all --yes removes any remaining Cassian Gate-owned labs discovered by the cleanup plan
cleanup stays explicit because dry-run remains the default without --yes

1️⃣4️⃣ DevOps Integration

Generate adapter artifacts.

Terraform

cassian adapt terraform \
  --plan plan.json

Input:

1	`terraform show -json`

Ansible

cassian adapt ansible \
  --dir rendered_configs/

Adapters are advisory only.

1️⃣5️⃣ AI Assistance (Optional)

AI is assistive only.

It never affects:

execution
verdicts
exit codes

AI Advisory (Optional, Non-Authoritative)

cassian ai --lab <lab-name> "<question>"
cassian ai --artifacts <path> "<question>"
cassian ai "<question>"

Purpose:

Provides advisory explanations and guidance based on artifacts
Helps interpret:
failures
coverage gaps
missing tests/scenarios
control-plane intent

Authority:

advisory only
does not affect:
verdicts
exit codes
execution
artifacts

Input:

topology.resolved.yaml
results.json

Unified AI Assistance

Use the same conversational entrypoint for failure explanation, coverage review, topology review, scenario interpretation, invariant explanation, and blast-radius explanation.

Common human path

cassian ai "why did this fail"

Uses the most recent valid artifact context when available.

Explicit lab path

cassian ai --lab <lab> "why did this fail"

Uses the specified lab when it contains the required artifacts.

Explicit artifacts path

cassian ai --artifacts <dir> "why did this fail"

This is the most explicit override and is useful for proof/debug workflows.

Optional online-enriched rendering

Enable online-enriched advisory rendering explicitly:

cassian ai --online "why did this fail"
cassian ai --lab <lab> --online "why did this fail"
cassian ai --artifacts <dir> --online "why did this fail"

Rules:

online-enriched rendering is explicit opt-in only
local advisory rendering remains the baseline behavior
online rendering does not change authority, verdicts, or execution behavior
unavailable online rendering should be treated as a non-authoritative advisory-path failure, not a change in execution authority

Rendering modes

cassian ai may indicate whether local or online-enriched advisory rendering was used.

Both remain advisory-only.

Context selection

When possible, prefer explicit artifact or lab selection for clarity.

Required artifacts include:

1 2	`topology.resolved.yaml results.json`

If the required artifacts are missing, the advisory path should not be treated as available.

Important boundary

cassian ai:

reads artifacts only
does not execute lifecycle actions
does not modify topology, tests, scenarios, or configs
does not affect verdicts
remains advisory-only

AI Output Structure

cassian ai is intended to present grounded, advisory explanations based on artifacts.

Typical output includes:

Summary
Grounded evidence
Advisory interpretation
Recommended next steps
Optional draft suggestions

Treat the exact wording and formatting as supporting guidance rather than release-surface authority.

Draft Format (Copy-Paste Ready)

Drafts are structured and labeled:

Draft 1 — <type>
-----
<content>
-----

Common draft types:

topology guidance
test block
scenario block
firewall-side fix
test-side fix

Example:

Draft 1 — test block
-----
tests:
  - name: h1_to_h2_ping_should_pass
    kind: ping
    src: h1
    dst: h2
    expect: pass
-----

Notes:

drafts are safe to copy/paste
drafts are non-authoritative
drafts require human review

Supported Question Styles (Flexible)

cassian ai supports multiple phrasings for the same intent.

Scenario Questions

"what scenario am I missing"
"what scenario should I add"
"how would you test failover here"

Invariant Questions

"what invariant would help here"
"what invariant should I add first"

Coverage / Validation

"what tests should I add next"
"give me a concrete validation plan"

Failure Analysis

"why did this fail"
"what should I change first"
"what should I prove first"

Topology Improvement

"how would you improve this topology"
"provide an improved topology"

Behavior:

different phrasings can still target the same advisory intent
AI remains advisory-only regardless of phrasing

Local vs Online Rendering

Local (default)

cassian ai --lab <lab> "<question>"

deterministic, built-in reasoning
no external dependency
always available

Online (optional)

cassian ai --lab <lab> --online "<question>"

Requirements depend on the configured online AI path.

Behavior:

online rendering is optional
it may provide richer explanations or phrasing
it does not change verdicts, artifacts, or execution

How to Use AI Effectively

Best practice flow:

Run deterministic gate:

cassian test <topology.yaml>

If failure:

cassian ai --lab <lab> "why did this fail"

Improve coverage:

cassian ai --lab <lab> "what should I prove first"

Expand validation:

cassian ai --lab <lab> "give me a concrete validation plan"

Key Insight

passing tests ≠ proven design
AI helps identify:
missing positive proofs
missing failure scenarios
missing control-plane invariants

AI Guardrails

AI is never authoritative
AI cannot:
run commands
modify topology
change configs
alter results
AI output must always be:
human-reviewed
explicitly applied

Verification Behavior

AI verification details belong to the implementation and verification surfaces. For operator use, keep the important boundary clear: AI remains optional and advisory-only.

Example: AI Identifies Missing Invariant

AI may suggest:

Draft 1 — test block
-----
tests:
  - name: fw1_advertises_192.168.2.0_24_to_r2
    kind: invariant
    type: route_advertised_to
    node: fw1
    peer: r2
    prefix: 192.168.2.0/24
    expect: pass
-----

Meaning:

you are not proving control-plane correctness yet
add route-level proof before expanding scenarios

1️⃣6️⃣ Artifacts

Artifacts are written to:

1	`labs/clab-<lab-name>/`

Artifacts are typically written under:

1	`labs/clab-<lab-name>/`

Interpret them using the authority boundary already established in the project:

topology.resolved.yaml is generated execution input
results.json is the authoritative verdict artifact
results.summary.txt is explanatory only

Key files:

topology.resolved.yaml
results.json
results.summary.txt
artifacts/
artifacts/blast-radius/blast_radius.json

results.json

results.json is the authoritative verdict artifact.

It explicitly records declared validation items that executed, and when materially relevant, declared validation items that were blocked after entering authoritative execution scope.

Important boundary:

omission does not mean success
a blocked declared item should appear explicitly in results.json
failed-invariant records carry a structured observed_state payload — see §9 "Failed-Invariant Observed State" and docs/topology-schema-v1.5.md §4 for the schema

topology.resolved.yaml

Contains the fully expanded deterministic model used for execution.

Includes:

resolved defaults
auto IP assignments
normalized topology
explicit invariant expansion from declared packs
additive EVPN-resolved fields when EVPN runtime substrate is used

Structured State Diff (Advisory Only)

Cassian Gate can produce a structured pre/post operational state diff when state capture is explicitly enabled for both phases.

This artifact is:

advisory only
non-authoritative
deterministic
generated only from the explicitly captured state

It does not:

change verdicts
change exit codes
replace results.json
score differences as good or bad

How it works

When enabled, Cassian Gate captures the declared command/profile state:

once before tests (pre)
once after tests (post)

It then compares those two captured state sets and writes a structured diff artifact.

This is a diff between:

pre-state captured command output
post-state captured command output

for the same run.

It is not a diff between:

two different runs
two different topologies
baseline vs candidate config directories
intended config vs actual config

Command Example

cassian test topologies/three-frr-two-hosts-fw-routed.yaml \
  --state-capture both \
  --state-profile linux-net-basic \
  --state-profile frr-interfaces-basic \
  --state-profile frr-routing-basic \
  --state-profile nft-ruleset-basic

Phase 1a expanded the built-in FRR profile set (now: frr-routing-basic, frr-bgp-basic, frr-ospf-basic, frr-interfaces-basic, frr-comprehensive) and switched FRR probes to JSON form (vtysh -c "show ... json") with Linux iproute2 primitives for the interfaces profile. See docs/cli-reference-v1.md for the full --state-capture / --state-profile flag reference and per-profile descriptions.

Artifact Path

1	`labs/clab-<lab-name>/artifacts/state-diff/state_diff.json`

What to inspect

Inspect the structured diff for the captured objects, changed elements, and supporting evidence relevant to your review.

Operator meaning

Use this artifact when you want to understand:

what operational state changed during the run
which captured command surfaces changed between pre and post
supporting evidence for review or explanation

Keep the authority boundary clear:

results.json = authoritative verdict surface
state_diff.json = supporting evidence only

Blast Radius (Advisory Only)

Cassian Gate can produce a blast radius artifact that shows:

what the executed tests/scenarios directly covered
what additional nodes/links are potentially affected based on deterministic topology connectivity

This artifact is:

advisory only
non-authoritative
deterministic
generated during Collect

It does not:

change verdicts
change exit codes
replace results.json
score severity or risk
infer live routing/runtime behavior

Artifact Path

1	`labs/clab-<lab-name>/artifacts/blast-radius/blast_radius.json`

Supporting `results.json` Surface

results.json may also include a clearly labeled non-authoritative supporting section:

1	`blast_radius`

This remains:

supporting evidence only
non-authoritative
not part of verdict logic

Keep the authority boundary clear:

results.json verdict fields = authoritative
results.json blast_radius section = supporting evidence only
artifacts/blast-radius/blast_radius.json = detailed advisory artifact

What it contains

Inspect the blast-radius artifact for the covered scope, potentially affected objects, and other supporting evidence relevant to your review.

Operator meaning

Use this artifact when you want to understand:

what your declared tests directly touched
what else is connected to that tested scope
where additional coverage may be useful

Example

cassian test topologies/blast_radius_ok.yaml

python -m json.tool \
  labs/clab-blast-radius-ok/artifacts/blast-radius/blast_radius.json

Important Boundary

Blast radius currently reflects:

resolved topology structure
declared coverage surfaces
deterministic conservative graph expansion

It does not currently prove:

live routing impact
actual traffic path usage
runtime failure propagation
business severity

1️⃣7️⃣ Common Operator Tasks

Validate a topology:

1	`cassian validate topology.yaml`

Validate contrib content structurally:

1	`cassian validate-contrib contrib/`

Run validation gate:

cassian test topology.yaml

Note:

cassian test <topology.yaml> now requires at least one declared test or scenario
if you only want to prove deploy/provision smoke behavior, use exploration mode instead of gate mode

Validate invariant-pack compatibility:

1	`cassian validate topologies/pack_local_compatibility_ok.yaml`

Run invariant-pack gate proof:

cassian test topologies/pack_local_compatibility_ok.yaml

Validate invalid pack misuse handling:

cassian validate topologies/neg/pack_unknown_reference.yaml
cassian validate topologies/neg/pack_incompatible_contents.yaml

Replay a previous gate deterministically:

cassian replay labs/clab-<lab> --gate

Explore a lab interactively:

cassian run topology.yaml --keep
cassian status <lab>
cassian exec <lab> r1

Bring up EVPN runtime substrate:

1	`cassian up topologies/evpn_runtime_generation.yaml`

Run a routing attribute invariant proof:

cassian test topologies/bgp_localpref_equals.yaml

Run a route advertisement invariant proof:

cassian test topologies/route_advertised_to.yaml
cassian test topologies/route_not_advertised_to.yaml

Run an EVPN invariant proof:

cassian test topologies/evpn_mac_route_present.yaml

Replay an EVPN proof deterministically:

cassian replay labs/clab-evpn-mac-route-present --gate --verify-results

Clean up labs:

cassian cleanup --all --yes

Run scenario testing:

cassian test topology.yaml --all-scenarios

Run a specific scenario in exploration mode:

cassian run topology.yaml --scenario <scenario-id>

Run a scenario with an explicit wait step:

cassian test topologies/h2_wait_runtime_positive.yaml --scenario simple_wait_runtime
cassian run topologies/h2_wait_runtime_positive.yaml --scenario simple_wait_runtime

Run a grey-failure scenario:

cassian test topologies/fixtures/grey_failure_direct_pass.yaml --scenario loss5_ping_still_passes

Replay the same grey-failure scenario deterministically:

cassian replay labs/clab-grey-failure-direct-pass --gate --verify-results

Run a blast radius proof:

cassian test topologies/blast_radius_ok.yaml

Inspect blast radius output:

python -m json.tool \
  labs/clab-blast-radius-ok/artifacts/blast-radius/blast_radius.json

Inspect structured state diff output:

cassian test topologies/three-frr-two-hosts-fw-routed.yaml \
  --state-capture both \
  --state-profile frr-comprehensive \
  --state-profile linux-net-basic \
  --state-profile nft-ruleset-basic

python -m json.tool labs/clab-three-frr-two-hosts-fw-routed/artifacts/state-diff/state_diff.json

Inspect a blocked declared-item result:

cassian test topologies/neg/blocked_precheck_bgp_results.yaml
python -m json.tool labs/clab-blocked-precheck-bgp-results/results.json

Look for:

the declared test present in tests
observed: blocked
verdict: fail
summary counts reflecting the blocked item

Use AI to explain a failure from the most recent run:

cassian ai "why did this fail"

Use AI against a specific lab:

cassian ai --lab <lab> "what should I prove first"

Use AI to expand validation coverage:

cassian ai --lab <lab> "give me a concrete validation plan"

Use optional online-enriched AI rendering:

cassian ai --lab <lab> --online "why did this fail"

`cassian validate-contrib` — Structural validation for contrib content

Validate supported contrib content without running any lifecycle phases.

Command:

1	`cassian validate-contrib <path>`

Purpose:

checks contrib content structurally
rejects malformed or unsupported contrib layout
does not deploy anything
does not create lab artifacts
does not affect verdicts, replay, or authority

Important boundary:

validate-contrib is:

structural only
non-authoritative
explicit only

It does not:

run resolve → deploy → test lifecycle phases
produce PASS / FAIL validation verdicts
generate results.json
validate runtime behavior
score content quality
infer meaning or intent

Supported contrib surfaces are limited to the contrib content types documented by the current project documentation.

Typical behavior:

validates only the path you explicitly pass
checks for supported contrib layout and required structure
rejects unsupported or malformed contrib content

Examples:

cassian validate-contrib contrib/
cassian validate-contrib contrib/packs
cassian validate-contrib contrib/state-profiles
cassian validate-contrib contrib/topologies/first-run-proof

Typical exit semantics follow the standard structural-validation pattern:

accepted supported contrib content returns success
invalid or unsupported contrib content is rejected as a usage / contract error

1️⃣8️⃣ Exit Codes

Code	Meaning
0	PASS
1	Test failure
2	Usage / contract error

Examples:

invariant truth mismatch → 1
validation failure after declared proof ran → 1
unsupported EVPN topology shape → 2
invalid invariant declaration → 2
incompatible pack contents → 2
zero-assertion gate run (cassian test <topology.yaml> with no tests/scenarios) → 2
valid contrib validation (cassian validate-contrib contrib/) → 0
invalid contrib structure (cassian validate-contrib <broken-path>) → 2

Misuse / usage / contract error example:

cassian test does-not-exist.yaml

Typical outcome:

the command is rejected before validation runs
the failure is treated as a usage / contract error

Meaning:

this is misuse / invalid invocation
validation did not run
exit code remains 2

Validation failure example:

1	`RESULT: FAIL (validation)`

Meaning:

the system ran validation correctly
the declared proof failed
exit code remains 1

A FAIL can also mean a declared validation item was blocked after authoritative execution began.

In that case, inspect results.json.

Typical blocked-result shape:

{
  "name": "<declared-check-name>",
  "observed": "blocked",
  "verdict": "fail",
  "error": "blocked before execution"
}

These UX clarifications do not change:

lifecycle order
authority model
verdict semantics
artifact schema
replay authority
deterministic execution
exit code contract

1️⃣9️⃣ First 10 Minutes

Recommended onboarding workflow:

cassian doctor
cassian validate topology.yaml
cassian test topology.yaml

Note:

cassian test <topology.yaml> now requires at least one declared test or scenario
if you only want to prove deploy/provision smoke behavior, use exploration mode instead of gate mode

For exploration:

cassian run topology.yaml --keep
cassian status <lab>

For EVPN runtime + proof:

cassian validate topologies/evpn_runtime_generation.yaml
cassian test topologies/evpn_mac_route_present.yaml

For AI-assisted explanation after a gate run:

cassian ai "why did this fail"

End of Cassian Gate v79 Operator Cheat Sheet

Cassian Gate v79 — Operator Cheat Sheet

1️⃣ What Cassian Gate Is (and Is Not)

2️⃣ Command Index

Environment

Execution (Validation)

Inspection

DevOps Integration

AI Assistance (optional / advisory only)

3️⃣ Two Execution Modes (CRITICAL)

🔷 Gate Mode (Authoritative Validation)

Important summary boundary

Zero-assertion gate runs are rejected

cassian replay — Deterministic replay of prior artifacts

Inputs

Gate replay (authoritative context preserved)

Non-gate replay (non-authoritative context preserved)

When to use replay

Replay summary boundary

Important boundary

🔷 Exploration Mode (Non-Authoritative)

Option A — run

Exploration summary boundary

Option B — Explicit Lifecycle

Lifecycle Comparison

4️⃣ Topology vs Lab Name

Commands That Use a Topology File

Commands That Use a Lab Name

Where does lab name come from?

5️⃣ Topology Authoring

Minimal Example

Required Keys

Invariant Packs (Loaded and Expanded During Resolve)

Pack Declaration

Current Supported Pack

Example

Operator Commands

Artifact Note

6️⃣ Nodes

7️⃣ Links

8️⃣ EVPN Runtime Substrate (Generation Support)

Supported EVPN Intent Surface

Supported Node Shape

VLAN ↔ VNI Mapping

Host Attachment Requirements

Minimal Supported Proof Shape

Unsupported / Rejected Shapes

Example EVPN Runtime Generation Topology

Operator Commands

Artifact Note

Important Boundary

9️⃣ Tests and Invariants

Standard test kinds

Ping Example

TCP Example

Invariant tests

Blocked declared validation items

Routing Invariants

BGP Local Preference Invariant

Route Advertised To Invariant

Route Not Advertised To Invariant

BGP Session Up Invariant

Route Present Invariant

Route Absent Invariant

BGP MED Equals Invariant

OSPF Neighbor Up Invariant

Interface State Invariant

EVPN Invariants

EVPN MAC Route Present

EVPN MAC Route Absent

EVPN VNI Route Present

EVPN BGP Session Up

Expected outcomes

Evidence and authority

Positive proof examples

Negative validation example

Negative misuse example

Replay

Scope boundary

Failed-Invariant Observed State

🔟 Scenarios (Failure Choreography)

Option A — `run`

`wait` (explicit elapsed-time pause)

`wait_for` (condition-based convergence)

Supporting `results.json` Surface

`cassian validate-contrib` — Structural validation for contrib content