Modeling the network as a graph

In order to generalize the issue(s?) at play here, we will semi-formalize our model in a graph-theory oriented framework. This initial formulation will be revisited and revised in subsequent sections.

The reference topology

The two figures below (also present in the parent page) illustrate a network with four network security zones (0, 0.1, 0.2, 0.3, and 0.4), and four “sites” (0, 0.1, 0.2, and 0.3). Site “0” represents a wide-area-network, and sites 0.1-0.3 represent separate metropolitan areas. Zone “0” represents a “transit zone” for zones 0.1-0.3 (which represent separate workload-hosting zones).

This topology is depicted with isometric figures in order to help visualize the intersections of sites and security-zones . In the first figure, “horizontal planes” contain all the nodes with a shared “site-ID” value, and “vertical columns” contain all the nodes with a shared “security-zone” value. (Horizontal and vertical are reversed in the second figure)

These two figures illustrate visually that “site” and “security zone” function similarly in the topology we’re analyzing. We see similar visual patterns when “zones” and “sites” are inverted in the two graphs. The differences that we observe include:

Zone-0 differs from the other three zones
- No stateful nodes exist in zone-0
- Only stateful nodes connect to nodes in zone-0
Site-0 differs from the other three sites
- No stateful nodes exist in site-0
Zone-0 differs from Site-0
- Nodes in zone-0 form a tree with the zone-0/site-0 node at its root
- There are no connections between two nodes in site-0

Up to this point we have left “site” and “zone” very loosely defined. We will formalize them in subsequent sections, but for now, we will integrate them to the graph as properties of the edges and nodes.

Where are all the endpoints/workloads/hosts?

At this point, we’ll finally acknowledge a non-trivial simplification that has been implicit in all of the topologies previously discussed. We have not included any “workload” (aka “non-forwarding”; aka “end-point”, aka “host’ etc.. ) nodes in our topologies. The only nodes that we’ve depicted have been “forwarding” (aka “transport”) nodes. The motivation for this omission is a straight-forward desire to reduce visual clutter in the diagrams and conceptual clutter in the model itself.

In the “actual” networks that these topologies represent, those non-depicted (non-forwarding, “workload”) nodes on the network/graph participate in subnets that are originated into BGP by the nodes that are represented on the topology-diagram and graph. When we begin discussing BGP advertisements and propagation, we will use a unique label for each node as a stand-in for all networks that the BGP router that it represents would originate in an “actual” network.

Modeling the network topology as a graph

Graph theory deals with “things” (referred to as “nodes”) and the “relationships” (referred to as “edges” or “verticies”) between those things (the nodes/vertices). Graphs are most intuitively represented visually. Typically (and more importantly, in this paper) “nodes” are depicted as circles (any shape, or none is perfectly acceptably – but we’ll use circles) on a two-dimensional plane. “Edges” (aka vertices) are depicted by line-segments or arcs.

While establishing the graph to model our network topology, we will use an undirected graph. In an undirected graph, the edges are “directionless”. Put another way, an edge signifies a relationship between an unordered pair of edges. Put more explicitly, an edge in an undirected graph that terminates on nodes “a” and “b” connotates that:

A is related to B in exactly the same way that B is related to A.

Elements of our reference topology graph

Nodes

The routers/transport-nodes/BGP-speakers on our network are represented as “nodes” on the graph (circles, in a two-dimensional figure). Each node on our graph represents an IP router in the real-world/enterprise networks that we are modeling.

As dicussed in a previous section, these nodes are “shorthand” for both themselves and any additional nodes that the depicted node provides the only connectivity to. For example, if an L3 switch has 50 servers attached to it, on 5 different VLANs/subnets – and those 50 servers (and those five subnets) are only connected to that L3 switch – we only depict a single node in our graph (and the reference topology diagram from earlier.)

Edges

The edges on our graph (the line segments and arcs that terminate on nodes) represent a few things at once. At the highest level, they represent a “connection” between nodes. In the graph that we’re using, these connections are representative of:

An eBGGP session between two nodes
A usable data-plane path for messages between two nodes

The second item above (the data-plane path between two nodes) is particularly important with regards to the “site” concept, as discussed below.

Properties

We define multiple properties of both nodes and edges as part of modeling our network reference topology.

Node properties
- “Statefulness” is a boolean value that describes whether or not a node tracks/enforces connection state for bi-directional traffic flows between two nodes. Put more plainly: “whether the node will refuse to forward traffic if it does not “see” both flows in a bidirectional flow-pair between two other nodes.
  - This will be important later, when we start building paths through the graph
  - We depict statefulness in the visual graph by depicting the node with a dashed perimeter (non-stateful nodes have a solid perimeter)
- “Security-zone” is a label/tag used to identify nodes which may communicate with each other without passing through a “stateful” node.
  - That is to say, that any path between two nodes in different security zones (with different zone-ID values/colors) must include at least one stateful node
  - Conversely, there must be a path between nodes in the same security zone that does *not traverse any stateful nodes
  - We depict the security-zone of a node by coloring the node’s perimeter (each unique color is a distinct security zone)
Edge properties
- “Site” is a property of the edges on the graph expressed as a numerical value.
  - We will denote edges representing links between nodes in the same site with a label ending in “.0”
  - All edges between nodes in the same site will have will have identical labels
    - We will denote edges representing links between nodes in different sites with a label ending in “.n” (where n is a non-zero integer)

Graphs of the topology

The figure below is a translation of the figure at the top of this page to a graph, using the conventions described in the preceding section.

The two figures below show the same graph, with the elements’ visual orentation shifted, but no changes to the underlying structure of the graph itself.

The preceding figures illustrate the following observation of our network/graph’s structure:

For every “non-zero” site value “n”, a stateful site n node is connected to a stateless site 0 node, and a stateless site n node is connected to a statefu site 0 node.

Generalization/Abstraction

Let’s revisit how “site”, “zone”, and “statefulness” properties related to each other and see if they can be further generalized:

Site (a property of edges):
- Will influence preferred paths through the graph.
  - Preference of a path is inversely proportional to the number of different “sites” in the path
    - E.g., a path that consists exclusively of edges with the same site value (ending in “.0”) will be preferred to a path with any leading to different sites (ending in “.n”)
Zone and statefulness (properties of nodes)
- Will influence permitted paths through the graph
  - With paths traversing “stateful” nodes requiring bidirectional symmetry

Put plainly, our nodes have “stateful” and “zone” properties, and the edges that connect our nodes have a “site” property. The way we want these properties to interact with regards to path-selection can be summarized as follows:

Any stateful node in the selected path from node “a” to node “b” must be in the selected path from nobe “b” to node “a”

The path including the smallest number of edges should be the selected path.

Rules of Inference

We use the following rules of inference in building our graph (network topology):

A node may be instantiated under the following conditions
- Each node is characterized as either “stateful” (dashed line) or “stateless” (solid line)
- Each node is characterized with a single security zone (a unique color per security zone on the graph)
Edges may be instantiated under the following conditions
- A path that does not include a stateful node must exist between all stateless nodes with the same security-zone property
- An edge between two different security zones must connect to stateful nodes
  - Unless the node has the a value of “0” for its security-zone property, in which case the node with zone property of “0” may be stateless
Edges are labeled using the following conventions
- Connections between edges in the same “site” end in “.0”
- Connections between edges in different “sites” end in “.n” where “n” is a non-zero integer.