Architecture
Table of contents
Key Concepts
Network Services
A Network Service is a set of Connectivity, Security, and Observability features at L3 and above to which workloads may individually connect.
A Network Service is a collection of Connectivity, Security, and Observability features applied to traffic.
Examples of Network Services would include:
- A simple distributed vL3 that allows the workloads to communicate via IP, optionally with DNS service for that vL3
- A Traditional Network Service like Istio, Linkerd, Consul, or Kuma running over a vL3. This allows specific workloads to be admitted to that Service Mesh, independent of where they run. It also allows a single workload to connect to multiple Traditional Service Meshes. This can allow a workload to connect both to a companies Service Mesh, and also to the Service Meshes of its partners simultaneously.
More sophisticated features (IPS, etc) can be composed into Network Services to add additional Security and Observability Features.
Clients
A Client in Network Service Mesh, sometimes also called a Network Service Client (NSC) is a workload that asks to be connected to a Network Service by name. A Client is independently authenticated (currently by Spiffe ID), and must be authorized to be attached to attach to a Network Service.
For each Network Service to which a Client wishes to be connected, in addition to the name of that Network Service and the identity of the Client, an optional set of ‘labels’ (key value pairs) may be provided. These ‘labels’ may be used by Network Service for Endpoint selection, or by Endpoints themselves to influence how the Endpoint provides service to the Client .
A Client may be a:
- Pod
- VM
- Physical Server
vWires
That which connects a Client to an Endpoint is a vWire or Virtual Wire .
The contract of a vWire is:
- A packet ingressing the vWire at the Client will egress at the Endpoint
- A packet ingressing the vWire at the Endpoint will egress the vWire at the Client
- Only packets that ingressed the vWire at the Client will egress at the Endpoint
- Only packets that ingressed the vWire at the Client will egress the vWire at the Endpoint
- An Endpoint may have multiple incoming vWires .
- A Client may have multiple outgoing vWires .
- Each vWire carries traffic for exactly one Network Service.
In short, a vWire acts like a virtual Wire between Client and Endpoint .
It should be noted that a Client may request the same Network Service multiple times, and thus have multiple vWires that happen to connect it to a particular Endpoint .
Endpoints
An Endpoint in Network Service Mesh, sometimes called a Network Service Endpoint or NSE is the ‘thing’ provides the Network Service to the Client .
Network Service Mesh constructs a vWire between the Client and the Endpoint :
An Endpoint may be
- a Pod running in the same K8s cluster
- a Pod running in a different K8s cluster
- a VM
- an aspect of the physical network
- Anything else to which packets can be delivered for processing
Network Service API
Request
A vWire between a Client and Network Service is created by a Client sending a ‘Request’ GRPC call to NSM.
Close
A vWire between a Client and a Network Service is formally Closed by sending a ‘Close’ GRPC call to NSM.
Monitor
A vWire between a Client and a Network Service always has a finite expire time. The Client may (and usually does) send new ‘Request’ messages to ‘refresh’ the vWire . If a vWire exceeds its expire time without being refreshed, NSM cleans up the vWire .
A Client may use a ‘MonitorConnection’ streaming GRPC call to NSM to get updates on the status of a vWire it has to a Network Service.
Registries
As with any other Mesh, Network Service Mesh has Network Service Registries (NSR) in which Network Services and Network Service Endpoint s are registered
Network Service Endpoint
A Network Service Endpoint ( NSE or Endpoint ) provides one or more Network Services. It registers with the registry a list of Network Services (by name) that it provides, and the ‘destination labels’ it is advertising for each Network Service.
Network Service
A Network Service is identified by name and carries a payload type (either IP or Ethernet).
|
|
by default if not specified, the payload is presumed to be IP. Network Services are registered with the Network Service Registry.
Optionally, a Network Service may specify a list of ‘matches’. These matches allow matching the ‘source labels’ a Client sends with its Request to ‘destination labels’ advertised by the Endpoint when it registers as providing the Network Service.
For example:
|
|
If a Client provided no ‘source labels’ with its Request for the ‘service-mesh’ Network Service, it would not match the ‘service: envoy-proxy’ for the first match, and so would fall through to the final ‘catch all’ match with no source_selector, and be matched to Endpoint that advertised a Network Service named ‘service-mesh’ with ‘destination label’ ‘service: envoy-proxy’.
If a Client provided a ‘source label’ of ‘service: envoy-proxy’ it would match the first match and be matched to an Endpoint that advertised a Network Service named ‘service-mesh’ with ‘destination label’ ‘service: vl3’
Registry Domains
Network Service Mesh allows multiple independent mutually ignorant Registry Domains.
The Network Service Registry Domain of a Network Service is indicated by suffixing an ‘@domain’ to the Network Service Name. So for example, a Network Service named ‘service-mesh’ in the ‘finance.example.com’ domain would be ‘service-mesh@finance.example.com’
The reference implementation of Network Service Mesh locates the Registry Server for a Registry Domain by looking up a SRV record for the name of the domain. This is not the only permissible way to do it, it is done as one example that permits scaling to ‘internet scale’.
Inter-domain
A Client may request a Network Service from any Network Service Registry Domain independent of where it is running. Whether the lookup from the Registry for that Registry Domain is permitted, and whether the Client is permitted to connect to that Network Service is a matter of policy, not a matter of where the Client is running.
An Endpoint may register as providing a Network Service in any Network Service Registry Domain independent of where it is running. Whether the Endpoint is permitted to register in that Registry Domain is a matter of policy, not a matter of where the Endpoint is running.
Floating Inter-domain
A Network Service Registry Domain need not be associated directly with any Runtime Domain. It may be a purely logical Registry, with Clients and Endpoints running across many different Runtime Domains that the Registry Domain has no direct association with.
When run in this mode, it is referred to as a ‘floating’ Registry Domain.
Advanced Features
Network Service Mesh’s ‘match’ process for selecting candidate Endpoints to provide a Network Service can be used to implement a variety of advanced features:
- Composition
- Selective Composition
- Topologically Aware Endpoint Selection
- Topologically Aware Scale from Zero
Composition
Sometimes a Network Service is provided by a graph of Endpoints composed together to serve that workload. For example, it is likely simplest when providing a Traditional Service Mesh as a Network Service to Compose an Envoy Proxy (managed by an Istio or Kuma control plane) with a vL3 (providing a virtual L3 domain):
|
|
Please note: there is nothing magic about the choice of labels as ‘service: …’ as with all labels, the choice is arbitrary, it’s the matching that matters.
Selective Composition
Sometimes it is desirable to have different Clients receive a different composition of Endpoints to provide a Network Service.
For example, imagine that a Client is version v1.1 of an app foo. It is known that v1.1 of app foo has a security vulnerability. There is a plan to remediate to foo version v1.2 with the fix. The schedule for that is six weeks out. App foo needs to stay in deployment in the interim.
An expensive IPS can provide protection from the vulnerability. By keying off of labels provided by the Clients when they Request the Network Service, NSM can selectively interpose an IPS between all instances of foo v1.1 and the vL3 for the Network Service.
|
|
All other workloads using the Network Service continue normally without the IPS.
Please note: there is nothing magic about the choice of labels as:
- provides
- provided
- app
- version
as with all labels, the choice is arbitrary, it’s the matching that matters.
Topologically Aware Endpoint Selection
Topology for both Clients and Endpoints can be expressed by source or destination labels. Examples:
- nodeName
- clusterName
- zone
- cloudProvider
etc.
Network Service Mesh supports dynamic specification of destination labels based on the source labels in the Request.
In a destination_selector {{ .labelName }}
will substitute in the value for labelName from the source labels of the Request.
For example {{ .nodeName }}
is substituted with the value of nodeName
from the source labels.
|
|
Would cause each Client to be matched to an Endpoint on the same Node when it Requests ‘local-vl3@marketing.example.com’
Topologically Aware Scale from Zero
Topologically Aware Endpoint Selection is very useful. It requires an Endpoint to be available that matches the topology constraints for every Client that might request a Network Service. This is expensive. Imagine running an Endpoint on every Node in a 5000 Node Cluster. Fortunately, the NSM match mechanism can be used to enable Topologically Aware Scale from Zero.
|
|
If there is already an Endpoint running to provide the Network Service on the same Node as the Client , the first match selects it. If there is not, the second match sends the request to a ‘Supplier’ which will start the Endpoint on the desired node, and then return an error. The error will trigger an attempt to ‘reselect’. The reselect will find the newly created Endpoint and connect the Client to it.
Table of contents