This thesis analyzes the existing work, proposes, and evaluates novel designs surrounding the logically-centralized, physically-distributed Software Defined Networking (SDN) network control plane in industrial settings. Deployment of SDN in industrial scenarios requires catering for the issue of control plane dependability. The impact of distributed controller operation on resulting network performance is, however, under-investigated in existing literature. A provably robust logically-centralized industrial control plane design is necessary for its successful adoption in production settings. We henceforth postulate the feasibility of using a highly-available and resilient SDN controller solution as an enabler of future softwarized industrial networks. To this end, we provide an analysis of the availability, reliability, and response time properties of the existing consensus-based solutions. In order to achieve the low response times, we propose multiple enhancements to handling flexibly-consistent control state updates at scale. We furthermore define mechanisms for tolerating semantic faults in replicated controller state independent of the root cause (e.g., software / hardware bug, malicious takeover or diverged controller state). The proposed designs are validated analytically and empirically. To simplify the deployment of the resulting control plane, we propose a novel automated bootstrapping approach that omits any data plane dependencies, so to isolate the control and data plane responsibilities, providing for easier verification and analysis of the system's correctness. Succinctly summarized, our thesis achieves four goals:
Assessment and hardening of existing distributed SDN control plane designs: We provide the analytical guarantees for availability and response time metrics of state-of-the-art distributed SDN control plane proposals. Steady-state and transient analysis based on SANs are used in dependability and performance evaluation. We furthermore assess corner cases impacting the correctness of existing control plane designs. In particular, scenarios of leader oscillation and unsuccessful election were reproduced with existing SDN controllers. To cater for and alleviate such issues, we propose for decoupling of the underlying failure detection procedure from controller state consensus.
Design of a scalable fault-tolerant distributed control plane: We propose multiple designs for realizing a multi-controller SDN control plane that simultaneously enables a Fail-Stop-tolerant and scalable system operation. To this end, we introduce the notion of adaptive consistency, a state replication model that autonomously adapts to provide for a sufficient degree of consistency for the hosted SDN applications, under consideration of worst-case divergence requirements.
Design of mechanisms for supporting reliable distributed control plane operation: To ensure correct handling of faults rooted in Byzantine events, we propose novel control mechanisms that guarantee a transparent system transition from faulty-to-stable state even if some controller replicas are computing unreliable outputs due to internal faults. The proposed control plane extensions optionally leverage programmable forwarding elements in order to minimize the footprint of controller instance replication.
Automated bootstrapping of a highly-available and reliable distributed control plane: We propose two novel bootstrapping schemes to initialize a complex distributed system comprising arbitrary number of controller replicas. The in-band control plane is thus bootstrapped with availability guarantees - i.e., it is automatically protected against individual data plane and controller failures.
The majority of designs proposed in this thesis were evaluated under assumption of industrial network KPIs, i.e., they assume the respective typical topologies and parameter configurations. Nevertheless, the advantages of introduced designs apply to other domains, e.g., the data-center and campus SDNs.
«
This thesis analyzes the existing work, proposes, and evaluates novel designs surrounding the logically-centralized, physically-distributed Software Defined Networking (SDN) network control plane in industrial settings. Deployment of SDN in industrial scenarios requires catering for the issue of control plane dependability. The impact of distributed controller operation on resulting network performance is, however, under-investigated in existing literature. A provably robust logically-centralize...
»