- At the level of VA Machines (K8S-VA) fail safety is provided by the Kubernetes processing servers. When the Kubernetes server crashes, containers are migrated to available machines.
- At the level of BE Server Nodes (K8S-BE) fail safety is provided both by the Kubernetes processing servers and by triple database replication as well as by fail-safety of individual BE servers. RAID is used to resist local disk failure. Additionally, there are BE Server Nodes machines responsible for load balancing for the data storage subsystem and the data processing subsystem. Fail safety and data integrity is provided by the distributed data storage system based on Ceph.
To ensure optimal data routing between all components of the IREX software and hardware complex, routing is implemented using the Leaf-Spine two-level topology. The topology is composed of leaf switches (to which servers and storage connect) and spine switches (to which leaf switches connect).
The Leaf level consists of switches to which servers connect.
Spine switches form the core of the architecture. Every leaf switch connects to every spine switch in the network fabric. The network traffic path is balanced in such a way that the network load is evenly distributed among spine switches. Failure of one of spine switches will only slightly degrade network performance in the cluster.
No matter which leaf switch a server is connected to, it has to cross the same number of devices every time it connects to another server. (The only exception is when the other server is on the same leaf.)This approach is most efficient because it minimizes latency and bottlenecks.