Build Datacenters like Cloud for the new Hybrid IT environments

April 12, 2023 | Srikanth Krishnamohan

Over the last few years, we have seen how the cloud operational model has transformed the migration of workloads from on-premises or private Data Centers (DC) to the public cloud. However, organizations are increasingly looking for a reverse gear i.e. the flexibility to bring workloads back to their private DC when required to keep control of agility and cost while benefitting from the wide range of services offered by the public cloud.

To get the best of both the public and the private cloud, more and more companies are moving towards hybrid cloud deployments. The private cloud offers excellent control and security, while the public offers expansive computing powers. With a hybrid cloud architecture, enterprises and network operators can keep and manage critical data and resources on secure private servers and move them to a public server for different processing requirements.

Fig 1. Hybrid Cloud model

Hybrid Cloud Model 2

According to the Gartner report “How to Evolve Your Physical Data Center to a Modern Operating Model”:

Enterprise operational and deployment models are reaching beyond public cloud to include hybrid cloud-based platforms for on-premises as-a-service offerings.

  1. By 2025, 40% of newly procured premises-based compute and storage will be consumed as a service, up from less than 10% in 2021.
  2. By 2025, 70% of organizations will implement structured infrastructure automation to deliver flexibility and efficiency, up from 20% in 2021

Let us look at the common characteristics of a cloud-as-a-service model and how Arrcus solutions can help in building the as-a-Service data center infrastructure.

The Cloud or Hyperscale Model

The large cloud vendors started adopting a DIY approach to modernizing their Data Centers by assembling white-label building blocks of compute, storage, and networks over a decade ago and automating them for operational efficiency. They have taken the lead and shown how to innovate rapidly while gaining control of their networks and keeping the costs in control.

Here are some of the guiding principles to build a Cloud-like DC:

  1. Hyperscale economics
  2. Horizontal and vertical scale
  3. Simplified programmable control plane
  4. Replaceable building blocks (Break from vendor lock-in)
  5. Security
  6. As-a-Service - Programmatic APIs and automation at scale
  7. Telemetry and intelligent monitoring

Hyperscale economics

Hyperscale economics could be defined as the ability to separate the rate at which one can keep up demand (in this case data) while keeping the cost under check. Achieving hyperscale could mean that either your network grows exponentially with a linear or flat cost base, or network grows in a linear way while costs fall exponentially. In theory, hyperscale economics can be pursued regardless of the size of the organization.

Scale

  • Horizontally scalable from 100s to 1000s of compute and storage nodes interconnected by a scalable and programmable network fabric
  • Vertically scalable tiers based on speeds and number of redundant paths
  • Typical 2-stage or 3-stage Leaf-Spine architectures can handle both horizontal scales by adding more leaf or spine and vertical scale by adding a super-spine layer. The CLOS architecture has in-build redundancy with multiple equal cost paths (ECMP).

Fig 2: Leaf Spine CLOS Fabric

Leaf Spine CLOS Fabric

Simplified programmable control plane

  • The modern DC architecture had been standardized using just two protocols.
  • BGP for underlay fabric IP connectivity
  • EVPN-based control pane for overlay fabric
  • Multi-tenant Ready: Network virtualization can be achieved using overlay technologies such as VXLAN for seamless workload mobility and network segmentation
  • With just two protocols it is easier to automate and troubleshoot the network

Replaceable building blocks - Break from vendor lock-in

The fundamental shift in building a large cloud-scale network was to use repeatable Lego-like building blocks for the switches/routers and computer servers. The advantage is two-fold – one if a switch or router hardware fails it is easier to replace it and two to scale the system simply add more switches to the fabric. This is possible due to availability of merchant silicon with well-defined SDK, hardware built using those standard silicon by multiple ODMs and finally Network Operations Systems (NOS) which can run on those white-box hardware. The benefit of this disaggregation approach is faster and parallel innovation on each of the individual components and freedom from lock-in to a single vendor’s roadmap.

Security

Data Centers are known to have several layers of security to secure the premises as well as the data inside. A hybrid operating model cannot be possible without securing all the different locations and the interconnections between them. The physical connections between the data centers or from the colo to the cloud could be secured using MACsec while overlay WAN connections from on-prem to the multi-cloud can be protected using IPsec. IPsec works on IP packets, at layer 3, while MACsec operates at layer 2, on ethernet frames. With both MACsec and IPsec, user applications do not need to be modified to take advantage of the security guarantees that these standards provide.

As-a-Service - Programmatic APIs and automation at scale

Automation at scale is the secret to managing large distributed systems. To automate service provisioning, orchestrate the demand-capacity lifecycle, and create new services the NOS needs to provide an open and programmatic API for the control and data plane.

Telemetry and Intelligence:

7. Telemetry and Intelligence

The final piece is observability. Collecting data about the system comprises the automated collection, correlation, and consumption of measurement data from remote devices to generate network insights to generate AI-enabled insights that predict and help prevent network outages.

These principles are applicable for any scale or type of DC be it on-prem, Colo, Edge or cloud.

Arrcus helps organizations to build a cloud-like as-a-service multi-tenant Data Center. ArcOS supports traditional three-layer design as well as the IP CLOS fabric with a BGP underlay and a EVPN-VXLAN overlay. The network can be virtualized for L2 and L3 services. All programmable through the open APIs with streaming telemetry for intelligent monitoring. With security built into the design ArcOS helps in building a highly secure hybrid environment.

With the ACE platform and ArcIQ customers have the option of ingesting telemetry data for observing the network and proactively fixing the issues for efficient operations at scale. The ArcEdge solution connects the private DCs with the public cloud to build a truly hybrid IT infrastructure (refer to the ArcEdge blog).

Managed Services

three Layer Stack