Core Concepts

Colocation is one of the core features of Katalyst. In order to achieve colocation, Katalyst introduces and implements a few core concepts. This document gives a high level explaination on those concepts.

QoS

To extend the ability of kubernetes’ original QoS class, katalyst defines its own QoS class with CPU as the dominant resource. Other than memory, CPU is considered as a divisible resource and is easier to isolate. And for cloudnative workloads, CPU is usually the dominant resource that causes performance problems. So katalyst uses CPU to name different QoS classes, and other resources are implicitly accompanied by it.

Definition

Qos level	Feature	Target Workload	Mapped k8s QoS
dedicated_cores	Bind with a quantity of dedicated cpu cores Without sharing with any other pod	Workload that's very sensitive to latency such as online advertising, recommendation.	Guaranteed
shared_cores	Share a set of dedicated cpu cores with other shared_cores pods	Workload that can tolerate a little cpu throttle or neighbor spikes such as microservices for webservice.	Guaranteed/Burstable
reclaimed_cores	Over-committed resources that are squeezed from dedicated_cores or shared_cores Whenever dedicated_cores or shared_cores need to claim their resources back, reclaimed_cores will be suppressed or evicted	Workload that mainly cares about throughput rather than latency such as batch bigdata, offline training.	BestEffort
system_cores	Reserved for core system agents to ensure performance	Core system agents.	Burstable

Pool

As introduced above, katalyst uses the term pool to indicate a combination of resources that a batch of pods share with each other. For instance, pods with shared_cores may share a shared pool, meaning that they share the same cpusets, memory limits and so on; in the meantime, if cpuset_pool enhancement is enabled, the single shared pool will be separated into several pools based on the configurations.

Enhancement

Beside the core QoS level, katalyst also provides a mechanism to enhance the ability of standard QoS levels. The enhancement works as a flexible extensibility, and may be added continuously.

Enhancement	Feature
numa_binding	Indicates that the pod should be bound into a (or several) numa node(s) to gain further performance improvements Only supported by dedicated_cores
cpuset_pool	Allocate a separated cpuset in shared_cores pool to isolate scheduling domain for identical pods. Only supported by shared_cores
...

Configurations

To make the configuration more flexible, katalyst designs a new mechanism to set configs on the run, and it works as a supplement for static configs defined via command-line flags. In katalyst, the implementation of this mechanism is called KatalystCustomConfig (KCC for short). It enables each daemon component to dynamically adjust its working status without restarting or re-deploying. For more information about KCC, please refer to dynamic-configuration.

Last modified October 31, 2024 : Merge pull request #11 from ozline/main (118e03d)