12 Feb 2026
Kubernetes Blog
Spotlight on SIG Architecture: API Governance
This is the fifth interview of a SIG Architecture Spotlight series that covers the different subprojects, and we will be covering SIG Architecture: API Governance.
In this SIG Architecture spotlight we talked with Jordan Liggitt, lead of the API Governance sub-project.
Introduction
FM: Hello Jordan, thank you for your availability. Tell us a bit about yourself, your role and how you got involved in Kubernetes.
JL: My name is Jordan Liggitt. I'm a Christian, husband, father of four, software engineer at Google by day, and amateur musician by stealth. I was born in Texas (and still like to claim it as my point of origin), but I've lived in North Carolina for most of my life.
I've been working on Kubernetes since 2014. At that time, I was working on authentication and authorization at Red Hat, and my very first pull request to Kubernetes attempted to add an OAuth server to the Kubernetes API server. It never exited work-in-progress status. I ended up going with a different approach that layered on top of the core Kubernetes API server in a different project (spoiler alert: this is foreshadowing), and I closed it without merging six months later.
Undeterred by that start, I stayed involved, helped build Kubernetes authentication and authorization capabilities, and got involved in the definition and evolution of the core Kubernetes APIs from early beta APIs, like v1beta3 to v1. I got tagged as an API reviewer in 2016 based on those contributions, and was added as an API approver in 2017.
Today, I help lead the API Governance and code organization subprojects for SIG Architecture, and I am a tech lead for SIG Auth.
FM: And when did you get specifically involved in the API Governance project?
JL: Around 2019.
Goals and scope of API Governance
FM: How would you describe the main goals and areas of intervention of the subproject?
The surface area includes all the various APIs Kubernetes has, and there are APIs that people do not always realize are APIs: command-line flags, configuration files, how binaries are run, how they talk to back-end components like the container runtime, and how they persist data. People often think of "the API" as only the REST API... that is the biggest and most obvious one, and the one with the largest audience, but all of these other surfaces are also APIs. Their audiences are narrower, so there is more flexibility there, but they still require consideration.
The goals are to be stable while still enabling innovation. Stability is easy if you never change anything, but that contradicts the goal of evolution and growth. So we balance "be stable" with "allow change".
FM: Speaking of changes, in terms of ensuring consistency and quality (which is clearly one of the reasons this project exists), what are the specific quality gates in the lifecycle of a Kubernetes change? Does API Governance get involved during the release cycle, prior to it through guidelines, or somewhere in between? At what points do you ensure the intended role is fulfilled?
JL: We have guidelines and conventions, both for APIs in general and for how to change an API. These are living documents that we update as we encounter new scenarios. They are long and dense, so we also support them with involvement at either the design stage or the implementation stage.
Sometimes, due to bandwidth constraints, teams move ahead with design work without feedback from API Review. That's fine, but it means that when implementation begins, the API review will happen then, and there may be substantial feedback. So we get involved when a new API is created or an existing API is changed, either at design or implementation.
FM: Is this during the Kubernetes Enhancement Proposal (KEP) process? Since KEPs are mandatory for enhancements, I assume part of the work intersects with API Governance?
JL: It can. KEPs vary in how detailed they are. Some include literal API definitions. When they do, we can perform an API review at the design stage. Then implementation becomes a matter of checking fidelity to the design.
Getting involved early is ideal. But some KEPs are conceptual and leave details to the implementation. That's not wrong; it just means the implementation will be more exploratory. Then API Review gets involved later, possibly recommending structural changes.
There's a trade-off regardless: detailed design upfront versus iterative discovery during implementation. People and teams work differently, and we're flexible and happy to consult early or at implementation time.
FM: This reminds me of what Fred Brooks wrote in "The Mythical Man-Month" about conceptual integrity being central to product quality... No matter how you structure the process, there must be a point where someone looks at what is coming and ensures conceptual integrity. Kubernetes uses APIs everywhere -- externally and internally -- so API Governance is critical to maintaining that integrity. How is this captured?
JL: Yes, the conventions document captures patterns we've learned over time: what to do in various situations. We also have automated linters and checks to ensure correctness around patterns like spec/status semantics. These automated tools help catch issues even when humans miss them.
As new scenarios arise -- and they do constantly -- we think through how to approach them and fold the results back into our documentation and tools. Sometimes it takes a few attempts before we settle on an approach that works well.
FM: Exactly. Each new interaction improves the guidelines.
JL: Right. And sometimes the first approach turns out to be wrong. It may take two or three iterations before we land on something robust.
The impact of Custom Resource Definitions
FM: Is there any particular change, episode, or domain that stands out as especially noteworthy, complex, or interesting in your experience?
JL: The watershed moment was Custom Resources. Prior to that, every API was handcrafted by us and fully reviewed. There were inconsistencies, but we understood and controlled every type and field.
When Custom Resources arrived, anyone could define anything. The first version did not even require a schema. That made it extremely powerful -- it enabled change immediately -- but it left us playing catch-up on stability and consistency.
When Custom Resources graduated to General Availability (GA), schemas became required, but escape hatches still existed for backward compatibility. Since then, we've been working on giving CRD authors validation capabilities comparable to built-ins. Built-in validation rules for CRDs have only just reached GA in the last few releases.
So CRDs opened the "anything is possible" era. Built-in validation rules are the second major milestone: bringing consistency back.
The three major themes have been defining schemas, validating data, and handling pre-existing invalid data. With ratcheting validation (allowing data to improve without breaking existing objects), we can now guide CRD authors toward conventions without breaking the world.
API Governance in context
FM: How does API Governance relate to SIG Architecture and API Machinery?
JL: API Machinery provides the actual code and tools that people build APIs on. They don't review APIs for storage, networking, scheduling, etc.
SIG Architecture sets the overall system direction and works with API Machinery to ensure the system supports that direction. API Governance works with other SIGs building on that foundation to define conventions and patterns, ensuring consistent use of what API Machinery provides.
FM: Thank you. That clarifies the flow. Going back to release cycles: do release phases -- enhancements freeze, code freeze -- change your workload? Or is API Governance mostly continuous?
JL: We get involved in two places: design and implementation. Design involvement increases before enhancements freeze; implementation involvement increases before code freeze. However, many efforts span multiple releases, so there is always some design and implementation happening, even for work targeting future releases. Between those intense periods, we often have time to work on long-term design work.
An anti-pattern we see is teams thinking about a large feature for months and then presenting it three weeks before enhancements freeze, saying, "Here is the design, please review." For big changes with API impact, it's much better to involve API Governance early.
And there are good times in the cycle for this -- between freezes -- when people have bandwidth. That's when long-term review work fits best.
Getting involved
FM: Clearly. Now, regarding team dynamics and new contributors: how can someone get involved in API Governance? What should they focus on?
JL: It's usually best to follow a specific change rather than trying to learn everything at once. Pick a small API change, perhaps one someone else is making or one you want to make, and observe the full process: design, implementation, review.
High-bandwidth review -- live discussion over video -- is often very effective. If you're making or following a change, ask whether there's a time to go over the design or PR together. Observing those discussions is extremely instructive.
Start with a small change. Then move to a bigger one. Then maybe a new API. That builds understanding of conventions as they are applied in practice.
FM: Excellent. Any final comments, or anything we missed?
JL: Yes... the reason we care so much about compatibility and stability is for our users. It's easy for contributors to see those requirements as painful obstacles preventing cleanup or requiring tedious work... but users integrated with our system, and we made a promise to them: we want them to trust that we won't break that contract. So even when it requires more work, moves slower, or involves duplication, we choose stability.
We are not trying to be obstructive; we are trying to make life good for users.
A lot of our questions focus on the future: you want to do something now... how will you evolve it later without breaking it? We assume we will know more in the future, and we want the design to leave room for that.
We also assume we will make mistakes. The question then is: how do we leave ourselves avenues to improve while keeping compatibility promises?
FM: Exactly. Jordan, thank you, I think we've covered everything. This has been an insightful view into the API Governance project and its role in the wider Kubernetes project.
JL: Thank you.
12 Feb 2026 12:00am GMT
03 Feb 2026
Kubernetes Blog
Introducing Node Readiness Controller
In the standard Kubernetes model, a node's suitability for workloads hinges on a single binary "Ready" condition. However, in modern Kubernetes environments, nodes require complex infrastructure dependencies-such as network agents, storage drivers, GPU firmware, or custom health checks-to be fully operational before they can reliably host pods.
Today, on behalf of the Kubernetes project, I am announcing the Node Readiness Controller. This project introduces a declarative system for managing node taints, extending the readiness guardrails during node bootstrapping beyond standard conditions. By dynamically managing taints based on custom health signals, the controller ensures that workloads are only placed on nodes that met all infrastructure-specific requirements.
Why the Node Readiness Controller?
Core Kubernetes Node "Ready" status is often insufficient for clusters with sophisticated bootstrapping requirements. Operators frequently struggle to ensure that specific DaemonSets or local services are healthy before a node enters the scheduling pool.
The Node Readiness Controller fills this gap by allowing operators to define custom scheduling gates tailored to specific node groups. This enables you to enforce distinct readiness requirements across heterogeneous clusters, ensuring for example, that GPU equipped nodes only accept pods once specialized drivers are verified, while general purpose nodes follow a standard path.
It provides three primary advantages:
- Custom Readiness Definitions: Define what ready means for your specific platform.
- Automated Taint Management: The controller automatically applies or removes node taints based on condition status, preventing pods from landing on unready infrastructure.
- Declarative Node Bootstrapping: Manage multi-step node initialization reliably, with a clear observability into the bootstrapping process.
Core concepts and features
The controller centers around the NodeReadinessRule (NRR) API, which allows you to define declarative gates for your nodes.
Flexible enforcement modes
The controller supports two distinct operational modes:
- Continuous enforcement
- Actively maintains the readiness guarantee throughout the node's entire lifecycle. If a critical dependency (like a device driver) fails later, the node is immediately tainted to prevent new scheduling.
- Bootstrap-only enforcement
- Specifically for one-time initialization steps, such as pre-pulling heavy images or hardware provisioning. Once conditions are met, the controller marks the bootstrap as complete and stops monitoring that specific rule for the node.
Condition reporting
The controller reacts to Node Conditions rather than performing health checks itself. This decoupled design allows it to integrate seamlessly with other tools existing in the ecosystem as well as custom solutions:
- Node Problem Detector (NPD): Use existing NPD setups and custom scripts to report node health.
- Readiness Condition Reporter: A lightweight agent provided by the project that can be deployed to periodically check local HTTP endpoints and patch node conditions accordingly.
Operational safety with dry run
Deploying new readiness rules across a fleet carries inherent risk. To mitigate this, dry run mode allows operators to first simulate impact on the cluster. In this mode, the controller logs intended actions and updates the rule's status to show affected nodes without applying actual taints, enabling safe validation before enforcement.
Example: CNI bootstrapping
The following NodeReadinessRule ensures a node remains unschedulable until its CNI agent is functional. The controller monitors a custom cniplugin.example.net/NetworkReady condition and only removes the readiness.k8s.io/acme.com/network-unavailable taint once the status is True.
apiVersion: readiness.node.x-k8s.io/v1alpha1
kind: NodeReadinessRule
metadata:
name: network-readiness-rule
spec:
conditions:
- type: "cniplugin.example.net/NetworkReady"
requiredStatus: "True"
taint:
key: "readiness.k8s.io/acme.com/network-unavailable"
effect: "NoSchedule"
value: "pending"
enforcementMode: "bootstrap-only"
nodeSelector:
matchLabels:
node-role.kubernetes.io/worker: ""
Demo:
Getting involved
The Node Readiness Controller is just getting started, with our initial releases out, and we are seeking community feedback to refine the roadmap. Following our productive Unconference discussions at KubeCon NA 2025, we are excited to continue the conversation in person.
Join us at KubeCon + CloudNativeCon Europe 2026 for our maintainer track session: Addressing Non-Deterministic Scheduling: Introducing the Node Readiness Controller.
In the meantime, you can contribute or track our progress here:
- GitHub: https://sigs.k8s.io/node-readiness-controller
- Slack: Join the conversation in #sig-node-readiness-controller
- Documentation: Getting Started
03 Feb 2026 2:00am GMT
30 Jan 2026
Kubernetes Blog
New Conversion from cgroup v1 CPU Shares to v2 CPU Weight
I'm excited to announce the implementation of an improved conversion formula from cgroup v1 CPU shares to cgroup v2 CPU weight. This enhancement addresses critical issues with CPU priority allocation for Kubernetes workloads when running on systems with cgroup v2.
Background
Kubernetes was originally designed with cgroup v1 in mind, where CPU shares were defined simply by assigning the container's CPU requests in millicpu form.
For example, a container requesting 1 CPU (1024m) would get (cpu.shares = 1024).
After a while, cgroup v1 started being replaced by its successor, cgroup v2. In cgroup v2, the concept of CPU shares (which ranges from 2 to 262144, or from 2¹ to 2¹⁸) was replaced with CPU weight (which ranges from [1, 10000], or 10⁰ to 10⁴).
With the transition to cgroup v2, KEP-2254 introduced a conversion formula to map cgroup v1 CPU shares to cgroup v2 CPU weight. The conversion formula was defined as: cpu.weight = (1 + ((cpu.shares - 2) * 9999) / 262142)
This formula linearly maps values from [2¹, 2¹⁸] to [10⁰, 10⁴].

While this approach is simple, the linear mapping imposes a few significant problems and impacts both performance and configuration granularity.
Problems with previous conversion formula
The current conversion formula creates two major issues:
1. Reduced priority against non-Kubernetes workloads
In cgroup v1, the default value for CPU shares is 1024, meaning a container requesting 1 CPU has equal priority with system processes that live outside of Kubernetes' scope. However, in cgroup v2, the default CPU weight is 100, but the current formula converts 1 CPU (1024m) to only ≈39 weight - less than 40% of the default.
Example:
- Container requesting 1 CPU (1024m)
- cgroup v1:
cpu.shares = 1024(equal to default) - cgroup v2 (current):
cpu.weight = 39(much lower than default 100)
This means that after moving to cgroup v2, Kubernetes (or OCI) workloads would de-facto reduce their CPU priority against non-Kubernetes processes. The problem can be severe for setups with many system daemons that run outside of Kubernetes' scope and expect Kubernetes workloads to have priority, especially in situations of resource starvation.
2. Unmanageable granularity
The current formula produces very low values for small CPU requests, limiting the ability to create sub-cgroups within containers for fine-grained resource distribution (which will possibly be much easier moving forward, see KEP #5474 for more info).
Example:
- Container requesting 100m CPU
- cgroup v1:
cpu.shares = 102 - cgroup v2 (current):
cpu.weight = 4(too low for sub-cgroup configuration)
With cgroup v1, requesting 100m CPU which led to 102 CPU shares was manageable in the sense that sub-cgroups could have been created inside the main container, assigning fine-grained CPU priorities for different groups of processes. With cgroup v2 however, having 4 shares is very hard to distribute between sub-cgroups since it's not granular enough.
With plans to allow writable cgroups for unprivileged containers, this becomes even more relevant.
New conversion formula
Description
The new formula is more complicated, but does a much better job mapping between cgroup v1 CPU shares and cgroup v2 CPU weight:
The idea is that this is a quadratic function to cross the following values:
- (2, 1): The minimum values for both ranges.
- (1024, 100): The default values for both ranges.
- (262144, 10000): The maximum values for both ranges.
Visually, the new function looks as follows:

And if you zoom in to the important part:

The new formula is "close to linear", yet it is carefully designed to map the ranges in a clever way so the three important points above would cross.
How it solves the problems
-
Better priority alignment:
- A container requesting 1 CPU (1024m) will now get a
cpu.weight = 102. This value is close to cgroup v2's default 100. This restores the intended priority relationship between Kubernetes workloads and system processes.
- A container requesting 1 CPU (1024m) will now get a
-
Improved granularity:
- A container requesting 100m CPU will get
cpu.weight = 17, (see here). Enables better fine-grained resource distribution within containers.
- A container requesting 100m CPU will get
Adoption and integration
This change was implemented at the OCI layer. In other words, this is not implemented in Kubernetes itself; therefore the adoption of the new conversion formula depends solely on the OCI runtime adoption.
For example:
- runc: The new formula is enabled from version 1.3.2.
- crun: The new formula is enabled from version 1.23.
Impact on existing deployments
Important: Some consumers may be affected if they assume the older linear conversion formula. Applications or monitoring tools that directly calculate expected CPU weight values based on the previous formula may need updates to account for the new quadratic conversion. This is particularly relevant for:
- Custom resource management tools that predict CPU weight values.
- Monitoring systems that validate or expect specific weight values.
- Applications that programmatically set or verify CPU weight values.
The Kubernetes project recommends testing the new conversion formula in non-production environments before upgrading OCI runtimes to ensure compatibility with existing tooling.
Where can I learn more?
For those interested in this enhancement:
- Kubernetes GitHub Issue #131216 - Detailed technical analysis and examples, including discussions and reasoning for choosing the above formula.
- KEP-2254: cgroup v2 - Original cgroup v2 implementation in Kubernetes.
- Kubernetes cgroup documentation - Current resource management guidance.
How do I get involved?
For those interested in getting involved with Kubernetes node-level features, join the Kubernetes Node Special Interest Group. We always welcome new contributors and diverse perspectives on resource management challenges.
30 Jan 2026 4:00pm GMT