The external interfaces to Kubernetes ingress

Defining a Kubernetes ingress seems simple enough: Define a YAML file and you’re good. But that only defines the interface from services inside the cluster to whatever external agent is connecting external client requests to cluster services. It says nothing about the interface from that external agent to the Kubernetes control and data planes. If you are building a cloud provider, how do you connect your external load balancer to Kubernetes clusters inside your cloud? If you are configuring an entire on-prem cluster from scratch, how do you set up access to the Internet?

more ...

Three rings of abstraction in Kubernetes

A common analogy for Kubernetes presents it as “the operating system for microservices”, identifying Kubernetes as a kind of operating system. I don’t think this analogy is wrong but I do think it’s underinformative, leaving the key question unanswered: What abstractions does this “operating system” provide? I suggest that unlike the more focused set of abstractions provided by the classic Unix model, the abstractions of Kubernetes can be grouped into three concentric rings. This highlights the different kinds of benefits Kubernetes affords and clarifies the decisions an organization must make when adopting Kubernetes.

more ...

A manager cannot be a mentor—but should still give advice

Mentoring. Everyone thinks it’s a great idea and many claim to do it. Yet both formal studies and my personal observations suggest that such relationships are rarely sustained and often not retrospectively valued by the junior participants. To sustain effective mentoring many elements must be in place but I believe that an essential requirement is this: The mentor cannot have authority over their protege. The mentor’s suggestions must be suggestions, not orders. Without that basis, the whole relationship falters. Teaching and supervising are distinct roles from mentoring, however much they share elements with it. The advisory nature of mentoring affects every aspect of the relationship.

more ...

The basics of latency percentiles

Designers of distributed systems obsess over service latency. They forecast latencies of proposed designs, stress-test the latencies of systems under development, equalize latencies of replicated services by load balancing, monitor latencies of production systems using instrumentation, and commit to latency targets in service level objectives and agreements.

The most commonly-used representation of latency in these scenarios is percentiles. Percentiles are deceptively simple, a function from values between 0 and 1 (or equivalently 0% and 100%) to a response time. But the relation between percentiles and the distribution they describe is subtle: Formally, the percentile function is the inverse of the integral of the probability density of the distribution.

The subtlety of this construction makes it easy to miss a step and become confused. Working through the basics in detail, gaining a clear understanding of how percentiles relate to other representations of latency, provides designers with a straightforward approach to reasoning about a fundamental concept in distributed systems design.

more ...

Extending "The Tail at Scale"

A companion post goes into more detail on constructing the percentile function from a density function.

In their classic article, “The Tail at Scale”, Dean and Barroso formulate an important principle of service design: Reduce the variance of the latency for low-level services because that has a large impact on the performance of their callers. This effect arises due to the high number of subservices called by a typical main service, making it likely that at least one subservice will have an extremely slow response, holding back the main service. In their article, Dean and Barroso focus on the very largest latencies but extending their argument to smaller but more frequent latencies provides even more insight into service design.

more ...