Building a resilient deployment on Kubernetes-part 1: Minimum Application Downtime

6 min readDec 29, 2020

In this article, I would be discussing one of the Kubernetes features that would help to build a high availability cluster with minimum downtime.

(The summary section is at the bottom of this article if you want to get a quick look at what I discussed throughout the article.)

General Brief on High Availability

In an enterprise-grade deployment, the high availability is a crucial factor to keep the clients who engage with the application satisfied.

The basic idea of the high availability is engaging with an application successfully at a given time.

How do we make sure the high availability

Having multiple healthy instances (2 or more) of an application in the environment to handle the requests.
Identify application instances are healthy and replace it unhealthy instance identified.
Do not allow traffic if the instances are not healthy/ready.
Open a new instance to the consumption only when it is ready.

High Availability on Kubernetes

High availability in Kubernetes means, having multiple replicas of a certain pod and make sure that the traffic is not forwarded to an unhealthy pod. Kubernetes provides a few options to achieve high availability.

1. Replicas: Maintain multiple instances of the same pod(application)

To keep a high available cluster, you need to have a minimum of 2 pods of the application running in the cluster.

You can achieve this using the replicaSet, deployment or statefulSet in Kubernetes.

ReplicaSet ensures that defined numbers of pods are running on the cluster at a given time. You can define the number of replicas you want to run on the cluster with the replicasparameter in ReplicaSet definition.
If you are creating a Kubernetes deployment resource you can define the replicas property in the respective definition and it will create the replicaSet automatically.
If you using Kubernetes StatefulSet resource, you can still define the number of replicas you desire using the replicas property. However, note that StatefulSet does not create a replicaSet.

You have to define the replicas property as shown in the following code snippet under either of the aforementioned Kubernetes resource definition’s spec section.

apiVersion: <api version of the resource>
kind: < ReplicaSet | StatefulSet | Deployment>
metadata:
  name: ha-app
spec:
  replicas: 2

2. Liveness, Readiness and Startup Probes: To ensure healthy pods and replace unhealthy pods.

The purpose of these probes is to ensure that the desired pods(the container(s) inside the pods) are in healthy status always. Using these probes, the kubelet can detect unhealthy pods and restart them or to not allow traffic until the pods are healthy and ready.

LivenessProbe

Check if the containers functioning properly according to the given liveness probe. If the liveness probe fails, the kubelet will restart the pod.

ReadinessProbe

Readiness probe helps to determine if a pod is ready to accept the traffic. A container(s) in a pod might need some dependent applications, files, containers etc to function properly. Once the configured readiness probe conditions satisfied in a pod for all of its containers, the pod will be considered to server the traffic.

StartupProbe

Startup probe ensures that the container is started properly. If a containerized application takes time to spin up, configuring startup probe will ensure to allow the traffic into it. Once you configure startup probe, the other probes will be ignored until the startup probe passes. If it fails, the pod will be restarted. This is useful for slow starting containers as the other probes interference is ignored, hence the kubelet would not restart the pod based on the liveness probe.

Configuring these probes ensures that the pods are not considered to be serving traffic unless they are ready. Hence the client would not be exposed to faulty apps. Further, in an event of failure, the pod itself get recovered as it has the capability to detect the unhealthy pods.

Configuring Probes

The configuration of each probe is similar and it can be done using 3 approaches as mentioned below. These probes are configured at the containers level for each container in the pod definition.

Using a command

This is used if you need to execute a command to check the health status of your application. If the executing command is successful, it would consider the container to be healthy.

Using an HTTP request

This is done by performing an HTTP GET on an endpoint of the application. If the application has an HTTP GET endpoint (most probably this is a health check endpoint), then you can configure that endpoint to check the health status. If you get a successful response, then it would be considered as healthy.

Using a TCP port

This is used to establish a TCP socket to the configured container. If establishing the connection is successful, the container is considered healthy.

For each of these probes can be configured to when to start and on what is the time interval to continue using the following properties.

initialDelaySeconds

To instruct the pod to perform the first health check, the pod might need to wait a few seconds. To configure this we can use initialDelaySeconds property in containers section.
Default value: 0 seconds
Minimum value: 0 seconds

periodSeconds

The liveness check should continue to happen periodically to make sure the application is healthy. The time interval between two checks is configured using periodicSeconds in containers section.
Default value: 10 seconds
Minimum value: 1 second

successThreshold

The minimum number of consecutive times the probe should be successful in order to be considered as healthy after a failure.
Default value: 1
Minimum value: 1

failureThreshold

The number of consecutive times the probe should be a fail, in order to be considered the pod as a failure.
Default value: 3
Minimum value: 1

timeoutSeconds

The number of seconds the probe should be running at a turn
Default value: 1 second
Minimum value: 1 second

A sample configuration consisting most of the configuration we discussed above are shown in below Kubernetes definition.

To get to know more about the different configs related to probes, please refer to the official Kubernetes guide.

apiVersion: v1
kind: Pod
metadata:
  labels:
    test: liveness
  name: liveness-exec
spec:
  containers:
  - name: liveness
    image: k8s.gcr.io/busybox
    args:
    - /bin/sh
    - -c
    - touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600
    livenessProbe:
      exec:
        command:
        - cat
        - /tmp/healthy
      initialDelaySeconds: 5
      periodSeconds: 7
    readinessProbe:
      tcpSocket:
        port: 8080
      initialDelaySeconds: 5
      periodSeconds: 10
      failureThreshold: 5

According to the above definition, the pod will start its initial liveness check after 5 seconds by executing cat /tmp/healthy. And it will continue to check its liveness every 7 seconds. In the event of a failed liveness check, the pod will be restarted.

Similarly, it will continue to check if it can establish a TCP socket for port 8080 every 10 seconds after the initial check. If it fails, the pod will be not accepting traffic. If it continues to fail 5 times, the pod will be considered as not healthy.

Summary

In this article, we described how to achieve minimum downtime in Kubernetes. For that, we need the followings.

Need 2 or more instances

Use replicas in deployment,replicaSet or statefulSet

Need to check the pods are healthy

Readiness probe checks if the pod is ready to accept the traffic, if the probe fails, the pod will be marked as unhealthy and not allow to serve traffic.
Liveness probe to ensure the pod is running healthy and restart if unhealthy
Startup probe to ensure that the slow containers are started properly. Until it becomes successful, the other probes would be neglected.
These probes are configured at the container level.
Probe check can be executing a command, HTTP GET request or TCP port.

Combination of these properties ensures that the application has multiple healthy instances running at a given time to serve the traffic.

Building a resilient deployment is not just about having replicas and healthy instance. It required more than that to have a resilient deployment Kubernetes. I will discuss the other factors in my next articles.

Thank you for reading. I hope it is helpful. I will see you in my next article.