Tuesday, April 23, 2019

kubernetes - Horizontal Pod Accelerator

Having your services inside a container is good, but managing the services during peak time is hard. What if the service that we are running inside a pod is getting heavy load and consuming high cpu?. How can we make our pod scale when there is high cpu consumption?

Container orchestration tools like kubernetes provide a way to scale the number of pods when the service in pod expects heavy load. Hpa ( Horizontal pod accelerator) in kubernetes provides a way of autoscaling the pods based on the cpu load that happens on the infrastructure.  Hpa scales the number of pods in a replication controller, deployment or replica set based on observed CPU utilization. Using hpa we can configure our cluster to automatically scale the pod up or down.

There are couple of factors that affect the number of pods include:
1. A minimum and maximum number of pods allowed to run defined by the user
2. Observed CPU/Memory usage, as reported by resource metrics
3. Customer metrics provided by the third party metrics like prometheus

Hpe improves your services by:
1. Releasing hardware resources that would otherwise be wasted by an excessive number of pods
2. Increase/decrease performance as needed to accomplish service level agreements.

1. Install the kubernetes metrics Server : Clone the metrics-server using ,
git clone

Edit the file deploy/1.8+/metrics-server-deployment.yaml to add a command section that did not exist. The command sections looks as,
      - name: metrics-server
          - /metrics-server
          - --kubelet-insecure-tls

This new section will let metrics-server to allow for an insecure communication  which involves not verifying the certs involved.

Once done, deploy the server using kubectl create -f deploy/1.8+/
The metrics-server will be deployed to the kube-system namespace. Once ran, give couple of minutes to gather the metrics.

The kubernetes metrics-server can be very useful in keeping an eye on the underlying infrastructure and pods that are running. The project is not officially part of kubernetes cluster which requires manual installation. This has 2 parts metrics-server and the metrics api. The Metrics-server is a cluster wide aggregator of resource usage data which collects metrics from the summery API exposed by kubelet on each node.

The metrics server will keep an eye on the underlying nodes and pods running. Run the top node and top pod command along with kubectl command. Run the command to see the details of the node and pod using,
jagadishm@$ kubectl top node
NAME                      CPU(cores)   CPU%      MEMORY(bytes)   MEMORY%   
docker-for-desktop   650m           32%       1408Mi                74%       

jagadishm@$ kubectl top pod
NAME                                            CPU(cores)   MEMORY(bytes)   
load-generator-7bbbb4fdd4-vfvrk    0m              0Mi             
php-apache-5d75bdc65f-rtvtr         1m              16Mi            

2. Create a Pod which can take load - this is simple php application which when hit by the service perform some cpu insensitive operations.

In the Dockerfile
FROM php:5-apache
ADD index.php /var/www/html/index.php
RUN chmod a+rx index.php
  $x = 0.0001;
  for ($i = 0; $i <= 1000000; $i++) {
    $x += sqrt($x);
  echo "OK!";

Build a image by the name php-apache. Upload to a dockerhub and run the container using ,kubectl run php-apache --requests=cpu=200m --expose --port=80

Once both deployment and service are created,lets create a hpa

Create hpa - The autoscaler can be created by a yaml file or a kubectl autoscale the below command to create a hpa using,
kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
horizontalpodautoscaler.autoscaling/php-apache autoscaled

The autoscaler command will create a hpa which will maintain replicates between 1 and 10 of the pods managed by the php-apache deployment. Hpa will increase or decrease the number of php-apache pod replicas to maintain an average cpu utilization across all pods to 50% since we have assigned 200 mill-cores to each pod.

Check the hpa using,
We can see that the target show 0%/50% which means we don't have any load.
Run a load generator : Run a busy box container from which we continuously hit the php-apache service using a while loop. Run the kubectl run -i --tty load-generator --image=busybox /bin/sh command and once command prompt comes , enter the command while true; do wget -q -O- http://php-apache.default.svc.cluster.local; done to continuously hit the php-apache service

jagadishm@$ kubectl run -i --tty load-generator --image=busybox /bin/sh
If you don't see a command prompt, try pressing enter.
/ # while true; do wget -q -O- http://php-apache.default.svc.cluster.local; done

After few minutes, we can see load in the hpa. The targets in the hpa show different values with increasing. We can then check the pods to see that the pods are replicated as below,
jagadishm@$ kubectl get pods
NAME                                         READY     STATUS    RESTARTS    AGE
load-generator-7bbbb4fdd4-vfvrk 1/1          Running       0              5m
php-apache-5d75bdc65f-9hfq8     1/1          Running       0              42s
php-apache-5d75bdc65f-bqb9s     1/1          Running      0              42s
php-apache-5d75bdc65f-rgfpg      1/1          Running      0              42s
php-apache-5d75bdc65f-rtvtr       1/1          Running      0              1h

We can see that when the target machine are seeing some load and based on the hpa that we created, the pod of php-apache are replicated. This way we can extend our service pod based on the infrastructure changes.

No comments :

Post a Comment