Having your services inside a container is good, but managing the services during peak time is hard. What if the service that we are running inside a pod is getting heavy load and consuming high cpu?. How can we make our pod scale when there is high cpu consumption?
Container orchestration tools like kubernetes provide a way to scale the number of pods when the service in pod expects heavy load. Hpa ( Horizontal pod accelerator) in kubernetes provides a way of autoscaling the pods based on the cpu load that happens on the infrastructure. Hpa scales the number of pods in a replication controller, deployment or replica set based on observed CPU utilization. Using hpa we can configure our cluster to automatically scale the pod up or down.
There are couple of factors that affect the number of pods include:
1. A minimum and maximum number of pods allowed to run defined by the user
2. Observed CPU/Memory usage, as reported by resource metrics
3. Customer metrics provided by the third party metrics like prometheus
Hpe improves your services by:
1. Releasing hardware resources that would otherwise be wasted by an excessive number of pods
2. Increase/decrease performance as needed to accomplish service level agreements.
1. Install the kubernetes metrics Server : Clone the metrics-server using ,
git clone https://github.com/kubernetes-incubator/metrics-server.git
Edit the file deploy/1.8+/metrics-server-deployment.yaml to add a command section that did not exist. The command sections looks as,
containers:
- name: metrics-server
image: k8s.gcr.io/metrics-server-amd64:v0.3.1
command:
- /metrics-server
- --kubelet-insecure-tls
This new section will let metrics-server to allow for an insecure communication which involves not verifying the certs involved.
Once done, deploy the server using kubectl create -f deploy/1.8+/
The metrics-server will be deployed to the kube-system namespace. Once ran, give couple of minutes to gather the metrics.
The kubernetes metrics-server can be very useful in keeping an eye on the underlying infrastructure and pods that are running. The project is not officially part of kubernetes cluster which requires manual installation. This has 2 parts metrics-server and the metrics api. The Metrics-server is a cluster wide aggregator of resource usage data which collects metrics from the summery API exposed by kubelet on each node.
The metrics server will keep an eye on the underlying nodes and pods running. Run the top node and top pod command along with kubectl command. Run the command to see the details of the node and pod using,
jagadishm@192.168.31.177:/Volumes/Work/$ kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
docker-for-desktop 650m 32% 1408Mi 74%
jagadishm@192.168.31.177:/Volumes/Work/$ kubectl top pod
NAME CPU(cores) MEMORY(bytes)
load-generator-7bbbb4fdd4-vfvrk 0m 0Mi
php-apache-5d75bdc65f-rtvtr 1m 16Mi
2. Create a Pod which can take load - this is simple php application which when hit by the service perform some cpu insensitive operations.
In the Dockerfile
FROM php:5-apache
ADD index.php /var/www/html/index.php
RUN chmod a+rx index.php
Container orchestration tools like kubernetes provide a way to scale the number of pods when the service in pod expects heavy load. Hpa ( Horizontal pod accelerator) in kubernetes provides a way of autoscaling the pods based on the cpu load that happens on the infrastructure. Hpa scales the number of pods in a replication controller, deployment or replica set based on observed CPU utilization. Using hpa we can configure our cluster to automatically scale the pod up or down.
There are couple of factors that affect the number of pods include:
1. A minimum and maximum number of pods allowed to run defined by the user
2. Observed CPU/Memory usage, as reported by resource metrics
3. Customer metrics provided by the third party metrics like prometheus
Hpe improves your services by:
1. Releasing hardware resources that would otherwise be wasted by an excessive number of pods
2. Increase/decrease performance as needed to accomplish service level agreements.
1. Install the kubernetes metrics Server : Clone the metrics-server using ,
git clone https://github.com/kubernetes-incubator/metrics-server.git
Edit the file deploy/1.8+/metrics-server-deployment.yaml to add a command section that did not exist. The command sections looks as,
containers:
- name: metrics-server
image: k8s.gcr.io/metrics-server-amd64:v0.3.1
command:
- /metrics-server
- --kubelet-insecure-tls
This new section will let metrics-server to allow for an insecure communication which involves not verifying the certs involved.
Once done, deploy the server using kubectl create -f deploy/1.8+/
The metrics-server will be deployed to the kube-system namespace. Once ran, give couple of minutes to gather the metrics.
The kubernetes metrics-server can be very useful in keeping an eye on the underlying infrastructure and pods that are running. The project is not officially part of kubernetes cluster which requires manual installation. This has 2 parts metrics-server and the metrics api. The Metrics-server is a cluster wide aggregator of resource usage data which collects metrics from the summery API exposed by kubelet on each node.
The metrics server will keep an eye on the underlying nodes and pods running. Run the top node and top pod command along with kubectl command. Run the command to see the details of the node and pod using,
jagadishm@192.168.31.177:/Volumes/Work/$ kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
docker-for-desktop 650m 32% 1408Mi 74%
jagadishm@192.168.31.177:/Volumes/Work/$ kubectl top pod
NAME CPU(cores) MEMORY(bytes)
load-generator-7bbbb4fdd4-vfvrk 0m 0Mi
php-apache-5d75bdc65f-rtvtr 1m 16Mi
2. Create a Pod which can take load - this is simple php application which when hit by the service perform some cpu insensitive operations.
In the Dockerfile
FROM php:5-apache
ADD index.php /var/www/html/index.php
RUN chmod a+rx index.php
index.php
<?php
$x = 0.0001;
for ($i = 0; $i <= 1000000; $i++) {
$x += sqrt($x);
}
echo "OK!";
?>
Build a image by the name php-apache. Upload to a dockerhub and run the container using ,kubectl run php-apache --image=docker.io/jagadesh1982/php-apache --requests=cpu=200m --expose --port=80
Once both deployment and service are created,lets create a hpa
Create hpa - The autoscaler can be created by a yaml file or a kubectl autoscale command.run the below command to create a hpa using,
kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
horizontalpodautoscaler.autoscaling/php-apache autoscaled
The autoscaler command will create a hpa which will maintain replicates between 1 and 10 of the pods managed by the php-apache deployment. Hpa will increase or decrease the number of php-apache pod replicas to maintain an average cpu utilization across all pods to 50% since we have assigned 200 mill-cores to each pod.
Check the hpa using,
<?php
$x = 0.0001;
for ($i = 0; $i <= 1000000; $i++) {
$x += sqrt($x);
}
echo "OK!";
?>
Build a image by the name php-apache. Upload to a dockerhub and run the container using ,kubectl run php-apache --image=docker.io/jagadesh1982/php-apache --requests=cpu=200m --expose --port=80
Once both deployment and service are created,lets create a hpa
Create hpa - The autoscaler can be created by a yaml file or a kubectl autoscale command.run the below command to create a hpa using,
kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
horizontalpodautoscaler.autoscaling/php-apache autoscaled
The autoscaler command will create a hpa which will maintain replicates between 1 and 10 of the pods managed by the php-apache deployment. Hpa will increase or decrease the number of php-apache pod replicas to maintain an average cpu utilization across all pods to 50% since we have assigned 200 mill-cores to each pod.
Check the hpa using,
We can see that the target show 0%/50% which means we don't have any load.
Run a load generator : Run a busy box container from which we continuously hit the php-apache service using a while loop. Run the kubectl run -i --tty load-generator --image=busybox /bin/sh command and once command prompt comes , enter the command while true; do wget -q -O- http://php-apache.default.svc.cluster.local; done to continuously hit the php-apache service
jagadishm@10.135.114.187:/Volumes/Work$ kubectl run -i --tty load-generator --image=busybox /bin/sh
If you don't see a command prompt, try pressing enter.
/ # while true; do wget -q -O- http://php-apache.default.svc.cluster.local; done
OK!
OK!
After few minutes, we can see load in the hpa. The targets in the hpa show different values with increasing. We can then check the pods to see that the pods are replicated as below,
jagadishm@192.168.31.177:/Volumes/Work$ kubectl get pods
NAME READY STATUS RESTARTS AGE
load-generator-7bbbb4fdd4-vfvrk 1/1 Running 0 5m
php-apache-5d75bdc65f-9hfq8 1/1 Running 0 42s
php-apache-5d75bdc65f-bqb9s 1/1 Running 0 42s
php-apache-5d75bdc65f-rgfpg 1/1 Running 0 42s
php-apache-5d75bdc65f-rtvtr 1/1 Running 0 1h
We can see that when the target machine are seeing some load and based on the hpa that we created, the pod of php-apache are replicated. This way we can extend our service pod based on the infrastructure changes.
No comments :
Post a Comment