When we see high
load on a System, we are talking about the load-average of the system. In this
article we will see how the load average can be used to identify issues. The
load average can be obtained either by running command
omhq199e:dwls999-~
$ uptime
01:03:24 up 15 days, 14:30,2 users, load average: 1.30, 1.32, 0.89
Or we can get the
same details using the top command too like,
omhq199e:dwls999-~
$ top
top - 01:03:26 up
15 days, 14:30, 2 users, load average: 1.20, 1.30, 0.88
In both the above
cases, the load average can be seen as
load average: 1.20,
1.30, 0.88
These numbers
represent my system load average from the last 1, 5 and 15 minutes time. This
says how system is handling the process that needs system CPU time for their
processing. The says the average number of process that are have to wait for
the CPU time during the last 1, 5 and 15 minutes.
If we have a load
of 0, it means that the system is idle.
If we have a load
of 1, then it means that there is at least 1 process waiting to get the CPU
time.
If we have a load
of 2, that means that there are multiple process waiting to get the CPU time.
More Detail,
Consider that iam a Bridge operator, I want cars on the bridge to move smoothly
on the bridge. So if I say I have a 0 load, it means that there are no cars to
move on the bridge. The load between ‘0’ and ‘1’ says that the load is normal. If
I say I have a Load of ‘1’, it means that I have cars that are moving smoothly
and if more cars come the load increases.
If I says that I
have a load of ‘2’, it means that there are 2 lanes of cars waiting to cross.
The cars here are
similar to Process in Linux. The load will rise when there are process waiting
for the CPU time. So the CPU load should ideally stay below 1.0
So is ‘1.00’ is Ideal Load? What is the load that we
need to consider as Serious?
The load average
value deferrers with CPU Installed and the Core the system has.
For a Single Core
System, the load average of normal upto1.0 is considered normal. Similarly for
a Duel core system the load average is normal up to 2.0. This means that
If a Quad core System has load average
of 4.0 , it is working fine and if it is more than 4.0 , the load is more on
that system and we need to find the reasons.
Average to Consider?
When we run the
uptime command we see load average for 1,5 and 15 minutes , now which one to
consider.
We need to
concentrate on the 5 and 15 minute average because if there is a hype in the 1
minute ,it is acceptable but if the average is high for the 5 min or 15 minutes
we need to consider it as a server issue and find out the reasons.
How do I Find out How Many Core are available?
[root@vx148d tmp]$
grep 'model name' /proc/cpuinfo | wc -l
2
[root@vx148d tmp]$ egrep
-e "core id" -e ^physical /proc/cpuinfo|xargs -l2 echo|sort -u
physical id : 0
core id : 0
physical id : 1
core id : 0
More to Come ,
Happy Learning