Sar
is a command available in linux which helps in analyzing various
performance bottlenecks. This can also help in analyzing various data
while doing a Performance trouble shooting.
Consider
our system is compromised of 3 sub systems, CPU , memory and Disk.
We need to find out which sub system is the reason for causing the
issues.
CPU
(!
1005)-> sar -u 1 1
Linux
2.6.18-348.el5xen (vx181d) 08/18/2013
11:56:55
PM CPU %user %nice %system %iowait %steal
%idle
11:56:56
PM all 0.00 0.00 0.00 0.00
0.00 100.00
Average:
all 0.00 0.00 0.00 0.00
0.00 100.00
The
%user and %system columns simply specify the amount of time the CPU
spends in user and system mode. The %iowait and %idle columns are of
interest to us when doing performance analysis. The %iowait column
specifies the amount of time the CPU spends waiting for I/O requests
to complete. The %idle column tells us how much useful work the CPU
is doing.
A
%idle time near zero indicates a CPU bottleneck, while a high %iowait
value indicates unsatisfactory disk performance.
Spending
time in %user is expected behavior, as this is where all non-system
tasks are accounted for. If cycles are actively being spent in
%system then much of the execution time is being spent in lower-level
code. If %iowait is high then it indicates processes are actively
waiting due to disk accesses being a bottleneck on the system.
Load
(!
1005)-> sar -q 1 1
Linux
2.6.18-348.el5xen (vx181d) 08/18/2013
11:59:19
PM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15
11:59:20
PM 3 482 2.13 1.04 0.03
11:59:21
PM 5 487 2.15 1.04 0.03
11:59:22
PM 6 489 2.16 1.04 0.03
Average:
0 482 0.13 0.04 0.03
“sar
-q” displays the run Queue length , Total Number of process and the
load averages for the last 1 ,5 and 15 minutes.
The
System seems to be a little busy since multiple process can be
executed at the same time.
If
your ldavg-1 column stays consistently high, or continues to rise
during this load check, this is an indication that you could have
something on the server spiking its usage.
Typically
a system's load should remain at 70% of the number of cores or lower.
If the system's load is consistently above this amount there may be
performance degradation, and if the load ever rises above the number
of cores there will be a significant slowdown.
Memory
omhq19e9:dwls990-~
$ sar -r 1 2
Linux
2.6.18-348.4.1.el5 (omhq19e9) 08/19/2013
04:30:20
AM kbmemfree kbmemused %memused kbbuffers kbcached kbswpfree
kbswpused %swpused kbswpcad
04:30:01
62716272 201531148 76.27 15952 1556572
2048236 12 0.00 4
04:40:03
191904 264055516 99.93 2692 28908
0 2048248 100.00 8496
04:50:14
184100 264063320 99.93 1388 10600
0 2048248 100.00 0
Average:
4415719 259831701 98.33 1357749 20307185 1906978
141270 6.90 297
The
Memory Details when taken with 10min gap tell us many details. Linux
likes to use memory upto 99% of the Memory.In the above out put we
can see that swap is being used extensively.
IO
04:30:00
tps rtps wtps bread/s bwrtn/s
04:30:01
67.95 51.84 16.12 4303.82 4664.14
04:40:03
564.60 227.07 337.52 34338.84 87719.02
04:50:14
51.05 40.25 10.80 1326.32 245.06
Average:
31.12 11.00 20.12 1383.64 3346.65
The
number of disk reads and writes will vary based on the underlying
hardware; however, we can take a look at what is considered 'normal'
for this system by examining the data over a period of time, and then
look for spikes. We can see a large spike at 4:40 where the number of
reads and writes increases dramatically. Note that shortly after
these go back down, indicating that this massive burst was resolved.
The
Trouble Shoot Documents will be Updated Continuously for more Tips.
More
To Come , Happy learning :-)