High
Cpu Usage is one issue that could cause a big damage to the
production environment. Finding out the exact reason for the high CPU
usage is very important in production environment.
As
a Application Server admin, there are many cases where I need to find
the exact reason for the High CPU usage. There can be many reasons
for the High CPU Usage like
A
server process it self (tomcat , Jboss or weblogic )
User
Created Threads
Bad
Coding Practices
Third
Party Libraries we are using.
But
Finding the exact reason for the High Cpu Usage can be difficult
some times. This article explains a way of finding the reasons for a
High Cpu Usage.
In
Most production environments , alerts are configured to make sure
mails are sent when ever there is high memory usage or a high Cpu
usage.
Consider
if you got the alerts , the first thing we need to do is
1.Run
the “top” command to find the process which was taking
the high Cpu.
top
- 22:58:19 up 42 days, 3:33, 2 users, load average: 6.33, 6.56,
6.80
Tasks:
111 total, 1 running, 110 sleeping, 0 stopped, 0 zombie
Cpu(s):
71.4%us, 0.9%sy, 0.1%ni, 26.9%id, 0.3%wa, 0.0%hi, 0.2%si,
0.3%st
Mem:
4194304k total, 4033112k used, 161192k free, 806868k buffers
Swap:
4194296k total, 52k used, 4194244k free, 926320k cached
PID
USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
16361
djbs002 20 0 941m 244m 9.9m S 107.7 6.0
71:31.83 java
21386
djbs002 24 0 846m 58m 6168 S 49.0 1.4 7036:11
java
2930
djbs002 18 0 845m 44m 6108 S 19.6 1.1 561:22.14
java
30116
djbs002 18 0 459m 18m 6108 S 19.6 0.4 42:10.06
java
26176
djbs002 15 0 12764 1040 760 R 2.0 0.0 0:00.04
top
From
the above case we are sure that the process ID – 16361 is using
107.7% of the Cpu and this is the highest cpu using process. Now are
are aware of which process is utilizing the Cpu high.
The
above one was a Tomcat process.
2
.Not the next step is to find which threads in the java process are
causing the high Cpu.
Execute
“top -H -b -p 16361” to see the threads causing High Cpu
where 16361 is the PID.
The
above top command execute for every 2 seconds and shows the results
on the console. We can also send the output to a file for later
checking. Now when I execute the above command I see,
top
- 23:03:01 up 42 days, 3:37, 2 users, load average: 6.52, 6.62,
6.77
Tasks:
46 total, 3 running, 43 sleeping, 0 stopped, 0 zombie
Cpu(s):
98.7%us, 0.3%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.7%si,
0.3%st
Mem:
4194304k total, 4034004k used, 160300k free, 806884k buffers
Swap:
4194296k total, 52k used, 4194244k free, 926504k cached
PID
USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
17997
djbs002 25 0 941m 244m 9.9m R 33.2 6.0 24:20.73 java
17998
djbs002 25 0 941m 244m 9.9m R 33.2 6.0 24:31.29 java
18000
djbs002 25 0 941m 244m 9.9m R 32.2 6.0 24:30.46 java
The
above is a extract from the top command output. I observed the above
command for a couple of minute and found that the above 3 threads are
causing high Cpu usage. Notice that these belong to the process ID
16361.So these 3 threads in process 16361 are causing the process to
use high Cpu.
Since
we got the threads we need to find what these threads are.
We
can also use watch "ps -eLo pid,ppid,tid,pcpu,comm | grep
16361" to see the threads
3.Take
a Thread dumps for the above process 16361 using
jstack
-l 16361
now
once we got the thread dump , we need to find the thread and from
where it was started.
Since
there are 3 threads that are causing the high Cpu , let find them
4.Convert
the Thread ID to hex decimal.
In
order to find the exact thread that belong to thread ID , we need to
first convert them in hex decimal so that we can search them in the
thread dump.
A
little shell script will do this for us
write
a shell script “dec2hex.sh” with contents
printf
"dec -> hex: %d = %x \n" ${1} ${1}
save
it .
Now
when we run the shell script by passing the Thread ID like
./dec2hex.sh
17997
dec
-> hex: 17997 = 464d
The
result was 464d. Now search the Thread Dump for the nid value.
"Thread-55"
daemon prio=10 tid=0x0a2e3000 nid=0x464d runnable
[0xc1aad000]
java.lang.Thread.State: RUNNABLE
at java.lang.StrictMath.atan(Native Method)
at java.lang.Math.atan(Math.java:187)
at org.apache.jsp.HighCPU_jsp$1.run(HighCPU_jsp.java:73)
at java.lang.Thread.run(Thread.java:619)
java.lang.Thread.State: RUNNABLE
at java.lang.StrictMath.atan(Native Method)
at java.lang.Math.atan(Math.java:187)
at org.apache.jsp.HighCPU_jsp$1.run(HighCPU_jsp.java:73)
at java.lang.Thread.run(Thread.java:619)
The
TID is thead id and NID is: Native thread ID. This ID is highly
platform dependent. It's the NID in jstack thread dumps.
Since
the Thread Stack trace I got says that the thread that has been
started from HighCPU_jsp.java class was causing the thread to use
high Cpu.
Now
I would ask the Developer team to check the code for any possible
Issues
Happy
learning