Pages

Monday, September 15, 2014

Analyzing a Thread Dump : case Studies

In the first doc we have seen how to analyze a Thread dump, in this article we will see how we can analyze using some case Studies.

Dead Lock
One of the famous issues that we see with our code was a Dead lock. We will see how we can identify dead lock using thread dumps. For this execute the sample code and generate thread dump

import java.util.*;

public class SimpleDeadLock extends Thread {
   public static Object l1 = new Object();
   public static Object l2 = new Object();
   private int index;
   public static void main(String[] a) {
      Thread t1 = new Thread1();
      Thread t2 = new Thread2();
      t1.start();
      t2.start();
   }
   private static class Thread1 extends Thread {
      public void run() {
         synchronized (l1) {
            System.out.println("Thread 1: Holding lock 1...");
            try { Thread.sleep(10); }
            catch (InterruptedException e) {}
            System.out.println("Thread 1: Waiting for lock 2...");
            synchronized (l2) {
               System.out.println("Thread 2: Holding lock 1 & 2...");
            }
         }
      }
   }
   private static class Thread2 extends Thread {
      public void run() {
         synchronized (l2) {
            System.out.println("Thread 2: Holding lock 2...");
            try { Thread.sleep(10); }
            catch (InterruptedException e) {}
            System.out.println("Thread 2: Waiting for lock 1...");
            synchronized (l1) {
               System.out.println("Thread 2: Holding lock 2 & 1...");
            }
         }
      }
   }
}

So when we generate the thread dump we see

"Thread-1" prio=10 tid=0x092c8c00 nid=0xf33 waiting for monitor entry [0xaf5d2000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at SimpleDeadLock$Thread2.run(SimpleDeadLock.java:34)
        - waiting to lock <0xed112a78> (a java.lang.Object)
        - locked <0xed112a80> (a java.lang.Object)

"Thread-0" prio=10 tid=0x092c7400 nid=0xf32 waiting for monitor entry [0xaf623000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at SimpleDeadLock$Thread1.run(SimpleDeadLock.java:21)
        - waiting to lock <0xed112a80> (a java.lang.Object)
        - locked <0xed112a78> (a java.lang.Object)
(Removed Additional Stack trace )

Found one Java-level deadlock:
=============================
"Thread-1":
  waiting to lock monitor 0x0928e7c8 (object 0xed112a78, a java.lang.Object),
  which is held by "Thread-0"
"Thread-0":
  waiting to lock monitor 0x0928dc10 (object 0xed112a80, a java.lang.Object),
  which is held by "Thread-1"

Java stack information for the threads listed above:
===================================================
"Thread-1":
        at SimpleDeadLock$Thread2.run(SimpleDeadLock.java:34)
        - waiting to lock <0xed112a78> (a java.lang.Object)
        - locked <0xed112a80> (a java.lang.Object)
"Thread-0":
        at SimpleDeadLock$Thread1.run(SimpleDeadLock.java:21)
        - waiting to lock <0xed112a80> (a java.lang.Object)
        - locked <0xed112a78> (a java.lang.Object)

Found 1 deadlock.

If we check the above thread dump we can easily identify since it tells us that there is a deadlock Found

"Thread-1":
        at SimpleDeadLock$Thread2.run(SimpleDeadLock.java:34)
        - waiting to lock <0xed112a78> (a java.lang.Object)
        - locked <0xed112a80> (a java.lang.Object)
"Thread-0":
        at SimpleDeadLock$Thread1.run(SimpleDeadLock.java:21)
        - waiting to lock <0xed112a80> (a java.lang.Object)
        - locked <0xed112a78> (a java.lang.Object)

The Thread-1 is waiting for a Lock on object 0xed112a78 (waiting to lock <0xed112a78>) and the same lock is kept hold by Thread-0 (locked <0xed112a78>)

The Thread-0 is waiting for a Lock on object 0xed112a80 (waiting to lock <0xed112a80>) and the same lock is kept hold by Thread-0 (locked <0xed112a80>)

High CPU
Another case where we see A High Cpu Usage in the System making it unstable. A High Cpu usage can be caused due to a Long running thread. The High Cpu Usage can be identified from the Os level Commands like top in linux. In this case we will see how we can identify a thread causing high Cpu usage.

Execute the following java code

import java.lang.*;

class Struck {
    static boolean done;
    public static void main(String args[]) {
        done = false;
        new Thread() {
            public void run() {
            try {
                Thread.sleep(1000); // dummy work load
            } catch (Exception e) {
                done = true;
            }
            done = true;
            }
        }.start();
        while (!done);
        System.out.println("bye");
    }
}

Now check the process which are casing high Cpu usage using top command in linux

  PID  USER       PR  NI  VIRT    RES  SHR  S %CPU %MEM    TIME+  COMMAND
 4587 djbs002    15   0 1164m  11m 6452 S  99.3  0.3       1:04.76     java
 4640 djbs002    15   0 12764  1112  848  R  1.3    0.0       0:00.10     top

So from the above top output we can see that process ID 4587 is causing high Cpu usage of around 99.3% and the command running is java.

Now we got the Process ID which is generating High Cpu . Our next step would be to find out the Thread that is causing this High Cpu.

run the following command , ps -eLo pid,ppid,tid,pcpu,comm multiple times to see thread information . This command will show you the threads that are currently running. Now we have the process ID 4587 , extract specific thread that belong to this process using

Every 2.0s: ps -eLo pid,ppid,tid,pcpu,comm | grep 4587                                                           Thu Sep  4 01:14:03 2014

 4587  3666  4587  0.0 java
 4587  3666  4588 99.0 java
 4587  3666  4589  0.0 java
 4587  3666  4590  0.0 java

The First column is the Process ID and Third column is the Thread ID . If we see the second row , we see 4588 thread is causing around 99% of Cpu usage. Run the same command multiple times , in all times it will show 4588 thread is causing high Cpu.

Now we have the thread ID 4588 , now our next step is take the thread dump for the Pid 4587. Once the thread dump is taken we need to find the corresponding thread for the thread ID 4588. Now convert the decimal value 4588 to hex-decimal value. We get 0x11ec.

Generate the thread dump and search for 0x11ec where we see

 "main" prio=10 tid=0x08828800 nid=0x11ec runnable [0xf7432000]
   java.lang.Thread.State: RUNNABLE
        at Struck.main(Struck.java:17)

check the nid value where we see 0x11ec. So the thread ID 4588 that obtained from Os
commands represent the nid value 0x11ec in the thread dump. The nid value represent to the main Thread .

This says that the code started from line 17 from the Struck.java code. We need to check the code and see what went wrong with our code. By this we find out the reason for high Cpu usage using thread dumps.

Waiting
Some times the thread will maintain a WAIT status. These threads will be in waiting for some external resource to respond. Sockets waiting to read data ( Execute Threads in Weblogic) , Listeners waiting for Messages to come from JMS. Threads like

"IoWaitThread" prio=6 tid=0x000000000321134 nid=0x2b3c waiting on condition [0x000000000435f000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000007d5c45850> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)

We need to keep an eye on these threads as these stay in wait state for long time causing High Cpu or Performance degradation

Slow Performance

During some cases, we see application got slow . Thread dumps help at this point. Generate thread dump multiple times and see the thread that are either blocked or waiting for some locks. 

3 comments :

  1. guest test post
    [url=http://google.tn/]bbcode[/url]
    html
    http://google.tn/ simple

    ReplyDelete
  2. I just look into the online tools for analysing thread dumps http://fastthread.io/ It is very easier and much worth trying.

    ReplyDelete