Pages

Thursday, September 18, 2014

HPROF aka Heap profiler

HPROF (Heap profiler) are a feature available in JDK which has a dynamically linked library that interface with the JVM. This utility uses Java Virtual Machine Tools Interface (JVM TI) to interface.

HRPOF is actually a JVM native agent library which is dynamically loaded through the command line Options, at JVM startup and then becomes a part of the JVM process. Using the HPROF options at start up we can get various pieces of information from heap, thread and CPU. The data generated can be in textual or binary format, and can be used to track down and isolate performance problems involving memory usage and inefficient code.

The Tool can be invoked using,
java -agentlib:hprof[=options] ToBeProfiledClass
-OR-
java -Xrunhprof[:options] ToBeProfiledClass

The sub-options include
  • heap=sites: Tells hprof to generate stack traces indicating where memory was allocated
  • cpu=samples: Tells hprof to use statistical sampling to determine where the CPU spends its time
  • depth=10: Tells hprof to show stack traces 10 levels deep, at most
  • monitor=y: Tells hprof to generate information on contention monitors used to synchronize the work of multiple threads
  • thread=y: Tells hprof to identify threads in stack traces
  • doe=y: Tells hprof to produce a dump of profiling data upon exit
We can use this as
java -agentlib:hprof=heap=sites <Class To be Profiled>

A file by the name “java.hprof.txt” is generated which will give all the information about the memory, Cpu etc.

Analyzing the Output
As we said once we run any of the above commands we get a file named “java.hprof.txt” in the current location. It is important to understand the elements in the file.

It starts with a header that says about its format, which the header claims is subject to change without notice.

Thread and Trace: The output file contains a Thread and Trace Sections. The thread section provides the list of threads started and terminated during the running of the program. We can see all threads that are started ran and terminated during the life of the Program. The Trace Section gives us the Information about the stack Traces for the threads. These Stack Traces will be used as Cross-references from other parts of the same output file. These gives information about what a thread did with the stack.

Heap: Depending on the Options we use (heap and CPU), the corresponding profiling information is added to the Output file.

When profiled the class with the heap, we see
SITES BEGIN (ordered by live bytes) Wed Sep 17 23:02:14 2014
             percent              live             alloc'ed       stack class
 rank   self    accum     bytes   objs     bytes  objs    trace name
    1   0.76%  0.76%      1104    2      1104    2     300144 byte[]
    2   0.71%  1.48%      1032    4      1032    4     300051 char[]
    3   0.65%  2.12%       936     4       936     4     300063 byte[]

We can get a dump of all live objects in the Java heap (heap=dump) and/or a sorted list of allocation sites that identifies the most heavily allocated objects (heap=sites).

CPU: The CPU Sample and CPU Time section help us to understand the CPU utilization ( using cpu=sample or cpu=time) . CPU Time includes measurements of how many times a given method was called and how long each method took to execute.

The CPU samples look some thing like this,
CPU SAMPLES BEGIN (total = 3337) Wed Sep 17 23:02:14 2014
rank   self    accum   count     trace    method
   1  99.82% 99.82%    3331 300345 Struck.main
   2  0.12%   99.94%       4    300366 java.lang.Shutdown.halt0
   3  0.03%   99.97%       1    300186 java.lang.Object.<init>
   4  0.03%   100.00%      1   300343 java.lang.Thread.start0

Monitor: The Monitor Time and Monitor Dump sections help you understand how synchronization affects your program's performance. Monitor Time shows how much time your threads experience contention for locked resources. Monitor Dump is a snapshot of monitors currently in use. As you'll see, Monitor Dump is useful for finding deadlocks

Case Studies
We will see how we can use this to analyze the output.

Memory Usage: copy this memory leak program

import java.util.ArrayList;
import java.util.List;

public class Main {
 public static void main(final String... args) {
 System.out.println("start");
 List<Integer> numbers = new ArrayList<Integer>();
 for (int i = 0; i <= Integer.MAX_VALUE; i++) { // going to fail
 numbers.add(i);
 }
 System.out.println("stop");
 }
}

execute the program using,
javac -J-agentlib:hprof=heap=sites Main

We used the heap option to find the memory usage and we can see

SITES BEGIN (ordered by live bytes) Wed Sep 17 22:48:56 2014
           percent                live                     alloc'ed                    stack class
 rank   self     accum     bytes      objs         bytes          objs        trace name
    1  76.73% 76.73%  45226080 2826630  45226080 2826630 300188 java.lang.Integer
    2  23.03% 99.75%  13571688    1          40715144 31          300187 java.lang.Object[]
SITES END

We can see that 2826630 Objects of java.lang.Interger are created holding 45226080 bytes of data. We can see that 2826630 Objects in live section are same as 2826630 Objects in the allocated. It means that objects created are not all garbage collected causing Out of memory leaks.

The next step is to find where these objects are being created. If we check the trace under the stack Class section we see 300188 and 300187 trace ID. If we search for these traceID in the same output file we see

TRACE 300188:
    java.lang.Number.<init>(Number.java:32)
    java.lang.Integer.<init>(Integer.java:642)
    java.lang.Integer.valueOf(Integer.java:625)
    Main.main(Main.java:9)

TRACE 300187:
    java.util.Arrays.copyOf(Arrays.java:2760)
    java.util.Arrays.copyOf(Arrays.java:2734)
    java.util.ArrayList.ensureCapacity(ArrayList.java:167)
    java.util.ArrayList.add(ArrayList.java:351)

This clearly tells us the Main.main(Main.java:9) is creating Objects of java.lang.Integer.

Cpu Usage: In this series we will see how we can identify the CPU usage. Copy the same Struck.java code which will create struck threads.

import java.lang.*;

class Struck {
    static boolean done;
    public static void main(String args[]) {
        done = false;
        new Thread() {
            public void run() {
            try {
                Thread.sleep(1000); // dummy work load
            } catch (Exception e) {
                done = true;
            }
            done = true;
            }
        }.start();
        while (!done);
        System.out.println("bye");
    }
}
Execute the code using, javac -J-agentlib:hprof=cpu=sites.

We can extract many details from the output file generated. The sites section in the output file gives us

CPU SAMPLES BEGIN (total = 3337) Wed Sep 17 23:02:14 2014
rank   self     accum       count   trace    method
   1    99.82% 99.82%    3331   300345  Struck.main
   2    0.12%   99.94%       4     300366  java.lang.Shutdown.halt0
   3    0.03%   99.97%       1     300186  java.lang.Object.<init>
   4    0.03%  100.00%      1     300343  java.lang.Thread.start0

If we check this section, we can see method Struck.main is taking more time at around 99.82%. We can tell that code from Struck.java is causing the high CPU usage.  We can also use the traceID “300345” to find out the thread that is causing this. We can see the thread for the traceID  “300345” as

TRACE 300345: (thread=200001)
    Struck.main(Struck.java:15)

Dead Lock Detection: Copy following code to generate a dead lock,

import java.util.*;

public class SimpleDeadLock extends Thread {
   public static Object l1 = new Object();
   public static Object l2 = new Object();
   private int index;
   public static void main(String[] a) {
      Thread t1 = new Thread1();
      Thread t2 = new Thread2();
      t1.start();
      t2.start();
   }
   private static class Thread1 extends Thread {
      public void run() {
         synchronized (l1) {
            System.out.println("Thread 1: Holding lock 1...");
            try { Thread.sleep(10); }
            catch (InterruptedException e) {}
            System.out.println("Thread 1: Waiting for lock 2...");
            synchronized (l2) {
               System.out.println("Thread 2: Holding lock 1 & 2...");
            }
         }
      }
   }
   private static class Thread2 extends Thread {
      public void run() {
         synchronized (l2) {
            System.out.println("Thread 2: Holding lock 2...");
            try { Thread.sleep(10); }
            catch (InterruptedException e) {}
            System.out.println("Thread 2: Waiting for lock 1...");
            synchronized (l1) {
               System.out.println("Thread 2: Holding lock 2 & 1...");
            }
         }
      }
   }
}

Compile the code using
javac -J-agentlib:hprof= heap=sites,cpu=samples,depth=10,monitor=y,thread=y,doe=y SimpleDeadLock

We can see a Monitor DUMP section in the output generated as

MONITOR DUMP BEGIN
    THREAD 200001, trace 300000, status: ZO
    THREAD 200002, trace 300000, status: ZO
    THREAD 200003, trace 300000, status: R
    THREAD 200004, trace 300352, status: R
    THREAD 200005, trace 300366, status: MW
    THREAD 200006, trace 300367, status: MW
    THREAD 200007, trace 300389, status: R
    THREAD 200008, trace 300388, status: R
    MONITOR Ljava/lang/Object;
    owner: thread 200005, entry count: 1
    waiting to enter: thread 200006
    waiting to be notified:
    MONITOR Ljava/lang/Object;
    owner: thread 200006, entry count: 1
    waiting to enter: thread 200005
    waiting to be notified:
    MONITOR Ljava/lang/Shutdown$Lock;
    owner: thread 200008, entry count: 1
    waiting to enter:
    waiting to be notified:
    MONITOR Ljava/lang/Class;
    owner: thread 200008, entry count: 1
    waiting to enter:
    waiting to be notified:
MONITOR DUMP END

From the above output we can see that Thread 200005 and 200006 both are in Waiting State. The monitor information in the above output gives more information. In this case both threads 200006 and 200005 are in dead lock state.

owner: thread 200005, entry count: 1
    waiting to enter: thread 200006
    waiting to be notified:
    MONITOR Ljava/lang/Object;
owner: thread 200006, entry count: 1
    waiting to enter: thread 200005
    waiting to be notified:

The thread 200005 is waiting to enter to the thread 200006 and vice versa if you see the output.

Dump: We can dump the details using
hello $ javac -J-agentlib:hprof=heap=dump Struck.java
Dumping Java heap ... done.

By using this HPROF we can analyze various levels of information even before the issues while running.

No comments :

Post a Comment