Pages

Tuesday, September 23, 2014

Analyzing a JVM Crash

A crash can cause the JVM to terminate. A Crash or even a fatal error can cause a Crash. A bug in the API, a bug in third party library that we use or bug in a native api can cause these JVM crashes. JVM can crash due to the external factors like resource exhaustion etc.

In this article we will see how we can analyze a JVM crash. Whenever a JVM is crashed a fatal error log file is generated. The file is named as hs_err_pid<pid>.log (where <pid> is the process id of the process). Most cases this file is created in the same working directory and in some other cases the file may be created in the temporary location. The file is created in the /tmp location in Linux and in directory specified by TEMP environment variable in windows.

The format of the error log file will change accordingly.

Header
The header in the log file tells us about the problem. This is printed to standard output. An example log file looks as

#
# An unexpected error has been detected by Java Runtime Environment:
#
#  EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x074639a3, pid=1232, tid=9240
#
# Java VM: Java HotSpot(TM) Client VM (11.0-b16 mixed mode, sharing windows-x86)
# Problematic frame:
# C [jp2iexp.dll+0x39a3]
#
# If you would like to submit a bug report, please visit:
#   http://java.sun.com/webapps/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

The first line tells that the JVM crashed due to an Unexpected signal. The second line gives some more details like,
EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x074639a3, pid=1232, tid=9240

EXCEPTION_ACCESS_VIOLATION: This was crashed with an Access violation exception and this is the signal name.
(0xc0000005): Signal Number
0x074639a3: Instruction pointer
pid=1232: Process ID
tid=9240: Thread ID

The next line tells us about the JVM
# Java VM: Java HotSpot(TM) Client VM (11.0-b16 mixed mode, sharing windows-x86)
This says that the JVM is running in Client mode, mixed and sharing mode.

The Problematic frame is the function frame where the error occurred. The next line gives us like
C [jp2iexp.dll+0x39a3]
This is represented as a Library name and offset. It's possible to inspect the instructions that caused the crash without a debugger or core file.  Just use a disassembler to dump instructions near the offset.

In the above line, the “C” is called a Native C Frame. We do have some other frames as
C -Native C frame
J –Other frame types including compiled Java frames
j -Interpreted Java frames
V -VM frames
v –VM generated stub frame

Thread
The fatal error log will also give us the Thread details. The thread section in the log file gives us information about the thread that crashed. There may be cases where multiple threads crash, at this point only one thread is printed.

The first part in the thread section tell us
Current thread (0x0d0eb000):  JavaThread "main" [_thread_in_native, id=9240, stack (0x044e0000, 0x046e0000)]

Current thread (0x0d0eb000): The Thread pointer is a pointer to the internal representation to the JVM internal thread structure.
JavaThread "main": The Java Thread that is currently running. This can change accordingly.
_thread_in_native: The Thread State.

Some other thread states include,
_thread_uninitialized: Thread is not created. This should never happen unless there's
memory corruption
_thread_new: Thread has been created but it has not yet started.
_thread_in_native: Thread is running native code. Probably a bug in native code.
_thread_in_vm: Thread is running VM code.
_thread_in_Java: Thread is running (either interpreted or compiled) Java code.
_thread_blocked: Thread is blocked.
..._trans: If you see any of the above states but followed by "_trans", it means
the thread is changing to a different state.

id=9240: The thread ID is the native Thread ID also seen as nid in Thread dumps.

The next line is the signal information like
siginfo: ExceptionCode=0xc0000005, reading address 0x00000000

This indicates that the exception code is 0xc0000005 (ACCESS_VIOLATION), and it occurred when the thread attempted to read address 0x00000000.

The next line after this will be the register information like,
Registers:
EAX=0x00000000, EBX=0x28108828, ECX=0x004583a8, EDX=0x046dc46c
ESP=0x046dc448, EBP=0x046dc458, ESI=0x00000000, EDI=0x0d0eb000
EIP=0x074639a3, EFLAGS=0x00010246

These register gives us the context of the register at the time of the fatal error and these will change depending on the underlying architecture.

After the register values, you will see the top of stack and the instructions/opcodes near the crashing pc. The error log prints 32 bytes which can be decoded using a disassembler to see the instructions around the location of the crash. These can be seen as
Top of Stack: (sp=0x046dc448)
0x046dc448:   046dc46c 046dc454 28108828 00000000
0x046dc458:   046dc498 0e869d11 0d0eb114 046dc4a0
0x046dc468:   09eaabf8 00000000 0d0eb9ac 0d0eb9a4
0x046dc478:   046dc478 28108828 046dc4ac 28110a90
0x046dc488:   00000000 28108828 00000000 046dc4a8
0x046dc498:   046dc4d4 0e862f07 281104e0 0e8681a6
0x046dc4a8:   09eaabf8 00000000 2718dd98 046dc4b4
0x046dc4b8:   28108797 046dc4dc 28110a90 00000000

Instructions: (pc=0x074639a3)
0x07463993:   33 f6 85 c0 7c 3a 8b 45 fc 8d 55 14 52 89 75 14
0x074639a3:   8b 08 50 ff 91 a0 00 00 00 85 c0 7c 1a 8b 75 14

The next line of information will be the thread stack trace. This includes the addresses of the base and the top of the stack, the current stack pointer, and the amount of unused stack available to the thread. This is followed by the stack frames and up to 100 frames are printed. For C/C++ frames the library name may also be printed.

Stack: [0x044e0000, 0x046e0000], sp=0x046dc448, free space=2033k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C [jp2iexp.dll+0x39a3]
j  sun.plugin2.main.server.IExplorerPlugin.getDocumentBase0(J)Ljava/lang/String;+0
j  sun.plugin2.main.server.IExplorerPlugin.getDocumentBase()Ljava/lang/String;+31
……..
……..

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  sun.plugin2.main.server.IExplorerPlugin.getDocumentBase0(J)Ljava/lang/String;+0
j  sun.plugin2.main.server.IExplorerPlugin.getDocumentBase()Ljava/lang/String;+31
…….
If we see we have 2 stack traces one was a Native Stack trace and other was a Java Stack Trace.

The Native Stack trace gives us the native thread information showing all function calls leaving the Java calls. The second stack is Java frames which prints the java frames including the in-lined methods. It skips the native frames. Sometimes according to the crash it will not print the native thread details but can print the java details.

Process
Some of the fatal error log file will generate the Process Information too. The process section gives us information about the process which includes the thread details and memory usage. We can see the information as,

Java Threads: ( => current thread )

Other Threads:
=>0x00007faa08008800 (exited) JavaThread "Unknown thread" [_thread_in_vm, id=12588, stack(0x00007faa0c9d5000,0x00007faa0cad6000)]

Java Threads: ( => current thread )
  0x11726c00 JavaThread "JRE 1.6.0.11 Worker Thread" [_thread_blocked, id=10764, stack(0x123d0000,0x124d0000)]
  0x11726800 JavaThread "JRE 1.6.0.11 Output Reader Thread" [_thread_in_native,
……

Other Threads:
  0x0d14b800 VMThread [stack: 0x109b0000,0x10ab0000] [id=9664]
  0x0d171c00 WatcherThread [stack: 0x114e0000,0x115e0000] [id=8004]

In the above line we can see, The thread list includes the threads that the VM is aware of. This includes all Java threads and some VM internal threads, but does not include any native threads created by the user application that have not attached to the VM

VM state:not at safepoint (normal execution)
The next line gives us the VM state. This normally tells us the state of the VM. Some normal VM states include
1.    not at a safepoint – normal execution
2.    at safepoint – indicates that all threads are blocked in the VM waiting for a special vm operation to complete
3.    synchronizing – indicates that a special vm operation is required and the VM is waiting for all threads in the vm to block.

VM Mutex/Monitor currently owned by a thread: None
The next line gives us the Mutex/Monitor Information. These are not java Object locks but JVM internal locks. For each lock it prints the name of the lock, its owner, and the addresses of a VM internal mutex structure and its OS lock. In this case the lock is owned by None.
Heap
 def new generation   total 960K, used 773K [0x27100000, 0x27200000, 0x27230000)
  eden space 896K,  86% used [0x27100000, 0x271c15f8, 0x271e0000)
  from space 64K,   0% used [0x271e0000, 0x271e0000, 0x271f0000)
  to   space 64K,   0% used [0x271f0000, 0x271f0000, 0x27200000)
 tenured generation   total 4096K, used 0K [0x27230000, 0x27630000, 0x28100000)
   the space 4096K,   0% used [0x27230000, 0x27230000, 0x27230200, 0x27630000)
 compacting perm gen  total 12288K, used 749K [0x28100000, 0x28d00000, 0x2c100000)
   the space 12288K,   6% used [0x28100000, 0x281bb598, 0x281bb600, 0x28d00000)
    ro space 8192K,  63% used [0x2c100000, 0x2c613ae8, 0x2c613c00, 0x2c900000)
    rw space 12288K,  53% used [0x2c900000, 0x2cf683f8, 0x2cf68400, 0x2d500000)

The next section prints about the heap details and they depend upon the GC and memory used.

After the heap details, the next section tells us about the memory regions at the time of crash. A Memory map is also generated which is use-full when debugging some crashes, as it can tell you what libraries are actually being used, their location in memory, as well as the location of heap, stack and guard pages. This memory map section will be different depending on the Operating Systems. On Linux the process memory map (/proc/<pid>/maps) is printed. On Windows, the base and end addresses of each library are printed.

Dynamic libraries:
0x01220000 - 0x012dc000   C:\Program Files (x86)\Internet Explorer\IEXPLORE.EXE
0x777a0000 - 0x77920000   C:\Windows\SysWOW64\ntdll.dll
0x76d70000 - 0x76e80000   C:\Windows\syswow64\kernel32.dll
……..

Or in some cases

Dynamic libraries:
00400000-00401000 r-xp 00000000 fd 04 263750    /software/bin/java
0238e000-023af000 rw-p 00000000 00:00 0                            [heap]
3fae00000-800000000 rw-p 00000000 00:00 0
3be8802000-3be8a02000 ---p 00002000 fd:00 321        /lib64/libdl-2.12.so
…..

The reading of memory map is as follows                      
00400000-00401000: memory region
r-xp: Permission on the region , we have Read,Write,eXecute,Private and Share.
Fd: File Off set number
04:Major ID and minor ID of the device from where the file is located or accessed.
263750: Inode number   
/software/bin/java: File Name

Every library has two virtual memory regions. One for code and one for data. The code
segment is marked with r-xp (readable, executable, private); data segment is rw-p
(readable, writable, private).

The Java heap details can be obtained with the map listed as heap. This gives us the the actual memory regions reserved for heap with attributes rwxp.

Thread stacks usually show up in the memory map as two back-to-back regions, one with
attribute ---p (guard page) and one with rwxp (actual stack space). Of course, knowing the
guard page size or stack size would make the job a bit easier. For example, in this memory
map, 3be8802000-3be8a02000 is stack.

After the memory map details, we can see VM arguments passed,

VM Arguments:
jvm_args: -XX:+AggressiveOpts -XX:+UseLargePages -Xmn10g -Xms26g -Xmx26g
java_command: <unknown>
Launcher Type: SUN_STANDARD

The next section tells us about the Environment variables passed
Environment Variables:
JAVA_HOME=/software/java64/current
PATH=/usr/local/bin:/bin:/usr/bin:/usr/local/.............
LD_LIBRARY_PATH=/software/jboss/ews/current/lib:
SHELL=/bin/bash

System
The last section is about the System which gives us the information about the system statistics when the crash occurred like

OS:Red Hat Enterprise Linux Server release 6.4 (Santiago)

uname:Linux 2.6.32-358.23.2.el6.x86_64 #1 SMP Sat Sep 14 05:32:37 EDT 2013 x86_64
libc:glibc 2.12 NPTL 2.12
rlimit: STACK 10240k, CORE 0k, NPROC 32768, NOFILE 8192, AS infinity
load average:0.00 0.01 0.00

/proc/meminfo:
MemTotal:        1922716 kB
MemFree:          177136 kB
Buffers:           67908 kB
Cached:           226672 kB
SwapCached:        11612 kB
Active:          1093288 kB
Inactive:         472848 kB
Active(anon):     950380 kB
Inactive(anon):   324624 kB
Active(file):     142908 kB
Inactive(file):   148224 kB
Unevictable:        4208 kB
Mlocked:            4208 kB
SwapTotal:       4194296 kB
SwapFree:        4131324 kB
Dirty:               112 kB
Writeback:             0 kB
AnonPages:       1267744 kB
Mapped:            21132 kB
Shmem:                 0 kB
Slab:             139492 kB
SReclaimable:     111720 kB
SUnreclaim:        27772 kB
KernelStack:        2768 kB
PageTables:         7384 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     5155652 kB
Committed_AS:    2018712 kB
VmallocTotal:   34359738367 kB
VmallocUsed:       12060 kB
VmallocChunk:   34359712480 kB
HardwareCorrupted:     0 kB
AnonHugePages:     75776 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:        8180 kB
DirectMap2M:     2088960 kB

CPU:total 1 (1 cores per cpu, 1 threads per core) family 6 model 13 stepping 3, cmov, cx8, fxsr, mmx, sse, sse2, sse3, lzcnt, tsc

/proc/cpuinfo:
processor      : 0
vendor_id      : AuthenticAMD
cpu family     : 6
model           : 13
model name   : QEMU Virtual CPU version (cpu64-rhel6)
stepping        : 3
cpu MHz                  : 2394.026
cache size      : 512 KB
fpu               : yes
fpu_exception : yes
cpuid level     : 4
wp               : yes
flags             : fpu de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm up unfair_spinlock pni cx16 hypervisor lahf_lm abm sse4a
bogomips       : 4788.05
TLB size        : 1024 4K pages
clflush size     : 64
cache_alignment       : 64
address sizes : 48 bits physical, 48 bits virtual
power management:

Memory: 4k page, physical 1922716k(177136k free), swap 4194296k(4131324k free)

The output is operating system specific but in general includes the operating system version, CPU information, and summary information about the memory configuration.

In Linux or solaris , the the operating system information obtained by reading /etc/*release.
The uname In linux is called to get information about the kernel and the Thread library

uname:Linux 2.6.32-358.23.2.el6.x86_64 #1 SMP Sat Sep 14 05:32:37 EDT 2013 x86_64
libc:glibc 2.12 NPTL 2.12

On linux there are 3 types of threading libraries which areinuxthreads (fixed stack), linuxthreads (floating stack) and NPTL. In our case it’s the NTPL. Knowing this type is use-full when we see crashes that come due to the threading library.

The rlimit is actually a wrapper for the ulimit command in linux
rlimit: STACK 10240k, CORE 0k, NPROC 32768, NOFILE 8192, AS infinity
load average:0.00 0.01 0.00
The ulimit gives us the information about the limits that are set for various elements. In the above line
Stack: Stack Size (ulimit –s)
CORE: Core dump size(ulimit –c)
NPROC: Maximum user process (Ulimit –u)
NOFILE: Maximum Open files (ulimit –n)

CPU:total 1 (1 cores per cpu, 1 threads per core) family 6 model 13 stepping 3, cmov, cx8, fxsr, mmx, sse, sse2, sse3, lzcnt, tsc

The CPU information is also provided with the fatal error log and last the memory information is provided as
Memory: 4k page, physical 1922716k(177136k free), swap 4194296k(4131324k free)
4K: Page Size
1922716k: Physical memory
177136k: un-used Physical Memory
4194296k: Total Swap memory
4131324k: Un-used Swap Memory

If both physical memory and swap space are almost full, there's good reason to suspect the crash was due to running out of memory.

vm_info: Java HotSpot(TM) Client VM (11.0-b16) for windows-x86 JRE (1.6.0_11-b03), built on Nov 10 2008 02:15:12 by "java_re" with MS VC++ 7.1

The Final Section is the Vm_info.This is a version string embedded in libjvm.so/jvm.dll. Every JVM has its own unique vm_info string. When you are in doubt about whether the hs_err*.log was generated by a particular JVM, check the version string.

I will be updating this doc with more details on analyzing the error log file.


More to Come, Happy Learning J

14 comments :

  1. Very informative, thanks. Please post some details on how to deduce the garbage heap collection details.

    ReplyDelete
  2. Hi! I could have sworn I've visited this site before but after browsing through many
    of the posts I realized it's new to me. Nonetheless,
    I'm definitely happy I came across it and I'll be bookmarking
    it and checking back often!

    ReplyDelete
  3. great article..anyway to get the details of the thread that cause JVM crash ?

    ReplyDelete
  4. Hey I know this is off topic but I was wondering if you knew of any widgets I could
    add to my blog that automatically tweet my newest
    twitter updates. I've been looking for a plug-in like this for quite some time and was hoping
    maybe you would have some experience with something like this.
    Please let me know if you run into anything. I truly enjoy reading your blog and I look
    forward to your new updates.

    ReplyDelete
  5. Here you found its! Go ahead !! Love this game !

    ReplyDelete
  6. A dua
    hung
    chuôm
    hàng đầu
    Bạc
    đình công
    Bàn bạc
    tuổi teen
    bao bọc
    phải
    phong cách
    mách nhỏ
    ngày nay
    mẫu mã
    quanh đó
    không tính
    bí quyết
    xem nhẹ
    xếp đặt
    bởi thế
    bớt xén
    bự chảng
    bước ngoặt
    kinh doanh
    quý khách
    các
    cái
    cơ bản
    nguồn gốc
    cần thiết
    cẩn thận
    đớp
    khó khăn
    gắt
    chắc chắn
    loại vải
    chế tạo
    tầm giá
    chi tiết
    chính hãng
    chọn
    chống thấm
    quan tâm
    chú ý
    đi lại
    dùng cho
    xây dựng thương hiệu
    sở hữu
    công nghệ
    công ty
    cộng
    shop
    vô cùng
    cười
    cứu giúp
    rộng rãi
    đông đảo
    hầu hết
    đặc biệt
    lẩn thẩn
    mê say
    bộ dạng
    đập phá
    đau ốm
    thứ 1
    toàn bộ
    tiết kiệm
    đem lại
    tới
    trẻ trung
    đi ra
    thế mạnh
    điều tra
    đồ vật
    đơn thuần
    ngừng
    tiêu dùng
    dùng
    êm dịu
    kém chất lượng
    ràng buộc
    vừa rồi
    gan dạ
    quen thuộc
    gần gũi
    sắp
    gặp
    gấp rút
    ví như
    giải thích
    hạn chế
    hạn chế
    kết hợp
    hết lòng
    tiên tiến
    tuyệt vời
    tương đối
    thua sút
    kém
    rốt cục
    hậu quả
    khai mạc
    quan niệm
    tổng quát
    khi không
    lúc
    làm
    ko
    khuyến mãi
    khiếu nại
    lang bạt
    làng
    điều khiển
    rét mướt
    chếnh choáng
    lào xào
    thuận tiện
    thu hút
    lớn
    quái đản
    miêu tả
    sắm
    bắt
    tài năng
    nặng nề
    nấu nướng

    sửng sốt
    nhãn hàng
    thời gian nhanh
    nhốn nháo
    sôi động
    vượt trội
    nức danh
    oai nghiêm
    ổn định
    pha chế
    thấp
    phẳng phiu
    vươn lên là
    nguyên tắc
    phù hợp
    PR
    qua chuyện
    ưa chuộng
    rách rưới
    rậm rì
    rẻ mạt
    rõ rệt
    bao la
    tinh khiết
    thông minh
    sáng sủa
    lịch sự
    sâu sắc
    hoảng
    tái mét
    tai vạ
    tài xế
    tật nguyền
    tang chứng
    tàu bay
    thoải mái
    lương y
    thừa thãi
    nối tiếp
    Dù vậy
    mờ mịt
    đầm đìa
    va vấp
    cầu xin
    vắng ngắt
    lặt vặt
    xa cách
    lãng phí
    xâm chiếm
    xanh rờn
    xấu xa
    xin xỏ
    hoàn thành
    xu thế
    dự định
    ỷ lại
    trang phục
    y chang
    yên lặng
    yêu đương
    yếu

    ReplyDelete
  7. We have JVM running on windows, can you please tell me if the "Swap" memory is the same as the paging file for windows?

    ReplyDelete
  8. I'm extremely іmpressed together with your writing
    abilities as neatly as with tһe fokrmat in your
    weblog. Is that this a paid theme or did you mоdify it yourself?
    Anyѡay stay uup the excwllent hiyh quality writing, it's rare to
    see a nicee weblog like this one today..

    ReplyDelete
  9. Quality aгticles is the key to attract the people to
    ρay a visit the web page, that's what this ᴡeb site is providing.

    ReplyDelete
  10. Hey there, You've done an incredible job. I'll certainly digg it and
    personally suggest to my friends. I'm sure they will be benefited from
    this web site.

    ReplyDelete
  11. Verү good article. I absoⅼutely lovе this websіte. Keep writing!

    ReplyDelete
  12. You've made sоme goⲟd poіnts there. I checked on the internet to
    learn more aboᥙt the issuе and found most people will gо along with yoսr views
    on this web site.

    ReplyDelete
  13. Hеlⅼo to all, how is the whole thing, I think every one іs getting more from this web site, and your views are fɑstidiоus designed for new viewers.

    ReplyDelete
  14. For most recent news you have to go to see world wide web and on the
    web I found this web site as a finest site for hottest updates.

    ReplyDelete