Pages

Thursday, September 25, 2014

Defunct Processes

Recently we faced a issue with one of our tomcat Server. The Server was crashed with Out of Memory error. We tried to get the server up and running but it did not came up running at any point.

We then checked the process table where we,

hello:local-ews $ ps ux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME  COMMAND
root        6274  4.2    0.0         0        0    ?          Zl     Sep18  343:53 [java] <defunct>

We see that a Process 6274 is running from some time. This is the Process ID for the tomcat instance when running. Now the state of the PID is in Zl State which mean is gone to a Zombie state.

The defunct process or zombie process is a process that has completed execution but still has an entry in the process table. So the process has actually completed the task and waiting for their parent process to destroy them. This process will not go away until the parent process collects this child process or destroy them. At this point our only way was to find the parent process and bounce it. I obtained the Parent Process ID as

hello:local-IEM-A2 $ ps -l 6274
F S   UID   PID  PPID  C PRI  NI ADDR SZ WCHAN  TTY        TIME CMD
0 Z  7282  6274     1  4  80   0 -     0 exit   ?        344:57 [java] <defunct>

The parent Process ID was “1” which says that the defunct process was started from the Parent Process.

At any point we cannot kill the defunct process as it is already killed but exists in the Process table. The only way to clear a defunct process is to kill the parent process but in this case it is the boot process so we had to bounce the machine for clearing this process.

The defunct process occupies very little resources like a slot in the process table and the resource (timing) information that the parent can ask for.

So what happens exactly?
UNIX or Linux maintains a Parent-Child relationship between processes. So a Whenever a Child process dies, the parent process receives the notification and then it is the responsibility of the parent process to take notice of the child death by using a Wait() system call. The return value of this wait() method is the PID of the child. Once the status is returned the parent process changes the arguments of the child process to the exit status. The shells like bash will know how to process following commands and set the special $? Variable. As long as the parent hasn't called wait(), the system needs to keep the dead child in the global process list, because that's the only place where the process ID is stored.

The purpose of the "zombies" is really just for the system to remember the process ID, so that it can inform the parent process about it on request. If the parent "forgets" to collect on its children, then the zombie will stay undead forever.
This is the reason why we need to find the parent process and bounce it to clear these defunct processes.


More to Come, Happy learningJ

1 comment :

  1. what happens if the pid is 0? that means no PID exists. in this case what would be the next step?

    ReplyDelete