Wednesday, May 22, 2013

Solaris: Which process is associated with a socket connection?

Consider the following real world scenario

[...] one process is waiting for a very long time in a send system call. It is sending on a valid fd but the question we have is that, is there any way to find who is on the other end of that fd? We want to know to which process is that message being sent to. [...]

Here is how I proceed in finding the other end of the socket, and the state of the socket connection with Mozilla's Thunderbird mail client in one end of the socket connection:

  1. Get the process id of the application
    % prstat
       PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP      
    22385 mandalik  180M   64M sleep   49    0   0:05:15 0.1% thunderbird-bin/5

  2. Run pfiles on the pid - it prints a list of open files including open sockets (pfiles is the Solaris supported equivalent of lsof utility).
    % pfiles 22385
    22385:  /usr/lib/thunderbird/thunderbird-bin -UILocale C -contentLocale C
      Current rlimit: 512 file descriptors
      ...
      ...
    
      33: S_IFSOCK mode:0666 dev:280,0 ino:31544 uid:0 gid:0 size:0
          O_RDWR|O_NONBLOCK
            SOCK_STREAM
            SO_SNDBUF(49152),SO_RCVBUF(49640)
            sockname: AF_INET 192.168.1.2  port: 60364
            peername: AF_INET 192.18.39.10  port: 993
            ...
            ...
  3. Locate the socket id and the corresponding sockname/port#, peername/port# in the output of pfiles pid (see step #2).

    Here my assumption is that I know the socket id I'm interested in. In the above output, 33 is the socket id. One end of the socket is bound to port 60364 on the local host 192.168.1.2; and the other end of the socket is bound to port 993 on the remote host 192.18.39.10.

  4. Run netstat -a | egrep "| (get the port numbers from step 3); and check the state of the socket connection. If you see anything other than ESTABLISHED, it indicates trouble.
    % netstat -a | egrep "60364|993"
    solaris-devx-iprb0.60364 mail-sfbay.sun.com.993 48460      0 49640      0 ESTABLISHED

    If you want to see the host names in numbers (IP addresses), run netstat with option -n.
    %  netstat -an | egrep "60364|993"
    192.168.1.2.60364    192.18.39.10.993     49559      0 49640      0 ESTABLISHED

    Now since we know both ends of the socket, we can easily get the state of the socket connection at the other end by running netstat -an | egrep '|.

    If the state of the socket connection is CLOSE_WAIT, have a look at the following diagnosis: CPU hog with connections in CLOSE_WAIT.

Finally to answer ... which process is that message being sent to ... part of the original question:

Follow the above steps and find the remote host (or IP) and remote port number. To find the corresponding process id on the remote machine to which the other half of the socket belongs to, do the following:

  1. Login as root user on the remote host.
  2. cd /proc
  3. Run pfiles * | egrep "^[0-9]|sockname" > /tmp/pfiles.txt.
  4. vi /tmp/pfiles.txt and search for the port number. If you scroll few lines up, you can see the process ID, name of the process along with its argument(s).