Click on our Sponsors to help Support SunWorld [SunWorld] [Pete's Super Systems by Peter Baer Galvin] April 1999 [Next story] The power of /proc [Table of Contents] Using proc tools to solve your system [Search] problems [Subscribe to SunWorld, it's free!] ------------------------------------------------------------------------ Abstract Many sysdamins don't realize the wide variety and functionality of the tools that are native to Solaris. When they experience problems with applications or trouble at the system level, they can be left at a loss for how to debug the problem and resolve issues. This month, Peter takes a look at the proc tools and how they can help during difficult times. (3,000 words) ------------------------------------------------------------------------ ver the past several years, scientists at the Computing Science Research Center of Bell Laboratories have been working on a new, experimental operating system called Plan 9. This system has much in common with Unix, which isn't a coincidence, as many members of the team were also involved in the development of Unix at one level or another. The key concept of Plan 9 is that almost everything on the system is treated as if it is a file. For instance, the state of the kernel is ascertained by viewing and manipulating files in a special directory. These "files" are really interfaces into the kernel. By treating everything as a file, the kernel as well as application programs are simplified. For example, on Plan 9 there is no special set of system calls for applications to call to deal with the state of the kernel. Instead, the applications can be written using standard file I/O system calls. One of the engineers from Bell Labs joined the Sun development team and created the /proc filesystem and the /usr/proc/bin tools. Before this interface was invented, all programs that used kernel state (for instance, ps and top) had to be recompiled with each OS release. They read kernel memory and the memory locations of key variables changed with each release. With the advent of /proc, applications (including systems programs) have a uniform and static interface into the kernel. The proc tools were actually written to test the /proc filesystem interface, but so many folks at Sun were using them that they decided to include them with Solaris. Why, as a systems administrator, do you care? Well, the tools that manipulate the proc interface can be very useful in determining system state and debugging application and systems problems. First, let's take a look at /proc and the tools, then we will look at some real-life uses for the tools. Perhaps you have noticed an odd member of the output of df -k: $ df -k Filesystem kbytes used avail capacity Mounted on /proc 0 0 0 0% /proc /proc is mounted by the /etc/rcS.d/S40standardmounts.sh startup script. The mount makes the /proc interface into the kernel available to the system and its applications. Once created, /proc echos the state of all processes on the system. Consider an abbreviated listing of /proc on Solaris 2.6 (Solaris 7 works the same way): $ ls -l /proc total 168 dr-x--x--x 5 root root 736 Jan 15 17:00 0 dr-x--x--x 5 root root 736 Jan 15 17:00 1 dr-x--x--x 5 root root 736 Feb 25 11:13 10258 dr-x--x--x 5 root root 736 Jan 15 17:00 11 dr-x--x--x 5 jds staff 736 Mar 17 08:03 11892 dr-x--x--x 5 akane staff 736 Mar 17 08:32 12032 dr-x--x--x 5 cbertold staff 736 Mar 17 08:44 12098 dr-x--x--x 5 jkelly staff 736 Mar 17 08:56 12186 dr-x--x--x 5 root root 736 Mar 9 09:08 12522 dr-x--x--x 5 jds staff 736 Mar 9 09:08 12524 dr-x--x--x 5 root root 736 Mar 9 09:10 12540 dr-x--x--x 5 jds staff 736 Mar 9 09:10 12542 dr-x--x--x 5 spd staff 736 Mar 17 10:01 12547 dr-x--x--x 5 cbertold staff 736 Mar 17 10:03 12555 dr-x--x--x 5 root root 736 Mar 17 10:09 12597 dr-x--x--x 5 pbg staff 736 Mar 17 10:09 12599 dr-x--x--x 5 jds staff 736 Mar 17 10:19 12660 dr-x--x--x 5 pbg staff 736 Mar 17 10:25 12670 dr-x--x--x 5 root root 736 Jan 15 17:00 2 dr-x--x--x 5 root root 736 Jan 15 17:01 239 dr-x--x--x 5 root root 736 Jan 15 17:01 241 dr-x--x--x 5 root root 736 Feb 9 13:50 24515 dr-x--x--x 5 root root 736 Jan 15 17:00 3 dr-x--x--x 5 root root 736 Jan 15 17:01 307 Each numerical directory entry in /proc represents the process with a matching process-ID. The owner of the directory entry is the UID of the process, and the group of the directory is likewise the GID of the process. In this way, only the process owner (and root) have primary access to the process information. Looking within a directory, we see: $ ls -l /proc/12599 total 3543 -rw------- 1 pbg staff 1794048 Mar 18 20:48 as -r-------- 1 pbg staff 152 Mar 18 20:48 auxv -r-------- 1 pbg staff 32 Mar 18 20:48 cred --w------- 1 pbg staff 0 Mar 18 20:48 ctl lr-x------ 1 pbg staff 0 Mar 18 20:48 cwd -> dr-x------ 2 pbg staff 1056 Mar 18 20:48 fd -r--r--r-- 1 pbg staff 120 Mar 18 20:48 lpsinfo -r-------- 1 pbg staff 912 Mar 18 20:48 lstatus -r--r--r-- 1 pbg staff 536 Mar 18 20:48 lusage dr-xr-xr-x 3 pbg staff 48 Mar 18 20:48 lwp -r-------- 1 pbg staff 1728 Mar 18 20:48 map dr-x------ 2 pbg staff 544 Mar 18 20:48 object -r-------- 1 pbg staff 2048 Mar 18 20:48 pagedata -r--r--r-- 1 pbg staff 336 Mar 18 20:48 psinfo -r-------- 1 pbg staff 1728 Mar 18 20:48 rmap lr-x------ 1 pbg staff 0 Mar 18 20:48 root -> -r-------- 1 pbg staff 1440 Mar 18 20:48 sigact -r-------- 1 pbg staff 1232 Mar 18 20:48 status -r--r--r-- 1 pbg staff 256 Mar 18 20:48 usage -r-------- 1 pbg staff 0 Mar 18 20:48 watch -r-------- 1 pbg staff 2736 Mar 18 20:48 xmap Notice the varying permissions on each component of the process's structure. Some components are read-only, some are write-only, and some are a mix. The mode of access to a component is dictated by its functionality. For instance, the file "as" is the address space (virtual memory) of the process, and is readable and writable. On the other hand, ctl allows manipulation of the process's state, and is therefore only writable. Details about each component and its role are included in the man page (man -s 4 proc). Of most interest is as, because it indicates the relative memory use of the process. It is a relative measure because it includes the memory used by all the shared libraries. Therefore, it is not, for example, an accurate reflection of the memory that would be freed if the process was killed. ------------------------------------------------------------------------ Advertisements ------------------------------------------------------------------------ Using the /proc tool set Fortunately, we do not need to deal with the intricacies of /proc directly. Rather, there is a set of tools available to do the dirty work for us. With each new release of Solaris, the proc tool set expands. Under Solaris 2.6, the list includes these tools (in /usr/proc/bin). pcred pflags pmap psig pstop ptree pwdx pfiles pldd prun pstack ptime pwait Let's look at each of the tools. pcred pcred prints the effective, real, and saved UID and GID of a process: $ /usr/proc/bin/pcred 12599 12599: e/r/suid=500 e/r/sgid=10 strong> pfiles pfiles lists all open files (file descriptors represent open files in Unix) associated with the process, as well as any per-process limits on open files: $ /usr/proc/bin/pfiles 12599 12955: vi Current rlimit: 64 file descriptors 0: S_IFCHR mode:0620 dev:136,0 ino:88226 uid:500 gid:7 rdev:24,4 O_RDWR 1: S_IFCHR mode:0620 dev:136,0 ino:88226 uid:500 gid:7 rdev:24,4 O_RDWR 2: S_IFCHR mode:0620 dev:136,0 ino:88226 uid:500 gid:7 rdev:24,4 O_RDWR 3: S_IFCHR mode:0666 dev:136,0 ino:88109 uid:0 gid:3 rdev:13,12 O_RDWR 4: S_IFREG mode:0600 dev:136,0 ino:456959 uid:500 gid:10 size:24576 O_RDWR Descriptors 0, 1, and 2 are part of the standard I/O package (stdin, stdout, and stderr), so those inodes represent entries in /dev/pty. (You might want to practice the following technique on these files to prove to yourself that it works.) To determine the file that descriptor 4 points to requires a little detective work. We could just search the entire system for inode number 456959. Unfortunately, inode numbers are only unique per-partition, so first we need to determine which partition the inode in question is on. We start by searching through the appropriate /devices entries to find one with the matching major and minor device number (in the case of descriptor 4, the major number is 136 and the minor number is 0). The matching device is found by looking through /devices: $ ls -lR /devices | grep 136 brw------- 1 root sys 136, 0 Mar 2 11:10 dad@0,0:a crw------- 1 root sys 136, 0 Mar 2 11:10 dad@0,0:a,raw brw------- 1 root sys 136, 1 Mar 2 11:10 dad@0,0:b crw------- 1 root sys 136, 1 Mar 2 11:10 dad@0,0:b,raw . . . Then, we determine the logical device name by grepping for the physical device name in the /dev tree, as in: $ ls -lR /dev | grep dad@0,0 lrwxrwxrwx 1 root root 46 Mar 2 11:10 c0t0d0s0 -> ../../devices/pci@1f,0/pci@1,1/ide@3/dad@0,0:a lrwxrwxrwx 1 root root 46 Mar 2 11:10 c0t0d0s1 -> ../../devices/pci@1f,0/pci@1,1/ide@3/dad@0,0:b lrwxrwxrwx 1 root root 46 Mar 2 11:10 c0t0d0s2 -> ../../devices/pci@1f,0/pci@1,1/ide@3/dad@0,0:c lrwxrwxrwx 1 root root 46 Mar 2 11:10 c0t0d0s3 -> ../../devices/pci@1f,0/pci@1,1/ide@3/dad@0,0:d . . . We know that the appropriate device is c0t0d0s0, because its device path matches that of the major and minor device number we're looking for. On systems with many disks, this process becomes more complex with duplicate device names (dad@0,0) for instance, but different device paths (pci@1f,4000 for instance). In those cases, the greps must be more complete and include the device path as well as the device name. So now we have the correct device, but how do we determine which file is open? First, we locate the mount point for the device in question: $ df -k Filesystem kbytes used avail capacity Mounted on /proc 0 0 0 0% /proc /dev/dsk/c0t0d0s0 8162157 1639360 6441176 21% / fd 0 0 0 0% /dev/fd swap 546392 296 546096 1% /tmp Then we use find, which has an option to locate files with a specific inode number: $ find / -inum 456959 -mount -print /var/tmp/Ex0000002849 The "-mount" option prevents find from searching beyond the starting mount point (so it will not search other partitions). We have now found the file opened by vi. Unfortunately, when vi edits a file, it first copies it to /var/tmp with a temporary name. When vi writes changes back, it writes them to the original file and deletes the temporary copy (thus a system crash in the middle of a vi session would allow recovery of the original file). Now, on to the other /proc commands. pflags pflags determine the status of the process. $ /usr/proc/bin/pflags 12599 12599: -ksh /1: flags = PR_PCINVAL|PR_ORPHAN|PR_ASLEEP [ waitid(0x7,0x0,0xeffff930,0x7) ] The meanings of the flags can be found in the appropriate ".h" file: /usr/include/sys/procfs.h. pldd pldd lists all the dynamic libraries that are associated with the process: $/usr/proc/bin/pldd 12599 12599: -ksh /usr/lib/libsocket.so.1 /usr/lib/libnsl.so.1 /usr/lib/libc.so.1 /usr/lib/libdl.so.1 /usr/lib/libmp.so.2 /usr/platform/sun4u/lib/libc_psr.so.1 pmap pmap lists the process's address space, including sizes of memory segments and the access allowed to each: $ /usr/proc/bin/pmap 12599 12599: -ksh 00010000 184K read/exec /usr/bin/ksh 0004C000 8K read/write/exec /usr/bin/ksh 0004E000 32K read/write/exec [ heap ] EF580000 592K read/exec /usr/lib/libc.so.1 EF622000 32K read/write/exec /usr/lib/libc.so.1 EF62A000 8K read/write/exec [ anon ] EF680000 448K read/exec /usr/lib/libnsl.so.1 EF6FE000 40K read/write/exec /usr/lib/libnsl.so.1 EF708000 24K read/write/exec [ anon ] EF750000 16K read/exec /usr/platform/sun4u/lib/libc_psr.so.1 EF760000 16K read/exec /usr/lib/libmp.so.2 EF772000 8K read/write/exec /usr/lib/libmp.so.2 EF790000 32K read/exec /usr/lib/libsocket.so.1 EF7A6000 8K read/write/exec /usr/lib/libsocket.so.1 EF7A8000 8K read/write/exec [ anon ] EF7B0000 8K read/exec /usr/lib/libdl.so.1 EF7C0000 8K read/write/exec [ anon ] EF7D0000 112K read/exec /usr/lib/ld.so.1 EF7FA000 8K read/write/exec /usr/lib/ld.so.1 EFFFC000 16K read/write/exec [ stack ] total 1608K pstack pstack shows the stack trace for each thread (lightweight process or LWP) in a process. This information can help determine where a process is hung, why it is using up too much memory, and so on: $ /usr/proc/bin/pstack 12599 12599: -ksh ef5b915c waitid (7, 0, effff930, 7) ef5d40d0 _libc_waitpid (ffffffff, effffa30, 4, 7, ef622e54, 2422c) + 54 0002422c job_wait (4e000, 0, 52818, 4, ef622e54, 2fa54) + 184 0002fd04 sh_exec (31b2, 0, 0, 4e400, 4e000, 0) + c1c 00027894 ???????? (5174c, 4cf38, 4cf38, 4e400, 4e400, 4d294) 00027174 main (4e400, efffff6c, 4e400, 4e400, 4e400, 4e400) + 844 00015e88 _start (0, 0, 0, 0, 0, 0) + dc ptree ptree prints a formatted listing of a process's lineage, with child processes indented beneath their parent. It can show you the whole system's process tree, or just the parents of a given process: $ /usr/proc/bin/ptree 12599 285 /usr/sbin/inetd -s 12597 in.telnetd 12599 -ksh 12773 /usr/proc/bin/ptree 12599 In this case the ptree command was started by a ksh shell, which was started by telnetd due to an incoming telnet. The telnetd was started by the internet services daemon inetd. pwdx pwdx prints the current working directory of the process: $ /usr/proc/bin/pwdx 12599 12599: /export/home/pbg ptime ptime times the execution of a process with "microstate accounting" for more precision (and more reproducible results) than the time command: $ /usr/proc/bin/ptime ls (output from ls) real 0.013 user 0.004 sys 0.007 As of Solaris 7, a couple of new and useful proc tools were added. They live in /usr/bin because they are needed by the boot and shutdown scripts. plimit plimit gets and sets the per-process limits: $ /usr/bin/plimit 482 974: -ksh resource current maximum time(seconds) unlimited unlimited file(blocks) unlimited unlimited data(kbytes) unlimited unlimited stack(kbytes) 8192 unlimited coredump(blocks) unlimited unlimited nofiles(descriptors) 64 1024 vmemory(kbytes) unlimited unlimited pgrep pgrep searches for processes matching a certain criteria. No more ps | grep pipes! $ /usr/bin/pgrep tcsh 361 414 416 554 pkill Finally, pkill sends a user-definable signal to one or more processes, based on criteria such as process name or process owner. Not only is pkill useful, but it is responsible in good part for the boot and shutdown performance improvements of Solaris 7. $ /usr/bin/pkill bad-process Most of the /proc-related commands have a few options, and most accept a list of processes. Check out the man pages for a few more details. There is also a /proc gotcha to be aware of. From the manual: These proc tools stop their target processes while inspecting them and reporting the results: pfiles, pldd, pmap, pstack, pwdx. A process can do nothing while it is stopped. Thus, for example, if the X server is inspected by one of these proc tools running in a window under the X server's control, the whole window system can become deadlocked because the proc tool would be attempting to print its results to a window that cannot be refreshed. Logging in from another system using rlogin(1) and killing the offending proc tool would clear up the deadlock in this case. The following is a real-life example of the utility of the proc tools. A site was having a problem with a daemon. The daemon's job was to accept connections from client machines and allow them to process insurance claims. During testing, the daemon would run for a while and then crash. This type of behavior indicates a resource limit, but what resource? Using the various proc tools during the daemon's execution, we noticed that the number of open files kept climbing, and that the process failed as they went over 60 or so. The culprit was the file-descriptor limit. Removing the limit removed the problem as well. The proc tools made this problem very easy to solve. Useful books A quick note about a useful book. It's actually a repackaging of a series of O'Reilly and Associates books. The result is a combination book and CD. The book is Unix in a Nutshell System V Edition,and the CD includes that work plus Unix Power Tools, Second Edition; Learning the Unix Operating System, Fourth Edition; Sed & Awk, Second Edition; Learning the vi Editor, Fifth Edition; and Learning the Korn Shell. The CD contents are indexed and in HTML format, makingthem very convenient to use. The combination is useful for those who haven't bought all the books, or like to travel light (consultants and the like). There are other book-CD combinations coming that also look to be very useful. Highly recommended. Details are available in Resources. Letters A couple of notes about the previous columns on patch theory and practice: Nice article, but Peter really does need to be reminded that other countries other than the USA exist. 1-800-USA-4-SUN? What?? Sorry about that! Any 800-numbers that I mention are, of course, for the US only. I found a patch that overwrote /dev/null. Sun's response was that all patches (regardless of install notes) should be installed in single-user mode. Not always an easy thing to do, but after getting burnt, this is now mandatory for even the simplest patch. That makes a nice juxtaposition to another letter: I don't agree with "If no crash, no patch." If a patch exists it is there to solve a problem. My suggestion is install all recommended and security patches and make that as a routine. Both of these letters reinforce some points made in the previous columns: that patch policy should and will vary according to site needs. Some sites can afford no downtime at all, and therefore only patch when absolutely necessary. Others have planned maintenance windows which afford a perfect opportunity to install the recommended, suggested, Y2K, and security patches on a schedule. Another important suggestion: always make a backup before making major system changes, make the changes during scheduled downtime, in single-user mode if possible, and reboot when done to verify that all changes take affect and that they have not caused problems with system functions. Next month Pete's Super Systems will cover the changes to system administration in Solaris 7. [Image] Click on our Sponsors to help Support SunWorld ------------------------------------------------------------------------ Resources * Plan 9 home page http://plan9.bell-labs.com/plan9/index.html * O'Reilly and Associates's catalog http://www.oreilly.com/catalog/unixcd/ * Full listing of previous Pete's Super Systems columns http://www.sunworld.com/sunworldonline/common/swol-backissues-columns.html#supersys * Hal Stern's past SysAdmin columns http://www.sunworld.com/sunworldonline/common/swol-backissues-columns.html#sysadmin Additional SunWorld resources * Peter's Solaris Security FAQ (recently updated!) http://www.sunworld.com/sunworldonline/common/security-faq.html * Peter's Unix Secure Programming FAQ http://www.sunworld.com/swol-08-1998/swol-08-security.html * Check out SunWorld's Site Index -- a topical listing of our most popular stories http://www.sunworld.com/common/swol-siteindex.html * Visit sunWHERE -- launchpad to hundreds of online resources for Sun users http://www.sunworld.com/sunwhere.html * Explore SunWorld's back issues http://www.sunworld.com/common/swol-backissues.html * IDG.net, your one-stop IT resource http://www.idg.net ------------------------------------------------------------------------ About the author [Peter Galvin's photo] Peter Baer Galvin is the chief technologist for Corporate Technologies, a systems integrator and VAR. Before that, Peter was the systems manager for Brown University's Computer Science Department. He has written articles for Byte and other magazines and previously wrote Pete's Wicked World, the security column for SunWorld. Peter is co-author of the Operating Systems Concepts textbook. As a consultant and trainer, Peter has taught tutorials on security and system administration and given talks at many conferences and institutions. What did you think of this article? -Very worth reading -Too long -Too technical -Worth reading -Just right -Just right -Not worth reading -Too short -Not technical enough [SunWorld] [Table of Contents] [Search] [Next story] [Subscribe to SunWorld, it's free!] [Feedback] [Sun's Site] [(c) Copyright Web Publishing Inc., and IDG Communication company] If you have technical problems with this magazine, contact webmaster@sunworld.com URL: http://www.sunworld.com/swol-04-1999/swol-04-supersys.html Last modified: Tuesday, April 13, 1999