Thomas Hood wrote: >On Wed, 2004-07-07 at 12:09, Helge Hafting wrote: > > >>My misunderstanding then. I tried this, and put >>"echo" statement between the others. I found that >>the initial "ifdown lo" hangs, so the rest does not happen. >> >> > > >Please run ifdown with "-v" and send the output. >-- >Thomas > > Timing for a bad strace ifdown lo: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 67.44 0.017983 8992 2 waitpid 5.00 0.001333 9 147 21 open 4.29 0.001143 229 5 execve 3.66 0.000976 8 118 read 3.53 0.000942 5 174 old_mmap 2.56 0.000683 3 208 brk 1.80 0.000481 4 135 close 1.64 0.000437 6 75 72 access 1.53 0.000409 3 120 fstat64 1.49 0.000397 7 54 munmap 1.28 0.000341 7 51 33 stat64 1.08 0.000288 26 11 11 connect 0.90 0.000240 17 14 socket 0.65 0.000173 58 3 clone 0.62 0.000166 5 36 mmap2 0.42 0.000111 3 33 rt_sigaction 0.38 0.000100 4 28 fcntl64 0.32 0.000084 4 20 rt_sigprocmask 0.22 0.000058 4 13 uname 0.17 0.000046 6 8 getdents64 0.14 0.000036 4 9 getpid 0.10 0.000026 5 5 time 0.10 0.000026 5 5 _llseek 0.09 0.000025 13 2 ioctl 0.08 0.000021 4 6 set_thread_area 0.08 0.000021 11 2 shutdown 0.06 0.000017 3 6 geteuid32 0.06 0.000017 6 3 setsockopt 0.06 0.000015 5 3 gettimeofday 0.06 0.000015 5 3 getcwd 0.04 0.000010 3 3 getrlimit 0.03 0.000009 3 3 getuid32 0.03 0.000008 3 3 getppid 0.03 0.000008 3 3 getgid32 0.03 0.000008 3 3 getegid32 0.02 0.000006 3 2 getpgrp 0.02 0.000004 2 2 select 0.01 0.000003 3 1 umask ------ ----------- ----------- --------- --------- ---------------- 100.00 0.026666 1319 137 total lots of time spent in waitpid? Also attached a strace -T -f (not the same run) with a slow-running "ifdown lo" It eventually completed, but running "ifdown lo" in an xterm (not runlevel 1) is much faster, it completes in less than a second. Taking the machine down to runlevel 1 means ifdown will be slow, using half a minute to complete or even take hours without completing. ifdown ran process 2299 which spent 29.99s in a select() that timed out. Wich is why the entire process 2299 took 30s. Process 2299 is run-parts /etc/network/if-post-d. Here is the part with the timeout: [pid 2299] socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 5 <0.000028> [pid 2299] setsockopt(5, SOL_TCP, TCP_NODELAY, [1], 4) = 0 <0.000013> [pid 2299] fcntl64(5, F_GETFL) = 0x2 (flags O_RDWR) <0.000011> [pid 2299] fcntl64(5, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000010> [pid 2299] connect(5, {sa_family=AF_INET, sin_port=htons(389), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress) <0.000085> [pid 2299] select(1024, NULL, [5], NULL, {30, 0}) = 0 (Timeout) <29.994926> [pid 2299] shutdown(5, 2 /* send and receive */) = 0 <0.000016> [pid 2299] close(5) = 0 <0.000020> I see. It tries to connect to port 389, address 127.0.0.1. Of course it times out, because "lo" is down at this time. Port 389 is the ldap server, which I use for experimental user authentication. LDAP shuts down before the network goes down though. Now I wonder - do "run-parts" use PAM in any way - even when the directory turns out to be empty? What for? Should I file a bug against run-parts instead? Or PAM? If these are "correct", then run-parts cannot be used after "lo" goes down. Or when ldap isn't up. Helge Hafting Helge Hafting