#607856 dnsmasq init script returns before dnsmasq fully starts

#607856#5
Date:
2010-06-17 14:00:10 UTC
From:
To:
I keep my Squeeze box up to date each day.  In the last week or so I noticed
openafs-client failing to start randomly at boot time.  My openafs-client start
script checks for connection to the afs server before starting.  I also have
dnsmasq running.  I notice that when openafs-client fails to start that dnsmasq
is started after openafs-client attempts to start.  When openafs-client
successfully starts, dnsmasq has been started before openafs-client tries to
start.  I don't know for sure if this is what causes openafs-client to fail or
not.  It is just what I noticed.

Here is a clip of the boot log when openafs-client starts successfully:

Thu Jun 17 09:36:37 2010: Setting parameters of disc: (none).
Thu Jun 17 09:36:37 2010: Setting preliminary keymap...done.
Thu Jun 17 09:36:37 2010: Activating swap...done.
Thu Jun 17 09:36:37 2010: Checking root file system...fsck from util-linux-ng
2.16.2
Thu Jun 17 09:36:37 2010: e2fsck 1.41.12 (17-May-2010)
Thu Jun 17 09:36:37 2010: /dev/sda2: clean, 18346/48288 files, 118420/192780
blocks
Thu Jun 17 09:36:37 2010: done.
Thu Jun 17 09:36:37 2010: Cleaning up ifupdown....
Thu Jun 17 09:36:37 2010: Loading kernel modules...done.
Thu Jun 17 09:36:37 2010: Activating lvm and md swap...done.
Thu Jun 17 09:36:37 2010: Checking file systems...fsck from util-linux-ng
2.16.2
Thu Jun 17 09:36:37 2010: e2fsck 1.41.12 (17-May-2010)
Thu Jun 17 09:36:37 2010: /dev/sda10: clean, 109603/4136960 files,
2787412/16540917 blocks
Thu Jun 17 09:36:37 2010: e2fsck 1.41.12 (17-May-2010)
Thu Jun 17 09:36:37 2010: /dev/sda8: clean, 43949/640848 files, 615738/2560351
blocks
Thu Jun 17 09:36:37 2010: e2fsck 1.41.12 (17-May-2010)
Thu Jun 17 09:36:37 2010: /dev/sda3: clean, 41/159680 files, 27366/638583
blocks (check after next mount)
Thu Jun 17 09:36:37 2010: e2fsck 1.41.12 (17-May-2010)
Thu Jun 17 09:36:37 2010: /dev/sda9: clean, 365597/2689904 files,
2503071/10753501 blocks
Thu Jun 17 09:36:37 2010: e2fsck 1.41.12 (17-May-2010)
Thu Jun 17 09:36:37 2010: /dev/sda5: clean, 22221/640848 files, 687109/2562359
blocks
Thu Jun 17 09:36:37 2010: e2fsck 1.41.12 (17-May-2010)
Thu Jun 17 09:36:37 2010: /dev/sda6: clean, 265/128520 files, 78986/514048
blocks
Thu Jun 17 09:36:37 2010: done.
Thu Jun 17 09:36:37 2010: Mounting local filesystems...done.
Thu Jun 17 09:36:37 2010: Activating swapfile swap...done.
Thu Jun 17 09:36:38 2010: Cleaning up temporary files....
Thu Jun 17 09:36:38 2010: Setting kernel variables ...error:
"net.ipv6.bindv6only" is an unknown key
Thu Jun 17 09:36:38 2010: ^[[31mfailed.^[[39;49m
Thu Jun 17 09:36:38 2010: Setting up resolvconf...done.
Thu Jun 17 09:36:39 2010: Setting up networking....
Thu Jun 17 09:36:39 2010: Configuring network interfaces...Stopping the
Firestarter firewall....
Thu Jun 17 09:36:40 2010: Starting the Firestarter firewall... failed!
Thu Jun 17 09:36:40 2010: invoke-rc.d: initscript firestarter, action "restart"
failed.
Thu Jun 17 09:36:40 2010: run-parts: /etc/network/if-up.d/50firestarter exited
with return code 2
Thu Jun 17 09:36:40 2010: Internet Systems Consortium DHCP Client V3.1.3
Thu Jun 17 09:36:40 2010: Copyright 2004-2009 Internet Systems Consortium.
Thu Jun 17 09:36:40 2010: All rights reserved.
Thu Jun 17 09:36:40 2010: For info, please visit
https://www.isc.org/software/dhcp/
Thu Jun 17 09:36:40 2010:
Thu Jun 17 09:36:40 2010: Listening on LPF/eth0/00:22:68:15:72:5e
Thu Jun 17 09:36:40 2010: Sending on   LPF/eth0/00:22:68:15:72:5e
Thu Jun 17 09:36:40 2010: Sending on   Socket/fallback
Thu Jun 17 09:36:41 2010: DHCPDISCOVER on eth0 to 255.255.255.255 port 67
interval 6
Thu Jun 17 09:36:47 2010: DHCPDISCOVER on eth0 to 255.255.255.255 port 67
interval 14
Thu Jun 17 09:37:01 2010: DHCPDISCOVER on eth0 to 255.255.255.255 port 67
interval 17
Thu Jun 17 09:37:01 2010: DHCPOFFER from 9.61.249.4
Thu Jun 17 09:37:01 2010: DHCPREQUEST on eth0 to 255.255.255.255 port 67
Thu Jun 17 09:37:01 2010: DHCPACK from 9.61.249.3
Thu Jun 17 09:37:05 2010: bound to 9.61.249.68 -- renewal in 18275 seconds.
Thu Jun 17 09:37:05 2010: Stopping the Firestarter firewall....
Thu Jun 17 09:37:05 2010: Starting the Firestarter firewall....
Thu Jun 17 09:37:07 2010: done.
Thu Jun 17 09:37:07 2010: Starting portmap daemon....
Thu Jun 17 09:37:07 2010: Starting NFS common utilities: statd.
Thu Jun 17 09:37:07 2010: Cleaning up temporary files....
Thu Jun 17 09:37:07 2010: Setting up ALSA...done.
Thu Jun 17 09:37:07 2010: Setting console screen modes and fonts.
Thu Jun 17 09:37:07 2010: ^[]R^[[9;30]^[[14;30]Setting up console font and
keymap...done.
Thu Jun 17 09:37:09 2010: Setting sensors limits.
Thu Jun 17 09:37:09 2010: Running scripts in rcS.d/ took 14436 seconds.
Thu Jun 17 09:37:09 2010: INIT: Entering runlevel: 2
Thu Jun 17 09:37:09 2010: Using makefile-style concurrent boot in runlevel 2.
Thu Jun 17 09:37:09 2010: Starting portmap daemon...Already running..
Thu Jun 17 09:37:09 2010: udevd-work[1034]: kernel-provided name 'uinput' and
NAME= 'input/uinput' disagree, please use SYMLINK+= or change the kernel to
provide the proper name
Thu Jun 17 09:37:09 2010:
Thu Jun 17 09:37:09 2010: Starting NFS common utilities: statd.
Thu Jun 17 09:37:09 2010: Loading kvm module kvm_intel.
Thu Jun 17 09:37:09 2010: Starting acpi_fakekey daemon...done.
Thu Jun 17 09:37:09 2010: Starting enhanced syslogd: rsyslogd.
Thu Jun 17 09:37:09 2010: Thu Jun 17 09:37:09 2010: Starting hdapsd
Thu Jun 17 09:37:09 2010: Thu Jun 17 09:37:09 2010: Selected interface: HDAPS
Thu Jun 17 09:37:09 2010: Thu Jun 17 09:37:09 2010: Selected HDAPS input
device: /dev/input/event17
Thu Jun 17 09:37:10 2010: Starting system message bus: dbus.
Thu Jun 17 09:37:10 2010: Enabling additional executable binary formats:
binfmt-support.
Thu Jun 17 09:37:10 2010: Starting ACPI services....
Thu Jun 17 09:37:10 2010: Checking battery state...done.
Thu Jun 17 09:37:10 2010: Starting anac(h)ronistic cron: anacron.
Thu Jun 17 09:37:11 2010: Starting AGNS Log Daemon:
Thu Jun 17 09:37:11 2010: Starting network connection manager: NetworkManager.
Thu Jun 17 09:37:11 2010: Starting AGNS NetCilient Daemon:
Thu Jun 17 09:37:11 2010: Starting web server: apache2apache2: Could not
reliably determine the server's fully qualified domain name, using 127.0.0.1
for ServerName
Thu Jun 17 09:37:12 2010: .
Thu Jun 17 09:37:12 2010: Starting Hardware abstraction layer: hald.
Thu Jun 17 09:37:14 2010: Starting virtual private network daemon:.
Thu Jun 17 09:37:14 2010: Starting GNOME Display Manager: gdm.
Thu Jun 17 09:37:15 2010: Starting deferred execution scheduler: atd.
Thu Jun 17 09:37:15 2010: Starting atop system monitor: atop.
Thu Jun 17 09:37:15 2010: Starting automounter: loading autofs4 kernel module,
no automount maps defined.
Thu Jun 17 09:37:15 2010: Starting Avahi mDNS/DNS-SD Daemon: avahi-daemon.
Thu Jun 17 09:37:15 2010: Starting bluetooth: bluetoothd.
Thu Jun 17 09:37:16 2010: /etc/environment has been depreciated for locale
information; use /etc/default/locale for LANG=en_US instead ...
^[[33m(warning).^[[39;49m
Thu Jun 17 09:37:16 2010: /etc/environment has been depreciated for locale
information; use /etc/default/locale for LANGUAGE="en_US:en_GB:en" instead ...
^[[33m(warning).^[[39;49m
Thu Jun 17 09:37:16 2010: Starting periodic command scheduler: cron.
Thu Jun 17 09:37:16 2010: Starting Common Unix Printing System: cupsd.
Thu Jun 17 09:37:17 2010: Starting DNS forwarder and DHCP server: dnsmasq.
Thu Jun 17 09:37:17 2010: saned disabled; edit /etc/default/saned
Thu Jun 17 09:37:17 2010: SSL tunnels disabled, see /etc/default/stunnel4
Thu Jun 17 09:37:17 2010: Starting the Firestarter firewall....
Thu Jun 17 09:37:17 2010: Starting sensor daemon: sensord.
Thu Jun 17 09:37:18 2010: Starting NTP server: ntpd.
Thu Jun 17 09:37:18 2010: Loading cpufreq kernel modules...done (acpi-cpufreq).
Thu Jun 17 09:37:18 2010: Starting OpenBSD Secure Shell server: sshd.
Thu Jun 17 09:37:18 2010: CPUFreq Utilities: Setting ondemand CPUFreq
governor...CPU0...CPU1...done.
Thu Jun 17 09:37:19 2010: Starting tsm-client: scheduler.
Thu Jun 17 09:37:19 2010:
Thu Jun 17 09:37:19 2010: 1st attempt to contact IBM Intranet
Thu Jun 17 09:37:19 2010: Checking for IBM Intranet Activity - SUCCESS
Thu Jun 17 09:37:19 2010:
Thu Jun 17 09:37:19 2010: Checking 9.56.253.117 is reachable...
Thu Jun 17 09:37:19 2010: 9.56.253.117 is reachable, starting afs
Thu Jun 17 09:37:19 2010: Starting AFS services: openafs afsd.
Thu Jun 17 09:37:19 2010: afsd: All AFS daemons started.
Thu Jun 17 09:37:19 2010: Starting MTA:Starting up Cisco VPN daemon
Thu Jun 17 09:37:20 2010: Starting kerneloops:
Thu Jun 17 09:37:21 2010: NX> 100 NXSERVER - Version 3.2.0-74-SVN OS (GPL,
using backend: 3.3.0)
Thu Jun 17 09:37:21 2010: NX> 500 Error: No running sessions found.
Thu Jun 17 09:37:21 2010: NX> 999 Bye
Thu Jun 17 09:37:21 2010: NX> 100 NXSERVER - Version 3.2.0-74-SVN OS (GPL,
using backend: 3.3.0)
Thu Jun 17 09:37:21 2010: NX> 122 Service started
Thu Jun 17 09:37:21 2010: NX> 999 Bye
Thu Jun 17 09:37:25 2010:  exim4.


Here is a clip of the boot when openafs-client failed to start:

Thu Jun 17 08:02:43 2010: Setting parameters of disc: (none).
Thu Jun 17 08:02:43 2010: Setting preliminary keymap...done.
Thu Jun 17 08:02:43 2010: Activating swap...done.
Thu Jun 17 08:02:43 2010: Checking root file system...fsck from util-linux-ng
2.16.2
Thu Jun 17 08:02:43 2010: e2fsck 1.41.12 (17-May-2010)
Thu Jun 17 08:02:43 2010: /dev/sda2: clean, 18344/48288 files, 118417/192780
blocks
Thu Jun 17 08:02:43 2010: done.
Thu Jun 17 08:02:43 2010: Cleaning up ifupdown....
Thu Jun 17 08:02:43 2010: Loading kernel modules...done.
Thu Jun 17 08:02:43 2010: Activating lvm and md swap...done.
Thu Jun 17 08:02:43 2010: Checking file systems...fsck from util-linux-ng
2.16.2
Thu Jun 17 08:02:43 2010: e2fsck 1.41.12 (17-May-2010)
Thu Jun 17 08:02:43 2010: /dev/sda10: clean, 109410/4136960 files,
2786729/16540917 blocks
Thu Jun 17 08:02:43 2010: e2fsck 1.41.12 (17-May-2010)
Thu Jun 17 08:02:43 2010: /dev/sda8: clean, 43949/640848 files, 615738/2560351
blocks
Thu Jun 17 08:02:43 2010: e2fsck 1.41.12 (17-May-2010)
Thu Jun 17 08:02:43 2010: /dev/sda3: clean, 42/159680 files, 27366/638583
blocks (check in 2 mounts)
Thu Jun 17 08:02:43 2010: e2fsck 1.41.12 (17-May-2010)
Thu Jun 17 08:02:43 2010: /dev/sda9: clean, 365495/2689904 files,
2501854/10753501 blocks
Thu Jun 17 08:02:43 2010: e2fsck 1.41.12 (17-May-2010)
Thu Jun 17 08:02:43 2010: /dev/sda5: clean, 22161/640848 files, 677545/2562359
blocks
Thu Jun 17 08:02:43 2010: e2fsck 1.41.12 (17-May-2010)
Thu Jun 17 08:02:43 2010: /dev/sda6: clean, 265/128520 files, 79471/514048
blocks
Thu Jun 17 08:02:43 2010: done.
Thu Jun 17 08:02:43 2010: Mounting local filesystems...done.
Thu Jun 17 08:02:44 2010: Activating swapfile swap...done.
Thu Jun 17 08:02:44 2010: Cleaning up temporary files....
Thu Jun 17 08:02:44 2010: Setting kernel variables ...error:
"net.ipv6.bindv6only" is an unknown key
Thu Jun 17 08:02:44 2010: ^[[31mfailed.^[[39;49m
Thu Jun 17 08:02:44 2010: Setting up resolvconf...done.
Thu Jun 17 08:02:45 2010: Setting up networking....
Thu Jun 17 08:02:45 2010: Configuring network interfaces...Stopping the
Firestarter firewall....
Thu Jun 17 08:02:46 2010: Starting the Firestarter firewall... failed!
Thu Jun 17 08:02:46 2010: invoke-rc.d: initscript firestarter, action "restart"
failed.
Thu Jun 17 08:02:46 2010: run-parts: /etc/network/if-up.d/50firestarter exited
with return code 2
Thu Jun 17 08:02:46 2010: Internet Systems Consortium DHCP Client V3.1.3
Thu Jun 17 08:02:46 2010: Copyright 2004-2009 Internet Systems Consortium.
Thu Jun 17 08:02:46 2010: All rights reserved.
Thu Jun 17 08:02:46 2010: For info, please visit
https://www.isc.org/software/dhcp/
Thu Jun 17 08:02:46 2010:
Thu Jun 17 08:02:46 2010: Listening on LPF/eth0/00:22:68:15:72:5e
Thu Jun 17 08:02:46 2010: Sending on   LPF/eth0/00:22:68:15:72:5e
Thu Jun 17 08:02:46 2010: Sending on   Socket/fallback
Thu Jun 17 08:02:48 2010: DHCPDISCOVER on eth0 to 255.255.255.255 port 67
interval 4
Thu Jun 17 08:02:52 2010: DHCPDISCOVER on eth0 to 255.255.255.255 port 67
interval 5
Thu Jun 17 08:02:57 2010: DHCPDISCOVER on eth0 to 255.255.255.255 port 67
interval 9
Thu Jun 17 08:02:57 2010: DHCPOFFER from 9.61.249.3
Thu Jun 17 08:02:57 2010: DHCPREQUEST on eth0 to 255.255.255.255 port 67
Thu Jun 17 08:03:03 2010: DHCPREQUEST on eth0 to 255.255.255.255 port 67
Thu Jun 17 08:03:03 2010: DHCPACK from 9.61.249.3
Thu Jun 17 08:03:05 2010: bound to 9.61.249.68 -- renewal in 18329 seconds.
Thu Jun 17 08:03:05 2010: Stopping the Firestarter firewall....
Thu Jun 17 08:03:05 2010: Starting the Firestarter firewall....
Thu Jun 17 08:03:05 2010: done.
Thu Jun 17 08:03:05 2010: Starting portmap daemon....
Thu Jun 17 08:03:05 2010: Starting NFS common utilities: statd.
Thu Jun 17 08:03:05 2010: Cleaning up temporary files....
Thu Jun 17 08:03:05 2010: Setting up ALSA...done.
Thu Jun 17 08:03:06 2010: Setting console screen modes and fonts.
Thu Jun 17 08:03:06 2010: ^[]R^[[9;30]^[[14;30]Setting up console font and
keymap...done.
Thu Jun 17 08:03:07 2010: Setting sensors limits.
Thu Jun 17 08:03:07 2010: Running scripts in rcS.d/ took 14428 seconds.
Thu Jun 17 08:03:07 2010: INIT: Entering runlevel: 2
Thu Jun 17 08:03:07 2010: Using makefile-style concurrent boot in runlevel 2.
Thu Jun 17 08:03:07 2010: udevd-work[1008]: kernel-provided name 'uinput' and
NAME= 'input/uinput' disagree, please use SYMLINK+= or change the kernel to
provide the proper name
Thu Jun 17 08:03:07 2010:
Thu Jun 17 08:03:07 2010: Starting portmap daemon...Already running..
Thu Jun 17 08:03:07 2010: Starting NFS common utilities: statd.
Thu Jun 17 08:03:07 2010: Starting acpi_fakekey daemon...done.
Thu Jun 17 08:03:07 2010: Loading kvm module kvm_intel.
Thu Jun 17 08:03:08 2010: Starting enhanced syslogd: rsyslogd.
Thu Jun 17 08:03:08 2010: Thu Jun 17 08:03:07 2010: Starting hdapsd
Thu Jun 17 08:03:08 2010: Thu Jun 17 08:03:08 2010: Selected interface: HDAPS
Thu Jun 17 08:03:08 2010: Thu Jun 17 08:03:08 2010: Selected HDAPS input
device: /dev/input/event17
Thu Jun 17 08:03:08 2010: Enabling additional executable binary formats:
binfmt-support.
Thu Jun 17 08:03:08 2010: Starting system message bus: dbus.
Thu Jun 17 08:03:08 2010: Starting ACPI services....
Thu Jun 17 08:03:09 2010: Starting network connection manager: NetworkManager.
Thu Jun 17 08:03:09 2010: Starting Hardware abstraction layer: hald.
Thu Jun 17 08:03:10 2010: Starting virtual private network daemon:.
Thu Jun 17 08:03:10 2010: Starting GNOME Display Manager: gdm.
Thu Jun 17 08:03:11 2010: Checking battery state...done.
Thu Jun 17 08:03:11 2010: Starting AGNS Log Daemon:
Thu Jun 17 08:03:12 2010: Starting AGNS NetCilient Daemon:
Thu Jun 17 08:03:13 2010: Starting anac(h)ronistic cron: anacron.
Thu Jun 17 08:03:13 2010: Starting web server: apache2apache2: Could not
reliably determine the server's fully qualified domain name, using 127.0.0.1
for ServerName
Thu Jun 17 08:03:13 2010: .
Thu Jun 17 08:03:14 2010: Starting deferred execution scheduler: atd.
Thu Jun 17 08:03:14 2010: Starting atop system monitor: atop.
Thu Jun 17 08:03:14 2010: Starting automounter: loading autofs4 kernel module,
no automount maps defined.
Thu Jun 17 08:03:14 2010: Starting Avahi mDNS/DNS-SD Daemon: avahi-daemon.
Thu Jun 17 08:03:14 2010: Starting bluetooth: bluetoothd.
Thu Jun 17 08:03:15 2010: /etc/environment has been depreciated for locale
information; use /etc/default/locale for LANG=en_US instead ...
^[[33m(warning).^[[39;49m
Thu Jun 17 08:03:15 2010: /etc/environment has been depreciated for locale
information; use /etc/default/locale for LANGUAGE="en_US:en_GB:en" instead ...
^[[33m(warning).^[[39;49m
Thu Jun 17 08:03:15 2010: Starting periodic command scheduler: cron.
Thu Jun 17 08:03:15 2010: Starting Common Unix Printing System: cupsdStarting
the Firestarter firewall....
Thu Jun 17 08:03:16 2010: Starting NTP server: ntpd.
Thu Jun 17 08:03:16 2010:
Thu Jun 17 08:03:16 2010: 1st attempt to contact IBM Intranet
Thu Jun 17 08:03:16 2010: Checking for IBM Intranet Activity - FAILED.
Thu Jun 17 08:03:16 2010:
Thu Jun 17 08:03:16 2010: 2nd attempt to contact IBM Intranet
Thu Jun 17 08:03:16 2010: Checking for IBM Intranet Activity - FAILED.
Thu Jun 17 08:03:16 2010:
Thu Jun 17 08:03:16 2010: 3rd attempt to contact IBM Intranet
Thu Jun 17 08:03:16 2010: Checking for IBM Intranet Activity - FAILED.
Thu Jun 17 08:03:16 2010: quitting
Thu Jun 17 08:03:16 2010: AFS will not start
Thu Jun 17 08:03:16 2010: Loading cpufreq kernel modules...done (acpi-cpufreq).
Thu Jun 17 08:03:16 2010: .
Thu Jun 17 08:03:16 2010: CPUFreq Utilities: Setting ondemand CPUFreq
governor...CPU0...CPU1...done.
Thu Jun 17 08:03:16 2010: saned disabled; edit /etc/default/saned
Thu Jun 17 08:03:16 2010: Starting sensor daemon: sensord.
Thu Jun 17 08:03:17 2010: Starting DNS forwarder and DHCP server: dnsmasq.
Thu Jun 17 08:03:17 2010: Starting OpenBSD Secure Shell server: sshd.
Thu Jun 17 08:03:17 2010: Starting MTA:SSL tunnels disabled, see
/etc/default/stunnel4
Thu Jun 17 08:03:18 2010: Starting kerneloops:
Thu Jun 17 08:03:19 2010:  exim4.
Thu Jun 17 08:03:19 2010: Starting up Cisco VPN daemon
Thu Jun 17 08:03:19 2010: Starting tsm-client: scheduler.
Thu Jun 17 08:03:20 2010: NX> 100 NXSERVER - Version 3.2.0-74-SVN OS (GPL,
using backend: 3.3.0)
Thu Jun 17 08:03:20 2010: NX> 500 Error: No running sessions found.
Thu Jun 17 08:03:20 2010: NX> 999 Bye
Thu Jun 17 08:03:20 2010: NX> 100 NXSERVER - Version 3.2.0-74-SVN OS (GPL,
using backend: 3.3.0)
Thu Jun 17 08:03:20 2010: NX> 122 Service started
Thu Jun 17 08:03:20 2010: NX> 999 Bye

I'm not sure how all of the startup scripts work exactly but here is my
/etc/rc2.d directory:

#607856#10
Date:
2010-06-17 17:34:52 UTC
From:
To:
"Brent S. Elmer" <webe3vt@aim.com> writes:

The second time when OpenAFS didn't start, there's no output from the
OpenAFS init script at all, which to me points to this not being a problem
with the OpenAFS package itself and instead is a problem with something
else on your system that's causing your init scripts to not always be run.
The only time the init script would run and not produce at least some
output is if your /etc/openafs/afs.conf.client file says to not start the
client (which it doesn't appear to) or if /sbin/afsd doesn't exist in your
system.

Shouldn't make any difference which order they're started in.

When OpenAFS doesn't start at boot, does running
/etc/init.d/openafs-client start as root after the system starts work?

Something is very odd with the output of reportbug here.

#607856#15
Date:
2010-06-17 18:39:22 UTC
From:
To:
I looked into the openafs-client init script.  This is on my work
laptop.  The init script is modified.  Our AFS servers are on the
intranet and therefore openafs with my cells will only work if the
laptop is on the intranet.  Since trying to connect to afs servers on
the intranet when not on the intranet can take a very long time before
it times out, the openafs-client init script has been modified to first
check to see if on the intranet and if the cells are reachable and only
start the client if so.  There is a binary which I can't look into
called quickresolv that does the checking to see if the laptop is
connected to the intranet.   It probably does a dig, or nslookup, or
ping to a known intranet id(I think it is w3.ibm.com).  If it fails, it
doesn't start openafs otherwise it does.  The reason it is important
that dnsmasq starts first is that w3.ibm.com will not resolve otherwise.
That is why in the log where dnsmasq doesn't start first you see this:

Thu Jun 17 08:03:16 2010: 1st attempt to contact IBM Intranet
Thu Jun 17 08:03:16 2010: Checking for IBM Intranet Activity - FAILED.
Thu Jun 17 08:03:16 2010:
Thu Jun 17 08:03:16 2010: 2nd attempt to contact IBM Intranet
Thu Jun 17 08:03:16 2010: Checking for IBM Intranet Activity - FAILED.
Thu Jun 17 08:03:16 2010:
Thu Jun 17 08:03:16 2010: 3rd attempt to contact IBM Intranet
Thu Jun 17 08:03:16 2010: Checking for IBM Intranet Activity - FAILED.
Thu Jun 17 08:03:16 2010: quitting
Thu Jun 17 08:03:16 2010: AFS will not start

When dnsmasq is started first you see this:

Thu Jun 17 09:37:19 2010: 1st attempt to contact IBM Intranet
Thu Jun 17 09:37:19 2010: Checking for IBM Intranet Activity - SUCCESS
Thu Jun 17 09:37:19 2010:
Thu Jun 17 09:37:19 2010: Checking 9.56.253.117 is reachable...
Thu Jun 17 09:37:19 2010: 9.56.253.117 is reachable, starting afs
Thu Jun 17 09:37:19 2010: Starting AFS services: openafs afsd.
Thu Jun 17 09:37:19 2010: afsd: All AFS daemons started.

So, it doesn't look like an openafs problem since the init script has
been modified.  It would be nice though if I could figure out how to
make dnsmasq start first.  If I could figure it out, would I have to
change it every time openafs-client or dnsmasq is updated?

To answer your question, in the case when openafs-client doesn't start
during boot, it does start manually because by then dnsmasq is started.

I don't know about the reportbug oddness.  I build my own kernels and
modules if that makes reportbug look different.

#607856#20
Date:
2010-06-17 18:50:59 UTC
From:
To:
"Brent S. Elmer Ph.D." <webe3vt@aim.com> writes:

Oh, okay.  That would explain it.

To make openafs-client start first depends on whether you're using
dependency-based boot ordering.  Given the start numbers I don't think you
are.  So, all you should have to do is, in /etc/rc[2345].d, move
S21openafs-client to S22openafs-client (or whatever numbers you need; I
think I recall those were the ones).

That change should be preserved through any upgrades of either software
package.

With the squeeze release, Debian will switch to dependency-based boot
ordering, and you'll need to modify /etc/init.d/openafs-client.  At the
top, you'll want to add dnsmasq to the Required-Start line, which forces
it to be started first.  This change will be preserved on upgrades, but if
the openafs-client init script changes, you'll have to merge this change
with the new changes.

I think there's also some way to use a separate override file in
/etc/insserv/overrides so that you don't have to modify the script, but I
don't know exactly how it works.

You won't have to worry about that until squeeze, though.

#607856#25
Date:
2010-06-17 21:13:28 UTC
From:
To:
user    initscripts-ng-devel@lists.alioth.debian.org
usertag incorrect-dependency
thanks

[Brent S. Elmer]

To make openafs-client start after dnsmasq, you can add dnsmasq as a
init.d script dependency of openafs-client.  A more generic solution
is to make init.d/openafs-client depend on $named, and make sure
dnsmasq implement the virtual facility $named.  A quick check verifies
that dnsmasq currently implement $named (see /etc/insserv.conf), so I
believe adding $named to the openafs-client would work.

Try changing this line

  # Should-Start:         $syslog

to look like this

  # Should-Start:         $syslog $named

I believe that will solve your problem.

Happy hacking,

#607856#30
Date:
2010-06-17 21:17:52 UTC
From:
To:
Petter Reinholdtsen <pere@hungry.com> writes:

Hi Petter,

Are you sure you wanted to add these tags to this bug?  The openafs-client
init script (and the facilities that it starts) normally does not have a
dependency on dnsmasq or on DNS at all.  It only does in the context of a
modified script that's being run at Brent's site.  I'm not sure if you
therefore want to track this bug as part of the work that you're using
that user tag for, since I don't believe any changes are needed in the
openafs package in Debian.

#607856#35
Date:
2010-06-17 21:25:34 UTC
From:
To:
[Russ Allbery]

Hi. :)

Nope, but thought it best to put it on the boot system radar, to track
the status.  If it isn't related, I'll drop the tags again.

I assumed openafs-client would look up its server name using DNS also
in the common case, and thus would want to start after a local DNS
server is operational when such local DNS server is present.  If that
is not true, the dependency on $named would be wrong.  I do not know
if openafs-client will look up server names in DNS, but from its name
I assumed it was a client and they commonly look up servers using DNS
names. :)

If DNS is only needed for the modified script, and never in the common
case, I agree with you.

Happy hacking,

#607856#40
Date:
2010-06-17 22:51:33 UTC
From:
To:
Petter Reinholdtsen <pere@hungry.com> writes:

OpenAFS is a little weird in that historically it has always known about
its VLDB servers by IP address.  So in a traditional configuration, the
CellServDB file that's created as part of the package configuration will
include the IP addresses of the VLDB servers and no DNS is required at
boot.

Increasingly, people are moving towards using what OpenAFS calls dynroot
with AFSDB (or now DNS SRV) records, which doesn't contact any AFS cells
at boot but instead locates them by DNS queries.  However, in that case
the DNS query is not done until someone attempts to access a file in AFS;
there is no DNS service required to start the client, only once someone
attempts to access files under /afs.

So either way, I don't think the openafs-client init script should declare
a dependency on DNS.

To mention, the reason why I try to avoid having adding init dependencies
is that eventually it should be possible to put all of /usr in AFS, so AFS
should be able to start before nearly everything on the system.  This
currently isn't possible due to some configuration which is applied using
a binary in /usr after the initial boot, but I'm (slowly) working with
upstream to find a way to avoid having to do that through a separate
binary.

#607856#45
Date:
2010-06-22 14:08:49 UTC
From:
To:
I added $named to openafs-client and that took care of the problem.  I'm
not sure why it would be in Should-Start instead of Required-Start but I
put it in Should-Start as you suggested.

Thanks,

Brent

#607856#50
Date:
2010-12-14 19:30:07 UTC
From:
To:
The suggestion to do the following in /etc/init.d/openafs-client had
been working great until some update in squeeze a few weeks ago.

I'm not sure what update broke it but now opanafs-client attempts to start before dnsmasq is fully started.  In the
script I put in a ps to see if dnsmasq is started and the ps shows that dnsmasq is started(at least partially).
I put in a ping to a url and it was unresolvable which tells me that dnsmasq is not fully started.

Next I put in a sleep in a loop with the ping and eventually the ping resolves presumably once dnsmasq fully starts.
So, is the problem with openafs not honoring the Should-Start or dnsmasq reporting that it is started before it really is
fully started?

Brent

#607856#55
Date:
2010-12-14 22:30:45 UTC
From:
To:
"Brent S. Elmer Ph.D." <webe3vt@aim.com> writes:

The latter, unfortunately, which makes this hard to fix.  If the PIDs are
in that sequence, the implication I think is that the dnsmasq init script
returns as soon as it's forked off the process and doesn't wait for it to
be fully running before passing control back to the init system saying
that it had started a named.

There may not be a good fix for this other than to add a sleep statement
somewhere in the startup with a ping similar to what you describe, unless
there's some way to get dnsmasq to report when it's actually done.

#607856#60
Date:
2010-12-15 15:52:19 UTC
From:
To:
I suppose this is a result of the effort to speed up boot times.  What I
ended up doing is looping a few times with a sleep and dig +short to see
if work intranet urls resolve yet.  The dig +short returns nothing if it
doesn't resolve so it seemed better than the ping.  Would this be
considered a bug in dnsmasq the way it is working?  It seems dnsmasq
shouldn't report starting a named until it actually has.  Gnome actually
starts before the dig loop is done and openafs is started so startup
doesn't actually seem any slower.  It beats having to manually start
openafs as root each time I boot when on my work network.

Thanks,
Brent

#607856#65
Date:
2010-12-23 01:30:29 UTC
From:
To:
clone 586226 -1
severity -1 normal
retitle -1 dnsmasq init script returns before dnsmasq fully starts
reassign -1 dnsmasq
thanks

"Brent S. Elmer Ph.D." <webe3vt@aim.com> writes:

I think it is a bug in dnsmasq, although it may be a rather hard one to
fix.  I looked at the init script and it just starts the daemon via the
normal way, so apparently the daemon startup completes before the service
is entirely ready (which is not atypical for daemons).  I'm going to clone
this bug to dnsmasq (keeping the original to add the Should-Start
parameter for the OpenAFS init script) and let the dnsmasq maintainer take
a look and see if there's a way to address this there.