Installation of texlive-fonts-extra 2018.20190227-1 hangs.
Here's px information:
# px 5553
/usr/bin/dpkg
--status-fd
80
--no-triggers
--unpack
--auto-deconfigure
--recursive
/tmp/apt-dpkg-install-mos7uq
kernel(0) root
init(1) root
lightdm(915) root
lightdm(1525) root
zsh(1547) vinc17
fvwm2(1629) vinc17
sh(1745) vinc17
xterm(1746) vinc17
zsh(1750) vinc17
su(1839) root
bash(1841) root
aptitude(2012) root
----------------------> dpkg(5553) root
17m01s ago dpkg was started, at 2019-02-28T00:48:45+01:00.
1.5% has been its average CPU usage since then, or 15s/17m01s
Other processes started close to dpkg(5553):
[kworker/0:1-events](5351) was started 2m57s before dpkg(5553)
[kworker/u17:0-kcryptd](5540) was started 1.0s before dpkg(5553)
[kworker/4:2](5771) was started 2m19s after dpkg(5553)
[kworker/u16:1-events_unbound](5928) was started 4m21s after dpkg(5553)
[kworker/5:2-mm_percpu_wq](6068) was started 5m06s after dpkg(5553)
Users logged in when dpkg(5553) started:
vinc17 from 155.133.131.76
vinc17 from :0
2019-02-28T01:05:46.585799: Now invoking lsof, this can take over a minute on a big system...
2019-02-28T01:05:47.098503: lsof done, proceeding.
File descriptors:
stdin : [CHR] /dev/pts/16
stdout: [CHR] /dev/pts/16
stderr: [CHR] /dev/pts/16
Network connections:
Inter Process Communication:
aptitude(2012): [FIFO] pipe
For a list of all open files, do "sudo lsof -p 5553", or "sudo watch lsof -p 5553" for a live view.
Control: severity -1 important Control: retitle -1 texlive-fonts-extra installation hangs for 19 minutes [...] It eventually resumed. In the /var/log/dpkg.log file: [...] 2019-02-28 00:49:41 install texlive-bibtex-extra:all 2018.20190207-1 2018.20190227-1 2019-02-28 00:49:41 status half-installed texlive-bibtex-extra:all 2018.20190207-1 2019-02-28 00:50:27 status unpacked texlive-bibtex-extra:all 2018.20190227-1 2019-02-28 00:50:27 install texlive-extra-utils:all 2018.20190207-1 2018.20190227-1 2019-02-28 00:50:27 status half-installed texlive-extra-utils:all 2018.20190207-1 2019-02-28 00:50:55 status unpacked texlive-extra-utils:all 2018.20190227-1 2019-02-28 00:50:55 install texlive-font-utils:all 2018.20190207-1 2018.20190227-1 2019-02-28 00:50:55 status half-installed texlive-font-utils:all 2018.20190207-1 2019-02-28 00:51:00 status unpacked texlive-font-utils:all 2018.20190227-1 2019-02-28 00:51:01 install texlive-fonts-extra:all 2018.20190207-1 2018.20190227-1 2019-02-28 00:51:01 status half-installed texlive-fonts-extra:all 2018.20190207-1 2019-02-28 01:10:50 status unpacked texlive-fonts-extra:all 2018.20190227-1 2019-02-28 01:10:50 install texlive-fonts-recommended:all 2018.20190207-1 2018.20190227-1 2019-02-28 01:10:50 status half-installed texlive-fonts-recommended:all 2018.20190207-1 2019-02-28 01:11:49 status unpacked texlive-fonts-recommended:all 2018.20190227-1 [...] So that's about 19 minutes with almost no CPU/disk/network activity.
reassign 923423 dpkg thanks Nothing I can do about it, please ask dpkg maintainers what is going on. Norbert
In short: [...] 2019-02-28 00:51:01 status half-installed texlive-fonts-extra:all 2018.20190207-1 2019-02-28 01:10:50 status unpacked texlive-fonts-extra:all 2018.20190227-1 [...] And ditto in the past: [...] 2019-02-02 16:21:55 status half-installed texlive-fonts-extra:all 2018.20190126-1 2019-02-02 16:41:32 status unpacked texlive-fonts-extra:all 2018.20190131-1 [...] [...] 2019-02-26 01:35:08 status half-installed texlive-fonts-extra:all 2018.20190131-1 2019-02-26 01:55:12 status unpacked texlive-fonts-extra:all 2018.20190131-2 [...] [...] 2019-02-27 02:33:14 status half-installed texlive-fonts-extra:all 2018.20190131-2 2019-02-27 02:53:18 status unpacked texlive-fonts-extra:all 2018.20190207-1 [...]
Hi! Thanks for all the info. This does look like a hardware or kernel problem to me, though. Did this start happening due to the new kernel version? What filesystem are you using (say btrfs or similar?), etc, can you reproduce on another system? I certainly can't. :) I guess attaching strace to the running dpkg might also give some more info on what's going on, or where it's stalling. Also if you can simply reproduce via «dpkg -i», then enabling all dpkg debug flags might also help. I see the kernel is tainted. Perhaps try to unload those and see? :) Thanks, Guillem
In March 2018, this was already slow, but much less:
[...]
2018-03-08 00:55:55 status unpacked texlive-fonts-extra:all 2017.20180225-1
2018-03-08 00:55:55 status half-installed texlive-fonts-extra:all 2017.20180225-1
2018-03-08 01:02:55 status half-installed texlive-fonts-extra:all 2017.20180225-1
2018-03-08 01:02:56 status unpacked texlive-fonts-extra:all 2017.20180305-2
2018-03-08 01:02:56 status unpacked texlive-fonts-extra:all 2017.20180305-2
[...]
I don't have older information.
mount information:
/dev/mapper/zira--vg-root on / type ext4 (rw,relatime,errors=remount-ro)
No, on two of my other machines, it takes around 1 minute
(both for old and new versions). I assume that this is what
is expected.
With
dpkg -i /var/cache/apt/archives/texlive-fonts-extra_2018.20190227-1_all.deb
GKrellM shows full disk activity for a bit less than 2 minutes, then
no activity, but dpkg is still running. Attaching strace against it
during this time shows lots of rename's:
[...]
rename("/usr/share/texlive/texmf-dist/fonts/tfm/arkandis/berenisadf/ybdr2cijw8y.tfm.dpkg-new", "/usr/share/texlive/texmf-dist/fonts/tfm/arkandis/berenisadf/ybdr2cijw8y.tfm") = 0
openat(AT_FDCWD, "/usr/share/texlive/texmf-dist/fonts/tfm/arkandis/berenisadf/ybdr2ciw8y.tfm.dpkg-new", O_WRONLY) = 11
fsync(11) = 0
close(11) = 0
rename("/usr/share/texlive/texmf-dist/fonts/tfm/arkandis/berenisadf/ybdr2ciw8y.tfm.dpkg-new", "/usr/share/texlive/texmf-dist/fonts/tfm/arkandis/berenisadf/ybdr2ciw8y.tfm") = 0
openat(AT_FDCWD, "/usr/share/texlive/texmf-dist/fonts/tfm/arkandis/berenisadf/ybdr2cj8c.tfm.dpkg-new", O_WRONLY) = 11
fsync(11) = 0
close(11) = 0
rename("/usr/share/texlive/texmf-dist/fonts/tfm/arkandis/berenisadf/ybdr2cj8c.tfm.dpkg-new", "/usr/share/texlive/texmf-dist/fonts/tfm/arkandis/berenisadf/ybdr2cj8c.tfm") = 0
openat(AT_FDCWD, "/usr/share/texlive/texmf-dist/fonts/tfm/arkandis/berenisadf/ybdr2cj8t.tfm.dpkg-new", O_WRONLY) = 11
fsync(11) = 0
close(11) = 0
[...]
That's the NVIDIA drivers (because nouveau is too broken).
Hi! Yes. This does really look like a kernel or hardware issue TBH. Could you try downgrading the kernel to earlier versions to see whether that improves, that'd give a definitive answer. I'd say try also older dpkg versions, but I seriously doubt that'd change much, as I don't remember any major changes in this area recently. Thanks. Yes this would be the code doing the delayed fsync(3 ) with atomic renames(3), which has been the solution to earlier problems with ext4. Thanks, Guillem
Same issue (also around 20 minutes) with the kernel from stable: Linux zira 4.9.0-8-amd64 #1 SMP Debian 4.9.144-3.1 (2019-02-19) x86_64 GNU/Linux If it's a hardware issue, I'm wondering why I don't have messages in journalctl output.
I've done a test with "iozone -a -e", and with this -e option
(to include fsync), the operations are much slower than on the
other machine.
zira:~> iozone -a -e
[...]
random random bkwd record stride
kB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread
64 4 1821 1987 1922 3703 3657 1770 1621 1628 1697 1707 1773 1879725 2892445
64 8 1606 1684 1713 3493 3476 1700 1765 1730 1812 2135 1852 2278628 3406295
64 16 1807 1823 1986 3898 3798 1977 1813 1988 1871 2049 3320 2561267 4018152
64 32 1702 1714 1755 3494 3683 1741 1841 1796 1737 1633 1674 1879725 4018152
64 64 1808 1790 1632 3314 3387 2254 1995 1843 1984 1789 1820 2801873 3203069
[...]
I meant, except fread and freread.
Hi! Ok, can we conclude this is not a problem in dpkg then? :) It seems to me either hardware, filesystem or kernel related as mentioned before. Could you reassign where you think it would be more appropriate? Thanks, Guillem
Hi, I'm not sure. This may just be a slow disk after all, and in practice (I mean except benchmarking), the problem seems specific to dpkg. I assume that programs normally don't do a fsync at every fraction of microsecond! I don't know what dpkg is trying to achieve with such a frequency of fsync, but this doesn't seem to make sense to me. You had said "which has been the solution to earlier problems with ext4", but the real solution would be to fix ext4 (does "earlier" mean this has eventually been fixed?). I've seen https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1785020 "fsync is slow on later kernels with ext4 filesystms" but the comments in this bug page and 4.19.28-1 kernel logs show that this has been fixed. FYI, I get very similar timings on the same machine, but on /boot, which is: /dev/sda1 on /boot type ext2 (rw,relatime,block_validity,barrier,user_xattr,acl,stripe=4) The fact that the main, ext4 partition is encrypted but not this one makes another difference, but the timings are similar, so that this doesn't seem to be related to the FS system itself and appears to be at a lower level. Note also that my machine is a laptop, and I couldn't do a comparison with other laptops, just in case. I think that this should still be regarded as a dpkg bug, based on my first paragraph above (the fsync's occur much too often in dpkg compared to other programs, and this doesn't seem to be useful).
To confirm the issue: Unpacking texlive-fonts-extra (2020.20210202-3) over (2020.20210202-1) ... is incredibly slow. I'm running Debian/bullseye; the SSD is ext4 and far from full. What is the status of this issue? What are the conclusions after all the above correspondence? Is there any hope? - Joachim