#976907 golang-github-boltdb-bolt: FTBFS on ppc64el (arch:all-only src pkg): dh_auto_test: error: cd obj-powerpc64le-linux-gnu && go test -vet=off -v -p 160 -short github.com/boltdb/bolt github.com/boltdb/bolt/cmd/bolt returned exit code 1 #976907
- Package:
- src:golang-github-boltdb-bolt
- Source:
- golang-github-boltdb-bolt
- Submitter:
- Lucas Nussbaum
- Date:
- 2021-02-01 19:45:05 UTC
- Severity:
- normal
- Tags:
Hi, During a rebuild of all packages in sid, your package failed to build on ppc64el. At the same time, it did not fail on amd64. I'm marking this bug as severity:serious since your package has only Architecture:all binary packages, and should thus, in theory, build everywhere. Failure to build on ppc64el might indicate a serious issue in this package or in another package. But feel free to downgrade or close if you believe that this is only a build-time issue. (I would personnally prefer a severity:minor bug just to track that the package can only be built on specific architectures.) Relevant part (hopefully): http://qa-logs.debian.net/2020/12/09/golang-github-boltdb-bolt_1.3.1-7_unstable.log A list of current common problems and possible solutions is available at http://wiki.debian.org/qa.debian.org/FTBFS . You're welcome to contribute! If you reassign this bug to another package, please marking it as 'affects'-ing this package. See https://www.debian.org/Bugs/server-control#affects If you fail to reproduce this, please provide a build log and diff it with me so that we can identify if something relevant changed in the meantime. About the archive rebuild: The rebuild was done on a Power8 cluster part of the Grid'5000 testbed. Hardware specs: https://www.grid5000.fr/w/Grenoble:Hardware#drac
Hello all, 1 down, 1 to go.... info below. [...] ^--- I've not looked into this one yet. [...] [...] ^-- this one is solved by adding `tx.Rollback()` last in TestTx_Commit_ErrTxNotWritable function (in tx_test.go:65). In other words, this is a test-suite bug (not a bug in the actual product code). The reasoning goes that tx.Commit() is expected to return error bolt.ErrTxNotWritable, which it does -- but this means it's holding a reader lock on db.mmaplock. After the test function finishes the deferred function db.MustClose() runs and calls into things that tries to take a read-write lock of db.mmaplock which times out. The added tx.Rollback() on a read-only tx basically only removes the transaction and releases the db.mmaplock. I have no idea why this would not also trigger on any other arch. Regards, Andreas Henriksson
Hello again, [...] Now also quickly looked into this one. It seems the test-suite makes assumptions related to calculations that involve os.Getpagesize() (which gives 4096 on amd64 and 65536 on ppc64el, which is 16 times larger). Changing the 500 number to 8000 (16 times larger) in TestBucket_Stats(...) (in bucket_test.go:1143) gives the expected BranchPageN == 1.... (however after that it then says "unexpected LeafPageN: 6" with this modification). Anyway, this makes me loose interest in pursuing this further. In my opinion it's pretty clear that these are test-suite only issues and not issues in the actual product. Unless someone else wants to pursue fixing up the test-suite for ppc64le needs, my offer to "fix" this will be to simply disable it on !amd64 architectures (unless we agree on simply downgrading this issue to non-RC). Regards, Andreas Henriksson
Hello again, So after wasting my time here I finally realized that apparently boltdb is archived upstream. It will not receive any fixes. Apparently golang-github-coreos-bbolt is a maintained feature-extended fork. We should likely encurage moving to that and get boltdb removed from debian. The timeout waiting for db.mmaplock that occurred in boltdb is apparently already fixed in bbolt, see: https://github.com/etcd-io/bbolt/commit/e06ec0a754bc30c2e17ad871962e71635bf94d45 The pagesize issue seems to plague them both still though. See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=976926 for the bug report against bbolt. Regards, Andreas Henriksson
Hi Andreas, Thanks a lot for investigating! The problem with removing this 'right now' is that there are a few (important) reverse-dependencies and reverse-build-deps that this package has. $ reverse-depends golang-github-boltdb-bolt-dev Reverse-Depends * golang-github-blevesearch-bleve-dev * golang-github-hashicorp-nomad-dev * golang-github-hashicorp-raft-boltdb-dev * golang-github-influxdb-influxdb-dev $ reverse-depends golang-github-boltdb-bolt-dev -b Reverse-Testsuite-Triggers * snapd Reverse-Build-Depends * docker-libkv * etcd * go-dep * golang-github-blevesearch-bleve * golang-github-hashicorp-raft-boltdb * influxdb * nomad * snapd * vuls Can simply replacing the dependency in all of them with bbolt work? This may also need upstream patching in the future. Please let me know Kind Regards, Nilesh
example, see: https://github.com/hashicorp/raft-boltdb/pull/19#issuecomment-703732437 In short: hashicorp-raft-boltdb wants to make sure there's no issue before making the change. This change would impact reverse build deps of hashicorp-raft-boltdb, like nomad or consul. I think it's better to downgrade the severity here (as was done in coreos-bbolt, see https://bugs.debian.org/976926).