#520927 Some files corrupted from remote dump/restore reproducably

Package:
dump
Source:
dump
Description:
backup and restore for ext2/3/4 filesystems
Submitter:
Jenny Barna
Date:
2010-05-17 19:57:11 UTC
Severity:
important
#520927#5
Date:
2009-03-23 17:14:34 UTC
From:
To:
I set up dump using a SLT24 tape system and tested local and remote
dump and restore with restoring single files OK. When I restored a user's
lost file I discovered I had got back text from another (unrelated?) file.

Therefore I restored the whole of /etc (critical) locally from a typical dump
and cannot see a problem, on the system where the tape drive is. But when I
dump/restore from this system to the remote tape drive (same OS) I am
reproducably finding some files corrupted (using /etc for all tests). When I do
file * in the restored etc I find two files listed as Vim swap files. These
seem in some way related eg hosts.allow is full of the wrong contents and one
of my old copies (I dated them .0902xx etc) is reported as a Vim file. This
behaviour happens with the default block size, with a block size of 64 and one
of 256. The files in question are not open when I do the test dump though it
is true the file system is mounted. I have not proved no files are corrupted
on any local dump. I have read that dump may not be as reliable for a mounted
file system under Linux as with Solaris, which I was using before, but if
this is a feature not a bug then I am unsure how to avoid it. hosts.allow has
definitely been edited with vi but another file that seems corrupted is mailcap
and I do not think I have edited that. The key thing is that the same
corrupted files are seen each time I do this test ie from /etc on this system
to the remote tape drive on the same OS. dump and restore give no errors.
Thanks.

#520927#10
Date:
2009-06-18 19:54:31 UTC
From:
To:
Which rmt are you using?  On Debian systems, there are several packages
that can provide rmt, so the 'alternatives' mechanism is used to allow
you to choose which one you want.  By default, you probably get the one
provided by the 'tar' package.  It would be interesting to know whether
you're configured to use the rmt that comes with dump, with tar, or some
other version.

Bdale

#520927#15
Date:
2009-06-19 08:24:27 UTC
From:
To:
Since I wrote that in March I switched to using tar for my two remote
machines whereas I stayed with dump for the machine that has the tape
system on it.

Jenny Barna            | Email 	               jcjb@cam.ac.uk
SBS Computing Facility | Web computing.bio.cam.ac.uk
Dept of Biochemistry   | Telephone (Direct)   +44 1223 333644
Tennis Court Road      | Switchboard          +44 1223 333600
Cambridge, CB2 1QW, UK | Fax:                 +44 1223 333345

#520927#20
Date:
2009-06-19 15:20:31 UTC
From:
To:
That's interesting.  So are you using rmt with tar, or some other
transport mechanism?  If you're using rmt-dump in all cases, that would
certainly help point the finger squarely at dump itself!

I personally use amanda, so while I run both dump and tar for different
system needs, I don't ever actually use the built-in remote support...
remote transport is handled by amanda.

Bdale

#520927#25
Date:
2009-06-19 15:41:42 UTC
From:
To:
I guess so.

After I reported this I had found to my horror that my remote dumps were
useless with multiple corrupted files eg in /etc whereas remote tar
seemed ok.

#520927#28
Date:
2009-08-31 23:15:37 UTC
From:
To:
Hi!

One of the users of my Debian package of dump reported repeatable corruption
when using dump over rmt to a remote tape drive.  See the item in our bug
tracking system at http://bugs.debian.org/520927 for more information.

I've poked around a bit, but it won't be easy for me to reproduce the problem.
Does this ring any bells with you?

Bdale

#520927#29
Date:
2009-09-06 16:20:19 UTC
From:
To:
This sounds a lot like the kind of problems that can be encountered
when dumping a live filesystem.

In order to completly rule out any network related corruptions, I
would suggest to try doing a dumping locally (into a file in /tmp for
example), and try restoring (or comparing) this dump.

As for the live filesystem problem, the only way to confirm that
this is the cause is, well, to umount (or mount R/O) the filesystem
before dumping.

While you're at it, try to force a fsck on this filesystem, just in
case the filesystem is really corrupt...

Stelian.

#520927#34
Date:
2010-03-16 07:28:57 UTC
From:
To:
tags 520927 moreinfo
thanks

Hello Jenny,

Back in March 2009, you reported a bug about file corruption when
dumping filesystems.

Bdale Garbee, who maintains the relevant package asked the upstream
author (Stelian Pop) whether that rings some bells for him.

Stelian answered the following, which was apparently never forwarded
to you (or at least in a visible way). This sounds like some tests
could be done on your side. Hopefully, you're still in position to do
them...

Stelian Pop's answer:

This sounds a lot like the kind of problems that can be encountered
when dumping a live filesystem.

In order to completly rule out any network related corruptions, I
would suggest to try doing a dumping locally (into a file in /tmp for
example), and try restoring (or comparing) this dump.

As for the live filesystem problem, the only way to confirm that
this is the cause is, well, to umount (or mount R/O) the filesystem
before dumping.

While you're at it, try to force a fsck on this filesystem, just in
case the filesystem is really corrupt...

Stelian.

#520927#39
Date:
2010-03-16 07:28:57 UTC
From:
To:
tags 520927 moreinfo
thanks

Hello Jenny,

Back in March 2009, you reported a bug about file corruption when
dumping filesystems.

Bdale Garbee, who maintains the relevant package asked the upstream
author (Stelian Pop) whether that rings some bells for him.

Stelian answered the following, which was apparently never forwarded
to you (or at least in a visible way). This sounds like some tests
could be done on your side. Hopefully, you're still in position to do
them...

Stelian Pop's answer:

This sounds a lot like the kind of problems that can be encountered
when dumping a live filesystem.

In order to completly rule out any network related corruptions, I
would suggest to try doing a dumping locally (into a file in /tmp for
example), and try restoring (or comparing) this dump.

As for the live filesystem problem, the only way to confirm that
this is the cause is, well, to umount (or mount R/O) the filesystem
before dumping.

While you're at it, try to force a fsck on this filesystem, just in
case the filesystem is really corrupt...

Stelian.

#520927#44
Date:
2010-03-17 12:00:42 UTC
From:
To:
No I have no record of this.


I had tested local dumps before I reported the problem. Neither a local
dump to a tape, nor a dump piped into restore to a disk, nor to a file
on disk have given any problem I can locate even tho they are from a live
system. And remote tar also seems ok. So as reported then the problem was
confined to remote dump.

It was not the problem. See above.

What I have not done is tried a remote dump to tape again in recent months
so if any patch may have fixed it I could try one again.

#520927#49
Date:
2010-03-17 12:00:42 UTC
From:
To:
No I have no record of this.


I had tested local dumps before I reported the problem. Neither a local
dump to a tape, nor a dump piped into restore to a disk, nor to a file
on disk have given any problem I can locate even tho they are from a live
system. And remote tar also seems ok. So as reported then the problem was
confined to remote dump.

It was not the problem. See above.

What I have not done is tried a remote dump to tape again in recent months
so if any patch may have fixed it I could try one again.

#520927#52
Date:
2010-03-17 12:00:42 UTC
From:
To:
No I have no record of this.


I had tested local dumps before I reported the problem. Neither a local
dump to a tape, nor a dump piped into restore to a disk, nor to a file
on disk have given any problem I can locate even tho they are from a live
system. And remote tar also seems ok. So as reported then the problem was
confined to remote dump.

It was not the problem. See above.

What I have not done is tried a remote dump to tape again in recent months
so if any patch may have fixed it I could try one again.