#129330 sitecopy: inadequate locking causes data corruption/loss

Package:
sitecopy
Source:
sitecopy
Description:
program for managing a WWW site via FTP, SFTP, DAV or HTTP
Submitter:
Michael Gurski
Date:
2015-03-19 01:42:01 UTC
Severity:
wishlist
#129330#5
Date:
2002-01-15 08:21:30 UTC
From:
To:
When sitecopy is used for periodic (N minute) updates to remote web
spaces, any delay transmitting that's greater than the delay between
invocations causes sitecopy to corrupt its database for that site.
This is manifested as what sitecopy believes is "Extra content at the
end of the document" (caused by the later invocation writing to the
file at the same time as an earlier one).  The error message from
sitecopy looks like:

=-=-=-=
sitecopy: Error: Corrupt site storage file for `foo':
sitecopy: XML parse error at line 14: Extra content at the end of the document
.
sitecopy: Skipping site `foo'.
=-=-=-=

This happens repeatedly on servers which have questionable reliability
(@home (...)), and when the error is fixed by either editting the XML
source file, or re-initializing, it causes several "File exists"
warnings.

Recently, going through this same process (yet again), sitecopy
decided on a new tactic, deleting several files on the remote site
which existed locally (sitecopy -k --update foo), which, according to
the manpage, should NOT have occurred.

Output of that session:

=-=-=-=
sitecopy: Updating site `foo' (on remote.webspace.machine in /userarea/)
Deleting index.html: done.
Deleting stsn.html: done.
Deleting webcam.html: done.
Deleting srcs.html: done.
Deleting associations.html: done.
Deleting go_away.html: done.
Deleting cam_ico.html: done.
=-=-=-=

I consider this to be a severe defect in sitecopy.  At the very least,
it should lock its own databases so that only one instance at a time
is able to access them.

#129330#8
Date:
2002-01-18 00:00:16 UTC
From:
To:
When sitecopy is used for periodic (N minute) updates to remote web
spaces, any delay transmitting that's greater than the delay between
invocations causes sitecopy to corrupt its database for that site.
This is manifested as what sitecopy believes is "Extra content at the
end of the document" (caused by the later invocation writing to the
file at the same time as an earlier one).  The error message from
sitecopy looks like:

=-=-=-=
sitecopy: Error: Corrupt site storage file for `foo':
sitecopy: XML parse error at line 14: Extra content at the end of the document
.
sitecopy: Skipping site `foo'.
=-=-=-=

This happens repeatedly on servers which have questionable reliability
(@home (...)), and when the error is fixed by either editting the XML
source file, or re-initializing, it causes several "File exists"
warnings.

Recently, going through this same process (yet again), sitecopy
decided on a new tactic, deleting several files on the remote site
which existed locally (sitecopy -k --update foo), which, according to
the manpage, should NOT have occurred.

Output of that session:

=-=-=-=
sitecopy: Updating site `foo' (on remote.webspace.machine in /userarea/)
Deleting index.html: done.
Deleting stsn.html: done.
Deleting webcam.html: done.
Deleting srcs.html: done.
Deleting associations.html: done.
Deleting go_away.html: done.
Deleting cam_ico.html: done.
=-=-=-=

I consider this to be a severe defect in sitecopy.  At the very least,
it should lock its own databases so that only one instance at a time
is able to access them.

#129330#9
Date:
2002-01-18 00:16:55 UTC
From:
To:
(Masayuki, thanks for forwarding the bug report: will the
129330-forwarded@bugs address automatically copy this to the reporter or
must I do that myself?)

This is a known flaw, though I could argue "don't do that then"! It is
very easy to update your cron script to be:

	lockfile -r0 $HOME/.sitecopy/mysite.lock && sitecopy -u mysite

I'll update the FAQ with this, I don't know when I'll get round to
adding locking support.

Regards,

joe

#129330#10
Date:
2002-01-25 06:31:56 UTC
From:
To:
everything using the script instead of calling sitecopy directly,
sitecopy is *still* corrupting its xml files.  It seems to happen at
about the time sitecopy is unable to get to a down server or possibly
when it starts to transfer files and then the ftp server decides to
error out (I know, I know...as soon as I can get DSL, @Home's remnants
are going the way of the dodo).  The next time sitecopy's able to
connect, the xml file is corrupted.  I think sitecopy might need a
little more logic on when/how to write out its files in the case of
errors.

Output from cron emails:

(run 1):

sitecopy: Error: Could not authorise user on server.

(run 2):

Failed to update images:
550 /xxxx/images: File exists
Failed to update keys:
550 /xxxx/keys: File exists
...
550 /xxxx/images/rants/lockheed_moron: File exists
Failed to update images/jpgs/mike_20000710.jpg:
Could not read response line: connection timed out.
Failed to update keys/card.ps:
530 Sorry, maximum number clients (5) from your host already connected.
Failed to update keys/revocations.txt:
530 Sorry, maximum number clients (5) from your host already connected.
...
Failed to update uprecord.txt:
530 Sorry, maximum number clients (5) from your host already connected.
sitecopy: Errors occurred while updating the remote site.


(run 3):

sitecopy: Error: Corrupt site storage file for `xxxx':
sitecopy: XML parse error at line 54: Extra content at the end of the document
.
sitecopy: Skipping site `xxxx'.
sitecopy: No valid sites specified.
Try `sitecopy --help' for more information.



The script in question, which I tested by hand, and otherwise works
well, ie only a single instance ever runs, and doesn't queue up five
million instances if one's taking forever:

#!/bin/bash

LOCKFILE=$HOME/.sitecopy/sc.lock

lockfile -! -r 10 $LOCKFILE

if [ $?  ];
then
    sitecopy $*
    rm -f $LOCKFILE
fi

#129330#19
Date:
2008-02-06 11:07:37 UTC
From:
To:
forcemerge 182894 129330 141178
thanks

The following three bugs should all be merged together, they have the
same root-cause as upstream suggested.

#182894: sitecopy: needs reenforcing against 2nd instance of itself
#129330: sitecopy: inadequate locking causes data corruption/loss
#141178: sitecopy: Running multiple instances screws up xml configuration file

#129330#26
Date:
2008-02-26 13:13:30 UTC
From:
To:
forwarded 129330 Joe Orton <joe@manyfish.co.uk>
forwarded 141178 Joe Orton <joe@manyfish.co.uk>
forwarded 182894 Joe Orton <joe@manyfish.co.uk>
stop

Already forwarded to upstream. Adding appropriate action on it!
Thanks!

#129330#37
Date:
2014-05-25 16:14:03 UTC
From:
To:
Sehr geehrte/r Arbeitsuchender,

folgender Jobvorschlag ist für alle geeignet, da diese Arbeit ohne besondere Anforderungen auch von zu Hause zu bewerkstelligen ist. Der Arbeitnehmer hat keine Ausgaben und muss keine besonderen Kenntnisse mitbringen. Die benötigte technische Ausrüstung wird von uns frei zur Verfügung gestellt. Zu Ihren Hauptaufgaben gehört die Koordinierung, das Erstellen von Ebooks, das Erstellen von Mediatheken, die Vorbereitung der Dokumentenbearbeitung und das Erstellen von Buchwerken. Wir bieten eine attraktive Vergütung in Höhe von 16€ die Stunde Brutto.

Unser Betrieb verfügt über internationale Firmensitze in ganz Europa und wir arbeiten im Onlinebereich. Im Moment sind wir auf der Suche nach neuen Mitarbeitern.

Ihre Anforderungen wären folgende: Eine Tätigkeit von zu Hause entspricht Ihrer Vorstellung, Grundkenntnisse von MS-Office sind von Vorteil, Sie besitzen Flexibilität, Sie arbeiten gern von zu Hause, sie verfügen über ein Paar Stunden Zeit am Tag und Sie besitzen eine teamorientierte Arbeitsweise.

Haben wir Ihr Interesse geweckt? Dann freuen wir uns über Ihre Bewerbung! Senden Sie Ihre vollständigen Bewerbungsbögen an: JapworxJudyp@priest.com
Wir freuen uns auf Ihre Bewerbung.

Beste Grüße

Friesen Ltd

#129330#38
Date:
2015-03-19 01:38:53 UTC
From:
To:
Sehr geehrter Kunde,

Ihr Kreditinstitut hat die Lastschrift zurück buchen lassen. Sie haben eine offene Rechnung bei unseren Mandanten Ebay AG.

Namens und in Vollmacht unseren Mandanten ordnen wir Ihnen an, die offene Forderung sofort zu bezahlen.

Aufgrund des andauernden Zahlungsausstands sind Sie gebunden zuzüglich, die durch unsere Beauftragung entstandenen Gebühren von 38,71 Euro zu tragen. Wir erwarten die vollständige Zahlung bis zum 20.03.2015 auf unser Girokonto.

Es erfolgt keine weitere Erinnerung oder Mahnung. Nach Ablauf der Frist wird die Akte dem Staatsanwalt und der Schufa übergeben. Eine vollständige Kostenaufstellung, der Sie alle Buchungen entnehmen können, ist beigefügt. Für Fragen oder Anregungen erwarten wir eine Kontaktaufnahme innerhalb des selben Zeitraums.

Mit verbindlichen Grüßen

Rechnungsstelle Altdorfer Rafael