- Package:
- zlib1g-dev
- Source:
- zlib
- Description:
- compression library - development
- Submitter:
- Jonathan Nieder
- Date:
- 2010-02-08 23:01:21 UTC
- Severity:
- wishlist
Hi,
zlibâs gzFile interface provides no way I can see to recover from
EINTR.
Context:
When library code makes syscalls, I live in constant danger that the
application may have installed a signal handler with SA_RESTART
disabled. If the system call is interrupted by a signal, depending
on the application, one of three behaviors would be appropriate:
a. if the goal is to walk the stack after a SIGINT and clean up
before exiting, my function should return quickly to the
application with an error;
b. if the interruption is just a side effect of something else (maybe
this is a System V-based system and the application writer forgot
to use SA_RESTART, or maybe the interruption was intended for some
other thread), my function should restart the call and return with
success once the operation succeeds.
c. if the interruption (perhaps from SIGIO) represents some
high-priority job that needs to be done soon, my function should
let that job finish and then resume.
The usual way to handle this is to either pick one or leave it to the
application by returning with enough information to resume the
operation later.
In my case, the library was libdpkg, and the chosen behavior was
option b. Simple, right?
char buffer[BUFSZ];
int actualread;
gzFile gzfile = gzdopen(fd_in, "r");
for (;;) {
int actualread = gzread(gzfile, &buffer[0], sizeof(buffer));
if (actualread < 0) {
int err;
const char *errmsg = gzerror(gzfile, &err);
if (err == Z_ERRNO && errno == EINTR)
continue;
errx(_("read error: %s"), errmsg);
}
if (actualread == 0) /* EOF! */
break;
write(fd_out, &buffer[0], actualread);
}
int err = gzclose(gzfile);
// handle error or return
Unfortunately, when zlib hits its first EINTR, two bad things happen:
- the error indicator for the underlying stdio stream is set;
- the error number for the gzFile is set
and there is no way to undo either. So in fact this is an infinite
loop.
Thoughts about fix:
I would love it if zlib just cleared the error indicator before each
potentially resumable operation (gzread(), gzwrite(), gzseek()). But
probably someone out there is relying on zlib _not_ to do that, as
part of his implementation of error recovery plan a above, so such a
change would be dangerous.
Better would be to add a new gzclearerr() function to allow the two
error indicators to be explicitly cleared.
void Z_EXPORT gzclearerr (file)
gzFile file;
{
gz_stream *s = (gz_stream*)file;
if (s == NULL) return;
s->z_err = Z_OK;
clearerr(s->file);
}
What do you think?
Jonathan
The enclosed request was submitted against the Debian zlib package - it seems like a sensible request, though it might be hassle for portability.
Hi! There's already a gzclearerr provided by zlib, which clears both the EOF and error flags. And I think it's perfectly acceptable for code which might want to handle the interrupted case to call gzclearerr, and resume the previous zlib call. The actual problem though, as you initially pointed out in the debian-dpkg thread, is that not all zlib IO functions preserve enough state to always recover from a partial read/write. Because zlib uses now read(2)/write(2) functions for IO, on all write functions using gz_comp, the write might return a short count, but it does not store it anywhere or retries like its counterpart gz_load, which makes it impossible for the caller to restart the call even if having used gzclearerr. I had skimmed previously over the gzio.c (stdio based) code and there were several other problematic functions, but doing so now over the new gzlib.c based code it seems to be in a way better shape, although there might still be some other problems, I've not checked thoroughly. The EINTR, EAGAIN, EWOULDBLOCK errors seem to be safe to be handled now with gzclearerr, as there's no possibility of a partial fread/fwrite returning with those anymore. regards, guillem
Note that I've not uploaded the most recent versions of zlib to experimental yet due to some build system changes I need to refresh the packaging for. There are still some incompatibility issues with gzio users that are being worked through so it's not gone to unstable quite yet.
Ah! Thanks for pointing this out. I'm not sure upstream might see much
point in fixing the stdio based code, but as I skimmed over the code
anyway, it might be useful, so here's some of the problems I found:
putLong (Does not check for errors from fputc)
getLong (Does not handle restartable IO as it unconditionally calls
get_byte)
check_header (Migth fail in with Z_DATA_ERROR due to interrupted fread
from get_byte)
gz_open (Does not check return code from fprintf, might fail due to
interrupted check_header)
gzread (Migth fail in interrupted getLong or check_header)
gzwrite (Does not store the partial written item count from fwrite)
do_flush (Likewise)
regards,
guillem
Oh, I doubt anything in gzio.c will ever be touched again if it can possibly be avoided on the basis that it's been working for people for years and every attempt to change things causes some kind of interoperability issue.
Hi,
Guillem Jover wrote:
Thanks for the reminder. (Since I had to look it up, the thread in
question is at
<http://lists.debian.org/debian-dpkg/2009/10/msg00091.html>.)
I think gzwrite is the worst one. gzwrite, gzprintf, gzputs, gzgets,
gzflush, and gzclose are not resumable, but I suspect it is only worth
fixing gzwrite (hence gzputs), gzflush, and gzclose.
Considering each function in gzio.c from zlib 1.2.3.5 in turn:
- gzread() returns the number of bytes read, so in theory it should
be fine. In practice, it is pretty good, too. Just one issue:
In transparent mode, on short reads, the fread() call sets
the error flag on the file handle but not the zlib error
number. The caller has to do another one-byte gzread() to
learn the nature of the error.
- gzwrite() forgets the number of bytes actually written and returns
0 on error (-1 in gzlib.c). A resumable function would return the
short count.
- gzprintf(): it is not worth making this resumable. Applications that
need a resumable gzprintf should write a sprintf coroutine (or use
snprintf() in one fell swoop, if the string to be written is short
enough) and use gzwrite(). In the simplest case, the semantics
of SA_RESTART might be implementable and good enough, though.
- gzputs() is implemented in terms of gzwrite() and shares its problems.
- gzgets() is not resumable, but neither is fgets(). A resumable
gzgets() would need another output parameter to tell how many bytes
were read.
- gzputc() and gzputc() are resumable.
- gzungetc() does not do I/O.
- do_flush() forgets the number of bytes actually written on error, so
gzflush() is not resumable.
- if gzseek() or gzrewind() fails, the error indicator is not set, so
one canât detect if it is a zlib internal failure or fseek() failure.
gzseek() and gzrewind() are resumable. They might screw up the file
offset if not called again after an error.
- gztell() does not do I/O.
- gzoffset() is not implemented.
- gzeof() returns 1 for I/O errors as well as EOF. This is fine, but
it should be documented.
- gzdirect() does not do I/O.
- gzclose() is not resumable. Neither is fclose(), unless one calls
fflush() first. Unfortunately, a gzflush() is not enough to write
out everything that gzclose() is going to write to the file. Maybe
a new resumable function to do everything except close the file
would help?
If a close() call is interrupted, I am not sure what one is
supposed to do. Friendly operating systems will prevent that, I
guess.
gzclose() does not report I/O errors when writing its last few
bytes.
- gzerror() looks safe.
- gzclearerr() does exactly what one would want it to. Sorry for the
nonsense!
I havenât read this over yet. Glad to hear there seems to have been
some progress.
write() still can, no?
From gz_comp(), it seems the length of partial writes are still
sometimes forgotten:
if (have && write(state->fd, state->next, have) != have) {
gz_error(state, Z_ERRNO, zstrerror());
return -1;
}
Hope that helps,
Jonathan
Guillem Jover wrote:
Sorry I missed this before. Looks like you caught a few problems I missed.
For completeness, I should mention:
gzread (Does not set the error flag on short reads in transparent mode)
gzprintf, gzputs (Uses gzwrite)
gzgets (Does not return the length of a short read)
gzflush (Uses do_flush)
destroy (Does not allow retrying after a failed fclose())
gzclose (Uses do_flush, putLong, and destroy)
It might be good to document that these operations are not resumable.
Hopefully, gzlib will deal better.
Regards,
Jonathan
Like I say, gzio.c is irrelevant now - it's deprecated in favour of the new implementation. Looking at it is pretty much a waste of time, it's not built in the Debian package any more.
Mark Brown wrote: Thatâs good. ;-) Iâll try to find time to look over gzlib tonight. Jonathan
When interrupted or on non-blocking mode write(2) can either return > 0 (possibly a partial write) or -1 with any of those error codes, but it should never return with a partial write and errno set to any of those, which is possible with fwrite (as internally it might be doing several write(2) calls and the last one might fail with any of those errors). So in the gz_comp case as long as the write returns -1 and never does a partial write it's fine to restart it. OTOH if it's a partial write errno will be bogus. regards, guillem
Guillem Jover wrote: Ah, okay, so with "fast" devices (such as disks), itâs fine. For "slow" devices such as pipes, it would be nice to report the partial writes. If this causes the calling operation to fail, I am not sure what the right way to report this is. If this happens in the middle of a block, fwrite() just tries to resume; but that is not always very good behavior. Maybe this should be configurable via a new flag in the 'mode' string (e.g., 'B' for non-blocking)? Jonathan
Jonathan Nieder wrote: Agh, that would be dangerously backwards-incompatible. With gzio, the cleaned-up mode string was passed on to fopen. Currently, a perfectly legitimate short write (for example, to a pipe) can cause gzwrite to fail. Fixing this properly would require adding a flag and a function to set it, to decide whether to retry on short reads and writes. If we were starting from scratch, the best default would be _to_ retry. But the backwards-compatible choice is to default to not retrying. Jonathan
Mark, There already is a gzclearerr() which does what was requested. Mark
Hi Mark,
First of all, sorry for the nonsensical original report.
Mark Brown wrote:
implementation. Some of these things may be worth fixing, some not:
gz_open (can segfault if it runs out of memory for path)
gz_comp (reports partial writes with "slow" devices as errors.
This is wrong in two ways: it discards data, and errno
contains random nonsense.)
gz_load (blocks in non-blocking situations with "slow" devices)
gzprintf (uses gz_zero and gz_comp, inherently nonresumable)
gzclose_r (does not report I/O errors!)
gzclose_w (uses gz_zero, gz_comp; inherently nonresumable)
gzclose (uses gzclose_r, gzclose_w)
gz{_zero,setparams,write,put[sc],flush} (uses gz_comp)
NEXT gz{_{avail,head,next4,decomp,make,skip},read,gets,{un,}getc} (uses gz_load)
Of these issues, the most important to me is the gz_comp one. I
would like not to discard data. For my uses, it would be fine to
restart write() to finish up any partial write.
Doing that would break applications that expect compression not to
block for so long, either because they used gzdopen on a file
descriptor with O_NONBLOCK or because the write to a "slow" device was
interrupted by a signal. Luckily, it is possible to make both these
people and me happy without introducing new API or ABI.
The most flexible thing is to report enough information to allow
resuming after a partial write. This means:
- A field is added to gz_state to represent whether a flush is pending.
gz_comp deals with any pending flush before starting other I/O.
- gzwrite returns the number of input bytes consumed. If not all bytes
reached the file yet, thatâs fine, so there is no need to distinguish
successful writes from partial writes that consumed all input.
- This would make gzputc and gzputs automatically behave
appropriately.
- A short write in gzflush needs to not go undetected. gzflush should set
errno = EINTR in this case and return E_ERRNO so the application
knows it can be resumed.
A fix for the gz_open bug mentioned above follows (warning: untested).
---
gzlib.c | 12 ++++++++----
1 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/gzlib.c b/gzlib.c
index bcef6c2..b74cb6b 100644
--- a/gzlib.c
+++ b/gzlib.c
@@ -153,6 +153,14 @@ local gzFile gz_open(path, fd, mode, use64)
return NULL;
}
+ /* save the path name for error messages */
+ state->path = malloc(strlen(path) + 1);
+ if (state->path == NULL) {
+ free(state);
+ return NULL;
+ }
+ strcpy(state->path, path);
+
/* open the file with the appropriate mode (or just use fd) */
state->fd = fd != -1 ? fd :
open(path,
@@ -176,10 +184,6 @@ local gzFile gz_open(path, fd, mode, use64)
if (state->mode == GZ_APPEND)
state->mode = GZ_WRITE; /* simplify later checks */
- /* save the path name for error messages */
- state->path = malloc(strlen(path) + 1);
- strcpy(state->path, path);
-
/* save the current position for rewinding (only if reading) */
if (state->mode == GZ_READ) {
state->start = LSEEK(state->fd, 0, SEEK_CUR);
---
Jonathan Nieder wrote:
Hereâs half of the fix to that.
gzread.c => gzread.c.new | 6 ++++--
1 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/gzread.c b/gzread.c.new
index 836c57c..5603eca 100644
--- a/gzread.c
+++ b/gzread.c.new
@@ -624,8 +624,10 @@ int ZEXPORT gzclose_r(file)
free(state->in);
}
gz_error(state, Z_OK, NULL);
- close(state->fd);
- free(state);
+ if (close(state->fd)) {
+ free(state);
+ return -1;
+ }
return Z_OK;
}
Here is the other half (also untested).
gzread.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)
diff --git a/gzread.c b/gzread.c
index 5603eca..7ba3268 100644
--- a/gzread.c
+++ b/gzread.c
@@ -622,10 +622,12 @@ int ZEXPORT gzclose_r(file)
inflateEnd(&(state->strm));
free(state->out);
free(state->in);
+ state->size = 0;
}
gz_error(state, Z_OK, NULL);
if (close(state->fd)) {
- free(state);
+ if (errno != EINTR)
+ free(state);
return -1;
}
return Z_OK;
There's some other issues in there surrounding the handling of errors from the underlying file I/O calls - I'll forward on some patches this weekend when I've got a bit more time, unless Jonathan does so first (I'm fairly sure some of the issues were fixed already).
Mark Brown wrote: Yes, sorry about that. From the changelog, it looks like two of the three issues I wrote patches for were fixed in 1.2.3.6 or 1.2.3.7. So Iâd rather find 1.2.3.7 before wasting more time with patches. Problems that probably remain: gz_comp (reports partial writes with "slow" devices as errors. This is wrong in two ways: it discards data, and errno contains random nonsense. does not store enough state to resume an interrupted flush) gz_load (blocks in non-blocking situations with "slow" devices) gzprintf (uses gz_zero and gz_comp, inherently nonresumable) gzclose_r (Should not free state if close fails with EINTR) gzclose_w, gzwrite, gzputc, gzflush, gzsetparams (does not allow recovery from an interrupted seek) Is the latest source available anywhere? In my few minutes of searching, I have not been able to find it. Jonathan
Hi again,
Mark Brown wrote:
the error handling problems I can find. As before, the use I have in
mind is libraries that want to behave sanely when a signal handler is
installed without SA_RESTART.
Patch 1 is a follow-up to the patch to catch allocation failures in
gz_open from 1.2.3.7. It seems that that patch introduced a small
file descriptor leak; this fixes it.
Patches 2 and 7 address a small problem with gzclose. If a system
call made by gzclose was interrupted by a signal, there was no way
to clean up after that; with patches 2-7 applied, one can call
gzclearerr() and repeat the gzclose() to recover.
Patches 3-6 and 8 allow recovery from partial or interrupted writes.
Some (broken) applications might have been relying on zlib to quickly
return control in these situations, so for backward compatibility,
zlib cannot take care of recovering on its own. So patch 3 teaches it
to ask the caller for help, by reporting EINTR.
Patch 9 teaches zlib to recover from an interrupted read. Existing
zlib versions already resume reading after a partial read, so the
only consistent thing to do is notice and handle the interruption.
Patches 2-8 leave me a bit uncomfortable: should zlib really force the
caller to retry calls to avoid breaking programs with broken
assumptions? By broken assumptions, I mean this: a program may want
to rely on System V-style signal semantics to quickly bail out after a
interruption. Great: code to do so looks like this:
static sig_atomic_t user_aborted;
void handle_sigint(int signum)
{
user_aborted = 1;
}
void do_something(void)
{
some_system_call();
if (user_aborted) clean_up_and_exit();
another_system_call();
if (user_aborted) clean_up_and_exit();
etc
}
By checking the value of user_aborted after each system call, one can
be sure to handle the interruption before blocking in another call.
If such a program uses zlib, this strategy does not (and should not)
work any more. For example, in gz_comp, zlib calls write several
times in a loop; a write can be interrupted by a signal handler
"after" writing the appropriate number of bytes and return success,
allowing zlib to block in the next write call.
Still, I understand backward compatibility is important for zlib, so I
am still afraid to break such code.
Thank you for writing zlib. I hope you enjoy the patches.
Kind regards,
Jonathan Nieder (9):
gzopen: avoid a file descriptor leak after malloc() failure
gzclose_r: allow recovery from EINTR
gz_comp: report short writes as EINTR
gz_comp: consume part of output buffer for partial writes
gz_zero: make return value a number of zeros
gz_zero: report partial successes
gzclose_w: allow recovery from EINTR
gzwrite: report partial successes
gz_load: handle EINTR
gzlib.c | 17 +++++-----
gzread.c | 12 +++++--
gzwrite.c | 105 +++++++++++++++++++++++++++++++++++++++++--------------------
3 files changed, 89 insertions(+), 45 deletions(-)
If there is not enough address space to copy the path passed to
gz_open, the file descriptor opened soon before is never closed.
---
gzlib.c | 17 +++++++++--------
1 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/gzlib.c b/gzlib.c
index e637e83..66c4eac 100644
--- a/gzlib.c
+++ b/gzlib.c
@@ -149,6 +149,14 @@ local gzFile gz_open(path, fd, mode)
return NULL;
}
+ /* save the path name for error messages */
+ state->path = malloc(strlen(path) + 1);
+ if (state->path == NULL) {
+ free(state);
+ return NULL;
+ }
+ strcpy(state->path, path);
+
/* open the file with the appropriate mode (or just use fd) */
state->fd = fd != -1 ? fd :
open(path,
@@ -167,19 +175,12 @@ local gzFile gz_open(path, fd, mode)
0666);
if (state->fd == -1) {
free(state);
+ free(state->path);
return NULL;
}
if (state->mode == GZ_APPEND)
state->mode = GZ_WRITE; /* simplify later checks */
- /* save the path name for error messages */
- state->path = malloc(strlen(path) + 1);
- if (state->path == NULL) {
- free(state);
- return NULL;
- }
- strcpy(state->path, path);
-
/* save the current position for rewinding (only if reading) */
if (state->mode == GZ_READ) {
state->start = LSEEK(state->fd, 0, SEEK_CUR);
On some operating systems, close() can be interrupted by a signal.
Thus to close a file and report an error on failure, portable
applications have to do something like this:
while (close(fd) < 0)
if (errno != EINTR) {
perror("close");
exit(1);
}
Allow applications to do the same with zlib:
for (;;) {
int err;
const char *errmsg;
err = gzclose(gzfile);
if (err == Z_ERRNO && errno == EINTR) continue;
if (!err)
break;
errmsg = (err == Z_ERRNO) ? strerror(errno) : zError(err);
fprintf(stderr, "gzclose: %s\n", errmsg);
exit(1);
}
---
Alternatively, this could say
while (close(state->fd)) {
if (errno != EINTR) {
free(state);
return E_ERRNO;
}
}
allowing applications to say
int err = gzclose(gzfile);
if (err) {
const char *errmsg = (err == Z_ERRNO) ?
strerror(errno) : zError(err);
fprintf(stderr, "gzclose: %s\n", errmsg);
exit(1);
}
That would be simpler, but it might break programs that play weird
games with signal handling.
gzread.c | 10 +++++++---
1 files changed, 7 insertions(+), 3 deletions(-)
diff --git a/gzread.c b/gzread.c
index 74322ca..0062a5e 100644
--- a/gzread.c
+++ b/gzread.c
@@ -631,7 +631,6 @@ int ZEXPORT gzdirect(file)
int ZEXPORT gzclose_r(file)
gzFile file;
{
- int ret;
gz_statep state;
/* get internal structure */
@@ -648,11 +647,16 @@ int ZEXPORT gzclose_r(file)
inflateEnd(&(state->strm));
free(state->out);
free(state->in);
+ state->size = 0;
}
gz_error(state, Z_OK, NULL);
- ret = close(state->fd);
+ if (close(state->fd)) {
+ if (errno != EINTR)
+ free(state);
+ return Z_ERRNO;
+ }
free(state);
- return ret ? Z_ERRNO : Z_OK;
+ return Z_OK;
}
#endif /* !OLD_GZIO */
A short write is perfectly normal when writing to a "slow" device
such as a pipe. But most zlib routines cannot handle this sanely
yet, so report it as an error.
Note that EINTR here just means the write was interrupted or the
device temporarily unavailable, not that the relevant zlib call
can be resumed by trying again.
In many situations (such as non-blocking I/O), this makes zlib
report EINTR when some other error value such as EAGAIN would be
more appropriate. Unfortunately, I canât see a better option,
since errno is not set by write() in this case and historical
behavior prevents quietly trying again.
---
gzwrite.c | 15 ++++++++++++---
1 files changed, 12 insertions(+), 3 deletions(-)
diff --git a/gzwrite.c b/gzwrite.c
index f4a0a80..5b53cec 100644
--- a/gzwrite.c
+++ b/gzwrite.c
@@ -78,9 +78,18 @@ local int gz_comp(state, flush)
if (strm->avail_out == 0 || (flush != Z_NO_FLUSH &&
(flush != Z_FINISH || ret == Z_STREAM_END))) {
have = strm->next_out - state->next;
- if (have && write(state->fd, state->next, have) != have) {
- gz_error(state, Z_ERRNO, zstrerror());
- return -1;
+ if (have) {
+ ssize_t n = write(state->fd, state->next, have);
+
+ if (n < 0) {
+ gz_error(state, Z_ERRNO, zstrerror());
+ return -1;
+ }
+ if (n != have) { /* short write */
+ errno = EINTR;
+ gz_error(state, Z_ERRNO, zstrerror());
+ return -1;
+ }
}
if (strm->avail_out == 0) {
strm->avail_out = state->size;
Callers can recover by trying again after a partial write.
---
gzwrite.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/gzwrite.c b/gzwrite.c
index 5b53cec..8ec70ef 100644
--- a/gzwrite.c
+++ b/gzwrite.c
@@ -88,6 +88,7 @@ local int gz_comp(state, flush)
if (n != have) { /* short write */
errno = EINTR;
gz_error(state, Z_ERRNO, zstrerror());
+ state->next += n;
return -1;
}
}
No change in behavior intended. This internal API change will make it
easier for gz_zero to report partial successes, so callers can ask
gz_zero to finish the job when appropriate.
---
gzwrite.c | 46 ++++++++++++++++++++++++----------------------
1 files changed, 24 insertions(+), 22 deletions(-)
diff --git a/gzwrite.c b/gzwrite.c
index 8ec70ef..6d7f32c 100644
--- a/gzwrite.c
+++ b/gzwrite.c
@@ -10,7 +10,7 @@
/* Local functions */
local int gz_init OF((gz_statep));
local int gz_comp OF((gz_statep, int));
-local int gz_zero OF((gz_statep, z_off_t));
+local z_off_t gz_zero OF((gz_statep, z_off_t));
/* Initialize state for writing a gzip file. Mark initialization by setting
state->size to non-zero. Return -1 on failure or 0 on success. */
@@ -118,13 +118,14 @@ local int gz_comp(state, flush)
return 0;
}
-/* Compress len zeros to output. Return -1 on error, 0 on success. */
-local int gz_zero(state, len)
+/* Compress len zeros to output. Return 0 on error, len on success. */
+local z_off_t gz_zero(state, len)
gz_statep state;
z_off_t len;
{
int first;
unsigned n;
+ z_off_t rest = len;
z_streamp strm = &(state->strm);
/* consume whatever's left in the input buffer */
@@ -133,8 +134,8 @@ local int gz_zero(state, len)
/* compress len zeros */
first = 1;
- while (len) {
- n = len < state->size ? (unsigned)len : state->size;
+ while (rest) {
+ n = rest < state->size ? (unsigned)rest : state->size;
if (first) {
memset(state->in, 0, n);
first = 0;
@@ -143,10 +144,10 @@ local int gz_zero(state, len)
strm->next_in = state->in;
state->pos += n;
if (gz_comp(state, Z_NO_FLUSH) == -1)
- return -1;
- len -= n;
+ return 0;
+ rest -= n;
}
- return 0;
+ return len;
}
/* -- see zlib.h -- */
@@ -187,8 +188,8 @@ int ZEXPORT gzwrite(file, buf, len)
/* check for seek request */
if (state->seek) {
- state->seek = 0;
- if (gz_zero(state, state->skip) == -1)
+ state->seek -= gz_zero(state, state->skip);
+ if (state->seek)
return 0;
}
@@ -248,8 +249,8 @@ int ZEXPORT gzputc(file, c)
/* check for seek request */
if (state->seek) {
- state->seek = 0;
- if (gz_zero(state, state->skip) == -1)
+ state->seek -= gz_zero(state, state->skip);
+ if (state->seek)
return -1;
}
@@ -311,8 +312,8 @@ int ZEXPORTVA gzprintf (gzFile file, const char *format, ...)
/* check for seek request */
if (state->seek) {
- state->seek = 0;
- if (gz_zero(state, state->skip) == -1)
+ state->seek -= gz_zero(state, state->skip);
+ if (state->seek)
return 0;
}
@@ -386,8 +387,8 @@ int ZEXPORTVA gzprintf (file, format, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10,
/* check for seek request */
if (state->seek) {
- state->seek = 0;
- if (gz_zero(state, state->skip) == -1)
+ state->seek -= gz_zero(state, state->skip);
+ if (state->seek)
return 0;
}
@@ -453,8 +454,8 @@ int ZEXPORT gzflush(file, flush)
/* check for seek request */
if (state->seek) {
- state->seek = 0;
- if (gz_zero(state, state->skip) == -1)
+ state->seek -= gz_zero(state, state->skip);
+ if (state->seek)
return -1;
}
@@ -488,8 +489,8 @@ int ZEXPORT gzsetparams(file, level, strategy)
/* check for seek request */
if (state->seek) {
- state->seek = 0;
- if (gz_zero(state, state->skip) == -1)
+ state->seek -= gz_zero(state, state->skip);
+ if (state->seek)
return -1;
}
@@ -523,8 +524,9 @@ int ZEXPORT gzclose_w(file)
/* check for seek request */
if (state->seek) {
- state->seek = 0;
- ret += gz_zero(state, state->skip);
+ state->seek -= gz_zero(state, state->skip);
+ if (state->seek)
+ ret--;
}
/* flush, free memory, and close file */
Callers can finish up after an interrupted write by writing the
remaining zeros.
---
gzwrite.c | 7 ++++---
1 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/gzwrite.c b/gzwrite.c
index 6d7f32c..ccb7dd9 100644
--- a/gzwrite.c
+++ b/gzwrite.c
@@ -118,7 +118,8 @@ local int gz_comp(state, flush)
return 0;
}
-/* Compress len zeros to output. Return 0 on error, len on success. */
+/* Compress len zeros to output. Return number of zeros successfully
+ written or buffered. */
local z_off_t gz_zero(state, len)
gz_statep state;
z_off_t len;
@@ -143,9 +144,9 @@ local z_off_t gz_zero(state, len)
strm->avail_in = n;
strm->next_in = state->in;
state->pos += n;
- if (gz_comp(state, Z_NO_FLUSH) == -1)
- return 0;
rest -= n;
+ if (gz_comp(state, Z_NO_FLUSH) == -1)
+ return len - rest;
}
return len;
}
Allow gzclose to be resumed if it returns EINTR.
Also tweak the error reporting so errno and the return value come
from the first error that occured.
---
gzwrite.c | 41 +++++++++++++++++++++++++++++++++--------
1 files changed, 33 insertions(+), 8 deletions(-)
diff --git a/gzwrite.c b/gzwrite.c
index ccb7dd9..cc51e35 100644
--- a/gzwrite.c
+++ b/gzwrite.c
@@ -512,6 +512,7 @@ int ZEXPORT gzclose_w(file)
gzFile file;
{
int ret = 0;
+ int saved_errno = 0;
gz_statep state;
/* get internal structure */
@@ -526,19 +527,43 @@ int ZEXPORT gzclose_w(file)
/* check for seek request */
if (state->seek) {
state->seek -= gz_zero(state, state->skip);
- if (state->seek)
- ret--;
+ if (state->seek) {
+ ret = state->err;
+ if (ret == Z_ERRNO && errno == EINTR)
+ return ret;
+ saved_errno = errno;
+ }
}
/* flush, free memory, and close file */
- ret += gz_comp(state, Z_FINISH);
- (void)deflateEnd(&(state->strm));
- free(state->out);
- free(state->in);
- ret += close(state->fd);
+ if (ret) {
+ (void) gz_comp(state, Z_FINISH);
+ } else {
+ ret = gz_comp(state, Z_FINISH);
+ if (ret == Z_ERRNO && errno == EINTR)
+ return ret;
+ saved_errno = errno;
+ }
+ if (state->strm.state)
+ (void) deflateEnd(&(state->strm));
+ if (state->size) {
+ free(state->out);
+ free(state->in);
+ state->size = 0;
+ }
+ if (ret) {
+ (void) close(state->fd);
+ } else {
+ ret = close(state->fd);
+ if (ret == -1 && errno == EINTR)
+ return ret;
+ saved_errno = errno;
+ }
gz_error(state, Z_OK, NULL);
free(state);
- return ret ? Z_ERRNO : Z_OK;
+ if (saved_errno)
+ errno = saved_errno;
+ return ret;
}
#endif /* !OLD_GZIO */
Callers can finish up after an interrupted write by writing the
remaining bytes:
for (p = buf; p != bufend; p += gzwrite(gzfile, p, bufend - p)) {
int err;
const char *errmsg = gzerror(gzfile, &err);
if (err && !(err == Z_ERRNO && errno == EINTR)) {
fprintf(stderr, "gzwrite: %s\n", errmsg);
exit(1);
}
}
for (;;) {
int err;
const char *errmsg;
err = gzclose(gzfile);
if (err == Z_ERRNO && errno == EINTR) continue;
if (err) {
errmsg = (err == Z_ERRNO) ? strerror(errno) : zError(err);
fprintf(stderr, "gzclose: %s\n", errmsg);
exit(1);
}
break;
}
exit(0);
---
gzwrite.c | 5 ++---
1 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/gzwrite.c b/gzwrite.c
index cc51e35..e0f5c08 100644
--- a/gzwrite.c
+++ b/gzwrite.c
@@ -209,7 +209,7 @@ int ZEXPORT gzwrite(file, buf, len)
buf = (char *)buf + n;
len -= n;
if (len && gz_comp(state, Z_NO_FLUSH) == -1)
- return 0;
+ return put - len;
} while (len);
}
else {
@@ -221,8 +221,7 @@ int ZEXPORT gzwrite(file, buf, len)
strm->avail_in = len;
strm->next_in = (voidp)buf;
state->pos += len;
- if (gz_comp(state, Z_NO_FLUSH) == -1)
- return 0;
+ gz_comp(state, Z_NO_FLUSH);
}
/* input was all buffered or compressed (put will fit in int) */
gzread() and related functions cannot handle situations when read has
no data to return, such as non-blocking I/O or interruption by a
signal handler. The latter is easy to fix: just declare that
input-related functions will continue after interruption by a signal
handler and explicitly continue the read. So do so.
In the very special case of partial reads due to interrupted reading
of "slow" files, this is what gz_load already does.
Resumable non-blocking I/O would be harder. Fortunately, nobody has
needed it yet.
---
Thatâs the last patch. Thanks for reading. Please let me know if you
have any questions or suggestions.
gzread.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/gzread.c b/gzread.c
index 0062a5e..fb056e6 100644
--- a/gzread.c
+++ b/gzread.c
@@ -31,6 +31,8 @@ local int gz_load(state, buf, len, have)
*have = 0;
do {
ret = read(state->fd, buf + *have, len - *have);
+ if (ret < 0 && errno == EINTR)
+ continue;
if (ret <= 0)
break;
*have += ret;
Jonathan Nieder wrote: Scratch that --- if a program _relies_ on signals interrupting the C library wrappers for system calls in order to quickly bail out, it is already broken. A signal can always arrive right before a system call starts and there will be no way for the application code to detect that. Thus, the System V-style semantics are basically useless, except as an unreliable way to occasionally avoid a little extra copying of data. Here is an interesting piece of history. It seems the System V-style semantics for the system calls were novel to some people. I wish the problem had been smoothed over for everyone in libc. http://www.jwz.org/doc/worse-is-better.html What this means for zlib I still donât know. Jonathan