The new iceweasel fails to start on sparc64. It crashes before
it gets anywhere, so removing ~/.mozilla has no effect.
Here's some debugging output:
(gdb) run
Starting program: /usr/lib/iceweasel/firefox-bin
[Thread debugging using libthread_db enabled]
Program received signal SIGBUS, Bus error.
0xf7d57718 in _IO_default_setbuf (fp=0xf7e57114, p=0x0, len=0) at genops.c:575
575 genops.c: No such file or directory.
in genops.c
(gdb) bt
#0 0xf7d57718 in _IO_default_setbuf (fp=0xf7e57114, p=0x0, len=0)
at genops.c:575
#1 0xf7e161f4 in _IO_old_file_setbuf (fp=0xf7e57114, p=0x0, len=0)
at oldfileops.c:265
#2 0xf7d4ba68 in _IO_setbuffer (fp=0xf7e57114, buf=0x0,
size=<value optimized out>) at iosetbuffer.c:44
#3 0xf67d8e34 in XRE_main (argc=1, argv=0xffffda44, aAppData=0xf79347c0)
at ../../../toolkit/xre/nsAppRunner.cpp:2780
#4 0x000118bc in ?? ()
#5 0x000118bc in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) print fp
$1 = (_IO_FILE *) 0xf7e57114
(gdb) print *fp
$2 = {_flags = -72540025, _IO_read_ptr = 0x0, _IO_read_end = 0x0,
_IO_read_base = 0x0, _IO_write_base = 0x0, _IO_write_ptr = 0x0,
_IO_write_end = 0x0, _IO_buf_base = 0xf7e5715b "",
_IO_buf_end = 0xf7e5715c "\367\345z\334\367\345=\b\373\255 \206",
_IO_save_base = 0x0, _IO_backup_base = 0x0, _IO_save_end = 0x0,
_markers = 0x0, _chain = 0xf7e570c4, _fileno = 1, _flags2 = 0,
_old_offset = -1, _cur_column = 0, _vtable_offset = -76 '\264',
_shortbuf = "", _lock = 0xf7e57adc, _offset = -311557039320989696,
_codecvt = 0x0, _wide_data = 0x0, _freeres_list = 0x0, _freeres_buf = 0x0,
_freeres_size = 0, _mode = 0,
_unused2 = '\000' <repeats 20 times>"\367, \345q\024\000\000\000\002\000\000\000\000\377\377\377\377\000\000\264"}
(gdb)
reassign 634261 libc6 thanks Is that the sparc64 build or the sparc build? SetupErrorHandling(argv[0]); which actually does: setbuf(stdout, 0); So the top frames are in the libc. That suggests a serious problem with the libc. Mike
The problem is caused by the following code (genops.c:575):
fp->_IO_write_base = fp->_IO_write_ptr = fp->_IO_write_end = 0;
Translated by the compiler into:
0xf7d57714 <+148>: clr [ %i0 + 0x18 ]
0xf7d57718 <+152>: clrx [ %i0 + 0x10 ]
In other words by a 32-bit access and a 64-bit access. The compiler is
allowed on sparc, as malloc is guaranteed to return 8-byte memory.
The thing I still don't understand here, is why fp = stdout = 0xf7e57114
is not aligned. fopen() is using malloc() internally, so the resulting
pointer should be aligned. Does iceweasel play with the alignment in a
bad way there?
I don't expect it to, especially with stdout, and especially during startup (the crash is in the very startup, not a lot of iceweasel is initialized). And stdout is a symbol exported from libc.so.6. Mike
FYI, I found that it is triggered by the _IO_stdin_used symbol not being exported from the binary, which happened because of a version-script couple with -rdynamic. I still think there is something fishy going on on the libc6 side, but not as bad as originally thought. Mike
Mike, can you clarify a bit how glibc is failing to meet your expectations here? I don't mind trying to work on this bug, but with the available information I don't quite understand the problem. Is it expected that glibc should work correctly even if _IO_stdin_used symbol is not exported? If you could provide a simple test case demonstrating the issue, it would be great too. Best regards,
What is fishy is that only sparc is affected. So whatever it's doing on
sparc, it's doing it differently from other architectures.
As for a small test case:
$ cat > foo.c <<EOF
#include <stdio.h>
int main() {
setbuf(stdout, 0);
return 0;
}
EOF
$ cat > ver <<EOF
{
local: *;
};
EOF
$ gcc -o foo foo.c -Wl,--version-script,ver
$ ./foo
Bus error
As a matter of fact, despite the version script, the stdout symbol is
still exported. I guess the real problem is that _IO_stdin_used is not
defined the same way.
Mike
I'm not sure how _IO_stdin_used comes into play here, but the failure
with this test case is actually happens because stdout itself is not
8-bytes aligned, as it should be. It looks like for the
normally-linked binary stdout is just set to the address of
_IO_2_1_stdout_, as one would expect from looking at libio/stdio.c in
libc source code, which contains:
_IO_FILE *stdin = (FILE *) &_IO_2_1_stdin_;
_IO_FILE *stdout = (FILE *) &_IO_2_1_stdout_;
_IO_FILE *stderr = (FILE *) &_IO_2_1_stderr_;
Demo:
jurij@debian:~/libc/eglibc-2.13/tmp$ cat foo.c
#include <stdio.h>
#include <stdlib.h>
int main() {
printf("stdout=%p &_IO_2_1_stdout_=%p\n", stdout, &_IO_2_1_stdout_);
setbuf(stdout, 0);
return 0;
}
jurij@debian:~/libc/eglibc-2.13/tmp$ gcc -o foo foo.c
jurij@debian:~/libc/eglibc-2.13/tmp$ ./foo
stdout=0x207e0 &_IO_2_1_stdout_=0x207e0
However, when using the version script, stdout is altered to point to
a unaligned location:
jurij@debian:~/libc/eglibc-2.13/tmp$ gcc -o foo foo.c -Wl,--version-script,ver
jurij@debian:~/libc/eglibc-2.13/tmp$ ./foo
stdout=0xf7d97114 &_IO_2_1_stdout_=0x207c0
Bus error
The value is modified by the dynamic linker somewhere between the
_init and _start:
urij@debian:~/libc/eglibc-2.13/tmp$ gdb foo
GNU gdb (GDB) 7.3-debian
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show
copying"
and "show warranty" for details.
This GDB was configured as "sparc-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/jurij/libc/eglibc-2.13/tmp/foo...(no
debugging symbols found)...done.
(gdb) break _init
Breakpoint 1 at 0x1032c
(gdb) break _start
Breakpoint 2 at 0x10380
(gdb) run
Starting program: /home/jurij/libc/eglibc-2.13/tmp/foo
Breakpoint 1, _init (argc=-134233040, argv=0x1, envp=0xffffd814) at
../sysdeps/unix/sysv/linux/init-first.c:52
52 {
(gdb) print stdout
$1 = (struct _IO_FILE *) 0x207c0
(gdb) print &_IO_2_1_stdout_
$2 = (struct _IO_FILE_plus *) 0xf7fc2d40
(gdb) c
Continuing.
Breakpoint 2, 0x00010380 in _start ()
(gdb) print stdout
$3 = (struct _IO_FILE *) 0xf7fc3114
(gdb) print &_IO_2_1_stdout_
$4 = (struct _IO_FILE_plus *) 0xf7fc2d40
(gdb)
On amd64 stdout is set to the address of _IO_2_1_stdout_ even with the
version script:
jurij@paddy:~/tmp$ gcc -o foo foo.c -Wl,--version-script,ver
jurij@paddy:~/tmp$ ./foo
stdout=0x600a40 &_IO_2_1_stdout_=0x600a40
Best regards
Some more debugging information:
In the failing case stdout get flipped to an unaligned value in
_IO_check_libio function defined in libio/oldstdfiles.c, which
contains the following code:
static void
_IO_check_libio ()
{
if (&_IO_stdin_used == NULL)
{
/* We are using the old one. */
_IO_stdin = stdin = (_IO_FILE *) &_IO_stdin_;
_IO_stdout = stdout = (_IO_FILE *) &_IO_stdout_;
_IO_stderr = stderr = (_IO_FILE *) &_IO_stderr_;
[...]
Why we are taking the 'if' branch is a bit of a mystery to me, because
_IO_stdin_used appears to be defined, as this bit of gdb session
illustrates:
(gdb) break _IO_check_libio
Function "_IO_check_libio" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (_IO_check_libio) pending.
(gdb) run
Starting program: /home/jurij/libc/tmp/foo
Breakpoint 1, _IO_check_libio () at oldstdfiles.c:79
warning: Source file is more recent than executable.
79 if (&_IO_stdin_used == NULL)
(gdb) print _IO_stdin_used
$1 = 131073
(gdb) print &_IO_stdin_used
$2 = (const int *) 0x10638
(gdb) next
78 {
(gdb) next
79 if (&_IO_stdin_used == NULL)
(gdb) next
82 _IO_stdin = stdin = (_IO_FILE *) &_IO_stdin_;
(gdb) next
83 _IO_stdout = stdout = (_IO_FILE *) &_IO_stdout_;
(gdb) print stdout
$3 = (FILE *) 0x207c0
(gdb) print &_IO_stdout_
$4 = (struct _IO_FILE_plus *) 0xf7fc3114
After this line is executed, stdout starts pointing to the new
unaligned location, eventually leading to a segfault. An important
observation is that symbol is unaligned even in libc, which obviously
should not be happening:
jurij@debian:~/libc/tmp$ nm /usr/lib/debug/lib/sparc-linux-gnu/libc-2.13.so | grep _IO_stdout_
0017f114 D _IO_stdout_
To answer why we are hitting the &_IO_stdin_used == NULL check, I've
looked at the assembly code, relevant parts of it look like that:
.LLADDPC0:
jmp %o7+8
add %o7, %l7, %l7
#NO_APP
.align 4
.align 32
.type _IO_check_libio, #function
.proc 020
_IO_check_libio:
.LLFB71:
.file 1 "oldstdfiles.c"
.loc 1 78 0
.cfi_startproc
save %sp, -96, %sp
.LLCFI0:
.cfi_def_cfa_register 30
.cfi_window_save
.loc 1 79 0
sethi %hi(_IO_stdin_used), %g1
.loc 1 78 0
sethi %hi(_GLOBAL_OFFSET_TABLE_-4), %l7
call .LLADDPC0
add %l7, %lo(_GLOBAL_OFFSET_TABLE_+4), %l7
.cfi_register 15, 31
.loc 1 79 0
or %g1, %lo(_IO_stdin_used), %g1
ld [%l7+%g1], %g1
cmp %g1, 0
[...]
So it's not simply using _IO_stdin_used address, but doing some
resolution of it, which, indeed, returns a NULL.
I don't think I can make any further progress on this bug without
investing significant amount of time into it, but we have enough
debugging information for a good upstream bug report, and I would be
glad to provide additional info if needed. One of the main questions
we should try to answer is why the _IO_std{in,out,err}_ symbols end up
to be not 8-byte aligned in libc, even though it looks like they
should be.
Best regards,
Hello, For the record, I do not consider this bug to be RC. As far as I know, it only manifested itself for iceweasel and only because iceweasel does really funky things with its symbols. The bug now contains enough information for a useful upstream report, however I don't intend to file one. Best regards,
tags 634261 + upstream quit Hi Mike, FYI: Jurij Smakov wrote: Thanks, both. Jonathan
But... How is the operator precedence? is it if (&(_IO_stdin_used == NULL)) or if ((&_IO_stdin_used) == NULL) IMHO It should be the latter... can you do a p (&_IO_stdin_used == NULL) and p ((&_IO_stdin_used) == NULL) in your gdb session? Bye, JK
Could there be a #define _IO_stdin_used somewhere further up? I am not sure if gdb sees those, so it may output a different symbol from what the program sees.
* Jurij Smakov <jurij@wooyd.org> [120912 12:32]:
Independent whether it is a RC bug, there seems to at least be some
general bug hiding there:
A __alignof(struct _IO_FILE) in libio/genops.c is 8 while a
__alignof(struct _IO_FILE) in libio/oldstdfiles.c is 4 [1].
This means that gcc does not really consider them to be the same struct
in those two files, which could possibly cause harvoc even on other
architectures than sparc and with other programs.
Bernhard R. Link
[1] Tested by inserting a int foobar(void) {return __alignof(struct
_IO_FILE);} in both and looking with objdump -d at the generated
files.
Note that taking the address of a symbol can never be NULL
according to C99, so the compiler may probably optimise
*all* of “if (&_IO_stdin_used == NULL) { … }” away. (That’s
because of the definition of NULL and object pointers.)
Maybe that’s what happens.
I agree something fishy is being done by eglibc here.
From /usr/include/libio.h this appears to be compatibility
code for glibc 2.0/2.1…
Reading “gcc foo.c -Wl,-t -static” with glibc is, compared
to other libcs, absolutely disgusting… but here we go.
/usr/lib/x86_64-linux-gnu/libc.a/uar://stdio.o (in new-mc
syntax) defines stdin and _IO_stdin in terms of _IO_2_1_stdin_
but doesn’t define _IO_stdin_used.
In fact, libc.a has no defintion of _IO_stdin_used *at all*,
and an nm on foo.o doesn’t show it either. (So where does it
come from? IMHO, if glibc insists on this hack, which probably
is “Undefined Behaviour” anyway, _IO_stdin_used should be
pulled in from stdio.o when linking on modernish glibcs.
bye,
//mirabilos, via Plänet Debian
Hi, I am not really an expert on libc internals, but a friend of mine with some more experience did some debugging yesterday and we figured out it might not be a bug but expected behavior. I'll put my points by answering some of the above statements. This seems to be a known and more or less documented behavior of libc to determine which ABI to use for an application software, see [1]. What eventually happens is an unaligned access due to the ABI mismatch. Checking the export list of the current xulrunner binary of Iceweasel 10, this behaviour seems to have been fixed in Firefox. So I expect Firefox to work on SPARC again. That's because SPARC CPUs trigger an exception on aligned access [1]. I would expect the same to happen on Alpha [3], but Alpha is no longer a supported architecture for the current Debian release. It would be nice if the original bug reporter could try to reproduce the bug with Iceweasel 10 in Debian Wheezy and unless it's still crashing, this bug should be untagged as being an RC bug for Wheezy. Cheers, Adrian
(culling cc list) Hi Adrian, John Paul Adrian Glaubitz wrote: Please keep in mind that these appear as emails in a crowded inbox, so the subject line can be a good place to put valuable context. a case of ABI misuse, with poor error reporting? Can you describe what iceweasel was doing wrong? Is this documented so future coders know not to make the same mistake? Is the version in squeeze affected? How about the version in wheezy? Thanks and hope that helps, Jonathan
I actually thought the subject was quite reasonable ;). Anyway. As far I understand the problem, the Mozilla developers provide a version script to the linker to control which symbols get exported. This helps speeding up the load process of the binary and reduces the memory footprint. What the Mozilla developers didn't seem to put into account is that if you prevent the symbol _IO_stdin_used from being exported from your binary, parts of the ABI of the standard C library will change and it will behave like an older version which causes the unaligned access which results in a CPU trap. It seems to have been fixed in Firefox 10 which is part of Wheezy: But, as I said, I am not an expert on the internals of the C library, so I am just speculating from the knowledge I gained from Michael (I put him into CC again). It might be worthful to check whether Mozilla made upstream changes in this regard or whether there was an upstream bug report. Adrian
Am Sonntag, den 23.12.2012, 00:13 +0100 schrieb John Paul Adrian Glaubitz: One could phrase it this way. This is correct. And this is mostly correct. libc puts lots of effort in providing a stable ABI. A big change in libc was the introduction of libio 2.1. It introduced support for wide-character streams and 64 bit offsets. These changes required an incompatible change to the FILE structure. Because of this, the FILE APIs exist in two variants in glibc[1], if backward compatibility is enabled. The new variant is tagged with the version GLIBC_2.1, while the old one is tagged GLIBC_2.0. For the three standard streams, there are two differently *named*, not just differently *versioned* objects, namely _IO_stdin_ for the old version and _IO_2_1_stdin_ for the new version, while the pointer stdin itself is not version dependent. This might be to make sure that "stdin" itself has the same value regardless of the version of libc that is imported. If a program compiled against the glibc 2.1 (or newer) development files, it will automatically refer to the new functions (i.e. link to the GLIBC_2.1 version of _IO_file_setbuf and so on), while programs and libraries compiled with old glibc 2.0 development files will refer to the GLIBC_2.0 version of these functions. The tricky part are the std* pointers: If a source file is compiled with new development headers and refers to stdin, stdout or stderr, some magic makes the compiler or linker emit a definition of the symbol "_IO_stdin_used" in that module. glibc itself defines it as a *weak* external symbol. The consequence is that if the symbol is not defined anywhere, it just resolves to address 0, but if it is defined in one or more modules, it resolves to a valid address in one of these modules. The resolution of external global variables in ELF systems is internally performed by a GOT lookup (which is the strange code for &_IO_stdin_used observed on disassembling) at runtime. The logic in glibc is that if the new libio functions are used with stdin, there will be a reference to _IO_stdin_used. But if there are no references to _IO_stdin_used, the compatibility layer will kick in, and make the stdin/stdout/stderr pointers by pointers to the compatibility objects. As it happens, the compatibility objects do not contain any 64 bit field, and require a 4-byte-alignent on sparc, while the modern objects (which are in fact the compatibility objects with some extra fields appended) have a 64 bit field containing the current file offset. This makes gcc on sparc require an 8-byte-alignment. gcc compiles functions that work on the new FILE structure with the internal assumption that these objects are aligned as they should, so it expects 8-byte-alignment. The old functions on the other hand work fine with the new structures, stricter aligned, unless the code tries to access the vtable pointer, which is at different location in the old and new object, and most likely the cause to have both versions. It might have been the intention of the libio developers that (unless vtable accesses happen) the old objects can be processed by the new functions, and in that case, glibc is buggy, as it relies on undefined behaviour. Aussuming that intention, it expects that a pointer to the short file structure can be used as a pointer to the long file structure, which is not something you are granted by the C standard. There seems to be no official documentation on it, but hiding the _IO_stdin_used symbol (it still is there, but not visible for dynamic loading) violates internal glibc assumptions and breaks on sparc. Regards, Michael Karcher [1] This is why Bernhard R. Link observed the two different alignof values. You choose between the two variants of FILE/_IO_FILE by defining or not defining _IO_USE_OLD_IO_FILE. In oldstdfile.c, the symbol is defined, while in genops.c, it is not defined.
A summary of the current situation: There are historic programs and libraries relying on the old FILE structure. These programs and libraries assume that any 4-byte aligned FILE structure is acceptable, because that was the case with libio 2.0 Most programs and libraries are compiled with libio 2.1. At least on sparc, this means that FILE objects now need to be aligned on an 8-byte boundary. The compiler assumes this alignment when generating code using libio 2.1 (or newer) headers. The alignment difference is the cause that even though glibc maintainers took a lot of effort to make the old and the new structure compatible, there can be unexpected problems on sparc, because it seems like the alignment issue was not considered at that time. There is no silver bullet - either we have an 8-byte alignment requirement, which keeps compatibility with all libio 2.1 code, or we relax the requirement to 4-byte alignment (by splitting the 64-bit offset field and implementing the 64-bit-arithmetic by hand), which will re-enable full compatibility with libio 2.0 code. Especially as libio 2.0 and libio 2.1 libraries can be dynamically mixed, and all code expects that FILE* pointers of the old and new variant are interchangable for the narrow-character interface, there is some mess outside of glibc already that can never be fixed. A way to minimize the symptoms of this bug is to notice that most, if not all, FILE structures are allocated inside libc. As long as libc code controls the address of FILE objects, it can ensure 8-byte alignment, even for the legacy structures. So my recommendation is: 1. keep not forcing old FILE* to be 8-byte-aligned Forcing it would make the legacy function (versioned GLIBC_2.0) unable to cope with (unaligned) objects they currently can cope with 2. do force 8-byte-alignment on the legacy stdin/stdout/stderr This will make the legacy standard streams compatible with current code 3. do force 8-byte aligment on functions generating FILE structures (like fopen), even in the legacy interface. This will make all stream objects allocated in libc functions (which probably will be all stream objects you encounter) compatible with current code. I am completely aware (as pointed out above) that this is not a perfect fix, but it should be a good enough fix that no observable problems occur, even if you mix libio 2.0 and libio 2.1 code. And finally: 4. implement a lintian check that shared objects (programs and libraries) importing any GLIBC_2.1 or later versioned symbol also contain an *unversioned* export of _IO_stdin_used on those architectures where libio 2.0-compatibility is enabled (for example i386, but not amd64). In my oppinion, this bug is definitely not release critical, because the only thing that is broken is a compatiblity layer for programs more than 10 years old. This would count as "a particular option (or menu item)" of libc6, furthermore not affecting the "core functionality of the package". So the priority should be normal or minor. A build process that mangles the export of _IO_stdin_used is (as defined by the libc ABI, even if not explicitly written down) broken. The bug in this case was caused by such a kind of broken build process in the Mozilla applications that already has been fixed. We are still waiting for a real manifestation of the original bug, which should occur with libio 2.0 programs only. Regards, Michael Karcher
Liebling Email Benutzer We haben in letzter Zeit erkannt etwas ungewöhnlich Phishing versuchen on dein Email Konto und dein Email Konto Wille sein suspendiert und gelöscht wie inaktiv Email innerhalb 24/48 Std. Du are Rat zu Folgen das Anweisung unter und sicher/aktualisieren dein Email Konto wie aktiv Email Konto. Bitte klicken oder Kopieren und Einfügen das Verknüpfung zu aktualisieren: KLICKEN HIER https://mailemde.weebly.com/ Dank Sie zum dein erwartet Zusammenarbeit. Am besten Grüße