unbound-1.19.0 alloc_reg_obtain() core dumps

Sami Kerola kerolasa at iki.fi
Fri Feb 2 11:43:33 UTC 2024


On Fri, 2 Feb 2024 at 09:06, Yorgos Thessalonikefs via Unbound-users
<unbound-users at lists.nlnetlabs.nl> wrote:
> I'll have a look but probably next week.

Thank you.

There are more cores. Many repeats of the same old same old.  One is
something I have not seen before.

Program terminated with signal SIGSEGV, Segmentation fault.

(gdb) info threads
  Id   Target Id                           Frame
* 1    Thread 0x7fc0b79aba40 (LWP 2878537) 0x0000000000000000 in ?? ()
  2    Thread 0x7fc0b67546c0 (LWP 2878538) warning: Section
`.reg-xstate/2878538' in core file too small.
0x00007fc0b6b27e26 in splice (fd_in=150, off_in=0x7fc0b0000ef0,
fd_out=32, off_out=0x7fc0b6b27e26 <splice+38>, len=0,
flags=3061135592) at ../sysdeps/unix/sysv/linux/splice.c:25
  3    Thread 0x7fc08ffff6c0 (LWP 2878549) warning: Section
`.reg-xstate/2878549' in core file too small.
0x00007fc0b6b27e26 in splice (fd_in=186, off_in=0x7fc074000ef0,
fd_out=32, off_out=0x7fc0b6b27e26 <splice+38>, len=0,
flags=2415913192) at ../sysdeps/unix/sysv/linux/splice.c:25
[snip snip all thread busy with splice]
  23   Thread 0x7fc0ad7fa6c0 (LWP 2878547) warning: Section
`.reg-xstate/2878547' in core file too small.
0x00007fc0b6b27e26 in splice (fd_in=180, off_in=0x7fc07c000ef0,
fd_out=32, off_out=0x7fc0b6b27e26 <splice+38>, len=0,
flags=2910820584) at ../sysdeps/unix/sysv/linux/splice.c:25
  24   Thread 0x7fc0797fa6c0 (LWP 2878561) warning: Section
`.reg-xstate/2878561' in core file too small.
0x00007fc0b6b27e26 in splice (fd_in=224, off_in=0x7fc04c000e70,
fd_out=32, off_out=0x7fc0b6b27e26 <splice+38>, len=0, flags=0) at
../sysdeps/unix/sysv/linux/splice.c:25
  25   Thread 0x7fc08effd6c0 (LWP 2878551) warning: Section
`.reg-xstate/2878551' in core file too small.
0x00007fc0b6b27e26 in splice (fd_in=193, off_in=0x7fc064000ef0,
fd_out=32, off_out=0x7fc0b6b27e26 <splice+38>, len=0,
flags=2399127784) at ../sysdeps/unix/sysv/linux/splice.c:25

(gdb) bt -full
#0  0x0000000000000000 in ?? ()
No symbol table info available.
#1  0x0000000000000000 in ?? ()
No symbol table info available.

Looking 'layout asm' I get impression 0x7fc0b79aba40 is invalid stack pointer.

And why threads were in middle of running splice()? Was the service in
middle of systemd upgrade? That could explain how next entry in
reg_list is null.

Meanwhile I need to look if there is a reason in operating environment
why systemctl restarts happen (when I do not expect them).

> I am a little confused by your previous wording:
> "... have a problem with unbound-1.19.0.  Estimated time in between
> crashes is around 450 days on a single server."
> Did these start specifically with 1.19.0?

Sorry. I should have been more clear. Couple weeks ago I was informed
unbound is crashing, version at the time was 1.17.1. Unfortunately I
did not have access to core files at the time so I could not look into
backtraces. But I thought that's an old version, let me just upgrade
and not think about this too hard.

After upgrade cores kept on coming. That is when I arranged access to
debugging facilities.

> How often do you see those crashes (per single server)?

I have not seen crash repeated on a server (yet). For some reason
cores are coming from servers in India more than anywhere else.  The
450 day interval on a single server is calculated by taking rate of
crashes and compared it on number of servers running unbound. Where I
was trying to get to is; this looks pretty rare condition, but with
enough servers are things happen often.

Have a nice weekend, Sami

-- 
Sami Kerola
https://kerolasa.iki.fi/


More information about the Unbound-users mailing list