auth-zones and DNS NOTIFY

W.C.A. Wijngaards wouter at nlnetlabs.nl
Mon Jun 4 09:01:24 UTC 2018


Hi Harry,

On 02/06/18 19:24, Harry Schmalzbauer wrote:
> Am 02.06.2018 um 16:44 schrieb Harry Schmalzbauer via Unbound-users:
>> Am 17.04.2018 um 15:26 schrieb W.C.A. Wijngaards via Unbound-users:
>>> Hi Harry,
>>>
>>> Yes, DNS NOTIFY is implemented in the current code repo version.  You
>>> can specify additional sources with allow-notify.
>>
>> Great, thanks a lot!.
>> Found time to update some production systems, but unfortunately zone
>> transfer seem to work only initially, then I see these messages logged:

Thank you very much for the detailed report.  I found the deadlock
problem and fixed it for the upcoming release.

There is a patch as well in case that is useful for you.  The routine
simply forgot to unlock in one of the cases for an incoming NOTIFY
message.  This explains why the other report did not encounter the problem.

Index: services/authzone.c
===================================================================
--- services/authzone.c	(revision 4703)
+++ services/authzone.c	(working copy)
@@ -3425,8 +3425,10 @@
 {
 	/* if the serial of notify is older than we have, don't fetch
 	 * a zone, we already have it */
-	if(has_serial && !xfr_serial_means_update(xfr, serial))
+	if(has_serial && !xfr_serial_means_update(xfr, serial)) {
+		lock_basic_unlock(&xfr->lock);
 		return;
+	}
 	/* start new probe with this addr src, or note serial */
 	if(!xfr_start_probe(xfr, env, fromhost)) {
 		/* not started because already in progress, note the serial */


Best regards, Wouter

>> unbound: [14927:0] error: ./services/authzone.c at 6102 could not
>> pthread_mutex_lock(&xfr->lock): Resource deadlock avoided
>> unbound: [14927:0] error: ./services/authzone.c at 3454 could not
>> pthread_mutex_lock(&xfr->lock): Resource deadlock avoided
>>>>
>> Increasing log level to 3 doesn't show more useful.
>>
>> After the error occurs, unbound returns "error response SERVFAIL" for
>> all queries which match stub-zones: and all quieries matching
>> auth-zones: get the old records (no xfer any more).
>>
>> Any idea where the problem could come from?
>> Will try to make all stub-zones auth-zones and see if that changes
>> anything....
> 
> Couldn't find out more, sorry, no config change I made had any effect.
> 
> I'm running 1.7.1 on FreeBSD inside a jail and use "allow-notify:",
> since the transfer takes a different route (via tunnel) than the notify
> source.
> The incoming notify triggers the error(-log) and the stall for stub-zones.
> 
> I had to remove auth-zones: for now to get my setup back into working
> condition.
> 
> My intention was to serve auth-zones without using a zonefile, but it
> doesn't make any difference whether I use one or not.
> There seems to be a locking problem when a xfer starts after a notify
> was received.  Unfortunately nothing I can easily track, since I'm not
> used to debuggers and don't even have a system where I could install one
> at firsthand.
> 
> I hope someone can take care of that issue.
> The dedlock error quoted above corresponds to auth_xfer_timer() for line
> 6102:
>>         struct auth_xfer* xfr = (struct auth_xfer*)arg;
>         struct module_env* env;
>         log_assert(xfr->task_nextprobe);
>         lock_basic_lock(&xfr->lock);
>         env = xfr->task_nextprobe->env;
>         if(env->outnet->want_to_quit) {
>                 lock_basic_unlock(&xfr->lock);
>                 return; /* stop on quit */
>         }
> 
>         /* see if zone has expired, and if so, also set auth_zone
> expired */
>> 
> and auth_zones_notify() for line 3454:
>>        /* see which zone this is */
>         lock_rw_rdlock(&az->lock);
>         xfr = auth_xfer_find(az, nm, nmlen, dclass);
>         if(!xfr) {
>                 lock_rw_unlock(&az->lock);
>                 /* no such zone, refuse the notify */
>                 *refused = 1;
>                 return 0;
>         }
>         lock_basic_lock(&xfr->lock);
>         lock_rw_unlock(&az->lock);
> 
>         /* check access list for notifies */
>> 
> But no way for me to get any further, sorry.
> 
> -harry
> 


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.nlnetlabs.nl/pipermail/unbound-users/attachments/20180604/c2f7fc43/attachment.bin>


More information about the Unbound-users mailing list