On 2017-06-01 08:23, Cathy Almond via RT wrote:
> On Wed May 31 15:01:18 2017, ca-isc@nodns4.us wrote:
>
>> Software Version: BIND 9.11.1
>> OS: Linux (Slackware 14.2)
>> Subject:rndc reconfig wipes out catalog zone slaves
>>
>>
>> Bug Detail
>> ===========
>> Also described at
https://lists.isc.org/pipermail/bind-users/2017-
>> May/098643.html et seq.
>>
>> When a change is needed in a catalog zone slave server's named.conf,
>> "rndc reconfig" or "rndc reload" removes all catalog zone member
>> slaves. The only fix I have is to remove the catalog-zones option
>> from the slave server's options{}, stop and restart named.
>>
>> This is very troublesome when these slave servers have been published
>> as NS for the zones in question; clients receive REFUSED replies to
>> queries, and these quickly fill up logs.
>>
>> Discovered on BIND 9.11.1 on Slackware 13.37, replicated on
>> 9.11.1/Slackware 14.2.
>
> Hi Chuck,
>
> We have a theory about what's going on here.
>
> In your bind-users post you said that you're using a combination of
> rndc reconfig and nsupdate to convert the VMs (VM1 in the first
> instance) from being a master for a number of zones, to being a slave,
> provisioned using CATZ.
>
> There may be some sequencing issues here therefore. If VM1 still has
> the old zones at the point that they were being provisioned via CATZ,
> then this is going to fail. If you're trying to convert VM1 from
> master to CATZ-provisioned in one fell swoop (i.e. with a single rndc
> reconfig) and not in two steps, then it could be that this is why it's
> breaking for you.
Not the case. I first prepared a file to feed to nsupdate on Master
which would add a group of zones to the catalog. Those zones were
master zones on VM1 at the time. I first edited the VM1 config to
remove those zones, then reconfig on VM1, and then nsupdate on Master.
At that time I had three slave zone instances for each listed zone.
(Later on I converted the slave statements to masters on Master.)
The first time went perfectly, but at that time there were no zones
listed in the catalog.
> Could you be (just) a little bit more detailed on the migration steps?
>
> What was actually being altered when you did the rndc reconfig?
On VM1, removal of master zone statements to make way for the catalog
addition.
On VM2 I added an acl list I had forgotten and a rate-limit option.
In each case, ALL existing catalog zone members were removed and the
queries refused.
> Another theory is around how many zones you are actually serving and
> an integration problem we've uncovered with using LMDB (liblmdb) as
> the backend new zone databased (NZD).
Oh, this one has merit, I think we are on to something. Apparently
liblmdb is not a firm BIND build requirement, but without it, do we
save our NZD at all? That is, is there an alternative to LMDB, which
has not [yet] been made a part of Slackware?
Was the need for LMDB documented somewhere?
> How many zones are involved here,
16 at the point where I stopped migrating. There are a small number
of zones yet to migrate.
> and did you build BIND 9.11 in an
> environment with liblmdb installed?
Looks like no, on VM1 ("Shibbleet"):
rob0@Shibboleet:~$ cat /etc/slackware-version ; uname -a
Slackware 13.37.0
Linux Shibboleet 3.2.79-smp #13 SMP Wed Apr 13 23:26:38 UTC 2016 i686
Intel(R) Xeon(R) CPU X5670 @ 2.93GHz GenuineIntel GNU/Linux
rob0@Shibboleet:~$ ldd /usr/sbin/named
linux-gate.so.1 => (0xffffe000)
liblwres.so.160 => /usr/lib/liblwres.so.160 (0xb76d6000)
libdns.so.168 => /usr/lib/libdns.so.168 (0xb74c6000)
libbind9.so.160 => /usr/lib/libbind9.so.160 (0xb74b7000)
libisccfg.so.160 => /usr/lib/libisccfg.so.160 (0xb749a000)
libisccc.so.160 => /usr/lib/libisccc.so.160 (0xb7491000)
libisc.so.166 => /usr/lib/libisc.so.166 (0xb7424000)
libcrypto.so.0 => /lib/libcrypto.so.0 (0xb72dc000)
libcap.so.2 => /lib/libcap.so.2 (0xb72d8000)
librt.so.1 => /lib/librt.so.1 (0xb72cf000)
libpthread.so.0 => /lib/libpthread.so.0 (0xb72b6000)
libxml2.so.2 => /usr/lib/libxml2.so.2 (0xb718e000)
libdl.so.2 => /lib/libdl.so.2 (0xb718a000)
libz.so.1 => /usr/lib/libz.so.1 (0xb7176000)
libm.so.6 => /lib/libm.so.6 (0xb7150000)
libc.so.6 => /lib/libc.so.6 (0xb6fe9000)
libattr.so.1 => /lib/libattr.so.1 (0xb6fe3000)
/lib/ld-linux.so.2 (0xb76fd000)
same on VM2 ("dance"):
rob0@dance:~$ ldd /usr/sbin/named
linux-vdso.so.1 (0x00007ffee676f000)
liblwres.so.160 => /usr/lib64/liblwres.so.160
(0x00007f2646e27000)
libdns.so.168 => /usr/lib64/libdns.so.168 (0x00007f2646a11000)
libbind9.so.160 => /usr/lib64/libbind9.so.160
(0x00007f2646800000)
libisccfg.so.160 => /usr/lib64/libisccfg.so.160
(0x00007f26465d4000)
libisccc.so.160 => /usr/lib64/libisccc.so.160
(0x00007f26463ca000)
libisc.so.166 => /usr/lib64/libisc.so.166 (0x00007f2646152000)
libcrypto.so.1 => /lib64/libcrypto.so.1 (0x00007f2645d01000)
libcap.so.2 => /lib64/libcap.so.2 (0x00007f2645afc000)
libjson-c.so.2 => /usr/lib64/libjson-c.so.2 (0x00007f26458f1000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f26456d4000)
libxml2.so.2 => /usr/lib64/libxml2.so.2 (0x00007f264536d000)
libz.so.1 => /lib64/libz.so.1 (0x00007f2645157000)
liblzma.so.5 => /lib64/liblzma.so.5 (0x00007f2644f32000)
libm.so.6 => /lib64/libm.so.6 (0x00007f2644c29000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f2644a24000)
libc.so.6 => /lib64/libc.so.6 (0x00007f264465b000)
libattr.so.1 => /lib64/libattr.so.1 (0x00007f2644456000)
/lib64/ld-linux-x86-64.so.2 (0x000055afa6072000)
rob0@dance:~$ cat /etc/slackware-version ; uname -a
Slackware 14.2
Linux dance 4.4.69 #4 SMP Sat May 20 18:57:07 CDT 2017 x86_64 Westmere
E56xx/L56xx/X56xx (Nehalem-C) GenuineIntel GNU/Linux
I think I don't have a NZD, now that you mention it. VM1:
26-May-2017 13:57:45.189 config: error: open: _default.nzf: file not
found
... a bit further on after all the empty zones loaded:
26-May-2017 13:57:45.196 general: info: zone
56.78.158.98.in-addr.arpa/IN: (slave) removed
26-May-2017 13:57:45.196 general: info: zone
57.78.158.98.in-addr.arpa/IN: (slave) removed
26-May-2017 13:57:45.196 general: info: zone
58.78.158.98.in-addr.arpa/IN: (slave) removed
26-May-2017 13:57:45.196 general: info: zone
59.78.158.98.in-addr.arpa/IN: (slave) removed
26-May-2017 13:57:45.196 general: info: zone
60.78.158.98.in-addr.arpa/IN: (slave) removed
26-May-2017 13:57:45.196 general: info: zone
61.78.158.98.in-addr.arpa/IN: (slave) removed
26-May-2017 13:57:45.196 general: info: zone
62.78.158.98.in-addr.arpa/IN: (slave) removed
26-May-2017 13:57:45.196 general: info: zone
63.78.158.98.in-addr.arpa/IN: (slave) removed
26-May-2017 13:57:45.196 general: info: zone stormtracklinux.com/IN:
(slave) removed
26-May-2017 13:57:45.196 general: info: zone lizella.net/IN: (slave)
removed
26-May-2017 13:57:45.196 general: info: zone slackbook.org/IN: (slave)
removed
26-May-2017 13:57:45.196 general: info: zone slackbuilds.org/IN: (slave)
removed
26-May-2017 13:57:45.196 general: info: zone walk-on.us/IN: (slave)
removed
(That was the whole catalog at that point.)
26-May-2017 13:57:45.196 general: info: zone nodns4.us/IN: reconfiguring
zone keys
26-May-2017 13:57:45.197 general: info: reloading configuration
succeeded
26-May-2017 13:57:45.197 general: info: zone nodns4.us/IN: next key
event: 26-May-2017 14:57:45.196
26-May-2017 13:57:45.197 general: info: scheduled loading new zones
26-May-2017 13:57:45.197 general: info: catz: updating catalog zone
'catalog.example' with serial 2017052407
26-May-2017 13:57:45.211 general: info: any newly configured zones are
now loaded
26-May-2017 13:57:45.211 general: info: zone dynamic.nodns4.us/IN:
reconfiguring zone keys
26-May-2017 13:57:45.211 general: notice: running
26-May-2017 13:57:45.212 general: info: zone dynamic.nodns4.us/IN: next
key event: 26-May-2017 14:57:45.211
26-May-2017 13:57:53.280 security: info: client @0x849fa80
95.97.142.106#60189 (ns1.slackbuilds.org): query (cache)
'ns1.slackbuilds.org/AAAA/IN' denied
And then a whole slew of denied queries for published zones (catz
members
which had been removed.)
So what I think we have here might be a Slackware packaging bug, not
providing something we need to write this <view>.nzf file? I can get
this Slackware bug fixed, but I am not sure about the software
requirements.
Let's go back to VM2, since it's newer and "clean", and look at the
working directory:
rob0@dance:~$ ls -lR /var/named/
/var/named/:
total 16
drwxr-xr-x 2 root root 4096 May 28 21:27 catzones/
drwxr-xr-x 2 root root 4096 May 27 12:46 logs/
-rw-r--r-- 1 root root 821 May 31 12:50 primary.mkeys
-rw-r--r-- 1 root root 512 May 31 12:50 primary.mkeys.jnl
/var/named/catzones:
total 72
-rw-r--r-- 1 root root 283 Jun 1 05:49
__catz__primary_catalog.example_56.78.158.98.in-addr.arpa.db
-rw-r--r-- 1 root root 286 Jun 1 09:47
__catz__primary_catalog.example_57.78.158.98.in-addr.arpa.db
-rw-r--r-- 1 root root 291 Jun 1 07:31
__catz__primary_catalog.example_58.78.158.98.in-addr.arpa.db
-rw-r--r-- 1 root root 278 Jun 1 04:21
__catz__primary_catalog.example_59.78.158.98.in-addr.arpa.db
-rw-r--r-- 1 root root 961 Jun 1 05:18
__catz__primary_catalog.example_60.78.158.98.in-addr.arpa.db
-rw-r--r-- 1 root root 282 Jun 1 08:09
__catz__primary_catalog.example_61.78.158.98.in-addr.arpa.db
-rw-r--r-- 1 root root 729 Jun 1 09:00
__catz__primary_catalog.example_62.78.158.98.in-addr.arpa.db
-rw-r--r-- 1 root root 278 Jun 1 06:49
__catz__primary_catalog.example_63.78.158.98.in-addr.arpa.db
-rw-r--r-- 1 root root 1243 Jun 1 10:01
__catz__primary_catalog.example_rlworkman.net.db
-rw-r--r-- 1 root root 1221 Jun 1 10:01
__catz__primary_catalog.example_rlworkman.net.db.jnl
-rw-r--r-- 1 root root 1561 Jun 1 08:35
__catz__primary_catalog.example_room101.us.eu.org.db
-rw-r--r-- 1 root root 837 Jun 1 08:56
__catz__primary_catalog.example_slackbook.org.db
-rw-r--r-- 1 root root 3193 Jun 1 09:27
__catz__primary_catalog.example_slackbuilds.org.db
-rw-r--r-- 1 root root 765 Jun 1 09:27
__catz__primary_catalog.example_slackbuilds.org.db.jnl
-rw-r--r-- 1 root root 621 Jun 1 08:52
__catz__primary_catalog.example_slackpkg.org.db
-rw-r--r-- 1 root root 1086 Jun 1 08:50
__catz__primary_catalog.example_stormtracklinux.com.db
-rw-r--r-- 1 root root 880 Jun 1 08:22
__catz__primary_catalog.example_tuxaloosa.org.db
-rw-r--r-- 1 root root 1176 Jun 1 09:28
__catz__primary_catalog.example_walk-on.us.db
/var/named/logs:
total 44
-rw-r--r-- 1 root root 37305 Jun 1 09:37 named.log
-rw-r--r-- 1 root root 0 May 27 12:46 query.log
So it seems that there should be a "/var/named/primary.nzf" file
containing all the catalog zone members (along with any zones from
addzone if appropriate)?
For the record I'll also include "named-checkconf -p" ("Master" is
"harrier"):
acl "shibboleet" {
98.158.78.58/32;
};
acl "harrier" {
207.223.116.211/32;
};
acl "sbo" {
"harrier";
"shibboleet";
};
acl "spikenet" {
208.94.237.144/28;
};
acl "trusted" {
"localhost";
"sbo";
"spikenet";
};
acl "recursion" {
"trusted";
};
acl "transfer" {
"trusted";
};
logging {
channel "default_log" {
file "logs/named.log" versions unlimited size 4194304;
severity dynamic;
print-time yes;
print-severity yes;
print-category yes;
};
channel "query_log" {
file "logs/query.log" versions 10 size 2097152;
severity dynamic;
print-time yes;
};
category "default" {
"default_log";
};
category "queries" {
"query_log";
};
};
masters "shibboleet" {
98.158.78.58;
};
masters "harrier" {
207.223.116.211;
};
masters "sbo" {
"harrier";
};
options {
directory "/var/named";
querylog no;
allow-recursion {
"recursion";
};
catalog-zones {
zone "catalog.example" default-masters {
"sbo";
} zone-directory "catzones" in-memory no
min-update-interval 10;
};
dnssec-validation auto;
rate-limit {
exempt-clients {
"trusted";
};
log-only no;
responses-per-second 9;
};
allow-transfer {
"transfer";
};
zone-statistics yes;
};
view "primary" {
zone "catalog.example" {
type slave;
masters {
"sbo";
};
allow-query {
"trusted";
};
};
};