On 2017-06-01 08:23, Cathy Almond via RT wrote: > On Wed May 31 15:01:18 2017, ca-isc@nodns4.us wrote: > >> Software Version: BIND 9.11.1 >> OS: Linux (Slackware 14.2) >> Subject:rndc reconfig wipes out catalog zone slaves >> >> >> Bug Detail >> =========== >> Also described at https://lists.isc.org/pipermail/bind-users/2017- >> May/098643.html et seq. >> >> When a change is needed in a catalog zone slave server's named.conf, >> "rndc reconfig" or "rndc reload" removes all catalog zone member >> slaves. The only fix I have is to remove the catalog-zones option >> from the slave server's options{}, stop and restart named. >> >> This is very troublesome when these slave servers have been published >> as NS for the zones in question; clients receive REFUSED replies to >> queries, and these quickly fill up logs. >> >> Discovered on BIND 9.11.1 on Slackware 13.37, replicated on >> 9.11.1/Slackware 14.2. > > Hi Chuck, > > We have a theory about what's going on here. > > In your bind-users post you said that you're using a combination of > rndc reconfig and nsupdate to convert the VMs (VM1 in the first > instance) from being a master for a number of zones, to being a slave, > provisioned using CATZ. > > There may be some sequencing issues here therefore. If VM1 still has > the old zones at the point that they were being provisioned via CATZ, > then this is going to fail. If you're trying to convert VM1 from > master to CATZ-provisioned in one fell swoop (i.e. with a single rndc > reconfig) and not in two steps, then it could be that this is why it's > breaking for you. Not the case. I first prepared a file to feed to nsupdate on Master which would add a group of zones to the catalog. Those zones were master zones on VM1 at the time. I first edited the VM1 config to remove those zones, then reconfig on VM1, and then nsupdate on Master. At that time I had three slave zone instances for each listed zone. (Later on I converted the slave statements to masters on Master.) The first time went perfectly, but at that time there were no zones listed in the catalog. > Could you be (just) a little bit more detailed on the migration steps? > > What was actually being altered when you did the rndc reconfig? On VM1, removal of master zone statements to make way for the catalog addition. On VM2 I added an acl list I had forgotten and a rate-limit option. In each case, ALL existing catalog zone members were removed and the queries refused. > Another theory is around how many zones you are actually serving and > an integration problem we've uncovered with using LMDB (liblmdb) as > the backend new zone databased (NZD). Oh, this one has merit, I think we are on to something. Apparently liblmdb is not a firm BIND build requirement, but without it, do we save our NZD at all? That is, is there an alternative to LMDB, which has not [yet] been made a part of Slackware? Was the need for LMDB documented somewhere? > How many zones are involved here, 16 at the point where I stopped migrating. There are a small number of zones yet to migrate. > and did you build BIND 9.11 in an > environment with liblmdb installed? Looks like no, on VM1 ("Shibbleet"): rob0@Shibboleet:~$ cat /etc/slackware-version ; uname -a Slackware 13.37.0 Linux Shibboleet 3.2.79-smp #13 SMP Wed Apr 13 23:26:38 UTC 2016 i686 Intel(R) Xeon(R) CPU X5670 @ 2.93GHz GenuineIntel GNU/Linux rob0@Shibboleet:~$ ldd /usr/sbin/named linux-gate.so.1 => (0xffffe000) liblwres.so.160 => /usr/lib/liblwres.so.160 (0xb76d6000) libdns.so.168 => /usr/lib/libdns.so.168 (0xb74c6000) libbind9.so.160 => /usr/lib/libbind9.so.160 (0xb74b7000) libisccfg.so.160 => /usr/lib/libisccfg.so.160 (0xb749a000) libisccc.so.160 => /usr/lib/libisccc.so.160 (0xb7491000) libisc.so.166 => /usr/lib/libisc.so.166 (0xb7424000) libcrypto.so.0 => /lib/libcrypto.so.0 (0xb72dc000) libcap.so.2 => /lib/libcap.so.2 (0xb72d8000) librt.so.1 => /lib/librt.so.1 (0xb72cf000) libpthread.so.0 => /lib/libpthread.so.0 (0xb72b6000) libxml2.so.2 => /usr/lib/libxml2.so.2 (0xb718e000) libdl.so.2 => /lib/libdl.so.2 (0xb718a000) libz.so.1 => /usr/lib/libz.so.1 (0xb7176000) libm.so.6 => /lib/libm.so.6 (0xb7150000) libc.so.6 => /lib/libc.so.6 (0xb6fe9000) libattr.so.1 => /lib/libattr.so.1 (0xb6fe3000) /lib/ld-linux.so.2 (0xb76fd000) same on VM2 ("dance"): rob0@dance:~$ ldd /usr/sbin/named linux-vdso.so.1 (0x00007ffee676f000) liblwres.so.160 => /usr/lib64/liblwres.so.160 (0x00007f2646e27000) libdns.so.168 => /usr/lib64/libdns.so.168 (0x00007f2646a11000) libbind9.so.160 => /usr/lib64/libbind9.so.160 (0x00007f2646800000) libisccfg.so.160 => /usr/lib64/libisccfg.so.160 (0x00007f26465d4000) libisccc.so.160 => /usr/lib64/libisccc.so.160 (0x00007f26463ca000) libisc.so.166 => /usr/lib64/libisc.so.166 (0x00007f2646152000) libcrypto.so.1 => /lib64/libcrypto.so.1 (0x00007f2645d01000) libcap.so.2 => /lib64/libcap.so.2 (0x00007f2645afc000) libjson-c.so.2 => /usr/lib64/libjson-c.so.2 (0x00007f26458f1000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f26456d4000) libxml2.so.2 => /usr/lib64/libxml2.so.2 (0x00007f264536d000) libz.so.1 => /lib64/libz.so.1 (0x00007f2645157000) liblzma.so.5 => /lib64/liblzma.so.5 (0x00007f2644f32000) libm.so.6 => /lib64/libm.so.6 (0x00007f2644c29000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f2644a24000) libc.so.6 => /lib64/libc.so.6 (0x00007f264465b000) libattr.so.1 => /lib64/libattr.so.1 (0x00007f2644456000) /lib64/ld-linux-x86-64.so.2 (0x000055afa6072000) rob0@dance:~$ cat /etc/slackware-version ; uname -a Slackware 14.2 Linux dance 4.4.69 #4 SMP Sat May 20 18:57:07 CDT 2017 x86_64 Westmere E56xx/L56xx/X56xx (Nehalem-C) GenuineIntel GNU/Linux I think I don't have a NZD, now that you mention it. VM1: 26-May-2017 13:57:45.189 config: error: open: _default.nzf: file not found ... a bit further on after all the empty zones loaded: 26-May-2017 13:57:45.196 general: info: zone 56.78.158.98.in-addr.arpa/IN: (slave) removed 26-May-2017 13:57:45.196 general: info: zone 57.78.158.98.in-addr.arpa/IN: (slave) removed 26-May-2017 13:57:45.196 general: info: zone 58.78.158.98.in-addr.arpa/IN: (slave) removed 26-May-2017 13:57:45.196 general: info: zone 59.78.158.98.in-addr.arpa/IN: (slave) removed 26-May-2017 13:57:45.196 general: info: zone 60.78.158.98.in-addr.arpa/IN: (slave) removed 26-May-2017 13:57:45.196 general: info: zone 61.78.158.98.in-addr.arpa/IN: (slave) removed 26-May-2017 13:57:45.196 general: info: zone 62.78.158.98.in-addr.arpa/IN: (slave) removed 26-May-2017 13:57:45.196 general: info: zone 63.78.158.98.in-addr.arpa/IN: (slave) removed 26-May-2017 13:57:45.196 general: info: zone stormtracklinux.com/IN: (slave) removed 26-May-2017 13:57:45.196 general: info: zone lizella.net/IN: (slave) removed 26-May-2017 13:57:45.196 general: info: zone slackbook.org/IN: (slave) removed 26-May-2017 13:57:45.196 general: info: zone slackbuilds.org/IN: (slave) removed 26-May-2017 13:57:45.196 general: info: zone walk-on.us/IN: (slave) removed (That was the whole catalog at that point.) 26-May-2017 13:57:45.196 general: info: zone nodns4.us/IN: reconfiguring zone keys 26-May-2017 13:57:45.197 general: info: reloading configuration succeeded 26-May-2017 13:57:45.197 general: info: zone nodns4.us/IN: next key event: 26-May-2017 14:57:45.196 26-May-2017 13:57:45.197 general: info: scheduled loading new zones 26-May-2017 13:57:45.197 general: info: catz: updating catalog zone 'catalog.example' with serial 2017052407 26-May-2017 13:57:45.211 general: info: any newly configured zones are now loaded 26-May-2017 13:57:45.211 general: info: zone dynamic.nodns4.us/IN: reconfiguring zone keys 26-May-2017 13:57:45.211 general: notice: running 26-May-2017 13:57:45.212 general: info: zone dynamic.nodns4.us/IN: next key event: 26-May-2017 14:57:45.211 26-May-2017 13:57:53.280 security: info: client @0x849fa80 95.97.142.106#60189 (ns1.slackbuilds.org): query (cache) 'ns1.slackbuilds.org/AAAA/IN' denied And then a whole slew of denied queries for published zones (catz members which had been removed.) So what I think we have here might be a Slackware packaging bug, not providing something we need to write this .nzf file? I can get this Slackware bug fixed, but I am not sure about the software requirements. Let's go back to VM2, since it's newer and "clean", and look at the working directory: rob0@dance:~$ ls -lR /var/named/ /var/named/: total 16 drwxr-xr-x 2 root root 4096 May 28 21:27 catzones/ drwxr-xr-x 2 root root 4096 May 27 12:46 logs/ -rw-r--r-- 1 root root 821 May 31 12:50 primary.mkeys -rw-r--r-- 1 root root 512 May 31 12:50 primary.mkeys.jnl /var/named/catzones: total 72 -rw-r--r-- 1 root root 283 Jun 1 05:49 __catz__primary_catalog.example_56.78.158.98.in-addr.arpa.db -rw-r--r-- 1 root root 286 Jun 1 09:47 __catz__primary_catalog.example_57.78.158.98.in-addr.arpa.db -rw-r--r-- 1 root root 291 Jun 1 07:31 __catz__primary_catalog.example_58.78.158.98.in-addr.arpa.db -rw-r--r-- 1 root root 278 Jun 1 04:21 __catz__primary_catalog.example_59.78.158.98.in-addr.arpa.db -rw-r--r-- 1 root root 961 Jun 1 05:18 __catz__primary_catalog.example_60.78.158.98.in-addr.arpa.db -rw-r--r-- 1 root root 282 Jun 1 08:09 __catz__primary_catalog.example_61.78.158.98.in-addr.arpa.db -rw-r--r-- 1 root root 729 Jun 1 09:00 __catz__primary_catalog.example_62.78.158.98.in-addr.arpa.db -rw-r--r-- 1 root root 278 Jun 1 06:49 __catz__primary_catalog.example_63.78.158.98.in-addr.arpa.db -rw-r--r-- 1 root root 1243 Jun 1 10:01 __catz__primary_catalog.example_rlworkman.net.db -rw-r--r-- 1 root root 1221 Jun 1 10:01 __catz__primary_catalog.example_rlworkman.net.db.jnl -rw-r--r-- 1 root root 1561 Jun 1 08:35 __catz__primary_catalog.example_room101.us.eu.org.db -rw-r--r-- 1 root root 837 Jun 1 08:56 __catz__primary_catalog.example_slackbook.org.db -rw-r--r-- 1 root root 3193 Jun 1 09:27 __catz__primary_catalog.example_slackbuilds.org.db -rw-r--r-- 1 root root 765 Jun 1 09:27 __catz__primary_catalog.example_slackbuilds.org.db.jnl -rw-r--r-- 1 root root 621 Jun 1 08:52 __catz__primary_catalog.example_slackpkg.org.db -rw-r--r-- 1 root root 1086 Jun 1 08:50 __catz__primary_catalog.example_stormtracklinux.com.db -rw-r--r-- 1 root root 880 Jun 1 08:22 __catz__primary_catalog.example_tuxaloosa.org.db -rw-r--r-- 1 root root 1176 Jun 1 09:28 __catz__primary_catalog.example_walk-on.us.db /var/named/logs: total 44 -rw-r--r-- 1 root root 37305 Jun 1 09:37 named.log -rw-r--r-- 1 root root 0 May 27 12:46 query.log So it seems that there should be a "/var/named/primary.nzf" file containing all the catalog zone members (along with any zones from addzone if appropriate)? For the record I'll also include "named-checkconf -p" ("Master" is "harrier"): acl "shibboleet" { 98.158.78.58/32; }; acl "harrier" { 207.223.116.211/32; }; acl "sbo" { "harrier"; "shibboleet"; }; acl "spikenet" { 208.94.237.144/28; }; acl "trusted" { "localhost"; "sbo"; "spikenet"; }; acl "recursion" { "trusted"; }; acl "transfer" { "trusted"; }; logging { channel "default_log" { file "logs/named.log" versions unlimited size 4194304; severity dynamic; print-time yes; print-severity yes; print-category yes; }; channel "query_log" { file "logs/query.log" versions 10 size 2097152; severity dynamic; print-time yes; }; category "default" { "default_log"; }; category "queries" { "query_log"; }; }; masters "shibboleet" { 98.158.78.58; }; masters "harrier" { 207.223.116.211; }; masters "sbo" { "harrier"; }; options { directory "/var/named"; querylog no; allow-recursion { "recursion"; }; catalog-zones { zone "catalog.example" default-masters { "sbo"; } zone-directory "catzones" in-memory no min-update-interval 10; }; dnssec-validation auto; rate-limit { exempt-clients { "trusted"; }; log-only no; responses-per-second 9; }; allow-transfer { "transfer"; }; zone-statistics yes; }; view "primary" { zone "catalog.example" { type slave; masters { "sbo"; }; allow-query { "trusted"; }; }; };