Report information
The Basics
Id:
45380
Status:
resolved
Priority:
Medium/Medium
Queue:

BugTracker
Version Fixed:
9.9.12, 9.10.7, 9.11.3, 9.12.0
Version Found:
(no value)
Versions Affected:
(no value)
Versions Planned:
(no value)
Priority:
P3 Low
Severity:
S2 Normal
CVSS Score:
(no value)
CVE ID:
(no value)
Component:
BIND Server
Area:
bug

Dates
Created:Wed, 14 Jun 2017 09:06:45 -0400
Updated:Thu, 17 Aug 2017 06:28:12 -0400
Closed:Mon, 14 Aug 2017 09:26:07 -0400



This bug tracker is no longer active.

Please go to our Gitlab to submit issues (both feature requests and bug reports) for active projects maintained by Internet Systems Consortium (ISC).

Due to security and confidentiality requirements, full access is limited to the primary maintainers.

Subject: nsupdate: master address failover works incorrectly when GSSAPI is used
When GSSAPI is used and TKEY retrieval fails for a given master address, nsupdate resends the TKEY query to the next master address, if available. However, recvgss() uses sendrequest() instead of send_gssrequest() for sending the retried TKEY query to the next master address, which causes recvsoa() to be called instead of recvgss() when the retried TKEY query is completed. AFAICT, this bug has been present in the code ever since GSS-TSIG was first implemented in 289ae548d5. The issue is pretty obscure, here is the simplest (sic!) environment in which I was able to trigger it: - a dual-stacked host, - a local named instance running with the following configuration: ---------------------------------------------------------------- options { listen-on { 127.0.0.1; }; listen-on-v6 { }; tkey-gssapi-keytab "keytab"; }; zone "example." { type master; file "example.db"; notify no; update-policy { grant user@REALM zonesub ANY; }; }; ---------------------------------------------------------------- - example.db contains: ---------------------------------------------------------------- $TTL 300 ; 5 minutes example IN SOA localhost. michal.isc.org. ( 1 ; serial 86400 ; refresh (1 day) 3600 ; retry (1 hour) 3600000 ; expire (5 weeks 6 days 16 hours) 300 ; minimum (5 minutes) ) NS ns1.example. NS ns2.example. $ORIGIN example. ns1 A 127.0.0.1 ns2 A 127.0.0.1 ---------------------------------------------------------------- - nsupdate.txt contains: ---------------------------------------------------------------- update add ns3.example. 300 A 127.0.0.1 send ---------------------------------------------------------------- - /etc/resolv.conf contains "nameserver 127.0.0.1". Kerberos setup is irrelevant, as long as it is working properly. When run in the above environment, "nsupdate -g < nsupdate.txt" will look for the master server to send UPDATE messages to by issuing a SOA query to the resolver configured in /etc/resolv.conf (i.e. 127.0.0.1) and checking the MNAME field in the response. As the MNAME is set to "localhost.", nsupdate will then grab a list of master server addresses using bind9_getaddresses(). Due to the host being dual-stacked, "localhost" will be resolved to a list of two addresses: ::1 and 127.0.0.1. As the local named instance does not listen on ::1, recvgss() will fail over to the next master address, i.e. 127.0.0.1. Here is a debug log generated using nsupdate from current master: -------------------------------------------------------------------- $ nsupdate -g -D < nsupdate.txt setup_system() reset_system() user_interaction() do_next_command() evaluate_update() update_addordelete() do_next_command() start_update() recvsoa() About to create rcvmsg show_message() Reply from SOA query: ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 61922 ;; flags: qr aa rd ra; QUESTION: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0 ;; QUESTION SECTION: ;ns3.example. IN SOA ;; AUTHORITY SECTION: example. 0 IN SOA localhost. michal.isc.org. 1 86400 3600 3600000 300 Found zone name: example The master is: localhost start_gssrequest Found realm from ticket: REALM send_gssrequest show_message() Outgoing update query: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 58268 ;; flags:; QUESTION: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; QUESTION SECTION: ;519555431.sig-localhost. ANY TKEY ;; ADDITIONAL SECTION: 519555431.sig-localhost. 0 ANY TKEY gss-tsig. 1497444295 1497444295 3 NOERROR 647 YIICgwYGKwYBBQUCoIICdzCCAnOgDTALBgkqhkiG9xIBAgKiggJgBIIC XGCCAlgGCSqGSIb3EgECAgEAboICRzCCAkOgAwIBBaEDAgEOogcDBQAg AAAAo4IBV2GCAVMwggFPoAMCAQWhDxsNSFEuS0VNUE5JVS5QTKIbMBmg AwIBAaESMBAbA0ROUxsJbG9jYWxob3N0o4IBGDCCARSgAwIBEqEDAgEC ooIBBgSCAQJdNb6tzrm3W/YubKnziJWlzVG6YcBRZh2vRgxa+fOItHVR YlvanW+KoWmpbp4UmEK/6RybBEZzfZ2Guwz2MilFJD0XtdamDvnT1z+j Qn6jGwtS4q3M+raoIlcu3DPecCqgdzv+bRPaTQMpUmp5RILD8Y+6qa18 9j7TZiSW2TJTFSdOSIfLNRbfQ7l5XiPzTx9Wsc7y9cUegStpWa/DnysK r9MhjNuuUfA69ZZB57ODYoMmD6gL+N9GepqzJPYuSeb7XqJCBJtnyG6V UXDje4G+6w9JEIh/i8DSHFBPIfTnRRJXcBwSpR5AA6w8PwynyLpBFY1+ MbvbMhMgG1DJqZcc9GqkgdIwgc+gAwIBEqKBxwSBxPlrNBoqYvw4xFx3 DvqWViKA1/lfBqQ9odRb4xVN5c0X7hM3JTE6lUsRr76KZ1arzuyNLeF/ 4TxtSGKPCBz8h86ziV4hY1bCHQ0s7hsc/IjwYSIFikh9adFH/vcZJ5Eh vyB6Q2eFu2yEW3YIcDx+nppuPZjwIqCabZ85KDqRbB1ryvFzamoBlpHr LhbFAxh5qhPJWwz9EzMq9E5mRm+EYzmHYNFeQ4BoHrY+kM8mFNkmzIYy 6Qzb9Tyh+vbptvdtSEwyV1I= 0 Out of recvsoa recvgss() ; Communication with ::1#53 failed: operation canceled recvgss: trying next server Destroying request [0x7f2526fef010] recvsoa() About to create rcvmsg show_message() Reply from SOA query: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 39068 ;; flags: qr ra; QUESTION: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;519555431.sig-localhost. ANY TKEY ;; ANSWER SECTION: 519555431.sig-localhost. 0 ANY TKEY gss-tsig. 1497444295 1497447895 3 NOERROR 186 oYG3MIG0oAMKAQChCwYJKoZIhvcSAQICooGfBIGcYIGZBgkqhkiG9xIB AgICAG+BiTCBhqADAgEFoQMCAQ+iejB4oAMCARKicQRvJv80Y9UPwyJd MSxmo/yIgR9LgpY+BjqvuW2yPqc0hm+qxjdpdPFZVUVE7Xl43flI4VM6 vwSvD9Gi3S4x5UXAI2LOQl60p9CzQnrsMP8+NLEblNSRSSxtIsT6b8uK EjZ3OxCNyqdIva95wFl2l6AV 0 Out of recvsoa recvsoa() About to create rcvmsg show_message() Reply from SOA query: ;; ->>HEADER<<- opcode: QUERY, status: FORMERR, id: 54329 ;; flags: qr ra; QUESTION: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;sig-localhost. ANY TKEY response to SOA query was unsuccessful -------------------------------------------------------------------- A PCAP captured during the above nsupdate session is also attached. What happens here is that recvgss() resends the TKEY query to 127.0.0.1 using sendrequest(), which causes recvsoa() to be called when the resent TKEY query is answered. recvsoa() is understandably baffled as it is unable to find a SOA RR in either the ANSWER section or the AUTHORITY section of the received answer, which causes it to evaluate the code beneath the "droplabel" label, which in turn causes the TKEY query to be resent once again, this time with one label cut off, soliciting a FORMERR response.
Subject: nsupdate-master-failover-error.pcap

Message body not shown because it is not plain text.

Suggested fix is in the rt45380 branch. Please review. Here is a debug log generated using modified nsupdate: -------------------------------------------------------------------- $ nsupdate -g -D < nsupdate.txt setup_system() reset_system() user_interaction() do_next_command() evaluate_update() update_addordelete() do_next_command() start_update() recvsoa() About to create rcvmsg show_message() Reply from SOA query: ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 1790 ;; flags: qr aa rd ra; QUESTION: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0 ;; QUESTION SECTION: ;ns3.example. IN SOA ;; AUTHORITY SECTION: example. 0 IN SOA localhost. michal.isc.org. 1 86400 3600 3600000 300 Found zone name: example The master is: localhost start_gssrequest Found realm from ticket: REALM send_gssrequest show_message() Outgoing update query: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 59541 ;; flags:; QUESTION: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; QUESTION SECTION: ;1260325310.sig-localhost. ANY TKEY ;; ADDITIONAL SECTION: 1260325310.sig-localhost. 0 ANY TKEY gss-tsig. 1497444491 1497444491 3 NOERROR 647 YIICgwYGKwYBBQUCoIICdzCCAnOgDTALBgkqhkiG9xIBAgKiggJgBIIC XGCCAlgGCSqGSIb3EgECAgEAboICRzCCAkOgAwIBBaEDAgEOogcDBQAg AAAAo4IBV2GCAVMwggFPoAMCAQWhDxsNSFEuS0VNUE5JVS5QTKIbMBmg AwIBAaESMBAbA0ROUxsJbG9jYWxob3N0o4IBGDCCARSgAwIBEqEDAgEC ooIBBgSCAQJdNb6tzrm3W/YubKnziJWlzVG6YcBRZh2vRgxa+fOItHVR YlvanW+KoWmpbp4UmEK/6RybBEZzfZ2Guwz2MilFJD0XtdamDvnT1z+j Qn6jGwtS4q3M+raoIlcu3DPecCqgdzv+bRPaTQMpUmp5RILD8Y+6qa18 9j7TZiSW2TJTFSdOSIfLNRbfQ7l5XiPzTx9Wsc7y9cUegStpWa/DnysK r9MhjNuuUfA69ZZB57ODYoMmD6gL+N9GepqzJPYuSeb7XqJCBJtnyG6V UXDje4G+6w9JEIh/i8DSHFBPIfTnRRJXcBwSpR5AA6w8PwynyLpBFY1+ MbvbMhMgG1DJqZcc9GqkgdIwgc+gAwIBEqKBxwSBxJb07TctNIsDAT5b qLch2YWNCCpGt6b9GoQ1AKXS8O9dMQDXWaneKJcnJbip3pOxyK7pJU8Z 6MZOvq8ik+JCZFM3N85IHeKYJLuE9ipHrbmOqBa0wYcrTv6rnVoPX7VK ZKMyTEu5rRtzYhcoq0eC5fiSAjPu+rGD++1QfmlV37SuYlOxs/RiypmK slfpVYjQudnYJDV+9kiwHYvxb1d7R29rkRdho1mIE6dQRfePc6EHmBZh +k5lLjhR9kh+Xfd3xrvZ/ls= 0 Out of recvsoa recvgss() ; Communication with ::1#53 failed: operation canceled recvgss: trying next server Destroying request [0x7f3d7a2da010] send_gssrequest show_message() Outgoing update query: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 7354 ;; flags:; QUESTION: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; QUESTION SECTION: ;1260325310.sig-localhost. ANY TKEY ;; ADDITIONAL SECTION: 1260325310.sig-localhost. 0 ANY TKEY gss-tsig. 1497444491 1497444491 3 NOERROR 647 YIICgwYGKwYBBQUCoIICdzCCAnOgDTALBgkqhkiG9xIBAgKiggJgBIIC XGCCAlgGCSqGSIb3EgECAgEAboICRzCCAkOgAwIBBaEDAgEOogcDBQAg AAAAo4IBV2GCAVMwggFPoAMCAQWhDxsNSFEuS0VNUE5JVS5QTKIbMBmg AwIBAaESMBAbA0ROUxsJbG9jYWxob3N0o4IBGDCCARSgAwIBEqEDAgEC ooIBBgSCAQJdNb6tzrm3W/YubKnziJWlzVG6YcBRZh2vRgxa+fOItHVR YlvanW+KoWmpbp4UmEK/6RybBEZzfZ2Guwz2MilFJD0XtdamDvnT1z+j Qn6jGwtS4q3M+raoIlcu3DPecCqgdzv+bRPaTQMpUmp5RILD8Y+6qa18 9j7TZiSW2TJTFSdOSIfLNRbfQ7l5XiPzTx9Wsc7y9cUegStpWa/DnysK r9MhjNuuUfA69ZZB57ODYoMmD6gL+N9GepqzJPYuSeb7XqJCBJtnyG6V UXDje4G+6w9JEIh/i8DSHFBPIfTnRRJXcBwSpR5AA6w8PwynyLpBFY1+ MbvbMhMgG1DJqZcc9GqkgdIwgc+gAwIBEqKBxwSBxJb07TctNIsDAT5b qLch2YWNCCpGt6b9GoQ1AKXS8O9dMQDXWaneKJcnJbip3pOxyK7pJU8Z 6MZOvq8ik+JCZFM3N85IHeKYJLuE9ipHrbmOqBa0wYcrTv6rnVoPX7VK ZKMyTEu5rRtzYhcoq0eC5fiSAjPu+rGD++1QfmlV37SuYlOxs/RiypmK slfpVYjQudnYJDV+9kiwHYvxb1d7R29rkRdho1mIE6dQRfePc6EHmBZh +k5lLjhR9kh+Xfd3xrvZ/ls= 0 recvgss() recvgss creating rcvmsg show_message() recvmsg reply from GSS-TSIG query ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 7354 ;; flags: qr ra; QUESTION: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;1260325310.sig-localhost. ANY TKEY ;; ANSWER SECTION: 1260325310.sig-localhost. 0 ANY TKEY gss-tsig. 1497444491 1497448091 3 NOERROR 186 oYG3MIG0oAMKAQChCwYJKoZIhvcSAQICooGfBIGcYIGZBgkqhkiG9xIB AgICAG+BiTCBhqADAgEFoQMCAQ+iejB4oAMCARKicQRvuvPWfLf7Whno 1TkY9ykZtTHKOWaywfNzvLrUXzUW4qNjvGZUjDdeYKzaw/U174MzOUxS uIXZLWAdXJLKmZc9Ug+/Gl6IoYxw+kiXvYyCqfLon1mIs8gCK6RHCVID doUpPJqSbfr5r4n4jqRZAZzG 0 send_update() Sending update to 127.0.0.1#53 show_message() Outgoing update query: ;; ->>HEADER<<- opcode: UPDATE, status: NOERROR, id: 23210 ;; flags:; ZONE: 1, PREREQ: 0, UPDATE: 1, ADDITIONAL: 1 ;; UPDATE SECTION: ns3.example. 300 IN A 127.0.0.1 ;; TSIG PSEUDOSECTION: 1260325310.sig-localhost. 0 ANY TSIG gss-tsig. 1497444491 300 28 BAQE//////8AAAAAPQVooIZy8N2gYMTWx4ahaA== 23210 NOERROR 0 Out of recvgss update_completed() tsig verification successful show_message() Reply from update query: ;; ->>HEADER<<- opcode: UPDATE, status: NOERROR, id: 23210 ;; flags: qr; ZONE: 1, PREREQ: 0, UPDATE: 0, ADDITIONAL: 1 ;; ZONE SECTION: ;example. IN SOA ;; TSIG PSEUDOSECTION: 1260325310.sig-localhost. 0 ANY TSIG gss-tsig. 1497444491 300 28 BAQF//////8AAAAAFNIHoOqztmsVwLMcZCrIPw== 23210 NOERROR 0 done_update() reset_system() user_interaction() cleanup() Shutting down task manager shutdown_program() Shutting down request manager Destroy DST lib Destroying request manager Freeing the dispatchers Shutting down dispatch manager Destroying event Shutting down socket manager Shutting down timer manager Destroying hash context Destroying name state Removing log context Destroying memory context -------------------------------------------------------------------- A PCAP captured during the above nsupdate session is also attached. I do not think we can prepare a reliable system test for this issue using our current test infrastructure, but I would love to be proven wrong.
Subject: nsupdate-master-failover-fixed.pcap

Message body not shown because it is not plain text.

I am afraid it is not that simple. There are two prerequisites for triggering this bug: 1. The master_servers array has to contain more than one address. 2. The SOA query has to be responded to properly, while sending the TKEY query has to elicit a dispatch error. Your comment prompted me to spend most of yesterday on trying to find a way to satisfy both of these prerequisites within the limitations of our system test infrastructure, i.e. without making any assumptions about the capabilities and configuration of the host running the tests. The executive summary is that I failed. However, my attempts led me to two other issues: - the return value of the next_master() call in the same code branch as the original problem is ignored, - when running in local-only mode ("-l") or when /etc/resolv.conf contains no name server addresses, nsupdate primes the master server address list with localhost addresses; however, 127.0.0.1 is tried before ::1 (perhaps this one deserves a separate ticket to avoid confusion?). I pushed fixes to both of the above issues to the rt45380 branch, please review. Following is a discussion of why I was unable to prepare a system test. IIUC, the first requirement can only be satisfied in three cases: a) No "server" command is provided and SOA MNAME resolves to multiple addresses. b) A "server" command is provided and the given server name resolves to multiple addresses. c) Local-only mode ("-l") is used and the host is dual-stacked. Problems with each of these cases: a) In order to direct SOA queries to localhost, we would either need to control /etc/resolv.conf or _assume_ the latter contains either only ::1 or only 127.0.0.1. b) We could use "server localhost" and _hope_ the host is dual-stacked. Though even if it was, there would still be no easy way to satisfy the second requirement: - If the local server listened on both ::1 and 127.0.0.1, it would have to correctly respond to the SOA query and then immediately close the socket used to receive that query, so that sending the TKEY query elicits a dispatch error. - If the local server listened on just ::1 or just 127.0.0.1, the dispatch error would already had been triggered by the time recvsoa() is called, so recvgss() would not get a chance to fail over to another address. c) Local-only mode not only requires a session key to be installed beforehand in the location defined by the SESSION_KEYFILE constant, but is also subject to the same limitations as case b) _and_ assumes the host is dual-stacked. My original reproduction scenario is based on case a), but it _requires_ /etc/resolv.conf to contain only 127.0.0.1 and _requires_ the host to be dual-stacked. Combined with a test server listening only on 127.0.0.1, this results in the SOA query being sent to 127.0.0.1 and successfully responded to with MNAME set to "localhost.", followed by sending a TKEY query to ::1, which causes recvgss() to fail over to 127.0.0.1. I could not find any way to achieve the same effect using a system test. All checks performed in the tsiggss system test use "server 10.53.0.1", which prevents any failover from occurring.
Just to be sure, does your sign-off also cover the other two fixes I pushed to rt45380 (i.e. proper handling of next_master() return value and preferring IPv6 over IPv4 in local-only mode)?
4680. [bug] Fix failing over to another master server address when nsupdate is used with GSS-API. [RT #45380] 9.9.12, 9.10.7, 9.11.3, 9.12.0 The merged fix does not include preferring IPv6 over IPv4 in local-only mode as this breaks some nsupdate system tests (due to the tested named instances listening on 127.0.0.1, but not on ::1). I might fix that in another ticket.