Report information
The Basics
Id:
41908
Status:
open
Priority:
Medium/Medium
Queue:

People
Owner:
Nobody in particular
Cc:
AdminCc:

BugTracker
Version Fixed:
(no value)
Version Found:
(no value)
Versions Affected:
(no value)
Versions Planned:
(no value)
Priority:
P2 Normal
Severity:
S2 Normal
CVSS Score:
(no value)
CVE ID:
(no value)
Component:
BIND Server
Area:
bug

Dates
Created:Thu, 10 Mar 2016 07:55:03 -0500
Updated:Thu, 29 Jun 2017 20:07:18 -0400
Closed:Not set



This bug tracker is no longer active.

Please go to our Gitlab to submit issues (both feature requests and bug reports) for active projects maintained by Internet Systems Consortium (ISC).

Due to security and confidentiality requirements, full access is limited to the primary maintainers.

CC: "Tony Finch" <dot@dotat.at>
Subject: Running out of ephemeral TCP ports
Date: Thu, 10 Mar 2016 12:54:59 +0000
To: bind9-bugs@isc.org
From: "Tony Finch" <dot@dotat.at>
I have a basic health check script on my recursive servers: #!/bin/sh digarg="+time=1 +tries=1 +short cam.ac.uk in loc" digout='52 12 19.000 N 0 7 5.000 E 18.00m 10000m 100m 100m' for host in 127.0.0.1 ::1 do for proto in +ignore +tcp do case $(dig @$host $proto $digarg) in ($digout) : ok ;; (*) exit 1 ;; esac done done exit 0 I am running a load test using adns-masterfile as described at http://fanf.livejournal.com/141030.html This load test involves one client using one UDP socket and one TCP socket. The client is running on a different machine connecting over a LAN. The server is running Linux 3.13.0-77-generic #121-Ubuntu BIND 9.10.3-P4+0-large <id:03b54c5> built by make with '--enable-threads' '--enable-getifaddrs' '--with-ecdsa=yes' '--with-geoip=no' '--with-gost=no' '--with-gssapi=no' '--with-idn=no' '--with-iconv=no' '--with-libjson=yes' '--with-libxml2=yes' '--with-openssl=yes' '--with-pkcs11=no' '--with-python=yes' '--with-readline=yes' '--with-tuning=large' '--prefix=/home/named/BIND/9.10.3-P4+0' '--mandir=/home/named/BIND/9.10.3-P4+0/man' '--localstatedir=/home/named/var' '--sysconfdir=/home/named/etc' The server accumulates a lot of completed TCP connections in TIME_WAIT. When `netstat -an | grep -c TIME_WAIT` gets over 28,000 then the health check script starts to fail, because `dig` cannot open a TCP connection - it fails with dig: isc_socket_bind: address in use I think this means that `dig` needs to use ISC_SOCKET_REUSEADDRESS and I suspect that `named` might need some attention in this area as well. Tony. -- f.anthony.n.finch <dot@dotat.at> http://dotat.at/ Viking, North Utsire: Southerly 5 to 7. Moderate or rough. Mainly fair. Good, occasionally poor.
There are a lot of articles about TCP sockets waiting in TIME_WAIT. I believe I read the first on 30 years ago... The problem is not in applications but in the kernel. BTW as you use Linux there are some specific tunings which solve it. About to add a REUSEADDR it has a bad side effect as some traffic not for dig or named can be caught by accident: REUSEADDR explicitly allows port collision... And we got complains about this issue when named used this socket option without care.
CC: "Tony Finch" <dot@dotat.at>
Subject: Re: [ISC-Bugs #41908] Running out of ephemeral TCP ports
Date: Thu, 10 Mar 2016 13:58:33 +0000
To: "BIND9 Bugs via RT" <bind9-bugs@isc.org>
From: "Tony Finch" <dot@dotat.at>
I have changed the health check script so it uses dig -b localhost @localhost which makes it set SO_REUSEADDR, and this allows its connections to succeed when they fail without the -b. That sort-of confirms my belief that this is a bug. There might be tuning I can do elsewhere to improve the situation... Tony. -- f.anthony.n.finch <dot@dotat.at> http://dotat.at/ Southeast Iceland: Southerly veering southwesterly 6 to gale 8, occasionally severe gale 9 in west. Rough or very rough, becoming very rough or high. Rain or snow, then snow showers. Moderate or good, occasionally very poor.
CC: "Tony Finch" <dot@dotat.at>
Subject: Re: [ISC-Bugs #41908] Running out of ephemeral TCP ports
Date: Thu, 10 Mar 2016 15:20:05 +0000
To: "Francis Dupont via RT" <bind9-bugs@isc.org>
From: "Tony Finch" <dot@dotat.at>
Francis Dupont via RT <bind9-bugs@isc.org> wrote: > BTW as you use Linux there are some specific tunings > which solve it. Hmm. I have set net.ipv4.tcp_tw_reuse=1 which is supposed to help, but `dig` still fails as before. I have also increased the ephemeral port range which just postpones the problem a few seconds. > About to add a REUSEADDR it has a bad side effect as some traffic not > for dig or named can be caught by accident: REUSEADDR explicitly allows > port collision... And we got complains about this issue when named used > this socket option without care. Surely it shouldn't catch unwanted traffic if it is an outgoing TCP connection? (as opposed to UDP) Tony. -- f.anthony.n.finch <dot@dotat.at> http://dotat.at/ Southeast Iceland: Southerly veering southwesterly 6 to gale 8, occasionally severe gale 9 in west. Rough or very rough, becoming very rough or high. Rain or snow, then snow showers. Moderate or good, occasionally very poor.