Bug #45637 for bind9-public: Backport performance work to 9.11, 9.10, 9.9 branches

Fri, 28 Jul 2017 09:23:42 -0400 Mukund Sivaraman <muks@isc.org> - Ticket created

To:	bind9-confidential@isc.org
From:	"Mukund Sivaraman" <muks@isc.org>
Subject:	Backport performance work to 9.11, 9.10, 9.9 branches
Date:	Fri, 28 Jul 2017 18:53:13 +0530

This is a ticket to backport of bits of change 4605 to release branches: 4605. [performance] Improve performance for delegation heavy answers and also general query performance. Removes the acache feature that didn't significantly improve performance. Adds a glue cache. Removes additional-from-cache and additional-from-auth features. Enables minimal-responses by default. Improves performance of compression code, owner case restoration, hash function, etc. Uses inline buffer implementation by default. Many other performance changes and fixes. [RT #44029] The bits to be backported will be decided when preparing the branch, with the requirement that there must be no behavioral changes in existing releaes branches. The backport branch(es) should be put to review to confirm this. Mukund

Fri, 28 Jul 2017 11:46:44 -0400 Vicky Risk <vicky@isc.org> - Queue changed from #6 to bind9-public

Fri, 28 Jul 2017 11:46:44 -0400 Vicky Risk <vicky@isc.org> - Severity S3 Low added

Fri, 28 Jul 2017 11:46:45 -0400 Vicky Risk <vicky@isc.org> - Area bug changed to Other

Fri, 28 Jul 2017 11:46:45 -0400 Vicky Risk <vicky@isc.org> - Priority P1 High added

Wed, 02 Aug 2017 00:06:22 -0400 Mukund Sivaraman <muks@isc.org> - Queue changed from bind9-public to #3

Mon, 18 Sep 2017 16:13:23 -0400 Vicky Risk <vicky@isc.org> - Versions Planned 9.10.6-S added

Mon, 18 Sep 2017 16:16:38 -0400 Vicky Risk <vicky@isc.org> - Versions Planned 9.10.6-S changed to 9.10.6-S, 9.9.12, 9.10.7, 9.11.3

Tue, 24 Oct 2017 15:23:45 -0400 Mukund Sivaraman <muks@isc.org> - Correspondence added

Subject:	Re: [ISC-Bugs #45637] Backport performance work to 9.11, 9.10, 9.9 branches
To:	"Mukund Sivaraman via RT" <bind9-confidential@isc.org>
Date:	Wed, 25 Oct 2017 00:53:34 +0530
From:	"Mukund Sivaraman" <muks@isc.org>

Please review rt45637. This branch has the following bits from #44029: * Set fillcount and freemax on msg->namepool and msg->rdspool commit 88fb5e9fc53e16a100bc1181ad5e7a64d69dab9e A bug was fixed in setting freemax on m->rdspool. fillcount() is also set to avoid lock contention in isc___mempool_get(), and the default sizes have been bumped a bit to help the COM delegation case that has 29 RRs. * avoid locking mctx when tracklines isn't in use commit 5bb7e77365117e54ecd55a29d6856aa0de04f3a3 Avoids unnecessary MCTXLOCK * Fixup rwlock tracing when using atomics commit b315f40288649016017c2c176d084823cf0c6583 isc_rwlock tracing code has been fixed for the common case where atomic instructions are used in the rwlock implementation. * Reduce operations in inline buffer functions commit 8591709fa02411f5682d7f4401fab020d0b79152 This removes various redundant boolean AND operations that are not necessary as the value is truncated when written into the shorter variable. It also fixes a compiler warning in dns_ssu_external_match() when the inline buffer macros are used. * Use inline buffer implementation commit df48125be8b901b3532abf6261b09fa509f1a057 This turns on use of inline buffer functions. These inline buffer functions do less checking, and because they're inlined they're a lot faster. I have an argument for adding some checks because we've had some CVEs found due to assertion failure in buffer.c recently. But these functions (especially the putuint16) are called a *LOT* and need to be fast to get named to perform well. Every additional check here lowers performance. * Reduce work in hash function commit 291305b70dece58735e11d07e68c31d1dfa6959c The hash function is called a very very large number of times. The pthread_once calls have a performance penalty, so they're skipped unless necessary. The commit also removes various redundant boolean AND operations that are not necessary as the value is truncated when written into the shorter variable. * Improve the performance of rdataset_getownercase() in RBTDB commit 5c8623f1257d6597e9fd45f47d90fbcd34650814 Owner case recording was added in 9.11. As delegations have large numbers of RRs in the reply, getownercase() is called a huge number of times. The implementation in 9.11 was very slow (and caused a performance dip from 9.10 to 9.11). There's no way to avoid this, so its performance was improved and a shortcut fastpath was also added for the common case where the owner case is entirely lowercase. * Improve the performance of dns_name_getlabelsequence() commit 17bf42955e8575c07e399ea2bba4dd0fbd4ca83f dns_name_getlabelsequence() is called a very large number of times, but mainly via the name compression codepath. The name compression codepath calls dns_name_getlabelsequence() in a specific way (it always attempts a label sequence of the rest of a name starting from some offset), so performance of dns_name_getlabelsequence() was tweaked to be fast for this case. * Improve performance of compression code commit 3bc6a92c08abead11d7304e90631b9d7f88d48f5 dns_compress_findglobal() is *very* high on the list of functions that are called often. It needs to be fast. This commit makes several changes to the compression code to make it run faster. Mainly: + it doesn't perform a wasteful hash the whole name being checked any more, but looks up the hash of only the first octet of the first label of the name from a lookup table. + it inlines calls to dns_name_equal() and dns_name_caseequal() to avoid re-running some code it needs to run for calculating the hash. Inlining this code is also necessary for improving the speed of dns_compress_findglobal(). * Clean up tracklines code commit b7b18a1124e7b27b59d572850614e4b616c13899 Just a cleanup * Iterate one more time in hand-unrolled loops commit 4ff39ae7d8e059d982cf48be6b5427a046dcc7bf Tiny fix to make it iterate one more time. (These hand-unrolled loops need to go, but we need to start using GCC attributes (per-function optimization flags). Adding -funroll-loops globally reduces performance.) Mukund

Tue, 24 Oct 2017 15:23:45 -0400 The RT System itself - Status changed from 'new' to 'open'

Tue, 24 Oct 2017 15:26:38 -0400 Mukund Sivaraman <muks@isc.org> - Status changed from 'open' to 'review'

Tue, 24 Oct 2017 15:26:39 -0400 Mukund Sivaraman <muks@isc.org> - Queue changed from #3 to bind9-public

Wed, 25 Oct 2017 16:01:03 -0400 Evan Hunt <Evan_Hunt@isc.org> - Correspondence added

All the changes look good to me. I think I saw some discussion over whether we want to backport performance changes or leave things as they are so that people have an incentive to upgrade to 9.12. I don't know if that was ever resolved. (In my opinion we shouldn't withhold fixes from any branch that don't affect compatibility or stability, and these should definitely be backported. I'm just mentioning it because I'm not sure there was consensus on the point.)

Fri, 27 Oct 2017 00:40:10 -0400 Evan Hunt <Evan_Hunt@isc.org> - Given to Mukund Sivaraman <muks@isc.org>

Mon, 27 Nov 2017 00:26:14 -0500 Mukund Sivaraman <muks@isc.org> - Given to Evan Hunt <Evan_Hunt@isc.org>

Mon, 27 Nov 2017 01:49:54 -0500 Evan Hunt <Evan_Hunt@isc.org> - Correspondence added

On Mon Nov 27 05:25:28 2017, muks wrote: > I've reverted the compression changes in the branch, to avoid any chance > of unexpected problems with broken devices. > > Please check if rt45637 is ready to be merged into 9.11. As near as I can tell there's no remaining trace of the compression change, and I already thought the other changes should be merged. Please go ahead.

Mon, 27 Nov 2017 01:49:55 -0500 Evan Hunt <Evan_Hunt@isc.org> - Given to Mukund Sivaraman <muks@isc.org>

Wed, 06 Dec 2017 01:16:23 -0500 Mukund Sivaraman <muks@isc.org> - Correspondence added

From:	"Mukund Sivaraman" <muks@isc.org>
To:	"Evan Hunt via RT" <bind9-public@isc.org>
Subject:	Re: [ISC-Bugs #45637] Backport performance work to 9.11, 9.10, 9.9 branches
Date:	Wed, 6 Dec 2017 11:46:14 +0530

Merged to v9_11 branch: 4605. [performance] (partial backport) Improve general query performance. Improves performance of owner case restoration, hash function, etc. Uses inline buffer implementation by default. [RT #45637] Mukund

Wed, 06 Dec 2017 01:17:01 -0500 Mukund Sivaraman <muks@isc.org> - Status changed from 'review' to 'resolved'

Wed, 06 Dec 2017 01:17:01 -0500 Mukund Sivaraman <muks@isc.org> - Untaken

Wed, 06 Dec 2017 01:17:01 -0500 Mukund Sivaraman <muks@isc.org> - Versions Planned 9.10.6-S, 9.9.12, 9.10.7, 9.11.3 changed to 9.11.3

Created:	Fri, 28 Jul 2017 09:23:42 -0400
Updated:	Wed, 06 Dec 2017 01:17:02 -0500
Closed:	Wed, 06 Dec 2017 01:17:01 -0500

Bug #45637 for bind9-public: Backport performance work to 9.11, 9.10, 9.9 branches

This bug tracker is no longer active.