Subject: | Fwd: mdig and memory checking |
Date: | Tue, 24 Jan 2017 15:12:56 +1100 |
To: | bind9-bugs@isc.org |
From: | "Mark Andrews" <marka@isc.org> |
CC: | dot@dotat.at |
Subject: | support for large batch jobs with mdig |
Date: | Mon, 23 Jan 2017 16:32:25 +0000 |
To: | bind-workers@isc.org |
From: | "Tony Finch" <dot@dotat.at> |
Marco Davids suggested on Twitter that I should try using mdig for bulk
lookups. I have previously used adns for this (it can handily manage
20kqps single threaded), and I have briefly tried out zdns (which isn't as
efficient).
I tried feeding it the Alexa top 1 million list:
$ sed '1d;s|^[0-9]*,||;s|/.*||' top-1m.csv |
time mdig +noall +answer -f - A @::1
and it failed:
mdig: dns_request_createvia4 failed with quota reached
Looking at the code it tries to send all the queries at once, which is
obviously not going to work for big jobs! (In fact mdig sets a dispatch
quota of 100 which restricts the maximum job size.)
I hacked around a bit to try adding a concurrency limit, along the lines I
have previously done for adns - see the patch below. This seems to work
reasonably OK, e.g.
$ sed '1d;s|^[0-9]*,||;s|/.*||' ~/work/dns/adns/top-1m.csv |
head -100000 |
time ./mdig +continue +noall -C 1000 -f - A @::1
9.74user 0.95system 0:14.41elapsed 74%CPU
A couple of problems, though:
mdig sometimes goes into a spin if I send it a SIGINT and/or fails with
the seemingly spurious error "dns_name_fromtext failed with label too
long". Dunno why - signal handling in BIND is still mysterious to me :-)
It slurps in the entire query list before starting to talk to the DNS
server, which can use a lot of RAM. (about 1kb per query)
I suspect there may be a better way to do this concurrency limit stuff,
making more effective use of the dispatch quota - at the moment the code
is enforcing the concurrency limit in two places, in mdig and in the dns
library.
If I give mdig more than a few hundred thousand queries it seems to hang
at the end of the job. On further investigation, there is something
accidentally quadratic in the cleanup code!
100k 0:11.45
200k 0:24.01
300k 0:40.44
400k 1:01.51
500k 1:27.97
If I add an _exit() after isc_app_run() returns, before the query list is
free()d, I get:
100k 0:10.23
200k 0:15.87
300k 0:20.08
400k 0:26.08
500k 0:29.89
1M 0:59.96 (adns can resolve this list in about 46s)
If I _exit() just after the query list is free()d, the times are similar
to the first list. Looks like something slooow in the memory manager.
---
bin/tools/mdig.c | 44 +++++++++++++++++++++++++++++++++++++-------
1 file changed, 37 insertions(+), 7 deletions(-)
diff --git a/bin/tools/mdig.c b/bin/tools/mdig.c
index eaf477c..034bf91 100644
--- a/bin/tools/mdig.c
+++ b/bin/tools/mdig.c
@@ -51,6 +51,12 @@
#include <bind9/getaddresses.h>
+#if 0
+#define TRACE(...) printf(__VA_ARGS__)
+#else
+#define TRACE(...)
+#endif
+
#define CHECK(str, x) { \
if ((x) != ISC_R_SUCCESS) { \
fprintf(stderr, "mdig: %s failed with %s\n", \
@@ -110,6 +116,7 @@ static in_port_t port = 53;
static isc_dscp_t dscp = -1;
static unsigned char cookie_secret[33];
static int onfly = 0;
+static int concurrency = 100;
static char hexcookie[81];
struct query {
@@ -140,6 +147,9 @@ struct query {
};
static struct query default_query;
static ISC_LIST(struct query) queries;
+static struct query *pending;
+
+static void sendqueries(isc_task_t *task, isc_event_t *event);
#define EDNSOPTS 100U
/*% opcode text */
@@ -477,8 +487,18 @@ cleanup:
dns_request_destroy(&reqev->request);
isc_event_free(&event);
- if (--onfly == 0)
+ --onfly;
+ if (pending != NULL && (onfly < concurrency * 9/10 || onfly < 10)) {
+ /* XXX is this the right event type? copied from app.c ... */
+ event = isc_event_allocate(mctx, task, ISC_APPEVENT_SHUTDOWN,
+ sendqueries, pending,
+ sizeof(*event));
+ if (event == NULL)
+ CHECK("isc_event_allocate", ISC_R_NOMEMORY);
+ isc_task_send(task, &event);
+ } else if (onfly == 0)
isc_app_shutdown();
+
return;
}
@@ -695,19 +715,23 @@ sendquery(struct query *query, isc_task_t *task)
static void
sendqueries(isc_task_t *task, isc_event_t *event)
{
- struct query *query = (struct query *)event->ev_arg;
+ struct query *query = pending;
isc_event_free(&event);
- while (query != NULL) {
+ while (query != NULL && onfly < concurrency) {
struct query *next = ISC_LIST_NEXT(query, link);
sendquery(query, task);
query = next;
}
+ TRACE("onfly = %d pending = %d\n", onfly, query != NULL);
- if (onfly == 0)
+ if (query == NULL && onfly == 0)
isc_app_shutdown();
+ else
+ pending = query;
+
return;
}
@@ -740,6 +764,7 @@ help(void) {
" global opt is one of:\n"
" -4 (use IPv4 query transport only)\n"
" -6 (use IPv6 query transport only)\n"
+" -C count (concurrency limit, default 100)\n"
" -b address[#port] (bind to source address/port)\n"
" -p port (specify port number)\n"
" +[no]dscp[=###] (Set the DSCP value to ### [0..63])\n"
@@ -1624,6 +1649,11 @@ dash_option(const char *option, char *next, struct query *query,
*hash = '#';
have_src = ISC_TRUE;
return (value_from_next);
+ case 'C':
+ concurrency = atoi(value);
+ if (concurrency < 1)
+ fatal("concurrency must be greater than zero");
+ return (value_from_next);
case 'c':
tr.base = value;
tr.length = strlen(value);
@@ -1931,7 +1961,7 @@ isc_mem_debugging = ISC_MEM_DEBUGRECORD;
dispatchvx = NULL;
RUNCHECK(dns_dispatch_getudp(dispatchmgr, socketmgr, taskmgr,
have_src ? &srcaddr : &bind_any,
- 4096, 100, 100, 17, 19,
+ 4096, concurrency, concurrency, 17, 19,
attrs, attrmask, &dispatchvx));
requestmgr = NULL;
RUNCHECK(dns_requestmgr_create(mctx, timermgr, socketmgr,
@@ -1943,8 +1973,8 @@ isc_mem_debugging = ISC_MEM_DEBUGRECORD;
view = NULL;
RUNCHECK(dns_view_create(mctx, 0, "_test", &view));
- query = ISC_LIST_HEAD(queries);
- RUNCHECK(isc_app_onrun(mctx, task, sendqueries, query));
+ pending = ISC_LIST_HEAD(queries);
+ RUNCHECK(isc_app_onrun(mctx, task, sendqueries, pending));
(void)isc_app_run();
--
2.10.1.445.g3cdd5d1
_______________________________________________
bind-workers mailing list
bind-workers@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-workers
Subject: | ISC_MEM_TRACKLINES is accidentally quadratic, was Re: support for large batch jobs with mdig |
Date: | Mon, 23 Jan 2017 19:34:13 +0000 |
To: | bind-workers@isc.org |
From: | "Tony Finch" <dot@dotat.at> |
Tony Finch <dot@dotat.at> wrote:
>
> If I give mdig more than a few hundred thousand queries it seems to hang
> at the end of the job. On further investigation, there is something
> accidentally quadratic in the cleanup code!
This accidentally quadratic performance is to do with ISC_MEM_TRACKLINES -
turn that off and freeing is instant! But it needs a fix...
---
lib/dns/openssl_link.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/lib/dns/openssl_link.c b/lib/dns/openssl_link.c
index 29facf3..4e0976e 100644
--- a/lib/dns/openssl_link.c
+++ b/lib/dns/openssl_link.c
@@ -122,10 +122,17 @@ id_callback(void) {
#endif
#if OPENSSL_VERSION_NUMBER < 0x10100000L || defined(LIBRESSL_VERSION_NUMBER)
-#define FLARG_PASS , __FILE__, __LINE__
+
#define FLARG
#define FILELINE
+#if ISC_MEM_TRACKLINES
+#define FLARG_PASS , __FILE__, __LINE__
#else
+#define FLARG_PASS
+#endif
+
+#else
+
#define FLARG , const char *file, int line
#define FILELINE , __FILE__, __LINE__
#if ISC_MEM_TRACKLINES
--
2.10.1.445.g3cdd5d1
_______________________________________________
bind-workers mailing list
bind-workers@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-workers
CC: | bind-workers@isc.org |
Subject: | Re: ISC_MEM_TRACKLINES is accidentally quadratic, was Re: support for large batch jobs with mdig |
To: | "Tony Finch" <dot@dotat.at> |
Date: | Tue, 24 Jan 2017 06:55:16 +1100 |
From: | "Mark Andrews" <marka@isc.org> |
In message <alpine.DEB.2.11.1701231932080.23062@grey.csi.cam.ac.uk>, Tony Finch writes:
> Tony Finch <dot@dotat.at> wrote:
> >
> > If I give mdig more than a few hundred thousand queries it seems to hang
> > at the end of the job. On further investigation, there is something
> > accidentally quadratic in the cleanup code!
Remove "isc_mem_debugging = ISC_MEM_DEBUGRECORD;" from the start
of main in mdig.c. It got left in from developement. The flag is very
good at reporting what memory is leaking but it isn't the fastest
code.
mdig needs dig's preparse args to set isc_mem_debugging.
Mark
> This accidentally quadratic performance is to do with ISC_MEM_TRACKLINES -
> turn that off and freeing is instant! But it needs a fix...
>
> ---
> lib/dns/openssl_link.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/lib/dns/openssl_link.c b/lib/dns/openssl_link.c
> index 29facf3..4e0976e 100644
> --- a/lib/dns/openssl_link.c
> +++ b/lib/dns/openssl_link.c
> @@ -122,10 +122,17 @@ id_callback(void) {
> #endif
>
> #if OPENSSL_VERSION_NUMBER < 0x10100000L || defined(LIBRESSL_VERSION_NUMBER)
> -#define FLARG_PASS , __FILE__, __LINE__
> +
> #define FLARG
> #define FILELINE
> +#if ISC_MEM_TRACKLINES
> +#define FLARG_PASS , __FILE__, __LINE__
> #else
> +#define FLARG_PASS
> +#endif
> +
> +#else
> +
> #define FLARG , const char *file, int line
> #define FILELINE , __FILE__, __LINE__
> #if ISC_MEM_TRACKLINES
> --
> 2.10.1.445.g3cdd5d1
>
> _______________________________________________
> bind-workers mailing list
> bind-workers@lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind-workers
--
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742 INTERNET: marka@isc.org
CC: | bind-workers@isc.org |
Subject: | Re: ISC_MEM_TRACKLINES is accidentally quadratic, was Re: support for large batch jobs with mdig |
Date: | Tue, 24 Jan 2017 06:55:16 +1100 |
To: | "Tony Finch" <dot@dotat.at> |
From: | "Mark Andrews" <marka@isc.org> |
In message <alpine.DEB.2.11.1701231932080.23062@grey.csi.cam.ac.uk>, Tony Finch writes:
> Tony Finch <dot@dotat.at> wrote:
> >
> > If I give mdig more than a few hundred thousand queries it seems to hang
> > at the end of the job. On further investigation, there is something
> > accidentally quadratic in the cleanup code!
Remove "isc_mem_debugging = ISC_MEM_DEBUGRECORD;" from the start
of main in mdig.c. It got left in from developement. The flag is very
good at reporting what memory is leaking but it isn't the fastest
code.
mdig needs dig's preparse args to set isc_mem_debugging.
Mark
> This accidentally quadratic performance is to do with ISC_MEM_TRACKLINES -
> turn that off and freeing is instant! But it needs a fix...
>
> ---
> lib/dns/openssl_link.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/lib/dns/openssl_link.c b/lib/dns/openssl_link.c
> index 29facf3..4e0976e 100644
> --- a/lib/dns/openssl_link.c
> +++ b/lib/dns/openssl_link.c
> @@ -122,10 +122,17 @@ id_callback(void) {
> #endif
>
> #if OPENSSL_VERSION_NUMBER < 0x10100000L || defined(LIBRESSL_VERSION_NUMBER)
> -#define FLARG_PASS , __FILE__, __LINE__
> +
> #define FLARG
> #define FILELINE
> +#if ISC_MEM_TRACKLINES
> +#define FLARG_PASS , __FILE__, __LINE__
> #else
> +#define FLARG_PASS
> +#endif
> +
> +#else
> +
> #define FLARG , const char *file, int line
> #define FILELINE , __FILE__, __LINE__
> #if ISC_MEM_TRACKLINES
> --
> 2.10.1.445.g3cdd5d1
>
> _______________________________________________
> bind-workers mailing list
> bind-workers@lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind-workers
--
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742 INTERNET: marka@isc.org
_______________________________________________
bind-workers mailing list
bind-workers@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-workers
Subject: | Re: ISC_MEM_TRACKLINES is accidentally quadratic, was Re: support for large batch jobs with mdig |
Date: | Mon, 23 Jan 2017 19:57:44 +0000 |
To: | bind-workers@isc.org |
From: | "Tony Finch" <dot@dotat.at> |
Tony Finch <dot@dotat.at> wrote:
>
> This accidentally quadratic performance is to do with ISC_MEM_TRACKLINES -
> turn that off and freeing is instant! But it needs a fix...
The other fix is to remove what looks (from the weird indentation) like
some stray debugging code that got committed by accident.
And with mdig unhobbled, it manages to process the whole top-1m list in
less than 34s, nearly 30kqps. Not bad!
---
bin/tools/mdig.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/bin/tools/mdig.c b/bin/tools/mdig.c
index 034bf91..0839845 100644
--- a/bin/tools/mdig.c
+++ b/bin/tools/mdig.c
@@ -1901,7 +1901,6 @@ main(int argc, char *argv[]) {
fatal("could not find either IPv4 or IPv6");
mctx = NULL;
-isc_mem_debugging = ISC_MEM_DEBUGRECORD;
RUNCHECK(isc_mem_create(0, 0, &mctx));
lctx = NULL;
--
2.10.1.445.g3cdd5d1
_______________________________________________
bind-workers mailing list
bind-workers@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-workers
CC: | bind-workers@isc.org |
Subject: | Re: ISC_MEM_TRACKLINES is accidentally quadratic, was Re: support for large batch jobs with mdig |
Date: | Mon, 23 Jan 2017 20:02:06 +0000 |
To: | "Mark Andrews" <marka@isc.org> |
From: | "Tony Finch" <dot@dotat.at> |
Mark Andrews <marka@isc.org> wrote:
>
> Remove "isc_mem_debugging = ISC_MEM_DEBUGRECORD;" from the start
> of main in mdig.c.
Thanks :-) I worked it out eventually!
Tony.
--
f.anthony.n.finch <dot@dotat.at> http://dotat.at/ - I xn--zr8h punycode
Lundy, Fastnet, Irish Sea: South or southwest 4 or 5, increasing 6 at times.
Slight or moderate, becoming rough except in Irish Sea, occasionally very
rough in Fastnet. Occasional rain. Good, occasionally moderate.