Content-Disposition: inline Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: binary Content-Length: 2573 Continue answering queries with cached expired answers, if the authority is not answering, after first querying the authority. The goal is to allow resolvers to continue to respond even when the authority for the data might be temporarily unresponsive due to an attack. We want a (configurable) timer for how long to wait for the authority to respond before serving stale data, and a (configurable) timer on the amount of time after it is expired, that a record may still be served. The expectation is that this would be enabled system-wide. Not sure whether this should default to on, or what the default settings for the timers should be. The pending IETF DRAFT "Serving Stale Data to Improve DNS Resiliency draft-tale-dnsop-serve-stale-00" is a start at describing how this should work. The attached contributed patch from Akamai may or may not be useful. email excerpted below is attached, with the current draft of the IETF paper. "Here it is, the long awaited patch! There are two attachments, the first being the patch and the second the current version of the Internet-Draft that I'm waiting to submit after Warren is done with the editing pen. Some things to note about the patch: * Per the comment in the draft about not evicting CNAMEs in the cache when other data arrives, this can result in unexpected behaviour once everything goes stale and the CNAME comes back into play after new authoritative data had changed the zone. We had an incident related to this. This has not been addressed in the patch; if I had, I was leaning in the direction of checking for an existing CNAME conflict when adding new data and evicting the old data. * This does not handle using stale glue really, which is a shame. I believe it should, but I just didn't get into messing around with the adb. Personally I think if you have a stale delegation for example.com you should still be able to use it to resolve names. * There's some work in there related to reloading the dump file, which I realize was meant only for testing and not a production feature even before this came along. We had a thought that this would also improve generalized resiliency to preserve data across restarts, but since the dump load doesn't have a provision for loading negative answers I didn't finish that. The timestamps in the dump file are still written to reflect stale age though, which could be really surprising for someone looking at it and seeing much longer TTLs than they expect. Sorry again that this took so long, but I hope that it is useful for you."