Skip Menu |
Report information
The Basics
Id: 45540
Status: open
Priority: 50/50
Queue: dhcp-public

People
Owner: Nobody in particular
Requestors: Vladimir Kunschikov <kunschikov@gmail.com>
Cc:
AdminCc:

Bug Information
Version Fixed: (no value)
Version Found: (no value)
Versions Affected: (no value)
Versions Planned: (no value)
Priority: (no value)
Severity: (no value)
CVSS Score: (no value)
CVE ID: (no value)
Component: (no value)
Area: (no value)

Dates
Created:Tue, 11 Jul 2017 04:59:50 -0400
Updated:Thu, 21 Dec 2017 06:42:24 -0500
Closed:Not set



This bug tracker is no longer active.

Please go to our Gitlab to submit issues (both feature requests and bug reports) for active projects maintained by Internet Systems Consortium (ISC).

Due to security and confidentiality requirements, full access is limited to the primary maintainers.

Date: Tue, 11 Jul 2017 11:59:31 +0300
From: "Vladimir Kunschikov" <kunschikov@gmail.com>
To: dhcp-bugs@isc.org
Subject: dhclient doesn't RENEW at proper moment when system time shifts back
Hello,
  we have met the following network setup problem.
A shift of the system time to the past leads to the incorrect long sleep of the dhclient.
Step by step reproduction:

1. obtain address via dhclient at timestamp t1;
2. shift system time  to the past to timestamp t2;
3.  bug: dhclient misses renew operation. it is sleeping in 'select' for "lease time"/2 + (t1 - t2) seconds.

Time shift can be great, for example in setting up local time on guest linux OSes in some hypervisors.  It consequently leads to the losing address of the interface.

--
Best regards,
Vladimir Kunschikov
Lead software developer
InfoTeCS JSC

On Tue Jul 11 08:59:50 2017, kunschikov@gmail.com wrote: > Hello, > we have met the following network setup problem. > A shift of the system time to the past leads to the incorrect long sleep of > the dhclient. > Step by step reproduction: > > 1. obtain address via dhclient at timestamp t1; > 2. shift system time to the past to timestamp t2; > 3. bug: dhclient misses renew operation. it is sleeping in 'select' for > "lease time"/2 + (t1 - t2) seconds. > > Time shift can be great, for example in setting up local time on guest > linux OSes in some hypervisors. It consequently leads to the losing > address of the interface. => if you use ISC DHCP 4.1 you can try 4.3 which has a different timer library. You can also try to send a signal to the server in order to make it to break and restart the select loop. Note I don't know for Lines but BSDs provide an option to shift system time without stepping it. Of course it does not work for big shifts. To finish I am not surprise a protocol relying on timing has some problems with a not increasing clock...
Date: Tue, 11 Jul 2017 12:30:06 +0300
From: "Vladimir Kunschikov" <kunschikov@gmail.com>
To: dhcp-confidential@isc.org
Subject: Re: [ISC-Bugs #45540] dhclient doesn't RENEW at proper moment when system time shifts back
Download (untitled) / with headers
text/plain 1.7KiB
      We are using CentOS as base system and can't switch to BSD. I am not sure which signal can be sent to the dhclient in order to recalculate sleep/wake timestamps. Select can be canceled on signal reception but after that dhclient is going to sleep again till the next date, which is calculated not from current time, but from the scheduled one.
I am solving this problem by replacing gettimeofday() with clock_gettime(CLOCK_MONOTONIC_RAW,..) for dhclient. LD_PRELOAD with gettimofday()# replacement library does the trick.  #Proper solution will require modification of the dhclient scheduler.
Thanks for quick reply.

2017-07-11 12:09 GMT+03:00 Francis Dupont via RT <dhcp-confidential@isc.org>:
On Tue Jul 11 08:59:50 2017, kunschikov@gmail.com wrote:
> Hello,
>   we have met the following network setup problem.
> A shift of the system time to the past leads to the incorrect long sleep of
> the dhclient.
> Step by step reproduction:
>
> 1. obtain address via dhclient at timestamp t1;
> 2. shift system time  to the past to timestamp t2;
> 3.  bug: dhclient misses renew operation. it is sleeping in 'select' for
> "lease time"/2 + (t1 - t2) seconds.
>
> Time shift can be great, for example in setting up local time on guest
> linux OSes in some hypervisors.  It consequently leads to the losing
> address of the interface.

=> if you use ISC DHCP 4.1 you can try 4.3 which has a different
timer library. You can also try to send a signal to the server in
order to make it to break and restart the select loop.
Note I don't know for Lines but BSDs provide an option to shift
system time without stepping it. Of course it does not work
for big shifts.
To finish I am not surprise a protocol relying on timing has some
problems with a not increasing clock...


Download (untitled) / with headers
text/plain 1.2KiB
On Tue Jul 11 09:30:12 2017, kunschikov@gmail.com wrote: > We are using CentOS as base system and can't switch to BSD. => I said BSDs because it was invented for 4.3 BSD in 1993 (and I used this system at that time). Note adjtime() is supported by Linux too (cd http://man7.org/linux/man-pages/man3/adjtime.3.html) > I am not > sure which signal can be sent to the dhclient in order to recalculate > sleep/wake timestamps. Select can be canceled on signal reception but after > that dhclient is going to sleep again till the next date, which is > calculated not from current time, but from the scheduled one. => you can send a signal to the process (but avoid to kill it). select takes a relative time so if you are lucky it is recomputed with the "new" current time. > I am solving this problem by replacing gettimeofday() with > clock_gettime(CLOCK_MONOTONIC_RAW,..) for dhclient. LD_PRELOAD with > gettimofday()# replacement library does the trick. #Proper solution will > require modification of the dhclient scheduler. => you solution deals with the problem at its root (not increasing gettimeofday). I am not sure it is correct to modify the scheduler: the assumption the current time is increasing is not so crazy... > Thanks for quick reply. => can we close the ticket?
Subject: Re: dhclient doesn't RENEW at proper moment when system time shifts back [ISC-Bugs #45540]
To: dhcp-confidential@isc.org
Date: Tue, 11 Jul 2017 18:06:11 +0300
From: "Vladimir Kunschikov" <kunschikov@gmail.com>
Download (untitled) / with headers
text/plain 2.2KiB
Thanks again for quick reply, but the crucial point is here

>select takes a relative time so if you are lucky it is recomputed
with the "new" current time.

This recomputation goes wrong.
As can be seen from 'strace `pidof dhclient`' output, select() sleeps for the difference between current time and next scheduled time!

It can be easily reproduced  by setting "lease time" on server to 60 seconds and shift time on client to some point in past.

More detailed reproduction:
0. dhclient receives settings
1. sleeps in select() for the lease time/2; -  strace `pidof dhclient` shows some decent 15-30 seconds
2.  system time shifts back on some hour
3. dhclient wakes up, discovers that shift ant goes to sleep on that time shift! So if you are switching back to 80 days it will sleep on 80 days. Again can be easily checked by usage of `strace`.


2017-07-11 16:34 GMT+03:00 Francis Dupont via RT <dhcp-confidential@isc.org>:
On Tue Jul 11 09:30:12 2017, kunschikov@gmail.com wrote:
>       We are using CentOS as base system and can't switch to BSD.

=> I said BSDs because it was invented for 4.3 BSD in 1993
(and I used this system at that time). Note adjtime() is supported
by Linux too (cd http://man7.org/linux/man-pages/man3/adjtime.3.html)

> I am not
> sure which signal can be sent to the dhclient in order to recalculate
> sleep/wake timestamps. Select can be canceled on signal reception but after
> that dhclient is going to sleep again till the next date, which is
> calculated not from current time, but from the scheduled one.

=> you can send a signal to the process (but avoid to kill it).
select takes a relative time so if you are lucky it is recomputed
with the "new" current time.

> I am solving this problem by replacing gettimeofday() with
> clock_gettime(CLOCK_MONOTONIC_RAW,..) for dhclient. LD_PRELOAD with
> gettimofday()# replacement library does the trick.  #Proper solution will
> require modification of the dhclient scheduler.

=> you solution deals with the problem at its root (not increasing
gettimeofday). I am not sure it is correct to modify the scheduler:
the assumption the current time is increasing is not so crazy...

> Thanks for quick reply.

=> can we close the ticket?


On Tue Jul 11 15:06:25 2017, kunschikov@gmail.com wrote: > Again can be easily checked by usage of `strace`. => hum, it seems your solution (change the gettimeofday clock for a monotonic one) is the only one which does not require some redesign/recode... Two notes: - I have a tool for DNS which increases exponentially the clock speed. It is used to check re-signing & co. It is based on redirection of the gettimeofday system call in a very similar way... - unfortunately your solution won't work between reboots so you have to remove the client lease file to reset the state (this file is used at startup as an optimization and is not critical at all, BTW many clients have no stable storage so should not depend on it).
Subject: Re: dhclient doesn't RENEW at proper moment when system time shifts back [ISC-Bugs #45540]
To: dhcp-confidential@isc.org
Date: Wed, 12 Jul 2017 11:37:41 +0300
From: "Vladimir Kunschikov" <kunschikov@gmail.com>
Download (untitled) / with headers
text/plain 2.1KiB
This bug was reported in our internal bug tracker and assigned to me.
 At first, I disagreed that it was a bug but surprisingly there exist users who are running into this problem. We are making images for using in Vmware, and we don't know beforehand what time zone uses the client who installed image. So we are assuming that BIOS time is in UTC. But VMware hypervisor sets BIOS time to the local time! It can't be changed judging by googling results and reading several discussions on the subject.  The operating system starts up in Moscow (UTC +3), then interfaces are being brought up by usage of dhclient, then goes NTP synchronization - and time goes back on three hours. With short leases, it leads to the bringing ifaces down.
 
  I've made two solutions with gettimeofday() replacements: using LD_PRELOAD and other solution using direct compile-in replacements. Both of the solutions were quickly approved by our corporative code review. I've even put the somewhat simpler LD_PRELOAD hack on public githup.  So it is not a problem for now, but it will be cool to know that someday it will be properly fixed in some future version.

 Now I am going to nicely blame our support QA now by rephrasing your words that from 1993 there wasn't any person except them who bumped into this problem.

2017-07-11 18:49 GMT+03:00 Francis Dupont via RT <dhcp-confidential@isc.org>:
On Tue Jul 11 15:06:25 2017, kunschikov@gmail.com wrote:
> Again can be easily checked by usage of `strace`.

=> hum, it seems your solution (change the gettimeofday
clock for a monotonic one) is the only one which does not
require some redesign/recode...
Two notes:
 - I have a tool for DNS which increases exponentially
  the clock speed. It is used to check re-signing & co.
  It is based on redirection of the gettimeofday system call
  in a very similar way...
 - unfortunately your solution won't work between reboots
  so you have to remove the client lease file to reset the
  state (this file is used at startup as an optimization and
  is not critical at all, BTW many clients have no stable
  storage so should not depend on it).


From: "Nick Phillips" <nick.phillips@otago.ac.nz>
To: "dhcp-bugs@isc.org" <dhcp-bugs@isc.org>
Subject: Re: [ISC-Bugs #45540] dhclient doesn't RENEW at proper moment when system time shifts back
Date: Tue, 26 Sep 2017 20:16:12 +0000
Download (untitled) / with headers
text/plain 1.2KiB
Hi... Just wanted to let you know that we have been hit by this as well. We're booting linux on systems which usually run Windows. We bring up the network and then call ntpdate to correct time. If ntpdate steps the time backwards (which it occasionally does), then dhclient does not renew its lease when it should, and continued use of the IP address is prevented by the switch we're connected to. I suspect this has been happening for years, but we've only noticed now that our network is "smarter" and drops packets from machines which don't have a valid lease. It's not clear why ntpdate sometimes needs to step time backwards (timezone issues? flaky ntp server?), but the point is that we must be able to get from a state where the system clock is unknown to a running state. Since we can't run ntpdate before starting dhclient, we either need dhclient to handle backward time steps, or we need to hack our way around it (LD_PRELOAD trick, set system time to epoch on boot, or detect negative time step and kill dhclient). It would be far nicer if dhclient could handle this itself. Cheers, Nick -- Nick Phillips / nick.phillips@otago.ac.nz / 03 479 4195 # These statements are mine, not those of the University of Otago
BTW draft-aanchal-time-implementation-guidance-00.txt (presented at IEPG during IETF 100 meeting now) talks about similar issue in DNS (vs DHCP) and recommends too on POSIX systems to use a monotonic clock (vs gettimeofday()) which cannot be attacked off path.
From: "Ferry van Steen" <Ferry.van.Steen@citrus.nl>
Subject: Re: [ISC-Bugs #45540] dhclient doesn't RENEW at proper moment when system time shifts back
To: "dhcp-bugs@isc.org" <dhcp-bugs@isc.org>
Date: Thu, 21 Dec 2017 10:53:00 +0000
Download (untitled) / with headers
text/plain 2.4KiB

Hi,


Hope this will get added to bug 45540, don't have an account on the bugtracker and can't comment with guest access.


for us this is a pretty severe issue and quite surprised you are surprised this hits anyone.


For obvious reasons NTP won't run until the network is activated and there's several cases where the clock will be stepped by it:

* You multiboot and the hardware clock is stored by the other OS in a different timezone

* Your hypervisor handles it differently than expected

* OS stores clock in localtime, you multiboot, summer-/wintertime occurs and OS'es aren't aware of the other OS already has made the switch

* Systemclock is off by a lot


And probably more.


This issue has been open on the RedHat bugtracker for quite some time (years). Not sure if it wasn't filed at ISC earlier as the bugtracker apparently wasn't open to the public at all some time ago.


You can find it here: https://bugzilla.redhat.com/show_bug.cgi?id=1093803


The issue seems to be:

Hardware clock is stored in localtime. We're in UTC+1 (wintertime, UTC+2 in summer). So say it's 14:00.

System boots, linux assumes hardware clock in UTC, adds 1 (or 2 depending on summer/winter) hours to hardware clock and sets system clock to 15:00

dhclient fires up, obtains a lease with a lease time of 15 minutes. dhclient sets a trigger/timer on 15:15 to renew (well, actually 15:07:30 but let's ignore that for ease)

NTP starts, steps system clock back to 14:00

14:15 lease expires, something removes it (presume the kernel - doesn't get logged). At this point your network is dead. It doesn't come back to live again either as:

At 15:15 dhclient tries to renew, but lease has expired. So DHCP won't renew. It apparently doesn't try to obtain a new lease at that point either as the network remains dead.


At many places this isn't an issue as leasetime > clock shift. But more and more networks are switching to low lease times due to BYOD and a lot of roaming phones and laptops and guest devices using up leases.


Haven't tested what happens if the clock is shifted the other way around, which is normal for half the world being west of UTC, but it's at best a pretty nasty issue for anyone east of UTC.


Kind regards,

______________________________________________________

This message may contain confidential or privileged information. If you are not the addressee, please notify the sender and delete it from your files. Please consider the environmental impact before printing this e-mail.

To: dhcp-public@isc.org
Date: Thu, 21 Dec 2017 14:42:17 +0300
From: "Vladimir Kunschikov" <kunschikov@gmail.com>
Subject: Re: [ISC-Bugs #45540] dhclient doesn't RENEW at proper moment when system time shifts back
Download (untitled) / with headers
text/plain 3.2KiB
If you came here looking for quick fix you can use LD_PRELOAD with gettimeofday() replacement. 
You can freely grab the solution from  https://github.com/kunschikov/ld_preload_gettimeofday
Just export LD_PRELOAD from some network configuration script, which is being sourced from other start/stop scripts.

On RedHat CentOS it can be done by adding line
    export LD_PRELOAD=/usr/lib64/libgettimeofday.so
to the end of the
   /etc/sysconfig/network

It fixes this behavior for sure.

More solid (and proper) fix will require replacement of time calculation in dhclient itself.  Of course it will require recompilation of dhclient.

2017-12-21 13:53 GMT+03:00 Ferry van Steen via RT <dhcp-public@isc.org>:
Hi,


Hope this will get added to bug 45540, don't have an account on the bugtracker and can't comment with guest access.


for us this is a pretty severe issue and quite surprised you are surprised this hits anyone.


For obvious reasons NTP won't run until the network is activated and there's several cases where the clock will be stepped by it:

* You multiboot and the hardware clock is stored by the other OS in a different timezone

* Your hypervisor handles it differently than expected

* OS stores clock in localtime, you multiboot, summer-/wintertime occurs and OS'es aren't aware of the other OS already has made the switch

* Systemclock is off by a lot


And probably more.


This issue has been open on the RedHat bugtracker for quite some time (years). Not sure if it wasn't filed at ISC earlier as the bugtracker apparently wasn't open to the public at all some time ago.


You can find it here: https://bugzilla.redhat.com/show_bug.cgi?id=1093803


The issue seems to be:

Hardware clock is stored in localtime. We're in UTC+1 (wintertime, UTC+2 in summer). So say it's 14:00.

System boots, linux assumes hardware clock in UTC, adds 1 (or 2 depending on summer/winter) hours to hardware clock and sets system clock to 15:00

dhclient fires up, obtains a lease with a lease time of 15 minutes. dhclient sets a trigger/timer on 15:15 to renew (well, actually 15:07:30 but let's ignore that for ease)

NTP starts, steps system clock back to 14:00

14:15 lease expires, something removes it (presume the kernel - doesn't get logged). At this point your network is dead. It doesn't come back to live again either as:

At 15:15 dhclient tries to renew, but lease has expired. So DHCP won't renew. It apparently doesn't try to obtain a new lease at that point either as the network remains dead.


At many places this isn't an issue as leasetime > clock shift. But more and more networks are switching to low lease times due to BYOD and a lot of roaming phones and laptops and guest devices using up leases.


Haven't tested what happens if the clock is shifted the other way around, which is normal for half the world being west of UTC, but it's at best a pretty nasty issue for anyone east of UTC.


Kind regards,

______________________________________________________

This message may contain confidential or privileged information. If you are not the addressee, please notify the sender and delete it from your files. Please consider the environmental impact before printing this e-mail.