Discussion:
[ntp:questions] NTP shows all servers in condition "reject"
Ronny Egner
2008-06-09 09:09:05 UTC
Permalink
Dear List,

i?m having slight problems getting ntp to synchronize.

The problem i am facing occur on all my new 6 servers
which are all equally configured (Red Hat AS 4 U4, 64-bit).

The associations to all ntp servers are in state "reject".


The ntp.conf configuration file:

restrict default nomodify notrap noquery
restrict 127.0.0.1
server solaris-server
server windows-dc-serverA
server windows-dc-serverB
driftfile /var/lib/ntp/drift
broadcastdelay 0.008
keys /etc/ntp/keys
logconfig =syncstatus +allevents +allinfo +allstatus
logfile /var/log/ntpd


ntpq> peers
remote refid st t when poll reach delay offset
jitter
==============================================================================
solaris-server 10.x.y.2 5 u 18 64 377 0.259 892.342
30.508
windows-dc-serverA 10.x.v.1 4 u 17 64 377 0.214 847.816
50.335
windows-dc-serverB 10.x.v.1 4 u 22 64 377 0.272 923.808
45.667


ntpq> as

ind assID status conf reach auth condition last_event cnt
===========================================================
1 39492 9014 yes yes none reject reachable 1
2 39493 9014 yes yes none reject reachable 1
3 39494 9014 yes yes none reject reachable 1


ntpq> rv 39492
assID=39492 status=9014 reach, conf, 1 event, event_reach,
srcadr=solaris-server, srcport=123, dstadr=10.x.y.z, dstport=123, leap=00,
stratum=5, precision=-18, rootdelay=31.601, rootdispersion=10316.483,
refid=10.x.y.z, reach=377, unreach=0, hmode=3, pmode=4, hpoll=6,
ppoll=6, flash=00 ok, keyid=0, ttl=0, offset=892.342, delay=0.259,
dispersion=3.273, jitter=26.161,
reftime=cbf7700d.bc611000 Mon, Jun 9 2008 12:02:05.735,
org=cbf770b3.f30d9000 Mon, Jun 9 2008 12:04:51.949,
rec=cbf770b3.068b7e4d Mon, Jun 9 2008 12:04:51.025,
xmt=cbf770b3.066a0125 Mon, Jun 9 2008 12:04:51.025,
filtdelay= 0.35 0.33 0.27 0.26 0.38 0.32 0.35 0.27,
filtoffset= 924.03 913.42 902.79 892.34 881.81 871.49 861.37 850.77,
filtdisp= 0.00 0.99 1.98 2.96 3.93 4.89 5.84 6.83


ntpq> rv 39493
assID=39493 status=9014 reach, conf, 1 event, event_reach,
srcadr=windows-dc-serverA, srcport=123, dstadr=10.x.y.z,
dstport=123, leap=00, stratum=4, precision=-6, rootdelay=31.250,
rootdispersion=10285.095, refid=10.x.v.1, reach=377, unreach=0,
hmode=3, pmode=4, hpoll=6, ppoll=6, flash=00 ok, keyid=0, ttl=0,
offset=895.522, delay=0.253, dispersion=18.466, jitter=24.392,
reftime=cbf77070.dbf356a4 Mon, Jun 9 2008 12:03:44.859,
org=cbf770b3.f7dac32f Mon, Jun 9 2008 12:04:51.968,
rec=cbf770b3.067b80a9 Mon, Jun 9 2008 12:04:51.025,
xmt=cbf770b3.06681a9b Mon, Jun 9 2008 12:04:51.025,
filtdelay= 0.30 0.35 0.53 0.32 0.25 0.30 0.47 0.40,
filtoffset= 943.01 919.72 911.99 903.67 895.52 887.86 878.58 870.20,
filtdisp= 15.63 16.60 17.58 18.54 19.48 20.46 21.40 22.36



Any help ??
--
Mit freundlichen Gr??en

Ronny Egner
Diplom-Ingenieur (BA)
Systeme & Service
Oracle DBA

Telefon: +49 381 2524-422
Telefax: +49 381 2524-399


SIV.AG - Service f?r Informationsverarbeitung AG
Hauptsitz: Konrad-Zuse-Str. 1, 18184 Roggentin
Handelsregister: Amtsgericht Rostock, HRB 8677, Ust.-IdNr.: DE 137477226
Vorstand: J?rg Sinnig (Vorsitzender), Andreas Lehmann, Arno Weichbrodt
Aufsichtsratsvorsitzender: Thomas Huth

*************************************************************************
Aus Rechtsgr?nden ist die in dieser E-Mail gegebene Information nicht
rechtsverbindlich. Eine rechtsverbindliche Best?tigung reichen wir
Ihnen auf Anforderung in schriftlicher Form nach. Diese Nachricht ist
ausschlie?lich f?r den Adressaten oder dessen Vertreter bestimmt.

The information contained in this email is not legally binding.
At your request, we will provide you with a legally binding confirmation
in written form. This message is intended solely for the addressee,
entity to which the email is addressed or the authorised agent.

*************************************************************************
Steve Kostecke
2008-06-10 13:49:40 UTC
Permalink
Post by Ronny Egner
Dear List,
i?m having slight problems getting ntp to synchronize.
The problem i am facing occur on all my new 6 servers
which are all equally configured (Red Hat AS 4 U4, 64-bit).
Are they VMs?
None of my comments about your configuration file are germane to the
lack of sync issue, but ...
Post by Ronny Egner
restrict default nomodify notrap noquery
restrict 127.0.0.1
These restrictions are OK. They do not block time service.
Post by Ronny Egner
server solaris-server
server windows-dc-serverA
server windows-dc-serverB
You can reduce the initial sync time from ~ 5 minutes to ~15-20 seconds
by appending 'iburst' to your server lines.
Post by Ronny Egner
broadcastdelay 0.008
The broadcast delay line is meaningless, but harmless, unless this ntpd
is operating as a broadcastclient.
Post by Ronny Egner
keys /etc/ntp/keys
The keys line is meaningless, but harmless, because the symmetric key
configuration is incomplete.
Post by Ronny Egner
ntpq> peers
remote refid st t when poll reach delay offset jitter
======================================================================
solaris-server 10.x.y.2 5 u 18 64 377 0.259 892.342 30.508
windows-dc-serverA 10.x.v.1 4 u 17 64 377 0.214 847.816 50.335
windows-dc-serverB 10.x.v.1 4 u 22 64 377 0.272 923.808 45.667
There is a signifiant diference in offsets between those remote time
servers. And the jitter is quite high. Are all of these server on the
same LAN? Are any at remote sites or reached over a VPN?
Post by Ronny Egner
ntpq> rv 39492
assID=39492 status=9014 reach, conf, 1 event, event_reach,
srcadr=solaris-server, srcport=123, dstadr=10.x.y.z, dstport=123, leap=00,
stratum=5, precision=-18, rootdelay=31.601, rootdispersion=10316.483,
The root dispersion suggests that this peer has not been synced to a
real time source is quite a while. This needs to be fixed first.

Could you please post the solaris-server 'ntpq -p' billboard from (in a
condensed format as shown above)?
Post by Ronny Egner
ntpq> rv 39493
assID=39493 status=9014 reach, conf, 1 event, event_reach,
srcadr=windows-dc-serverA, srcport=123, dstadr=10.x.y.z,
dstport=123, leap=00, stratum=4, precision=-6, rootdelay=31.250,
rootdispersion=10285.095, refid=10.x.v.1, reach=377, unreach=0,
Same problem here, too.
--
Steve Kostecke <kostecke at ntp.org>
NTP Public Services Project - http://support.ntp.org/
David Woolley
2008-06-10 20:59:31 UTC
Permalink
Post by Steve Kostecke
Post by Ronny Egner
ntpq> peers
remote refid st t when poll reach delay offset jitter
======================================================================
solaris-server 10.x.y.2 5 u 18 64 377 0.259 892.342 30.508
windows-dc-serverA 10.x.v.1 4 u 17 64 377 0.214 847.816 50.335
windows-dc-serverB 10.x.v.1 4 u 22 64 377 0.272 923.808 45.667
There is a signifiant diference in offsets between those remote time
servers. And the jitter is quite high. Are all of these server on the
same LAN? Are any at remote sites or reached over a VPN?
The delay is too low for a VPN.
Post by Steve Kostecke
Post by Ronny Egner
ntpq> rv 39492
assID=39492 status=9014 reach, conf, 1 event, event_reach,
srcadr=solaris-server, srcport=123, dstadr=10.x.y.z, dstport=123, leap=00,
stratum=5, precision=-18, rootdelay=31.601, rootdispersion=10316.483,
The root dispersion suggests that this peer has not been synced to a
real time source is quite a while. This needs to be fixed first.
The root dispersion is an order of magnitude too high for the server to
be acceptable. I thought W32Time was the only implementation that
didn't go to stratum 16 when the root distance exceeded 1 second, so I'm
not quite sure how a Solaris system could be reporting a valid stratum
but such a high root dispersion. However the precision is too good to
be W32Time. Could it be using some other alternative NTP implementation?
Post by Steve Kostecke
Could you please post the solaris-server 'ntpq -p' billboard from (in a
condensed format as shown above)?
I think you will find that its servers are the two windows domain
controllers. However, if it is running an alternative NTP
implementation it unlikely to respond to ntpq.
Post by Steve Kostecke
Post by Ronny Egner
ntpq> rv 39493
assID=39493 status=9014 reach, conf, 1 event, event_reach,
srcadr=windows-dc-serverA, srcport=123, dstadr=10.x.y.z,
dstport=123, leap=00, stratum=4, precision=-6, rootdelay=31.250,
rootdispersion=10285.095, refid=10.x.v.1, reach=377, unreach=0,
Same problem here, too.
But this could be W32Time, assuming that the recent versions aren't
still stuck at stratum 2.

The original article appears to have failed to make it to the newsgroup,
its not on groups.google as well as not in my feed, so I don't know if
there was an rv 0 and what the reference times were, but I'm not sure if
root dispersion is only the remote component, or whether it could be
generated locally with a very old reference time.
Steve Kostecke
2008-06-10 21:50:11 UTC
Permalink
Post by David Woolley
The original article appears to have failed to make it to the newsgroup,
its not on groups.google as well as not in my feed,
Usenet is a maze of twisty passages. And articles are occasionally
delayed by a grue.
Post by David Woolley
so I don't know if there was an rv 0
There was. But it was not important to my reply so I trimmed it. Just be
patient; I'm sure you'll see the original article soon.
--
Steve Kostecke <kostecke at ntp.org>
NTP Public Services Project - http://support.ntp.org/
David Woolley
2008-06-11 06:32:51 UTC
Permalink
Post by Steve Kostecke
There was. But it was not important to my reply so I trimmed it. Just be
patient; I'm sure you'll see the original article soon.
Still not made it here or to Google.
Ronny Egner
2008-06-11 05:22:46 UTC
Permalink
Post by Steve Kostecke
Post by Ronny Egner
The problem i am facing occur on all my new 6 servers
which are all equally configured (Red Hat AS 4 U4, 64-bit).
Are they VMs?
No, physical machines.
Post by Steve Kostecke
Post by Ronny Egner
server solaris-server
server windows-dc-serverA
server windows-dc-serverB
You can reduce the initial sync time from ~ 5 minutes to ~15-20 seconds
by appending 'iburst' to your server lines.
I will try that.
Post by Steve Kostecke
Post by Ronny Egner
ntpq> peers
remote refid st t when poll reach delay offset jitter
======================================================================
solaris-server 10.x.y.2 5 u 18 64 377 0.259 892.342 30.508
windows-dc-serverA 10.x.v.1 4 u 17 64 377 0.214 847.816 50.335
windows-dc-serverB 10.x.v.1 4 u 22 64 377 0.272 923.808 45.667
There is a signifiant diference in offsets between those remote time
servers. And the jitter is quite high. Are all of these server on the
same LAN? Are any at remote sites or reached over a VPN?
No they are not reacable over a VPN. They are routed through the network
(quite complicated; but there is no WAN part inbetween).

I dont know much about the windows servers. The customer told me i can
use them for time synchronization - so i did it because the network i
highly protected from the outside lan.
Post by Steve Kostecke
Post by Ronny Egner
ntpq> rv 39492
assID=39492 status=9014 reach, conf, 1 event, event_reach,
srcadr=solaris-server, srcport=123, dstadr=10.x.y.z, dstport=123, leap=00,
stratum=5, precision=-18, rootdelay=31.601, rootdispersion=10316.483,
The root dispersion suggests that this peer has not been synced to a
real time source is quite a while. This needs to be fixed first.
Could you please post the solaris-server 'ntpq -p' billboard from (in a
condensed format as shown above)?
This is from the solaris box:

ntpq> version
ntpq 3-5.93e Mon Sep 20 15:45:42 PDT 1999 (1)

(it is a Solaris 10)


ntpq> peer
remote refid st t when poll reach delay offset
disp
==============================================================================
#windows-dc-serverA 10.x.v.1 4 u 215 1024 377 0.31 0.888 3.42
+windows-dc-serverB 10.x.v.1 4 u 509 1024 377 0.44 -1.445 2.61


Both windows server seem to get ther ntp data from the host 10.x.v.1.


ntpq> as
ind assID status conf reach auth condition last_event cnt
===========================================================
1 29028 9514 yes yes none dist.peer reachable 1
2 29029 9414 yes yes none synchr. reachable 1


status=9514 reach, conf, sel_sys.peer, hi_dist, 1 event, event_reach
srcadr=windows-dc-serverA, srcport=123, dstadr=10.x.y.4,
dstport=123, keyid=0, stratum=4, precision=-6, rootdelay=31.25,
rootdispersion=10120.09, refid=10.x.v.1, delay=0.31, offset=0.89,
dispersion=3.42, reach=377, valid=8, hmode=3, pmode=4, hpoll=10,
ppoll=10, leap=00, flash=0x0<OK>,

status=9414 reach, conf, sel_sync, 1 event, event_reach
srcadr=windows-dc-serverB, srcport=123, dstadr=10.x.y.4,
dstport=123, keyid=0, stratum=4, precision=-6, rootdelay=31.25,
rootdispersion=10113.86, refid=10.x.v.1, delay=0.44, offset=-1.45,
dispersion=2.61, reach=377, valid=8, hmode=3, pmode=4, hpoll=10,
ppoll=10, leap=00, flash=0x0<OK>,




Thanks for your help.
--
Mit freundlichen Gr??en

Ronny Egner
Diplom-Ingenieur (BA)
Systeme & Service
Oracle DBA

Telefon: +49 381 2524-422
Telefax: +49 381 2524-399


SIV.AG - Service f?r Informationsverarbeitung AG
Hauptsitz: Konrad-Zuse-Str. 1, 18184 Roggentin
Handelsregister: Amtsgericht Rostock, HRB 8677, Ust.-IdNr.: DE 137477226
Vorstand: J?rg Sinnig (Vorsitzender), Andreas Lehmann, Arno Weichbrodt
Aufsichtsratsvorsitzender: Thomas Huth

*************************************************************************
Aus Rechtsgr?nden ist die in dieser E-Mail gegebene Information nicht
rechtsverbindlich. Eine rechtsverbindliche Best?tigung reichen wir
Ihnen auf Anforderung in schriftlicher Form nach. Diese Nachricht ist
ausschlie?lich f?r den Adressaten oder dessen Vertreter bestimmt.

The information contained in this email is not legally binding.
At your request, we will provide you with a legally binding confirmation
in written form. This message is intended solely for the addressee,
entity to which the email is addressed or the authorised agent.

*************************************************************************
David Woolley
2008-06-11 21:43:50 UTC
Permalink
Ronny Egner wrote:

This posting is broken because it has this header:

Content-Transfer-Encoding: quoted-printable

but doesn't actually quoted-printable encode the body. That means that
things go badly wrong whenever there is an "=" sign. I'll therefore
quote the message source here, which means I'll have to put the quote
marks in by hand. For the benefit of other people with MIME capable
newsreaders, I won't trim the quoting in the way I normally would. (More
at foot - theory about gateway bug.)
Post by Ronny Egner
Post by Steve Kostecke
Post by Ronny Egner
The problem i am facing occur on all my new 6 servers
which are all equally configured (Red Hat AS 4 U4, 64-bit).
Are they VMs?
No, physical machines.
Post by Steve Kostecke
Post by Ronny Egner
server solaris-server
server windows-dc-serverA
server windows-dc-serverB
You can reduce the initial sync time from ~ 5 minutes to ~15-20 seconds
by appending 'iburst' to your server lines.
I will try that.
Post by Steve Kostecke
Post by Ronny Egner
ntpq> peers
remote refid st t when poll reach delay offset jitter
========================
=========================
=====================
Post by Ronny Egner
Post by Steve Kostecke
Post by Ronny Egner
solaris-server 10.x.y.2 5 u 18 64 377 0.259 892.342 30.508
windows-dc-serverA 10.x.v.1 4 u 17 64 377 0.214 847.816 50.335
windows-dc-serverB 10.x.v.1 4 u 22 64 377 0.272 923.808 45.667
There is a signifiant diference in offsets between those remote time
servers. And the jitter is quite high. Are all of these server on the
same LAN? Are any at remote sites or reached over a VPN?
No they are not reacable over a VPN. They are routed through the network
(quite complicated; but there is no WAN part inbetween).
I dont know much about the windows servers. The customer told me i can
use them for time synchronization - so i did it because the network i
highly protected from the outside lan.
Post by Steve Kostecke
Post by Ronny Egner
ntpq> rv 39492
assID=39492 status=9014 reach, conf, 1 event, event_reach,
srcadr=solaris-server, srcport=123, dstadr=10.x.y.z, dstport=1
23, leap=00,
Post by Ronny Egner
Post by Steve Kostecke
Post by Ronny Egner
stratum=5, precision=-18, rootdelay=31.601, rootdispersion=103
16.483,
Post by Ronny Egner
Post by Steve Kostecke
The root dispersion suggests that this peer has not been synced to a
real time source is quite a while. This needs to be fixed first.
Could you please post the solaris-server 'ntpq -p' billboard from (in a
Post by Steve Kostecke
condensed format as shown above)?
ntpq> version
ntpq 3-5.93e Mon Sep 20 15:45:42 PDT 1999 (1)
(it is a Solaris 10)
ntpq> peer
remote refid st t when poll reach delay offset
disp
=========================
=========================
=========================
===
Post by Ronny Egner
#windows-dc-serverA 10.x.v.1 4 u 215 1024 377 0.31 0.888 3.
42
+windows-dc-serverB 10.x.v.1 4 u 509 1024 377 0.44 -1.445 2.
61
Both windows server seem to get ther ntp data from the host 10.x.v.1.
ntpq> as
ind assID status conf reach auth condition last_event cnt
=========================
=========================
=========
Post by Ronny Egner
1 29028 9514 yes yes none dist.peer reachable 1
Association 1 is being rejected because the root dispersion is too high.
Post by Ronny Egner
2 29029 9414 yes yes none synchr. reachable 1
status=9514 reach, conf, sel_sys.peer, hi_dist, 1 event, event_reach
Note the hi_dist flag. That's telling you that the rootdispersion is
too high, and the server is therefore unacceptable.
Post by Ronny Egner
srcadr=windows-dc-serverA, srcport=123, dstadr=10.x.y.4,
dstport=123, keyid=0, stratum=4, precision=-6, rootdelay=31.25,
rootdispersion=10120.09, refid=10.x.v.1, delay=0.31, offset=0.89,
rootdispersion, and therefore root distance, is greater than the maximum
allowed 1 second.
Post by Ronny Egner
dispersion=3.42, reach=377, valid=8, hmode=3, pmode=4, hpoll=
10,
ppoll=10, leap=00, flash=0x0<OK>,
status=9414 reach, conf, sel_sync, 1 event, event_reach
srcadr=windows-dc-serverB, srcport=123, dstadr=10.x.y.4,
dstport=123, keyid=0, stratum=4, precision=-6, rootdelay=31.25,
rootdispersion=10113.86, refid=10.x.v.1, delay=0.44, offset=-1.45
,

This root dispersion is also more than 9 seconds greater than permitted.
Post by Ronny Egner
dispersion=2.61, reach=377, valid=8, hmode=3, pmode=4, hpoll=
10,
ppoll=10, leap=00, flash=0x0<OK>,
Basically, you should never use W32Time to serve time to a real NTP
client, although I still don't understand why Solaris isn't reporting
stratum 16. Did you do anything to force the use of that server?

In this context, W32Time reports a true root dispersion when
unsychronised, including being "synchronised" to its local clock,
whereas ntpd should go into alarm in the first case, and gives an
extremely optimistic distance in the second case.
Post by Ronny Egner
Thanks for your help.
--
Mit freundlichen Gr??en
Ronny Egner
Diplom-Ingenieur (BA)
Systeme & Service
Oracle DBA
Telefon: +49 381 2524-422
Telefax: +49 381 2524-399
SIV.AG - Service f?r Informationsverarbeitung AG
Hauptsitz: Konrad-Zuse-Str. 1, 18184 Roggentin
Handelsregister: Amtsgericht Rostock, HRB 8677, Ust.-IdNr.: DE 137477226
Vorstand: J?rg Sinnig (Vorsitzender), Andreas Lehmann, Arno Weichbrodt
Aufsichtsratsvorsitzender: Thomas Huth
*************************************************************************
Aus Rechtsgr?nden ist die in dieser E-Mail gegebene Information nicht
rechtsverbindlich. Eine rechtsverbindliche Best?tigung reichen wir
Ihnen auf Anforderung in schriftlicher Form nach. Diese Nachricht ist
ausschlie?lich f?r den Adressaten oder dessen Vertreter bestimmt.
The information contained in this email is not legally binding.
At your request, we will provide you with a legally binding confirmation
in written form. This message is intended solely for the addressee,
entity to which the email is addressed or the authorised agent.
The entity to which this is addressed, is, of course, the whole universe.
Post by Ronny Egner
*************************************************************************
(The mail problem may well be with the gateway. This list isn't really
a mailing list; it is really a newsgroup, and the gateway provides a
convenience for accessing the newsgroup where firewalls on unenlightened
ISPs make accessing the group difficult. On USENET discussion groups,
binary attachments are strictly forbidden (they are abused to transmit
large files, which often violate copyright), so, although cryptographic
signatures are actually tolerated, the gateway has stripped the
signature. It is possible that, in doing so, it has failed to copy the
rest of the message in a fully transparent manner.

I wonder if the reason that the original didn't make it onto USENET is
related.)
Steve Kostecke
2008-06-12 03:43:55 UTC
Permalink
so, although cryptographic signatures are actually tolerated, the
gateway has stripped the signature. It is possible that, in doing so,
it has failed to copy the rest of the message in a fully transparent
manner.
I wonder if the reason that the original didn't make it onto USENET is
related.)
Sorry to disappoint you, but I did some checking and the mail to news
gateway _did_ function correctly and the original message was
sucessfully passed to our primary upstream news server.

I've compared the copy of the message in the questions archive and a
decoded copy of the message that was delivered to our primary upstream
server. Both are identical.

So you'll need to cast aspersions elsewhere.
--
Steve Kostecke <kostecke at ntp.org>
NTP Public Services Project - http://support.ntp.org/
Steve Kostecke
2008-06-12 03:55:08 UTC
Permalink
Post by Steve Kostecke
Post by David Woolley
I wonder if the reason that the original didn't make it onto USENET is
related.)
I've compared the copy of the message in the questions archive and a
decoded copy of the message that was delivered to our primary upstream
server. Both are identical.
I neglected to mention that the body of the original message, and
gatewayed article, was BASE64 encoded.

The subsequent messages from the OP were not BASE64 encoded.
--
Steve Kostecke <kostecke at ntp.org>
NTP Public Services Project - http://support.ntp.org/
David Woolley
2008-06-12 07:01:28 UTC
Permalink
Post by Steve Kostecke
I've compared the copy of the message in the questions archive and a
decoded copy of the message that was delivered to our primary upstream
server. Both are identical.
Was it already broken then? You seem to have quoted from it as though
it wasn't broken, which suggests not; the nature of the damage is such
that any MIME capable tool would be unable to compensate for the error.
Post by Steve Kostecke
So you'll need to cast aspersions elsewhere.
If it wasn't broken before reaching the gateway, it got broken on the
common path between the gateway and both me and Google. So, if it
wasn't already broken, and the gateway is not MIME decoding the body, or
adding a bogus header, and given the importance of the Google archive,
the gateway's USENET injection point needs to be moved so that there are
no broken USENET relays between it and Google.

This is the path to Google:

Path:
g2news1.google.com!news4.google.com!news1.google.com!newsfeed.stanford.edu!news.isc.org!psp2.ntp.org!lists.ntp.org

and this is the path to me:
Path:
news.aaisp.net.uk!news-peer-lilac.gradwell.net!news.glorb.com!news.isc.org!psp2.ntp.org!lists.ntp.org

Assuming that:
1) there aren't two different servers breaking the article in the same way;
2) it wasn't broken on input to isc.org:

it looks like the only possible sources of damage are:

lists.ntp.org (the gateway?)
psp2.ntp.org, and
news.isc.org

all of which seem to me to be owned by ISC.

I seem to remember a lot of confusion being caused, in the past, by
another article that was damaged in a similar way, but can't remember
whether the cause was ever diagnosed.

(You can see Google's broken copy at:

http://groups.google.com/group/comp.protocols.time.ntp/msg/6f1ead8cdd0b0b4b?dmode=source

Follow the "view parsed" link to see what it looks like in a MIME
capable reader.)
Steve Kostecke
2008-06-12 13:15:30 UTC
Permalink
Post by Steve Kostecke
I've compared the copy of the message in the questions archive and
a decoded copy of the message that was delivered to our primary
upstream server. Both are identical.
Was it already broken then? You seem to have quoted from it as though
it wasn't broken, which suggests not; the nature of the damage is
such that any MIME capable tool would be unable to compensate for the
error.
I was discussing the original article in this thread (the article that
appears to have not fully propagated) not the OP's second article (the
one you replied to).

The original message was received by the questions list, distributed
to the subscribers, and archived. The body of this message was
BASE64 encoded. I have compared the message in the Mailman archive
(pre-gateway) with the article in the gateway news spool and the article
delivered to both our primary and back-up upstream news servers. All of
the copies of this original message/article (both pre and post gateway)
are identical AFAIKT.

I received, and replied to, the original article via the back-up
news-server (which I operate). But I have not as yet been able to
determine why the article did not propagate beyond our primary or
back-up systems. Nor have I had an opportunity to trace the processing
of the OP's second article.
Post by Steve Kostecke
So you'll need to cast aspersions elsewhere.
If it wasn't broken before reaching the gateway,
Which "it" are you talking about here? The OP's original article or his
second article.
it got broken on the common path between the gateway and both me and
Google.
If you truly understood Usenet you would know that there are no
absolutes.
So, if it wasn't already broken,
You are obsessed with that word, aren't you.
and the gateway is not MIME decoding the body, or adding a bogus
header,
You moan when articles containing HTML and other MIME cruft show up in
the news-group.

You moan when articles are not perfectly formed after the offending bits
are stripped out.

You simply can not have it both ways.

You really ought to shut off the grindstone, put down the axe, and
concentrate on making constructive contributions. This entire issue
pales in comparision to the abysmal quoting practices, rampant thread
drift, and general cluelessness prevalent across the board in Usenet.

Don't bother complaining about having to reformat quoted material in
replies. Anyone who cares a whit about their readers reformats as
necessary.
and given the importance of the Google archive,
The only important thing about the Google archive is the advertising
revenue it generates.
the gateway's USENET injection point needs to be moved so that there
are no broken USENET relays between it and Google.
We use the assets available to us. And they generally work without a
problem.
g2news1.google.com!news4.google.com!news1.google.com!\
newsfeed.stanford.edu!news.isc.org!psp2.ntp.org!lists.ntp.org
news.aaisp.net.uk!news-peer-lilac.gradwell.net!news.glorb.com!\
news.isc.org!psp2.ntp.org!lists.ntp.org
That is _one_ set of paths from our primary upstream news-server. But it
is not the only path that the articles may take.
1) there aren't two different servers breaking the article in the same way;
There's that word again.
lists.ntp.org (the gateway?)
psp2.ntp.org, and
news.isc.org
all of which seem to me to be owned by ISC.
You're a bit confused here. The only one of those three systems which is
owned and operated by ISC is news.isc.org. Further, that list contains a
system which is not involved in the gateway path and it incorrectly
identifies lists.ntp.org as the gateway.
I seem to remember a lot of confusion being caused, in the past,
Another case of unilateral confusion.
by another article that was damaged in a similar way, but can't
remember whether the cause was ever diagnosed.
http://groups.google.com/group/comp.protocols.time.ntp/msg/\
6f1ead8cdd0b0b4b?dmode=source
Follow the "view parsed" link to see what it looks like in a MIME
capable reader.)
It's a little mis-formatted. 'Tis but a drop in the bucket.
--
Steve Kostecke <kostecke at ntp.org>
NTP Public Services Project - http://support.ntp.org/
Martin Burnicki
2008-06-12 08:57:35 UTC
Permalink
Post by Ronny Egner
Post by Steve Kostecke
Post by Ronny Egner
The problem i am facing occur on all my new 6 servers
which are all equally configured (Red Hat AS 4 U4, 64-bit).
Are they VMs?
No, physical machines.
Post by Steve Kostecke
Post by Ronny Egner
server solaris-server
server windows-dc-serverA
server windows-dc-serverB
You can reduce the initial sync time from ~ 5 minutes to ~15-20 seconds
by appending 'iburst' to your server lines.
I will try that.
Post by Steve Kostecke
Post by Ronny Egner
ntpq> peers
remote refid st t when poll reach delay offset jitter
Post by Ronny Egner
solaris-server 10.x.y.2 5 u 18 64 377 0.259 892.342 30.508
windows-dc-serverA 10.x.v.1 4 u 17 64 377 0.214 847.816 50.335
windows-dc-serverB 10.x.v.1 4 u 22 64 377 0.272 923.808 45.667
There is a signifiant diference in offsets between those remote time
servers. And the jitter is quite high. Are all of these server on the
same LAN? Are any at remote sites or reached over a VPN?
No they are not reacable over a VPN. They are routed through the network
(quite complicated; but there is no WAN part inbetween).
Assuming the Solaris machine runs real NTP software, the jitter for that
server is pretty high. Do you also observe such jitter for normal ping
requests to the solaris machine?

Concerning the Windows servers - are they running NTP, or w32time? If they
run w32time they may not be good time sources for "real" NTP nodes running
on your Linux machines. Anyway, the fact that the jitter is also high for
these servers lets me assume your network connection is not very good.
Post by Ronny Egner
I dont know much about the windows servers. The customer told me i can
use them for time synchronization - so i did it because the network i
highly protected from the outside lan.
The question is your customer's understanding of "time synchronization". In
common Windows terms time is synchronized if the system times differ less
than a couple of seconds, since this is sufficient for kerberos
authentication.

In NTP terms, a few seconds are a huge offset. In fact, ntpd already steps
the system time if the offset exceeds 128 milliseconds (!)

So, in Windows terms, all 3 upstream servers could be called "synchronized"
whereas for NTP the times from those 3 servers differ so much that ntpd is
unable to select the server with the "right" time.


Martin
--
Martin Burnicki

Meinberg Funkuhren
Bad Pyrmont
Germany
David Woolley
2008-06-13 07:40:48 UTC
Permalink
Post by Martin Burnicki
Concerning the Windows servers - are they running NTP, or w32time? If they
run w32time they may not be good time sources for "real" NTP nodes running
It is possible to use W32Time as a source for NTP, although, before
Windows 2003, it violates the NTP specification to do so. It is almost
never a good thing to do. The W32Time instance must be synchronised to
some proper source of time for this to work - a W32Time root with no
upstream sources won't work.
Post by Martin Burnicki
on your Linux machines. Anyway, the fact that the jitter is also high for
these servers lets me assume your network connection is not very good.
The jitter is a secondary statistic. The root dispersion is the key.
That indicates that the Windows systems haven't been synchronised to a
real source of time for several days (7.7 days, if it uses the same
worst case drift assumption as does ntpd). Normal NTP servers would
alarm under those circumstances (which is why I find the Solaris system
confusing - it is alarming on one upstream, but is accepting and
propagating the huge root dispersion on the other. I wonder if someone
has changed MAXDISTANCE on that system, e.g. to 10 seconds.

If one had the reference time for the Windows servers, I think one would
find it was quite ancient.
Post by Martin Burnicki
Post by Ronny Egner
I dont know much about the windows servers. The customer told me i can
use them for time synchronization - so i did it because the network i
highly protected from the outside lan.
That almost certainly means that they are not being synchronised by
anything, and are operating like ntpd using a local clock, but with the
difference that ntpd attributes a root dispersion based on the fiction
that it successfully synchronised on every read of the local clock. I
think W32Time is more honest in this respect. With the reference
implementation the server is responsible for pretending all is well, but
for a pure W32Time network, it is the client that is responsible for
ignoring the high root dispersion, and believing all is well.

The ntpd strategy makes sense if you are operating an isolated time
island with a single source of local clock time,ar (although this isn't
common) you are synchronising the local clock by means other than NTP,
but is not so good when the local clock is used as a fall back for lost
external connectivity.
Post by Martin Burnicki
So, in Windows terms, all 3 upstream servers could be called "synchronized"
whereas for NTP the times from those 3 servers differ so much that ntpd is
unable to select the server with the "right" time.
In this case it has no trouble deciding between them. Root dispersion
is over 10 seconds, but the offset discerpancies are a couple of orders
of magnitude less than this; any offset difference upto 10 seconds would
pass the truechimer test. ntpd is actually rejecting before that test
because the excessive root dispersion means that it can't trust the
offsets to better than 10 seconds.
Steve Kostecke
2008-06-12 14:33:41 UTC
Permalink
Now that we've all enjoyed some light entertainment I'd like to look
into what happened to Ronny Egner's messages in the 'NTP shows all
servers in condition "reject"' thread.

I need a complete copy of both messages (484F6126.2050100 at siv.de, and
484CF331.6030204 at siv.de) from one of the mailing list subscribers. It
also might help if Ronny could resend these messages directly to me.

Please make sure that you forward the entire message including _all_ of
the headers. Messages with incomplete or missing headers won't help me.

Please send a copy of the complete message to kostecke at ntp.org and
retain the original in case I need to request another copy.

Thanks!
--
Steve Kostecke <kostecke at ntp.org>
NTP Public Services Project - http://support.ntp.org/
Danny Mayer
2008-06-17 18:04:56 UTC
Permalink
Post by Steve Kostecke
Post by Ronny Egner
Dear List,
i?m having slight problems getting ntp to synchronize.
The problem i am facing occur on all my new 6 servers
which are all equally configured (Red Hat AS 4 U4, 64-bit).
Are they VMs?
None of my comments about your configuration file are germane to the
lack of sync issue, but ...
Post by Ronny Egner
restrict default nomodify notrap noquery
restrict 127.0.0.1
These restrictions are OK. They do not block time service.
Post by Ronny Egner
server solaris-server
server windows-dc-serverA
server windows-dc-serverB
Windows servers are know to be extremely bad to be used as an NTP server
and cannot be relied upon. Drop them and use a regular NTP server and
you will probably get good synchronization.

Danny
Maarten Wiltink
2008-06-18 13:06:44 UTC
Permalink
"Danny Mayer" <mayer at ntp.isc.org> wrote in message
news:4857FCC8.9060203 at ntp.isc.org...
[...]
Post by Danny Mayer
Windows servers are know to be extremely bad to be used as an NTP
server and cannot be relied upon.
Even if they are running NTP?

Groetjes,
Maarten Wiltink

Loading...