A Stream-based Time Synchronization Technique For Networked Computer Games

Zachary Booth Simpson
1 March 2000

(c)2002 ZBS. http://www.mine-control.com/zack
Please sign my guestbo0k if you find this work useful.

Abstract

Many networked games use a form of dead-reckoning to avoid extraneous synchronization and minimize apparent latency. This technique may be improved by synchronizing the clocks on all machines within acceptable limits. Clock synchronization is non-trivial when the effects of variable latency are introduced such as on the global Internet. Extremely robust but slow-to-converge synchronization techniques such as NTP [MILLS92] have been proposed and developed by Mills, et al. Simpler alternatives are also in use such as SNTP [MILLS96]. Both of the protocols have deployment problems for computer games such as the fact that they are both data-gram based (UDP) [POSTEL80-1] as well as are either complex (NTP) or too-inaccurate (SNTP). This paper proposes an extremely simple alternative to Millsí NTP latency integration approach which, while substantially less accurate than NTP, is suitable for many network games and can be used on top of a stream-oriented protocol such as TCP [POSTEL80-2].

Dead-Reckoning

Many network games utilize a technique called "dead-reckoning" to minimize apparent latency and avoid extraneous synchronization transmissions. The dead-reckoning algorithm is to simply estimate an objectís position based on previous positional and speed information. (It derives its name from ancient ship navigation terminology; sailors would throw objects overboard making them "dead in the water" and then time their trip from bow to stern in order to derive speed. The term is now commonly used in all forms of navigation to mean piloting by speed estimation with known starting points.)

Most computer games or other real-time networked simulation environments implement the dead-reckoning algorithm by a broadcasting the position, speed, and possibly higher-order differential coefficients for each moving object. (A review of these techniques is presented in [GREER99]. ) Upon receipt of the movement packet, the client interpolates from the current position. While this technique can be implemented without time synchronization, it is less accurate to do so. Without time synchronization, the client assumes that a newly arrived packet was sent with zero latency. This causes the interpolation to have at least as much error as the last packet latency and possibly more. This latency error, which is caused by both the effects of Internet router and modem latency as well as re-transmissions from dropped packets in a connection-oriented protocol such as TCP, is highly significant and can range from 100-3000ms. Simplistic studies conducted by the author suggest that humans are sensitive to action latencies (time between request and visual confirmation of action) of anything over 150ms.

Existing Clock Synchronization Protocols

Clock synchronization is a topic of major importance to the world. Many computer systems from space exploration to financial markets rely on time synchronization technology to keep all of their computers in sync. The most simplistic technique of synchronization is incorporated in the Simple Network Time Protocol (SNTP) developed by Mills. In this protocol, the client machine to be synchronized sends a datagram (UDP) packet to the server which is then immediately resent by the receiver with the time as it is known to the server. This technique is trivial to implement and appropriate for environments where the latency between the client and server are minimal and constant as is roughly the case in many local area networks (LANs) or when the desired synchronization is only needed within a few hundreds of milliseconds.

The SNTP technique is not viable when accuracy is critical and latency is variable as is the case on the world-wide Internet. Algorithms to measure, quantify, and correct for the error induced by variable latency were introduced by David Mills at University of Delaware. Mills presents an extremely comprehensive treatment of the subject and proposes a solution which became the Network Time Protocol (NTP). (specified in RFC-1305) [MILLS92] This protocol is used extensively throughout the Internet. Unfortunately, NTP is very complicated and, more importantly, slow to converge on the accurate time delta. This makes NTP less than ideal for network game play where the player expects a game to start immediately and is unwilling to allow for synchronization time.

Further complicating matters, NTP and SNTP are datagram protocols relying on UDP which is unfortunately firewalled (i.e. blocked) by many Internet Service Providers, especially by corporate WANs. Also, anecdotal evidence from network programmers suggests that the UDP implementation under Win32 is unstable and overly-unreliable. The reason that SNTP and NTP use datagram protocols is simple. Connection latency is measured, and therefore extracted from the time request, by assuming that the transmit and receive times are symmetric and dividing the measured latency by two. In a stream-based protocol such as TCP, the underlying protocol may retransmit a lost or unordered packet causing anomalous and asymmetric latency. These protocols have no API for informing high-level code that the retransmission occurred. Therefore, the only truly safe and accurate way to conduct the latency measurement is to use a datagram protocol as just mentioned to avoid this problem. (Note however, that still can not assure a symmetric connection. For example, satellite based ISPs use modem up-links and high-bandwidth, variable latency satellite down-links).

A Simple Alternative

A simple clock synchronization technique is required for games. Ideally, it should have the following properties: reasonably accurate (150ms or better), quick to converge, simple to implement, able to run on stream-based protocols such as TCP.

A simple algorithm with these properties is as follows:

  1. Client stamps current local time on a "time request" packet and sends to server
  2. Upon receipt by server, server stamps server-time and returns
  3. Upon receipt by client, client subtracts current time from sent time and divides by two to compute latency. It subtracts current time from server time to determine client-server time delta and adds in the half-latency to get the correct clock delta.
    (So far this algothim is very similar to SNTP)
  4. The first result should immediately be used to update the clock since it will get the local clock into at least the right ballpark (at least the right timezone!)
  5. The client repeats steps 1 through 3 five or more times, pausing a few seconds each time. Other traffic may be allowed in the interim, but should be minimized for best results
  6. The results of the packet receipts are accumulated and sorted in lowest-latency to highest-latency order. The median latency is determined by picking the mid-point sample from this ordered list.
  7. All samples above approximately 1 standard-deviation from the median are discarded and the remaining samples are averaged using an arithmetic mean.

The only subtlety of this algorithm is that packets above one standard deviation above the median are discarded. The purpose of this is to eliminate packets that were retransmitted by TCP. To visualize this, imagine that a sample of five packets was sent over TCP and there happened to be no retransmission. In this case, the latency histogram will have a single mode (cluster) centered around the median latency. Now imagine that in another trial, a single packet of the five is retransmitted. The retransmission will cause this one sample to fall far to the right on the latency histogram, on average twice as far away as the median of the primary mode. By simply cutting out all samples that fall more than one standard deviation away from the median, these stray modes are easily eliminated assuming that they do not comprise the bulk of the statistics.

This algorithm was thoroughly tested in NetStorm, Islands At War, a real-time Internet strategy game co-implemented by the author at Titanic Entertainment in 1997. The results were completely satisfactory and usually resulted in synchronizations less than 100ms. Anecdotal evidence in large scale trials suggested that bad synchronizations due to retransmission were infrequent and when they did occurred were frequently symptomatic of an unusually bad Internet connections that typically caused more catastrophic errors (such as dropped connections) rendering the failure to time-sync mute.

References

[GREER99] Greer, James F.; Real-time Strategy Games Without The Lagggg;

Computer Game Developers Conference, San Jose CA 1999

http://jamesfgreer.home.mindspring.com/gdc99.ppt

[MILLS92] Mills, David; Network Time Protocol (Version 3) Specification, Implementation and Analysis.; University of Delaware, March 1992. RFC-1305

http://www.eecis.udel.edu/~mills/ntp.htm

[MILLS96] Mills, David; Simple Network Time Protocol (Version 4); University of Delaware, October 1996. RFC-2030

http://www.eecis.udel.edu/~mills/ntp.htm

[POSTEL80-1] Postel, J.; User Datagram Protocol; STD 6, USC/Information Sciences Institute, August 1980; RFC-768

[POSTEL80-2] Postel, J.; Transmission Control Protocol; STD 6, USC/Information Sciences Institute, August 1980; RFC-761