Want To Go On A (java.util.)Date?

10-Sep-2013 Like this? Dislike this? Let me know

Date handling in Java and other languages continues to cause problems even in new systems that have a platform otherwise capable of rich, precise, and lossless operations on them. The problem is rooted in three major issues:

  1. An instantaneous point in time is not the same thing as a calendar date.
  2. The datetime of an event is not the same thing as the "system processing date" (which, by the way, is also not the same thing as the system clock date).
  3. Shortcuts in externalization (to- and from-string) end up being lossy or make it difficult and/or ambiguous to use the externalized form.
The good news is these issues are very easy to address. It simply takes some discipline and erring on the side of producing too much information instead of too little.

Time vs. Calendar

To begin, it is important to understand that a point in time is absolute. It is "the same" in every time zone. In Java, point in time is carried in the java.util.Date object and is represented as the number of milliseconds since exactly midnight GMT Jan 1, 1970, a date commonly refered to in the Unix world as the epoch date. Consider a global system with two instances of the same software running in Tokyo and New York that processes 4 pieces of activity in the following order, where the absolute activity time is captured using a simple call to new Date():

#Absolute Time Which SystemLocal Time
11378746511036 TK2013-09-10 02:08.441
21378746511037 NY2013-09-09 13:08.442
31378750111170 NY2013-09-09 14:08.651
41378753711055 TK2013-09-10 04:08.085

The example above is a simple demonstration of how our human-imposed context of Calendar and earth's rotation confounds an otherwise straightforward sequence of 4 events. In the absence of such context, items #2 and #3 sure look like they happen before item #1. Clearly, if we were only ever using absolute time, all sorts of filter and sorting and other operations would be easy. It would be a great thing if our minds could easily grok globally absolute time and we could simply do:


    select * from table where tradeDate > 1378750111170

and "just know" that meant 2013-09-09 14:08 in New York.

It's About Dayframe, not Timezone

But we and our systems lead lives structured around a day. More specifically, it is "activity occuring within some number of hours, 24 or less." The subtle point here is a system processing day may cross over into the next calendar day before the processing day is incremented. A day's worth of processing is not necessarily anchored to a calendar day, but by convention we almost always associate it with a date. This duration of time (herein called the "dayframe") is the important organizational element of a system, not timezones. Dayframes are high level and business-oriented; timezones are low-level and system-oriented. The challenges are: A design that makes one or more of the following mistakes will make it difficult to work with dayframe data on a practical basis:

The Solution

A practical solution involves the following for each record that is created: Coming back to our 4 pieces of activity:
#Business DayframeDayframe dateLocal TimeUTC Time
1ASIA2013-09-092013-09-10 02:08.4412013-09-09 18:08.441
2NORTHAM2013-09-092013-09-09 13:08.4422013-09-09 18:08.442
3NORTHAM2013-09-092013-09-09 14:08.6532013-09-09 19:08.653
4ASIA2013-09-092013-09-10 04:08.0852013-09-09 20:08.085

The value of the datetimes and how they are used principally used is now unambiguous because we now have four pieces of information (business dayframe, dayframe date, local event time, and UTC time) instead of just two (local time and UTC offset). Notes:

If we want to get very crisp about dayframe handling, then instead of using a date and dayframe identifiers as a composite key, we would indirectly address date and other attributes through a bespoke key. A simple implementation might be D followed by an incremental integer, e.g.

#DayframeLocal TimeUTC Time
1D7712013-09-10 02:08.4412013-09-09 18:08.441
2D7722013-09-09 13:08.4422013-09-09 18:08.442
3D7722013-09-09 14:08.6532013-09-09 19:08.653
4D7712013-09-10 04:08.0852013-09-09 20:08.085

To find out what happened on a particular dayframe, one would first consult the dayframe resource and query to arrive at the proper dayframe key, which would then be supplied to a query against the transaction resource. This permits more flexibility in determining and capturing attributes that make a dayframe, such as the UTC time that it was activated, by which actor in the system, state, etc.

In summary, don't use shortcuts to try to capture a local time, a UTC time, and a dayframe (a.k.a. system processing date) in less than the three separate pieces of information that they are.

Like this? Dislike this? Let me know


Site copyright © 2013-2017 Buzz Moschetti. All rights reserved