triggering timezone-dependent actions

Suppose you have a database full of events that are occurring at various times in various timezones.

Suppose also that you want to trigger an action (like an email campaign) to occur at a specific local time (in each event’s timezone) a certain number of days prior to the event.

At any given point in time, how do you determine which actions are ready to trigger?  For simplicity’s sake, let’s say we want to send an email at 8am 5 days prior to each event.

This is not as simple as asking which events are occurring within a given amount of time like this:

Event.where("start_datetime < ?", Time.now - 5.days)

Two events occurring at the exact same UTC time could require that the email be sent at drastically different times.  An event at 11pm June 9th Pacific Time and an event at 2am June 10th Eastern Time occur at exactly the same time (i.e. their UTC timestamps are identical), but the triggered email for the former should be sent at 8am Pacific June 4th, while the email for the latter should be sent at 11am Pacific June 5th, 27 hours after the first email.

As a first step, suppose I want to write a method that returns all events that are occurring in exactly 5 days (within each event’s respective timezone)

There are many ways to do this, but the challenge is to find a solution that can perform well even if there are a huge number of events. Note: each event has a property `start_datetime` which is the UTC date time of the event and a property `timezone` which is the name of its timezone (corresponding to an `ActiveSupport::TimeZone`)

One approach (which is not performant) is to check every event directly:

def events_in_five_days
  Event.all.select do |event|
    tz = ActiveSupport::TimeZone.new(event.timezone)
    days_away = event.start_datetime.in_time_zone(tz).to_date - tz.now.to_date
    days_away == 5
  end
end

If you have a lot of events, that’s a large database query (even if you only select the relevant columns) and a lot of timezone calculations.

Another approach is to loop through each timezone and construct a separate query for each one.  We construct a 24 hour UTC window during which the event can occur for each timezone, and then query for the events in that timezone that are within the window:

def events_in_five_days
  ActiveSupport::TimeZone.all.map { |timezone|
    five_days_from_now = timezone.now + 5.days
    midnight = five_days_from_now.midnight    
    Event.where(["timezone = ? and start_datetime >= ? and start_datetime <= ?", timezone.name, midnight, midnight + 1.day])
  }.flatten
end

Depending on how many events you have, this might significantly reduce the amount of processing, but if you have events in a lot of timezones and a large database, this is still a lot of queries and might be fairly slow.

I wanted a way to get all of the events with just one simple query, and I didn’t want to depend on any database-specific timezone calculations.  One solution is to pre-calculate the UTC time of the midnight before each event (within their respective timezones) and store this in a new column in the database.  Each time the `start_datetime` or `timezone` of the event changes, the `midnight_before_start` column is recalculated.   This enables us to exercise a single query regardless of timezone.  It might not be obvious at first, but with a little thought, it’s clear that we want the events whose `midnight_before_start` occurs between 4 and 5 days from now:

def events_in_five_days
  five_days_from_now = Time.now + 5.days
  four_days_from_now = five_days_from_now - 1.day
  Event.where("midnight_before_start >= ? and midnight_before_start < ?", four_days_from_now, five_days_from_now)
end

This is the solution I ended up using in one project.

However, for another project, I decided setting up the extra column was too much baggage.  So, I compromised by using a modified version of the very first solution above, except restricting the direct calculations only to those events which could reasonably be expected to be a match.  This works like a funnel — first the database query gets all of the events that are occurring within 24 hours in either direction of the target time, then the Ruby checks each of these “potential” matches individually with specific timezone calculations like we did in the first code snippet:

def events_in_five_days
  target = Time.now.utc + 5.days

  # Give a 1 day buffer on each side, to accommodate various timezones
  window_start = target - 1.day
  window_end = target + 1.day

  # First gather events "near" the target date (we're casting a larger net here)
  events = Event.where("start_datetime >= ? AND start_datetime <= ?", window_start, window_end).to_a

  # Now double-check timezone
  events.select do |event|
    tz = ActiveSupport::TimeZone.new(event.timezone)
    days_away = event.start_datetime.in_time_zone(tz).to_date - tz.now.to_date
    days_away == 5
  end 
end

Depending on the number of events and the number of timezones, I think any of these four solutions could be appropriate.  Of course, there may be better solutions, but I stopped thinking about it once I found solutions that I felt would be adequately performant for my needs.

Recall: Finding events occurring 5 days from now is just the first step — once I have them, I also want to check that the target time of 8am is passed in the event’s timezone, but that step is identical regardless of how you first collect the set of events occurring 5 days out.

Also Note: My actual production code is a bit cleaner than this — I’ve kept it this way on the blog because I’m primarily illustrating the overall algorithmic logic behind each approach, and I wanted it to be clear what calculations were being made and how they were being made.

How would you code this?