Wednesday, June 08, 2005

Solving the Phone Synchronization Problem End-to-End

(or, The Design of Exchange Direct Push in Exchange 2003 SP2)

Background

I bought my first cell phone in the summer of 2000 – the venerable Nokia 5160.  With the attendant giddy excitement of a new consumer electronics purchase, I started adding names and numbers to the addressbook on the phone.  This bubble was to burst when I would arrive at work to realize that the contacts and appointments I’d spent the last few years entering into Outlook were isolated from those on the phone.  There may have been custom solutions to keeping the two synchronized, but a cursory search on Nokia’s site did not yield anything obvious – certainly, nothing out-of-the-box existed at that time for my setup.

Fine, I thought.  For now, I’ll resign myself to manually entering each contact twice – once on the phone, and once in Outlook.  But this sucks.

And so it was until the Nokia 8390 came out.  I bought it on literally the first day that Seattle’s AT&T Wireless store stocked them, as this was the first phone offered by Nokia that met my specifications:

  • Being as I was unschooled in the ways of SIM unlocking, the phone had to be offered by my service provider.
  • The phone had to sport PC connectivity, be that over infrared or USB, and synchronization software for Outlook.
  • The phone had to look cool – the Nokia 9000 series phones were just too big and corporate-looking for me.

Note that between the 5160 and the 8390, I also owned the 8260 (yes, the Charlie’s Angels phone).  Consider what this means[1]:  Upon buying the 8260, I had to manually copy every contact from the 5160 to the 8260 – if the store personnel had the capacity to use custom tools to do this for me, they didn’t offer.  And this process had to be repeated when I moved from the 8260 to the 8390.  Finally, all the while, I ran the risk of losing my current phone and, with it, all of my carefully-entered contacts.  Best case, this meant an incredibly laborious, error-prone, eye-crossing two hours spent manually entering contacts into my replacement phone – and only if I happened to be diligent about doubly entering every contact into both Outlook and the phone.  Worst case, this meant some lost contacts and the all-too-familiar emails whose contents read something like, "I’ve lost my phone and your phone numbers; please send me your contact information!"

If the double-entry of contacts in Outlook and the phone sucked, this really sucked.

With the 8390 in hand, I was ready to trounce several of these issues:

  • With my contacts being synchronized with Outlook (and subsequently replicated up to my Exchange mailbox), I was insulated against my phone getting lost or destroyed or (as what wound up happening) inexplicably frying itself while on a business trip.
  • I need only enter contacts and calendar information once, whether in Outlook or on the phone, and synchronization would take care of reconciling the two.
  • Buying a new phone no longer implied the aforementioned back-breaking two hours’ labor.

I installed the synchronization software onto my laptop and connected the 8390 using infrared.  Problems surfaced immediately upon completing the first synchronization, however:

  • The contacts for which I had postal address information in Outlook were mapped, uselessly, to "United States of America."  Thanks.
  • General "low fidelity" synchronization of contact and calendar items – i.e., missing/incorrectly mapped fields.  No synchronization of email.
  • Exceptions to recurring appointments (e.g., "We meet at 1pm every week except this week, when we’ll meet at 2pm.") were unsupported.
  • The truncation of text fields for appointments seemed overly aggressive: I often knew that I had a meeting but was not entirely sure where it was or what it concerned.
  • The desktop synchronization software would occasionally crash with the dialog "Pure virtual method called."  What?

Ok; so, annoying, but not the end of the world.  Some of this can be attributed to ambiguities in mapping from one set of schema for contact and calendar items to another, to limited storage on the device, and perhaps to early versions of the software (though I was using version 4).  I lived with this for about a year, burned through two 8390’s, upgraded to the Nokia 7210, and lived with that and a similar synchronization experience for about another year.

During that time, things were manageable: the phone upgrades I mentioned were fairly painless because my Exchange account contained the authoritative copies of the data, and, with each new phone, I just pulled down all the data by way of Outlook at the first synchronization.  However, it was still less than ideal: upon modifying a contact or an appointment on my calendar, I had to remember to align the infrared ports of my phone and laptop (I tried repeatedly and unsuccessfully to find a USB cable for connecting the two), kick off the synchronization software, and wait for it to complete.  Making this a part of my daily routine was just tedious.

Enter the Motorola MPx200.  This was the first phone offered by my service provider that ran a Windows Mobile operating system and was of an appropriately small form factor[2].  The significance of the phone running a Windows Mobile operating system is that the phone would be running ActiveSync, which would provide high-fidelity synchronization of the email, calendar appointments, and contacts in my Exchange mailbox[3].  This would be supported both over-the-air with a GPRS connection and via USB/infrared using the desktop ActiveSync software that was provided along with the phone.

A further nicety of the MPx200 was the cradle and the single power/data USB connector it used – this meant that I could cradle the phone on my desk upon arriving to work, let it charge and sync all day, and pluck it from the cradle upon heading home, fully charged and synchronized[4].

The Design of AUTD

So, things are looking pretty good now (our story began in the summer of 2000; it is now the fall of 2003).  What’s the problem, then?  Well, the device synchronizes itself on a schedule, the most frequent setting of which is every five minutes.  I’ve always interpreted the setting "How often would you like to sync?" as "By how much would you like to be out of date?" This meant that I may well pluck the device from its cradle before the next scheduled sync has occurred and miss some updates.  Further, scheduled syncs over the air are fairly costly: most folders don’t contain changes.  Finally, these unnecessary syncs cost power and adversely affect the lifetime of the battery of the device.

Yes, we offered an always up-to-date (AUTD) solution based on text messaging at that time, but I wasn’t happy with what was required in terms of provisioning and the server-side enforcement of latency so as to mitigate the impact of AUTDv1 on server performance.

What to do.

Around this time, we had begun looking into what it would take to offer an up-to-date mobile email solution ("AUTD," from here on) that competed with the likes of RIM, Good, et al.  I liked the up-to-date nature of their solutions but had not personally adopted them for reasons of device choice (again with the form factor), setup costs (in terms of money, deployment overhead, and operational overhead), or both.  Being on the Exchange team, we’ve always got two sets of customers: the administrative staff and end users, and we wanted to build a solution that worked well for both.  By enumerating our requirements and constraints, we essentially painted ourselves into a corner (happily, this corner contained the solution):

  • The deployment of AUTD must be turn-key for the administrative staff.  Just install Exchange, check a checkbox or two, and you’re off and running.
  • The deployment of AUTD must not require a business relationship between any of Microsoft, the enterprise deploying AUTD, or the mobile operator.
  • The solution must not require a network operations center (NOC).
  • Since, by and large, mobile devices are not internet-routable without a NOC and without having first contacted an internet-resident peer, the means by which AUTD works must be initiated by the device.
  • Enterprise administrators will laugh at us if we ask them to open inbound ports on their networks other than 80 (HTTP) and 443 (HTTPS).  Some of them laugh at us, anyway.
  • There must be no notion of “dropped” notifications.
  • The device side of the solution must not require any provisioning beyond what the user must already do in order to setup ActiveSync.

Within this definition of the problem, we came up with the following solution:

  • The device issues an HTTP request to Exchange, which asks Exchange to report any changes that occur in the mailbox of the requesting user within a specified time limit.  The URL of this HTTP request is the same as that of other AirSync commands ("/Microsoft-Server-ActiveSync") with some differing query string parameters.  The body of the HTTP request allows the client to specify those folders that Exchange should monitor for changes.  Typically, these will be the Inbox, Calendar, Contacts, and Tasks folders.
  • Upon receiving this request, Exchange will monitor the specified folders until either the time limit expires or a change (such as the arrival of a piece of email) occurs in one of those folders, whichever comes first.  Exchange will then issue a response to this request that notes in which folders the changes occurred.  Of course, this will be empty if the time limit elapsed before any changes occurred.
  • Upon receiving an empty response, the device simply re-issues the request.  This loop of issuing a request for change notifications, receiving an empty response, and re-issuing the request for change notifications is called "the heartbeat."
  • Upon receiving a non-empty response, the device issues a synchronization request against each folder in the response.  When those complete, it re-issues the request for change notifications.

I’ve omitted some details here, but that is what is going on under the covers when you check the "Enable up-to-date notifications via HTTP" checkbox in Exchange System Manager in Exchange 2003 SP2, and it has the benefit of working on any mobile operator network that supports internet connectivity.  Since the hopes of increased revenues of most mobile operators appear to be pinned on the possibility of selling users on data-enabled applications, this seemed like a safe enough bet.

Further, by using HTTP, we do not require enterprises to open any inbound ports beyond what they’ve already had to open in order to support Outlook Web Access (OWA), Outlook’s RPC-over-HTTP feature, and ActiveSync itself.  Finally, the client-initiated nature of HTTP makes the device ultimately responsible for connectivity with Exchange, so upon receiving the request for change notifications from the device, Exchange will return a response immediately if any changes have occurred since the last synchronization.  This is how we prevent "dropped" notifications.  If the device ever drifts out of coverage, it will enter a re-try loop and connect as soon as it is able.  The network resilience logic of the device can also be triggered on the timeout limit having elapsed before a response from the server is received.

So there we have it: an up-to-date mobile email solution that is friendly for administrators and users alike.  Changes trickle into the phone in the same way that they do into Outlook on the desktop.  In fact, updates appear on the phone before they do in Outlook and OWA!

Now then, if you’ve been paying attention, you’ve probably noticed that AUTD requires a persistent data connection twixt the device and Exchange, and you’ve got a few issues with this:

  • Won’t the always-on data connection hose the battery of the device?  If we were constantly sending and receiving packets, yes.  However, note that for much of the lifetime of a request for change notifications, we are just waiting for a response.  GPRS radios do not consume power unless they are actively transmitting.  Further, the lifetime of a request for change notifications is chosen independently by each device, and, in practice, these requests tend to live for upwards of twenty minutes in the no-email case.  The means by which the device chooses this lifetime is tuned to minimize bytes over the wire and maximize battery life.  Five minute scheduled sync is more poorly behaved in this regard.
  • Won’t the always-on data connection result in massive data charges for users?  Not really – the synchronization operations that are performed in AUTD are targeted at only those folders that contain changes, so you’re never issuing lots of empty syncs as you are with a scheduled or manual sync.  Five minute scheduled sync is more poorly behaved in this regard, too.
  • How much data traffic does AUTD require?  We get this question a lot.  The best answer is that we have no idea.  How much email do you get in a day?  That’s about how much traffic AUTD requires.  Unhappy with that number?  Consider sending less email or ending certain personal and professional relationships.

What the previous three points add up to is that AUTD is actually better for mobile operator networks and device battery life than the solution based on scheduled sync that is used by devices that mobile operators sell today.  We’ve had a bit of difficulty in getting this point across to some mobile operators.

  • Will the increased connection load bring down Exchange front-end machines?  Servicing OWA and RPC-over-HTTP already result in thousands of outstanding connections to the front-end machines in our own deployment of Exchange ("we use it before you do").  The additional connection load imposed by AUTD is a drop in the bucket, relatively speaking.  Further, before AUTD could be deployed to service our corporate mailboxes here, we had to get past a security review (well, three, actually) with various corporate IT and security folks.  Meaning, we’re running it here and with no additional hardware.
  • By eliminating the NOC, isn’t this solution less secure?  This is among my favorite questions, and it’s usually followed up with some hand-waving about the connection to the enterprise "somehow" getting "hijacked."  The answer is, it is exactly as secure as the last online purchase you made with your credit card, exactly as secure as the last time you checked your email with OWA, and exactly as secure as the last time you used Outlook with RPC-over-HTTP.  That is, we use SSL (which itself negotiates over-the-wire encryption using RC4 or 3DES) to communicate between the device and the server.  I suppose that you could run this with SSL disabled, but you also risk a concussion if you run top-speed into a brick wall.  Just a little fyi.
  • What do the mobile operators think about all this?  Good question.  An end-to-end prototype of this solution was built in early 2004, and the next year was spent in trials with mobile operators all over the world, taking their feedback and addressing their concerns.  At the end of that process, I feel pretty good about what we’ve got.

As you might guess, we’ve been running early versions of this in Exchange for a few months now.  One of the more satisfying testaments to the utility of our AUTD solution is watching upper management bump into each other in the halls as they consult their devices for the email that just arrived or for the location of the meeting for which they’re already ten minutes late.

Conclusion

To let out a little secret, I’m not actually all that interested in having up-to-date email[5] on my phone, though that aspect of it is a big favorite for our upper-management types around here.  For me, having updates that I make to my calendar and contacts “just appear” on the phone without any special, conscious action on my part was the motivating idea behind all of this.

- Sami Khoury
 

[1] Ok, consider what this means besides the apparent fact that I like Nokia phones.
[2] AT&T Wireless may have carried PDAs running the PocketPC operating system around that time, but, on social grounds, I refuse to carry around one of those things.
[3] The ActiveSync protocol is proprietary but Microsoft has begun licensing to third party vendors like Motorola, Nokia, PalmOne, and Symbian.  Given that, the choice of devices that allow for high-fidelity synchronization with Exchange is no longer limited to those running Windows Mobile operating systems.
[4] Motorola, if you’re listening, this is one of the omissions from the MPx220 that is keeping me from buying one.
[5] Truth be told, I am actively disinterested in having email from work constantly appearing on my phone, but my boss is probably reading this.

No comments: