(and how we broke the Gmail)
A simple assumption. An action is performed at a given time, so it can be a unique value used for sorting by date. Right?
Gmail developers thought that way, and it seems they were not entirely correct.
On one of the cold nights of the Autumn 2016…
…me and my future fiancée sent each other an e-mail using Gmail Android application. It supposed to be a “sleep well” e-mail. We both have waited a few minutes for a reply, and while there was none, we went asleep assuming the other person was already in that state.
After crafting and sending new e-mails the next morning, we have figured out none of us got the “goodnight” e-mail! We have checked the dates of the messages and they were sent at the same time.
That might have been a coincidence on the server’s part just at that moment, but let’s imagine it was incorrect design.
Every second is different
For the e-mail client, every message sent will have a different timestamp as long as we measure it in that many milliseconds or nanoseconds required to send an e-mail by the fastest software.
Every e-mail can be ordered by timestamp with that assumption.
In the Gmail application, in conversation mode, each e-mail is displayed under another e-mail, sorted by their creation (or arrival) date.
On that night, application logic was forced to make an impossible decision for the computer – which e-mail was first when they both were sent at the same time?
Probably less sophisticated software would just order them randomly (and displayed in a different order each time viewed), but Gmail maybe had to hardcode their order, on the server or in the application cache even, and that was not possible.
It must have been based on an assumption that no test tool has achieved sending both e-mails at the same time, or the aspect of two-way communication has been ignored.
Moral of the story
If timestamps are going to be your IDs, make sure the precision used is better than the possibility to take another action.
If there is a problem with uniqueness (ensure DB type is Unique), handle that error separately by forcing additional millisecond, use randomness in addition to the timestamp, or use a composite key of timestamp and a client ID.
If that is a real-time system, ask business and handle the case accordingly.
Just do not assume it would never happen.
Leave a comment