Why WhatsApp made its web app use your phone as a server
Ever since WhatsApp announced their web app, I have seen various people complain about having to keep your phone on to send messages. But in all of these posts people seem to be overlooking two key points about the design of WhatsApp which either necessitate this design or at least facilitate WhatsApp in keeping their service lean and fast.
In case you didn’t hear, WhatsApp (supposedly) does end-to-end encryption. That means WhatsApp can’t read what people say in their messages. What this also means, though, is that if someone sends you a message and you have your phone and web app, there is no way to read that message independently of both devices without sharing the decryption key, which is not good security practice. You could have separate keys for your phone and web browser, but then you would then need to make sure to share keys between your phone, your browser, and all of your contacts so they could receive messages from any of your devices.
But if you treat your phone as your personal WhatsApp server, then your phone can continue to have the master keys for your account and then you only need to manage keys between your phone and your other devices, making it a hub-and-spoke system where you phone is the hub and your other devices are the spokes. This keeps WhatsApp out of the key management business and allows you to easily revoke keys for your other devices without WhatsApp having to store anything for you. So tying everything through your phone keeps everything encrypted and simplifies key sharing between users to just their phones and keeps security simple which is how you want it to be.
Someone pointed out to me that WhatsApp is not in the storage business, and when you stop and look at how the service was structured before the web app came along you will realize that WhatsApp tried to minimize what data it had to store from its inception. For instance, if you get a new phone you actually have to migrate your messages over yourself. Since WhatsApp doesn’t keep messages on their servers, your phone ends up being the keeper of truth when it comes to what messages you sent and received.
But the key thing about WhatsApp not storing messages on their servers is how much it simplifies their service. Consider their last publicly stated user count of 500,000,000. Since WhatsApp doesn’t store messages for you, they really only need to store messages that have yet to be delivered to a user’s phone and your account’s configuration data. So let’s assume every user suddenly sent a bunch of photos that came to a total of 1 MB of pending messages (remember that WhatsApp is only going to show you a small version of a photo and so they can compress them such that they don’t take up much space; 1 MB should go a long way). That’s 1 MB * 500,000,000 = 500 TB of storage. OK, not a puny number for most services.
But let’s look at this from a cloud perspective. Let’s say you wanted an extremely fast service, so you would want to use local SSD which Google Compute Engine offers. As I write this, a local SSD on GCE is 375 GB and you can have up to 4 per instance. At 375 GB that gets means you would need 1,334 SSDs to store 500 TB of data (it would also require at minimum 334 instances and probably enough for the service to run, but I’m not pricing out computation or bandwidth costs to keep this simple). Now according to GCE’s pricing, a local SSD costs USD 81.75/month. That means it would cost you USD 81.75 * 1,334 SSDs = USD 109,054.50/month or USD 1,308,654/year. Now let’s be totally extravagant and say you want the data replicated near the sender, near the receiver, and on one other continent for safekeeping until delivery (when the sender and receiver are on the same continent the data could go to two separate clusters). So we are talking about 3N data replication. That works out to USD 3,925,962/year in local SSD storage costs if you used Google Compute Engine at full retail and always maintained this level of storage constantly. In the current startup climate, USD 4 million/year is pittance (and unnecessary as I bet WhatsApp could get away with 2N data replication if that since this is only for buffering purposes). And for a company like Facebook who have their own clusters? Storing 1.5 PB of data would not be difficult at all, so WhatsApp could go as far as backing up the data on every populated continent if they wanted to for 3 PB or less than USD 8 million/year.
In other words by relying on your phone as the storage mechanism for messages and not having to keep anything on their own servers, WhatsApp can run very cheaply and efficiently (and only cost you USD 1/year as a user). This makes WhatsApp basically nothing more than a fast routing service with some buffering between phones, much like the telcos (which might be what inspired WhatsApp to use Erlang). It’s rather ingenious and probably why the service seems so fast and has such great uptime.
And I bet users don’t care about the potential loss of messages either; when was the last time you scrolled back into the history on your phone to a point that preceded you getting that phone or having catastrophic data loss on the device? Plus you have to consider how many users of WhatsApp are really going to use the web app; I bet a large portion of their users don’t even have their own computers beyond their phones so this web app service probably isn’t critical to WhatsApp’s success.