Today Adam Wolff tweeted a story about deugging FB-Chat. He is a technical fellow at Electric Capital. Below is it.

Before the major effort to redo the UI, FB Chat was super broken and we had no idea why.

We didn’t know what was wrong, but we knew the code was a mess. We set about rewriting both the front-end and the back-end in an effort to fix it.

The front-end rewrite pulled in a whole team of amazing engineers and became one of the big threads that led to reactjs.

During the time we were working on the Chat rewrite, we were also replacing the original Erlang backend with one written in C++. This was probably a good move, but the problem wasn’t with Erlang either.

When we finally gained insight into our deliverability data, we were able to cut it by region. We noticed Chat was really popular in India. This was before WhatsApp, at a time when SMS wasn’t reliable.

Eventually we pinpointed a region in India where one specific DNS provider was giving out the wrong IP addresses for our Chat servers. So when people went to use Chat, they would sometimes get a notification that they had a message, and then it would disappear. Or they’d send a message and it would get lost. All because they were connecting to the wrong IP address.

That was it! None of the sexy new tech we were working on was going to solve that problem. Ever.