Highly complex services, such as those here at Facebook, have large source code bases in order to deliver a wide range of features and functionality. Even after the machine code for one of these services is compiled, it can range from 10s to 100s of megabytes in size, which is often too large to fit in any modern CPU instruction cache. As a result, the hardware spends a considerable amount of processing time — nearly 30 percent, in many cases — getting an instruction stream from memory to the CPU.
This is something that doesn’t happen to everyone, but I’m always fascinated by the problem big companies find out at their scale, even though I’ll maybe not encounter these kinds of problems (at least in the near future).
I’m very thankful for Facebook to have posts like this, where they explain the problems they have and in detail how they improved it. On top of that, they also released this code open source, so anyone can use it and contribute to it, without asking for anything in return.
This is a quite technical post (the kind I really like) and if you don’t have a grasp how CPU level works, it’s not very easy to follow, and understand what exact problem they are solving and how they solved part of it. From the article, it looks like there are other things that can be improved in the future, so hopefully, they are going to keep developing this tools (that they are already using in production, so I guess so) in their interest and possibly a community around it.
Another interest side-note: at the end of the article they are thanking the LLVM community because they are using some of the LLVM libraries to simplify their development. LLVM is also really a great tool, very well developed and really a great resource for everyone working with compilers.