With MLGO, Google has released a framework that optimizes the LLVM compiler architecture using machine learning (ML) methods. Among other things, it replaces the heuristic approaches with reinforcement learning (RL) during inlining. The “GO” in the name has nothing to do with Google’s programming language Go, but stands for “Guided Optimizations”.
Well observed strategy
Reinforcement learning works with an ML agent that observes its environment and uses a kind of reward system to recognize how efficient its approach to the task is. It follows a trial-and-error approach and can, among other things, efficiently develop strategies for board and video games. AlphaGo used RL to assess the impact of its moves on the outcome of the game.
The paper “MLGO: a Machine Learning Guided Compiler Optimizations Framework” found on arxiv sees RL as having two advantages over the human-made heuristics that LLVM currently uses for optimization. On the one hand, there are no examples for optimizing the heuristics and, on the other hand, different strategies can be tried out with RL and the results can be included in the learning process of the model.
Inline instead of function call
At launch, MLGO targets two areas of compiler optimization: inline replacement (inlining) and register allocation (register allocation, regalloc). The former approach replaces function calls with the function body and copies its code directly into the flow. This eliminates the overhead of the call. Depending on the structure, inlining can improve performance and reduce the size of the build, but it can also make the compile significantly larger by copying the same lines of code to different places.
Nested function calls complicate the trade-off between costs and benefits. So far, this is where the heuristics and, with MLGO, reinforcement learning come in to recognize whether the inline replacement improves or degrades the compiled code. There is an example on the Google blog that is a through ball for inling:
When processing the call graph, which represents the sequence of function calls, the compiler asks the RL model whether it considers inlining to be useful for a specific call. The model relies on gradient-based optimization (Policy Gradient) and evolution strategies (Evolution Strategies). For the training of the RL model, the compiler keeps a record of the decisions made and the result of the optimization. After compilation, the log is used by the trainer to update the model.
Register allocation attempts to improve performance by optimizing the allocation of local automatic variables to the limited number of registers in the processor. There are also heuristic approaches for this, which MLGO replaces with reinforcement learning.
Google used the code for its in-house software for the regalloc training. According to the blog post on MLGO, the results can be generalized to the extent that the model for different software in the internet giant’s data center has achieved a 0.3 percent to 1.5 percent improvement in queries per second (QPS). Google’s Fuchsia open-source operating system served as a test candidate for the inlining, and according to the blog post, the MLGO optimizations reduced the size of the C++ translation units by 6.3 percent.
Open for everything
MLGO is available as an open source project on GitHub. According to Google, it is open to any customization, be it better RL algorithms or compiler optimizations beyond inlining and register allocation. Part of the GitHub repository is a demo that uses gradient-based optimization for inline replacement.