MAME is quite a big project. It has 5538 C source files and 3304 header files (counting all 3rd party libraries). There are a total of 4416185 lines of code.
The source files’ extension is c, but the project is actually written in C++. The transition from C to C++ happened a few years ago, mainly led by Aaron Gilles. C++ is known[citation needed] to take longer than C to compile.
The result of all this is that MAME takes a very long time to build. With my Intel Core 2 Quad Q6600 running Ubuntu 14.04, it takes almost 2 hours to build MAME. In this post I will give a few tips on how to get faster MAME build times. Some tips are only available for a few operating systems.
1. Use clang instead of gcc
GCC is awesome. It’s the compiler that powers the open-source world. It’s been doing this for over 27 years. But some 11 years ago LLVM came along, and some 7 years ago came clang, now a serious contender in the compilers world.
You can specify the compiler you want to use while building MAME with the CC=<compiler>
option.
$ time make CC=clang [...] real 72m44.722s user 65m47.202s sys 4m57.559s
If you don’t want make to print out all compilation commands, just use the @
sign before the compiler name, like this: make CC=@clang
How much of a speedup do we get by using clang instead of gcc? Let’s find out.
$ time make CC=gcc [...] real 111m37.837s user 102m32.533s sys 8m31.289s
The speedup for a full build of MAME is 35% when using clang instead of gcc.
Clang is now the default compiler in Mac OS X, so there’s no need to specify it in the command line. Unfortunately, clang still does not support Windows.
2. Use multiple cores
With most processors now having multiple cores it’s quite straight-forward that we should be using all those cores for compilation. The compiler itself is not optimized for multiple cores, but since we have a bunch of files to compile and they’re independent of each other, we can compile them in parallel.
You can specify the amount of jobs to run in parallel with the -j <number of cores>
option.
With GCC and two cores we get the compilation time cut almost by half:
$ time make -j2 [...] real 54m45.346s user 98m30.421s sys 8m37.893s
Use as many cores as you can for compilation. It may even be worth using more jobs in parallel than actual number of cores. A good rule is to use <number of cores + 1> jobs.
3. Disable GNU make builtin rules
Up to now we’ve covered speedups for building the whole project. But what if you’re hacking away in MAME, and you have to compile your code over and over with small changes in between each run? The 30 seconds it takes to compile that one change and link MAME seem like FOREVER. Every second that can be scraped off is welcome.
Every time you run GNU make, it will check for all the files that have to be recompiled. If you have changed only one file, make will still check for all other files to see if they have to be recompiled. This is normally quite quick, but GNU make has a thing called implicit rules. It will check for a bunch of other files that you never asked for in the first place. I don’t know when this is really useful, but most modern Makefiles don’t need to use any implicit rules. MAME doesn’t.
You can disable this feature with the -r
option.
To show the benefits of disabling implicit rules, I’ll run make in a fully built directory. Everything has already been built, so there’s nothing for make to do, except for checking if it needs to make any more rules, implicit or explicit.
On a system with no files cached (cache cleared between runs):
$ time make make: Nothing to be done for `default'. real 0m2.136s user 0m0.577s sys 0m0.120s $ time make -r make: Nothing to be done for `default'. real 0m0.736s user 0m0.119s sys 0m0.122s
When all files already in cache:
$ time make make: Nothing to be done for `default'. real 0m0.550s user 0m0.504s sys 0m0.046s $ time make -r make: Nothing to be done for `default'. real 0m0.128s user 0m0.086s sys 0m0.042s
The gains are very small on most systems (Linux and Mac OS X), being less than 2 seconds in the worst-case scenario. But let’s try this on Windows now:
> timer make make: Nothing to be done for `default'. time taken: 4164 ms > timer make -r make: Nothing to be done for `default'. time taken: 472 ms
Did you see that? Instead of taking 4.164 seconds, make now takes only 472 milliseconds. The gains are HUGE in Windows systems, where file system operations take an awkwardly long amount of time.
4. Use gold
Suppose we’re still hacking MAME and making small changes in one source file only. Even if we have to compile only one file, we still have to link the MAME executable in its entirety. This means walking through all compiled object files to make one final executable. This step can take considerably longer than compiling any source files that have changed.
GNU binutils has a new linker optimized for ELF files and big C++ projects since 2008. This linker is called gold. It does a very good job with MAME.
You can specify the linker you want to use while building MAME with the LD=<linker>
option.
You don’t use LD=gold
directly, but specify that you want g++ to use gold while linking. The command thus becomes: LD="g++ -fuse-ld=gold"
Let’s see how much speedup we can get with gold instead of the default linker:
$ rm -f mame64 && time make -r Linking mame64... real 0m20.442s user 0m16.478s sys 0m2.757s $ rm -f mame64 && time make LD="@g++ -fuse-ld=gold" -r Linking mame64... real 0m5.012s user 0m4.185s sys 0m0.781s
The linking step is 75% faster when using gold.
Unfortunately this linker only works for generating ELF files, which means it only works for Linux builds. Mac OS X and Windows can’t use this linker.
Putting it all together
So, different Operating Systems have different tricks to speedup MAME build times.
For Linux, use clang and gold:
$ make -r CC=@clang LD="@g++ -fuse-ld=gold" -j4
For Mac OS X, clang is used by default, and you can’t use gold:
$ make -r -j4
For Windows, you can’t use clang or gold, but at least you can use multiple cores and shave a few seconds off by disabling implicit rules:
> make -r -j4