Faster MAME build times

MAME is quite a big project. It has 5538 C source files and 3304 header files (counting all 3rd party libraries). There are a total of 4416185 lines of code.

The source files’ extension is c, but the project is actually written in C++. The transition from C to C++ happened a few years ago, mainly led by Aaron Gilles. C++ is known[citation needed] to take longer than C to compile.

The result of all this is that MAME takes a very long time to build. With my Intel Core 2 Quad Q6600 running Ubuntu 14.04, it takes almost 2 hours to build MAME. In this post I will give a few tips on how to get faster MAME build times. Some tips are only available for a few operating systems.

1. Use clang instead of gcc

GCC is awesome. It’s the compiler that powers the open-source world. It’s been doing this for over 27 years. But some 11 years ago LLVM came along, and some 7 years ago came clang, now a serious contender in the compilers world.

You can specify the compiler you want to use while building MAME with the CC=<compiler> option.

$ time make CC=clang
[...]
real    72m44.722s
user    65m47.202s
sys 4m57.559s

If you don’t want make to print out all compilation commands, just use the @ sign before the compiler name, like this: make CC=@clang

How much of a speedup do we get by using clang instead of gcc? Let’s find out.

$ time make CC=gcc
[...]
real    111m37.837s
user    102m32.533s
sys 8m31.289s

The speedup for a full build of MAME is 35% when using clang instead of gcc.

Clang is now the default compiler in Mac OS X, so there’s no need to specify it in the command line. Unfortunately, clang still does not support Windows.

2. Use multiple cores

With most processors now having multiple cores it’s quite straight-forward that we should be using all those cores for compilation. The compiler itself is not optimized for multiple cores, but since we have a bunch of files to compile and they’re independent of each other, we can compile them in parallel.

You can specify the amount of jobs to run in parallel with the -j <number of cores> option.

With GCC and two cores we get the compilation time cut almost by half:

$ time make -j2
[...]
real    54m45.346s
user    98m30.421s
sys 8m37.893s

Use as many cores as you can for compilation. It may even be worth using more jobs in parallel than actual number of cores. A good rule is to use <number of cores + 1> jobs.

3. Disable GNU make builtin rules

Up to now we’ve covered speedups for building the whole project. But what if you’re hacking away in MAME, and you have to compile your code over and over with small changes in between each run? The 30 seconds it takes to compile that one change and link MAME seem like FOREVER. Every second that can be scraped off is welcome.

Every time you run GNU make, it will check for all the files that have to be recompiled. If you have changed only one file, make will still check for all other files to see if they have to be recompiled. This is normally quite quick, but GNU make has a thing called implicit rules. It will check for a bunch of other files that you never asked for in the first place. I don’t know when this is really useful, but most modern Makefiles don’t need to use any implicit rules. MAME doesn’t.

You can disable this feature with the -r option.

To show the benefits of disabling implicit rules, I’ll run make in a fully built directory. Everything has already been built, so there’s nothing for make to do, except for checking if it needs to make any more rules, implicit or explicit.

On a system with no files cached (cache cleared between runs):

$ time make
make: Nothing to be done for `default'.

real    0m2.136s
user    0m0.577s
sys 0m0.120s

$ time make -r
make: Nothing to be done for `default'.

real    0m0.736s
user    0m0.119s
sys 0m0.122s

When all files already in cache:

$ time make
make: Nothing to be done for `default'.

real    0m0.550s
user    0m0.504s
sys 0m0.046s

$ time make -r
make: Nothing to be done for `default'.

real    0m0.128s
user    0m0.086s
sys 0m0.042s

The gains are very small on most systems (Linux and Mac OS X), being less than 2 seconds in the worst-case scenario. But let’s try this on Windows now:

> timer make
make: Nothing to be done for `default'.

time taken: 4164 ms

> timer make -r
make: Nothing to be done for `default'.

time taken: 472 ms

Did you see that? Instead of taking 4.164 seconds, make now takes only 472 milliseconds. The gains are HUGE in Windows systems, where file system operations take an awkwardly long amount of time.

4. Use gold

Suppose we’re still hacking MAME and making small changes in one source file only. Even if we have to compile only one file, we still have to link the MAME executable in its entirety. This means walking through all compiled object files to make one final executable. This step can take considerably longer than compiling any source files that have changed.

GNU binutils has a new linker optimized for ELF files and big C++ projects since 2008. This linker is called gold. It does a very good job with MAME.

You can specify the linker you want to use while building MAME with the LD=<linker> option.

You don’t use LD=gold directly, but specify that you want g++ to use gold while linking. The command thus becomes: LD="g++ -fuse-ld=gold"

Let’s see how much speedup we can get with gold instead of the default linker:

$ rm -f mame64 && time make -r
Linking mame64...

real    0m20.442s
user    0m16.478s
sys 0m2.757s

$ rm -f mame64 && time make LD="@g++ -fuse-ld=gold" -r
Linking mame64...

real    0m5.012s
user    0m4.185s
sys 0m0.781s

The linking step is 75% faster when using gold.

Unfortunately this linker only works for generating ELF files, which means it only works for Linux builds. Mac OS X and Windows can’t use this linker.

Putting it all together

So, different Operating Systems have different tricks to speedup MAME build times.

For Linux, use clang and gold:
$ make -r CC=@clang LD="@g++ -fuse-ld=gold" -j4

For Mac OS X, clang is used by default, and you can’t use gold:
$ make -r -j4

For Windows, you can’t use clang or gold, but at least you can use multiple cores and shave a few seconds off by disabling implicit rules:
> make -r -j4

 

Leave a Reply

Your email address will not be published. Required fields are marked *