Monday, September 14, 2009

Spillage!

Hi,

One of the things I was determined to implement in BlitzMax right from the start was a decent 'register allocator'.

Register allocation is the process of mapping 'local' program variables (i.e. those variables declared 'Local' in BlitzMax) to CPU registers (i.e. the finite set of 'variables' inside the actual x86 CPU).

Blitz3D and BlitzPlus largely ignored this. Instead, all local variables were simply stored in memory (on the stack), and loaded into CPU registers as required. For example, Blitz3D code like this…

Local x,y,sum
x=1
y=2
sum=x+y

…would involve approximately the following steps:

1) Write the constant '1' into the memory location for variable 'x'.
2) Write the constant '2' into the memory location for variable 'y'.
3) Read the variables 'x' and 'y' from memory and add them together.
4) Store the result into the memory location for variable 'sum'.

That's 2 reads and 3 writes from/to system memory, and system memory is generally slow (even with fancy caches) when compared with the CPU (but I guess it'll be catching up soon…).

BlitzMax on the other hand will try to represent the variables x,y and sum using CPU registers, so equivalent BlitzMax code might involve something like…

1) Write the constant '1' to CPU register EAX.
2) Write the constant '2' to CPU register ECX.
3) Write the sum of EAX and ECX to CPU register EDX.

…none of which 'hits' memory so is theoretically much faster.

And, happily, this time at least, theory matches practice - BlitzMax code runs significantly faster than equivalent Blitz3D code in general, and this is almost completely due to BlitzMax's register allocator.

There is, however, one fairly major drawback to writing a register allocator - it's *@#!$^& hard! ESPECIALLY on the crappy Intel X86 which has only about 8 general purpose registers.

In particular, the trickiest thing to deal with is 'spillage' - handling the situation where there just aren't enough CPU registers to go round and some local variables MUST be 'spilled' to memory.

In this case, a spilled variable ends up behaving much like a Blitz3D/BlitzPlus variable above does - it must be constantly loaded/stored from/to memory. Therefore, the most important job of the register allocator becomes minimizing the number of spilt variables.

This is tricky because it can be hard to know precisely what effect spilling a variable will have. Spilling a variable may even cause a 'cascade' of more spills, because even though a spilled variable is stored in memory, it still needs to be moved into a CPU register at SOME point to be used for anything useful (e.g. arithmetic, memory addressing etc)!

And every now and then, someone will come up with a little piece of code in BlitzMax that causes this effect. The code will usually be small, involve about 8 or so local variables and will cause the compiler to lock up completely as the register allocator gets caught in an endless allocate/spill/allocate/spill loop...

In fact, it happened again just yesterday - hence this post!

I *think* I have it sorted now, but to be honest I seriously doubt I'll ever be able to say with 100% confidence that I know it works!

Peace out!
Mark

12 comments:

  1. This is good to know. At least the problem get caught early.( i.e. failed compilation) :)

    Whats a safe number of locals to use to prevent spillage?

    ReplyDelete
  2. Hi,

    > Whats a safe number of locals to use to prevent spillage?

    Don't worry about spillage - there are several things that can 'trigger' it and there is no sane way to really predict when it's gonna happen.

    Besides, locals are better than the alternatives (globals, fields) because they at least have the potential to be placed into CPU registers (and they generally keep the program cleaner) - with globals and fields are always stored in memory.

    That said, it can help to keep functions shortish - in my opinion, a 'long' function is about a pages worth. At that point I'd start thinking about splitting it up.

    This both keeps your program a little saner, and can actually help out register allocation a bit.

    Mark

    ReplyDelete
  3. Hey Mark,
    finally a new blog entry. I enjoy all those background information about BlitzMax, it really helps to decide whether to get myself a license or not. I'm gonna take a look at my account today when I'm home from work. If there's not enough money left then end of this month. Hope this will also help to keep you blogging ;)

    ReplyDelete
  4. Hi Mark,

    Since fields are always stored in local memory would it be beneficial to transfer fields to local variables for calculation intensive stuff then?

    ReplyDelete
  5. Hey Mark,

    Interesting read. A quick question: in Strict/SuperStrict mode, does declaring a loop iterator inside the for...next declaration improve the chances of BlitzMax allocating it to a register or is it exactly equivalent to coding, say:

    Local i:Int
    For i = 0 Until blah

    Also, does declaring all the local variables at the top of the function have any impact on optimisations? (I know in C++, some compilers can optimize accordingly if variables are only declared at the point at which they are needed).

    ReplyDelete
  6. Hi,

    > Local i:Int
    > For i = 0 Until blah

    No difference really - 'For Local blah' is just a scoping convenience.

    > Also, does declaring all the local variables at the top of the
    > function have any impact on optimisations?

    Actually, it can do.

    Worse case is when a variable is defined right at the top of a function, and eventually used right at the end.

    This means the variable stays 'live' for the entire function and ends up interfering with all other variables.

    Ultimately, this may make it a good candidate for spilling, so it wont have that much impact, but if it were instead defined nearer to where it was used, it may not need to be spilled at all.

    More sophisticated compilers probably handle this better by moving code around (Max only does a little of this).

    ReplyDelete
  7. For local object variables, should I declare local variables or use global variables that get reset to Null before the function exits? Is there a cost to declaring a local variable, even if a new instance of the type is not created?

    ReplyDelete
  8. > should I declare local variables or use global variables that get reset to Null before the function exits?

    Use locals! At least they have a chance of being register-ized.

    > Is there a cost to declaring a local variable, even if a new instance of the type is not created?

    Nope.

    (ps: The above post is more just a 'what goes on in the background' sort of thing - I don't really expect people to take register allocation into account when writing real code. You're likely to end up expending a hell of a lot of effort for little, if any, real gain!)

    ReplyDelete
  9. Thanks for the tips.

    For math functions or anything that uses a temporary object, I usually do this:

    Function Thing()
    Global t:TVec3=New TVec3
    t.x=0; t.y=0; t.z=0
    EndFunction

    Rather than this:
    Function Thing()
    Local t:TVec3=New TVec3
    EndFunction

    ReplyDelete
  10. Thanks Mark, I'm sure many like me get a kick out of hearing about the Bmax internals.

    ReplyDelete
  11. Thank you, Mark. I enjoy hearing about the internals of the compiler, and there are some real nuggets here.

    ReplyDelete
  12. Second that. It's just great to know the tool we are all working with better and better.

    ReplyDelete

Note: Only a member of this blog may post a comment.