a575963da9
Former-commit-id: da6be194a6b1221998fc28233f2503bd61dd9d14
69 lines
3.7 KiB
Plaintext
69 lines
3.7 KiB
Plaintext
We need to switch to a new register allocator.
|
|
The current one is split in a global and a local register allocator.
|
|
The global one can assign only callee-saves registers and happens
|
|
on the tree-based internal representation: it assigns local variables
|
|
to hardware registers.
|
|
The local one happens on the linear representation on a per basic
|
|
block basis and assigns hard registers to virtual registers (which
|
|
hold temporary values during expression executions) and it deals also
|
|
with the platform-specific issues (fixed registers, call conventions).
|
|
|
|
Moving to a different register will help solve some of the performance
|
|
issues introduced by the above split, make the register more easily
|
|
portable and solve some of the issues generated by dealing with trees.
|
|
|
|
The general design ideas are below.
|
|
|
|
The new allocator should have a global view of all the method, so it can be
|
|
able to assign variables also to some of the volatile registers if possible,
|
|
even across basic blocks (this would improve performance).
|
|
|
|
The allocator would be driven by per-arch declarative data, so porting
|
|
should be easier: an architecture needs to specify register classes,
|
|
call convention and instructions requirements (similar to the gcc code).
|
|
|
|
The allocator should operate on the linear representation, this way it's
|
|
easier and faster to track usages more correctly. We need to assign virtual
|
|
registers on a per-method basis instead of per basic block. We can assign
|
|
virtual registers to variables, too. Note that since we fix the stack offset
|
|
of local vars only after this step (which happens after the burg rules are run),
|
|
some of the burg rules that try to optimize the code won't apply anymore:
|
|
the peephole code may need to be enhanced to do the optimizations instead.
|
|
|
|
We need to handle floating point registers in the global allocator, too.
|
|
|
|
The new allocator also needs to keep track precisely of which registers
|
|
contain references or managed pointers to allow us to move to a precise GC.
|
|
|
|
It may be worth to use a single increasing set of integers for the virtual
|
|
registers, with the class of the register stored separately (unless the
|
|
current local allocator which keeps interger and fp registers separate).
|
|
|
|
Since this is a large task, we need to do it in steps as much as possible.
|
|
The first is to run the register allocator _after_ the burg rules: this
|
|
requires a rewrite of the liveness code, too, to use linear indexes instead
|
|
of basic-block/tree number combinations. This can be done by:
|
|
*) allocating virtual regs to all the locals that can be register allocated
|
|
*) running the burg rules (some may require adjustments): the local virtual
|
|
registers are assigned starting from global-virt-regs+1, instead of the current
|
|
hardware-regs+1, so we can tell apart global and local virt regs.
|
|
*) running the liveness/whatever code is needed to allocate the global registers
|
|
*) allocate the rest of the local variables to stack slots
|
|
*) continue with the current local allocator
|
|
|
|
This work could take 2-3 weeks.
|
|
|
|
The next step is to define the kind of declarative data an architecture needs
|
|
and assigning virtual regs to all the registers and making the allocator
|
|
assign from the volatile registers, too.
|
|
Note that some of the code that is currently emitted in the arch-specific
|
|
code, will need to be emitted as instructions that the reg allocator
|
|
can inspect: think of a method that returns the first argument which is
|
|
received in a register: the current code copies it to either a local slot or
|
|
to a global reg in the prolog an copies it back to the return register
|
|
int he basic block, but since neither the regallocator nor the peephole code
|
|
knows about the prolog code, the first store cannot be optimized away.
|
|
The gcc code has some example of how to specify register classes in a
|
|
declarative way.
|
|
|