Files
retrowin32/doc/performance.md
T
2023-01-16 19:02:49 -08:00

63 lines
1.5 KiB
Markdown

# Performance notes
## Dumping assembly
```
$ cargo install cargo-show-asm
$ cargo asm --wasm -p win32 mov_r32
```
## Profiling on Mac
```
$ brew install cargo-instruments
$ cargo instruments --release -t time -p retrowin32 -- exe/zip/zip.exe 200
```
## Registers struct
Registers are known named slots, e.g. eax, ebx. It's natural to represent them
as like
```
struct Registers {
eax: u32,
ebx: u32,
...
}
```
But most instructions refer to registers indirectly, as an integer. So to look
up a register you might write code like:
```
enum Reg { EAX, EBX, ... }
fn get_reg(regs: &Registers, reg: Reg) {
match reg {
Reg::EAX => regs.eax,
...
}
}
```
Unfortunately it seems that, even if the values of the `Reg` enum are integers
that cleanly map to "the nth u32 in the registers struct", the above `get_reg`
function gets generated by LLVM as a switch table rather than math. (The
behavior seems the same between C++ and Rust so it seems to be an LLVM thing; it
generates more efficient code when `regs` is a global but it's still not ideal.)
If you instead do something that has the same layout in memory but is more
clearly integer-indexed:
```
struct Registers {
r32: [u32; 8],
}
```
then the code generated is ideal. But then accessing those registers in Rust
code ends up pretty miserable relative to the named registers.
So instead we just use the first struct with `#[repr(C)]` and do some casting to
get the efficient codegen of the latter.