You've already forked retrowin32
mirror of
https://github.com/encounter/retrowin32.git
synced 2026-03-30 11:35:51 -07:00
63 lines
1.5 KiB
Markdown
63 lines
1.5 KiB
Markdown
# Performance notes
|
|
|
|
## Dumping assembly
|
|
|
|
```
|
|
$ cargo install cargo-show-asm
|
|
$ cargo asm --wasm -p win32 mov_r32
|
|
```
|
|
|
|
## Profiling on Mac
|
|
|
|
```
|
|
$ brew install cargo-instruments
|
|
$ cargo instruments --release -t time -p retrowin32 -- exe/zip/zip.exe 200
|
|
```
|
|
|
|
## Registers struct
|
|
|
|
Registers are known named slots, e.g. eax, ebx. It's natural to represent them
|
|
as like
|
|
|
|
```
|
|
struct Registers {
|
|
eax: u32,
|
|
ebx: u32,
|
|
...
|
|
}
|
|
```
|
|
|
|
But most instructions refer to registers indirectly, as an integer. So to look
|
|
up a register you might write code like:
|
|
|
|
```
|
|
enum Reg { EAX, EBX, ... }
|
|
fn get_reg(regs: &Registers, reg: Reg) {
|
|
match reg {
|
|
Reg::EAX => regs.eax,
|
|
...
|
|
}
|
|
}
|
|
```
|
|
|
|
Unfortunately it seems that, even if the values of the `Reg` enum are integers
|
|
that cleanly map to "the nth u32 in the registers struct", the above `get_reg`
|
|
function gets generated by LLVM as a switch table rather than math. (The
|
|
behavior seems the same between C++ and Rust so it seems to be an LLVM thing; it
|
|
generates more efficient code when `regs` is a global but it's still not ideal.)
|
|
|
|
If you instead do something that has the same layout in memory but is more
|
|
clearly integer-indexed:
|
|
|
|
```
|
|
struct Registers {
|
|
r32: [u32; 8],
|
|
}
|
|
```
|
|
|
|
then the code generated is ideal. But then accessing those registers in Rust
|
|
code ends up pretty miserable relative to the named registers.
|
|
|
|
So instead we just use the first struct with `#[repr(C)]` and do some casting to
|
|
get the efficient codegen of the latter.
|