You've already forked linux-packaging-mono
							
							
		
			
				
	
	
		
			140 lines
		
	
	
		
			6.4 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			140 lines
		
	
	
		
			6.4 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| 
 | |
| 		IMT-based interface invocation support
 | |
| 
 | |
| The mono JIT can use an IMT-style invocation system to call interface methods.
 | |
| This considerably reduces the runtime memory usage when many interface types
 | |
| are loaded, because the old system required an array in MonoVTable indexed
 | |
| by the interface id, which grows linearly as more interfaces are loaded.
 | |
| In some cases there are also speedups, since an interface call can reduce to
 | |
| a virtual call automatically.
 | |
| 
 | |
| IMT instead uses a fixed-size table and hashes each method in the implemented
 | |
| interfaces to a slot in the IMT table. To be able to resolve collisions, at each
 | |
| callsite we store the interface MonoMethod to be called in a well-known register and
 | |
| the IMT table will contain a snippet of code that uses it to jump to the
 | |
| proper vtable slot. The interface invocation sequence becomes (in pseudo-code):
 | |
| 
 | |
| 	mov magic_reg, interface_monomethod
 | |
| 	call vtable [imt_slot]
 | |
| 
 | |
| The IMT table is stored at negative addresses in the vtable, like the old
 | |
| interface array used to be.
 | |
| 
 | |
| A small note on the choice of magic_reg for different JIT backends: the IMT
 | |
| method identifier doesn't necessarily need to be stored in a register, though
 | |
| doing so is fast and the JIT code has already the infrastructure to handle this
 | |
| case in an arch-independent way. A JIT porter just needs to #define
 | |
| MONO_ARCH_IMT_REG to the chosen register. Note that this register should be
 | |
| part of the MONO_ARCH_CALLEE_REGS set as it will be handled by the local register
 | |
| allocator (see mini/inssel.brg) and it must not be part of the registers used for
 | |
| argument passing as you'd overwrite an argument in that case.
 | |
| Also note that the method-specific trampoline code should make sure to preserve
 | |
| this register (but it should already if it's in MONO_ARCH_CALLEE_REGS as
 | |
| it could have been used for a vtable indirect call).
 | |
| 
 | |
| Note that in the case of a nono-colliding IMT slot, the interface call
 | |
| instruction sequence becomes equivalent to a virtual call, as the IMT slot
 | |
| will contain the direct trampoline for the method and the magic trampoline will
 | |
| set the slot to the method's native code address once it is compiled.
 | |
| 
 | |
| In case of collisions in the IMT slot, the JIT performs a linear search if
 | |
| the colliding methods are few or a binary search otherwise.
 | |
| To make this easier for each JIT port, a sort of internal representation
 | |
| of the code is created: this is an array of MonoIMTCheckItem structures
 | |
| built in a way to allow easy generation of a bsearch, when the list of colliding
 | |
| methods becomes large.
 | |
| 
 | |
| Each item in the array represents either a direct check for a method to be invoked
 | |
| or a bisection check in the bsearch algorithm.
 | |
| 
 | |
| struct _MonoIMTCheckItem {
 | |
| 	MonoMethod       *method;
 | |
| 	int               check_target_idx;
 | |
| 	int               vtable_slot;
 | |
| 	guint8           *jmp_code;
 | |
| 	guint8           *code_target;
 | |
| 	guint8            is_equals;
 | |
| 	guint8            compare_done;
 | |
| 	guint8            chunk_size;
 | |
| 	guint8            short_branch;
 | |
| };
 | |
| 
 | |
| For a direct check, the is_equals value is non-zero and the emitted code
 | |
| should be equivalent to:
 | |
| 	if (magic_reg != item->method)
 | |
| 		jump_to_item (array [item->check_target_idx]);
 | |
| 	jump_to_vtable (item->vtable_slot);
 | |
| 
 | |
| Note that if item->check_target_idx is 0, the jump should be omitted
 | |
| since this is the end of a linear sequence (you might want to insert a jump to
 | |
| a breakpoint, though, for debugging) and this would mean that we have an error:
 | |
| the IMT slot was asked to execute an interface method that the type doesn't implement.
 | |
| In the future we might want to handle this case not with a breakpoint or assert, but
 | |
| by either throwing an InvalidCast exception or by going into the runtime and
 | |
| adding support for the interface automagically to the type/vtable: this could be used
 | |
| both for transparent proxies and for the implicit interfaces that vectors in 2.0
 | |
| provide.
 | |
| 
 | |
| For a bisect check the code is even simpler:
 | |
| 
 | |
| 	if (magic_reg >= item->method)
 | |
| 		jump_to_item (array [item->check_target_idx]);
 | |
| 
 | |
| In this case item->check_target_idx is always non-zero.
 | |
| Note that in both cases item->method becomes an immediate constant in the
 | |
| jitted code.
 | |
| 
 | |
| The other fields in the structure are there to provide to the backend
 | |
| common storage for data needed during emission.
 | |
| As each item's code is emitted, the start of it is stored in the code_target
 | |
| field. At the same time when a conditional branch is inserted, its address
 | |
| is stored in jmp_code: this way with a single forward pass on the array at
 | |
| the end of the emission phase the branches can be patched to point to the
 | |
| proper target item's code (this process would patch the jump_to_item pseudo
 | |
| instructions described above).
 | |
| 
 | |
| chunk_size can be used to store the size of the code generated for the item: this
 | |
| can be used to optimize the short/long branch instructions, together with
 | |
| info stored in short_branch. It is also used to calculate the size of the
 | |
| code to allocate for the whole IMT thunk.
 | |
| 
 | |
| The compare_done field can be used to avoid doing an additional compare
 | |
| in a is_equals item for the same MonoMethod that was just compared in a
 | |
| bisecting item. Suppose we have 4 methods colliding in a slot, A, B, C and D.
 | |
| The arch-independent code already took care of sorting them, so that:
 | |
| 	A < B < C < D
 | |
| 
 | |
| The generated code will look like (M is the method to call):
 | |
| 
 | |
| 	compare (C, M)
 | |
| 	goto upper_sequence if bigger_equals
 | |
| 	/* linear sequence */
 | |
| 	compare (M, A)
 | |
| 	goto B_found if not_equals
 | |
| 	jump to A's slot
 | |
| B_found:
 | |
| 	jump to B's slot
 | |
| 
 | |
| upper_sequence:
 | |
| 	/* we just did a compare against C, no need to compare again */
 | |
| 	goto D_found if not_equals
 | |
| 	jump to C's slot
 | |
| D_found:
 | |
| 	jump to D's slot
 | |
| 
 | |
| This optimization is of course valid for architectures with flags registers.
 | |
| 
 | |
| As a further optimization to reduce memory usage, the Mono runtime sets the
 | |
| IMT slots initially to a single-instance magic trampoline so there is actually no
 | |
| memory used up by the thunks in the case of collisions. When an interface method is
 | |
| called the magic trampoline will fill-in the IMT slot with the proper thunk or
 | |
| trampoline, so later calls will use the fast path.
 | |
| This single-instance trampoline will use MONO_FAKE_IMT_METHOD as the method
 | |
| it's asking to be compiled and executed: the trampoline code does recognize
 | |
| this special value and retrieves the interface method to call from the usual
 | |
| MONO_ARCH_IMT_REG saved by the trampoline code.
 | |
| Given that only the IMT slots that are actually used will be initialized, this saves
 | |
| quite a bit of memory, as it's unlikely that all the interface methods are called on
 | |
| all the different types.
 | |
| 
 |