This was originally the third patch in a series, introduced by ec4e719cd6. The
first patch and the series itself was written by Mark Jansen and focused on an
incorrect handling of NULL in QueryActCtx(). It's not clear what those were
actually for, but they're upstream now.
My guess is that Sebastian noticed this "bug" while writing tests and wrote the
patch to fix it, as it's an easy one-liner. My guess is that he never submitted
it upstream either because it wasn't clearly correct (e.g. he wanted to actually
implement ACTIVATION_CONTEXT_SECTION_ASSEMBLY_INFORMATION) or because of
disagreements with upstream over project direction. As it happens there is a way
to get ACTIVATION_CONTEXT_SECTION_ASSEMBLY_INFORMATION to return valid data—pass
a NULL or empty string as the section name.
Anyway, the patch is likely to have never actually helped anything, and even if
it did, it's a one-liner with a FIXME message already attached, and so won't be
hard to debug again if necessary.
These prefixes are redundant. We're restoring the context from 32-bit code, so
%ss = %ds already, and if we're on this side of the code then our %ss = the
target %ss.
Still, why was the patch written?
Before 44fbc018ed, there was a single path to restore contexts, and it looked
like this:
output( "2:\tpushl 0x94(%%ecx)\n"); /* SegEs */
output( "\tpopl %%es\n" );
output( "\tpushl 0x90(%%ecx)\n"); /* SegFs */
output( "\tpopl %%fs\n" );
output( "\tpushl 0x8c(%%ecx)\n"); /* SegGs */
output( "\tpopl %%gs\n" );
output( "\tmovl 0x9c(%%ecx),%%edi\n"); /* Edi */
output( "\tmovl 0xa0(%%ecx),%%esi\n"); /* Esi */
output( "\tmovl 0xa8(%%ecx),%%edx\n"); /* Edx */
output( "\tmovl 0xa4(%%ecx),%%ebx\n"); /* Ebx */
output( "\tmovl 0xb0(%%ecx),%%eax\n"); /* Eax */
output( "\tmovl 0xb4(%%ecx),%%ebp\n"); /* Ebp */
output( "\tpushl 0xc8(%%ecx)\n"); /* SegSs */
output( "\tpopl %%ss\n" );
output( "\tmovl 0xc4(%%ecx),%%esp\n"); /* Esp */
output( "\tpushl 0xc0(%%ecx)\n"); /* EFlags */
output( "\tpushl 0xbc(%%ecx)\n"); /* SegCs */
output( "\tpushl 0xb8(%%ecx)\n"); /* Eip */
output( "\tpushl 0x98(%%ecx)\n"); /* SegDs */
output( "\tmovl 0xac(%%ecx),%%ecx\n"); /* Ecx */
output( "\tpopl %%ds\n" );
output( "\tiret\n" );
Very simple: we restore most registers (but not %ds), then switch stacks, then
push control registers and %ds to the current (target) stack, then pop %ds and
iret.
This was vulnerable to signal races:
+ /* As soon as we have switched stacks the context structure could
+ * be invalid (when signal handlers are executed for example). Copy
+ * values on the target stack before changing ESP. */
so 44fbc018ed changed the path to a different one:
output( "\tpushl 0xc8(%%ecx)\n" ); /* SegSs */
output( "\tpopl %%es\n" );
output( "\tmovl 0xc4(%%ecx),%%eax\n" ); /* Esp */
output( "\tleal -4*4(%%eax),%%eax\n" );
output( "\tmovl 0xc0(%%ecx),%%edx\n" ); /* EFlags */
output( "\t.byte 0x26\n\tmovl %%edx,3*4(%%eax)\n" );
output( "\tmovl 0xbc(%%ecx),%%edx\n" ); /* SegCs */
output( "\t.byte 0x26\n\tmovl %%edx,2*4(%%eax)\n" );
output( "\tmovl 0xb8(%%ecx),%%edx\n" ); /* Eip */
output( "\t.byte 0x26\n\tmovl %%edx,1*4(%%eax)\n" );
output( "\tmovl 0xb0(%%ecx),%%edx\n" ); /* Eax */
output( "\t.byte 0x26\n\tmovl %%edx,0*4(%%eax)\n" );
output( "\tpushl %%es\n" );
output( "\tpushl 0x98(%%ecx)\n" ); /* SegDs */
output(" \tpushl 0x94(%%ecx)\n" ); /* SegEs */
output( "\tpopl %%es\n" );
output( "\tpushl 0x90(%%ecx)\n"); /* SegFs */
output( "\tpopl %%fs\n" );
output( "\tpushl 0x8c(%%ecx)\n"); /* SegGs */
output( "\tpopl %%gs\n" );
output( "\tmovl 0x9c(%%ecx),%%edi\n" ); /* Edi */
output( "\tmovl 0xa0(%%ecx),%%esi\n" ); /* Esi */
output( "\tmovl 0xa4(%%ecx),%%ebx\n" ); /* Ebx */
output( "\tmovl 0xa8(%%ecx),%%edx\n" ); /* Edx */
output( "\tmovl 0xb4(%%ecx),%%ebp\n" ); /* Ebp */
output( "\tmovl 0xac(%%ecx),%%ecx\n" ); /* Ecx */
output( "\tpopl %%ds\n" );
output( "\tpopl %%ss\n" );
output( "\tmovl %%eax,%%esp\n" );
output( "\tpopl %%eax\n" );
output( "\tiret\n" );
That is, we set %es to the target %ss, write control regs onto the target stack
using %es, switch to the target stack, then pop + iret. In this case the %es
overrides make perfect sense: the target stack might be different from ours
(i.e. we are returning to 16-bit code).
Evidently this was invalid:
+ /* Restore the context when the stack segment changes. We can't use
+ * the same code as above because we do not know if the stack segment
+ * is 16 or 32 bit, and 'movl' will throw an exception when we try to
+ * access memory above the limit. */
In 4c8b3f63be1 slackner introduced two different paths. If we need to switch
stacks, we use the original path. (I think it's still vulnerable to signal
races, but we have no way of preventing those.) Meanwhile the other path can be
simplified a bit, since we're already on the target stack:
- output( "\tpushl 0xc8(%%ecx)\n" ); /* SegSs */
- output( "\tpopl %%es\n" );
output( "\tmovl 0xc4(%%ecx),%%eax\n" ); /* Esp */
output( "\tleal -4*4(%%eax),%%eax\n" );
output( "\tmovl 0xc0(%%ecx),%%edx\n" ); /* EFlags */
- output( "\t.byte 0x26\n\tmovl %%edx,3*4(%%eax)\n" );
+ output( "\t.byte 0x36\n\tmovl %%edx,3*4(%%eax)\n" );
output( "\tmovl 0xbc(%%ecx),%%edx\n" ); /* SegCs */
- output( "\t.byte 0x26\n\tmovl %%edx,2*4(%%eax)\n" );
+ output( "\t.byte 0x36\n\tmovl %%edx,2*4(%%eax)\n" );
output( "\tmovl 0xb8(%%ecx),%%edx\n" ); /* Eip */
- output( "\t.byte 0x26\n\tmovl %%edx,1*4(%%eax)\n" );
+ output( "\t.byte 0x36\n\tmovl %%edx,1*4(%%eax)\n" );
output( "\tmovl 0xb0(%%ecx),%%edx\n" ); /* Eax */
- output( "\t.byte 0x26\n\tmovl %%edx,0*4(%%eax)\n" );
+ output( "\t.byte 0x36\n\tmovl %%edx,0*4(%%eax)\n" );
- output( "\tpushl %%es\n" );
output( "\tpushl 0x98(%%ecx)\n" ); /* SegDs */
@@ -890,11 +889,37 @@ static void build_call_from_regs_x86(void)
output( "\tmovl 0xac(%%ecx),%%ecx\n" ); /* Ecx */
output( "\tpopl %%ds\n" );
- output( "\tpopl %%ss\n" );
output( "\tmovl %%eax,%%esp\n" );
output( "\tpopl %%eax\n" );
output( "\tiret\n" );
Sebastian got rid of the setting of %es, and replaced the %es prefixes with %ss
prefixes. What I think happened is that he made a subtle mistake—or, well, not a
mistake, but a redundancy. %es: was changed to %ss: by analogy, but it's
actually not necessary: we're operating on the source stack, and we know the
source stack is 32-bit, and we haven't set %ds yet, so %ds == %ss already, and
we can use the %implicit %ds prefix.
Alexandre presumably saw this in bab6ece63, and silently removed them. My guess
is that Sebastian saw that, wasn't sure, but (in the best case) didn't want to
submit his fix upstream until he had checked whether it was actually correct,
and never got around to checking. (Alternatively, he thought that the %ss should
have been retained for clarity, and decided not to try to submit that upstream.)