when an initializer is variable (I handled the constant case in a previous
patch). This has three pieces:
1. Enhance AggValueSlot to have a 'isZeroed' bit to tell CGExprAgg that
the memory being stored into has previously been memset to zero.
2. Teach CGExprAgg to not emit stores of zero to isZeroed memory.
3. Teach CodeGenFunction::EmitAggExpr to scan initializers to determine
whether they are profitable to emit a memset + inividual stores vs
stores for everything.
The heuristic used is that a global has to be more than 16 bytes and
has to be 3/4 zero to be candidate for this xform. The two testcases
are illustrative of the scenarios this catches. We now codegen test9 into:
call void @llvm.memset.p0i8.i64(i8* %0, i8 0, i64 400, i32 4, i1 false)
%.array = getelementptr inbounds [100 x i32]* %Arr, i32 0, i32 0
%tmp = load i32* %X.addr, align 4
store i32 %tmp, i32* %.array
and test10 into:
call void @llvm.memset.p0i8.i64(i8* %0, i8 0, i64 392, i32 8, i1 false)
%tmp = getelementptr inbounds %struct.b* %S, i32 0, i32 0
%tmp1 = getelementptr inbounds %struct.a* %tmp, i32 0, i32 0
%tmp2 = load i32* %X.addr, align 4
store i32 %tmp2, i32* %tmp1, align 4
%tmp5 = getelementptr inbounds %struct.b* %S, i32 0, i32 3
%tmp10 = getelementptr inbounds %struct.a* %tmp5, i32 0, i32 4
%tmp11 = load i32* %X.addr, align 4
store i32 %tmp11, i32* %tmp10, align 4
Previously we produced 99 stores of zero for test9 and also tons for test10.
This xforms should substantially speed up -O0 builds when it kicks in as well
as reducing code size and optimizer heartburn on insane cases. This resolves
PR279.
llvm-svn: 120692
slot. The easiest way to do that was to bundle up the information
we care about for aggregate slots into a new structure which demands
that its creators at least consider the question.
I could probably be convinced that the ObjC 'needs GC' bit should
be rolled into this structure.
Implement generalized copy elision. The main obstacle here is that
IR-generation must be much more careful about making sure that exactly
llvm-svn: 113962
but not in C++, so don't emit aggregate loads of volatile references
in null context in C++. Happens to have been caught by an assertion.
We do not get the scalar case right. Volatiles are really broken.
llvm-svn: 112019
implicitly-defined default constructor, zero-initialize the memory
before calling the default constructor. Previously, we would only
zero-initialize in the case of a trivial default constructor.
Also, simplify the hideous logic that determines when we have a
trivial default constructor and, therefore, don't need to emit any
call at all.
llvm-svn: 111779
pointers. I find the resulting code to be substantially cleaner, and it
makes it very easy to use the same APIs for data member pointers (which I have
conscientiously avoided here), and it avoids a plethora of potential
inefficiencies due to excessive memory copying, but we'll have to see if it
actually works.
llvm-svn: 111776
duplication between the constant and non-constant paths in all of this.
Implement ARM ABI semantics for member pointer constants and conversion.
llvm-svn: 111772