Summary:
Because of the existence branches out of GNU statement expressions, it
is possible that emitting cleanups for a full expression may cause the
new insertion point to not be dominated by the result of the inner
expression. Consider this example:
struct Foo { Foo(); ~Foo(); int x; };
int g(Foo, int);
int f(bool cond) {
int n = g(Foo(), ({ if (cond) return 0; 42; }));
return n;
}
Before this change, result of the call to 'g' did not dominate its use
in the store to 'n'. The early return exit from the statement expression
branches to a shared cleanup block, which ends in a switch between the
fallthrough destination (the assignment to 'n') or the function exit
block.
This change solves the problem by spilling and reloading expression
evaluation results when any of the active cleanups have branches.
I audited the other call sites of enterFullExpression, and they don't
appear to keep and Values live across the site of the cleanup, except in
ARC code. I wasn't able to create a test case for ARC that exhibits this
problem, though.
Reviewers: rjmccall, rsmith
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D30590
llvm-svn: 297084
This reverts commit r290171. It triggers a bunch of warnings, because
the new enumerator isn't handled in all switches. We want a warning-free
build.
Replied on the commit with more details.
llvm-svn: 290173
Summary: Enabling the compression of CLK_NULL_QUEUE to variable of type queue_t.
Reviewers: Anastasia
Subscribers: cfe-commits, yaxunl, bader
Differential Revision: https://reviews.llvm.org/D27569
llvm-svn: 290171
abstract information about the callee. NFC.
The goal here is to make it easier to recognize indirect calls and
trigger additional logic in certain cases. That logic will come in
a later patch; in the meantime, I felt that this was a significant
improvement to the code.
llvm-svn: 285258
Currently Clang use int32 to represent sampler_t, which have been a source of issue for some backends, because in some backends sampler_t cannot be represented by int32. They have to depend on kernel argument metadata and use IPA to find the sampler arguments and global variables and transform them to target specific sampler type.
This patch uses opaque pointer type opencl.sampler_t* for sampler_t. For each use of file-scope sampler variable, it generates a function call of __translate_sampler_initializer. For each initialization of function-scope sampler variable, it generates a function call of __translate_sampler_initializer.
Each builtin library can implement its own __translate_sampler_initializer(). Since the real sampler type tends to be architecture dependent, allowing it to be initialized by a library function simplifies backend design. A typical implementation of __translate_sampler_initializer could be a table lookup of real sampler literal values. Since its argument is always a literal, the returned pointer is known at compile time and easily optimized to finally become some literal values directly put into image read instructions.
This patch is partially based on Alexey Sotkin's work in Khronos Clang (https://github.com/KhronosGroup/SPIR/commit/3d4eec61623502fc306e8c67c9868be2b136e42b).
Differential Revision: https://reviews.llvm.org/D21567
llvm-svn: 277024
In {CG,}ExprConstant.cpp, we weren't treating vector splats properly.
This patch makes us treat splats more properly.
Additionally, this patch adds a new cast kind which allows a bool->int
cast to result in -1 or 0, instead of 1 or 0 (for true and false,
respectively), so we can sanely model OpenCL bool->int casts in the AST.
Differential Revision: http://reviews.llvm.org/D14877
llvm-svn: 257559
This patch changes the generation of CGFunctionInfo to contain
the FunctionProtoType if it is available. This enables the code
generation for call instructions to look into this type for
exception information and therefore generate better quality
IR - it will not create invoke instructions for functions that
are know not to throw.
llvm-svn: 253926
Currently debug info for types used in explicit cast only is not emitted. It happened after a patch for better alignment handling. This patch fixes this bug.
Differential Revision: http://reviews.llvm.org/D13582
llvm-svn: 250795
Summary:
This change adds support for `__builtin_ms_va_list`, a GCC extension for
variadic `ms_abi` functions. The existing `__builtin_va_list` support is
inadequate for this because `va_list` is defined differently in the Win64
ABI vs. the System V/AMD64 ABI.
Depends on D1622.
Reviewers: rsmith, rnk, rjmccall
CC: cfe-commits
Differential Revision: http://reviews.llvm.org/D1623
llvm-svn: 247941
Introduce an Address type to bundle a pointer value with an
alignment. Introduce APIs on CGBuilderTy to work with Address
values. Change core APIs on CGF/CGM to traffic in Address where
appropriate. Require alignments to be non-zero. Update a ton
of code to compute and propagate alignment information.
As part of this, I've promoted CGBuiltin's EmitPointerWithAlignment
helper function to CGF and made use of it in a number of places in
the expression emitter.
The end result is that we should now be significantly more correct
when performing operations on objects that are locally known to
be under-aligned. Since alignment is not reliably tracked in the
type system, there are inherent limits to this, but at least we
are no longer confused by standard operations like derived-to-base
conversions and array-to-pointer decay. I've also fixed a large
number of bugs where we were applying the complete-object alignment
to a pointer instead of the non-virtual alignment, although most of
these were hidden by the very conservative approach we took with
member alignment.
Also, because IRGen now reliably asserts on zero alignments, we
should no longer be subject to an absurd but frustrating recurring
bug where an incomplete type would report a zero alignment and then
we'd naively do a alignmentAtOffset on it and emit code using an
alignment equal to the largest power-of-two factor of the offset.
We should also now be emitting much more aggressive alignment
attributes in the presence of over-alignment. In particular,
field access now uses alignmentAtOffset instead of min.
Several times in this patch, I had to change the existing
code-generation pattern in order to more effectively use
the Address APIs. For the most part, this seems to be a strict
improvement, like doing pointer arithmetic with GEPs instead of
ptrtoint. That said, I've tried very hard to not change semantics,
but it is likely that I've failed in a few places, for which I
apologize.
ABIArgInfo now always carries the assumed alignment of indirect and
indirect byval arguments. In order to cut down on what was already
a dauntingly large patch, I changed the code to never set align
attributes in the IR on non-byval indirect arguments. That is,
we still generate code which assumes that indirect arguments have
the given alignment, but we don't express this information to the
backend except where it's semantically required (i.e. on byvals).
This is likely a minor regression for those targets that did provide
this information, but it'll be trivial to add it back in a later
patch.
I partially punted on applying this work to CGBuiltin. Please
do not add more uses of the CreateDefaultAligned{Load,Store}
APIs; they will be going away eventually.
llvm-svn: 246985
Summary:
float_cast_overflow is the only UBSan check without a source location attached.
This patch propagates SourceLocations where necessary to get them to the
EmitCheck() call.
Reviewers: rsmith, ABataev, rjmccall
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D11757
llvm-svn: 244568
The RegionCounter type does a lot of legwork, but most of it is only
meaningful within the implementation of CodeGenPGO. The uses elsewhere
in CodeGen generally just want to increment or read counters, so do
that directly.
llvm-svn: 235664
The /volatile:ms semantics turn volatile loads and stores into atomic
acquire and release operations. This distinction is important because
volatile memory operations do not form a happens-before relationship
with non-atomic memory. This means that a volatile store is not
sufficient for implementing a mutex unlock routine.
Differential Revision: http://reviews.llvm.org/D7580
llvm-svn: 229082
This causes things like assignment to refer to the '=' rather than the
LHS when attributing the store instruction, for example.
There were essentially 3 options for this:
* The beginning of an expression (this was the behavior prior to this
commit). This meant that stepping through subexpressions would bounce
around from subexpressions back to the start of the outer expression,
etc. (eg: x + y + z would go x, y, x, z, x (the repeated 'x's would be
where the actual addition occurred)).
* The end of an expression. This seems to be what GCC does /mostly/, and
certainly this for function calls. This has the advantage that
progress is always 'forwards' (never jumping backwards - except for
independent subexpressions if they're evaluated in interesting orders,
etc). "x + y + z" would go "x y z" with the additions occurring at y
and z after the respective loads.
The problem with this is that the user would still have to think
fairly hard about precedence to realize which subexpression is being
evaluated or which operator overload is being called in, say, an asan
backtrace.
* The preferred location or 'exprloc'. In this case you get sort of what
you'd expect, though it's a bit confusing in its own way due to going
'backwards'. In this case the locations would be: "x y + z +" in
lovely postfix arithmetic order. But this does mean that if the op+
were an operator overload, say, and in a backtrace, the backtrace will
point to the exact '+' that's being called, not to the end of one of
its operands.
(actually the operator overload case doesn't work yet for other reasons,
but that's being fixed - but this at least gets scalar/complex
assignments and other plain operators right)
llvm-svn: 227027