Commit Graph

99 Commits

Author SHA1 Message Date
George Burgess IV 1ced44b92a [Analysis] Centralize objectsize lowering logic.
We're currently doing nearly the same thing for @llvm.objectsize in
three different places: two of them are missing checks for overflow,
and one of them could subtly break if InstCombine gets much smarter
about removing alloc sites. Seems like a good idea to not do that.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290214 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-20 23:46:36 +00:00
Jun Bum Lim 4d9c93dc3f [CodeGenPrep] Skip merging empty case blocks
This is recommit of r287553 after fixing the invalid loop info after eliminating an empty block and unit test failures in AVR and WebAssembly :

Summary: Merging an empty case block into the header block of switch could cause ISel to add COPY instructions in the header of switch, instead of the case block, if the case block is used as an incoming block of a PHI. This could potentially increase dynamic instructions, especially when the switch is in a loop. I added a test case which was reduced from the benchmark I was targetting.

Reviewers: t.p.northover, mcrosier, manmanren, wmi, joerg, davidxl

Subscribers: joerg, qcolombet, danielcdh, hfinkel, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D22696

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289988 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-16 20:38:39 +00:00
Sanjoy Das 83c8485899 Fix CodeGenPrepare::stripInvariantGroupMetadata
`dropUnknownNonDebugMetadata` takes a list of "known" metadata IDs.  The
only reason it worked at all is that `getMetadataID` returns something
unrelated -- it returns the subclass ID of the receiver (which is used
in `dyn_cast` etc.).  That does not numerically match
`LLVMContext::MD_invariant_group` and ends up dropping `invariant_group`
along with every other metadata that does not numerically match
`LLVMContext::MD_invariant_group`.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289973 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-16 18:52:33 +00:00
Jun Bum Lim 51900e49f9 Revert "[CodeGenPrep] Skip merging empty case blocks"
This reverts commit r289951.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289960 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-16 17:06:14 +00:00
Jun Bum Lim 6db0eaf697 [CodeGenPrep] Skip merging empty case blocks
This is recommit of r287553 after fixing the invalid loop info after eliminating an empty block:

Summary: Merging an empty case block into the header block of switch could cause ISel to add COPY instructions in the header of switch, instead of the case block, if the case block is used as an incoming block of a PHI. This could potentially increase dynamic instructions, especially when the switch is in a loop. I added a test case which was reduced from the benchmark I was targetting.

Reviewers: t.p.northover, mcrosier, manmanren, wmi, joerg, davidxl

Subscribers: joerg, qcolombet, danielcdh, hfinkel, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D22696

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289951 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-16 16:03:31 +00:00
Matt Arsenault 76a17e03e0 AMDGPU: Implement isCheapAddrSpaceCast
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288523 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-02 18:12:53 +00:00
Joerg Sonnenberger 46cc79217b Revert r287553: [CodeGenPrep] Skip merging empty case blocks
It results in assertions in lib/Analysis/BlockFrequencyInfoImpl.cpp line
670 ("Expected irreducible CFG").


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288052 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-28 18:56:54 +00:00
Justin Lebar 09220c80d3 [CodeGenPrepare] Don't sink non-cheap addrspacecasts.
Summary:
Previously, CGP would unconditionally sink addrspacecast instructions,
even going so far as to sink them into a loop.

Now we check that the cast is "cheap", as defined by TLI.

We introduce a new "is-cheap" function to TLI rather than using
isNopAddrSpaceCast because some GPU platforms want the ability to ask
for non-nop casts to be sunk.

Reviewers: arsenm, tra

Subscribers: jholewinski, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D26923

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287591 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-21 22:49:15 +00:00
Jun Bum Lim b68036c70c [CodeGenPrep] Skip merging empty case blocks
Summary: Merging an empty case block into the header block of switch could cause
ISel to add COPY instructions in the header of switch, instead of the case
block, if the case block is used as an incoming block of a PHI. This could
potentially increase dynamic instructions, especially when the switch is in a
loop. I added a test case which was reduced from the benchmark I was targetting.

Reviewers: t.p.northover, mcrosier, manmanren, wmi, davidxl

Subscribers: qcolombet, danielcdh, hfinkel, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D22696

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287553 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-21 16:47:28 +00:00
Justin Lebar eb77a4a537 [BypassSlowDivision] Handle division by constant numerators better.
Summary:
We don't do BypassSlowDivision when the denominator is a constant, but
we do do it when the numerator is a constant.

This patch makes two related changes to BypassSlowDivision when the
numerator is a constant:

 * If the numerator is too large to fit into the bypass width, don't
   bypass slow division (because we'll never run the smaller-width
   code).

 * If we bypass slow division where the numerator is a constant, don't
   OR together the numerator and denominator when determining whether
   both operands fit within the bypass width.  We need to check only the
   denominator.

Reviewers: tra

Subscribers: llvm-commits, jholewinski

Differential Revision: https://reviews.llvm.org/D26699

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287062 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-16 00:44:47 +00:00
Justin Lebar 27d02ea698 Add missing lit.local.cfg to llvm/test/Transforms/CodeGenPrepare/NVPTX.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285464 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-28 21:56:07 +00:00
Justin Lebar f644e7b00f Don't leave unused divs/rems sitting around in BypassSlowDivision.
Summary:
This "pass" eagerly creates div and rem instructions even when only one
is needed -- it relies on a later pass (machine DCE?) to clean them up.

This is problematic not just from a cleanliness perspective (this pass
is running during CodeGenPrepare, so should leave the IR in a better
state), but it also creates a problem for instruction selection.  If we
always have a div+rem, isel will always select a divrem instruction (if
possible), even when a single div or rem would do.

Specifically, in NVPTX, we want to compute rem from the output of div,
if available.  But if a div is not available, we want to leave the rem
alone.  This transformation is overeager if div is always available.

Because this code runs as part of CodeGenPrepare, it's nontrivial to
write a test for this change.  But this will effectively be tested by
a later patch which adds the aforementioned change to NVPTX isel.

Reviewers: tra

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26088

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285460 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-28 21:43:54 +00:00
Justin Lebar 9488f1f527 Don't claim the udiv created in BypassSlowDivision is exact.
Summary:
In BypassSlowDivision's short-dividend path, we would create e.g.

  udiv exact i32 %a, %b

"exact" here means that we are asserting that %a is a multiple of %b.
But we have no reason to believe this must be true -- this is just a
bug, as far as I can tell.

Reviewers: tra

Subscribers: jholewinski, llvm-commits

Differential Revision: https://reviews.llvm.org/D26097

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285459 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-28 21:43:51 +00:00
Dehao Chen 2e4381ef79 Update the section.ll to fix non-x86 failure.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284566 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-19 03:53:41 +00:00
Dehao Chen 625e9e7e61 Revert r284545 again as the regression in ppc still exists. There is bug in MBPI exposed by th patch.
Also update the section.ll to fix non-x86 failure.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284563 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-19 01:18:25 +00:00
Dehao Chen 8c8a9767af Add target for test to fix regression introduced by r284533.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284538 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-18 21:13:31 +00:00
Dehao Chen 977fc82cac Use profile info to set function section prefix to group hot/cold functions.
Summary:
The original implementation is in r261607, which was reverted in r269726 to accomendate the ProfileSummaryInfo analysis pass. The new implementation:
1. add a new metadata for function section prefix
2. query against ProfileSummaryInfo in CGP to set the correct section prefix for each function
3. output the section prefix set by CGP

Reviewers: davidxl, eraman

Subscribers: vsk, llvm-commits

Differential Revision: https://reviews.llvm.org/D24989

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284533 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-18 20:42:47 +00:00
David Majnemer 11dea5d5dd [CodeGenPrepare] Don't sink a cast past its user
The sink cast machinery is supposed to sink casts as close to their user
as possible.  However, an EH pad is the first instruction in it's basic
block.  Don't sink if the user is an EH pad.

This fixes PR27536.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267767 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-27 19:36:38 +00:00
Sanjay Patel e59120290f [CodeGenPrepare] don't convert an unpredictable select into control flow
Suggested in the review of D19488:
http://reviews.llvm.org/D19488



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267504 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-26 00:47:39 +00:00
Adrian Prantl 4eeaa0da04 [PR27284] Reverse the ownership between DICompileUnit and DISubprogram.
Currently each Function points to a DISubprogram and DISubprogram has a
scope field. For member functions the scope is a DICompositeType. DIScopes
point to the DICompileUnit to facilitate type uniquing.

Distinct DISubprograms (with isDefinition: true) are not part of the type
hierarchy and cannot be uniqued. This change removes the subprograms
list from DICompileUnit and instead adds a pointer to the owning compile
unit to distinct DISubprograms. This would make it easy for ThinLTO to
strip unneeded DISubprograms and their transitively referenced debug info.

Motivation
----------

Materializing DISubprograms is currently the most expensive operation when
doing a ThinLTO build of clang.

We want the DISubprogram to be stored in a separate Bitcode block (or the
same block as the function body) so we can avoid having to expensively
deserialize all DISubprograms together with the global metadata. If a
function has been inlined into another subprogram we need to store a
reference the block containing the inlined subprogram.

Attached to https://llvm.org/bugs/show_bug.cgi?id=27284 is a python script
that updates LLVM IR testcases to the new format.

http://reviews.llvm.org/D19034
<rdar://problem/25256815>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266446 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-15 15:57:41 +00:00
Petar Jovanovic 396d592ab0 Calculate __builtin_object_size when pointer depends on a condition
This patch fixes calculating of builtin_object_size if it depends on a
condition. Before this patch compiler did not know how to calculate the
object size when it finds a condition that cannot be eliminated.
This patch enables calculating of builtin_object_size even in case when
condition cannot be eliminated by choosing minimum or maximum value as a
result from condition. Choosing minimum or maximum value from condition
is based on the second argument of __builtin_object_size function.

Patch by Strahinja Petrovic.

Differential Revision: http://reviews.llvm.org/D18438


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266193 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-13 12:25:25 +00:00
Peter Zotov 6a4952db2d [CodeGenPrepare] Avoid sinking soft-FP comparisons
Sinking comparisons in CGP can undo the job of hoisting them done
earlier by LICM, and soft-FP makes this an expensive mistake.

A common pattern that produces floating point comparisons uniform
over a loop is an explicit check for division by zero. If the divisor
is hoisted out of the loop, the comparison can also be, but hoisting
the function that unwinds is never legal, since it may cause side
effects in the loop body prior to the unwinding to not be executed.

Differential Revision: http://reviews.llvm.org/D18744

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265264 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-03 16:36:17 +00:00
Adrian Prantl 7876f64bc3 testcase gardening: update the emissionKind enum to the new syntax. (NFC)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265081 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-01 00:16:49 +00:00
George Burgess IV 58cad7988f Keep CodeGenPrepare from preserving the domtree.
CGP modifies the domtree in some cases, so saying that it preserves the
domtree is a lie. We'll be able to selectively preserve it with the new
pass manager.

Differential Revision: http://reviews.llvm.org/D16893


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264099 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-22 21:25:08 +00:00
Philip Reames 3239eb1016 [CGP] Duplicate addressing computation in cold paths if required to sink addressing mode
This patch teaches CGP to duplicate addressing mode computations into cold paths (detected via explicit cold attribute on calls) if required to let addressing mode be safely sunk into the basic block containing each load and store.

In general, duplicating code into cold blocks may result in code growth, but should not effect performance. In this case, it's better to duplicate some code than to put extra pressure on the register allocator by making it keep the address through the entirely of the fast path.

This patch only handles addressing computations, but in principal, we could implement a more general cold cold scheduling heuristic which tries to reduce register pressure in the fast path by duplicating code into the cold path. Getting the profitability of the general case right seemed likely to be challenging, so I stuck to the existing case (addressing computation) we already had.

Differential Revision: http://reviews.llvm.org/D17652



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263074 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-09 23:13:12 +00:00