Commit Graph

63 Commits

Author SHA1 Message Date
Nicola Zaghen d34e60ca85 Rename DEBUG macro to LLVM_DEBUG.
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.

In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.

Differential Revision: https://reviews.llvm.org/D43624

llvm-svn: 332240
2018-05-14 12:53:11 +00:00
Sanjay Patel 0d7df36c66 [TargetSchedule] shrink interface for init(); NFCI
The TargetSchedModel is always initialized using the TargetSubtargetInfo's 
MCSchedModel and TargetInstrInfo, so we don't need to extract those and 
pass 3 parameters to init().

Differential Revision: https://reviews.llvm.org/D44789

llvm-svn: 329540
2018-04-08 19:56:04 +00:00
Reid Kleckner 2aeb930a9f Revert r327721 "This patch fixes the invalid usage of OptSize in Machine Combiner."
It causes asserts when compiling Chromium on Win32 with optimizations.
We compile many things with -Os.

llvm-svn: 327733
2018-03-16 20:11:55 +00:00
Andrew V. Tischenko a0cd09d4a2 This patch fixes the invalid usage of OptSize in Machine Combiner.
Differential Revision: https://reviews.llvm.org/D43813

llvm-svn: 327721
2018-03-16 16:06:24 +00:00
Andrew V. Tischenko 083891925b The final step to close D41278 [MachineCombiner] Improve debug output (NFC).
Differential Revision: https://reviews.llvm.org/D41278

llvm-svn: 326074
2018-02-26 09:43:21 +00:00
Andrew V. Tischenko b65b078d4d (NFC)[MachineCombiner] Improve debug output.
llvm-svn: 325217
2018-02-15 07:55:02 +00:00
Alexander Ivchenko 6805004cb1 Fix unused variable warning in release mode. NFC.
llvm-svn: 324330
2018-02-06 09:53:02 +00:00
Florian Hahn c68428b5dc [MachineCombiner] Add check for optimal pattern order.
In D41587, @mssimpso discovered that the order of some patterns for
AArch64 was sub-optimal. I thought a bit about how we could avoid that
case in the future. I do not think there is a need for evaluating all
patterns for now. But this patch adds an extra (expensive) check, that
evaluates the latencies of all patterns, and ensures that the latency
saved decreases for subsequent patterns.

This catches the sub-optimal order fixed in D41587, but I am not
entirely happy with the check, as it only applies to sub-optimal
patterns seen while building with EXPENSIVE_CHECKS on. It did not
discover any other sub-optimal pattern ordering.

Reviewers: Gerolf, spatel, mssimpso

Reviewed By: Gerolf, mssimpso

Differential Revision: https://reviews.llvm.org/D41766

llvm-svn: 323873
2018-01-31 13:54:30 +00:00
Matthias Braun f1caa2833f MachineFunction: Return reference from getFunction(); NFC
The Function can never be nullptr so we can return a reference.

llvm-svn: 320884
2017-12-15 22:22:58 +00:00
Michael Zolotukhin c468b648fd Remove redundant includes from lib/CodeGen.
llvm-svn: 320619
2017-12-13 21:30:47 +00:00
Florian Hahn 001c3dd202 [MachineCombiner] Add up latencies of all instructions in new pattern.
Summary:
When calculating the RootLatency, we add up all the latencies of the
deleted instructions. But for NewRootLatency we only add the latency of
the new root instructions, ignoring the latencies of the other
instructions inserted. This leads the combiner to underestimate the cost
of patterns which add multiple instructions. This patch fixes that by
summing up the latencies of all new instructions. For NewRootNode, the
more complex getLatency function is used.

Note that we may be slightly more precise than just summing up
all latencies. For example, consider a pattern like

    r1 = INS1 ..
    r2 = INS2 ..
    r3 = INS3 r1, r2

I think in some other places, the total latency of the pattern would be
estimated as lat(INS3) + max(lat(INS1), lat(INS2)). If you consider
that worth changing, I think it would be best to do in a follow-up
patch.

Reviewers: Gerolf, sebpop, spop, fhahn

Reviewed By: fhahn

Subscribers: evandro, llvm-commits

Differential Revision: https://reviews.llvm.org/D40307

llvm-svn: 319951
2017-12-06 20:27:33 +00:00
David Blaikie b3bde2ea50 Fix a bunch more layering of CodeGen headers that are in Target
All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).

llvm-svn: 318490
2017-11-17 01:07:10 +00:00
David Blaikie 3f833edc7c Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering
This header includes CodeGen headers, and is not, itself, included by
any Target headers, so move it into CodeGen to match the layering of its
implementation.

llvm-svn: 317647
2017-11-08 01:01:31 +00:00
Simon Pilgrim 194693e996 [MC] Split out register def/use idx calls to make debugging simpler. NFCI.
llvm-svn: 316927
2017-10-30 17:24:40 +00:00
Florian Hahn e52abba277 [MachineCombiner] Fix initialisation of LastUpdate for incremental update.
Summary:
Fixes a bogus iterator resulting from the removal of a block's first instruction at the point that incremental update is enabled.

Patch by Paul Walker.

Reviewers: fhahn, Gerolf, efriedma, MatzeB

Reviewed By: fhahn

Subscribers: aemerson, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D38734

llvm-svn: 315502
2017-10-11 20:25:58 +00:00
Florian Hahn ceb4494786 Recommit [MachineCombiner] Update instruction depths incrementally for large BBs.
This version of the patch fixes an off-by-one error causing PR34596. We
do not need to use std::next(BlockIter) when calling updateDepths, as
BlockIter already points to the next element.

Original commit message:
> For large basic blocks with lots of combinable instructions, the
> MachineTraceMetrics computations in MachineCombiner can dominate the compile
> time, as computing the trace information is quadratic in the number of
> instructions in a BB and it's relevant successors/predecessors.

> In most cases, knowing the instruction depth should be enough to make
> combination decisions. As we already iterate over all instructions in a basic
> block, the instruction depth can be computed incrementally. This reduces the
> cost of machine-combine drastically in cases where lots of instructions
> are combined. The major drawback is that AFAIK, computing the critical path
> length cannot be done incrementally. Therefore we only compute
> instruction depths incrementally, for basic blocks with more
> instructions than inc_threshold. The -machine-combiner-inc-threshold
> option can be used to set the threshold and allows for easier
> experimenting and checking if using incremental updates for all basic
> blocks has any impact on the performance.
>
> Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn
>
> Reviewed By: fhahn
>
> Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D36619

llvm-svn: 313751
2017-09-20 11:54:37 +00:00
Hans Wennborg 06e2a384c2 Revert r312719 "[MachineCombiner] Update instruction depths incrementally for large BBs."
This caused PR34596.

> [MachineCombiner] Update instruction depths incrementally for large BBs.
>
> Summary:
> For large basic blocks with lots of combinable instructions, the
> MachineTraceMetrics computations in MachineCombiner can dominate the compile
> time, as computing the trace information is quadratic in the number of
> instructions in a BB and it's relevant successors/predecessors.
>
> In most cases, knowing the instruction depth should be enough to make
> combination decisions. As we already iterate over all instructions in a basic
> block, the instruction depth can be computed incrementally. This reduces the
> cost of machine-combine drastically in cases where lots of instructions
> are combined. The major drawback is that AFAIK, computing the critical path
> length cannot be done incrementally. Therefore we only compute
> instruction depths incrementally, for basic blocks with more
> instructions than inc_threshold. The -machine-combiner-inc-threshold
> option can be used to set the threshold and allows for easier
> experimenting and checking if using incremental updates for all basic
> blocks has any impact on the performance.
>
> Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn
>
> Reviewed By: fhahn
>
> Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D36619

llvm-svn: 313213
2017-09-13 23:23:09 +00:00
Florian Hahn d39b8a3533 [MachineCombiner] Update instruction depths incrementally for large BBs.
Summary:
For large basic blocks with lots of combinable instructions, the
MachineTraceMetrics computations in MachineCombiner can dominate the compile
time, as computing the trace information is quadratic in the number of
instructions in a BB and it's relevant successors/predecessors.

In most cases, knowing the instruction depth should be enough to make
combination decisions. As we already iterate over all instructions in a basic
block, the instruction depth can be computed incrementally. This reduces the
cost of machine-combine drastically in cases where lots of instructions
are combined. The major drawback is that AFAIK, computing the critical path
length cannot be done incrementally. Therefore we only compute
instruction depths incrementally, for basic blocks with more
instructions than inc_threshold. The -machine-combiner-inc-threshold
option can be used to set the threshold and allows for easier
experimenting and checking if using incremental updates for all basic
blocks has any impact on the performance.

Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn

Reviewed By: fhahn

Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits

Differential Revision: https://reviews.llvm.org/D36619

llvm-svn: 312719
2017-09-07 12:49:39 +00:00
Jakub Kuderski 1d2dc681b1 [NFC] Move DEBUG_TYPE macro below includes...
in MachineCombiner.cpp.

llvm-svn: 307940
2017-07-13 19:30:52 +00:00
Matthias Braun 1527baab0c CodeGen: Rename DEBUG_TYPE to match passnames
Rename the DEBUG_TYPE to match the names of corresponding passes where
it makes sense. Also establish the pattern of simply referencing
DEBUG_TYPE instead of repeating the passname where possible.

llvm-svn: 303921
2017-05-25 21:26:32 +00:00
Eric Christopher 17ce8a2f5e Fix up grammar in a comment.
llvm-svn: 297898
2017-03-15 21:50:46 +00:00
Andrew V. Tischenko 8da96914f9 Compile time decreasing in the case we're dealing with Machine Combiner.
Before this patch compile time was about 21s (see below). After this patch
we have less than 2s (see bellow).

  Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz

    DAGCombiner - trunk
        time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math
        real  0m1.685s

    DAGCombiner + Speed patch
        time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math
        real  0m1.655s

    MachineCombiner w/o Speed patch
        time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math
        real  0m21.614s

    MachineCombiner + Speed patch
        time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math
        real  0m1.593s

The test spill_fdiv.ll  is attached to D29627
D29627 should be closed.

llvm-svn: 294936
2017-02-13 09:43:37 +00:00
Matthias Braun a4976c6166 MachineInstr: Remove parameter from dump()
The primary use of the dump() functions in LLVM is for use in a
debugger. Unfortunately lldb does not seem to handle default arguments
so using `p SomeMI.dump()` fails and you have to type the longer `p
SomeMI.dump(nullptr)`. Remove the paramter to make the most common use
easy. (You can always construct something like `p
SomeMI.print(dbgs(),MyTII)` if you need more features).

Differential Revision: https://reviews.llvm.org/D29241

llvm-svn: 293440
2017-01-29 18:20:42 +00:00
Sebastian Pop 7779484313 machine combiner: fix pretty printer
we used to print UNKNOWN instructions when the instruction to be printer was not
yet inserted in any BB: in that case the pretty printer would not be able to
compute a TII as the instruction does not belong to any BB or function yet.
This patch explicitly passes the TII to the pretty-printer.

Differential Revision: https://reviews.llvm.org/D27645

llvm-svn: 290228
2016-12-21 01:41:12 +00:00
Sebastian Pop e08d9c7c87 instr-combiner: sum up all latencies of the transformed instructions
We have found that -- when the selected subarchitecture has a scheduling model
and we are not optimizing for size -- the machine-instruction combiner uses a
too-simple algorithm to compute the cost of one of the two alternatives [before
and after running a combining pass on a section of code], and therefor it throws
away the combination results too often.

This fix has the potential to help any ISA with the potential to combine
instructions and for which at least one subarchitecture has a scheduling model.
As of now, this is only known to definitely affect AArch64 subarchitectures with
a scheduling model.

Regression tested on AMD64/GNU-Linux, new test case tested to fail on an
unpatched compiler and pass on a patched compiler.

Patch by Abe Skolnik and Sebastian Pop.

llvm-svn: 289399
2016-12-11 19:39:32 +00:00