Commit Graph

32 Commits

Author SHA1 Message Date
Mehdi Amini f9dc2b7079 Separate the Registration from Loading dialects in the Context
This changes the behavior of constructing MLIRContext to no longer load globally
registered dialects on construction. Instead Dialects are only loaded explicitly
on demand:
- the Parser is lazily loading Dialects in the context as it encounters them
during parsing. This is the only purpose for registering dialects and not load
them in the context.
- Passes are expected to declare the dialects they will create entity from
(Operations, Attributes, or Types), and the PassManager is loading Dialects into
the Context when starting a pipeline.

This changes simplifies the configuration of the registration: a compiler only
need to load the dialect for the IR it will emit, and the optimizer is
self-contained and load the required Dialects. For example in the Toy tutorial,
the compiler only needs to load the Toy dialect in the Context, all the others
(linalg, affine, std, LLVM, ...) are automatically loaded depending on the
optimization pipeline enabled.

To adjust to this change, stop using the existing dialect registration: the
global registry will be removed soon.

1) For passes, you need to override the method:

virtual void getDependentDialects(DialectRegistry &registry) const {}

and registery on the provided registry any dialect that this pass can produce.
Passes defined in TableGen can provide this list in the dependentDialects list
field.

2) For dialects, on construction you can register dependent dialects using the
provided MLIRContext: `context.getOrLoadDialect<DialectName>()`
This is useful if a dialect may canonicalize or have interfaces involving
another dialect.

3) For loading IR, dialect that can be in the input file must be explicitly
registered with the context. `MlirOptMain()` is taking an explicit registry for
this purpose. See how the standalone-opt.cpp example is setup:

  mlir::DialectRegistry registry;
  registry.insert<mlir::standalone::StandaloneDialect>();
  registry.insert<mlir::StandardOpsDialect>();

Only operations from these two dialects can be in the input file. To include all
of the dialects in MLIR Core, you can populate the registry this way:

  mlir::registerAllDialects(registry);

4) For `mlir-translate` callback, as well as frontend, Dialects can be loaded in
the context before emitting the IR: context.getOrLoadDialect<ToyDialect>()

Differential Revision: https://reviews.llvm.org/D85622
2020-08-19 01:19:03 +00:00
Mehdi Amini e75bc5c791 Revert "Separate the Registration from Loading dialects in the Context"
This reverts commit d14cf45735.
The build is broken with GCC-5.
2020-08-19 01:19:03 +00:00
Mehdi Amini d14cf45735 Separate the Registration from Loading dialects in the Context
This changes the behavior of constructing MLIRContext to no longer load globally
registered dialects on construction. Instead Dialects are only loaded explicitly
on demand:
- the Parser is lazily loading Dialects in the context as it encounters them
during parsing. This is the only purpose for registering dialects and not load
them in the context.
- Passes are expected to declare the dialects they will create entity from
(Operations, Attributes, or Types), and the PassManager is loading Dialects into
the Context when starting a pipeline.

This changes simplifies the configuration of the registration: a compiler only
need to load the dialect for the IR it will emit, and the optimizer is
self-contained and load the required Dialects. For example in the Toy tutorial,
the compiler only needs to load the Toy dialect in the Context, all the others
(linalg, affine, std, LLVM, ...) are automatically loaded depending on the
optimization pipeline enabled.

To adjust to this change, stop using the existing dialect registration: the
global registry will be removed soon.

1) For passes, you need to override the method:

virtual void getDependentDialects(DialectRegistry &registry) const {}

and registery on the provided registry any dialect that this pass can produce.
Passes defined in TableGen can provide this list in the dependentDialects list
field.

2) For dialects, on construction you can register dependent dialects using the
provided MLIRContext: `context.getOrLoadDialect<DialectName>()`
This is useful if a dialect may canonicalize or have interfaces involving
another dialect.

3) For loading IR, dialect that can be in the input file must be explicitly
registered with the context. `MlirOptMain()` is taking an explicit registry for
this purpose. See how the standalone-opt.cpp example is setup:

  mlir::DialectRegistry registry;
  registry.insert<mlir::standalone::StandaloneDialect>();
  registry.insert<mlir::StandardOpsDialect>();

Only operations from these two dialects can be in the input file. To include all
of the dialects in MLIR Core, you can populate the registry this way:

  mlir::registerAllDialects(registry);

4) For `mlir-translate` callback, as well as frontend, Dialects can be loaded in
the context before emitting the IR: context.getOrLoadDialect<ToyDialect>()

Differential Revision: https://reviews.llvm.org/D85622
2020-08-18 23:23:56 +00:00
Mehdi Amini d84fe55e0d Revert "Separate the Registration from Loading dialects in the Context"
This reverts commit e1de2b7550.
Broke a build bot.
2020-08-18 22:16:34 +00:00
Mehdi Amini e1de2b7550 Separate the Registration from Loading dialects in the Context
This changes the behavior of constructing MLIRContext to no longer load globally
registered dialects on construction. Instead Dialects are only loaded explicitly
on demand:
- the Parser is lazily loading Dialects in the context as it encounters them
during parsing. This is the only purpose for registering dialects and not load
them in the context.
- Passes are expected to declare the dialects they will create entity from
(Operations, Attributes, or Types), and the PassManager is loading Dialects into
the Context when starting a pipeline.

This changes simplifies the configuration of the registration: a compiler only
need to load the dialect for the IR it will emit, and the optimizer is
self-contained and load the required Dialects. For example in the Toy tutorial,
the compiler only needs to load the Toy dialect in the Context, all the others
(linalg, affine, std, LLVM, ...) are automatically loaded depending on the
optimization pipeline enabled.

To adjust to this change, stop using the existing dialect registration: the
global registry will be removed soon.

1) For passes, you need to override the method:

virtual void getDependentDialects(DialectRegistry &registry) const {}

and registery on the provided registry any dialect that this pass can produce.
Passes defined in TableGen can provide this list in the dependentDialects list
field.

2) For dialects, on construction you can register dependent dialects using the
provided MLIRContext: `context.getOrLoadDialect<DialectName>()`
This is useful if a dialect may canonicalize or have interfaces involving
another dialect.

3) For loading IR, dialect that can be in the input file must be explicitly
registered with the context. `MlirOptMain()` is taking an explicit registry for
this purpose. See how the standalone-opt.cpp example is setup:

  mlir::DialectRegistry registry;
  mlir::registerDialect<mlir::standalone::StandaloneDialect>();
  mlir::registerDialect<mlir::StandardOpsDialect>();

Only operations from these two dialects can be in the input file. To include all
of the dialects in MLIR Core, you can populate the registry this way:

  mlir::registerAllDialects(registry);

4) For `mlir-translate` callback, as well as frontend, Dialects can be loaded in
the context before emitting the IR: context.getOrLoadDialect<ToyDialect>()
2020-08-18 21:14:39 +00:00
Mehdi Amini 25ee851746 Revert "Separate the Registration from Loading dialects in the Context"
This reverts commit 2056393387.

Build is broken on a few bots
2020-08-15 09:21:47 +00:00
Mehdi Amini 2056393387 Separate the Registration from Loading dialects in the Context
This changes the behavior of constructing MLIRContext to no longer load globally registered dialects on construction. Instead Dialects are only loaded explicitly on demand:
- the Parser is lazily loading Dialects in the context as it encounters them during parsing. This is the only purpose for registering dialects and not load them in the context.
- Passes are expected to declare the dialects they will create entity from (Operations, Attributes, or Types), and the PassManager is loading Dialects into the Context when starting a pipeline.

This changes simplifies the configuration of the registration: a compiler only need to load the dialect for the IR it will emit, and the optimizer is self-contained and load the required Dialects. For example in the Toy tutorial, the compiler only needs to load the Toy dialect in the Context, all the others (linalg, affine, std, LLVM, ...) are automatically loaded depending on the optimization pipeline enabled.

Differential Revision: https://reviews.llvm.org/D85622
2020-08-15 08:07:31 +00:00
Mehdi Amini ba92dadf05 Revert "Separate the Registration from Loading dialects in the Context"
This was landed by accident, will reland with the right comments
addressed from the reviews.
Also revert dependent build fixes.
2020-08-15 07:35:10 +00:00
Mehdi Amini ebf521e784 Separate the Registration from Loading dialects in the Context
This changes the behavior of constructing MLIRContext to no longer load globally registered dialects on construction. Instead Dialects are only loaded explicitly on demand:
- the Parser is lazily loading Dialects in the context as it encounters them during parsing. This is the only purpose for registering dialects and not load them in the context.
- Passes are expected to declare the dialects they will create entity from (Operations, Attributes, or Types), and the PassManager is loading Dialects into the Context when starting a pipeline.

This changes simplifies the configuration of the registration: a compiler only need to load the dialect for the IR it will emit, and the optimizer is self-contained and load the required Dialects. For example in the Toy tutorial, the compiler only needs to load the Toy dialect in the Context, all the others (linalg, affine, std, LLVM, ...) are automatically loaded depending on the optimization pipeline enabled.
2020-08-14 09:40:27 +00:00
Nicolas Vasilache 1a4263d394 [mlir][Vector] Add linalg.copy-based pattern for splitting vector.transfer_read into full and partial copies.
This revision adds a transformation and a pattern that rewrites a "maybe masked" `vector.transfer_read %view[...], %pad `into a pattern resembling:

```
   %1:3 = scf.if (%inBounds) {
      scf.yield %view : memref<A...>, index, index
    } else {
      %2 = linalg.fill(%extra_alloc, %pad)
      %3 = subview %view [...][...][...]
      linalg.copy(%3, %alloc)
      memref_cast %extra_alloc: memref<B...> to memref<A...>
      scf.yield %4 : memref<A...>, index, index
   }
   %res= vector.transfer_read %1#0[%1#1, %1#2] {masked = [false ... false]}
```
where `extra_alloc` is a top of the function alloca'ed buffer of one vector.

This rewrite makes it possible to realize the "always full tile" abstraction where vector.transfer_read operations are guaranteed to read from a padded full buffer.
The extra work only occurs on the boundary tiles.
2020-08-04 08:46:08 -04:00
Nicolas Vasilache d313e9c12e [mlir][Vector] Add transformation + pattern to split vector.transfer_read into full and partial copies.
This revision adds a transformation and a pattern that rewrites a "maybe masked" `vector.transfer_read %view[...], %pad `into a pattern resembling:

```
   %1:3 = scf.if (%inBounds) {
      scf.yield %view : memref<A...>, index, index
    } else {
      %2 = vector.transfer_read %view[...], %pad : memref<A...>, vector<...>
      %3 = vector.type_cast %extra_alloc : memref<...> to
      memref<vector<...>> store %2, %3[] : memref<vector<...>> %4 =
      memref_cast %extra_alloc: memref<B...> to memref<A...> scf.yield %4 :
      memref<A...>, index, index
   }
   %res= vector.transfer_read %1#0[%1#1, %1#2] {masked = [false ... false]}
```
where `extra_alloc` is a top of the function alloca'ed buffer of one vector.

This rewrite makes it possible to realize the "always full tile" abstraction where vector.transfer_read operations are guaranteed to read from a padded full buffer.
The extra work only occurs on the boundary tiles.

Differential Revision: https://reviews.llvm.org/D84631
2020-08-03 12:58:18 -04:00
Mehdi Amini 7ba82a7320 Revert "[mlir][Vector] Add transformation + pattern to split vector.transfer_read into full and partial copies."
This reverts commit 35b65be041.

Build is broken with -DBUILD_SHARED_LIBS=ON with some undefined
references like:

VectorTransforms.cpp:(.text._ZN4llvm12function_refIFvllEE11callback_fnIZL24createScopedInBoundsCondN4mlir25VectorTransferOpInterfaceEE3$_8EEvlll+0xa5): undefined reference to `mlir::edsc::op::operator+(mlir::Value, mlir::Value)'
2020-08-03 16:16:47 +00:00
Nicolas Vasilache 35b65be041 [mlir][Vector] Add transformation + pattern to split vector.transfer_read into full and partial copies.
This revision adds a transformation and a pattern that rewrites a "maybe masked" `vector.transfer_read %view[...], %pad `into a pattern resembling:

```
   %1:3 = scf.if (%inBounds) {
      scf.yield %view : memref<A...>, index, index
    } else {
      %2 = vector.transfer_read %view[...], %pad : memref<A...>, vector<...>
      %3 = vector.type_cast %extra_alloc : memref<...> to
      memref<vector<...>> store %2, %3[] : memref<vector<...>> %4 =
      memref_cast %extra_alloc: memref<B...> to memref<A...> scf.yield %4 :
      memref<A...>, index, index
   }
   %res= vector.transfer_read %1#0[%1#1, %1#2] {masked = [false ... false]}
```
where `extra_alloc` is a top of the function alloca'ed buffer of one vector.

This rewrite makes it possible to realize the "always full tile" abstraction where vector.transfer_read operations are guaranteed to read from a padded full buffer.
The extra work only occurs on the boundary tiles.

Differential Revision: https://reviews.llvm.org/D84631
2020-08-03 04:53:43 -04:00
Nicolas Vasilache 64cdd5b3da [mlir][Vector] Drop declarative transforms
For the purpose of vector transforms, the Tablegen-based infra is subsumed by simple C++ pattern application. Deprecate declarative transforms whose complexity does not pay for itself.

Differential Revision: https://reviews.llvm.org/D84753
2020-07-28 13:11:16 -04:00
Pierre Oechsel ec62e37c86 [mlir] [vector] Add an optional filter to vector contract lowering patterns.
Summary: Vector contract patterns were only parameterized by a `vectorTransformsOptions`. As a result, even if an mlir file was containing several occurrences of `vector.contract`, all of them would be lowered in the same way. More granularity might be required . This Diff adds a `constraint` argument to each of these patterns which allows the user to specify with more precision on which `vector.contract` should each of the lowering apply.

Differential Revision: https://reviews.llvm.org/D83960
2020-07-17 12:03:13 -04:00
aartbik 365434a584 [mlir] [VectorOps] Merge OUTER/AXPY vector.contract lowering into single case
We temporarily had separate OUTER lowering (for matmat flavors) and
AXPY lowering (for matvec flavors). With the new generalized
"vector.outerproduct" semantics, these cases can be merged into
a single lowering method. This refactoring will simplify future
decisions on cost models and lowering heuristics.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D83585
2020-07-10 13:11:54 -07:00
Nicolas Vasilache 05c65dc0fe [mlir][Vector] Add a VectorUnrollInterface and expose UnrollVectorPattern.
The UnrollVectorPattern is can be used in a programmable fashion by:
```
OwningRewritePatternList patterns;
    patterns.insert<UnrollVectorPattern<AddFOp>>(ArrayRef<int64_t>{2, 2}, ctx);
    patterns.insert<UnrollVectorPattern<vector::ContractionOp>>(
        ArrayRef<int64_t>{2, 2, 2}, ctx);
    ...
    applyPatternsAndFoldGreedily(getFunction(), patterns);
```

Differential revision: https://reviews.llvm.org/D83064
2020-07-06 08:09:06 -04:00
aartbik ee01c7a740 [mlir] [VectorOps] Add choice between dot and axpy lowering of vector.contract
Default vector.contract lowering essentially yields a series of sdot/ddot
operations. However, for some layouts a series of saxpy/daxpy operations,
chained through fma are more efficient. This CL introduces a choice between
the two lowering paths. A default heuristic is to follow.

Some preliminary avx2 performance numbers for matrix-times-vector.
Here, dot performs best for 64x64 A x b and saxpy for 64x64 A^T x b.

```
------------------------------------------------------------
            A x b                          A^T x b
------------------------------------------------------------
GFLOPS    sdot (reassoc)    saxpy    sdot (reassoc)    saxpy
------------------------------------------------------------
1x1        0.6               0.9       0.6             0.9
2x2        2.5               3.2       2.4             3.5
4x4        6.4               8.4       4.9             11.8
8x8       11.7               6.1       5.0             29.6
16x16     20.7              10.8       7.3             43.3
32x32     29.3               7.9       6.4             51.8
64x64     38.9                                         79.3
128x128   32.4                                         40.7
------------------------------------------------------------
```

Reviewed By: nicolasvasilache, ftynse

Differential Revision: https://reviews.llvm.org/D83012
2020-07-02 13:21:17 -07:00
aartbik 6391da98f4 [mlir] [VectorOps] Use 'vector.flat_transpose' for 2-D 'vector.tranpose'
Summary:
Progressive lowering of vector.transpose into an operation that
is closer to an intrinsic, and thus the hardware ISA. Currently
under the common vector transform testing flag, as we prepare
deploying this transformation in the LLVM lowering pipeline.

Reviewers: nicolasvasilache, reidtatge, andydavis1, ftynse

Reviewed By: nicolasvasilache, ftynse

Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul, llvm-commits

Tags: #llvm, #mlir

Differential Revision: https://reviews.llvm.org/D80772
2020-06-03 14:55:50 -07:00
Nicolas Vasilache 9578a54f50 [mlir][Vector] Add vector contraction to outerproduct lowering
This revision adds the additional lowering and exposes the patterns at a finer granularity for better programmatic reuse. The unit test makes use of the finer grained pattern for simpler checks.

As the ContractionOpLowering is exposed programmatically, cleanup opportunities appear and static class methods are turned into free functions with static visibility.

Differential Revision: https://reviews.llvm.org/D80375
2020-05-26 09:31:26 -04:00
Uday Bondhugula a5b9316b24 [MLIR][NFC] applyPatternsGreedily -> applyPatternsAndFoldGreedily
Rename mlir::applyPatternsGreedily -> applyPatternsAndFoldGreedily. The
new name is a more accurate description of the method - it performs
both, application of the specified patterns and folding of all ops in
the op's region irrespective of whether any patterns have been supplied.

Differential Revision: https://reviews.llvm.org/D77478
2020-04-10 12:55:21 +05:30
River Riddle 80aca1eaf7 [mlir][Pass] Remove the use of CRTP from the Pass classes
This revision removes all of the CRTP from the pass hierarchy in preparation for using the tablegen backend instead. This creates a much cleaner interface in the C++ code, and naturally fits with the rest of the infrastructure. A new utility class, PassWrapper, is added to replicate the existing behavior for passes not suitable for using the tablegen backend.

Differential Revision: https://reviews.llvm.org/D77350
2020-04-07 14:08:52 -07:00
Nicolas Vasilache 2fae7878d5 [mlir][Vector] Mostly-NFC - Restructure options for lowering to LLVM Matrix Intrinsics
Summary:
This revision restructures the calling of vector transforms to make it more flexible to ask for lowering through LLVM matrix intrinsics.
This also makes sure we bail out in degenerate cases (i.e. 1) in which LLVM complains about not being able to scalarize.

Differential Revision: https://reviews.llvm.org/D76266
2020-03-17 22:58:02 -04:00
Rob Suderman 4d60f47b08 [mlir][NFC] Renamed VectorOps to Vector
Summary: Renamed VectorOps to Vector to avoid the redundant Ops suffix.

Differential Revision: https://reviews.llvm.org/D76317
2020-03-17 15:28:08 -07:00
Rob Suderman 69d757c0e8 Move StandardOps/Ops.h to StandardOps/IR/Ops.h
Summary:
NFC - Moved StandardOps/Ops.h to a StandardOps/IR dir to better match surrounding
directories. This is to match other dialects, and prepare for moving StandardOps
related transforms in out for Transforms and into StandardOps/Transforms.

Differential Revision: https://reviews.llvm.org/D74940
2020-02-21 11:58:47 -08:00