llvm-project

mirror of https://github.com/encounter/llvm-project.git synced 2026-03-30 11:27:19 -07:00

Author	SHA1	Message	Date
Nicolas Vasilache	3f906c54a2	[mlir][Vector] Add 2-D vector contract lowering to ReduceOp This new pattern mixes vector.transpose and direct lowering to vector.reduce. This allows more progressive lowering than immediately going to insert/extract and composes more nicely with other canonicalizations. This has 2 use cases: 1. for very wide vectors the generated IR may be much smaller 2. when we have a custom lowering for transpose ops we can target it directly rather than rely LLVM Differential Revision: https://reviews.llvm.org/D85428	2020-08-07 06:17:48 -04:00
Nicolas Vasilache	1353cbc257	[mlir][Vector] NFC - Use matchAndRewrite in ContractionOp lowering patterns Replace the use of separate match and rewrite which unnecessarily duplicates logic. Differential Revision: https://reviews.llvm.org/D85421	2020-08-06 09:02:25 -04:00
Nicolas Vasilache	2d0b05969b	[mlir][Vector] Relax condition for `splitFullAndPartialTransferPrecondition` The `splitFullAndPartialTransferPrecondition` has a restrictive condition to prevent the pattern to be applied recursively if it is nested under an scf.IfOp. Relaxing the condition to the immediate parent op must not be an scf.IfOp lets the pattern be applied more generally while still preventing recursion. Differential Revision: https://reviews.llvm.org/D85209	2020-08-04 10:06:21 -04:00
Nicolas Vasilache	1a4263d394	[mlir][Vector] Add linalg.copy-based pattern for splitting vector.transfer_read into full and partial copies. This revision adds a transformation and a pattern that rewrites a "maybe masked" `vector.transfer_read %view[...], %pad `into a pattern resembling: ``` %1:3 = scf.if (%inBounds) { scf.yield %view : memref<A...>, index, index } else { %2 = linalg.fill(%extra_alloc, %pad) %3 = subview %view [...][...][...] linalg.copy(%3, %alloc) memref_cast %extra_alloc: memref<B...> to memref<A...> scf.yield %4 : memref<A...>, index, index } %res= vector.transfer_read %1#0[%1#1, %1#2] {masked = [false ... false]} ``` where `extra_alloc` is a top of the function alloca'ed buffer of one vector. This rewrite makes it possible to realize the "always full tile" abstraction where vector.transfer_read operations are guaranteed to read from a padded full buffer. The extra work only occurs on the boundary tiles.	2020-08-04 08:46:08 -04:00
Nicolas Vasilache	d313e9c12e	[mlir][Vector] Add transformation + pattern to split vector.transfer_read into full and partial copies. This revision adds a transformation and a pattern that rewrites a "maybe masked" `vector.transfer_read %view[...], %pad `into a pattern resembling: ``` %1:3 = scf.if (%inBounds) { scf.yield %view : memref<A...>, index, index } else { %2 = vector.transfer_read %view[...], %pad : memref<A...>, vector<...> %3 = vector.type_cast %extra_alloc : memref<...> to memref<vector<...>> store %2, %3[] : memref<vector<...>> %4 = memref_cast %extra_alloc: memref<B...> to memref<A...> scf.yield %4 : memref<A...>, index, index } %res= vector.transfer_read %1#0[%1#1, %1#2] {masked = [false ... false]} ``` where `extra_alloc` is a top of the function alloca'ed buffer of one vector. This rewrite makes it possible to realize the "always full tile" abstraction where vector.transfer_read operations are guaranteed to read from a padded full buffer. The extra work only occurs on the boundary tiles. Differential Revision: https://reviews.llvm.org/D84631	2020-08-03 12:58:18 -04:00
Mehdi Amini	7ba82a7320	Revert "[mlir][Vector] Add transformation + pattern to split vector.transfer_read into full and partial copies." This reverts commit `35b65be041`. Build is broken with -DBUILD_SHARED_LIBS=ON with some undefined references like: VectorTransforms.cpp:(.text._ZN4llvm12function_refIFvllEE11callback_fnIZL24createScopedInBoundsCondN4mlir25VectorTransferOpInterfaceEE3$_8EEvlll+0xa5): undefined reference to `mlir::edsc::op::operator+(mlir::Value, mlir::Value)'	2020-08-03 16:16:47 +00:00
Nicolas Vasilache	35b65be041	[mlir][Vector] Add transformation + pattern to split vector.transfer_read into full and partial copies. This revision adds a transformation and a pattern that rewrites a "maybe masked" `vector.transfer_read %view[...], %pad `into a pattern resembling: ``` %1:3 = scf.if (%inBounds) { scf.yield %view : memref<A...>, index, index } else { %2 = vector.transfer_read %view[...], %pad : memref<A...>, vector<...> %3 = vector.type_cast %extra_alloc : memref<...> to memref<vector<...>> store %2, %3[] : memref<vector<...>> %4 = memref_cast %extra_alloc: memref<B...> to memref<A...> scf.yield %4 : memref<A...>, index, index } %res= vector.transfer_read %1#0[%1#1, %1#2] {masked = [false ... false]} ``` where `extra_alloc` is a top of the function alloca'ed buffer of one vector. This rewrite makes it possible to realize the "always full tile" abstraction where vector.transfer_read operations are guaranteed to read from a padded full buffer. The extra work only occurs on the boundary tiles. Differential Revision: https://reviews.llvm.org/D84631	2020-08-03 04:53:43 -04:00
Benjamin Kramer	eb41f9edde	[mlir][Vector] Simplify code a bit. NFCI.	2020-08-01 14:49:19 +02:00
Nicolas Vasilache	47cbd9f922	[mlir][Vector] NFC - Improve VectorInterfaces This revision improves and makes better use of OpInterfaces for the Vector dialect. Differential Revision: https://reviews.llvm.org/D84053	2020-07-20 08:24:22 -04:00
Pierre Oechsel	ec62e37c86	[mlir] [vector] Add an optional filter to vector contract lowering patterns. Summary: Vector contract patterns were only parameterized by a `vectorTransformsOptions`. As a result, even if an mlir file was containing several occurrences of `vector.contract`, all of them would be lowered in the same way. More granularity might be required . This Diff adds a `constraint` argument to each of these patterns which allows the user to specify with more precision on which `vector.contract` should each of the lowering apply. Differential Revision: https://reviews.llvm.org/D83960	2020-07-17 12:03:13 -04:00
aartbik	365434a584	[mlir] [VectorOps] Merge OUTER/AXPY vector.contract lowering into single case We temporarily had separate OUTER lowering (for matmat flavors) and AXPY lowering (for matvec flavors). With the new generalized "vector.outerproduct" semantics, these cases can be merged into a single lowering method. This refactoring will simplify future decisions on cost models and lowering heuristics. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D83585	2020-07-10 13:11:54 -07:00
aartbik	9bf6354301	[mlir] [VectorOps] Allow AXPY to be expressed as special case of OUTERPRODUCT This specialization allows sharing more code where an AXPY follows naturally in cases where an OUTERPRODUCT on a scalar would be generated. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D83453	2020-07-10 12:23:24 -07:00
Benjamin Kramer	cca4ac523e	[mlir][VectorOps] Lower vector.outerproduct of int vectors vector.fma and mulf don't work on integers. Use a muli/addi pair or plain muli instead. Differential Revision: https://reviews.llvm.org/D83292	2020-07-07 14:40:07 +02:00
River Riddle	9db53a1827	[mlir][NFC] Remove usernames and google bug numbers from TODO comments. These were largely leftover from when MLIR was a google project, and don't really follow LLVM guidelines.	2020-07-07 01:40:52 -07:00
Nicolas Vasilache	05c65dc0fe	[mlir][Vector] Add a VectorUnrollInterface and expose UnrollVectorPattern. The UnrollVectorPattern is can be used in a programmable fashion by: ``` OwningRewritePatternList patterns; patterns.insert<UnrollVectorPattern<AddFOp>>(ArrayRef<int64_t>{2, 2}, ctx); patterns.insert<UnrollVectorPattern<vector::ContractionOp>>( ArrayRef<int64_t>{2, 2, 2}, ctx); ... applyPatternsAndFoldGreedily(getFunction(), patterns); ``` Differential revision: https://reviews.llvm.org/D83064	2020-07-06 08:09:06 -04:00
aartbik	ee01c7a740	[mlir] [VectorOps] Add choice between dot and axpy lowering of vector.contract Default vector.contract lowering essentially yields a series of sdot/ddot operations. However, for some layouts a series of saxpy/daxpy operations, chained through fma are more efficient. This CL introduces a choice between the two lowering paths. A default heuristic is to follow. Some preliminary avx2 performance numbers for matrix-times-vector. Here, dot performs best for 64x64 A x b and saxpy for 64x64 A^T x b. ``` ------------------------------------------------------------ A x b A^T x b ------------------------------------------------------------ GFLOPS sdot (reassoc) saxpy sdot (reassoc) saxpy ------------------------------------------------------------ 1x1 0.6 0.9 0.6 0.9 2x2 2.5 3.2 2.4 3.5 4x4 6.4 8.4 4.9 11.8 8x8 11.7 6.1 5.0 29.6 16x16 20.7 10.8 7.3 43.3 32x32 29.3 7.9 6.4 51.8 64x64 38.9 79.3 128x128 32.4 40.7 ------------------------------------------------------------ ``` Reviewed By: nicolasvasilache, ftynse Differential Revision: https://reviews.llvm.org/D83012	2020-07-02 13:21:17 -07:00
aartbik	63b3933d0c	[mlir] [VectorOps] Replace zero fma with mult for vector.contract More efficient implementation of the multiply-reduce pair, no need to add in a zero vector. Microbenchmarking on AVX2 yields the following difference in vector.contract speedup (over strict-order scalar reduction). SPEEDUP SIMD-fma SIMD-mul 4x4 1.45 2.00 8x8 1.40 1.90 32x32 5.32 5.80 Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D82833	2020-06-30 09:04:20 -07:00
aartbik	55d09dfc7b	[mlir] [VectorOps] Improve vector.create_mask lowering Use vector compares for the 1-D case. This approach scales much better than generating insertion operations, and exposes SIMD directly to backend. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D82402	2020-06-23 14:33:41 -07:00
Thomas Raoux	e4bc08f012	[mlir] Allow vector.contract to have mixed types operands Allow lhs and rhs to have different type than accumulator/destination. Some hardware like GPUs support natively operations like uint8xuint8xuint32. Differential Revision: https://reviews.llvm.org/D82069	2020-06-19 17:08:57 -07:00
aartbik	0d82ab7885	[mlir] [VectorOps] Improve vector.constant_mask lowering Use direct vector constants for the 1-D case. This approach scales much better than generating elaborate insertion operations that are eventually folded into a constant. We could of course generalize the 1-D case to higher ranks, but this simplification already helps in scaling some microbenchmarks that would formerly crash on the intermediate IR length. Reviewed By: reidtatge Differential Revision: https://reviews.llvm.org/D82144	2020-06-19 10:40:08 -07:00
aartbik	1e45b55dcc	[mlir] [VectorOps] Handle 'vector.shape_cast' lowering for all cases Summary: Even though this operation is intended for 1d/2d conversions currently, leaving a semantic hole in the lowering prohibits proper testing of this operation. This CL adds a straightforward reference implementation for the missing cases. Reviewers: nicolasvasilache, mehdi_amini, ftynse, reidtatge Reviewed By: reidtatge Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul, msifontes Tags: #mlir Differential Revision: https://reviews.llvm.org/D81503	2020-06-09 16:08:45 -07:00
aartbik	c19fae507e	[mlir] [VectorOps] Add missing comments to CreateMaskOp lowering Summary: Add missing comment to CreateMask. Fixed typo in ConstantMask comment. Reviewers: nicolasvasilache, rriddle, reidtatge, ftynse Reviewed By: ftynse Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul Tags: #mlir Differential Revision: https://reviews.llvm.org/D81125	2020-06-04 12:50:47 -07:00
aartbik	6391da98f4	[mlir] [VectorOps] Use 'vector.flat_transpose' for 2-D 'vector.tranpose' Summary: Progressive lowering of vector.transpose into an operation that is closer to an intrinsic, and thus the hardware ISA. Currently under the common vector transform testing flag, as we prepare deploying this transformation in the LLVM lowering pipeline. Reviewers: nicolasvasilache, reidtatge, andydavis1, ftynse Reviewed By: nicolasvasilache, ftynse Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul, llvm-commits Tags: #llvm, #mlir Differential Revision: https://reviews.llvm.org/D80772	2020-06-03 14:55:50 -07:00
Nicolas Vasilache	ba10daa820	[mlir][Vector] Add more vector.contract -> outerproduct lowerings and fix vector.contract type inference. This revision expands the types of vector contractions that can be lowered to vector.outerproduct. All 8 permutation cases are support. The idiomatic manipulation of AffineMap written declaratively makes this straightforward. In the process a bug with the vector.contract verifier was uncovered. The vector shape verification part of the contract op is rewritten to use AffineMap composition. One bug in the vector `ops.mlir` test is fixed and a new case not yet captured is added to the vector`invalid.mlir` test. Differential Revision: https://reviews.llvm.org/D80393	2020-05-26 15:40:55 -04:00
Nicolas Vasilache	9578a54f50	[mlir][Vector] Add vector contraction to outerproduct lowering This revision adds the additional lowering and exposes the patterns at a finer granularity for better programmatic reuse. The unit test makes use of the finer grained pattern for simpler checks. As the ContractionOpLowering is exposed programmatically, cleanup opportunities appear and static class methods are turned into free functions with static visibility. Differential Revision: https://reviews.llvm.org/D80375	2020-05-26 09:31:26 -04:00

1 2

46 Commits