llvm-project

mirror of https://github.com/AdaCore/llvm-project.git synced 2026-02-12 13:52:35 -08:00

Author	SHA1	Message	Date
Joseph Huber	e625a78121	[LLVM] Update CUDA ELF flags for their new ABI (#149534 ) Summary: We rely on these flags to do things in the runtime and print the contents of binaries correctly. CUDA updated their ABI encoding recently and we didn't handle that. it's a new ABI entirely so we just select on it when it shows up. Fixes: https://github.com/llvm/llvm-project/issues/148703 [LLVM] Fix offload and update CUDA ABI for all SM values (#159354) Summary: Turns out the new CUDA ABI now applies retroactively to all the other SMs if you upgrade to CUDA 13.0. This patch changes the scheme, keeping all the SM flags consistent but using an offset. Fixes: https://github.com/llvm/llvm-project/issues/159088	2025-09-22 17:55:39 -07:00
Callum Fare	47c9609a86	[Offload] Check plugins aren't already deinitialized when tearing down (#148642 ) This is a hotfix for #148615 - it fixes the issue for me locally. I think a broader issue is that in the test environment we're calling olShutDown from a global destructor in the test binaries. We should do something more controlled, either calling olInit/olShutDown in every test, or move those to a GTest global environment. I didn't do that originally because it looked like it needed changes to LLVM's GTest wrapper.	2025-07-14 16:17:10 +01:00
Kenneth Benzie (Benie)	508f9a0274	[Offload] Skip event tests on AMDGPU (#148632 ) Add `OffloadDeviceTest::getPlatformBackend()` and use it to skip event tests which currently fail on AMDGPU due to: ``` OL_ERRC_UNIMPLEMENTED: synchronize event not implemented ```	2025-07-14 09:19:53 -05:00
Ross Brunton	a71187e976	[Offload] Return error rather than dropping it (#148609 )	2025-07-14 14:05:58 +01:00
Kenneth Benzie (Benie)	b520d21c02	[Offload] Add tagged type to enumerator docs (#147998 ) When `EnumRec::isTyped()` is true, include the `EnumValueRec::getTaggedType()` to the documentation.	2025-07-14 13:35:36 +01:00
Ross Brunton	2fdeeefacf	[Offload] Add global variable address/size queries (#147972 ) Add two new symbol info types for getting the bounds of a global variable. As well as a number of tests for reading/writing to it.	2025-07-11 16:12:48 +01:00
Ross Brunton	84e15d08c2	[Offload] Add `olGetSymbolInfo[Size]` (#147962 ) This mirrors the similar functions for other handles. The only implemented info at the moment is the symbol's kind.	2025-07-11 15:29:53 +01:00
Ross Brunton	eee723f928	[Offload] Replace `GetKernel` with `GetSymbol` with global support (#148221 ) `olGetKernel` has been replaced by `olGetSymbol` which accepts a `Kind` parameter. As well as loading information about kernels, it can now also load information about global variables.	2025-07-11 14:48:10 +01:00
Ross Brunton	466357ab51	[Offload] Change `ol_kernel_handle_t` -> `ol_symbol_handle_t` (#147943 ) In the future, we want `ol_symbol_handle_t` to represent both kernels and global variables The first step in this process is a rename and promotion to a "typed handle".	2025-07-10 14:54:10 +01:00
Ross Brunton	abb878438a	[Offload] Allow querying the size of globals (#147698 ) The `GlobalTy` helper has been extended to make both the Size and Ptr be optional. Now `getGlobalMetadataFromDevice`/`Image` is able to write the size of the global to the struct, instead of just verifying it.	2025-07-10 12:05:31 +01:00
Kenneth Benzie (Benie)	cea33304c0	[Offload] Add Offload API Sphinx documentation (#147323 ) * Add spec generation to offload-tblgen tool * This patch adds generation of Sphinx compatible reStructuedText utilizing the C domain to document the Offload API directly from the spec definition `.td` files. * Add Sphinx HTML documentation target * Introduces the `docs-offload-html` target when CMake is configured with `LLVM_ENABLE_SPHINX=ON` and `SPHINX_OUTPUT_HTML=ON`. Utilized `offload-tblgen -gen-spen` to generate Offload API specification docs.	2025-07-10 11:50:51 +01:00
Callum Fare	7c6edf4a05	[Offload] Implement olGetQueueInfo, olGetEventInfo (#142947 ) Add info queries for queues and events. `olGetQueueInfo` only supports getting the associated device. We were already tracking this so we can implement this for free. We will likely add other queries to it in the future (whether the queue is empty, what flags it was created with, etc) `olGetEventInfo` only supports getting the associated queue. This is another thing we were already storing in the handle. We'll be able to add other queries in future (the event type, status, etc)	2025-07-09 17:09:31 +01:00
Ye Luo	9f6784cc1f	[libomptarget] fix test offloading/disable_default_device.c Fixes the incorrect lit command line introduced in `536ba87726`	2025-07-09 09:52:00 -05:00
Ross Brunton	bed9fe77dc	[Offload] Tests for global memory and constructors (#147537 ) Adds two "launch kernel" tests for lib offload, one testing that global memory works and persists between different kernels, and one verifying that `[[gnu::constructor]]` works correctly. Since we now have tests that contain multiple kernels in the same binary, the test framework has been updated a bit.	2025-07-09 14:26:50 +01:00
Ross Brunton	0740db9bc1	[Offload] Add `_LAST` variant for generated enumerations (#147314 )	2025-07-09 13:55:25 +01:00
Michael Kruse	4be3e95284	[Flang-RT][Offload] Always use LLVM-built GTest (#143682 ) The Offload and Flang-RT had the ability to compile GTest themselves. But in bootstrapping builds, LLVM_LIBRARY_OUTPUT_INTDIR points to the same location as the stage1 build. If both are building GTest, they everwrite each others `libllvm_gtest.a` and `libllvm_test_main.a` which causes #143134. This PR removes the ability for the Offload/Flang-RT runtimes to build their own GTest and instead relies on the stage1 build of GTest. This was already the case with LLVM_INSTALL_GTEST=ON configurations. For LLVM_INSTALL_GTEST=OFF configurations, we now also export gtest into the buildtree configuration. Ultimately, this reduces combinatorial explosion of configurations in which unittests could be built (LLVM_INSTALL_GTEST=ON, GTest built by Offload, GTest built by Flang-RT, GTest built by Offload and also used by Flang-RT). GTest and therefore Offload/Runtime unittests will not be available if the runtimes are configured against an LLVM install tree. Since llvm-lit isn't available in the install tree either, it doesn't matter. Note that compiler-rt and libc also use GTest in non-default configrations. libc also depends on LLVM's GTest build (and would error-out if unavailable), but compiler-rt builds it completely different. Fixes #143134	2025-07-09 12:53:33 +02:00
Ross Brunton	8c06d0e547	[Offload] Generate OffloadInfo.inc (#147316 ) This is a generated file which contains a macro for all Device Info keys. This is visible to the plugin interface so that it can use the definitions in a future patch.	2025-07-09 11:35:22 +01:00
Ross Brunton	8e104d69fc	[Offload] Provide proper memory management for Images on host device (#146066 ) The `unloadBinaryImpl` method on the host plugin is now implemented properly (rather than just being a stub). When an image is unloaded, it is deallocated and the library associated with it is closed.	2025-07-08 12:42:06 +01:00
Abhinav Gaba	ae4a81e849	[NFC][OpenMP] Add tests for mapping pointers and their dereferences. (#146934 ) The output of the compile-and-run tests is incorrect. These will be used for reference in future commits that resolve the issues. Also updated the existing clang LIT test, target_map_both_pointer_pointee_codegen.cpp, with more constructs and fewer CHECKs (through more update_cc_test_checks filters).	2025-07-08 06:52:38 -04:00
Callum Fare	fdf6ab2a53	[Offload] Implement 'Vendor Name' device info for CUDA (#147334 ) After #146345 the device info implementation requires a value for every query, rather than silently returning an empty string. This broke the test for `OL_DEVICE_INFO_VENDOR` on CUDA. Add a value to the CUDA plugin. We can quite safely hard code this one.	2025-07-08 10:04:48 +01:00
Giorgi Gvalia	5110ac4113	[Offload] Allow CUDA Kernels to use arbitrarily large shared memory (#145963 ) Previously, the user was not able to use more than 48 KB of shared memory on NVIDIA GPUs. In order to do so, setting the function attribute `CU_FUNC_ATTRIBUTE_MAX_THREADS_PER_BLOCK` is required, which was not present in the code base. With this commit, we add the ability toset this attribute, allowing the user to utilize the full power of their GPU. In order to not have to reset the function attribute for each launch of the same kernel, we keep track of the maximum memory limit (as the variable `MaxDynCGroupMemLimit`) and only set the attribute if our desired amount exceeds the limit. By default, this limit is set to 48 KB. Feedback is greatly appreciated, especially around setting the new variable as mutable. I did this becuase the `launchImpl` method is const and I am not able to modify my variable otherwise. --------- Co-authored-by: Giorgi Gvalia <ggvalia@login33.chn.perlmutter.nersc.gov> Co-authored-by: Giorgi Gvalia <ggvalia@login07.chn.perlmutter.nersc.gov>	2025-07-07 15:26:16 -04:00
Ross Brunton	8ae8d31832	[Offload] Add liboffload unit tests for shared/local memory (#147040 )	2025-07-07 16:20:02 +01:00
Ross Brunton	6b19cdcefa	[Offload][amdgpu] Map `INVALID_CODE_OBJECT` to `INVALID_BINARY` (#147070 )	2025-07-04 16:17:51 +01:00
Callum Fare	3c0571a749	[Offload] Add missing license header to Common.td (#146737 ) All other tablegen files in this directory have the license header, but `Common.td` is missing it	2025-07-02 17:17:30 +01:00
Ross Brunton	7d52b0983e	[Offload] Add `MAX_WORK_GROUP_SIZE` device info query (#143718 ) This adds a new device info query for the maximum workgroup/block size for each dimension.	2025-07-02 16:33:54 +01:00

1 2 3 4 5 ...

352 Commits