mirror of
https://github.com/AdaCore/cuda.git
synced 2026-02-12 13:05:54 -08:00
Includes also clarifications for using the CUDA_ROOT in the installation documentation. Solves eng/toolchain/cuda#34
219 lines
7.1 KiB
ReStructuredText
219 lines
7.1 KiB
ReStructuredText
***********************************
|
|
Prerequisites and environment setup
|
|
***********************************
|
|
|
|
Before installing this software, make sure you have the pre-requisites
|
|
corresponding to your build environment installed.
|
|
CUDA Toolkit and a suitable GNAT Ada compiler are required for building the
|
|
CUDA bindings.
|
|
|
|
.. note::
|
|
|
|
In cross compilation workflow the version of CUDA Toolkit on the development
|
|
host must be the same as that on the target.
|
|
|
|
.. _DEFAULT_INSTALL:
|
|
|
|
CUDA Toolkit installation on a workstation with NVIDIA GPU
|
|
**********************************************************
|
|
|
|
In case the machine used for development has a CUDA-capable NVIDIA GPU the toolkit
|
|
can be installed by following the standard setup instructions from NVIDIA.
|
|
Start from downloading the CUDA Toolkit for your development host from
|
|
https://developer.nvidia.com/cuda-downloads.
|
|
|
|
You need to have the CUDA Toolkit in your PATH, and in particular ``ptxas``.
|
|
You can check that by running:
|
|
|
|
.. code-block:: shell
|
|
|
|
which ptxas
|
|
|
|
If it doesn't return anything, CUDA may not be properly installed,
|
|
or needs to be put in your PATH, e.g.:
|
|
|
|
.. code-block:: shell
|
|
|
|
export PATH=/usr/local/cuda/bin:$PATH
|
|
|
|
.. _CUSTOM_INSTALL:
|
|
|
|
CUDA Toolkit installation on a workstation without a suitable GPU
|
|
*****************************************************************
|
|
|
|
In case the development host doesn't have CUDA-capable GPU, the available GPU
|
|
is not compliant with that on the target or the development environment needs
|
|
to be installed without root permissions, the toolkit can be installed without
|
|
video card drivers.
|
|
|
|
Downloading the CUDA Toolkit in **runfile format** for your development host from
|
|
https://developer.nvidia.com/cuda-downloads.
|
|
|
|
Decide where the toolkit shall be installed and expose the location of the toolkit
|
|
with environment variable ``CUDA_ROOT``.
|
|
|
|
.. warning::
|
|
|
|
``CUDA_ROOT`` cannot point to a folder that contains a ``gcc`` or ``gnat`` installation in any of its subdirectories.
|
|
By default, ``gcc`` is installed in :file:`/usr`. Avoid installing a custom CUDA toolkit in the same folder.
|
|
|
|
.. code-block:: shell
|
|
|
|
mkdir cuda-toolkit
|
|
export CUDA_ROOT=`pwd`/cuda-toolkit
|
|
|
|
Install the toolkit using the runfile downloaded from NVIDIA website using the
|
|
options listed below:
|
|
|
|
.. code-block:: shell
|
|
|
|
sh cuda_<cuda version>_linux.run --silent --toolkit --toolkitpath=$CUDA_ROOT --override --defaultroot=$CUDA_ROOT/root
|
|
|
|
Expose CUDA libraries for the linker and binaries for the setup script:
|
|
|
|
.. code-block:: shell
|
|
|
|
export LD_LIBRARY_PATH=$CUDA_ROOT/targets/<architecture>/lib:$LD_LIBRARY_PATH
|
|
export PATH=$CUDA_ROOT/bin:$PATH
|
|
|
|
<architecture> above is the name of the architecture for the target platform.
|
|
|
|
Native compiler on x86_64 Linux
|
|
*******************************
|
|
|
|
In case both the development host and the target are running x86_64 Linux
|
|
then the following tools are required:
|
|
|
|
- An x86_64 Linux environment with CUDA drivers (see above)
|
|
- An installation of GNAT Pro, version 24.0w (20230413) or later.
|
|
|
|
Cross compilation for aarch64 Linux
|
|
***********************************
|
|
|
|
If the development host is running x86_64 Linux and the target
|
|
aarch64 Linux then the following tools are required:
|
|
|
|
- An aarch64 Linux environment with CUDA drivers on the target.
|
|
- An installation of GNAT Pro cross toolchain for aarch64-linux,
|
|
version 24.0w (20230413) or later, on the development host.
|
|
|
|
Obtain a copy of the system libraries according to the instructions
|
|
in the cross toolchain documentation and place them in a directory of
|
|
your choice. **NB!** if you are going to copy the folders from target
|
|
to the development host then make sure that all of the required
|
|
libraries are installed on target before.
|
|
|
|
As an example, the files can be copied form the target board as follows:
|
|
|
|
.. code-block:: shell
|
|
|
|
$ mkdir ./sysroot
|
|
$ mkdir ./sysroot/usr
|
|
$ scp -rp <my-aarch64-linux-target>:/usr/include ./sysroot/usr/
|
|
$ scp -rp <my-aarch64-linux-target>:/usr/lib ./sysroot/usr/
|
|
$ scp -rp <my-aarch64-linux-target>:/usr/lib64 ./sysroot/usr/
|
|
$ scp -rp <my-aarch64-linux-target>:/lib ./sysroot/
|
|
$ scp -rp <my-aarch64-linux-target>:/lib64 ./sysroot/
|
|
|
|
Obtain a copy of the CUDA libraries from the target board and place it
|
|
in the targets folder of your CUDA setup:
|
|
|
|
.. code-block:: shell
|
|
|
|
$ scp -rp <my-aarch64-linux-target>:/usr/local/cuda/targets/aarch64-linux ./
|
|
$ sudo mv aarch64-linux <CUDA_TOOLBOX_ROOT>/targets
|
|
|
|
Where <CUDA_TOOLBOX_ROOT> is the location of the cuda toolbox:
|
|
|
|
* ``$CUDA_ROOT`` in case the toolbox was installed according to the instructions
|
|
in :ref:`CUSTOM_INSTALL`
|
|
|
|
* ``/usr/local/cuda`` in case of :ref:`DEFAULT_INSTALL`
|
|
|
|
Make the sysroot location visible to GNAT via the ``ENV_PREFIX`` environment
|
|
variable:
|
|
|
|
.. code-block:: shell
|
|
|
|
$ export ENV_PREFIX=`pwd`/sysroot
|
|
|
|
Let the toolchain know that the intended compilation target is aarch64-linux:
|
|
|
|
.. code-block:: shell
|
|
|
|
$ export CUDA_HOST=aarch64-linux
|
|
|
|
***************
|
|
GNAT-CUDA setup
|
|
***************
|
|
|
|
After setting up the environment, you can extract the gnat-cuda package:
|
|
|
|
.. code-block:: shell
|
|
|
|
tar -xzf gnat-cuda-[version]-x86_64-linux-bin.tar.gz
|
|
|
|
Now you need to know which GPU architecture you're targeting. This is
|
|
typically an ``sm``\_ prefix followed by a number. For example
|
|
``sm_89`` is the Ada Lovelace architecture. You can find details from
|
|
the `GPU architecture mapping article
|
|
<https://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/>`_.
|
|
You pass this parameter to the next script.
|
|
|
|
In the extracted directory, generate the tool suite setup for your
|
|
current installation:
|
|
|
|
.. code-block:: shell
|
|
|
|
cd gnat-cuda-[version]-x86_64-linux-bin/cuda
|
|
./setup.sh [-mcpu sm_<GPU architecture>] [-clean]
|
|
|
|
If the ``-mcpu`` argument is not provided, then the setup attempts to determine
|
|
the compute capability automatically using the utilities in CUDA toolbox.
|
|
|
|
The ``-clean`` argument can be optionally used for removing the temporary object
|
|
files in case the environment changes and the change cannot be detected automatically
|
|
by the binding generation process. This can happen, for instance, when the compiler
|
|
is upgraded or the same gnat-cuda source tree is used for multiple targets
|
|
(e.g. for native x86_64-linux build and aarch64-linux cross compilation) and
|
|
you switch from one target to another by changing the value of `$CUDA_HOST`
|
|
variable.
|
|
|
|
In the same directory, execute:
|
|
|
|
.. code-block:: shell
|
|
|
|
source ./env.sh
|
|
|
|
You need to perform the above step every time you want to compile a
|
|
CUDA application.
|
|
|
|
To check if everything is correctly installed, you can try an example:
|
|
|
|
.. code-block:: shell
|
|
|
|
cd cuda/examples/0_Introduction/vectorAdd
|
|
make
|
|
./main
|
|
|
|
.. note::
|
|
|
|
In cross compilation workflow you have to copy ``main`` to target
|
|
before executing it.
|
|
|
|
.. note::
|
|
|
|
If you are switching between different targets by changing the
|
|
``$CUDA_HOST`` variable or upgraded the compiler then the old
|
|
object files can be removed by calling ``make clean`` before
|
|
a new build.
|
|
|
|
After executing the code you should see:
|
|
|
|
.. code-block:: shell
|
|
|
|
CUDA kernel launch with 16 blocks of 256 threads
|
|
Copy output data from the CUDA device to the host memory
|
|
Test PASSED
|
|
Done
|