====== PyTorch ====== PyTorch is available on Comet as the module ''pytorch-env''. This loads a version of Python and the PyTorch module: $ module load pytorch-env $ which python /opt/software/easybuild/software/Python/3.10.4-GCCcore-11.3.0/virtual_enviroments/pytorch-env/bin/python $ python Python 3.10.4 (main, May 23 2025, 14:30:59) [GCC 11.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import torch >>> If you require a version of PyTorch to work with a different version of Python, read the following section to see how to install a custom version. ===== Installing / Testing Newer Versions ===== You can install newer versions of PyTorch using Python virtual environments and using Pip to install the module in that environment. The Slurm sbatch script below launches a job which sets up a new python virtual environment (named ''pytorch-test'' in your home directory), installs PyTorch (CUDA 12.8 flavour) into that environment, and then runs a simple Torch test script which enumerates the available CUDA devices and their capabilities: #!/bin/bash #SBATCH -p gpu-s_paid #SBATCH --nodelist=gpu001 #SBATCH --gres=gpu:L40:8 #SBATCH --account=comet_training module load Python/3.13.1-GCCcore-14.2.0 module load CUDA # Install Pytorch # =============== cd $HOME python3 -m venv pytorch-test cd pytorch-test source bin/activate pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128 >/dev/null # Run Pytorch test script cd $HOME/pytorch-test source bin/activate echo " import torch print(f'Current CUDA version: {torch.version.cuda}') print(f'Current CUDA device ID: {torch.cuda.current_device()}') print(f'Number of CUDA devices: {torch.cuda.device_count()}') if torch.cuda.device_count() > 0: print(f'Device # / Device Name / Total RAM / Allocated RAM / Cached RAM') for i in range(0, torch.cuda.device_count()): # CUDA device info device_name = torch.cuda.get_device_name(i) device_total_memory = int(torch.cuda.get_device_properties(i).total_memory / 1024 / 1024) device_allocated_memory = int(torch.cuda.memory_allocated(i) / 1024 / 1024) device_cached_memory = int(torch.cuda.memory_cached(i) / 1024 / 1024) print(f'{i}: {device_name} : {device_total_memory} MB : {device_allocated_memory} MB : {device_cached_memory} MB') device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') print(f'PyTorch using: {device}') " > test.py python3 test.py ---- [[:advanced:software|Back to Advanced Software page]]