Skip to main content

Run docker-based workload on HPC with GPU

 1. Prepare a container image by converting docker image from docker hub. For furter detail, please check ref. It would be recommended that using container that cuda installed. We provide several container images that cuda already installed and configuated, like nvhpc.22.9-devel-cuda_multi-ubuntu20.04.sif. You can check all the provided images under /pfss/containers/ directory.

singularity pull julia.1.8.2.sif docker://julia:alpine3.16

# move it to the containers folder, then we can run it in the web portal
mkdir -p ~/containers
mv julia.1.8.2.sif ~/containers

2. Prepare sbatch arguments for gpu usage

    3.1 First, find the gpupartition partition,page, and then open the nodes tab.

Partitions-OAsis-HPC-Center.png

      3.2 In this popup, we can find everyall gputhe gpus and their availability in this cluster. For example, there are 4 gpugpus with 1 gpu core and 10gb memory, 1 gpu with 3 gpu corecores and 40gb memory, 1 gpu with entire a100 capability. 

Partitions-OAsis-HPC-Center (1).png

4. Execute cmd in container by using srun cmd or using sbatch script file. In this example, we use 1 a100 gpu to run nvaccelinfo. The result should show us the gpu info.

     4.1 With srun cmd

srun -p gpu --gpus a100:1 singularity exec --nv /pfss/containers/nvhpc.22.9-devel-cuda_multi-ubuntu20.04.sif /bin/sh -c nvidia-smi

     4.2 With sbatch script

#!/usr/bin/env bash

#SBATCH -p gpu
#SBATCH --gpus a100:1

singularity exec --nv /pfss/containers/nvhpc.22.9-devel-cuda_multi-ubuntu20.04.sif /bin/sh -c nvaccelinfo