GPU utilization

Hi, I have a question about GPU utilization.

When I run my program locally, it runs at about 1000 timesteps/sec.
However, when I submit it, its speed decreases to about 40 timesteps/sec.
I suspect that it doesn’t utilize GPU correctly.

Resources requested: 11/16 CPUs, 1/1 GPUs, 0.0/55.96 GiB heap, 0.0/19.24 GiB objects

How can I utilize GPU? What parameters do I need to set?
I would appreciate it if you could answer my question.

Hello @shogoakiyama

Can you try setting

num_workers: 6 # Number of rollout workers to run
num_envs_per_worker: 20 # Number of envs to run per rollout worker
num_gpus: 0.6 # Fraction of GPU used by trainer
num_gpus_per_worker: 0.05 # Fraction of GPU used by rollout worker

Please make sure that num_gpus + num_gpus_per_worker*num_workers <= 1. Setting num_gpus to 0.5 doesn’t mean that half of the GPU memory is available to the trainer process. rllib doesn’t allocate GPUs but schedules the workers based on these values. They exist to make it easier to scale the training process to multiple GPUs. Since we use a single GPU during the evaluation, setting these values to some non-zero value should suffice.

Thank you for your reply.
I will try that :slight_smile: