If I understood correctly, the RLLib trainer process waits while the worker processes collect the rollouts, but if we set num_workers to 0, the trainer process collects the rollouts. Is it possible that the trainer process also collects rollouts along with the workers? I’m guessing that will increase throughput somewhat.
This file may contain something what you’re looking for
I was wondering the same thing. The trainer process would be idle during the sampling process. Would be nice if the trainer can engage in the sampling as well.