If I understood correctly, the RLLib trainer process waits while the worker processes collect the rollouts, but if we set num_workers to 0, the trainer process collects the rollouts. Is it possible that the trainer process also collects rollouts along with the workers? I’m guessing that will increase throughput somewhat.
2 Likes
I was wondering the same thing. The trainer process would be idle during the sampling process. Would be nice if the trainer can engage in the sampling as well.