Hi, thanks for the update! It was running quite smoothly until some hours ago. Now I cannot connect to the docker env at all.
This is the error message of the docker container. I would be happy if you can help me out:
AgentsVideoEditor._mp4_queue['0'] is empty. Retrying...
AgentsVideoEditor._mp4_queue['0'] is empty. Retrying...
AgentsVideoEditor._mp4_queue['0'] is empty. Retrying...
=================== Gym Client Ready! ===================
## Created agent: agent
AgentsVideoEditor._mp4_queue['0'] is empty. Retrying...
## Stop physics after creating graph
## Creating session
2021-10-03 22:45:47.197613: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Reset agent
Reset agent finished
{"simapp_exception": {"date": "2021-10-03 22:45:49.716887", "function": "deepracer_racetrack_env.py::_update_state::186", "message": "Unclassified exception: list indices must be integers or slices, not
list", "exceptionType": "simulation_worker.exceptions", "eventType": "system_error", "errorCode": "500"}}
ERROR: FAULT_CODE: 0
simapp_exit_gracefully: simapp_exit--1
Terminating simapp simulation...
simapp_exit_gracefully - callstack trace=Traceback (callstack)
File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/opt/amazon/install/sagemaker_rl_agent/lib/python3.6/site-packages/markov/rollout_worker.py", line 558, in <module>
rollout_entry()
File "/opt/amazon/install/sagemaker_rl_agent/lib/python3.6/site-packages/markov/rollout_worker.py", line 538, in rollout_entry
main()
File "/opt/amazon/install/sagemaker_rl_agent/lib/python3.6/site-packages/markov/rollout_worker.py", line 532, in main
unpause_physics=unpause_physics
File "/opt/amazon/install/sagemaker_rl_agent/lib/python3.6/site-packages/markov/rollout_worker.py", line 200, in rollout_worker
graph_manager.act(act_steps, wait_for_full_episodes=graph_manager.agent_params.algorithm.act_for_full_episodes)
File "/opt/amazon/install/sagemaker_rl_agent/lib/python3.6/site-packages/markov/multi_agent_coach/multi_agent_graph_manager.py", line 438, in act
done = self.top_level_manager.step(None)
File "/opt/amazon/install/sagemaker_rl_agent/lib/python3.6/site-packages/markov/multi_agent_coach/multi_agent_level_manager.py", line 232, in step
for action_info in action_infos])
File "/opt/amazon/install/sagemaker_rl_agent/lib/python3.6/site-packages/markov/multi_agent_coach/multi_agent_environment.py", line 185, in step
self._update_state()
File "/opt/amazon/install/sagemaker_rl_agent/lib/python3.6/site-packages/markov/environments/deepracer_racetrack_env.py", line 186, in _update_state
SIMAPP_EVENT_ERROR_CODE_500)
File "/opt/amazon/install/sagemaker_rl_agent/lib/python3.6/site-packages/markov/log_handler/exception_handler.py", line 74, in log_and_exit
s3_crash_status_file_name=s3_crash_status_file_name)
File "/opt/amazon/install/sagemaker_rl_agent/lib/python3.6/site-packages/markov/log_handler/exception_handler.py", line 179, in simapp_exit_gracefully
callstack_trace = ''.join(traceback.format_stack())
simapp_exit_gracefully - exception trace=Traceback (most recent call last):
File "/opt/amazon/install/sagemaker_rl_agent/lib/python3.6/site-packages/markov/environments/deepracer_racetrack_env.py", line 140, in _update_state
self.action_list)]
File "/opt/amazon/install/sagemaker_rl_agent/lib/python3.6/site-packages/markov/environments/deepracer_racetrack_env.py", line 139, in <listcomp>
[self._agents_info_map.update(agent.update_agent(action)) for agent, action in zip(self.agent_list,
File "/opt/amazon/install/sagemaker_rl_agent/lib/python3.6/site-packages/markov/agents/agent.py", line 79, in update_agent
return self._ctrl_.update_agent(action)
File "/opt/amazon/install/sagemaker_rl_agent/lib/python3.6/site-packages/markov/agent_ctrl/rollout_agent_ctrl.py", line 564, in update_agent
self._data_dict_, action, self._model_metadata_.get_action_dict(action),
File "/opt/amazon/install/sagemaker_rl_agent/lib/python3.6/site-packages/markov/boto/s3/files/model_metadata.py", line 100, in get_action_dict
return self._model_metadata[ModelMetadataKeys.ACTION_SPACE.value][action]
TypeError: list indices must be integers or slices, not list
simapp_exit_gracefully - skipping s3 upload.
simapp_exit_gracefully - Job type is SageOnly. Killing SimApp and Training jobs by PID
simapp_exit_gracefully - Waiting for simapp and training job to come up.
AgentsVideoEditor._mp4_queue['0'] is empty. Retrying...
AgentsVideoEditor._mp4_queue['0'] is empty. Retrying...
AgentsVideoEditor._mp4_queue['0'] is empty. Retrying...
simapp_exit_gracefully - Waiting for simapp and training job to come up.
AgentsVideoEditor._mp4_queue['0'] is empty. Retrying...
AgentsVideoEditor._mp4_queue['0'] is empty. Retrying...
AgentsVideoEditor._mp4_queue['0'] is empty. Retrying...
simapp_exit_gracefully - Waiting for simapp and training job to come up.
AgentsVideoEditor._mp4_queue['0'] is empty. Retrying...
AgentsVideoEditor._mp4_queue['0'] is empty. Retrying...
AgentsVideoEditor._mp4_queue['0'] is empty. Retrying...
simapp_exit_gracefully - Waiting for simapp and training job to come up.
AgentsVideoEditor._mp4_queue['0'] is empty. Retrying...
AgentsVideoEditor._mp4_queue['0'] is empty. Retrying...
AgentsVideoEditor._mp4_queue['0'] is empty. Retrying...
simapp_exit_gracefully - Stopped waiting. SimApp Pid Exists=True, Training Pid Exists=False.
+ exit
XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":0"
after 12268 requests (789 known processed) with 0 events remaining.
This is the error message I get in the terminal:
File "/home/anton/deepracers/agents/ppo_agent.py", line 96, in reset
observations = self.env.reset()
File "/home/anton/deepracers/deepracer-gym/deepracer_gym/envs/deepracer_gym_env.py", line 11, in reset
observation = self.deepracer_helper.env_reset()
File "/home/anton/deepracers/deepracer-gym/deepracer_gym/zmq_client.py", line 49, in env_reset
self.obs = self.zmq_client.recieve_response()
File "/home/anton/deepracers/deepracer-gym/deepracer_gym/zmq_client.py", line 25, in recieve_response
packed_response = self.socket.recv()
File "zmq/backend/cython/socket.pyx", line 781, in zmq.backend.cython.socket.Socket.recv
File "zmq/backend/cython/socket.pyx", line 817, in zmq.backend.cython.socket.Socket.recv
File "zmq/backend/cython/socket.pyx", line 191, in zmq.backend.cython.socket._recv_copy
File "zmq/backend/cython/socket.pyx", line 186, in zmq.backend.cython.socket._recv_copy
File "zmq/backend/cython/checkrc.pxd", line 22, in zmq.backend.cython.checkrc._check_rc
zmq.error.Again: Resource temporarily unavailable