I’m trying to set up an AWS instance with Round 1 available hardware:
|GPU||16 GB Tesla V100|
It seems that p3.2xlarge has 8vCPUs, the correct GPU and 61GiB or memory, so about 65 GB.
When I run the default
run.sh script I however face two issues:
There are only 2 CPUs available (not 8).
I get something like “failed to put object 4e8e6bbb00a431564d81fd5d010000c801000000 in object store because it is full. Object size is 302591732 bytes. Waiting 16000ms for space to free up…” until it crashes (full trace)
I’ve been playing a bit with
RAY_STORE_MEMORY, setting them to 55M and 80M or very big values, but still get the same problem. My current guess is that the “Memory” does not correspond to RAM but to something else?
Really curious if someone has encountered similar problems.