Strange REALRobot-v0 environment properties

As https://github.com/AIcrowd/real_robots/blob/master/environment.md says,

action vector consists of 9 components, each component responsible for it’s own joint (supposedly) .

In reality, single action component deals with several joints.

To be more precise:

– changes in components 0:7 impact finger_0 position;

– action component 8 impacts finger_0 position, as well as action component 7;

– it’s impossible to move finger_1 with component 8 at all.

Was the behavior described above an intended one?

If yes, does actual kuka manipulator work like this, or such ‘entanglement’ was an artificial complication,

serving a purpose of the competition?

Dear VPavlov,
each of the first 7 components acts on a single joint of the arm.
The remaining two components act on the two-finger gripper and they move it so that the fingers keep a symmetrical position.
The 8th component moves the bottom part of both fingers at once from 0 to 90 degrees.
The 9th component moves the upper part of both fingers at once - it can go from 0 to half of what the 8th component is.

This somewhat simplifies gripping from an agent that starts learning with random movements, since the gripper will always behave in a coherent fashion that usually permits grasping objects.
On the other hand, some degrees of freedom are taken away since fingers cannot be moved independently any more.

Have a look at this example, which visualizes in a GUI the effects of the 8th and 9th component,

import numpy as np
import real_robots
import gym

env = gym.make('REALRobot-v0')
env.render('human')
env.reset()

# Using 3rd component to put the arm in better view
for _ in range(100):
    obs, _, _, _ = env.step(np.array([0, 0, 0, -1.5, 0, 0, 0, 0, 0]))

# The 8th component moves the bottom part of both fingers at once 
# from 0 to 90 degrees.
# See the gripper opening keeping its fingers symmetrical.
for _ in range(100):
    obs, _, _, _ = env.step(np.array([0, 0, 0, -1.5, 0, 0, 0, np.pi/2, 0]))


# The 9th component moves the upper part of both fingers at once.
# See the gripper widening by moving the upper parts of the fingers.
for _ in range(100):
    obs, _, _, _ = env.step(np.array([0, 0, 0, -1.5, 0, 0, 0, np.pi/2, np.pi/2]))


# The 9th component can go from 0 to half of what the 8th component is.
# Now see that while the gripper closes (8th component to zero), both
# the upper and the bottom part of the fingers close.
# This is because the 9th component it automatically reduced to its maximum
# value, i.e. half of the current 8th component.
for _ in range(100):
    obs, _, _, _ = env.step(np.array([0, 0, 0, -1.5, 0, 0, 0, 0, np.pi/2]))
1 Like