Is it possible that Mario’s position (X coordinate) is sometimes reported incorrectly?
In the image below, Mario is clearly further ahead on the screen, but the simulator returns x = 122.
Because of this mismatch, the agent made a wrong decision and Mario died in the actual game.
In addition, in minor cases, the Mario simulation sometimes requests a jump action even when Mario is already in mid-air, making it impossible to perform the jump.
2 Likes
The images above were extracted from the observation (obs) variables.
guys, we found why position of Objects fluctuate from time to time,
its because coordinates of objects in screen measured relative to Mario position with offset
this is explanation how we need to adjust coordinates extracted by Orak’s context extractor super_mario_env.py
🧭 Coordinate Frames (the only way this makes sense)
We must name frames explicitly:
World frame – absolute level coordinates
Camera frame – what’s currently visible
Mario-centric frame – what the agent should reason in
🎮 Scenario Setup (fixed world)
Assume:
Goomba world x = 420 px
Mario world x = 300 px
Goomba is 120 px ahead of Mario in the level.
This never changes.
🎥 Camera hysteresis parameters
Screen width = 256 px
Mario dead zone center ≈ 128 px
Camera logic (simplified):
camera_x = mario_world_x - 128
—but only updates when Mario leaves dead zone.
🧮 STEP 1 — World coordinates (truth)
Entity World X
Mario 300
Goomba 420
Relative distance (true):
Goomba − Mario = +120 px
This is the only physically meaningful fact.
🧮 STEP 2 — Camera frame (what emulator renders)
Camera position
camera_x = 300 − 128 = 172
Screen positions
mario_screen_x = mario_world_x − camera_x
goomba_screen_x = goomba_world_x − camera_x
So:
Mario: 300 − 172 = 128
Goomba: 420 − 172 = 248
➡️ Goomba is near the right edge of the screen
🧮 STEP 3 — What YOUR CODE reports
Goomba (template matching)
Template matching finds Goomba in screen space:
goomba_detected_x = 248
Mario (fake screen mapping)
Your code:
x_pos = min(128, info['x_pos']) - 6
Since info['x_pos'] = 300:
mario_reported_x = 128 − 6 = 122
🧮 STEP 4 — Relative position (agent reasoning)
Agent computes (implicitly):
goomba_x_relative = 248 − 122 = 126 px
Which is close to the true 120 px, but not exact.
This is already an approximation.
🎥 NOW — Camera hysteresis kicks in
Mario moves right within dead zone:
Mario world x = 310
Goomba world x = 420
Camera does not move yet.
Recompute
Camera still:
camera_x = 172
Screen positions
Mario screen = 310 − 172 = 138
Goomba screen = 420 − 172 = 248
Reported Mario position
min(128, 310) − 6 = 122 ← unchanged
Agent sees
Goomba relative = 248 − 122 = 126
➡️ Everything looks stable.
💥 Camera snap moment (illusion)
Mario exits dead zone:
Mario world x = 340
Camera updates:
camera_x = 340 − 128 = 212
Recompute after snap
Mario screen = 340 − 212 = 128
Goomba screen = 420 − 212 = 208
Mario reported x:
min(128, 340) − 6 = 122
🧮 Agent now sees
Goomba relative = 208 − 122 = 86 px
🚨 Sudden jump:
126 → 86
But…
Truth check
Goomba − Mario = 420 − 340 = 80 px
The agent is now closer to the truth, but the change looked abrupt.
🧠 Why this feels wrong (but isn’t)
Thing Changed?
World geometry ❌ No
Goomba location ❌ No
Mario location ❌ No
Camera origin ✅ Yes
Coordinate frame ✅ Yes
The agent confused:
coordinate change
with
world change
✅ The correct invariant computation
If you had world coordinates:
goomba_rel = goomba_world_x − mario_world_x
This would be stable:
+120 → +110 → +80
Smooth. Predictable. Physics-aligned.
🔑 One-line formula to remember
(relative position) = (object_screen_x − mario_screen_x)
NOT
object_screen_x − clipped_info_x
Your code approximates Mario’s screen x — that’s the distortion source.```