Possible Mario simulation bug?

Is it possible that Mario’s position (X coordinate) is sometimes reported incorrectly?

In the image below, Mario is clearly further ahead on the screen, but the simulator returns x = 122.
Because of this mismatch, the agent made a wrong decision and Mario died in the actual game.

In addition, in minor cases, the Mario simulation sometimes requests a jump action even when Mario is already in mid-air, making it impossible to perform the jump.

2 Likes

The images above were extracted from the observation (obs) variables.

guys, we found why position of Objects fluctuate from time to time,
its because coordinates of objects in screen measured relative to Mario position with offset
this is explanation how we need to adjust coordinates extracted by Orak’s context extractor super_mario_env.py

🧭 Coordinate Frames (the only way this makes sense)

We must name frames explicitly:

World frame – absolute level coordinates

Camera frame – what’s currently visible

Mario-centric frame – what the agent should reason in

🎮 Scenario Setup (fixed world)

Assume:

Goomba world x = 420 px
Mario world x  = 300 px


Goomba is 120 px ahead of Mario in the level.

This never changes.

🎥 Camera hysteresis parameters
Screen width = 256 px
Mario dead zone center ≈ 128 px


Camera logic (simplified):

camera_x = mario_world_x - 128


—but only updates when Mario leaves dead zone.

🧮 STEP 1 — World coordinates (truth)
Entity	World X
Mario	300
Goomba	420

Relative distance (true):

Goomba − Mario = +120 px


This is the only physically meaningful fact.

🧮 STEP 2 — Camera frame (what emulator renders)
Camera position
camera_x = 300 − 128 = 172

Screen positions
mario_screen_x   = mario_world_x   − camera_x
goomba_screen_x  = goomba_world_x  − camera_x


So:

Mario:   300 − 172 = 128
Goomba:  420 − 172 = 248


➡️ Goomba is near the right edge of the screen

🧮 STEP 3 — What YOUR CODE reports
Goomba (template matching)

Template matching finds Goomba in screen space:

goomba_detected_x = 248

Mario (fake screen mapping)

Your code:

x_pos = min(128, info['x_pos']) - 6


Since info['x_pos'] = 300:

mario_reported_x = 128 − 6 = 122

🧮 STEP 4 — Relative position (agent reasoning)

Agent computes (implicitly):

goomba_x_relative = 248 − 122 = 126 px


Which is close to the true 120 px, but not exact.

This is already an approximation.

🎥 NOW — Camera hysteresis kicks in

Mario moves right within dead zone:

Mario world x = 310
Goomba world x = 420


Camera does not move yet.

Recompute
Camera still:
camera_x = 172

Screen positions
Mario screen = 310 − 172 = 138
Goomba screen = 420 − 172 = 248

Reported Mario position
min(128, 310) − 6 = 122   ← unchanged

Agent sees
Goomba relative = 248 − 122 = 126


➡️ Everything looks stable.

💥 Camera snap moment (illusion)

Mario exits dead zone:

Mario world x = 340


Camera updates:

camera_x = 340 − 128 = 212

Recompute after snap
Mario screen   = 340 − 212 = 128
Goomba screen  = 420 − 212 = 208


Mario reported x:

min(128, 340) − 6 = 122

🧮 Agent now sees
Goomba relative = 208 − 122 = 86 px


🚨 Sudden jump:

126 → 86


But…

Truth check
Goomba − Mario = 420 − 340 = 80 px


The agent is now closer to the truth, but the change looked abrupt.

🧠 Why this feels wrong (but isn’t)
Thing	Changed?
World geometry	❌ No
Goomba location	❌ No
Mario location	❌ No
Camera origin	✅ Yes
Coordinate frame	✅ Yes

The agent confused:

coordinate change
with
world change

✅ The correct invariant computation

If you had world coordinates:

goomba_rel = goomba_world_x − mario_world_x


This would be stable:

+120 → +110 → +80


Smooth. Predictable. Physics-aligned.

🔑 One-line formula to remember
(relative position) = (object_screen_x − mario_screen_x)


NOT

object_screen_x − clipped_info_x


Your code approximates Mario’s screen x — that’s the distortion source.```

This code solves world alignment of
world_x=camera_x+screen_x

Once this holds, camera hysteresis becomes irrelevant.

class CoordinateProcessor:
    def __init__(self):
        self.mario_screen_x_constant = 122
        self.prev_camera_x = None

    def parse_observation_text(self, obs_text: str) -> Dict:
        parsed = {'mario_screen_x': self.mario_screen_x_constant, 'objects': []}
        mario_match = re.search(r"Position of Mario:\s*\((\d+),\s*(\d+)\)", obs_text)
        if mario_match:
            parsed['mario_screen_x'] = int(mario_match.group(1))
            parsed['mario_screen_y'] = int(mario_match.group(2))

        patterns = {
            'bricks': r"Bricks:\s*([^}\n]+)",
            'question_blocks': r"Question Blocks:\s*([^}\n]+)",
            'goombas': r"Monster Goomba:\s*([^}\n]+)",
            'koopas': r"Monster Koopas:\s*([^}\n]+)",
            'pipes': r"Warp Pipe:\s*([^}\n]+)",
            'pits': r"Pit:\s*([^}\n]+)",
            'flag': r"Flag:\s*([^}\n]+)"
        }
        for typ, pat in patterns.items():
            m = re.search(pat, obs_text)
            if m:
                coords = re.findall(r'\((\d+),(\d+)(?:,(\d+))?(?:,(\d+))?\)', m.group(1))
                for c in coords:
                    x, y = int(c[0]), int(c[1])
                    w = int(c[2]) if len(c) > 2 and c[2] else 16
                    h = int(c[3]) if len(c) > 3 and c[3] else 16
                    parsed['objects'].append({'type': typ, 'screen_x': x, 'screen_y': y, 'w': w, 'h': h})
        return parsed

    def align_and_advise(self, parsed: Dict, world_x: float) -> Dict:
        camera_x = world_x - parsed['mario_screen_x']
        aligned = {
            'camera_x': camera_x,
            'mario_y': parsed.get('mario_screen_y', 45),
            'hysteresis': 'snapped' if self.prev_camera_x and abs(camera_x - self.prev_camera_x) > 10 else 'stable',
            'threats': [],
            'pits': []
        }
        self.prev_camera_x = camera_x

        for obj in parsed['objects']:
            world_obj_x = camera_x + obj['screen_x']
            dx = world_obj_x - world_x
            
            if obj['type'] in ['goombas', 'koopas'] and dx > -20 and dx < 200:
                aligned['threats'].append({
                    'type': obj['type'], 
                    'dx': round(dx),
                    'world_x': round(world_obj_x),
                    'screen_y': obj['screen_y']
                })
            
            if obj['type'] == 'pipes' and dx > -20 and dx < 200:
                aligned['threats'].append({
                    'type': 'pipes',
                    'dx': round(dx),
                    'world_x': round(world_obj_x),
                    'height': obj['h']
                })
                
            if obj['type'] == 'pits':
                pit_start_x = world_obj_x
                aligned['pits'].append({
                    'start_x': round(pit_start_x),
                    'dx_to_start': round(dx),
                    'screen_y': obj['screen_y']
                })

        aligned['threats'].sort(key=lambda t: t['dx'])
        aligned['pits'].sort(key=lambda p: p['dx_to_start'])
        
        return aligned
1 Like

if somebody had this error:

OverflowError: Python integer 1024 out of bounds for uint8
it fixed by downgrading numpy to
numpy==1.26.4

Please check this thread @howon_lee :slightly_smiling_face:

1 Like

Thank you for bringing this up, we are looking into it!

1 Like

This server files has issue:
evaluation_utils/mcp_game_servers/super_mario/game/super_mario_env.py

The issue is that get_game_info() needs to return the actual game state, but it needs access to the latest info dict from the gym environment.

Here’s the proper fix:

File: evaluation_utils/mcp_game_servers/super_mario/game/super_mario_env.py

Location 1: In the configure() method, add initialization (around line 237):

self.jump_level = 0
self.mario_loc_history = []
self.latest_info = {}  # ADD THIS LINE

Location 2: In the step() method, store the info dict before returning (around line 405-407):


obs = SuperMarioObs(
    state={"image": state},
    image = self.to_pil_image(state),
    info=info,
    reward={"distance": info['x_pos'], "done": done}
)
self.mario_loc_history.append((info['x_pos'], info['y_pos']-34))
self.latest_info = info  # ADD THIS LINE

return obs, info['x_pos'], done, trunc, info

Location 3: In the _start_new_episode() method, store initial info (around line 298-299):


self.env.render()
self.jump_level = 0
self.mario_loc_history = []
self.latest_info = last_info  # ADD THIS LINE

return SuperMarioObs(

Location 4: Replace the get_game_info() method (lines 308-313):

Replace:


def get_game_info(self) -> dict:
    
    return {
        "past_mario_action": f"Jump Level {self.jump_level}",
        "mario_loc_history": f"{self.mario_loc_history[-3:]}"
    }

With:


def get_game_info(self) -> dict:
    # Return the actual game state from gym environment
    # This includes: coins, flag_get, life, score, stage, status, time, world, x_pos, y_pos
    return self.latest_info if hasattr(self, 'latest_info') else {}

This way, the agent will receive the actual x_pos, y_pos, and other game state info through the game_info dict

1 Like

Without this fix, you only may track mario positition with variable state
tracked_world_x = tracked_world_x_now ± tracked_world_x_previous