Possible Mario simulation bug?

Is it possible that Marioโ€™s position (X coordinate) is sometimes reported incorrectly?

In the image below, Mario is clearly further ahead on the screen, but the simulator returns x = 122.
Because of this mismatch, the agent made a wrong decision and Mario died in the actual game.

In addition, in minor cases, the Mario simulation sometimes requests a jump action even when Mario is already in mid-air, making it impossible to perform the jump.

2 Likes

The images above were extracted from the observation (obs) variables.

guys, we found why position of Objects fluctuate from time to time,
its because coordinates of objects in screen measured relative to Mario position with offset
this is explanation how we need to adjust coordinates extracted by Orakโ€™s context extractor super_mario_env.py

๐Ÿงญ Coordinate Frames (the only way this makes sense)

We must name frames explicitly:

World frame โ€“ absolute level coordinates

Camera frame โ€“ whatโ€™s currently visible

Mario-centric frame โ€“ what the agent should reason in

๐ŸŽฎ Scenario Setup (fixed world)

Assume:

Goomba world x = 420 px
Mario world x  = 300 px


Goomba is 120 px ahead of Mario in the level.

This never changes.

๐ŸŽฅ Camera hysteresis parameters
Screen width = 256 px
Mario dead zone center โ‰ˆ 128 px


Camera logic (simplified):

camera_x = mario_world_x - 128


โ€”but only updates when Mario leaves dead zone.

๐Ÿงฎ STEP 1 โ€” World coordinates (truth)
Entity	World X
Mario	300
Goomba	420

Relative distance (true):

Goomba โˆ’ Mario = +120 px


This is the only physically meaningful fact.

๐Ÿงฎ STEP 2 โ€” Camera frame (what emulator renders)
Camera position
camera_x = 300 โˆ’ 128 = 172

Screen positions
mario_screen_x   = mario_world_x   โˆ’ camera_x
goomba_screen_x  = goomba_world_x  โˆ’ camera_x


So:

Mario:   300 โˆ’ 172 = 128
Goomba:  420 โˆ’ 172 = 248


โžก๏ธ Goomba is near the right edge of the screen

๐Ÿงฎ STEP 3 โ€” What YOUR CODE reports
Goomba (template matching)

Template matching finds Goomba in screen space:

goomba_detected_x = 248

Mario (fake screen mapping)

Your code:

x_pos = min(128, info['x_pos']) - 6


Since info['x_pos'] = 300:

mario_reported_x = 128 โˆ’ 6 = 122

๐Ÿงฎ STEP 4 โ€” Relative position (agent reasoning)

Agent computes (implicitly):

goomba_x_relative = 248 โˆ’ 122 = 126 px


Which is close to the true 120 px, but not exact.

This is already an approximation.

๐ŸŽฅ NOW โ€” Camera hysteresis kicks in

Mario moves right within dead zone:

Mario world x = 310
Goomba world x = 420


Camera does not move yet.

Recompute
Camera still:
camera_x = 172

Screen positions
Mario screen = 310 โˆ’ 172 = 138
Goomba screen = 420 โˆ’ 172 = 248

Reported Mario position
min(128, 310) โˆ’ 6 = 122   โ† unchanged

Agent sees
Goomba relative = 248 โˆ’ 122 = 126


โžก๏ธ Everything looks stable.

๐Ÿ’ฅ Camera snap moment (illusion)

Mario exits dead zone:

Mario world x = 340


Camera updates:

camera_x = 340 โˆ’ 128 = 212

Recompute after snap
Mario screen   = 340 โˆ’ 212 = 128
Goomba screen  = 420 โˆ’ 212 = 208


Mario reported x:

min(128, 340) โˆ’ 6 = 122

๐Ÿงฎ Agent now sees
Goomba relative = 208 โˆ’ 122 = 86 px


๐Ÿšจ Sudden jump:

126 โ†’ 86


Butโ€ฆ

Truth check
Goomba โˆ’ Mario = 420 โˆ’ 340 = 80 px


The agent is now closer to the truth, but the change looked abrupt.

๐Ÿง  Why this feels wrong (but isnโ€™t)
Thing	Changed?
World geometry	โŒ No
Goomba location	โŒ No
Mario location	โŒ No
Camera origin	โœ… Yes
Coordinate frame	โœ… Yes

The agent confused:

coordinate change
with
world change

โœ… The correct invariant computation

If you had world coordinates:

goomba_rel = goomba_world_x โˆ’ mario_world_x


This would be stable:

+120 โ†’ +110 โ†’ +80


Smooth. Predictable. Physics-aligned.

๐Ÿ”‘ One-line formula to remember
(relative position) = (object_screen_x โˆ’ mario_screen_x)


NOT

object_screen_x โˆ’ clipped_info_x


Your code approximates Marioโ€™s screen x โ€” thatโ€™s the distortion source.```

This code solves world alignment of
world_x=camera_x+screen_x

Once this holds, camera hysteresis becomes irrelevant.

class CoordinateProcessor:
    def __init__(self):
        self.mario_screen_x_constant = 122
        self.prev_camera_x = None

    def parse_observation_text(self, obs_text: str) -> Dict:
        parsed = {'mario_screen_x': self.mario_screen_x_constant, 'objects': []}
        mario_match = re.search(r"Position of Mario:\s*\((\d+),\s*(\d+)\)", obs_text)
        if mario_match:
            parsed['mario_screen_x'] = int(mario_match.group(1))
            parsed['mario_screen_y'] = int(mario_match.group(2))

        patterns = {
            'bricks': r"Bricks:\s*([^}\n]+)",
            'question_blocks': r"Question Blocks:\s*([^}\n]+)",
            'goombas': r"Monster Goomba:\s*([^}\n]+)",
            'koopas': r"Monster Koopas:\s*([^}\n]+)",
            'pipes': r"Warp Pipe:\s*([^}\n]+)",
            'pits': r"Pit:\s*([^}\n]+)",
            'flag': r"Flag:\s*([^}\n]+)"
        }
        for typ, pat in patterns.items():
            m = re.search(pat, obs_text)
            if m:
                coords = re.findall(r'\((\d+),(\d+)(?:,(\d+))?(?:,(\d+))?\)', m.group(1))
                for c in coords:
                    x, y = int(c[0]), int(c[1])
                    w = int(c[2]) if len(c) > 2 and c[2] else 16
                    h = int(c[3]) if len(c) > 3 and c[3] else 16
                    parsed['objects'].append({'type': typ, 'screen_x': x, 'screen_y': y, 'w': w, 'h': h})
        return parsed

    def align_and_advise(self, parsed: Dict, world_x: float) -> Dict:
        camera_x = world_x - parsed['mario_screen_x']
        aligned = {
            'camera_x': camera_x,
            'mario_y': parsed.get('mario_screen_y', 45),
            'hysteresis': 'snapped' if self.prev_camera_x and abs(camera_x - self.prev_camera_x) > 10 else 'stable',
            'threats': [],
            'pits': []
        }
        self.prev_camera_x = camera_x

        for obj in parsed['objects']:
            world_obj_x = camera_x + obj['screen_x']
            dx = world_obj_x - world_x
            
            if obj['type'] in ['goombas', 'koopas'] and dx > -20 and dx < 200:
                aligned['threats'].append({
                    'type': obj['type'], 
                    'dx': round(dx),
                    'world_x': round(world_obj_x),
                    'screen_y': obj['screen_y']
                })
            
            if obj['type'] == 'pipes' and dx > -20 and dx < 200:
                aligned['threats'].append({
                    'type': 'pipes',
                    'dx': round(dx),
                    'world_x': round(world_obj_x),
                    'height': obj['h']
                })
                
            if obj['type'] == 'pits':
                pit_start_x = world_obj_x
                aligned['pits'].append({
                    'start_x': round(pit_start_x),
                    'dx_to_start': round(dx),
                    'screen_y': obj['screen_y']
                })

        aligned['threats'].sort(key=lambda t: t['dx'])
        aligned['pits'].sort(key=lambda p: p['dx_to_start'])
        
        return aligned
1 Like

if somebody had this error:

OverflowError: Python integer 1024 out of bounds for uint8
it fixed by downgrading numpy to
numpy==1.26.4

Please check this thread @howon_lee :slightly_smiling_face:

1 Like

Thank you for bringing this up, we are looking into it!

1 Like

This server files has issue:
evaluation_utils/mcp_game_servers/super_mario/game/super_mario_env.py

The issue is that get_game_info() needs to return the actual game state, but it needs access to the latest info dict from the gym environment.

Hereโ€™s the proper fix:

File: evaluation_utils/mcp_game_servers/super_mario/game/super_mario_env.py

Location 1: In the configure() method, add initialization (around line 237):

self.jump_level = 0
self.mario_loc_history = []
self.latest_info = {}  # ADD THIS LINE

Location 2: In the step() method, store the info dict before returning (around line 405-407):


obs = SuperMarioObs(
    state={"image": state},
    image = self.to_pil_image(state),
    info=info,
    reward={"distance": info['x_pos'], "done": done}
)
self.mario_loc_history.append((info['x_pos'], info['y_pos']-34))
self.latest_info = info  # ADD THIS LINE

return obs, info['x_pos'], done, trunc, info

Location 3: In the _start_new_episode() method, store initial info (around line 298-299):


self.env.render()
self.jump_level = 0
self.mario_loc_history = []
self.latest_info = last_info  # ADD THIS LINE

return SuperMarioObs(

Location 4: Replace the get_game_info() method (lines 308-313):

Replace:


def get_game_info(self) -> dict:
    
    return {
        "past_mario_action": f"Jump Level {self.jump_level}",
        "mario_loc_history": f"{self.mario_loc_history[-3:]}"
    }

With:


def get_game_info(self) -> dict:
    # Return the actual game state from gym environment
    # This includes: coins, flag_get, life, score, stage, status, time, world, x_pos, y_pos
    return self.latest_info if hasattr(self, 'latest_info') else {}

This way, the agent will receive the actual x_pos, y_pos, and other game state info through the game_info dict

1 Like

Without this fix, you only may track mario positition with variable state
tracked_world_x = tracked_world_x_now ยฑ tracked_world_x_previous