Why Submission #288794 is failed?

rank0]: Traceback (most recent call last):

rank0: File “<private_file>”, line 214, in

rank0: raise exc

rank0: File “<private_file>”, line 199, in

rank0: File “<private_file>”, line 192, in main

rank0: File “<private_file>”, line 179, in serve

rank0: File “<private_file>”, line 269, in run_agent

rank0: File “<private_file>”, line 323, in _agent_executor

rank0: result = self.execute(target_attribute, *args, **kwargs)

rank0: File “<private_file>”, line 173, in execute

rank0: return method(*args, **kwargs)

rank0: File “<private_file>”, line 94, in init_agent

rank0: self.agent = run_with_timeout(

rank0: File “<private_file>”, line 162, in run_with_timeout

rank0: return fn(*args, **kwargs)

rank0: File “<private_file>”, line 47, in init

rank0: File “<private_file>”, line 55, in initialize_models

rank0: self.llm = <private_test_file>(

rank0: File “<private_file>”, line 1161, in inner

rank0: return fn(*args, **kwargs)

rank0: File “<private_test_file>”, line 247, in init

rank0: self.llm_engine = <private_test_file>.from_engine_args(

rank0: File “<private_test_file>”, line 510, in from_engine_args

rank0: return engine_cls.from_vllm_config(

rank0: File “<private_test_file>”, line 486, in from_vllm_config

rank0: return cls(

rank0: File “<private_test_file>”, line 278, in init

rank0: File “<private_test_file>”, line 435, in _initialize_kv_caches

rank0: self.model_executor.initialize_cache(num_gpu_blocks, num_cpu_blocks)

rank0: File “<private_test_file>”, line 123, in initialize_cache

rank0: self.collective_rpc(“initialize_cache”,

rank0: File “<private_test_file>”, line 56, in collective_rpc

rank0: answer = run_method(self.driver_worker, method, args, kwargs)

rank0: File “<private_test_file>”, line 2456, in run_method

rank0: return func(*args, **kwargs)

rank0: File “<private_file>”, line 311, in initialize_cache

rank0: raise_if_cache_size_invalid(

rank0: File “<private_file>”, line 565, in raise_if_cache_size_invalid

rank0: raise ValueError(

ValueError: <sensitive_data> Try increasing <environment_variable> or decreasing <sensitive_data> when initializing the engine.

2025-06-16 00:58:06.918
ValueError: The model’s max seq len (8192) is larger than the maximum number of tokens that can be stored in KV cache (8016). Try increasing gpu_memory_utilization or decreasing max_model_len when initializing the engine.