(openvla-env) root@ubuntu:~/ksim-gym# python -m train
INFO 2025-05-20 13:08:38 [xax.task.mixins.compile] Setting JAX logging level to INFO
INFO 2025-05-20 13:08:38 [xax.task.mixins.compile] Setting JAX compilation cache directory to /root/.cache/jax/jaxcache
INFO 2025-05-20 13:08:38 [xax.task.mixins.compile] Configuring JAX compilation cache parameters
INFO:2025-05-20 13:08:38,816:jax._src.xla_bridge:867: Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: libtpu.so: cannot open shared object file: No such file or directory
INFO 2025-05-20 13:08:38 [jax._src.xla_bridge] Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: libtpu.so: cannot open shared object file: No such file or directory
2025-05-20 13:08:38.897413: I external/xla/xla/service/service.cc:152] XLA service 0xb17d160 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2025-05-20 13:08:38.897471: I external/xla/xla/service/service.cc:160] StreamExecutor device (0): Orin, Compute Capability 8.7
2025-05-20 13:08:38.906223: I external/xla/xla/pjrt/pjrt_c_api_client.cc:130] PjRtCApiClient created.
STATUS 2025-05-20 13:08:50 [xax.task.mixins.artifacts] /root/ksim-gym/humanoid_walking_task/run_1
STATUS 2025-05-20 13:08:50 [xax.task.mixins.train] /root/ksim-gym/train.py
STATUS 2025-05-20 13:08:50 [xax.task.mixins.train] humanoid_walking_task
STATUS 2025-05-20 13:08:50 [xax.task.mixins.train] JAX devices: [CudaDevice(id=0)]
INFO 2025-05-20 13:08:51 [xax.task.mixins.train] Starting a new training run
PING 2025-05-20 13:08:53 [ksim.task.rl] Model size: 1,090,861 parameters
PING 2025-05-20 13:08:53 [ksim.task.rl] Optimizer size: 2,181,722 parameters
Status
✦ JAX devices: [CudaDevice(id=0)]
✦ humanoid_walking_task
✦ /root/ksim-gym/train.py
✦ /root/ksim-gym/humanoid_walking_task/run_1
Pings
✦ Optimizer size: 2,181,722 parameters
✦ Model size: 1,090,861 parameters
Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in _run_code
File "/root/ksim-gym/train.py", line 658, in
HumanoidWalkingTask.launch(
File "/root/openvla-env/lib/python3.12/site-packages/xax/task/mixins/runnable.py", line 51, in launch
launcher.launch(cls, *cfgs, use_cli=use_cli)
File "/root/openvla-env/lib/python3.12/site-packages/xax/task/launchers/cli.py", line 40, in launch
SingleProcessLauncher().launch(task, *cfgs, use_cli=use_cli_next)
File "/root/openvla-env/lib/python3.12/site-packages/xax/task/launchers/single_process.py", line 30, in launch
run_single_process_training(task, *cfgs, use_cli=use_cli)
File "/root/openvla-env/lib/python3.12/site-packages/xax/task/launchers/single_process.py", line 20, in run_single_process_training
task_obj.run()
File "/root/openvla-env/lib/python3.12/site-packages/ksim/task/rl.py", line 1009, in run
self.run_training()
File "/root/openvla-env/lib/python3.12/site-packages/ksim/task/rl.py", line 2042, in run_training
constants, carry, state = self.initialize_rl_training(mj_model, rng)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/openvla-env/lib/python3.12/site-packages/ksim/task/rl.py", line 1990, in initialize_rl_training
env_states=self._get_env_state(
^^^^^^^^^^^^^^^^^^^^
File "/root/openvla-env/lib/python3.12/site-packages/ksim/task/rl.py", line 1801, in _get_env_state
randomization_dict, physics_state = randomization_fn(
^^^^^^^^^^^^^^^^^
File "/root/openvla-env/lib/python3.12/site-packages/ksim/task/rl.py", line 322, in apply_randomizations
physics_state = engine.reset(physics_model, curriculum_level, reset_rng)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/openvla-env/lib/python3.12/site-packages/xax/utils/jax.py", line 139, in wrapped
res = jitted_fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
jaxlib.xla_extension.XlaRuntimeError: INTERNAL: cuSolver internal error
For simplicity, JAX has removed its internal frames from the traceback of the following exception. Set JAX_TRACEBACK_FILTERING=off to include these.
(openvla-env) root@ubuntu:~/ksim-gym# python -m train
INFO 2025-05-20 13:14:09 [xax.task.mixins.compile] Setting JAX logging level to INFO
INFO 2025-05-20 13:14:09 [xax.task.mixins.compile] Setting JAX compilation cache directory to /root/.cache/jax/jaxcache
INFO 2025-05-20 13:14:09 [xax.task.mixins.compile] Configuring JAX compilation cache parameters
INFO:2025-05-20 13:14:09,150:jax._src.xla_bridge:867: Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: libtpu.so: cannot open shared object file: No such file or directory
INFO 2025-05-20 13:14:09 [jax._src.xla_bridge] Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: libtpu.so: cannot open shared object file: No such file or directory
2025-05-20 13:14:09.219131: I external/xla/xla/service/service.cc:152] XLA service 0x118922f0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2025-05-20 13:14:09.219184: I external/xla/xla/service/service.cc:160] StreamExecutor device (0): Orin, Compute Capability 8.7
2025-05-20 13:14:09.228180: I external/xla/xla/pjrt/pjrt_c_api_client.cc:130] PjRtCApiClient created.
STATUS 2025-05-20 13:14:20 [xax.task.mixins.artifacts] /root/ksim-gym/humanoid_walking_task/run_2
STATUS 2025-05-20 13:14:20 [xax.task.mixins.train] /root/ksim-gym/train.py
STATUS 2025-05-20 13:14:20 [xax.task.mixins.train] humanoid_walking_task
STATUS 2025-05-20 13:14:20 [xax.task.mixins.train] JAX devices: [CudaDevice(id=0)]
INFO 2025-05-20 13:14:21 [xax.task.mixins.train] Starting a new training run
PING 2025-05-20 13:14:23 [ksim.task.rl] Model size: 1,090,861 parameters
PING 2025-05-20 13:14:23 [ksim.task.rl] Optimizer size: 2,181,722 parameters
Status
✦ JAX devices: [CudaDevice(id=0)]
✦ humanoid_walking_task
✦ /root/ksim-gym/train.py
✦ /root/ksim-gym/humanoid_walking_task/run_2
Pings
✦ Optimizer size: 2,181,722 parameters
✦ Model size: 1,090,861 parameters
Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in _run_code
File "/root/ksim-gym/train.py", line 658, in
HumanoidWalkingTask.launch(
File "/root/openvla-env/lib/python3.12/site-packages/xax/task/mixins/runnable.py", line 51, in launch
launcher.launch(cls, *cfgs, use_cli=use_cli)
File "/root/openvla-env/lib/python3.12/site-packages/xax/task/launchers/cli.py", line 40, in launch
SingleProcessLauncher().launch(task, *cfgs, use_cli=use_cli_next)
File "/root/openvla-env/lib/python3.12/site-packages/xax/task/launchers/single_process.py", line 30, in launch
run_single_process_training(task, *cfgs, use_cli=use_cli)
File "/root/openvla-env/lib/python3.12/site-packages/xax/task/launchers/single_process.py", line 20, in run_single_process_training
task_obj.run()
File "/root/openvla-env/lib/python3.12/site-packages/ksim/task/rl.py", line 1009, in run
self.run_training()
File "/root/openvla-env/lib/python3.12/site-packages/ksim/task/rl.py", line 2042, in run_training
constants, carry, state = self.initialize_rl_training(mj_model, rng)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/openvla-env/lib/python3.12/site-packages/ksim/task/rl.py", line 1990, in initialize_rl_training
env_states=self._get_env_state(
^^^^^^^^^^^^^^^^^^^^
File "/root/openvla-env/lib/python3.12/site-packages/ksim/task/rl.py", line 1801, in _get_env_state
randomization_dict, physics_state = randomization_fn(
^^^^^^^^^^^^^^^^^
File "/root/openvla-env/lib/python3.12/site-packages/ksim/task/rl.py", line 322, in apply_randomizations
physics_state = engine.reset(physics_model, curriculum_level, reset_rng)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/openvla-env/lib/python3.12/site-packages/xax/utils/jax.py", line 139, in wrapped
res = jitted_fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
jaxlib.xla_extension.XlaRuntimeError: INTERNAL: cuSolver internal error
For simplicity, JAX has removed its internal frames from the traceback of the following exception. Set JAX_TRACEBACK_FILTERING=off to include these.
(openvla-env) root@ubuntu:~/ksim-gym# python -m train
INFO 2025-05-20 13:08:38 [xax.task.mixins.compile] Setting JAX logging level to INFO
INFO 2025-05-20 13:08:38 [xax.task.mixins.compile] Setting JAX compilation cache directory to /root/.cache/jax/jaxcache
INFO 2025-05-20 13:08:38 [xax.task.mixins.compile] Configuring JAX compilation cache parameters
INFO:2025-05-20 13:08:38,816:jax._src.xla_bridge:867: Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: libtpu.so: cannot open shared object file: No such file or directory
INFO 2025-05-20 13:08:38 [jax._src.xla_bridge] Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: libtpu.so: cannot open shared object file: No such file or directory
2025-05-20 13:08:38.897413: I external/xla/xla/service/service.cc:152] XLA service 0xb17d160 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2025-05-20 13:08:38.897471: I external/xla/xla/service/service.cc:160] StreamExecutor device (0): Orin, Compute Capability 8.7
2025-05-20 13:08:38.906223: I external/xla/xla/pjrt/pjrt_c_api_client.cc:130] PjRtCApiClient created.
STATUS 2025-05-20 13:08:50 [xax.task.mixins.artifacts] /root/ksim-gym/humanoid_walking_task/run_1
STATUS 2025-05-20 13:08:50 [xax.task.mixins.train] /root/ksim-gym/train.py
STATUS 2025-05-20 13:08:50 [xax.task.mixins.train] humanoid_walking_task
STATUS 2025-05-20 13:08:50 [xax.task.mixins.train] JAX devices: [CudaDevice(id=0)]
INFO 2025-05-20 13:08:51 [xax.task.mixins.train] Starting a new training run
PING 2025-05-20 13:08:53 [ksim.task.rl] Model size: 1,090,861 parameters
PING 2025-05-20 13:08:53 [ksim.task.rl] Optimizer size: 2,181,722 parameters
Status
✦ JAX devices: [CudaDevice(id=0)]
✦ humanoid_walking_task
✦ /root/ksim-gym/train.py
✦ /root/ksim-gym/humanoid_walking_task/run_1
Pings
✦ Optimizer size: 2,181,722 parameters
✦ Model size: 1,090,861 parameters
Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in _run_code
File "/root/ksim-gym/train.py", line 658, in
HumanoidWalkingTask.launch(
File "/root/openvla-env/lib/python3.12/site-packages/xax/task/mixins/runnable.py", line 51, in launch
launcher.launch(cls, *cfgs, use_cli=use_cli)
File "/root/openvla-env/lib/python3.12/site-packages/xax/task/launchers/cli.py", line 40, in launch
SingleProcessLauncher().launch(task, *cfgs, use_cli=use_cli_next)
File "/root/openvla-env/lib/python3.12/site-packages/xax/task/launchers/single_process.py", line 30, in launch
run_single_process_training(task, *cfgs, use_cli=use_cli)
File "/root/openvla-env/lib/python3.12/site-packages/xax/task/launchers/single_process.py", line 20, in run_single_process_training
task_obj.run()
File "/root/openvla-env/lib/python3.12/site-packages/ksim/task/rl.py", line 1009, in run
self.run_training()
File "/root/openvla-env/lib/python3.12/site-packages/ksim/task/rl.py", line 2042, in run_training
constants, carry, state = self.initialize_rl_training(mj_model, rng)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/openvla-env/lib/python3.12/site-packages/ksim/task/rl.py", line 1990, in initialize_rl_training
env_states=self._get_env_state(
^^^^^^^^^^^^^^^^^^^^
File "/root/openvla-env/lib/python3.12/site-packages/ksim/task/rl.py", line 1801, in _get_env_state
randomization_dict, physics_state = randomization_fn(
^^^^^^^^^^^^^^^^^
File "/root/openvla-env/lib/python3.12/site-packages/ksim/task/rl.py", line 322, in apply_randomizations
physics_state = engine.reset(physics_model, curriculum_level, reset_rng)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/openvla-env/lib/python3.12/site-packages/xax/utils/jax.py", line 139, in wrapped
res = jitted_fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
jaxlib.xla_extension.XlaRuntimeError: INTERNAL: cuSolver internal error
For simplicity, JAX has removed its internal frames from the traceback of the following exception. Set JAX_TRACEBACK_FILTERING=off to include these.
(openvla-env) root@ubuntu:~/ksim-gym# python -m train
INFO 2025-05-20 13:14:09 [xax.task.mixins.compile] Setting JAX logging level to INFO
INFO 2025-05-20 13:14:09 [xax.task.mixins.compile] Setting JAX compilation cache directory to /root/.cache/jax/jaxcache
INFO 2025-05-20 13:14:09 [xax.task.mixins.compile] Configuring JAX compilation cache parameters
INFO:2025-05-20 13:14:09,150:jax._src.xla_bridge:867: Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: libtpu.so: cannot open shared object file: No such file or directory
INFO 2025-05-20 13:14:09 [jax._src.xla_bridge] Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: libtpu.so: cannot open shared object file: No such file or directory
2025-05-20 13:14:09.219131: I external/xla/xla/service/service.cc:152] XLA service 0x118922f0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2025-05-20 13:14:09.219184: I external/xla/xla/service/service.cc:160] StreamExecutor device (0): Orin, Compute Capability 8.7
2025-05-20 13:14:09.228180: I external/xla/xla/pjrt/pjrt_c_api_client.cc:130] PjRtCApiClient created.
STATUS 2025-05-20 13:14:20 [xax.task.mixins.artifacts] /root/ksim-gym/humanoid_walking_task/run_2
STATUS 2025-05-20 13:14:20 [xax.task.mixins.train] /root/ksim-gym/train.py
STATUS 2025-05-20 13:14:20 [xax.task.mixins.train] humanoid_walking_task
STATUS 2025-05-20 13:14:20 [xax.task.mixins.train] JAX devices: [CudaDevice(id=0)]
INFO 2025-05-20 13:14:21 [xax.task.mixins.train] Starting a new training run
PING 2025-05-20 13:14:23 [ksim.task.rl] Model size: 1,090,861 parameters
PING 2025-05-20 13:14:23 [ksim.task.rl] Optimizer size: 2,181,722 parameters
Status
✦ JAX devices: [CudaDevice(id=0)]
✦ humanoid_walking_task
✦ /root/ksim-gym/train.py
✦ /root/ksim-gym/humanoid_walking_task/run_2
Pings
✦ Optimizer size: 2,181,722 parameters
✦ Model size: 1,090,861 parameters
Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in _run_code
File "/root/ksim-gym/train.py", line 658, in
HumanoidWalkingTask.launch(
File "/root/openvla-env/lib/python3.12/site-packages/xax/task/mixins/runnable.py", line 51, in launch
launcher.launch(cls, *cfgs, use_cli=use_cli)
File "/root/openvla-env/lib/python3.12/site-packages/xax/task/launchers/cli.py", line 40, in launch
SingleProcessLauncher().launch(task, *cfgs, use_cli=use_cli_next)
File "/root/openvla-env/lib/python3.12/site-packages/xax/task/launchers/single_process.py", line 30, in launch
run_single_process_training(task, *cfgs, use_cli=use_cli)
File "/root/openvla-env/lib/python3.12/site-packages/xax/task/launchers/single_process.py", line 20, in run_single_process_training
task_obj.run()
File "/root/openvla-env/lib/python3.12/site-packages/ksim/task/rl.py", line 1009, in run
self.run_training()
File "/root/openvla-env/lib/python3.12/site-packages/ksim/task/rl.py", line 2042, in run_training
constants, carry, state = self.initialize_rl_training(mj_model, rng)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/openvla-env/lib/python3.12/site-packages/ksim/task/rl.py", line 1990, in initialize_rl_training
env_states=self._get_env_state(
^^^^^^^^^^^^^^^^^^^^
File "/root/openvla-env/lib/python3.12/site-packages/ksim/task/rl.py", line 1801, in _get_env_state
randomization_dict, physics_state = randomization_fn(
^^^^^^^^^^^^^^^^^
File "/root/openvla-env/lib/python3.12/site-packages/ksim/task/rl.py", line 322, in apply_randomizations
physics_state = engine.reset(physics_model, curriculum_level, reset_rng)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/openvla-env/lib/python3.12/site-packages/xax/utils/jax.py", line 139, in wrapped
res = jitted_fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
jaxlib.xla_extension.XlaRuntimeError: INTERNAL: cuSolver internal error
For simplicity, JAX has removed its internal frames from the traceback of the following exception. Set JAX_TRACEBACK_FILTERING=off to include these.