Opened 4 days ago

Last modified 3 hours ago

#20295 assigned defect

Make Boltz and OpenFold installs work with Nvidia Blackwell GPUs

Reported by: Tom Goddard Owned by: Tom Goddard
Priority: moderate Milestone:
Component: Structure Prediction Version:
Keywords: Cc:
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description

ChimeraX installs PyTorch 2.7.1 based on cuda 12.6 when installing Boltz and OpenFold and that does not work on newer Nvidia Blackwell GPUs such as RTX 5090 where a newer PyTorch and cuda is needed. There are some instructions about how the user can fix this on the ChimeraX Boltz web page under Limitations

https://www.rbvi.ucsf.edu/chimerax/data/boltz-apr2025/boltz_help.html#limitations

but no user will find that.

Change History (2)

comment:1 by Tom Goddard, 4 days ago

This problem was reported by Trevor Sewell on the ChimeraX mailing list:

From: trevor SEWELL via ChimeraX-users 
Subject: [chimerax-users] Error using Boltz in Windows 11 with himeraX 1.11.1
Date: May 8, 2026 at 2:46:14 PM PDT
To: "chimerax-users@cgl.ucsf.edu" 
Reply-To: trevor SEWELL 

How do I remedy the following error  with Boltz predictions on a Windows 11 system with ChimeraX 1.11.1
I have tried Cuda 13.2 and Cuda 12.8

I am  not sufficiently skilful to obey the following instructions: 
NVIDIA GeForce RTX 5080 Laptop GPU with CUDA capability sm_120 is not compatible with the current PyTorch installation. 
The current PyTorch install supports CUDA capabilities sm_50 sm_60 sm_61 sm_70 sm_75 sm_80 sm_86 sm_90. 
If you want to use the NVIDIA GeForce RTX 5080 Laptop GPU GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/ 


Many thanks for your kind help!

All the best

Trevor

Running boltz prediction failed with exit code 1:
command:
C:\Users\sewel/boltz22\Scripts\boltz.exe predict C:\Users\sewel/Desktop/boltz_omega_amidase\omega_amidase.yaml --accelerator gpu --no_kernels
stdout:
Boltz version 2.2.0 
Checking input data. 
Processing 1 inputs with 1 threads. 
Running structure prediction for 1 input. 

stderr:
0%| | 0/1 [00:00Using bfloat16 Automatic Mixed Precision (AMP) 
GPU available: True (cuda), used: True 
TPU available: False, using: 0 TPU cores 
HPU available: False, using: 0 HPUs 
C:\Users\sewel\boltz22\Lib\site-packages\pytorch_lightning\trainer\connectors\logger_connector\logger_connector.py:76: Starting from v1.9.0, ⁠ tensorboardX ⁠ has been removed as a dependency of the ⁠ pytorch_lightning ⁠ package, due to potential conflicts with other packages in the ML ecosystem. For this reason, ⁠ logger=True ⁠ will use ⁠ CSVLogger ⁠ as the default logger, unless the ⁠ tensorboard ⁠ or ⁠ tensorboardX ⁠ packages are found. Please ⁠ pip install lightning[extra] ⁠ or one of them to enable TensorBoard support by default 
Fri May 8 10:27:30 2026: Loading Boltz structure prediction weights 
C:\Users\sewel\boltz22\Lib\site-packages\pytorch_lightning\utilities\migration\utils.py:56: The loaded checkpoint was produced with Lightning v2.5.0.post0, which is newer than your current Lightning version: v2.5.0 
Fri May 8 10:28:00 2026: Finished loading Boltz structure prediction weights 
Fri May 8 10:28:00 2026: Starting structure inference 
C:\Users\sewel\boltz22\Lib\site-packages\torch\cuda\_init_.py:287: UserWarning: 
NVIDIA GeForce RTX 5080 Laptop GPU with CUDA capability sm_120 is not compatible with the current PyTorch installation. 
The current PyTorch install supports CUDA capabilities sm_50 sm_60 sm_61 sm_70 sm_75 sm_80 sm_86 sm_90. 
If you want to use the NVIDIA GeForce RTX 5080 Laptop GPU GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/ 

warnings.warn( 
You are using a CUDA device ('NVIDIA GeForce RTX 5080 Laptop GPU') that has Tensor Cores. To properly utilize them, you should set ⁠ torch.set_float32_matmul_precision('medium' | 'high') ⁠ which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision 
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0] 
C:\Users\sewel\boltz22\Lib\site-packages\pytorch_lightning\trainer\connectors\data_connector.py:420: Consider setting ⁠ persistent_workers=True ⁠ in 'predict_dataloader' to speed up the dataloader worker initialization. 
Traceback (most recent call last): 
File "C:\Users\sewel\boltz22\Lib\site-packages\pytorch_lightning\trainer\call.py", line 47, in _call_and_handle_interrupt 
return trainer_fn(*args, **kwargs) 
^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
File "C:\Users\sewel\boltz22\Lib\site-packages\pytorch_lightning\trainer\trainer.py", line 898, in _predict_impl 
results = self._run(model, ckpt_path=ckpt_path) 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
File "C:\Users\sewel\boltz22\Lib\site-packages\pytorch_lightning\trainer\trainer.py", line 982, in _run 
results = self._run_stage() 
^^^^^^^^^^^^^^^^^ 
File "C:\Users\sewel\boltz22\Lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1021, in _run_stage 
return self.predict_loop.run() 
^^^^^^^^^^^^^^^^^^^^^^^ 
File "C:\Users\sewel\boltz22\Lib\site-packages\pytorch_lightning\loops\utilities.py", line 179, in _decorator 
return loop_run(self, *args, **kwargs) 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
File "C:\Users\sewel\boltz22\Lib\site-packages\pytorch_lightning\loops\prediction_loop.py", line 105, in run 
self.setup_data() 
File "C:\Users\sewel\boltz22\Lib\site-packages\pytorch_lightning\loops\prediction_loop.py", line 162, in setup_data 
length = len(dl) if has_len_all_ranks(dl, trainer.strategy, allow_zero_length) else float("inf") 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
File "C:\Users\sewel\boltz22\Lib\site-packages\pytorch_lightning\utilities\data.py", line 105, in has_len_all_ranks 
if total_length == 0: 
^^^^^^^^^^^^^^^^^ 
RuntimeError: CUDA error: no kernel image is available for execution on the device 
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. 
For debugging consider passing CUDA_LAUNCH_BLOCKING=1 
Compile with ⁠ TORCH_USE_CUDA_DSA ⁠ to enable device-side assertions. 


During handling of the above exception, another exception occurred: 

Traceback (most recent call last): 
File "", line 198, in _run_module_as_main 
File "", line 88, in _run_code 
File "C:\Users\sewel\boltz22\Scripts\boltz.exe\_main_.py", line 7, in 
File "C:\Users\sewel\boltz22\Lib\site-packages\click\core.py", line 1157, in _call_ 
return self.main(*args, **kwargs) 
^^^^^^^^^^^^^^^^^^^^^^^^^^ 
File "C:\Users\sewel\boltz22\Lib\site-packages\click\core.py", line 1078, in main 
rv = self.invoke(ctx) 
^^^^^^^^^^^^^^^^ 
File "C:\Users\sewel\boltz22\Lib\site-packages\click\core.py", line 1688, in invoke 
return _process_result(sub_ctx.command.invoke(sub_ctx)) 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
File "C:\Users\sewel\boltz22\Lib\site-packages\click\core.py", line 1434, in invoke 
return ctx.invoke(self.callback, **ctx.params) 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
File "C:\Users\sewel\boltz22\Lib\site-packages\click\core.py", line 783, in invoke 
return __callback(*args, **kwargs) 
^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
File "C:\Users\sewel\boltz22\Lib\site-packages\boltz\main.py", line 1355, in predict 
trainer.predict( 
File "C:\Users\sewel\boltz22\Lib\site-packages\pytorch_lightning\trainer\trainer.py", line 859, in predict 
return call._call_and_handle_interrupt( 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
File "C:\Users\sewel\boltz22\Lib\site-packages\pytorch_lightning\trainer\call.py", line 68, in _call_and_handle_interrupt 
trainer._teardown() 
File "C:\Users\sewel\boltz22\Lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1005, in _teardown 
self.strategy.teardown() 
File "C:\Users\sewel\boltz22\Lib\site-packages\pytorch_lightning\strategies\strategy.py", line 536, in teardown 
self.lightning_module.cpu() 
File "C:\Users\sewel\boltz22\Lib\site-packages\lightning_fabric\utilities\device_dtype_mixin.py", line 82, in cpu 
return super().cpu() 
^^^^^^^^^^^^^ 
File "C:\Users\sewel\boltz22\Lib\site-packages\torch\nn\modules\module.py", line 1133, in cpu 
return self._apply(lambda t: t.cpu()) 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
File "C:\Users\sewel\boltz22\Lib\site-packages\torch\nn\modules\module.py", line 915, in _apply 
module._apply(fn) 
File "C:\Users\sewel\boltz22\Lib\site-packages\torchmetrics\metric.py", line 907, in _apply 
_dummy_tensor = fn(torch.zeros(1, device=self.device)) 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
RuntimeError: CUDA error: no kernel image is available for execution on the device 
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. 
For debugging consider passing CUDA_LAUNCH_BLOCKING=1 
Compile with ⁠ TORCH_USE_CUDA_DSA ⁠ to enable device-side assertions.

comment:2 by Tom Goddard, 3 hours ago

Nvidia Blackwell architecture GPUs require CUDA 12.8. ChimeraX Boltz and OpenFold install PyTorch 2.7.1 for Cuda 12.6. Online sources suggest that PyTorch 2.7.1 for CUDA 12.8 can use Blackwell GPUs. If that is correct I may want to update the ChimeraX Boltz and OpenFold install to use PyTorch 2.7.1 for CUDA 12.8. CUDA 12.8 was released January 2025 and CUDA 13 is the current version. So it may be reasonable to require CUDA 12.8.

First I need to test if Boltz and OpenFold can run on Blackwell with torch 2.7.1/CUDA 12.8. I don't have a Blackwell GPU but can try on RunPod.

Current PyTorch version is 2.11. I may want to update from 2.7.1. Inference speed with Boltz with torch 2.8 and 2.9 was much slower on Mac (1.5x longer runtime) so I kept the version at 2.7.1. Might be worth testing speed on Mac with PyTorch 2.11.

Note: See TracTickets for help on using tickets.