Making OpenFold Structure Predictions in ChimeraX

Tom Goddard
February 11, 2026

OpenFold in ChimeraX
OpenFold installation
Predicting a structure with OpenFold
Results, coloring and predicted aligned error
Structure size and prediction speed limitations
Run times and size limits
Batch predictions with sets of ligands
Options
Server predictions
Listing past predictions
ChimeraX openfold command
Limitations
Change log

OpenFold in ChimeraX

ChimeraX can run the OpenFold 3 structure prediction method to compute atomic structures of proteins and nucleic acids, including modified residues, ligands and ions on your laptop or desktop computer. A ChimeraX graphical user interface (menu Tools / Structure Prediction / OpenFold) and ChimeraX command (openfold) are provided to make predictions. ChimeraX daily builds dated February 13, 2026 and newer use OpenFold 3 preview. OpenFold 3 is fully open source under the Apache 2.0 license.

OpenFold installation

openfold gui

When you first start the OpenFold tool within ChimeraX (menu Tools / Structure Prediction / OpenFold) it will show a button Install OpenFold. OpenFold is a large software package, taking about 1 Gbyte of disk space and uses the Torch machine learning package. It also requires the neural network weights (2 Gbytes) to make predictions. Downloading and installing can take 10 minutes or more depending on network speed OpenFold will be installed in your home directory in ~/openfold3 and the network weights in ~/.openfold3.

The ChimeraX openfold install command can also be used to do this one time installation.

Predicting a structure with OpenFold

openfold gui 9gui Menu of molecular components
openfold components menu

To predict a structure made up of proteins, nucleic acids and small molecules you first specify all the molecular components. Choose entries from the Add menu and press the Add button to add them to your assembly specification in the table below. You can specify component molecules in several ways.

NTCA transcription factor
NTCA transcription factor bound to DNA
Prediction PAE heatmap
Predicted aligned error

Components can be added multiple times to have more instances of that molecule in the assembly. Press the Predict button after the assembly is completed by adding each component to start the prediction. A Stop button will be shown while the prediction runs to terminate the prediction, discarding the partial computation so you can start another prediction.

Results

The results are put on your desktop in a new folder openfold/<prediction-name> where the prediction name can be specified at the top of the ChimeraX OpenFold panel. Using the Options described below you can change where the result files are placed.

Predictions for small assemblies, for example 500 residues and ligand atoms, take one to several minutes depending on the computer (e.g. Nvidia GPU vs CPU only). Predictions run in the background (a separate process) so you can continue to use ChimeraX while the calculation runs. The predicted structure will be opened in ChimeraX when the calculation completes. If the assembly specification involved proteins or nucleic acids specified using chains of open models, the predicted structure will be aligned (using matchmaker) to the open model for the first such component.

Coloring

The prediction will be colored using the standard AlphaFold pLDDT type of coloring where blue indicates high confidence, yellow and red moderate to low confidence.

Predicted aligned error

Per-residue-pair estimates of prediction confidence can be displayed by pressing the Error Plot button to show the predicted aligned error.

Structure size and prediction speed limitations

OpenFold structure predictions use a lot of memory and compute resources on your computer that limits the size of the structure that can be predicted. See the run times section below for example run times and size limits.

Mac. It works well on Mac M-series (M1,M2,M3,M4) laptop and desktop computers predicting small 100 residue structures in 1 minute and up to 1200 residues in about 15 minutes with 32 GB of memory. With 16 GB of memory it can only predict about 350 amino acids taking about 5 minutes, with larger predictions running out of memory.

Nvidia GPUs on Windows and Linux. Nvidia GPUs on Windows and Linux computers also provide good performance. On Linux with an Nvidia GPU with 24 GB of graphics memory (e.g. Nvidia RTX 3090 or 4090) it can predict about 1300 residues in about 4 minutes, larger sizes run out of memory. Testing on Windows with less GPU memory, 12 GB (e.g. Nvidia RTX 4070) predicts up to 1000 residues. On Windows with 8 GB of GPU memory (e.g Nvidia RTX 3070) predictions of about 700 residues. Large predictions on Nvidia GPUs on Linux run out of memory and fail. On Windows the prediction will fallback to using CPU memory allowing larger structure predictions but taking immensely longer run times (10-30 times longer).

Intel CPU. Predictions only utilizing an Intel CPU are very slow, for example 1.5 hours for 900 residues. Expected size limits are about 350 residues with 16 GB, 1000 residues with 32 GB, and about 1600 residues with 64 GB. Run time is expected to increase as the square of the number of residues.

Run times and size limits

Here are run times for a few desktop and laptop computers for predicting various size molecular assemblies from the Protein Databank using the ChimeraX OpenFold 3 fork from February 11, 2026.

OpenFold prediction times in minutes, using cached MSAs. Tokens is the number of polymer residues plus ligand atoms.

PDB code Tokens Mac
M1
16 GB
Mac
M1 Max
32 GB
Mac
M2 Ultra
64 GB
Linux
i9 CPU 64 GB
Linux
Nvidia 4090
Windows
i7 CPU 64 GB
Windows
Nvidia 3070
Number of residues and atoms and prediction error
8rf4 129 1.5 1.0 0.8 1.4 0.7 1.5 118 amino acids, 11 ligand atoms, 1.5A RMSD 118 residues
9gui 526 18 4.3 2.5 16 1.0 Protein dimer, DNA, 2 ligands, 1.4A RMSD for protein
9moj 660 30 6.6 3.7 25 1.1 30 3.3 660 amino acids, heterotetramer, 0.9A RMSD 125 residues
9h1k 671 6.9 4.0 1.1 560 amino acids, 59 rna bases, 52 ligand atoms,
1.2A RMSD for 261 residues, RNA wrong
9b3h 911 38 7.6 1.5 911 amino acids, heterodimer, 1A RMSD 503 residues
9fz5 1025 11 1.7 1025 amino acids, heterotrimer, 2.2A RMSD 740 residues
9mcw 1154 16 2.3 1154 rna bases, homodimer, wrong dimer and monomer conformations
8sa0 1371 30 2.6 1274 amino acids, 97 ligand atoms, 2.1A RMSD 1151 residues
9gh4 1467 2.9 Protein homotrimer, monomer 489 residues, 1.5A RMSD for 250 residues
9enr 1794 4.1 Protein monomer, 4 ligands, 2.9A RMSD for 1551 residues
1dpp 2028 5.2 Homotetramer, 0.47A RMSD for 507 residues
9hma 2218 failed Protein monomer

Performance notes

Mac GPU acceleration. The reported Mac performance is for Mac M1/M2/M3/M4 series GPUs. OpenFold uses machine learning package torch which has GPU acceleration called Metal Performance Shaders (MPS) on these Mac M series GPUs which have speed up to 2-5x slower than an Nvidia 4090 but with the advantage that the Mac can handle larger molecular systems using the unified computer memory (e.g. 32 or 64 GB). With 16 GB prediction size is limited to 350 residues. Older Mac Intel machines do not have GPU acceleration in Torch and run at speeds similar to Windows Intel CPU-only predictions.

Windows Nvidia GPU performance. The above table shows a significant slow-down in predictions beyond about 600 residues on Windows with Nvidia 3070 (8 GB) and 4070 (12 GB) graphics. This is probably because the GPU memory is insufficient for larger structures and the machine learning toolkit falls back to a mix of CPU and GPU calculation. Notice that the 4070 GPU took more time than the 3070 GPU for large structures probably because the CPU on the 4070 machine (i5-6700K) is significantly slower than the CPU on the 3070 machine (i7-12900K).

Linux Nvidia GPU out of memory. On Linux Nvidia 4090 with 24 GB of GPU memory the maximum prediction appears to be about 2000 residues plus ligand atoms before an "out of memory" error occurs. This contrasts with Windows where Torch appears to fallback to using CPU and not run out of GPU memory.

CUDA use of bfloat16. On Nvidia CUDA the ChimeraX OpenFold uses 16-bit floating point (bfloat16, bf16-mixed torch lightning) which allows larger predictions then on non-CUDA systems where 32-bit float is used. Torch only supports hardware accelerate bfloat16 CUDA .

Batch predictions with sets of ligands

Several predictions can be run each with a different ligand to see how it binds to the rest of the molecular assembly. Use the Add molecule "each ligand SMILES string" menu entry. Paste in a set of SMILES strings with ligand names. Each line should have a ligand name, comma, followed by a SMILES string. The ligand names will be used as the predicted structure file names. Alternatively you can just enter one SMILES string per line without a name and the ligands will be named "ligand1", "ligand2", .... Press the Add button after pasting in the set of ligands, then the Predict button to run the series of predictions.

computer generated image
Example input for 52 anti-viral ligands run against HIV protease.
Predictions completed in 10 minutes on an Nvidia 4090 GPU.
computer generated image
Table of prediction ipTM confidence for ligands.

Progress messages will appear in the OpenFold panel as the structures are predicted for each ligand. The ipTM scores is a crude prediction of binding confidence. When the predictions complete a table of results giving ligand name, ipTM and SMILES string will appear. The table can be sorted by clicking on the column headers. Selecting rows of the table and pressing the Open button will open the selected structure predictions. The table is written as comma-separated values to the directory where OpenFold was run, for example hiv_protease/hiv_protease.oflig in the case shown. This file will appear in the file history thumbnails when you start ChimeraX in future sessions and clicking on the thumbnail will reshow the table.

Options

openfold options panel

Pressing the Options button shows additional settings for openfold predictions.

Additional advanced options are available by using the ChimeraX openfold command.

Listing past predictions

ChimeraX OpenFold user interface panel

ChimeraX menu entry Tools / Structure Prediction / OpenFold History lists a table of past OpenFold predictions. Selecting rows of the table and pressing the Open structures button will open those structures. The table is filled by scanning the OpenFold prediction directories located in the directory where ChimeraX puts new OpenFold predictions (by default ~/Desktop/openfold). If a prediction computed multiple structures they will all be aligned to the first structure of that prediction when opened (using the ChimeraX matchmaker command). The directory containing the prediction directories and the choice to align can be specified in the panel shown by pressing the Options button.

Directories are identified as containing OpenFold predictions if they contain a file named "command". When ChimeraX runs a prediction it saves the openfold command including all its arguments are listed in this file.

Fetching server predictions

If there are server predictions that did not complete before quitting ChimeraX, then when ChimeraX is run again the OpenFold History panel will show a button Fetch from server next to the Open structures button. This button fetches any server predictions that have completed and opens them. Predictions that have not yet completed are noted in the Log after pressing this button.

ChimeraX openfold command

The ChimeraX OpenFold graphical interface runs a prediction by running the ChimeraX openfold command. That command is recorded in the ChimeraX Log panel, and looking at that command can help you understand the command options.

      openfold predict [sequences] [protein sequences] [dna sequences] [rna sequences]
         [ligands residue-spec] [excludeLigands ccd-codes]
         [ligandCcd ccd-codes] [ligandSmiles smiles-string]
         [forEachSmilesLigand name,smiles-string,name,smiles-string...]
         [name prediction-name] [resultsDirectory directory] [samples n] [seed n]
         [device default|cpu|gpu] [precision 32-true | bf16-mixed | 16-true | bf16-true]
         [useServer true | false] [serverHost hostname] [serverPort port]
         [useMsaCache true|false] [msaOnly true|false]
         [open true|false] [installLocation directory] [wait true|false]

Options descriptions

Installation command

       openfold install [directory] [downloadModelWeights true | false] branch name

The openfold install command creates a Python virtual environment to install the ChimeraX OpenFold fork Github. If no directory is specified then ~/openfold3 in the user's home directory is used. The directory will be created or if it already exists must be empty. It then downloads the OpenFold network parameters to ~/.openfold3.

The install uses a fork of the OpenFold repository https://github.com/RBVI/openfold-3. It uses git branch chimerax_openfold of this fork unless the branch option is specified in which case it installs the specified branch.

The install process executes these commands to make the virtual environment and install OpenFold. It uses the ChimeraX Python executable to create the virtual environment. OpenFold will no longer work if ChimeraX is moved or deleted and will need to be reinstalled in that case. It will also no longer work if the openfold directory itself is moved since the openfold executable refers to the install location to find python.

The ChimeraX openfold install command creates a Python virtual environment and installs openfold and downloads the openfold weights. On Windows it installs a version of torch with CUDA 12.6 support before installing openfold if Nvidia graphics is detected.

      python -m venv directory
      directory/bin/python -m pip install torch --index-url https://download.pytorch.org/whl/cu126  # On Windows with Nvidia GPU only.
      directory/bin/python -m pip install openfold
      directory/bin/python chimerax/site-packages/openfold/download_weights.py
    

Ligand table command

       openfold ligandtable runDirectory [ alignTo  atomic-model ]

The openfold ligandtable command displays a table for batch ligand predictions. Normally this command is not needed because ChimeraX batch ligand predictions are tabulated in the directory where the OpenFold run is made in a comma-separated value file with suffix ".oflig" and you can open this file to show the table of results. But in cases where this file is missing you can recreate the table with this command which searches the prediction results extracting the scores, and searches the ".json" input file to find the SMILES strings for the ligands.

Limitations

Change log

OpenFold modifications

ChimeraX uses a fork of the OpenFold repository with several improvements summarized below. The exact changes are seen in openfold_diffs.patch which compares the March 25, 2026 ChimeraX OpenFold to the primary OpenFold repository main branch using command git diff unmodified chimerax_openfold_preview2.

  1. Compute and write PAE. By default OpenFold computed predicted aligned error (PAE) but did not have any option to write it to a file. I added it to the confidence output since it is the most heavily used metric for detailed analysis of predictions. Also I changed the default confidence file output from json format (.json) to numpy format (.npz) which is many times smaller and many times faster to read.
  2. Don't remove MSA and template files. OpenFold puts MSA and template files in a temporary directory and deletes them. These files are valuable for understanding the confidence of the predicted structures so ChimeraX writes them to the prediction output directory and does not delete them using standard OpenFold settings. But the OpenFold code always deletes the raw MSA (.a3m file) results from Colabfold and I changed it to preserve those as well.
  3. MSA only option. Sometimes it is useful to only compute MSAs in one run, and then predict structures in another run. For instance if the same MSA will be used for hundreds of protein complexes with same same proteins but different drugs. Also it is needed for runs on compute clusters that don't allow internet, where the MSA calculation needs to be run on one computer, and the inference on another. I added a --msa_and_templates_only OpenFold command-line option to handle this.
  4. Compute device option. I added a --device option (values "gpu", "cpu", "tpu") to control what device a prediction uses. The default is GPU, but for consumer GPUs with little memory (e.g. 8 or 12 GB) larger structure predictions will run out of memory and being able to run the computation more slowly on CPU allows it to complete successfully.
  5. Precision option. This specifies floating point precision to use bf16-mixed or 32-true. Standard OpenFold always uses 32-bit precision which uses twice the memory as 16-bit precision. Boltz and other prediction software saw no decrease in accuracy with 16-bit precision for inference. In the torch current machine learning package only CUDA has hardware supported 16-bit. In ChimeraX OpenFold it uses bf16-mixed which is a Torch Lightning precision that uses bfloat16 for most operations but 32-bit for some that are known to require higher precision. This allows running significantly large predictions on Nvidia GPUs, for example, 2000 residues vs 1300 residues max size on an Nvidia RTX 4090 with 24 GB of memory. By default ChimeraX OpenFold uses bf16-mixed on CUDA (Linux or Windows) and 32-float on Mac GPU or Windows CPU. I added a --precision option to allow specifying the precision if testing for future Torch versions or for accuracy degradation is needed.
  6. Download OpenFold parameters without user input. OpenFold installation requires the user to some choices about whether and where to download OpenFold model parameters. I modified the setup_openfold.py script so it can to the setup non-interactively. ChimeraX installs OpenFold non-interactively.
  7. Operating system specific packages. Several packages used by OpenFold are not available on Mac or Windows and are also not required but they cause the standard OpenFold install to fail. I modified the pyproject.toml dependency file to not attempt to install from PyPi mkl, aria, nvidia-cutlass, cuda-python on Mac since they are not available. Also I made it not require deepspeed or kalign-python on Windows. Extra code was added to use a kalign executable on Windows (and Mac). The other packages are for optimization or downloads and are not necessary for predictions.
  8. Include kalign binaries. In order for ChimeraX to install the kalign sequence alignment program used by OpenFold to align template structures to query sequences I put kalign executables for Mac, Windows, and Linux in the ChimeraX OpenFold repository under kalign. The standard OpenFold uses conda to install kalign which is only available for Linux and Mac. OpenFold preview 2 switched to using the kalign-python PyPi package. ChimeraX uses that on Linux but uses the kalign executables on Windows because the PyPi packages is not available for Windows, and also on Mac because the kalign-python package fails on Mac due to duplicate libomp.dylib libraries (one included in the kalign package and another in OpenFold).
  9. DeepSpeed only on Linux. OpenFold uses DeepSpeed to accelerate CUDA GPU computations and has it enabled by default. DeepSpeed is not available on Mac or Windows, so I make the default setting not use it on those operating systems. Also DeepSpeed is disable for Linux CPU (without CUDA) predictions.
  10. Mac multiprocessing fix. OpenFold uses the multiprocessing module with pickling to call an data processing function. On Linux this works with the function defined within another function, but on Mac the function must be at global scope. I moved the function to global scope.
  11. Remove poor performance multi-processing. OpenFold uses multi-processing presumably to speed-up processing training batches (num_workers = 10, num_workers_validation = 4 in validator.py). But this slows down inference where the batch size is one query and spinning up the subprocesses takes several seconds. I changed it to not use multiprocessing making predictions several seconds faster.
  12. Remove developer debugging log messages. OpenFold outputs code debugging messages in several places using logger.info() or print() instead of logger.debug() producing clutter in the output meaningful only to developers. I changed those to logger.debug() which is not output unless debug mode is enabled. Makes it easier to see real problems in the log.
  13. Progress messages. To allow ChimeraX to show sensible progress messages as a prediction is made I added log output that says when each stage that takes significant time starts and stops (e.g. executable startup, loading weights, colabfold search, template processing, feature processing, inference).
  14. Turn off progress bars. The tdqm progress bars create hundreds of output in log files and are only useful for output to at terminal. Turn those off by default.
  15. Log time stamps. To identify why some predictions take a long time it is useful to include time stamps on log output lines. For instance this allows ChimeraX to report that OpenFold took several minutes waiting for a Colabfold MSA calculation to complete. Also it helps identify bottlenecks (e.g. most of the time of for predicting small structures is not in inference).