OpenFold is an artificial-intelligence method for predicting biomolecular structures consisting of proteins, RNA, DNA, and other molecules such as ligands, cofactors, and drugs [OpenFold3 documentation]. Based on AlphaFold 3, OpenFold 3 is fully open-source and freely available for both academic and commercial use under the Apache 2.0 license. Any work that cites OpenFold 3 should also cite AlphaFold 3:
Accurate structure prediction of biomolecular interactions with AlphaFold 3. Abramson J, Adler J, et al. Nature. 2024 Jun;630(8016):493-500.
The ChimeraX OpenFold tool installs and runs OpenFold (current version OpenFold3-preview) on the local machine. The prediction runs on Mac, Linux, and Windows without requiring an Nvidia GPU, typically taking on the order of minutes, and is run in background so that ChimeraX can be used for other tasks.
The ChimeraX OpenFold tool can be opened from the Structure Prediction section of the Tools menu and manipulated like other panels (more...). It is also implemented as the openfold command.
The predicted structures vary in confidence levels (see coloring) and should be interpreted with caution. Residue-residue alignment errors for the modeled structures are shown in the Error Plot. See OpenFold structure prediction in ChimeraX. See also: AlphaFold, Boltz, ESMFold, Modeller Comparative, Model Loops, computational screening for protein-protein interactions
OpenFold Installation
Defining and Running a Prediction
Error Plot
OpenFold History
Limitations
OpenFold installation only needs to be done once per computer, as long as the ChimeraX installation is not moved or deleted. Clicking the Install OpenFold button on the tool dialog creates a Python virtual environment to install the ChimeraX OpenFold fork Github. OpenFold uses Torch and other packages, and the total installation including OpenFold the trained neural network weights, and the PDB Chemical Component Dictionary for defining residue types is about 3 GB and may take 10 minutes or more to download and install, depending on network speed. OpenFold (current version OpenFold3-preview) is installed in ~/openfold3 in the user's home directory. This directory will be created, or if it already exists must be empty. The OpenFold network parameters and Chemical Component Dictionary are downloaded to ~/.openfold3. Installation can also be done with the command openfold install.
The specified Prediction name will be used in naming the output folder and files, as detailed in the options. The structure to predict is defined by Adding one or more molecular components. For assemblies containing multiple copies of the same chain, that component should be added multiple times. Components can be defined by:
Batch prediction results will be shown in a table listing the ligand names, ipTM confidence scores, and SMILES strings. Clicking a column header sorts the table by the values in that column, and selecting one or more rows and clicking Open opens the corresponding structure prediction(s). The results are written as a comma-separated values file (.bzlig), and this file can be accessed later from the File History to reshow the table. Another way to reshow the table is with the command openfold ligandtable, by specifying the directory in which OpenFold was run. The table's context menu includes an option to write the results as a .bzlig file.
The current set of components to model are listed in a table, with the polymer residue count tallied underneath to help assess the size of the calculation (see OpenFold structure prediction in ChimeraX for guidelines on run times based on size and resources). The Clear button can be used to clear the table contents to start over, and Delete selected rows to remove just the row(s) currently highlighted in the table.
The Options button shows/hides additional options:
Clicking Save default options saves the current option settings as user preferences. More options are available as part of the openfold command.
Clicking Predict launches the calculation (see OpenFold structure prediction in ChimeraX for run times on various systems. The prediction is run in the background so that ChimeraX can be used for other tasks. Clicking Stop halts a calculation in progress. When the prediction finishes, the resulting structure(s) are opened automatically. Other buttons:
When first opened, the predicted structures are colored by the pLDDT confidence measure (same as for AlphaFold models) in the B-factor field:
...in other words, using
color bfactor palette alphafoldThe Color Key graphical interface or a command can be used to draw a corresponding color key, for example:
key red:low orange: yellow: cornflowerblue: blue:high [other-key-options]
A prediction with at least one component specified by structure chain will be superimposed on the pre-existing chain with matchmaker. If more than one chain in the predicted assembly was specified by an existing chain ID, only the first one is used for superposition.
Error plot shows a plot of the predicted aligned error (PAE), in which color gradations show (for each pairwise combination of residues) the expected error in position of one residue when the true and predicted structures are aligned based on the other residue.
Besides the per-residue pLDDT confidence measure, OpenFold gives for each pair of residues (X,Y) the expected position error at residue X if the predicted and true structures were aligned on residue Y. These residue-residue “predicted aligned error” (PAE) values can be shown in a plot by clicking the Error plot button on the OpenFold dialog.
When the mouse cursor is over the plot, the residue pair and PAE value at its current position are reported in the bottom right corner of the window.
Clicking Color PAE Domains clusters the residues into coherent domains (sets of residues with relatively low PAE values) and uses randomly chosen colors to distinguish these domains in the structure (details...). Clicking Color pLDDT returns the structure to the default confidence coloring.
The plot's context menu includes:
| 0 | 5 | 10 | 15 | 20 | 25 | 30 |
| 0 | 5 | 10 | 15 | 20 | 25 | 30 |
The Color Key graphical interface or a command can be used to draw (in the main graphics window) a color key for the PAE plot. For example, to make a color key that matches the pae or paegreen scheme, respectively:
key pae :0 : : :15 : : :30 showTool true
key paegreen :0 : : :15 : : :30 showTool true
A title for the color key (e.g., “Predicted Aligned Error (Å)”) would need to be created separately with 2dlabels.
The OpenFold History can be opened by clicking the History button on the OpenFold dialog, or by choosing OpenFold History from from the Structure Prediction section of the Tools menu. Results of previous predictions can be opened by choosing row(s) in the table and clicking Open structures. If any predictions were run on a server and not yet opened on the local machine, there will be a Fetch from server button, and clicking it will retrieve and open them.
Clicking Options shows/hides the additional options:
Structure size. OpenFold uses a lot of memory, and the amount of available memory limits the size of structures that can be predicted. For a computer with 32 Gbytes, the size limit is roughly 1000 residues plus ligand atoms (called "tokens"). Consumer Nvidia GPUs with 8 or 12 GB of memory (e.g., RTX 3070) only handle 300-500 residues before using CPU memory on Windows that slows the prediction speed by 10-20-fold. On Linux, it will not use CPU memory. Consumer Nvidia GPUs with 24 GB (RTX 3090 and RTX 4090) are able to predict 1300 tokens. Prediction size limits are perhaps the most important shortcoming of OpenFold compared to AlphaFold 3, which handles memory more efficiently and is able to predict 5000 tokens with 80GB of GPU memory, about twice the size that OpenFold can predict. A drawback of AlphaFold 3 is that it requires Linux and an Nvidia GPU in addition to various licensing restrictions. We hope that in the future OpenFold will optimize memory use to predict larger structures.
Run time. The computation time increases as the square of the number of tokens, so a prediction with three times as many residue and ligand atoms will take approximately nine times longer to run.
Nvidia GPU support on Windows. Installing OpenFold will get a CUDA-enabled version of the torch machine-learning package if it detects Nvidia graphics. It decides if you have Nvidia graphics by seeing if the file C:/Windows/System32/nvidia-smi.exe exists. Otherwise it gets a CPU-only version of torch. If you install an Nvidia graphics driver after installing OpenFold, you will have to reinstall OpenFold to get the CUDA version. The installed torch is for CUDA 12.6 or newer. If your computer has a version of CUDA older than 12.6 but newer than 11.8, you can run the following commands in a Windows Command Prompt to install a CUDA 11.8 version of torch. For other CUDA versions refer to the Torch installation page for the correct pip install command.
> cd C:\Users\username\openfold\Scripts > pip.exe uninstall torch > pip.exe install torch --index-url https://download.pytorch.org/whl/cu118
Nvidia GPU support on Linux. On Linux, the installed OpenFold will work with CUDA 12.6 or newer if you have Nvidia graphics. If you have an older system CUDA version it may still work, or you can refer to the Torch installation page for the correct pip install command and replace torch with the following shell commands.
$ cd ~/openfold3/bin $ ./pip uninstall torch $ ./pip install torch --index-url https://download.pytorch.org/whl/cu118
No assigning chain identifiers. It can be helpful to assign chain identifiers (A,B,C...) to the different molecular components to match existing structures. OpenFold is capable of this, but the ChimeraX user interface does not currently allow it.
MSA sequence alignments. OpenFold uses the Colabfold MSA server (https://api.colabfold.com) for computing deep sequence alignments. This requires internet connectivity and is subject to outages if that server is down. The sequence alignments are cached in ~/Downloads/ChimeraX/OpenFoldMSA so subsequent predictions with the same set of protein sequences can reuse the sequence alignment.