AlphaFold Performance: molecule size, speed, memory, and GPU

Tom Goddard
January 12, 2022

AlphaFold version 2.1.1 has limitations in the protein or complex size that can be predicted limited mostly by available GPU memory. Also predictions of large structures can take more than 10 hours. Here are examples to gauge what is possible.

AlphaFold 2.1.2 switched OpenMM minimization to use GPU while AlphaFold 2.1.1 used CPU. This makes runs faster. Times below are for 2.1.1 unless stated otherwise.

Need to try large molecules to get out-of-memory failures.

Maximum Structure Size

How many amino acids can a protein or complex contain before AlphaFold fails? Here are examples of runs on large structures.

PDBProteins / uniqueTotal amino acidsGPU memory (GB)Nvidia GPUTime (hours)ComputerNotes
6IG91 / 1374448A4035UCSF clusterCompleted, correct fold, used PDB templates. Image.
6YEJ1 / 1319948A4028UCSF cluster1Each of 2 domains predicted correctly, but relative position wrong. Image.
7JGD1 / 1266048A4013UCSF clusterCompleted. Misaligned domains. Image.
7LHW1 / 1252748A4017UCSF clusterCompleted, roughly correct with many shifted domain positions. Image.
6UM11 / 124992430901Desktop PCFailed in jax CUDA_ERROR_ILLEGAL_ADDRESS, probably out of GPU memory. Log.
7M1P1 / 1227348A407.5UCSF clusterAlphaFold 2.1.2 and reduced databases which does not use hhblits. Mostly correct prediction, 1A RMSD for 1900 atoms, N terminal domains shifted 10A. Image.
7M1P1 / 1227348A403.7UCSF clusterFailed making sequence alignment in hhblits "ERROR: did not find 569 match states in sequence 1 of tr|A0A1H9CGJ3|A0A1H9CGJ3_9FIRM". AlphaFold 2.1.1. Log.
7M1P1 / 122732430904.5Desktop PCFailed making sequence alignment. Log.
7ALP1 / 120842430908Desktop PC2Completed. Misaligned domains. Image.
7BAN1 / 119322430901.3Desktop PCFailed making sequence alignment in hhblits. Log.
7P3T6 / 1181248A4012.5UCSF clusterCorrect prediction, mostly 1A rmsd. Image.
6X6U4 / 215922430903.5Desktop PCCorrect prediction, mostly 1A rmsd. Image.
7S621 / 1144116P1002Google ColabFailed, out of GPU memory allocating 19 GB. Colab resource monitor shows GPU (not CPU) memory is exhausted. 5000 aligned sequences. Log.
7Q5Z1 / 1143616P1001ColabFoldFailed after predicting two models. CUDA_ERROR_ILLEGAL_ADDRESS. Log.
7E5N4 / 1129248A404UCSF clusterCompleted with AlphaFold 2.1.2. Wrong packing. Image.
7E5N4 / 1129248A4011UCSF clusterCompleted with AlphaFold 2.1.1. Wrong packing. Image.
7E5N4 / 112922430902.5Desktop PCCompleted with AlphaFold 2.1.2. Wrong packing. Image.
7E5N4 / 112922430903.5Desktop PCFailed predicting model 2 with OpenMM error writing PDB "The number of positions must match the number of atoms". AlphaFold 2.1.1. Identical error on second try. Log.
7QFP1 / 1126916P1002ColabFoldCompleted with some wrong domain positions. Image.
6XMP1 / 1123916P1005Google ColabFailed. Only 1 of 5 models completed, others gave out of memory. One completed model ran out of memory in energy minimization. 6000 aligned sequences.
7KTT1 / 1114216P1003Google ColabFailed. Only 2 of 5 models completed, others gave out of memory allocating 13 GB. One completed model ran out of memory in energy minimization. 1500 aligned sequences. Log. Image.
6Z032 / 1107816P1005Google Colab3Completed with wrong complex, proteins on top of each other. Image.
5N5F10 / 198016P1002.5Google ColabCorrect prediction, 0.4 A rmsd. Image.
6Z1J3 / 38242430901.5Desktop PCCorrect prediction, mostly 0.5 A rmsd. Image.
6Z1J3 / 382448A402.5UCSF clusterCorrect prediction, mostly 0.5 A rmsd.
6Z1J3 / 382416P1005Google ColabCorrect prediction, mostly 0.5 A rmsd.
  1. UCSF cluster: Wynton cluster, using Nvidia A40 GPUs with 48 GB video memory, CPU info N/A, Alphafold databases on parallel BeeGFS file system.
  2. Desktop PC: Intel Core i9-10850K CPU @ 3.60GHz, 64 GBytes memory, Nvidia RTX 3090 / 24 GBytes, AlphaFold databases on 4 TB Samsung 870 QVO SATA 3 SSD drive.
  3. Google Colab: Run from ChimeraX (1.4 daily build December 2021) menu Tools / Structure Prediction / AlphaFold. Using Colab Pro paid service with Nvidia GPUs P100, K80, or T4 with 16 GB identified with nvidia-smi command from Colab shell. Small AlphaFold databases streamed from web, no templates.

Speed with GPU vs without GPU

AlphaFold can be run without a GPU using the AlphaFold Docker --use_gpu=False option or the run_alphafold.py script --gpu_devices=-1 option. Here are some tests with small proteins comparing runs without GPU to with GPU.

PDBProteins / uniqueTotal amino acidsGPU memory (GB)Nvidia GPUTime (hours)ComputerNotes
6Z1J3 / 3824No GPU17UCSF clusterCorrect prediction.
6Z1J3 / 382448A402.5UCSF clusterCorrect prediction, mostly 0.5 A rmsd.
7B761 / 1125No GPU2.7UCSF clusterWrong fold. Image.
7B761 / 112548A400.5UCSF clusterWrong fold.
7B761 / 1125No GPU1Desktop PCWrong fold.
7B761 / 11252430900.3Desktop PCWrong fold.