NIAID progress report for group meeting July 14 (last report) - August 24, 2022.

Tom Goddard
August 24, 2022

NIH 3D Pipeline

DICOM

AlphaFold Database version 3

Fast Searching of 200 Million AlphaFold Models

DeepFoldRNA

De Novo RNA Tertiary Structure Prediction at Atomic Resolution Using Geometric Potentials from Deep Learning
Robin Pearce, Gilbert S. Omenna, Yang Zhang
Preprint at bioRxiv from May 15, 2022.

X-ray 7MLX DeepFoldRNA 5 models in gray

ESMFold

Machine learning protein structure prediction without using a deep sequence alignment.

Language models of protein sequences at the scale of evolution enable accurate structure prediction
Zeming Lin, Halil Akin, Roshan Rao, Brian Hie, Zhongkai Zhu, Wenting Lu, Allan dos Santos Costa, Maryam Fazel-Zarandi, Tom Sercu, Sal Candido, Alexander Rives
Preprint at bioRxiv, July 21 2022.


UMAP of high confidence predictions from 1 million sequences

FoldSeek

Foldseek: fast and accurate protein structure search
Michel van Kempen, Stephanie S. Kim, Charlotte Tumescheit, Milot Mirdita, Johannes Soding, Martin Steinegger
Preprint in bioRxiv, Feb 9, 2022.

"To increase speed, a crucial idea is to describe the amino acid backbone of proteins as sequences over a structural alphabet and compare structures using sequence alignments [15]. Structural alphabets thus reduce structure comparisons to much faster sequence alignments. For Foldseek, we developed a novel type of structural alphabet that does not describe the backbone but rather tertiary interactions. The 20 states of the 3D-interactions (3Di) alphabet describe for each residue i the geometric conformation with its spatially closest residue j."

Native Mac M1 ChimeraX - Funded by CZI grant