| 1 | The SKEMPIâversion of 1KBH has been âflatÂtenedâ in a way that breaks the PDB specification. Hereâs what we see when we compare the two files:
|
|---|
| 2 |
|
|---|
| 3 | No header or metadata
|
|---|
| 4 |
|
|---|
| 5 | The original 1kbh.pdb begins with HEADER, TITLE, COMPND, SOURCE, CRYST1, etc. SKEMPIâs SKEMPI_v2.0_1KBH.pdb starts immediately with ATOM lines and has zero HEADER, TITLE, COMPND, SOURCE or CRYST1 records.
|
|---|
| 6 |
|
|---|
| 7 | All 20 NMR models concatenated into one
|
|---|
| 8 |
|
|---|
| 9 | 1KBH is an NMR structure with 20 models; in the official PDB file youâll find 20 MODEL ⊠ENDMDL blocks (40 lines total). In the SKEMPI file, those 20 blocks have been stripped of their MODEL/ENDMDL delimiters, so you get 33 060 ATOM records (20 à about 1 650 atoms) all in one continuous âmodel.â
|
|---|
| 10 |
|
|---|
| 11 | Duplicate atom and residue numbers
|
|---|
| 12 |
|
|---|
| 13 | Because each original modelâs ATOM records (and their atom serial numbers) have simply been appended back-to-back, the file contains multiple atoms with the same serial numbers (1, 2, ⊠1653, then 1, 2, ⊠1653 again, etc.) and the same residue identifiers in each âblock.â Most parsers will reject or confuse these duplicates.
|
|---|
| 14 |
|
|---|
| 15 | Missing TERs and no END
|
|---|
| 16 |
|
|---|
| 17 | The original has two TER records per model (40 TERs); SKEMPI leaves just two TERs (one per chain) at the end of the entire file. There is also no final END record.
|
|---|
| 18 |
|
|---|
| 19 | Because of these violations of the fixed-column format and of the multi-model conventions (and because atom serials are no longer unique), many molecular viewers and parsers simply give up on SKEMPI_v2.0_1KBH.pdb.
|
|---|