Dear PDB users,
Starting May 3, 2022, the PDB archive will distribute assembly files in
PDBx/mmCIF format, allowing direct access and visualization of the
curated assemblies for all PDB entries.
Currently, PDBx/mmCIF formatted assembly files are provided for
structures that are non-PDB compliant, however the coordinates use model
numbers to differentiate alternate symmetry copies of PDB chain IDs.
This method is not ideal, nor necessary, for the current archive
PDBx/mmCIF format and has lead to limited use of these files in
community software tools. In response to this issue and recommendations
by the wwPDB advisory committee, we are implementing updated,
standardized practices for generation of assembly files for all PDB entries.
These updated PDBx/mmCIF format assembly files will have improved
organization of assembly data to support usage by the community. These
files will include all symmetry generated copies of each chain within a
single model, with distinct chain IDs (_atom_site.auth_asym_id and
_atom_site.label_asym_id) assigned to each. Generation of distinct chain
IDs in assembly files are based upon the following rules:
Chain IDs of the original chains from the atomic coordinate file will be
retained (e.g., A)
Assign unique chain ID (atom_site.label_asym_id and
atom_site.auth_asym_id) for each symmetry copy within a single model.
Rules of chain ID assignments:
- The applied index of the symmetry operator
(pdbx_struct_oper_list.id) will be appended to the original chain ID
separated by a dash (e.g., A-2, A-3, etc.)
- If there are more than one type of symmetry operators applied to
generate symmetry copy, a dash sign will be used between two
operators (e.g., A-12-60, A-60-88, etc.)
In addition, entity ID and chain ID mapping categories will be provided:
_pdbx_entity_remapping and _pdbx_chain_remapping.
A new directory (ftp.wwpdb.org/pub/pdb/data/assemblies/mmCIF/) will be
created for the distribution of these updated assembly files. The
directory containing the existing assembly mmCIF files for large entries
will be removed (ftp.wwpdb.org/pub/pdb/data/biounit/mmCIF/
https://ftp.wwpdb.org/pub/pdb/data/biounit/mmCIF/).
wwPDB asks all PDB users and software developers to review code and
address any limitations related to PDB assemblies. Sample files are made
available for testing purposes and to support community adoption at
GitHub.com/wwpdb/assembly-mmcif-examples
(https://github.com/wwpdb/assembly-mmcif-examples).
If you plan to use these assembly files for graphical viewing, check if
your visualization software (e.g., PyMol, ChimeraX, etc.) supports
instantiation of assemblies directly from atomic coordinate files
(_struct_assembly related categories), you do so for improved efficiency.
For any further information please email info@wwpdb.org.
--
Regards,
Jasmine
---==========================
Jasmine Young, Ph.D.
Biocuration Team Lead
RCSB Protein Data Bank
Research Professor
Institute for Quantitative Biomedicine
Rutgers, The State University of New Jersey
174 Frelinghuysen Rd
Piscataway, NJ 08854-8087
Email:jasmine@rcsb.rutgers.edu
Phone: (848)445-0103 ext 4920
Fax: (732)445-4320
---==========================
Dear PDB users,
Starting May 3, 2022, the PDB archive will distribute assembly files in
PDBx/mmCIF format, allowing direct access and visualization of the
curated assemblies for all PDB entries.
Currently, PDBx/mmCIF formatted assembly files are provided for
structures that are non-PDB compliant, however the coordinates use model
numbers to differentiate alternate symmetry copies of PDB chain IDs.
This method is not ideal, nor necessary, for the current archive
PDBx/mmCIF format and has lead to limited use of these files in
community software tools. In response to this issue and recommendations
by the wwPDB advisory committee, we are implementing updated,
standardized practices for generation of assembly files for all PDB entries.
These updated PDBx/mmCIF format assembly files will have improved
organization of assembly data to support usage by the community. These
files will include all symmetry generated copies of each chain within a
single model, with distinct chain IDs (_atom_site.auth_asym_id and
_atom_site.label_asym_id) assigned to each. Generation of distinct chain
IDs in assembly files are based upon the following rules:
# Chain IDs of the original chains from the atomic coordinate file will be
retained (e.g., A)
# Assign unique chain ID (atom_site.label_asym_id and
atom_site.auth_asym_id) for each symmetry copy within a single model.
Rules of chain ID assignments:
* The applied index of the symmetry operator
(pdbx_struct_oper_list.id) will be appended to the original chain ID
separated by a dash (e.g., A-2, A-3, etc.)
* If there are more than one type of symmetry operators applied to
generate symmetry copy, a dash sign will be used between two
operators (e.g., A-12-60, A-60-88, etc.)
In addition, entity ID and chain ID mapping categories will be provided:
_pdbx_entity_remapping and _pdbx_chain_remapping.
A new directory (ftp.wwpdb.org/pub/pdb/data/assemblies/mmCIF/) will be
created for the distribution of these updated assembly files. The
directory containing the existing assembly mmCIF files for large entries
will be removed (ftp.wwpdb.org/pub/pdb/data/biounit/mmCIF/
<https://ftp.wwpdb.org/pub/pdb/data/biounit/mmCIF/>).
wwPDB asks all PDB users and software developers to review code and
address any limitations related to PDB assemblies. Sample files are made
available for testing purposes and to support community adoption at
GitHub.com/wwpdb/assembly-mmcif-examples
(https://github.com/wwpdb/assembly-mmcif-examples).
If you plan to use these assembly files for graphical viewing, check if
your visualization software (e.g., PyMol, ChimeraX, etc.) supports
instantiation of assemblies directly from atomic coordinate files
(_struct_assembly related categories), you do so for improved efficiency.
For any further information please email info@wwpdb.org.
--
Regards,
Jasmine
===========================================================
Jasmine Young, Ph.D.
Biocuration Team Lead
RCSB Protein Data Bank
Research Professor
Institute for Quantitative Biomedicine
Rutgers, The State University of New Jersey
174 Frelinghuysen Rd
Piscataway, NJ 08854-8087
Email:jasmine@rcsb.rutgers.edu
Phone: (848)445-0103 ext 4920
Fax: (732)445-4320
===========================================================