Future Planning: Entries with extended PDB and CCD ID codes will be distributed in PDBx/mmCIF format only

JY
Jasmine Young
Wed, Apr 21, 2021 5:08 PM

http://www.wwpdb.org/news/news?year=2021#607760112786e73a79c76f9d

wwPDB, in collaboration with the PDBx/mmCIF Working Group
http://www.wwpdb.org/task/mmcif, has set plans to extend the length of
ID codes for PDB and Chemical Component Dictionary (CCD) ID entries in
the future. Entries issued with these extended IDs will not be supported
by the legacy PDB file format.

CCD entries are currently identified by unique three-character
alphanumeric codes. At current growth rates, we anticipate running out
of available new codes in the next three to four years. At this point,
the wwPDB will issue four-character alphanumeric codes for CCD IDs in
the OneDep system. Due to constraints of the legacy PDB file format,
entries containing these new, four character ID codes will only be
distributed in PDBx/mmCIF format. The wwPDB will begin implementation of
extended CCD ID codes in 2022.

In addition, wwPDB also plans extended PDB ID length to eight characters
prefixed by ‘PDB’, e.g., pdb_00001abc. Each PDB ID has a corresponding
Digital Object Identifier (DOI), often required for manuscript
submission to journals and described in publications by the structure
authors. Both extended PDB IDs and corresponding PDB DOIs, along with
existing four character PDB IDs, will be included in the PDBx/mmCIF
formatted files for all new entries by Fall 2021.

For example, PDB entry 1ABC will also have the extended PDB ID
(pdb_00001abc) and the corresponding PDB DOI (10.2210/pdb1abc/pdb)
listed in the _database_2 PDBx/mmCIF category.

loop_
_database_2.database_id
_database_2.database_code
_database_2.pdbx_database_accession
_database_2.pdbx_DOI
PDB   1abc pdb_00001abc 10.2210/pdb1abc/pdb
WWPDB D_1xxxxxxxxx  ?    ?

Once four-character PDB IDs are all consumed, newly-deposited PDB
entries will only be issued extended PDB ID codes, and entries will only
be distributed in PDBx/mmCIF format.

wwPDB is asking PDB users and related software developers to review code
and begin to remove such limitations for the future.

--
Regards,

Jasmine

---==========================
Jasmine Young, Ph.D.
Biocuration Team Lead
RCSB Protein Data Bank
Research Professor
Institute for Quantitative Biomedicine
Rutgers, The State University of New Jersey
174 Frelinghuysen Rd
Piscataway, NJ 08854-8087

Email: jasmine@rcsb.rutgers.edu
Phone: (848)445-0103 ext 4920
Fax: (732)445-4320

---==========================

http://www.wwpdb.org/news/news?year=2021#607760112786e73a79c76f9d wwPDB, in collaboration with the PDBx/mmCIF Working Group <http://www.wwpdb.org/task/mmcif>, has set plans to extend the length of ID codes for PDB and Chemical Component Dictionary (CCD) ID entries in the future. Entries issued with these extended IDs will not be supported by the legacy PDB file format. CCD entries are currently identified by unique three-character alphanumeric codes. At current growth rates, we anticipate running out of available new codes in the next three to four years. At this point, the wwPDB will issue four-character alphanumeric codes for CCD IDs in the OneDep system. Due to constraints of the legacy PDB file format, entries containing these new, four character ID codes will only be distributed in PDBx/mmCIF format. The wwPDB will begin implementation of extended CCD ID codes in 2022. In addition, wwPDB also plans extended PDB ID length to eight characters prefixed by ‘PDB’, e.g., pdb_00001abc. Each PDB ID has a corresponding Digital Object Identifier (DOI), often required for manuscript submission to journals and described in publications by the structure authors. Both extended PDB IDs and corresponding PDB DOIs, along with existing four character PDB IDs, will be included in the PDBx/mmCIF formatted files for all new entries by Fall 2021. For example, PDB entry 1ABC will also have the extended PDB ID (pdb_00001abc) and the corresponding PDB DOI (10.2210/pdb1abc/pdb) listed in the _database_2 PDBx/mmCIF category. loop_ _database_2.database_id _database_2.database_code _database_2.pdbx_database_accession _database_2.pdbx_DOI PDB   1abc pdb_00001abc 10.2210/pdb1abc/pdb WWPDB D_1xxxxxxxxx  ?    ? Once four-character PDB IDs are all consumed, newly-deposited PDB entries will only be issued extended PDB ID codes, and entries will only be distributed in PDBx/mmCIF format. wwPDB is asking PDB users and related software developers to review code and begin to remove such limitations for the future. -- Regards, Jasmine =========================================================== Jasmine Young, Ph.D. Biocuration Team Lead RCSB Protein Data Bank Research Professor Institute for Quantitative Biomedicine Rutgers, The State University of New Jersey 174 Frelinghuysen Rd Piscataway, NJ 08854-8087 Email: jasmine@rcsb.rutgers.edu Phone: (848)445-0103 ext 4920 Fax: (732)445-4320 ===========================================================