New PDB Beta Archive Available for Testing

IP
Irina Persikova
Wed, Feb 11, 2026 4:07 PM

By 2028 4-character PDB IDs (e.g. 1abc) will be fully allocated.
After that, all new entries will be assigned only extended PDB IDs.

The new extended PDB ID format
https://www.wwpdb.org/documentation/pdb-id-extension-faq will be 12
characters
, which includes a prefix pdb_ followed by 8 alphanumeric
characters, e.g. pdb_1000axyz. This new ID format
https://www.wwpdb.org/documentation/pdb-id-extension-faq will enable
text mining detection of PDB entries in the published literature and
allow for more informative and transparent delivery of revised data
files. */When submitting extended PDB IDs to journals and citing
extended PDB IDs in manuscripts, all 12 characters including prefix pdb_
should be provided./ *

A PDB Beta Archive https://www.wwpdb.org/ftp/pdb-beta-ftp-sites is now
available to help community adopt extended PDB ID and PDBx/mmCIF format
during the transition phase. All files at this archive are re-organized
with extended PDB ID (including file naming and directories) at entry
level, mirroring the same data organization of the PDB Versioned Archive
http://files-versioned.wwpdb.org/.

All data files for a particular entry are stored in a single directory,
labeled based on a two-character hash generated from the penultimate two
characters of the PDB code, i.e.,
https://files-beta.wwpdb.org/pub/wwpdb/pdb/data/entries/<two-letter-hash>/<pdb_accession_code>/<entry_data_File_names>.
The two-letter hash will be based on the second and third characters
from the last character. For example, PDB entry pdb_1abc5678 will be
under /67/. This will maintain consistency with the current PDB
archive: PDB entry 1abc is under /ab.

File naming is standardized such that the file type is used for the
extension.
For example, file naming is changed from r116dsf.ent.gz to
pdb_0000116d-sf.cif.gz for the structure factor file and from
pdb318d.ent.gz to pdb_0000318d.pdb.gz for the legacy PDB formatted
coordinate file.

When four character PDB IDs are about to be consumed, this PDB Beta
Archive will replace the current PDB Archive (expect to be around
mid-2027) and entries with extended PDB IDs issued are not compatible
with PDB format. wwPDB encourages scientific journals, PDB community and
users to transition to PDBx/mmCIF format and adopt new PDB ID format as
earlier as possible.

For any further information please contact us at info@wwpdb.org.

Please read the full news at:
https://www.wwpdb.org/news/news?year=2026#698a36067e4af405aeeb5b24

On behalf of the wwPDB,

--
IRINA PERSIKOVA, Ph.D.
Deputy Biocuration Lead, RCSB Protein Data Bank
Research Associate, Institute for Quantitative Biomedicine
Rutgers, The State University of New Jersey
174 Frelinghuysen Road, Piscataway NJ 08854
P: 848.445.4938 | E: irina.persikova@rcsb.org
mailto:irina.persikova@rcsb.org
rcsb.org <www.rcsb.org> | iqb.rutgers.edu http://iqb.rutgers.edu |
facebook https://www.facebook.com/RCSBPDB | twitter
https://twitter.com/buildmodels

*By 2028* 4-character PDB IDs (e.g. *1abc*) will be fully allocated. After that, all new entries will be assigned *only extended PDB IDs*. The new *extended PDB ID format <https://www.wwpdb.org/documentation/pdb-id-extension-faq> will be 12 characters*, which includes a prefix pdb_ followed by 8 alphanumeric characters, e.g. *pdb_1000axyz*. This new ID format <https://www.wwpdb.org/documentation/pdb-id-extension-faq> will enable text mining detection of PDB entries in the published literature and allow for more informative and transparent delivery of revised data files. */When submitting extended PDB IDs to journals and citing extended PDB IDs in manuscripts, all 12 characters including prefix pdb_ should be provided./ * A PDB Beta Archive <https://www.wwpdb.org/ftp/pdb-beta-ftp-sites> is now available to help community adopt extended PDB ID and PDBx/mmCIF format during the transition phase. All files at this archive are re-organized with extended PDB ID (including file naming and directories) at entry level, mirroring the same data organization of the PDB Versioned Archive <http://files-versioned.wwpdb.org/>. All data files for a particular entry are stored in a single directory, labeled based on a two-character hash generated from the penultimate two characters of the PDB code, i.e., https://files-beta.wwpdb.org/pub/wwpdb/pdb/data/entries/<two-letter-hash>/<pdb_accession_code>/<entry_data_File_names>. The two-letter hash will be based on the second and third characters from the last character. For example, PDB entry pdb_1abc5*67*8 will be under */67/*. This will maintain consistency with the current PDB archive: PDB entry 1abc is under /ab. File naming is standardized such that the file type is used for the extension. For example, file naming is changed from *r116dsf.ent.gz* to *pdb_0000116d-sf.cif.gz* for the structure factor file and from *pdb318d.ent.gz* to *pdb_0000318d.pdb.gz* for the legacy PDB formatted coordinate file. When four character PDB IDs are about to be consumed, this PDB Beta Archive will replace the current PDB Archive (expect to be around mid-2027) and entries with extended PDB IDs issued are not compatible with PDB format. wwPDB encourages scientific journals, PDB community and users to transition to PDBx/mmCIF format and adopt new PDB ID format as earlier as possible. For any further information please contact us at info@wwpdb.org. Please read the full news at: https://www.wwpdb.org/news/news?year=2026#698a36067e4af405aeeb5b24 On behalf of the wwPDB, -- IRINA PERSIKOVA, Ph.D. Deputy Biocuration Lead, RCSB Protein Data Bank Research Associate, Institute for Quantitative Biomedicine Rutgers, The State University of New Jersey 174 Frelinghuysen Road, Piscataway NJ 08854 P: 848.445.4938 | E: irina.persikova@rcsb.org <mailto:irina.persikova@rcsb.org> rcsb.org <www.rcsb.org> | iqb.rutgers.edu <http://iqb.rutgers.edu> | facebook <https://www.facebook.com/RCSBPDB> | twitter <https://twitter.com/buildmodels>