Prototype of PDB NextGen Archive now available

EP
Ezra Peisach
Tue, Feb 7, 2023 12:19 PM

A prototype of a next generation archive repository for the PDB is now
available. The archive, called “NextGen”, hosts structural model files
in PDBx/mmCIF and PDBML formats at https://files-nextgen.wwpdb.org
https://files-nextgen.wwpdb.org/. This enriched PDB archive provides
annotation from external database resources in the metadata in addition
to the content provided in the structure model files in the PDB main
archive at https://files.wwpdb.org https://files.wwpdb.org/.

This prototype provides sequence annotation from external resources such
as UniProt, SCOP2 and Pfam at atom, residue, and chain levels. This
mapping information is derived from the Structure Integration with
Function, Taxonomy and Sequence (SIFTS) project
(https://www.ebi.ac.uk/pdbe/docs/sifts/
https://www.ebi.ac.uk/pdbe/docs/sifts/), a service developed and
maintained by the PDBe and UniProt teams at EMBL-EBI. Sequence mappings
are provided in _/pdbx_sifts_unp_segments /and
/_pdbx_sifts_xref_db_segments /categories for each segment/,
_pdbx_sifts_xref_db /at residue level, and /atom/site at atom level.

The PDB NextGen Repository is currently updated monthly on the first
Wednesday of the month at 00:00 UTC and is subject to change in the
future. You can access these NextGen files at the following locations:

wwPDB: https://files-nextgen.wwpdb.org
https://files-nextgen.wwpdb.org/, rsync://rsync-nextgen.wwpdb.org RCSB
PDB (USA): https://files-nextgen.rcsb.org
https://files-nextgen.rcsb.org/ , rsync://rsync-nextgen.rcsb.org PDBe
(UK): https://ftp.ebi.ac.uk/pub/databases/pdb_nextgen
https://ftp.ebi.ac.uk/pub/databases/pdb_nextgen PDBj (Japan):
https://ftp-nextgen.pdbj.org https://ftp-nextgen.pdbj.org/

Data are structured based on entry ID with a two letter hash code,
‘thirdfrom last character' and 'second from last character’. This hash
code will remain consistent once PDB ID codes are extended beyond four
characters with the pdb_ prefix.

Some examples are shown below:

Access entry pdb_00008aly at
https://files-nextgen.wwpdb.org/pdb_nextgen/data/entries/divided/al/pdb_00008aly/
<https://files-nextgen.wwpdb.org/pdb_nextgen/data/entries/divided/al/pdb_00008aly/>
Both PDBx/mmCIF and PDBML are provided at this location. For entry
pdb_00008aly
<https://files-nextgen.wwpdb.org/pub/pdb/data/structures/divided/al/pdb_00008aly/>:

  o

    pdb_00008aly_xyz-enrich.cif.gz

  o

    
pdb_00008aly_xyz-no-atom-enrich.xml.gz

Please contact info@wwpdb.org mailto:info@wwpdb.org with any questions.

A prototype of a next generation archive repository for the PDB is now available. The archive, called “NextGen”, hosts structural model files in PDBx/mmCIF and PDBML formats at https://files-nextgen.wwpdb.org <https://files-nextgen.wwpdb.org/>. This enriched PDB archive provides annotation from external database resources in the metadata in addition to the content provided in the structure model files in the PDB main archive at https://files.wwpdb.org <https://files.wwpdb.org/>. This prototype provides sequence annotation from external resources such as UniProt, SCOP2 and Pfam at atom, residue, and chain levels. This mapping information is derived from the Structure Integration with Function, Taxonomy and Sequence (SIFTS) project (https://www.ebi.ac.uk/pdbe/docs/sifts/ <https://www.ebi.ac.uk/pdbe/docs/sifts/>), a service developed and maintained by the PDBe and UniProt teams at EMBL-EBI. Sequence mappings are provided in _/pdbx_sifts_unp_segments /and /_pdbx_sifts_xref_db_segments /categories for each segment/, _pdbx_sifts_xref_db /at residue level, and _/atom_/site at atom level. The PDB NextGen Repository is currently updated monthly on the first Wednesday of the month at 00:00 UTC and is subject to change in the future. You can access these NextGen files at the following locations: wwPDB: https://files-nextgen.wwpdb.org <https://files-nextgen.wwpdb.org/>, rsync://rsync-nextgen.wwpdb.org RCSB PDB (USA): https://files-nextgen.rcsb.org <https://files-nextgen.rcsb.org/> , rsync://rsync-nextgen.rcsb.org PDBe (UK): https://ftp.ebi.ac.uk/pub/databases/pdb_nextgen <https://ftp.ebi.ac.uk/pub/databases/pdb_nextgen> PDBj (Japan): https://ftp-nextgen.pdbj.org <https://ftp-nextgen.pdbj.org/> Data are structured based on entry ID with a two letter hash code, ‘thirdfrom last character' and 'second from last character’. This hash code will remain consistent once PDB ID codes are extended beyond four characters with the pdb_ prefix. Some examples are shown below: * Access entry pdb_00008aly at https://files-nextgen.wwpdb.org/pdb_nextgen/data/entries/divided/al/pdb_00008aly/ <https://files-nextgen.wwpdb.org/pdb_nextgen/data/entries/divided/al/pdb_00008aly/> * Both PDBx/mmCIF and PDBML are provided at this location. For entry pdb_00008aly <https://files-nextgen.wwpdb.org/pub/pdb/data/structures/divided/al/pdb_00008aly/>: o pdb_00008aly_xyz-enrich.cif.gz o 
pdb_00008aly_xyz-no-atom-enrich.xml.gz Please contact info@wwpdb.org <mailto:info@wwpdb.org> with any questions.