Removal of ls-lR index file from the PDB archive

JY
Jasmine Young
Tue, Apr 4, 2023 5:02 PM

Dear PDB-l,

With continuing growth of the PDB archive, the size of the file that
lists all directory contents (currently
https://files.wwpdb.org/pub/pdb/ls-lR) will become a challenge for long
term maintenance. wwPDB plans to remove this file from the PDB archive
at 00:00 UTC on July 12, 2023. We strongly encourage users to utilize
files previously announced that containing the same data
(https://files.wwpdb.org/pub/pdb/holdings/).

These inventory data files offer a quick overview of data in the
archive. These files are in the extensible JSON format, and can be found
under the new /pdb/holdings/ archive tree.

The inventory lists provided include:

  • all_removed_entries.json.gz: a list of obsoleted PDB entries
    including information for entry authors, entry title, release date,
    obsolete date, and superseding PDB ID, if any.
  • current_file_holdings.json.gz: a list of released PDB entries and
    the file types present for each in the PDB Core Archive (e.g.
    coordinate data, experimental data, validation report).
  • obsolete_structures_last_modified_dates.json.gz: a list of obsoleted
    PDB entries with information about the most recent modification date
    of the PDBx/mmCIF file.
  • refdata_id_list.json.gz: a list of released chemical reference
    entries, their content types (e.g., Chemical Component, BIRD), and
    the most recent modification date of the reference file.
  • released_structures_last_modified_dates.json.gz: a list of released
    PDB entries with the most recent modification date of the PDBx/mmCIF
    file.
  • unreleased_entries.json.gz: a list of on-hold PDB entries, their
    entry status, deposition date, and pre-release sequence information,
    where available.

Users are encouraged to utilize these inventory files. For example,
checking for the update of the PDB archive can be performed using
current_file_holdings.json.gz
https://s3.rcsb.org/pub/pdb/holdings/current_file_holdings.json.gz or
released_structures_last_modified_dates.json.gz
https://s3.rcsb.org/pub/pdb/holdings/released_structures_last_modified_dates.json.gz
in /pub/pdb/holdings/.

Please contact info@wwpdb.org with any questions.

--
Regards,

Jasmine

---==========================
Jasmine Young, Ph.D.
Biocuration Team Lead
RCSB Protein Data Bank
Research Professor
Institute for Quantitative Biomedicine
Rutgers, The State University of New Jersey
174 Frelinghuysen Rd
Piscataway, NJ 08854-8087

Email:jasmine@rcsb.rutgers.edu
Phone: (848)445-0103 ext 4920
Fax: (732)445-4320

---==========================

Dear PDB-l, With continuing growth of the PDB archive, the size of the file that lists all directory contents (currently https://files.wwpdb.org/pub/pdb/ls-lR) will become a challenge for long term maintenance. wwPDB plans to remove this file from the PDB archive at 00:00 UTC on July 12, 2023. We strongly encourage users to utilize files previously announced that containing the same data (https://files.wwpdb.org/pub/pdb/holdings/). These inventory data files offer a quick overview of data in the archive. These files are in the extensible JSON format, and can be found under the new /pdb/holdings/ archive tree. The inventory lists provided include: * all_removed_entries.json.gz: a list of obsoleted PDB entries including information for entry authors, entry title, release date, obsolete date, and superseding PDB ID, if any. * current_file_holdings.json.gz: a list of released PDB entries and the file types present for each in the PDB Core Archive (e.g. coordinate data, experimental data, validation report). * obsolete_structures_last_modified_dates.json.gz: a list of obsoleted PDB entries with information about the most recent modification date of the PDBx/mmCIF file. * refdata_id_list.json.gz: a list of released chemical reference entries, their content types (e.g., Chemical Component, BIRD), and the most recent modification date of the reference file. * released_structures_last_modified_dates.json.gz: a list of released PDB entries with the most recent modification date of the PDBx/mmCIF file. * unreleased_entries.json.gz: a list of on-hold PDB entries, their entry status, deposition date, and pre-release sequence information, where available. Users are encouraged to utilize these inventory files. For example, checking for the update of the PDB archive can be performed using current_file_holdings.json.gz <https://s3.rcsb.org/pub/pdb/holdings/current_file_holdings.json.gz> or released_structures_last_modified_dates.json.gz <https://s3.rcsb.org/pub/pdb/holdings/released_structures_last_modified_dates.json.gz> in /pub/pdb/holdings/. Please contact info@wwpdb.org with any questions. -- Regards, Jasmine =========================================================== Jasmine Young, Ph.D. Biocuration Team Lead RCSB Protein Data Bank Research Professor Institute for Quantitative Biomedicine Rutgers, The State University of New Jersey 174 Frelinghuysen Rd Piscataway, NJ 08854-8087 Email:jasmine@rcsb.rutgers.edu Phone: (848)445-0103 ext 4920 Fax: (732)445-4320 ===========================================================