ls-lR index file to be removed from PDB archive on July 12, 2023

JY
Jasmine Young
Wed, May 17, 2023 5:06 PM

Dear PDB-l,

With continuing growth of the PDB archive, the size of the file that
lists all directory contents (currently
https://files.wwpdb.org/pub/pdb/ls-lR) will become a challenge for long
term maintenance. wwPDB will remove this file from the PDB archive at
00:00 UTC on July 12, 2023. We strongly encourage users to utilize files
previously announced
https://www.wwpdb.org/news/news?year=2021#613b93b3ef055f03d1f222cf
that containing the same data (https://files.wwpdb.org/pub/pdb/holdings/).

These inventory data files offer a quick overview of data in the
archive. Two new inventory files for experimental data are added. These
files are in the extensible JSON format, and can be found under the new
/pdb/holdings/ archive tree.

The inventory lists provided include:

  • current_file_holdings.json.gz: a list of released PDB entries and
    the file types present for each in the PDB Core Archive (e.g.
    coordinate data, experimental data, validation report).
  • refdata_id_list.json.gz: a list of released chemical reference
    entries, their content types (e.g., Chemical Component, BIRD), and
    the most recent modification date of the reference file.
  • released_structures_last_modified_dates.json.gz: a list of released
    PDB entries with the most recent modification date of the PDBx/mmCIF
    file.
  • released_experimental_data_last_modified_dates.json.gz: a list of
    released experimental data files with the most recent modification date
  • obsolete_structures_last_modified_dates.json.gz: a list of obsoleted
    PDB entries with the most recent modification date of the PDBx/mmCIF
    file.
  • obsolete_experimental_data_last_modified_dates.json.gz: a list of
    obsoleted experimental data files with the most recent modification
    date.
  • all_removed_entries.json.gz: a list of obsoleted PDB entries
    including information for entry authors, entry title, release date,
    obsolete date, and superseding PDB ID, if any.
  • unreleased_entries.json.gz: a list of on-hold PDB entries, their
    entry status, deposition date, and pre-release sequence information,
    where available.

Users are encouraged to utilize these inventory files. For example,
checking for the update of the PDB archive can be performed using
current_file_holdings.json.gz
https://s3.rcsb.org/pub/pdb/holdings/current_file_holdings.json.gz or
released_structures_last_modified_dates.json.gz
https://s3.rcsb.org/pub/pdb/holdings/released_structures_last_modified_dates.json.gz
in /pub/pdb/holdings/.

Please contact info@wwpdb.org with any questions.

--
Regards,

Jasmine

---==========================
Jasmine Young, Ph.D.
Biocuration Team Lead
RCSB Protein Data Bank
Research Professor
Institute for Quantitative Biomedicine
Rutgers, The State University of New Jersey
174 Frelinghuysen Rd
Piscataway, NJ 08854-8087

Email:jasmine@rcsb.rutgers.edu
Phone: (848)445-0103 ext 4920
Fax: (732)445-4320

---==========================

Dear PDB-l, With continuing growth of the PDB archive, the size of the file that lists all directory contents (currently https://files.wwpdb.org/pub/pdb/ls-lR) will become a challenge for long term maintenance. wwPDB will remove this file from the PDB archive at 00:00 UTC on July 12, 2023. We strongly encourage users to utilize files previously announced <https://www.wwpdb.org/news/news?year=2021#613b93b3ef055f03d1f222cf> that containing the same data (https://files.wwpdb.org/pub/pdb/holdings/). These inventory data files offer a quick overview of data in the archive. Two new inventory files for experimental data are added. These files are in the extensible JSON format, and can be found under the new /pdb/holdings/ archive tree. The inventory lists provided include: * current_file_holdings.json.gz: a list of released PDB entries and the file types present for each in the PDB Core Archive (e.g. coordinate data, experimental data, validation report). * refdata_id_list.json.gz: a list of released chemical reference entries, their content types (e.g., Chemical Component, BIRD), and the most recent modification date of the reference file. * released_structures_last_modified_dates.json.gz: a list of released PDB entries with the most recent modification date of the PDBx/mmCIF file. * released_experimental_data_last_modified_dates.json.gz: a list of released experimental data files with the most recent modification date * obsolete_structures_last_modified_dates.json.gz: a list of obsoleted PDB entries with the most recent modification date of the PDBx/mmCIF file. * obsolete_experimental_data_last_modified_dates.json.gz: a list of obsoleted experimental data files with the most recent modification date. * all_removed_entries.json.gz: a list of obsoleted PDB entries including information for entry authors, entry title, release date, obsolete date, and superseding PDB ID, if any. * unreleased_entries.json.gz: a list of on-hold PDB entries, their entry status, deposition date, and pre-release sequence information, where available. Users are encouraged to utilize these inventory files. For example, checking for the update of the PDB archive can be performed using current_file_holdings.json.gz <https://s3.rcsb.org/pub/pdb/holdings/current_file_holdings.json.gz> or released_structures_last_modified_dates.json.gz <https://s3.rcsb.org/pub/pdb/holdings/released_structures_last_modified_dates.json.gz> in /pub/pdb/holdings/. Please contact info@wwpdb.org with any questions. -- Regards, Jasmine =========================================================== Jasmine Young, Ph.D. Biocuration Team Lead RCSB Protein Data Bank Research Professor Institute for Quantitative Biomedicine Rutgers, The State University of New Jersey 174 Frelinghuysen Rd Piscataway, NJ 08854-8087 Email:jasmine@rcsb.rutgers.edu Phone: (848)445-0103 ext 4920 Fax: (732)445-4320 ===========================================================