More efficient delivery of sequence cluster files

JD
Jose Duarte
Mon, Feb 14, 2022 6:26 PM

The sequence cluster files offered at RCSB PDB's CDN server (see
https://www.rcsb.org/docs/programmatic-access/file-download-services#sequence-clusters-data)
are now offered using PDB polymer entity identifiers, removing much
redundancy and producing smaller file sizes. The previous chain-based
cluster files will be updated only until April 12 2022. If you use these
files, please consider migrating to the entity-based files as soon as
possible.

Please also note that more integrated access to the same data is available
via RCSB PDB's Data and Search APIs. See:

https://data.rcsb.org/#gql-example-3
https://search.rcsb.org/#dealing-with-redundancy

Best wishes

Jose


Jose Duarte
RCSB Protein Data Bank
San Diego Supercomputing Center
UC San Diego

The sequence cluster files offered at RCSB PDB's CDN server (see https://www.rcsb.org/docs/programmatic-access/file-download-services#sequence-clusters-data) are now offered using PDB polymer entity identifiers, removing much redundancy and producing smaller file sizes. The previous chain-based cluster files will be updated only until April 12 2022. If you use these files, please consider migrating to the entity-based files as soon as possible. Please also note that more integrated access to the same data is available via RCSB PDB's Data and Search APIs. See: https://data.rcsb.org/#gql-example-3 https://search.rcsb.org/#dealing-with-redundancy Best wishes Jose --- Jose Duarte RCSB Protein Data Bank San Diego Supercomputing Center UC San Diego