Dear all,
RCSB PDB has released a new AI-powered 3D Structure Similarity search, enabling faster and more scalable structural comparisons across experimentally determined and Computed Structure Models (CSMs) https://www.rcsb.org/docs/general-help/computed-structure-models-and-rcsborg.
This update introduces a streamlined approach for performing protein structure similarity searches through the RCSB Search API. The service identifies proteins with similar three-dimensional shapes using a machine learning–based approach that represents macromolecular structures as embeddings in a high-dimensional vector space. Combined with vector databases, this approach enables efficient large-scale comparison of 3D structures and improves sensitivity for detecting structural similarity.
For methodological details, see the associated publication describing the embedding-based approach: doi.org/10.1093/bioinformatics/btag058 https://doi.org/10.1371/journal.pcbi.1007970.
What’s New
Protein-focused searches
The service now focuses exclusively on protein chains and assemblies. Searches for nucleic acids are no longer supported. For assemblies containing both proteins and nucleic acids, only the protein chains are considered during similarity comparison.
New API parameters
Additional API parameters are now available to control method-specific behavior. Please review the API reference https://search.rcsb.org/redoc/index.html#tag/Search-Service/operation/runJsonQueriesPost to ensure the parameters are configured appropriately for your workflow.
Updated client library
A new version of the rcsb-api library has been released with support for the updated 3D similarity search functionality. See the library documentation https://rcsbapi.readthedocs.io/en/latest/search_api/query_construction.html#structure-similarity-search for usage examples.
Known limitations
Known limitations are described in the corresponding RCSB documentation page https://www.rcsb.org/docs/search-and-browse/advanced-search/3d-similarity-search#limitations-of-3d-similarity-search.
Full documentation and example queries are available in the RCSB API documentation https://search.rcsb.org/#search-api.
Regards,
Yana Rose
——————————————————————
Yana Rose, Ph.D.
Scientific Software Developer & Project Manager
RCSB Protein Data Bank
University of California San Diego
10100 Hopkins Dr, La Jolla, CA 92093