Enriched chemical component files at PDBe

DA
David Armstrong
Thu, Jan 19, 2023 3:10 PM

For users looking to access data on the small molecules in the PDB
archive, the wwPDB chemical component dictionary (CCD) provides detailed
information about geometry and linkage information for these ligands. To
provide even more detailed information about these molecules, the PDBe
team has created a process to provide enriched, updated versions of
these CCD files, providing additional data.

These files are available through the PDBe FTP area at the following URL:
http://ftp.ebi.ac.uk/pub/databases/msd/pdbechem_v2/
with the folders constructed based upon the first character of the CCD
ID, followed by the full length CCD ID, for example:
http://ftp.ebi.ac.uk/pub/databases/msd/pdbechem_v2/S/STI/STI.cif

The additional data provided in the PDBe updated CCD files includes
information on links to external databases generated through the UniChem
service, including identifiers for ChEMBL, ChEBI, BindingDB, KEGG and
more. This process also generates more extensive synonym information
from these related databases. There is also mapping to DrugBank IDs,
providing information on drug classification and its protein targets.

There is also data provided on chemical structure, with information
about Murcko scaffolds and fragments for the molecule. Furthermore, we
also provide a number of physicochemical properties through these files,
based on analysis with the RDKit (https://www.rdkit.org/) software.
These physicochemical properties include information such as number of
rotatable bonds, number of hydrogen bonds/acceptors and many more.

Finally we also make available 2D atom coordinates and bond order
required to create standardised images of the molecule. This information
is used to display the ligands on our 2D ligand interactions component
on the PDBe website (e.g. ibuprofen binding site in PDB entry 3p6h:
https://www.ebi.ac.uk/pdbe/entry/pdb/3p6h/bound/IBP). The ligand
interactions component displays the ligand chemical structure,
highlighting interactions with other components in the structure. Users
can also find idealised 3D conformers generated using RDKit in these
updated CCD files.

Kind Regards,
David Armstrong

--
David Armstrong
Outreach and Training Lead
PDBe
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD UK

For users looking to access data on the small molecules in the PDB archive, the wwPDB chemical component dictionary (CCD) provides detailed information about geometry and linkage information for these ligands. To provide even more detailed information about these molecules, the PDBe team has created a process to provide enriched, updated versions of these CCD files, providing additional data. These files are available through the PDBe FTP area at the following URL: http://ftp.ebi.ac.uk/pub/databases/msd/pdbechem_v2/ with the folders constructed based upon the first character of the CCD ID, followed by the full length CCD ID, for example: http://ftp.ebi.ac.uk/pub/databases/msd/pdbechem_v2/S/STI/STI.cif The additional data provided in the PDBe updated CCD files includes information on links to external databases generated through the UniChem service, including identifiers for ChEMBL, ChEBI, BindingDB, KEGG and more. This process also generates more extensive synonym information from these related databases. There is also mapping to DrugBank IDs, providing information on drug classification and its protein targets. There is also data provided on chemical structure, with information about Murcko scaffolds and fragments for the molecule. Furthermore, we also provide a number of physicochemical properties through these files, based on analysis with the RDKit (https://www.rdkit.org/) software. These physicochemical properties include information such as number of rotatable bonds, number of hydrogen bonds/acceptors and many more. Finally we also make available 2D atom coordinates and bond order required to create standardised images of the molecule. This information is used to display the ligands on our 2D ligand interactions component on the PDBe website (e.g. ibuprofen binding site in PDB entry 3p6h: https://www.ebi.ac.uk/pdbe/entry/pdb/3p6h/bound/IBP). The ligand interactions component displays the ligand chemical structure, highlighting interactions with other components in the structure. Users can also find idealised 3D conformers generated using RDKit in these updated CCD files. Kind Regards, David Armstrong -- David Armstrong Outreach and Training Lead PDBe European Bioinformatics Institute (EMBL-EBI) European Molecular Biology Laboratory Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UK