PDB Reaches a New Milestone: 200,000+ Entries

JY
Jasmine Young
Wed, Jan 11, 2023 2:55 PM

Dear PDB-l,

*Depositors: **Download the image at
*https://www.wwpdb.org/news/news?year=2023#639b9e337f8444f313d20414,
write the number of structures deposited, and tag us in your photos.

With this week's update, the PDB archive contains a record 200,069
entries. The archive passed 150,000 structures in 2019
https://www.wwpdb.org/news/news?year=2019#5c8c2db1ea7d0653b99c8774 and
100,000 structures in 2014
https://www.wwpdb.org/news/news?year=2014#5764490799cccf749a90cdd6.

Established in 1971, this central, public archive has reached this
critical milestone thanks to the efforts of structural biologists
throughout the world who contribute their experimentally-determined
protein and nucleic acid structure data.

wwPDB data centers support online access to three-dimensional structures
of biological macromolecules that help researchers understand many
facets of biomedicine, agriculture, and ecology, from protein synthesis
to health and disease to biological energy. Many milestones have been
reached since the archive released the 100,000th structure in 2014. PDB
data have been seminal in understanding SARS-CoV-2, and provided the
foundation for the development of AI/ML techniques for predicting
protein structure. The 50th anniversary of the PDB was celebrated
throughout 2021 https://www.wwpdb.org/pdb50.

Today, the archive is quite large, containing more than 3,000,000 files
related to these PDB entries that require more than 1086 Gbytes of
storage. PDB structures contain more than 1.8 billion non-hydrogen atoms.

    Function follows form

In the 1950s, scientists had their first direct look at the structures
of proteins and DNA at the atomic level. Determination of these early
three-dimensional structures by X-ray crystallography ushered in a new
era in biology-one driven by the intimate link between form and
biological function. As the value of archiving and sharing these data
were quickly recognized by the scientific community, the Protein Data
Bank (PDB) was established as the first open access digital resource in
all of biology by an international collaboration in 1971 with data
centers located in the US and the UK.

Among the first structures deposited in the PDB were those of myoglobin
and hemoglobin, two oxygen-binding molecules whose structures were
elucidated by Chemistry Nobel Laureates John Kendrew and Max Perutz.
With this week's regular update, the PDB welcomes 266 new structures
into the archive. These structures join others vital to drug discovery,
bioinformatics and education.

The PDB is growing rapidly, increasing in size ~13% since 2011. In 2022,
an average of 275 new structures were released to the scientific
community each week. The resource is accessed hundreds of millions of
times annually by researchers, students, and educators intent on
exploring how different proteins are related to one another, to clarify
fundamental biological mechanisms and discover new medicines.

    Twenty Years of Collaboration

Since its inception, the PDB has been a community-driven enterprise,
evolving into a mission critical international resource for biological
research. The wwPDB partnership was established in July 2003 with PDBe,
PDBj, and RCSB PDB. Today, the collaboration includes partners BMRB
(joined in 2006) and EMDB (2021).

The wwPDB ensures that these valuable PDB data are securely stored,
expertly managed, and made freely available for the benefit of
scientists and educators around the globe. wwPDB data centers work
closely with community experts to define deposition and annotation
policies, resolve data representation issues, and implement community
validation standards. In addition, the wwPDB works to raise the profile
of structural biology with increasingly broad audiences.

Each structure submitted to the archive is carefully curated by wwPDB
staff before release. New depositions are checked and enhanced with
value-added annotations and linked with other important biological data
to ensure that PDB structures are discoverable and interpretable by
users with a wide range of backgrounds and interests.

wwPDB eagerly awaits the next 100,000 structures and the invaluable
knowledge these new data will bring.

--
Regards,

Jasmine

---==========================
Jasmine Young, Ph.D.
Biocuration Team Lead
RCSB Protein Data Bank
Research Professor
Institute for Quantitative Biomedicine
Rutgers, The State University of New Jersey
174 Frelinghuysen Rd
Piscataway, NJ 08854-8087

Email:jasmine@rcsb.rutgers.edu
Phone: (848)445-0103 ext 4920
Fax: (732)445-4320

---==========================

Dear PDB-l, *Depositors: **Download the image at **https://www.wwpdb.org/news/news?year=2023#639b9e337f8444f313d20414, write the number of structures deposited, and tag us in your photos.* With this week's update, the PDB archive contains a record 200,069 entries. The archive passed 150,000 structures in 2019 <https://www.wwpdb.org/news/news?year=2019#5c8c2db1ea7d0653b99c8774> and 100,000 structures in 2014 <https://www.wwpdb.org/news/news?year=2014#5764490799cccf749a90cdd6>. Established in 1971, this central, public archive has reached this critical milestone thanks to the efforts of structural biologists throughout the world who contribute their experimentally-determined protein and nucleic acid structure data. wwPDB data centers support online access to three-dimensional structures of biological macromolecules that help researchers understand many facets of biomedicine, agriculture, and ecology, from protein synthesis to health and disease to biological energy. Many milestones have been reached since the archive released the 100,000th structure in 2014. PDB data have been seminal in understanding SARS-CoV-2, and provided the foundation for the development of AI/ML techniques for predicting protein structure. The 50th anniversary of the PDB was celebrated throughout 2021 <https://www.wwpdb.org/pdb50>. Today, the archive is quite large, containing more than 3,000,000 files related to these PDB entries that require more than 1086 Gbytes of storage. PDB structures contain more than 1.8 billion non-hydrogen atoms. Function follows form In the 1950s, scientists had their first direct look at the structures of proteins and DNA at the atomic level. Determination of these early three-dimensional structures by X-ray crystallography ushered in a new era in biology-one driven by the intimate link between form and biological function. As the value of archiving and sharing these data were quickly recognized by the scientific community, the Protein Data Bank (PDB) was established as the first open access digital resource in all of biology by an international collaboration in 1971 with data centers located in the US and the UK. Among the first structures deposited in the PDB were those of myoglobin and hemoglobin, two oxygen-binding molecules whose structures were elucidated by Chemistry Nobel Laureates John Kendrew and Max Perutz. With this week's regular update, the PDB welcomes 266 new structures into the archive. These structures join others vital to drug discovery, bioinformatics and education. The PDB is growing rapidly, increasing in size ~13% since 2011. In 2022, an average of 275 new structures were released to the scientific community each week. The resource is accessed hundreds of millions of times annually by researchers, students, and educators intent on exploring how different proteins are related to one another, to clarify fundamental biological mechanisms and discover new medicines. Twenty Years of Collaboration Since its inception, the PDB has been a community-driven enterprise, evolving into a mission critical international resource for biological research. The wwPDB partnership was established in July 2003 with PDBe, PDBj, and RCSB PDB. Today, the collaboration includes partners BMRB (joined in 2006) and EMDB (2021). The wwPDB ensures that these valuable PDB data are securely stored, expertly managed, and made freely available for the benefit of scientists and educators around the globe. wwPDB data centers work closely with community experts to define deposition and annotation policies, resolve data representation issues, and implement community validation standards. In addition, the wwPDB works to raise the profile of structural biology with increasingly broad audiences. Each structure submitted to the archive is carefully curated by wwPDB staff before release. New depositions are checked and enhanced with value-added annotations and linked with other important biological data to ensure that PDB structures are discoverable and interpretable by users with a wide range of backgrounds and interests. wwPDB eagerly awaits the next 100,000 structures and the invaluable knowledge these new data will bring. -- Regards, Jasmine =========================================================== Jasmine Young, Ph.D. Biocuration Team Lead RCSB Protein Data Bank Research Professor Institute for Quantitative Biomedicine Rutgers, The State University of New Jersey 174 Frelinghuysen Rd Piscataway, NJ 08854-8087 Email:jasmine@rcsb.rutgers.edu Phone: (848)445-0103 ext 4920 Fax: (732)445-4320 ===========================================================