Re: pdb-l: Errors in "author determined" bio assy?

DR
Dunbrack, Roland
Fri, Jun 14, 2013 9:20 PM

This is a really common problem -- authors saying one thing in their paper and depositing a different biological assembly in the PDB. Usually they just deposit the asymmetric unit as their biological assembly in these cases. Example 3pc2 is a dimer in the paper but a monomer in the asymmetric unit and the authors' biological assembly. I have seen cases where the title of the PDB entry says the protein is a tetramer but the author biological assembly is a dimer.

I have brought this up at the RCSB's Advisory Board meetings, and we have had much discussion. It's my understanding that if there is an "author" biological assembly then the authors actually had to send a file at some point during the process. It cannot be that they forgot and the PDB put in the asymmetric unit by default. In this case (at least for entries in the last few years), the PDB would use the PISA biological assembly instead. In a few cases that I have asked specifically about, where the author BA is the same as the asymmetric unit but the paper is different, the PDB checked their records and they had an email exchange with the authors asking "Is the biological assembly the same as the asymmetric unit?" and the authors wrote back "yes."

In our analysis of available annotations (author and PISA), the biological assembly is different from the asymmetric MORE THAN 50% of the time (about half of these cases, it is bigger than the asymmetric unit and about half the time it is smaller). The idea that there is any biological meaning to the asymmetric unit per se is unwarranted.  The biological assembly is one of the very few pieces of information that is deposited with a structure that is not the direct result of the crystallographic experiment. It is the first image you see when you go to a PDB entry page, and it is very important for many purposes. It is critical to get it right and these kinds of mistakes are unnecessary and problematic.

The problem is one of communication between the authors and the PDB. One recent improvement, which I think is operating now, is that the authors are now shown the PISA biological assembly choices and can pick from one of those (often PISA has more than one possible assembly).

I had one idea that the PDB has not yet adopted and I would like to know what crystallographers think of it: I want the PDB to collect a statement from the authors on why they think the biological assembly that they deposit is the correct assembly.

Authors often do have experimental data on the size of the assembly (e.g. AUC) or even better mutational data or other experiments on the interface(s) that are present in the biological assembly in solution. They may have multiple crystal forms of the protein and see the same assembly in more than one crystal form (we and others have used this as evidence in favor of the biological relevance of interfaces in crystals). Or they may be aware of other proteins in the same family that have the same assembly in their crystal structures in the PDB and by inference, if the same assembly is present in the new structure, it is also likely to be biological. In some cases, it may be simply hypothetical and that would be fine. It would be nice to know if the deposited assembly is only a guess.

I think this annotation is especially important when no paper is published but even when it is, it might deter these kinds of mistakes.

What do you think?

Roland


Roland Dunbrack
Professor, Institute for Cancer Research
Fox Chase Cancer Center
Philadelphia PA 19111
http://dunbrack.fccc.edu
http://dunbrack.org

On 06/13/2013 10:39 PM, Eric Martz wrote:

It is my impression that when the "author determined" biological
assembly (REMARK 350) specifies the same structure as the asymmetric
unit, sometimes the authors simply forgot to specify a known assembly
during PDB deposition. Does anyone know of PDB entries where this has occurred?

Thanks, Eric

/* - - - - - - - - - - - - - - - - - - - - - - - - - - -
Eric Martz, Professor Emeritus, Dept Microbiology
U Mass, Amherst -- http://Martz.MolviZ.Org

Top Five 3D MolVis Technologies http://Top5.MolviZ.Org
FirstGlance: 3D Views in Nature Structure - http://firstglance.jmol.org
3D Wiki with Scene-Authoring Tools http://Proteopedia.Org
Biochem 3D Education Resources http://MolviZ.org
ConSurf - Find Conserved Patches in Proteins: http://consurf.tau.ac.il
Atlas of Macromolecules: http://atlas.molviz.org
Interactive Molecules in Public Spaces http://MolecularPlayground.Org
Workshops: http://workshops.molviz.org

                                                      • */

CONFIDENTIALITY NOTICE: This email communication may contain private, confidential, or legally privileged information intended for the sole use of the designated and/or duly authorized recipient(s). If you are not the intended recipient or have received this email in error, please notify the sender immediately by email and permanently delete all copies of this email including all attachments without reading them. If you are the intended recipient, secure the contents in a manner that conforms to all applicable state and/or federal requirements related to privacy and confidentiality of such information.

This is a really common problem -- authors saying one thing in their paper and depositing a different biological assembly in the PDB. Usually they just deposit the asymmetric unit as their biological assembly in these cases. Example 3pc2 is a dimer in the paper but a monomer in the asymmetric unit and the authors' biological assembly. I have seen cases where the title of the PDB entry says the protein is a tetramer but the author biological assembly is a dimer. I have brought this up at the RCSB's Advisory Board meetings, and we have had much discussion. It's my understanding that if there is an "author" biological assembly then the authors actually had to send a file at some point during the process. It cannot be that they forgot and the PDB put in the asymmetric unit by default. In this case (at least for entries in the last few years), the PDB would use the PISA biological assembly instead. In a few cases that I have asked specifically about, where the author BA is the same as the asymmetric unit but the paper is different, the PDB checked their records and they had an email exchange with the authors asking "Is the biological assembly the same as the asymmetric unit?" and the authors wrote back "yes." In our analysis of available annotations (author and PISA), the biological assembly is different from the asymmetric MORE THAN 50% of the time (about half of these cases, it is bigger than the asymmetric unit and about half the time it is smaller). The idea that there is any biological meaning to the asymmetric unit per se is unwarranted. The biological assembly is one of the very few pieces of information that is deposited with a structure that is not the direct result of the crystallographic experiment. It is the first image you see when you go to a PDB entry page, and it is very important for many purposes. It is critical to get it right and these kinds of mistakes are unnecessary and problematic. The problem is one of communication between the authors and the PDB. One recent improvement, which I think is operating now, is that the authors are now shown the PISA biological assembly choices and can pick from one of those (often PISA has more than one possible assembly). I had one idea that the PDB has not yet adopted and I would like to know what crystallographers think of it: I want the PDB to collect a statement from the authors on why they think the biological assembly that they deposit is the correct assembly. Authors often do have experimental data on the size of the assembly (e.g. AUC) or even better mutational data or other experiments on the interface(s) that are present in the biological assembly in solution. They may have multiple crystal forms of the protein and see the same assembly in more than one crystal form (we and others have used this as evidence in favor of the biological relevance of interfaces in crystals). Or they may be aware of other proteins in the same family that have the same assembly in their crystal structures in the PDB and by inference, if the same assembly is present in the new structure, it is also likely to be biological. In some cases, it may be simply hypothetical and that would be fine. It would be nice to know if the deposited assembly is only a guess. I think this annotation is especially important when no paper is published but even when it is, it might deter these kinds of mistakes. What do you think? Roland --------------------------------- Roland Dunbrack Professor, Institute for Cancer Research Fox Chase Cancer Center Philadelphia PA 19111 http://dunbrack.fccc.edu http://dunbrack.org On 06/13/2013 10:39 PM, Eric Martz wrote: > It is my impression that when the "author determined" biological > assembly (REMARK 350) specifies the same structure as the asymmetric > unit, sometimes the authors simply forgot to specify a known assembly > during PDB deposition. Does anyone know of PDB entries where this has occurred? > > Thanks, Eric > > > /* - - - - - - - - - - - - - - - - - - - - - - - - - - - > Eric Martz, Professor Emeritus, Dept Microbiology > U Mass, Amherst -- http://Martz.MolviZ.Org > > Top Five 3D MolVis Technologies http://Top5.MolviZ.Org > FirstGlance: 3D Views in Nature Structure - http://firstglance.jmol.org > 3D Wiki with Scene-Authoring Tools http://Proteopedia.Org > Biochem 3D Education Resources http://MolviZ.org > ConSurf - Find Conserved Patches in Proteins: http://consurf.tau.ac.il > Atlas of Macromolecules: http://atlas.molviz.org > Interactive Molecules in Public Spaces http://MolecularPlayground.Org > Workshops: http://workshops.molviz.org > - - - - - - - - - - - - - - - - - - - - - - - - - - - */ > > CONFIDENTIALITY NOTICE: This email communication may contain private, confidential, or legally privileged information intended for the sole use of the designated and/or duly authorized recipient(s). If you are not the intended recipient or have received this email in error, please notify the sender immediately by email and permanently delete all copies of this email including all attachments without reading them. If you are the intended recipient, secure the contents in a manner that conforms to all applicable state and/or federal requirements related to privacy and confidentiality of such information.