PISA Protein Interfaces Surfaces and Assemblies http www
PISA Protein Interfaces, Surfaces and Assemblies http: //www. ebi. ac. uk/msd-srv/prot_int/pistart. html Eugene Krissinel keb@ebi. ac. uk CCP 4 & EBI-MSD
PISA is a tool for the assessment of macromolecular interactions using data provided by protein crystallography. Scope of tasks addressed by PISA: • • • identification and prediction of multimeric states analysis of structure-function relationship analysis and prediction of macromolecular interactions analysis of macromolecular complexation and crystallisation properties of macromolecular interfaces search for interface/structure/assembly homologues active site recognition and analysis macromolecular surface analysis other Project started in 2004, supported by BBSRC research grant 721/B 19544
PISA today Web-service hosted by EBI-MSD at http: //www. ebi. ac. uk/msd-srv/prot_int/pistart. html • provides PISA analysis for all PDB entries and database searches • allows upload of PDB and mm. CIF files for interactive PISA analysis • provides XML download of multimer data, which is used in server applications (BALBES) on molecular replacement • works on aminoacid, nucleic acid and ligand structures • more than 140, 000 external queries served since the release • more than 1700 users • has a command-prompt stand-alone version
PISA basics PISA is based on chemical thermodynamics: for stable structures in the standard state. DGdiss cannot be calculated exactly. PISA uses semiempirical models with parameters calibrated to available experimental data on multimeric states. Precision of free energy estimates in PISA: Success rate of PQS prediction: ± 5 kcal/mol 80 -90%
Last year activity – nucleic acids and ligands ¨ Extension to include protein-DNA/RNA and ligand interactions • • • ¨ Derivation and calibration of interaction parameters Database of ligand interactions (~6000 entries parameterized on atomic level) Tools for database update and semi-automatic calculation of protein-ligand interactions Core algorithm completely rewritten in order to: • • implement changes needed to adopt protein-DNA/RNA and ligand interactions optimize and speed-up the calculations
Last year activity – ligand control ¨ Control over ligand processing: • Possibility to exclude certain ligands from processing • Choice of ligand processing modes: Ø Automatic Ø Fix all ligands Ø Free all ligands
Last year activity – adaptation for MSD&PDB ¨ Interface and presentation improvements at request of PDB/MSD curation teams: • • • ¨ Consistent identification of symops in PISA pages Adoption of PDB@RCSB symop nomenclature Automatic generation of REMARK 350 Optimization of final assembly positions Reporting on redundant assemblies (especially when ASU contains a fractional number of assemblies >1) PISA is now employed by both MSD and PDB@RCSB as a mandatory processing tool for all depositions
Last year activity - PISA database ¨ PISA database searches by • • • Multimeric state Symmetry number Space group Homomeric type Salt bridges Disulphide bonds List of ligands List of keywords Dissociation energy Assembly ASA Assembly BSA Percent BSA
Last year activity - standalone PISA ¨ Command-prompt, stand-alone PISA for inclusion into CCP 4 • • • Contains only data-processing part of “big” PISA, i. e. no database For technical reasons, there are code differences from “big” PISA Functionally identical to the corresponding parts of “big” PISA Mimics web-page output of “big” PISA in plain text Provides same XML output as “big” PISA Works as a local server: Ø Maintains sessions Ø Data processing separated from data retrieval • Visualization using Rasmol
Standalone PISA example
Last year activity – tune-up and polishing ¨ Last percent of improvement takes 99% of all efforts ¨ Literally hundreds of small problems solved on everyday basis. Examples: • • ¨ Inference of correct orthogonalisation codes Choice of margins for identification of parallel monomeric units Symmetry number calculations: superposition margins and unit enumeration order Identification and proper treatment of overlapping symmetry mates with fractional occupancy Unique labels for the download/visualisation data and wait pages to avoid caching on remote servers Catching up with EBI systems update PISA is roughly 60, 000 C++ statements and small bugs are most probably still there 6 releases over year
Future plans ¨ PISA is systematically underperforming on FABs. Possible reasons: • Neglect of electrostatic interactions • Neglect of entropy absorbance in flexible complexes ¨ ¨ Both are very difficult problems to address Last percent of improvement takes 99% of efforts
Future plans ¨ Analysis of “custom” assemblies • Allow for input without crystallographic data • Effectively inclusion of NMR entries as well ¨ Detection of “custom” assemblies • Allow for report on specific assemblies otherwise missed as unstable ¨ Automatic prediction of macromolecular interactions and assemblies by homologue search in PISA database ¨ Assessment of crystal “quality” • Identification of fake PDB entries and depositions
Fake PDBs 2 i 07 2 icc 2 ice 2 icf 2 hr 0 BSA 20% 51% 24% 10% Interfaces per chain 9. 56 7 6. 12 8. 33 3. 5 8 7 4 4 2 Yes Yes Yes Min. interfaces / chain Connected crystal
- Slides: 14