Crystallographic methods and software development in a high-throughput environment

Project: Research project (funded)Research

Project Details


The plans are as follows:

Task 1: Molecular Replacement (MR).
1. Implement communication tools using xml tags and test as a proof of principle for automated MR.

2. Improve analysis of the crystal contents to find out if there are several molecules and if so to use the self rotation function to decide if multimers are present. Then use this information in the decision making procedure during MR.

3. Analyse the PDB for intensity curves, find the optimal curve to describe the current experiment and use it to derive normalisation coefficients.

Task 2: COOT:
1. Implement fully automated model building using both fast methods and intensive methods. Fast methods will be used from within the COOT graphical model building software, and intensive methods within automated procedures.

2. Improve automated rebuilding tools in COOT for MR and binding study problems, and implement stand-alone versions. Initial tools for automated rebuilding in MR should have been released by the start of the grant, however automated loop rebuilding and iteration with phase improvement will require further work.

3. Fill in the remaining gaps in the functionality of the COOT graphical model building software to allow it to become a full, free and modern replacement for existing software. This includes various manual and assisted model building tools and user interface improvements.

Key findings

Almost all of the grant objectives have been completed. The results of each objective of the original proposal is addressed below:

Task 1: Molecular replacement
3 objectives fully met, 1 partially met, 1 in progress (of 6).
Communication between programs: Inter-program communication has been implemented and demonstrated in a highly automated molecular replacement pipeline: BALBES. The system takes only sequence(s) and experimental data and solves a crystal structure automatically using hierarchically organized database of protein structures, their domains and multimers. MOLREP, SFCHECK, REFMAC have been modified to produce information about the state of the processes in xml format. SFCHECK produces such vital information about the crystal as pseudo-translation, twinning, data completeness and anisotropy. The generated information is used during Molecular Replacement for decision-making.
Utilisation of the contents of crystal: If the sequence or the search model is given then MOLREP analyses the contents of the asymmetric unit of the crystal under study and suggests the number of molecules. Then if BALBES database has multimers homologous to the given sequence then they are tried as search models with higher priority. Thus if the asymmetric unit has a multimer then likelihood of finding it increases that in turn increases chance of solving the structure. Analysis of self-rotation function and using it to generate multimers without the help of database is currently under way.
Weighting of X-ray terms: Analysis of intensity curves for domains, subunits, multimers and whole crystals is under way; these will be classified and used in weighting the X-ray terms.
Location and use of pseudo-symmetry: The use of pseudo-symmetry in molecular replacement has been implemented in the software ZANUDA, available in the on the webserver, and deals with the problems when psuedosymmetry and psuedorotation are present at the same time.
Use of information about crystal imperfection: Use of twinning so far has not improved the results of molecular replacement but work is continuing.
Fast communication with the refinement program – REFMAC: This has also been implemented as part of the BALBES pipeline, described above. Refmac is now used as a test of the quality of molecular replacement solutions from within the molecular replacement calculation.

Task 2: COOT
6 objectives fully met, 1 partially met, 1 met by an external project (of 8).
Further stabilisation and portability work. Recent versions (0.3.x and 0.4.0 pre-release) are substantially more stable. There code is regularly tested at user workshops, and user feedback of problems from the mailing list is dealt with quickly. Emsley has developed an extensive automated testing suite for new releases.
Filling in gaps in the current functionality. Tools to fit helices and strands both at a specified position and by fast search of the whole map have been implemented. An tool for sequence assignment has been implemented, using a faster version of code developed for the 'buccaneer' model building software. Development continues in this area.
Providing interaction with new packages. Shelx reflection files and CNS map files are now handled automatically (a CNS reflection file interchange format has also been developed). The python scripting interface has developed dramatically and is now used by the PHENIX project for interaction with Coot. The Refmac interface has been improved, and Refmac is also launched to calculate phases if none are present in the input data.
Improving the automatic rebuilding tool for MR problems. 'Fit protein' and 'Stepped refine' tools have been implemented and are available in the 'extensions' menu, these allow automated real space refinement and rebuilding of a MR model into a difference map. Two different loop building tools have been implemented, one using internal code and the other making use of the external ARP/wARP software.
Extending the automatic rebuilding tool to perform automatic building of experimentally phased maps. A tool for automatically building multiple residues onto an existing chain has been implemented. Fully automated building into an empty map is currently handled by the external 'buccaneer' software.
Improve the manual building tools for difficult cases, in particular low resolution maps. The behaviour of the add-terminal-residue and side chain fitting tools have been substantially improved at low resolution by introducing a real-space refinement step, this will become the default behaviour in future versions. Planarity restraints on peptide groups have also been implemented. Torsion angle and chiral volume restrains are in development.
Extending the existing validation methods to provide automatic and semi-automatic validation procedures. The range of validation tools has been extended to include a new B-factor analysis, peptide bond geometry analysis, checks on modelled water molecules, NCS analysis, chiral volume warnings, and quality indicators from Molprobity. Warnings from Refmac are also available through the user interface.
User interface. Bernhard Lohkamp has made major improvements to the user interface. He has also done a lot of work on portability and the python scripting interface. He has also developed a 'prefences' window to clarify the many configuration options to the user. He has implemented docking of some of the dialog and tool windows which are generated by the program.
Coot has been cited more than 3000 times since 2004 (Emsley & Cowtan, 2004, Acta Cryst. D60, 2126-2132), and currently is downloaded ~60 times per day.
Effective start/end date1/12/0514/04/09