Free Energy of Interaction with Solinprot


This document describes how to calculate the free energy of interaction between the QM and the MM part of a system optimized with ComQum. I will describe how to prepare the needed input files, and how to run the program. For more details on solinprot, such as file formats and available flags, as well as information about the other MEAD programs, please check the README file.

Here is a list of the involved steps:

  1. Calculate esp charges for the QM part.
  2. Create a pdb file with charges (qm_chg.pdb), of the QM part.
  3. Create a pdb file of the whole system (qmmm.pdb) including all explicit water molecules.
  4. Identify water molecules that "interact" with the QM part, and remove all other water molecules (qmmm.pdb => qmmm_nowat.pdb).
  5. Translocate the system to the cap center, and find the cap radius (qmmm_nowat.pdb => qmmm_trans.pdb).
  6. Add radii and create a pqr file (qmmm_trans.pdb => qmmm.pqr).
  7. Remove the QM part, and add the correct charges (qmmm.pqr => mm.pqr).
  8. Translocate the QM pdb file, add radii, and create a pqr file (qm_chg.pdb => qm.pqr).
  9. Create an ogm file and run solinprot.

1) Calculate esp charges

We begin by obtaining charges for the QM part. Use the following G98 template for this. Ulf has a program that creates such G98 input from Turbomole files, but you are also welcome to use this little awk utility (coord2xyz.awk), which transforms a Turbomole coordinate file to an xyz file that you in turn can read into the G98 template. Note that the keyword NoSymm in the template is absolutely mandatory - without it you will get quite peculiar charges!

2) Create a pdb file for the QM part

Now you must create a pdb file of the QM part and insert the esp charges in this file. I have also made an awk utility for this purpose (get_chg.awk). This script extracts the esp charges from the G98 output and puts them in a format intended for pasting with a pdb file with one title line. The pdb file (without charges) can be created with Ulf's program mimic, which converts a Turbomole coordinate file to a pdb file, or with this awk script (xyz2pdb.awk), which converts an xyz file to a pdb file. The pdb file with charges can then simply be created with paste -d" " pdb_nochg chargefile > pdb_withchg. Ulf's program changepdb might be able to accomplish the same as the combination of get_chg.awk and paste, so check with him if you are looking for an alternative option.

3 and 4) Create a pdb file for the whole system, and identify potentially important explicit waters

We now need a pdb file of the whole system. Use Ulf's program changeparm to obtain a pdb file from the amber topology file for the whole system. Use the resulting pdb file to identify water molecules that interact with the QM part. That is, identify water molecules that have one atom within for example 2 Å of the QM system. For this purpose you will need a visualization program with reasonable intelligent selection commands. I have used Pymol, which has very flexible selection options, but I presume that Rasmol is equally well suited for this task.

If you have relaxed some of the MM part in the ComQum calculations, you will need to investigate each of the systems that you want to compare. The systems will most likely not be identical with respect to potentially important waters, so the trick is to find a common number of water molecules to keep. I am not sure whether it is completely necessary to keep the number of waters constant among the systems, but I think that this procedure minimizes the risk of introducing a bias, when removing waters.

As you may have guessed by now, I have of course made a simple awk script (remwat.awk), which removes unwanted waters from a pdb file. It is not really an ideal solution, as you have to edit the script for each new pdb file, but it is still much simpler, faster, and safer than editing the pdb file itself. Ulf's program changepdb might be able to do the same, so check that if you don't like this solution.

5, 6, and 7) Translocate the system and create a pqr file

Use Ulf's program changepdb to translocate all the coordinates, so that the Cartesian origin is the cap center. Use command "m" and subcommand "p". Write down the translocation vector, as you will need to translocate the QM part with this vector too. Now use command "me" to add radii and create a pqr file.

The pqr file that you have created contains the QM part, which you will now need to remove. Yes, I have also made an awk script for this purpose (remqm.awk). This script is perhaps not the most elegant solution, as you are forced to write the array of QM atoms into the script. On the other hand, you only have to do this once for every series of systems you are studying. In case you were wondering, I remove the QM part after I have made the pqr file, because in this way changepdb assigns all radii correctly.

Now we need to adjust the charges of the QM-MM intersecting residues. This is most easily achieved by using the pointcharges file from the ComQum calculation, because this file contains exactly the charges we want. First we strip the charges from the pointcharges file and put them in a format suited for pasting with the pqr file:

awk 'NR==1 {printf ("\n")} NR>1 && NR<=nr_atm+1 {printf ("%7.4f\n",$4)}' pointcharges > mm.chg
nr_atm+1 is the number of atoms in the pqr file plus 1 to account for the keyword line in the pointcharges file. Adjust the first print statement to reflect the number of title lines in the pqr file. Here is a combination of cut and paste commands that gives the final pqr file:
cut -c 64- mm.pqr > radii
cut -c -54 mm.pqr > mm_coord.pqr
paste -d" " mm_coord.pqr mm.chg radii > protein.pqr

My run script for solinprot expects to find a file called protein.pqr, which contains the MM part! Note, you will only have to create mm.chg and radii once for each series of systems that you want to compare, as only the coordinates of the MM part change among the systems.

8) Create a pqr file for the QM part

Use Ulf's program changepdb to translocate all coordinates for the QM part with the vector you identified earlier. Add radii and create the pqr file. There is a little quirk with solinprot. No atoms within a given residue number may have identical atom labels, so I have made a script (prep_pqr.awk), which creates unique labels for all atoms.

9) Create an ogm file and start the job

The ogm file contains specifications of the grid center, the grid dimension, and the grid spacing. If you have translocated the coordinates of the system as described above, then the grid center must be (0, 0, 0). The grid dimension is an odd integer, which specifies the number of grid points along the edge of the grid, and the grid spacing is the distance in Å between the grid points. You can use a series of grids with increasing "resolution" in the calculation. As I understand it, the grids should become increasingly focused on the system, that is they should extend less and less into empty space. However, the finest grid must still extend beyond the system. I have tried to make the finest grid extend ca. 5 Å beyond the system. Here is the ogm file I used for a system with a cap radius of 47 Å:
(0,0,0) 201 1
(0,0,0) 241 0.6
(0,0,0) 351 0.3
351 grid points is the highest number of points that I have been able to make our 1GB RAM machines handle!

solinprot expects that the ogm file and the pqr file for the QM part have an identical prefix, and my run script expects that the pqr file for the MM part is called protein.pqr. The run script displaces the grid in positive and negative direction along each Cartesian axis with a floating point offset, which must be given as the second argument to the run script. I have normally used half of the finest grid spacing as the offset (i.e. 0.15 in the example above). Here is then finally the syntax for starting a job:

run_solinprot ogm_file offset
The run script in /home/bio/Bin uses the following default values (see the README file for a description of the various flags):
-epsin1 1.0
-epsin2 4.0
I have made a script (get_sol.awk) that extracts the name of the job, the finest grid level, and all the energies from the output file.

Happy computing! :-)


Created by Torben Rasmussen, Nov-2003