Corrections for FEP calculations that change the net charge
i.e. corrections for the periodic MD simulations with Ewald summation

This document describes how to correct free energy calculations performed when the net charge of the perturbated system changes, using periodic boundary conditions in the MD simulations.

The description is based on the article
Rocklin et al., "Calculating the binding free energies of charged species based on explicit-solvent simulations employing lattice-sum methods: An accurate correction scheme for electrostatic finite-size effects" J. Chem. Phys. 139, 2013.

The implementation for Amber is described in:
M. A. Olsson, Alfonso T. García-Sousa, U. Ryde (2017) "Binding affinities of the farnesoid X receptor in the D3R Grand Challenge 2 estimated by free-energy perturbation and docking" J. Comput.-Aided Mol. Des., submitted 1/6-17.

There are some sample files in /temp4/bio/Fep_misc/Ch_corr/Alf and template input files are in /temp4/bio/Fep_misc/Ch_corr.

Alfonso T. Garcia-Sosa, Dec. 2016; Updated by UR and Martin Olsson, Feb. 2017.




Procedure

  1. Run the FEP simulations with a changed charge with a cubic (not rectangular) periodic box and constant volume.
    This is obtained by tleap, using the command:
    solvateBox x TIP3PBOX 10 iso
    Use the TIP3P water model.

  2. Copy the needed apbs template files:
    cp /temp4/bio/Fep_misc/Ch_corr/* .

  3. Start with a structure of the protein-ligand complex, preferably the prmtop file and the final mdrest file. The coordinate file should be wrapped.
    Build a pqr file for this complex with changeparm
    changeparm <<EOF
    prot.prm
    pqr
    m
    prot-1.00-mdrest5-wrap
    p
    0
    cc.pqr
    q
    EOF
    This example, defined PARSE radii, without any added solvent probe.

  4. Find the size of the simulated box with changepdb:
    changepdb <<EOF
    cc.pqr
    xyz
    q
    EOF
    Values after "Diff = " are the box extent in the x, y and z directions.

  5. Also find the charge of the protein and the ligand, as well as the number of water molecules, e.g. by changepdb or changeparm.

  6. Strip water molecules from the cc.pqr file:
    grep -v WAT cc.pqr >c.pqr

  7. Divide the file into one with the protein and one with the ligand, either by hand or by proper grep commands, e.g.
    grep L12 c.pqr >l.pqr
    grep -v L12 c.pqr | grep -v L41 >p.pqr

  8. Construct pqr files with zeroed charges of the protein or the ligand with changepdb:
    changepdb <<EOF
    l.pqr
    pqr
    l.pqr
    z
    wp
    l0.pqr
    q
    EOF
    changepdb <<EOF
    p.pqr
    pqr
    p.pqr
    z
    wp
    p0.pqr
    q
    EOF

  9. Now we can construct the final pqr files with simple cat commands:
    cat l.pqr >lig_only.pqr
    cat p0.pqr l.pqr >lig_in_prot.pqr
    cat p.pqr l0.pqr >prot_only.pqr

  10. Check that the glen values are large enough to enclose the entire protein in the two apbs_*.in files.
    Then, run apbs twice with the commands (has to be done locally; they take one or a few minutes each):
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/temp4/bio/APBS/APBS-1.5/lib
    apbs apbs_hom.in >apbs_hom.out
    apbs apbs_het.in >apbs_het.out

  11. Run four python calculations to get the RIP files (change the values in red to the proper charges of the protein and the ligand)
    python calculate_RIP_het.py protein_RIP_het.dx -10 >protein_RIP_het.out
    python calculate_RIP_het.py ligand_RIP_het.dx -1 >ligand_RIP_het.out
    python calculate_RIP_hom.py protein_RIP_hom.dx -10 >protein_RIP_hom.out
    python calculate_RIP_hom.py ligand_RIP_hom.dx -1 >ligand_RIP_hom.out

  12. Extract the result from these files:
    grep RIP *RIP*out | grep kJ

  13. Insert all results (protein and ligand charges, box dimensions, #water, and the four energies from the previous command) into the template file in
    /temp4/bio/Fep_misc/Ch_corr/charge-corr-fep-template.xlsx

  14. Repeat the procedure for the simulation of the ligand free in water, also in a cubic box (in a separate directory):

  15. Copy the needed files
    cp /temp4/bio/Fep_misc/Ch_corr/apbs* /temp4/bio/Fep_misc/Ch_corr/cal* .

  16. Build a pqr file for this simulation with changeparm
    changeparm <<EOF
    wat.prm
    pqr
    m
    wat-1.00-mdrest5-wrap
    p
    0
    cc.pqr
    q
    EOF
    This example, defined PARSE radii, without any added solvent probe.

  17. Find the size of the simulated box with changepdb, xyz command
    Also the number of water molecules.

  18. Strip water molecules from the cc.pqr file:
    grep -v WAT cc.pqr >c.pqr

  19. Remove the other ligand, e.g.
    grep L12 c.pqr >l.pqr

  20. Construct a pqr file with zeroed charges of the ligand with changepdb:
    changepdb <<EOF
    l.pqr
    pqr
    l.pqr
    z
    wp
    l0.pqr
    q
    EOF

  21. Now we can construct the final pqr files with simple cat commands:
    \cp l.pqr lig_only.pqr
    \cp l.pqr lig_in_prot.pqr
    \cp l0.pqr prot_only.pqr
    This and the following command is not ideal for the ligand-only calculation (i.e. some calculations are made in vain), but rather to avoid any changes in the apbs input files.

  22. Run apbs twice with the commands):
    apbs apbs_hom.in >apbs_hom.out
    apbs apbs_het.in >apbs_het.out

  23. Run two python calculations to get the RIP files (change the ligand charge in red)
    python calculate_RIP_het.py ligand_RIP_het.dx -1 >ligand_RIP_het.out
    python calculate_RIP_hom.py ligand_RIP_hom.dx -1 >ligand_RIP_hom.out

  24. Extract the result from these files:
    grep RIP *RIP*out | grep kJ

  25. Insert the results box dimensions, #water, and the two energies into the template file, which now gives you the final results.


Running APBS

Below are sample apbs input files.
In the /temp4/bio/Fep_misc/Ch_corr directory, there are two such files, which only differ in the output file name and in the line
sdie 97.0
which is instead 1.0 in the homogenous case.


#
#############################################################################
Heterogenous PB calculations to calculate the residual integrated
# potentials I_P and I_L.
#
# Input files:
# prot_only.pqr - The protein-ligand complex, but with only the protein
# atoms charged.
# lig_in_prot.pqr - The protein-ligand complex, but with only the ligand
# atoms charged.
# lig_only.pqr - The ligand only, in the same coordinates as the above.
# Used for centering the box only.
#
# Output files:
# protein_RIP.dx - The potential grid based on prot_only.pqr
# ligand_RIP.dx - The potential grid based on lig_in_prot.pqr
#
#############################################################################
# READ IN MOLECULES
read
mol pqr prot_only.pqr
mol pqr lig_in_prot.pqr
mol pqr lig_only.pqr
end

# CALCULATE POTENTIAL WITH ONLY THE PROTEIN CHARGES
elec name prot_only
mg-manual
dime 257 257 257
glen 120 120 120
gcent mol 3
mol 1
lpbe
bcfl mdh
pdie 1.0
sdie 97.0
chgm spl4
srfm smol
srad 1.4
swin 0.3
sdens 40.0
temp 300.0
calcenergy no
calcforce no
write pot dx protein_RIP
end

# CALCULATE POTENTIAL WITH ONLY THE LIGAND CHARGES
elec name lig_only
mg-manual
dime 257 257 257
glen 120 120 120
gcent mol 3
mol 2
lpbe
bcfl mdh
pdie 1.0
sdie 97.0
chgm spl4
srfm smol
srad 1.4
swin 0.3
sdens 40.0
temp 300.0
calcenergy no
calcforce no
write pot dx ligand_RIP
end

# SO LONG
quit
##########################################################

Substitute both glen lines in the apbs.in file to the length of the box in Angtroms.

Apbs is run by the command:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/temp4/bio/APBS/APBS-1.5/lib
/temp4/bio/APBS/APBS-1.5/bin/apbs apbs.in

If all is well, this will create files protein_RIP.dx and ligand_RIP.dx which are potential grids.

If the program does not complete and an error is produced about malloc (memory allocation), then either run the calculations on a different machine (preferred-option), or modify the apbs.in file line for number of grid points per processor to dime 161 (less-preferred option as lower resolution or accuracy may result, see http://www.poissonboltzmann.org/docs/apbs-overview.


Formulae

These calculations correct the obtained change in free energy in periodic box conditions ("raw" ΔΔGMD,PBC) with the correction terms ΔΔGANA and ΔΔGDSC, to give ΔΔGMD,NBC, which is independent of box size and corrects for several effects arising from net charges in periodic boxes:

ΔΔGMD,NBC = ΔΔGMD,PBC + ΔΔGNET + ΔΔGUSV + ΔΔGRIP + ΔΔGEMP + ΔΔGDSC

where
ΔΔGNET is a correction for periodicity-induced net-charge interactions,
ΔΔGUSV is a correction for periodicity-induced net-charge undersolvation,
ΔΔGRIP is a correction for residual integrated potential effects,
ΔΔGEMP is an empirical term to reproduce the exact analytical result in the special case of a single point charge at the center of a spherical cavity, and
ΔΔGDSC is a correction for discrete solvent effects.

IL,SLV = IL - IL_hom                                                                (The solvent contribution to IL)
RL = {IL,SLV/[(1/(8πε0)) * (4π/3) * (1 - 1/εs)|QL|]}1/2          (The effective solvation radius RL)
ΔΔGRIP = [(IP + IL)(QP + QL) - IPQP]/(L3)
ΔΔGEMP = - (1/(8πε0)) * (16π2/45) * (1 - 1/εs) * [(QP + QL)2 - Q2P] * RL5/L6

ΔΔGNET + ΔΔGUSV = ΔΔGNETs
ΔΔGNET = - (ξLS/8πε0) * [(QP + QL)2 - Q2P]/L
ΔΔGDSC = - ((γsQL)/6ε0) * (Ns/L3)

Here,
QP is the net charge of the protein
QL is the net charge of the protein
L is the length of the box in nm
Ns is the number of solvent molecules in the box

From the protein_RIP_het.out file, use the reported RIP as potential IP.
From the ligand_RIP_het.out file, use the reported RIP as potential IL.
From the ligand_RIP_hom.out file, use the reported RIP as potential IL_hom.

Constants:
ε0 is the permittivity of vacuum (the term 1/(4πε0) is 138.93545585 kJ nm e-2 mol-1).

ξLS Cubic lattice-sum (Wiegner) integration constant. Unitless. ξLS = -2.837297.

εs is the static relative dieletric permittivity of the medium, for TIP3P = 97
γs is the quadrupole-moment trace of the solvent model relative to its single van der Waals interaction site. For TIP3P, it is 0.00764 e nm2; note that the article gave a 10 times too large value.


Prerequisites

You need to have:

The equations used as in Rocklin et al. JCP 2013 are derived for a cubic box of water molecules using TIP3P, that has been run at constant volume.
If a different water model is used, then the quadrupole-moment trace, gamma, must be changed in the xlsx file. It is easily calculated from (Eqn. 29 in the article):
gamma_s = Sum(q_i r_i^2)


Making .pqr files

APBS requires the input file .in, in addition to structure files of the complex in .pqr format (related to .pdb format, with charges and radii as last columns instead of occupancies and B-factors). Thus, the .pqr format is of the form: Field_name Atom_number Atom_name Residue_name Chain_ID Residue_number X Y Z Charge Radius. The format is free so the fields must be separated by at least one space.

Our changepdb and changeparm software can write out (and read) pqr files from pdb, prmtop, mdrest, and mdcrd files, constructing various sets of atomic radii.

Three pqr files are needed: lig_in_prot.pqr, prot_only.pqr, and lig_only.pqr.
The should contain all the same coordinates and radii (from the protein-ligand complex).
None of them contain any water molecules.

lig_only.pqr contains only the ligand atoms and with full charges.
lig_in_prot.pqr contains both protein and ligand atoms, but with zeroed protein charges.
prot_only.pqr also has the both protein and atoms, but now the ligand charges are zeroed.

Create a file that is called complex.pdb with protein and ligand atoms from the charged state of the trajectory, without water molecules

Either:

Run changepdb to include charges and radii and output a .pqr file

OR

Upload the .pdb to the PDQ2PQR server at http://nbcr-222.ucsd.edu/pdb2pqr_2.1.1