Charge-corrections for FEP

Corrections for FEP calculations that change the net charge
i.e. corrections for the periodic MD simulations with Ewald summation

This document describes how to correct free energy calculations performed when the net charge of the perturbated system changes, using periodic boundary conditions in the MD simulations.

The description is based on the article
Rocklin et al., "Calculating the binding free energies of charged species based on explicit-solvent simulations employing lattice-sum methods: An accurate correction scheme for electrostatic finite-size effects" J. Chem. Phys. 139, 2013.

The implementation for Amber is described in:
M. A. Olsson, Alfonso T. García-Sousa, U. Ryde (2017) "Binding affinities of the farnesoid X receptor in the D3R Grand Challenge 2 estimated by free-energy perturbation and docking" J. Comput.-Aided Mol. Des., submitted 1/6-17.

There are some sample files in /temp4/bio/Fep_misc/Ch_corr/Alf and template input files are in /temp4/bio/Fep_misc/Ch_corr.

Alfonso T. Garcia-Sosa, Dec. 2016; Updated by UR and Martin Olsson, Feb. 2017.

Procedure

Run the FEP simulations with a changed charge with a cubic (not rectangular) periodic box and constant volume.
This is obtained by tleap, using the command:
solvateBox x TIP3PBOX 10 iso
Use the TIP3P water model.
Copy the needed apbs template files:
cp /temp4/bio/Fep_misc/Ch_corr/* .
Start with a structure of the protein-ligand complex, preferably the prmtop file and the final mdrest file. The coordinate file should be wrapped.
Build a pqr file for this complex with changeparm
changeparm <<EOFprot.prmpqrmprot-1.00-mdrest5-wrapp0cc.pqrqEOF
This example, defined PARSE radii, without any added solvent probe.
Find the size of the simulated box with changepdb:
changepdb <<EOFcc.pqrxyzqEOF
Values after "Diff = " are the box extent in the x, y and z directions.
Also find the charge of the protein and the ligand, as well as the number of water molecules, e.g. by changepdb or changeparm.
Strip water molecules from the cc.pqr file:
grep -v WAT cc.pqr >c.pqr
Divide the file into one with the protein and one with the ligand, either by hand or by proper grep commands, e.g.
grep L12 c.pqr >l.pqrgrep -v L12 c.pqr | grep -v L41 >p.pqr
Construct pqr files with zeroed charges of the protein or the ligand with changepdb:
changepdb <<EOFl.pqrpqrl.pqrzwpl0.pqrqEOFchangepdb <<EOFp.pqrpqrp.pqrzwpp0.pqrqEOF
Now we can construct the final pqr files with simple cat commands:
cat l.pqr >lig_only.pqrcat p0.pqr l.pqr >lig_in_prot.pqrcat p.pqr l0.pqr >prot_only.pqr
Check that the glen values are large enough to enclose the entire protein in the two apbs_*.in files.
Then, run apbs twice with the commands (has to be done locally; they take one or a few minutes each):
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/temp4/bio/APBS/APBS-1.5/libapbs apbs_hom.in >apbs_hom.outapbs apbs_het.in >apbs_het.out
Run four python calculations to get the RIP files (change the values in red to the proper charges of the protein and the ligand)
python calculate_RIP_het.py protein_RIP_het.dx -10>protein_RIP_het.outpython calculate_RIP_het.py ligand_RIP_het.dx -1>ligand_RIP_het.outpython calculate_RIP_hom.py protein_RIP_hom.dx -10>protein_RIP_hom.outpython calculate_RIP_hom.py ligand_RIP_hom.dx -1>ligand_RIP_hom.out
Extract the result from these files:
grep RIP *RIP*out | grep kJ
Insert all results (protein and ligand charges, box dimensions, #water, and the four energies from the previous command) into the template file in
/temp4/bio/Fep_misc/Ch_corr/charge-corr-fep-template.xlsx
Repeat the procedure for the simulation of the ligand free in water, also in a cubic box (in a separate directory):
Copy the needed files
cp /temp4/bio/Fep_misc/Ch_corr/apbs* /temp4/bio/Fep_misc/Ch_corr/cal* .
Build a pqr file for this simulation with changeparm
changeparm <<EOF wat.prm pqr m wat-1.00-mdrest5-wrap p 0 cc.pqr q EOF
This example, defined PARSE radii, without any added solvent probe.
Find the size of the simulated box with changepdb, xyz command
Also the number of water molecules.
Strip water molecules from the cc.pqr file:
grep -v WAT cc.pqr >c.pqr
Remove the other ligand, e.g.
grep L12 c.pqr >l.pqr
Construct a pqr file with zeroed charges of the ligand with changepdb:
changepdb <<EOF l.pqr pqr l.pqr z wp l0.pqr q EOF
Now we can construct the final pqr files with simple cat commands:
\cp l.pqr lig_only.pqr\cp l.pqr lig_in_prot.pqr\cp l0.pqr prot_only.pqr
This and the following command is not ideal for the ligand-only calculation (i.e. some calculations are made in vain), but rather to avoid any changes in the apbs input files.
Run apbs twice with the commands):
apbs apbs_hom.in >apbs_hom.out apbs apbs_het.in >apbs_het.out
Run two python calculations to get the RIP files (change the ligand charge in red)
python calculate_RIP_het.py ligand_RIP_het.dx -1>ligand_RIP_het.out python calculate_RIP_hom.py ligand_RIP_hom.dx -1>ligand_RIP_hom.out
Extract the result from these files:
grep RIP *RIP*out | grep kJ
Insert the results box dimensions, #water, and the two energies into the template file, which now gives you the final results.

Running APBS

Below are sample apbs input files.
In the /temp4/bio/Fep_misc/Ch_corr directory, there are two such files, which only differ in the output file name and in the line

sdie 97.0 

      which is instead 1.0 in the homogenous case.

#############################################################################

        

       Heterogenous PB calculations to calculate the residual
      integrated 

      # potentials I_P and I_L. 

      # 

      # Input files: 

      # prot_only.pqr - The protein-ligand complex, but with only the
      protein 

      # atoms charged. 

      # lig_in_prot.pqr - The protein-ligand complex, but with only the
      ligand 

      # atoms charged. 

      # lig_only.pqr - The ligand only, in the same coordinates as the
      above. 

      # Used for centering the box only. 

      # 

      # Output files: 

      # protein_RIP.dx - The potential grid based on prot_only.pqr 

      # ligand_RIP.dx - The potential grid based on lig_in_prot.pqr 

      # 

      #############################################################################

      

      # READ IN MOLECULES 

      read 

      mol pqr prot_only.pqr 

      mol pqr lig_in_prot.pqr 

      mol pqr lig_only.pqr 

      end 

      

      # CALCULATE POTENTIAL WITH ONLY THE PROTEIN CHARGES 

      elec name prot_only 

      mg-manual 

      dime 257 257 257 

      glen 120 120 120 

      gcent mol 3 

      mol 1 

      lpbe 

      bcfl mdh 

      pdie 1.0 

      sdie 97.0 

      chgm spl4 

      srfm smol 

      srad 1.4 

      swin 0.3 

      sdens 40.0 

      temp 300.0 

      calcenergy no 

      calcforce no 

      write pot dx protein_RIP 

      end 

      

      # CALCULATE POTENTIAL WITH ONLY THE LIGAND CHARGES 

      elec name lig_only 

      mg-manual 

      dime 257 257 257 

      glen 120 120 120 

         gcent mol 3 

      mol 2 

      lpbe 

      bcfl mdh 

      pdie 1.0 

      sdie 97.0 

      chgm spl4 

      srfm smol 

      srad 1.4 

      swin 0.3 

      sdens 40.0 

      temp 300.0 

      calcenergy no 

      calcforce no 

      write pot dx ligand_RIP 

      end 

      

      # SO LONG 

      quit

##########################################################

Substitute both glen lines in the apbs.in file to the length of the box in Angtroms.

Apbs is run by the command:

      export
      LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/temp4/bio/APBS/APBS-1.5/lib 

      /temp4/bio/APBS/APBS-1.5/bin/apbs apbs.in

If all is well, this will create files protein_RIP.dx and ligand_RIP.dx which are potential grids.

If the program does not complete and an error is produced about malloc (memory allocation), then either run the calculations on a different machine (preferred-option), or modify the apbs.in file line for number of grid points per processor to dime 161 (less-preferred option as lower resolution or accuracy may result, see http://www.poissonboltzmann.org/docs/apbs-overview.

Formulae

These calculations correct the obtained change in free energy in periodic box conditions ("raw" ΔΔG_MD,PBC) with the correction terms ΔΔG_ANA and ΔΔG_DSC, to give ΔΔG_MD,NBC, which is independent of box size and corrects for several effects arising from net charges in periodic boxes:

ΔΔG_MD,NBC = ΔΔG_MD,PBC + ΔΔG_NET + ΔΔG_USV + ΔΔG_RIP + ΔΔG_EMP + ΔΔG_DSC

where
ΔΔG_NET is a correction for periodicity-induced net-charge interactions,
ΔΔG_USV is a correction for periodicity-induced net-charge undersolvation,
ΔΔG_RIP is a correction for residual integrated potential effects,
ΔΔG_EMP is an empirical term to reproduce the exact analytical result in the special case of a single point charge at the center of a spherical cavity, and
ΔΔG_DSC is a correction for discrete solvent effects.

I_L,SLV = I_L - I_{L_hom} (The solvent contribution to I_L)
R_L = {I_L,SLV/[(1/(8πε₀)) * (4π/3) * (1 - 1/ε_s)|Q_L|]}^1/2 (The effective solvation radius R_L)
ΔΔG_RIP = [(I_P + I_L)(Q_P + Q_L) - I_PQ_P]/(L³)
ΔΔG_EMP = - (1/(8πε₀)) * (16π²/45) * (1 - 1/ε_s) * [(Q_P + Q_L)² - Q²_P] * R_L⁵/_L⁶

ΔΔG_NET + ΔΔG_USV = ΔΔG_NET/ε_s
ΔΔG_NET = - (ξ_LS/8πε₀) * [(Q_P + Q_L)² - Q²_P]/L
ΔΔG_DSC = - ((γ_sQ_L)/6ε₀) * (N_s/L³)

Here,
Q_P is the net charge of the protein
Q_L is the net charge of the protein
L is the length of the box in nm
N_s is the number of solvent molecules in the box

From the protein_RIP_het.out file, use the reported RIP as potential I_P.
From the ligand_RIP_het.out file, use the reported RIP as potential I_L.
From the ligand_RIP_hom.out file, use the reported RIP as potential I_{L_hom}.

Constants:
ε₀ is the permittivity of vacuum (the term 1/(4πε₀) is 138.93545585 kJ nm e^-2 mol^-1).

ξ_LS Cubic lattice-sum (Wiegner) integration constant. Unitless. ξ_LS = -2.837297.

ε_s is the static relative dieletric permittivity of the medium, for TIP3P = 97
γ_s is the quadrupole-moment trace of the solvent model relative to its single van der Waals interaction site. For TIP3P, it is 0.00764 e nm²; note that the article gave a 10 times too large value.

Prerequisites

You need to have:

an APBS input file with termination .in, e.g. apbs.in. The sample input file is provided below.
a pdb-file of the complex with hydrogen atoms and without water molecules.

The equations used as in Rocklin et al. JCP 2013 are derived for a cubic box of water molecules using TIP3P, that has been run at constant volume.
If a different water model is used, then the quadrupole-moment trace, gamma, must be changed in the xlsx file. It is easily calculated from (Eqn. 29 in the article):
gamma_s = Sum(q_i r_i^2)

Making .pqr files

APBS requires the input file .in, in addition to structure files of the complex in .pqr format (related to .pdb format, with charges and radii as last columns instead of occupancies and B-factors). Thus, the .pqr format is of the form: Field_name Atom_number Atom_name Residue_name Chain_ID Residue_number X Y Z Charge Radius. The format is free so the fields must be separated by at least one space.

Our changepdb and changeparm software can write out (and read) pqr files from pdb, prmtop, mdrest, and mdcrd files, constructing various sets of atomic radii.

Three pqr files are needed: lig_in_prot.pqr, prot_only.pqr, and lig_only.pqr.
The should contain all the same coordinates and radii (from the protein-ligand complex).
None of them contain any water molecules.

lig_only.pqr contains only the ligand atoms and with full charges.
lig_in_prot.pqr contains both protein and ligand atoms, but with zeroed protein charges.
prot_only.pqr also has the both protein and atoms, but now the ligand charges are zeroed.

Create a file that is called complex.pdb with protein and ligand atoms from the charged state of the trajectory, without water molecules

Either:

Run changepdb to include charges and radii and output a .pqr file

OR

Upload the .pdb to the PDQ2PQR server at http://nbcr-222.ucsd.edu/pdb2pqr_2.1.1

Corrections for FEP calculations that change the net charge i.e. corrections for the periodic MD simulations with Ewald summation

Formulae

Corrections for FEP calculations that change the net charge
i.e. corrections for the periodic MD simulations with Ewald summation