GIST analysis of MD simulations with restrained solute

Majda 28/6-19

This document describes how to analyze the structure and thermodynamics of water around a protein–ligand binding site (or a region of interest around any solute), using grid inhomogeneous solvation theory (GIST), implemented in the cpptraj module of the Amber software. More about the method can be found in this paper:

Nguyen, Kurtzman Young, & Gilson, 2012

It is assumed that the protein is setup following the instructions found here .

To perform a GIST analysis you will need a MD generated trajectory with explicit water in which the solute was kept restrained during the simulation, whereas the water molecules were allowed to move. Usually, 10 independent 10 ns long MD simulations are performed for a single conformation of the solute (100 ns in total). The conformation can be taken directly from the crystal structure or obtained by clustering unrestrained MD trajectories and performing the restrained MD simulations on the selected clusters. The clustering can be done with cpptraj using the following commands:

cpptraj < cluster.ing

cluster.ing
parm prmtop
trajin unrest_mdcrd
cluster C1 hieragglo epsilon 3 rms :LIG out cluster_numb_vs_frame summary summaryfile info infofile clusterout cluster
go
For other options and clustering algorithms check the Amber manual.

Each studied conformation is then solvated in an octahedral box of water molecules extending at least 10 Å from the solute using the tleap module:

tleap -f leap.in

leap.in
source leaprc.ff14SB
source leaprc.gaff2
loadAmberPrep lig.in
loadAmberParams gal3.dat
loadAmberParams frcmod.tip4pew
loadOff TIP4PEWBOX
WAT = T4E
HOH = T4E
x=loadpdb conf1.pdb
solvateOct x TIP4PEWBOX 10
saveamberparm x prmtop prmcrd
savepdb x prot.pdb
quit

And 10 independent 10 ns restrained MD simulations are run using the following input for sander:
sander.in1
Restrained minimization
  &cntrl
   irest=0,ntx=1,
   imin=1,maxcyc=1000,drms=0.0001,ntmin=2,
   ntc=2,ntf=2,
   cut=8.0,
   ntpr=100,ntwx=0,ntwv=0,ntwe=0,
   ipol=0,igb=0,
   ntr=1,restraint_wt=1000,
   restraintmask="!:WA= & !@H="
  &end

sander.in2
Restrained NVT equilibration
  &cntrl
   irest=0,ntx=1,ig=-1,
   nstlim=10000,dt=0.002,
   temp0=300.0,ntt=3,gamma_ln=2.0,
   ntc=2,ntf=2,
   cut=8.0,
   ntpr=500,ntwx=0,ntwv=0,ntwe=0,
   ntb=1,
   ipol=0,igb=0,
   ntr=1,restraint_wt=1000,
   restraintmask="!:WA= & !@H="
  &end

sander.in3
Restrained NPT equilibration
  &cntrl
   irest=1,ntx=5,
   nstlim=10000,dt=0.002,
   temp0=300.0,ntt=3,gamma_ln=2.0,
   ntc=2,ntf=2,
   cut=8.0,
   ntpr=500,ntwx=0,ntwv=0,ntwe=0,
   ntb=2,ntp=1,pres0=1.0,taup=1.0,
   ipol=0,igb=0,
   ntr=1,restraint_wt=1000,
   restraintmask="!:WA= & !@H="
  &end

sander.in4
Restrained equilibration 1 ns
  &cntrl
   irest=1,ntx=5,
   nstlim=500000,dt=0.002,
   temp0=300.0,ntt=3,gamma_ln=2.0,
   ntc=2,ntf=2,
   cut=8.0,
   ntpr=500,ntwx=0,ntwv=0,ntwe=0,
   ntb=2,ntp=1,pres0=1.0,taup=1.0,
   ipol=0,igb=0,
   ntr=1,restraint_wt=10,
   restraintmask="!:WA= & !@H="
  &end

sander.in5
Restrained production 10 ns                                                               
  &cntrl                                                                       
   irest=1,ntx=5,                                                              
   nstlim=5000000,dt=0.002,
   temp0=300.0,ntt=3,gamma_ln=2.0,                                             
   ntc=2,ntf=2,                                                                
   cut=8.0,                                                                    
   ntpr=500,ntwx=5000,ntwv=0,ntwe=0,
   ntb=2,ntp=1,pres0=1.0,taup=1.0,                                             
   ipol=0,igb=0,                                                               
   ntr=1,restraint_wt=10,
   restraintmask="!:WA= & !@H="
  &end  

To run the simulations on Aurora, use:
q-md
#!/bin/bash
#SBATCH -N 1
#SBATCH -n 20
#SBATCH -A snic2018-2-24
#SBATCH -t 168:00:00

module purge
module add ifort/2017.4.196-GCC-6.4.0-2.28
module add OpenMPI/2.1.1
module add Amber/16.12-AT-17.08

cd $SLURM_SUBMIT_DIR
mpirun -bind-to core -np 20 pmemd.MPI -O -i sander.in1 -o sander.out1 -p prmtop -c prmcrd  -r mdrest1 -ref prmcrd
mpirun -bind-to core -np 20 pmemd.MPI -O -i sander.in2 -o sander.out2 -p prmtop -c mdrest1 -r mdrest2 -ref prmcrd
mpirun -bind-to core -np 20 pmemd.MPI -O -i sander.in3 -o sander.out3 -p prmtop -c mdrest2 -r mdrest3 -ref prmcrd
mpirun -bind-to core -np 20 pmemd.MPI -O -i sander.in4 -o sander.out4 -p prmtop -c mdrest3 -r mdrest4 -ref prmcrd
mpirun -bind-to core -np 20 pmemd.MPI -O -i sander.in5 -o sander.out5 -p prmtop -c mdrest4 -r mdrest5 -ref prmcrd -x mdcrd5


GIST analysis

GIST is implemented in the cpptraj module of the Amber software. In order to perform GIST analysis, the restrained MD trajectories should first be imaged and aligned to a reference structure, based on which the coordinates of the selected region of interest were determined. To run gist, use:

cpptraj < gist.in

gist.in
parm prmtop
trajin mdcrd5
autoimage
reference reference.pdb
rms !:WAT,LIG reference
gist refdens 0.0332 gridcntr -21.014 11.282 0.088 griddim 60 42 42 out gist.out
go

Reference density (refdens) depends on the water model used. The values for common water models can be found in the Amber manual, together with other options for GIST analysis. The grid region of interest to be analyzed by GIST is determined by the grid center (gridcntr x y z) and grid dimensions (griddim a b c). To obtain these values, one can use a Python script FindCentroid.py . To use the script, you need a pdb file containing the region of interest (region.pdb), which is used by the script to determine the values for a grid that is roughly 5 times the size of the ligand. Usually, a grid that is centered on the ligand and extends at least 3 Å on each side of the ligand is used.

python2.7 FindCentroid.py –i region.pdb
 
Example output:
Centroid is:
(-20.746784615384612, 8.256723076923077, 0.5398923076923079)
Ranges are:
('x:  ', -30.327, 0.0)
('y:  ', 4.232, 10.818)
('z:  ', -2.639, 6.156)
Recommended gist input command given input file (presumed ligand or binding cavity .pdb file)
gist gridspacn 0.5 gridcntr -20.747  8.257  0.540 griddim 152 33 44

Finally, to run gist, use:
q-gist
#!/bin/bash
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -A snic2018-2-24
#SBATCH -t 50:00:00

module purge
module add ifort/2017.4.196-GCC-6.4.0-2.28
module add OpenMPI/2.1.1
module add Ambertools/18.10

cpptraj < gist.in

If you encounter memory issues, add “#SBATCH -c 2” to the q-gist file, to get more than 4500 MB for one task.