GIST analysis of MD simulations with restrained solute
Majda 28/6-19
This document describes how to analyze the structure and
thermodynamics of water around a protein–ligand binding site (or a
region of interest around any solute), using grid inhomogeneous
solvation theory (GIST), implemented in the cpptraj module of the
Amber software. More about the method can be found in this paper:
Nguyen, Kurtzman Young, & Gilson, 2012
It is assumed that the protein is setup following the instructions
found
here
.
To perform a GIST analysis you will need a MD generated trajectory
with explicit water in which the solute was kept restrained during
the simulation, whereas the water molecules were allowed to move.
Usually, 10 independent 10 ns long MD simulations are performed for
a single conformation of the solute (100 ns in total). The
conformation can be taken directly from the crystal structure or
obtained by clustering unrestrained MD trajectories and performing
the restrained MD simulations on the selected clusters. The
clustering can be done with cpptraj using the following commands:
cpptraj < cluster.ing
cluster.ing
parm prmtop
trajin unrest_mdcrd
cluster C1 hieragglo epsilon 3 rms :LIG
out cluster_numb_vs_frame summary summaryfile info infofile
clusterout cluster
go
For other options and clustering algorithms check the Amber manual.
Each studied conformation is then solvated in an octahedral box of
water molecules extending at least 10 Å from the solute using the
tleap module:
tleap -f leap.in
leap.in
source leaprc.ff14SB
source leaprc.gaff2
loadAmberPrep lig.in
loadAmberParams gal3.dat
loadAmberParams frcmod.tip4pew
loadOff TIP4PEWBOX
WAT = T4E
HOH = T4E
x=loadpdb conf1.pdb
solvateOct x TIP4PEWBOX 10
saveamberparm x prmtop prmcrd
savepdb x prot.pdb
quit
And 10 independent 10 ns restrained MD simulations are run using the
following input for sander:
sander.in1
Restrained minimization
&cntrl
irest=0,ntx=1,
imin=1,maxcyc=1000,drms=0.0001,ntmin=2,
ntc=2,ntf=2,
cut=8.0,
ntpr=100,ntwx=0,ntwv=0,ntwe=0,
ipol=0,igb=0,
ntr=1,restraint_wt=1000,
restraintmask="!:WA= & !@H="
&end
sander.in2
Restrained NVT equilibration
&cntrl
irest=0,ntx=1,ig=-1,
nstlim=10000,dt=0.002,
temp0=300.0,ntt=3,gamma_ln=2.0,
ntc=2,ntf=2,
cut=8.0,
ntpr=500,ntwx=0,ntwv=0,ntwe=0,
ntb=1,
ipol=0,igb=0,
ntr=1,restraint_wt=1000,
restraintmask="!:WA= & !@H="
&end
sander.in3
Restrained NPT equilibration
&cntrl
irest=1,ntx=5,
nstlim=10000,dt=0.002,
temp0=300.0,ntt=3,gamma_ln=2.0,
ntc=2,ntf=2,
cut=8.0,
ntpr=500,ntwx=0,ntwv=0,ntwe=0,
ntb=2,ntp=1,pres0=1.0,taup=1.0,
ipol=0,igb=0,
ntr=1,restraint_wt=1000,
restraintmask="!:WA= & !@H="
&end
sander.in4
Restrained equilibration 1 ns
&cntrl
irest=1,ntx=5,
nstlim=500000,dt=0.002,
temp0=300.0,ntt=3,gamma_ln=2.0,
ntc=2,ntf=2,
cut=8.0,
ntpr=500,ntwx=0,ntwv=0,ntwe=0,
ntb=2,ntp=1,pres0=1.0,taup=1.0,
ipol=0,igb=0,
ntr=1,restraint_wt=10,
restraintmask="!:WA= & !@H="
&end
sander.in5
Restrained production 10
ns
&cntrl
irest=1,ntx=5,
nstlim=5000000,dt=0.002,
temp0=300.0,ntt=3,gamma_ln=2.0,
ntc=2,ntf=2,
cut=8.0,
ntpr=500,ntwx=5000,ntwv=0,ntwe=0,
ntb=2,ntp=1,pres0=1.0,taup=1.0,
ipol=0,igb=0,
ntr=1,restraint_wt=10,
restraintmask="!:WA= & !@H="
&end
To run the simulations on Aurora, use:
q-md
#!/bin/bash
#SBATCH -N 1
#SBATCH -n 20
#SBATCH -A snic2018-2-24
#SBATCH -t 168:00:00
module purge
module add ifort/2017.4.196-GCC-6.4.0-2.28
module add OpenMPI/2.1.1
module add Amber/16.12-AT-17.08
cd $SLURM_SUBMIT_DIR
mpirun -bind-to core -np 20 pmemd.MPI -O -i sander.in1 -o
sander.out1 -p prmtop -c prmcrd -r mdrest1 -ref prmcrd
mpirun -bind-to core -np 20 pmemd.MPI -O -i sander.in2 -o
sander.out2 -p prmtop -c mdrest1 -r mdrest2 -ref prmcrd
mpirun -bind-to core -np 20 pmemd.MPI -O -i sander.in3 -o
sander.out3 -p prmtop -c mdrest2 -r mdrest3 -ref prmcrd
mpirun -bind-to core -np 20 pmemd.MPI -O -i sander.in4 -o
sander.out4 -p prmtop -c mdrest3 -r mdrest4 -ref prmcrd
mpirun -bind-to core -np 20 pmemd.MPI -O -i sander.in5 -o
sander.out5 -p prmtop -c mdrest4 -r mdrest5 -ref prmcrd -x mdcrd5
GIST analysis
GIST is implemented in the cpptraj module of the Amber software. In
order to perform GIST analysis, the restrained MD trajectories
should first be imaged and aligned to a reference structure, based
on which the coordinates of the selected region of interest were
determined. To run gist, use:
cpptraj < gist.in
gist.in
parm prmtop
trajin mdcrd5
autoimage
reference reference.pdb
rms !:WAT,LIG reference
gist refdens 0.0332 gridcntr -21.014 11.282 0.088 griddim 60 42 42 out gist.out
go
Reference density (refdens) depends on the water model used. The
values for common water models can be found in the Amber manual,
together with other options for GIST analysis. The grid region of
interest to be analyzed by GIST is determined by the grid center (gridcntr
x y z) and grid dimensions (griddim a b c). To obtain these
values, one can use a Python script
FindCentroid.py
. To use the script, you need a pdb
file containing the region of interest (region.pdb), which is used
by the script to determine the values for a grid that is roughly 5
times the size of the ligand. Usually, a grid that is centered on
the ligand and extends at least 3 Å on each side of the ligand is
used.
python2.7 FindCentroid.py –i region.pdb
Example output:
Centroid is:
(-20.746784615384612, 8.256723076923077, 0.5398923076923079)
Ranges are:
('x: ', -30.327, 0.0)
('y: ', 4.232, 10.818)
('z: ', -2.639, 6.156)
Recommended gist input command given input file (presumed ligand or
binding cavity .pdb file)
gist gridspacn 0.5 gridcntr -20.747 8.257 0.540 griddim
152 33 44
Finally, to run gist, use:
q-gist
#!/bin/bash
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -A snic2018-2-24
#SBATCH -t 50:00:00
module purge
module add ifort/2017.4.196-GCC-6.4.0-2.28
module add OpenMPI/2.1.1
module add Ambertools/18.10
cpptraj < gist.in
If you encounter memory issues, add “#SBATCH -c 2” to the q-gist
file, to get more than 4500 MB for one task.