Standard free energy of binding with Amber11

This document describes how one can calculate the standard (absolute) binding free energy of a protein-ligand complex using Amber 11 or later.

Note: at the moment of writing this, the simulations will not work on Alarik, due to Amber-related problems with the gfortran compilers. When I checked in the Amber mailing list last, they were working on the problem.

Samuel Genheden, 2012

Prerequisite

You need to have:

a leap prep-file with file ending .in, e.g. hux.in. The residue name should be the same as the filename, but in uppercase.
a pdb-file of the complex with hydrogen atoms.

We will assume that the protein is described by Amber99SB and GAFF force field for the protein and ligand, respectively. Water molecules will be described using TIP3P.
If the system requires additional parameters, commands to load these should be added to a file called leap_lines if the automatic setup scripts will be used

Making prmtop and prmcrd-files

Create a file that is called leapcom_ti and looks similar to this
addpath /away/bio/Data/Amber/Tip3p_boxes/ source leaprc.ff99SB loadamberparams gaff.dat loadamberprep hux.in loadoff TIP3PBOX10 x=loadpdb ace-hux.pdb bond x.37.SG x.64.SG bond x.329.SG x.429.SG solvateoct x SBOX 10 saveamberparm x ace-hux10.prm ace-hux10.rst savepdb x prot_temp.pdb x=copy HUX solvateoct x SBOX 10 saveamberparm x hux10.prm hux10.rst savepdb x lig_temp.pdb quit

Change ace and hux to the name of your protein and ligand, respectively. Change the cystein-bridges as well.

Execute tleap

tleap -f leapcom_ti

Delete the ligand from prot_temp.pdb and lig_temp.pdb.

sed -i "/HUX/d" prot_temp.pdb sed -i "/HUX/d" lig_temp.pdb

Create a new file called leapcom_ti2 and it should look similar to this
source leaprc.ff99SB loadamberparams gaff.dat loadamberprep hux.in x=loadpdb prot_temp.pdb bond x.37.SG x.64.SG bond x.329.SG x.429.SG solvateoct x TIP3PBOX 10 1000 saveamberparm x ace10.prm ace10.rst x=loadpdb lig_temp.pdb solvateoct x TIP3PBOX 10 1000 saveamberparm x wat10.prm wat10.rst quit

Execute tleap

tleap -f leapcom_ti2

Now you have two sets of prmtop and prmcrd-files, one for the starting state (V0), i.e. with ligand, and one for the end state (V1), i.e. without the ligand. However, the coordinates in V0 and V1 for the non-unique atoms is not the same. This must be fixed.

Execute:

/away/bio/Dg_tools/rm_coords.py ace-hux10.rst 8350 8390 > ace10.rst /away/bio/Dg_tools/rm_coords.py hux10.rst 1 41 > wat10.rst

/away/bio/Dg_tools/rm_coords.py is a script that removes a range of coordinates from a restart file. The first argument is the restart file to manipulate, and the second and third arguments are the first atom and the last atom to remove, respectively. The new file is written to standard output.

There is a script /away/bio/Dg_tools/make_parm_std.bash that takes two parameters, the protein name and the ligand name in lowercase. It will setup up 10 independent prmtop and prmcrd-files according to the commands above. If the system requires additional parameters, these should be added to a file called leap_lines, e.g.

cat << EOF > leap_lines loadamberprep ca.in loadamberparams ca.dat EOF

The automatic script will write out the first and last ligand atom, as well as the length of the ligand.

Making restraint file

Now you need to find three protein atoms and three ligand atoms that will define the restraints of the ligand. The first protein atom should be close to the center of mass of the protein. This atom can be found with changepdb and the command CEN.

Make a folder called Rest and copy /away/bio/Dg_tools/*templ to it.
Goto the Rest folder and execute

/away/bio/Dg_tools/find_rest.pl 2329 1797 6282 6996 6980 6989 ../ace-hux1.prm ../ace-hux1.rst ace

The first three numbers are the protein atom numbers and the next three are the ligand atom numbers. The last string is just for book-keeping.

In this folder there is a filed called ti_rest that defines the restraints and will be used in the simulations. Check carefully this file so that all values are filled out, i.e., so there are no r0=,k0= strings.

The /away/bio/Dg_tools/find_rest.pl script assumes that ptraj is in $PATH

Setting up simulation input-files

Make a folder Ti_sim at an appropriate place. Move the prmtop, prmcrd, ti_rest files to this folder.

Now we should make rather many input files for sander. To our help we have the script make_tiinput_std.py

The script assumes that there is a file called amb_masks. The first line should contain the Amber mask of the ligand, and the second line should contain an empty string. E.g.
cat << EOF > amb_masks ": HUX" "" EOF

Next, copy the file /away/bio/Dg_tools/cluster_amblib to the current folder. This contains information about how to execute sander on different clusters.

Now, execute:

/away/bio/Dg_tools/make_tiinp_std.py

and answer the questions, appropriately.
Copy the prmtop, prmcrd, ti_rest, the group-files, the input-files, the queue-scripts, and sub.bash to a cluster of choice. Execute sub.bash on the cluster.

Cluster library

The last task above, will create queue script that is appropriate for a specific cluster. The script that generates these scripts (make_tiinp_std.py) uses a file called cluster_amblib to know cluster-specific commands. One can edit this file manually, to support additional clusters. An example is shown below for the Alarik cluster at Lunarc.

!/bin/bash #SBATCH -N 1 #SBATCH -n 8 #SBATCH -t WWW:00:00 module add gcc/4.6.2 module add openmpi/1.4.5/gcc/4.6.2 export AMBERHOME=/lunarc/nobackup/projects/bio/Amber/Amber11/ export PATH=$AMBERHOME/exe:$PATH sbatch mpirun -np NN sander.MPI 8 %end

The first line should start with a percentage sign (%) followed by the name of the cluster.

The cluster specification should end with the line %end.

The last three lines of the specifications are special. First the command used on the cluster to add jobs to the queue are specified (e.g. sbatch, second the command used to execute the parallel version of sander, and third the number of processors to use.

All other lines in the specification are appended to the start of the queue script and should contains queue-specific lines (e.g. starting with #SBATCH), modules that need to be loaded and paths that need to be set. The AMBERHOME path must be set. The WWW mask is special and will be replaced by an estimated walltime (in hours).