Relative binding affinities using Amber

The procedure has become much simpler with Amber 13, as is described below.
For the old procedure in Amber 11, see below.
ccpbiosim

The sign of the results is that of the reaction from lambda=0 to lambda=1 (i.e. a negative sign means that lambda=1 is more stable).
The lambda=0 state is that defined in timask1.
You should calculate dG(tot) = dG(protein) - dG(water).

Amber 13-

Starting from Amber 13, TI can be performed with pmemd, using a single prmtop file, containing both ligands as separate residues.
There is no need to reorder atoms for the soft core.

Ensure that you have *.in files for the two ligands and a pdb file with the protein and the two ligands (prot.pdb).
Atoms that should be common for the two ligands need to be perfectly overlapping (these atoms are identified by the coordinates, not by atom names, which may be different).
Set up the protein as described in the equilibration home page.
Run tleap with the pdb file with protein+ligand1+ligand2.
Save the files as prot.prm and prot.rst
Remove the protein from the pdb file (wat.pdb) and run tleap again with this file.
Save the files as wat.prm and wat.rst
A template leap.in file is given below.
Run mkfep-pmemd to set up all the needed files for the FEP calculation.
Answer the following questions (or better put the answers in a fep.in file)
- Length of the equilibration simulation (0.5 ns)
- Length of the equilibration simulation (1.0 ns)
- Number of lambda values (13;
     predefined sequences are
     11: 0, 0.1, ..., 0.9, 1.0
     13: 0, 0.05, 0.1, 0.2, ..., 0.8, 0.9, 0.95, 1.0
     21: 0, 0.05, 0.1, 0.15, ..., 0.85, 0.9, 0.95, 1.0
     if you enter another number, you need to specify the values also
- Name of ligand 1 (3 letters)
- Name of ligand 2 (3 letters)
- Atoms in the soft core of ligand 1 (string of atom names)
- Atoms in the soft core of ligand 2 (string of atom names)
- Residue number of the last crystal-water molecule
- Computer on which the calculations will be run (platon, alarik, or abisko
- Whether you want random velocities or not (useful if you want to generate independent simulations; then, use also different solvent boxes, cf. SIIT)
- Wall-clock-time of prot and wat simulation (on one line, two integers)
Note that the soft-core mask also defines which atoms will have different coordinates for the two ligands.

mkfep-pmemd sets ifmbar=1, bar_intervall=1000,bar_l_min=0.00,bar_l_max=1.00,bar_l_incr=0.05
and
snapshots saved every 1 ps
If you want anything else, you may easily change the code:
cp /temp4/bio/Amber/Gfortran/mkfep-pmemd.f .
gfortran -o mkfep-pmemd mkfep-pmemd.f
Move the files to the remote computer and start all simulations by
. sub.bash

If you have a metal site that requires restraints (e.g. Ca with several water molecules), prepare a file with the restraints and call it rst. mkfep-pmemd will see this file and set up the needed lines in the sander.in files.

Sample rst file:

&rst
   iat=24132,26210,
   r1=0.0,r2=2.4,r3=2.4,r4=100.,
   rk2=50.0,rk3=50.0
&end

&rst iat=0 &end

Amber (owing to some bug) sometimes moves atoms outside the soft-core mask:
     Error : Atom    3325 does not have match in V1 !
This can be fixed by running modrest, command ff on the reOstart file, the prmtop file, and the sander.in file (with soft-core information).

When jobs are finished, calculate free energies with pymbar.
Also use overlap to check that the number of lambda values is proper.

You should always check the RMSD of the ligand compared to the starting structure.
This can be done with the following script:

for x in 0.00 0.05 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95 1.00 ; do
echo prot Lambda = $x
# Wrap file
echo trajin prot-$x.mdcrd4 >ptraj.in
echo trajout prot-$x.mdcrd4-wrap trajectory >>ptraj.in
echo autoimage :138 >>ptraj.in
echo go >>ptraj.in
cpptraj prot.prm ptraj.in
# Calculate RMSD
echo trajin prot-$x.mdcrd4-wrap >rmsd.in
echo reference prot.rst >>rmsd.in
echo rmsd :1-137 reference perres perresout rmsd-$x range 138-139 >>rmsd.in
echo go >>rmsd.in
cpptraj prot.prm rmsd.in
done
done

It does not seem to work properly.
Use changepdb, command rf instead (I will update this program to make it automatic).
/temp3/Ulf/Backup/Spa/Hf/Red/Std/Cm

Energy decomposition with Amber is done by setting idecomp=1 in the input file and directly after &end (residue numbers):
Residues considered for decomp
RRES 1 6228
END
Residues to print
RES 1 6228
END
END

Possibly also:
Residues considered as LIG
LRES 9 14

No:         | ERROR:   PMEMD 14 does not support idecomp != 0!

Energy components can be calculated with changeparm, command lim(2), see below.

Sample leap.in file

source leaprc.ff14SBsource leaprc.gaffaddPath ../LeaploadAmberPrep gu1.inloadAmberPrep gu2.in
loadAmberParams gu.dat x = loadpdb prot.start solvateOct x TIP3PBOX 10 saveamberparm x prot.prm prot.rstsavepdb x prot.pdb
x = loadpdb wat.start solvateOct x TIP3PBOX 10 saveamberparm x wat.prm wat.rstsavepdb x wat.pdb
quit

You can rotate the molecule before the solvation with (before solvateOct): transform x { {1 0 0 }{00.342020 -0.939693}
{0 0.939693 0.342020}}
This is a rotation of 70 degrees. This worked well for the protein, but not for the free ligand; use SIIT instead.

Sample sander.in4 file

TI 4 NPT production, lambda=0.00
&cntrl
irest=1,ntx=5,
nstlim= 500000,dt=0.002,
temp0=300.0,ntt=3,gamma_ln=2.0,
ntc=2,ntf=1,
cut=8.0,
ntb=2,ntp=1,pres0=1.0,taup=1.0,
ntpr=5000,ntwx=5000,ntwv=0,ntwe=0,
icfe=1,clambda=0.00,ifsc=1,
timask1=":GU1",scmask1="GU1@C8,H81,H82,H83",
timask2=":GU2,"scmask2="GU2@H8"
ifmbar=1, bar_intervall=5000,bar_l_min=0.05,bar_l_max=0.95,bar_l_incr=0.05,
&end

Sample fep.in file

2
4
13
GU2
GU1
C8,H81,H82,H83
H5
b

ccpbiosim

A general program has been developed as a cooperational project in UK to set up TI and MM/PBSA calculations in Amber (among other programs).
It is described in

http://ccpforge.cse.rl.ac.uk/gf/project/ccpbiosim/wiki/

CalcLIM

Relative free energies between two ligand can be calculated "by hand" as a postprocessing of Amber mdcrd files with the program changeparm, commands
LIM (new FEP format with a single set of prmtop and mdcrd files)
LIM2 (old FEP format with two sets of prmtop and mdcrd files)

Unfortunately, in a test of these programs, the free energies differed appreciably between when calculated by Amber (with periodic boundary) and when calculated by changeparm (without periodicity), up to 12 kJ/mol.

These programs also give energy components and can optionally read QM data.

Input (for LIM):
changeparm <<EOF
prot.prm
lim
c
prot-$x.mdcrd4-wrap
n
prot-$x-sander.in4
13
$x
mbar-new-prot.$xx
comp-prot.$xx
q
EOF

Input for LIM2
prmtop_0
prmtop_1
sander.in4
#lambda
lambda
mdcrd
mbar
comp

You first need to wrap the files and finally to run the program components to add all the components for the various lambda values

This is a script to do all calculations (with lim):

for a in prot wat ; do
for x in 0.00 0.05 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95 1.00 ; do
xx=$(echo $x | cut -c 1,3-4 )
echo $a Lambda = $x

echo trajin $a-$x.mdcrd4 >ptraj.in
echo trajout $a-$x.mdcrd4-wrap trajectory >>ptraj.in
echo autoimage :138 >>ptraj.in
echo go >>ptraj.in
cpptraj $a.prm ptraj.in

changeparm <<EOF
$a.prm
lim
c
$a-$x.mdcrd4-wrap
n
sander-$x.in4
13
$x
mbar-new-$a.$xx
comp-$a.$xx
q
EOF

done
done

components <<EOF >comp.out
13
comp-prot
prot.pdb
prot-comp.pdb
wat.pdb
wat-comp.pdb
lig-comp.pdb
EOF

overlap <<EOF >olap-new
13
mbar-new-prot
EOF

Single topology atoms

This section shows how to set up single-topology relative free-energy calculations with Amber, using dummy atoms and sander software (in the end to use QM/MM calculations).

Set up Amber prep (topology) files for the two ligands of interest (called v0.in and v1.in in the following). If the number of atoms differ, use dummy atoms. Ensure that the corresponding atoms come in the same order in the two files.
Set the ligand names
v0name=GU4
v1name=GU1
Ensure that you have pdb files of the two ligands in the protein. They should have the same coordinates of all atoms and only differ in the atom names of the differing atoms. They are called prot-v0.pin and prot-v1.pin.
sed s/$v0name/$v1name/ prot-v0.pin >prot-v1.pin
Construct the corresponding ligand-free structures by grep out the other ligand from these files:
grep $v0name prot-v0.pin >v0.pin
grep $v1name prot-v1.pin >v1.pin
Run leap with the following leap.in file
\rm leap.log; tleap -f leap.in

source leaprc.ff14SBsource leaprc.gaffAddPath ../LeaploadAmberPrep gu1-resp.inloadAmberPrep gu4-resp.inloadAmberPrep oa8-resp.in#loadAmberParams oa8.datx = loadpdb prot-v0.pinsolvateOct x TIP3PBOX 10saveamberparm x prot-v0.prm prot-v0.rstx = loadpdb prot-v1.pinsolvateOct x TIP3PBOX 10saveamberparm x prot-v1.prm prot-v1.rstx = loadpdb v0.pinsolvateOct x TIP3PBOX 10saveamberparm x wat-v0.prm wat-v0.rstx = loadpdb v1.pinsolvateOct x TIP3PBOX 10saveamberparm x wat-v1.prm wat-v1.rstquit
Build the corresponding pdb files
for p in prot wat ; do for v in v0 v1 ; do echo $p-$v.prm > chprm.in echo p >> chprm.in echo $p-$v.pdb >> chprm.in echo m >> chprm.in echo $p-$v.rst >> chprm.in echo q >> chprm.in changeparm <chprm.in donedone\rm chprm.in
Special action is typically needed for wat-v1.pdb.
For some strange reason, it is not solvated the same way as v0, even if the coordinates are identical.
\cp wat-v0.pdb v1.pin
sed -i s/$v0name/$v1name/ v1.pin
Change the name of the perturbed atom in v1.pin by hand.

Run leap with the following leap.in file
\rm leap.log; tleap -f leap2.in

source leaprc.ff14SBsource leaprc.gaffAddPath ../LeaploadAmberPrep gu1-resp.inx = loadpdb v1.pin setbox x vdw saveamberparm x wat-v1.prm wat-v1.rst quit

changeparm <<EOF
wat-v1.prm pwat-v1.pdb mwat-v1.rst q EOF
Set up the fep.in file and then run
mkfep <fep.in

2
0.5
1.0

":GU1 | :GU5"
-1
":GU5@Cl6 | :GU1@H6"
13
36 20
a
Start the FEP calculations.

Amber 11-12

This guide will show how to set up alchemical perturbation calculation with Amber 11-12 and two separate prmtop files. The perturbation calculations are used to compute the relative free energy between to ligand molecules, typically drug molecules. Perturbation calculations will be done when the ligand is bound to a protein and when it is free in solution.

Samuel Genheden, 2012; Modified by UR 8/5-13

Prerequisite

You need to have:

two leap prep-files with file ending .in, e.g. l01.in and l02.in. It is important the ligand atoms have the same name in both .in-files, apart from the atoms that will be perturbed.
a pdb-file with hydrogen atoms and with the first ligand (l01) bound. It should be named prot-l01.pdb, where prot is the protein name and l01 is the name of the first ligand.

Note!: The direction off the perturbation is important in the setup, due to technical reasons. The first ligand (l01) should have more or equal number of atoms than the second ligand (l02)

Note 2!: The naming of the input files are important if the black-box script should be used

Manual setup

Making perturbations and off-files

Move all prerequisite files to the folder where you will do the calculations (or set up a separate directory with the library files and use addPath below; if so, you only need the input pdb file of the complex with the larger ligandcp ).

First, we will create a number of files that will be used in latter stages of the setup. Create a file called leap_lig.inp that looks similar to this:

source leaprc.gaff loadamberprep l01.in loadamberprep l02.in saveoff L01 a.off saveoff L02 b.off saveamberparm L01 a.prm a.rst saveamberparm L02 b.prm b.rst quit

Change l01 and l02 to the name of your ligands. You might have to load additional parameters and insert an
addPath command.

Execute tleap

tleap -f leap_lig.inp

This execution will create off-files as alternatives to in-files that are easier to manipulate. It will also create prmtop-files for the two ligand that now will be used to extract perturbation information

Execute the following commands:

changeparm << EOF a.prm qfep L01 q EOF

and then for the other ligand

changeparm << EOF b.prm qfep L02 q EOF

The changeparm execution will create *.charges and *.vdws files that will be used to set up perturbation files. The perturbation files will be created with the following command

xxx This step can be improved

mv l01.charges a.chargesmv l02.charges b.chargesmv l01.vdws a.vdwsmv l02.vdws b.vdws /temp4/bio/Dg_tools/make_pertoffs_all.py a b

Answer the questions about corresponding atoms. If an atom does not have a corresponding atom, just press enter.

The script will create an amb_masks file that contains soft-core masks for Amber and a bash-script rename.bash that will be used in a moment. Most importantly, the script will reorder the atoms in the *.off-files such that the unique atoms are last in the ligand, an Amber requirement.
Edit the amb_masks file to change A and B to the correct name of the two residues.

Making prmtop and prmcrd-files

The prmtop and prmcrd-files will be created in two steps to ensure that the dual-topology files are identical apart from the unique atoms. Therefore, start with creating a file called leap_templ.inp that look something like this

source leaprc.ff99SB source leaprc.gaff loadoff a.off x = loadpdb prot-l01.pdb solvateOct x TIP3PBOX 10 saveamberparm x templ-prot.prm templ-prot.rst savepdb x prot-a-solv.pdb x = copy L01 solvateOct x TIP3PBOX 10 saveamberparm x templ-lig.prm templ-lig.rst savepdb x a-solv.pdb quit

Change protein and ligand name, add additional parameters, cys-cys bridges, addPath, and possibly protein *.in and *dat files.

Execute tleap

tleap -f leap_templ.inp

This execution will solvate the complex with the first ligand and the first ligand free in solution.

Next, we copy the solvated pdb-files to pdb-files for the second ligand.

cp prot-a-solv.pdb prot-b-solv.pdb cp a-solv.pdb b-solv.pdb

Then we change the residue name of the ligand in the new pdb-files

sed -i "s/L01/L02/" prot-b-solv.pdb b-solv.pdb

Finally, we will remove atoms that are in the first ligand but not in the second and rename corresponding atoms. This will be done with the bash-script that was created above

bash rename.bash prot-b-solv.pdb bash rename.bash b-solv.pdb This script do not always work properly, especially no for host-guest systems - check the results or better, do it by hand. Then, you can also change the bond length of perturbed atoms.

Now, we can read in all the solvated pdb-files in tleap and create the final prmtop and prmcrd-files that will be used with sander. Create a file called leap_final.inp that looks similar to this.

source leaprc.ff99SBsource leaprc.gaffloadoff a.off loadoff b.off x = loadpdb prot-a-solv.pdb setbox x vdw saveamberparm x prot-a-solv.prm prot-a-solv.rst x = loadpdb a-solv.pdb setbox x vdw saveamberparm x a-solv.prm a-solv.rst x = loadpdb prot-b-solv.pdb setbox x vdw saveamberparm x prot-b-solv.prm prot-b-solv.rst x = loadpdb b-solv.pdb setbox x vdw saveamberparm x b-solv.prm b-solv.rst quit

If needed, add additional parameters, Cys-Cys bridges, addPath, and possibly protein *.in and *dat files.

Execute tleap

tleap -f leap_final.inp

The lines setbox x vdw will set the box of the prmtop-files but it can only recognize rectangular geometry. Therefore, we need to edit the box information.

/temp4/bio/Dg_tools/edit_boxinfo_ti.py templ-prot.prm prot-a-solv.prm prot-a-solv.rst /temp4/bio/Dg_tools/edit_boxinfo_ti.py templ-prot.prm prot-b-solv.prm prot-b-solv.rst /temp4/bio/Dg_tools/edit_boxinfo_ti.py templ-lig.prm a-solv.prm a-solv.rst /temp4/bio/Dg_tools/edit_boxinfo_ti.py templ-lig.prm b-solv.prm b-solv.rst

Setting up simulation input-files

Now we should create sander input files.

cp /temp4/bio/Dg_tools/cluster_amblib . This file contains information about how to execute sander on different clusters.
Check the file that is updated with the proper module information.

Execute

python /temp4/bio/Dg_tools/make_tiinp_rel.py prot a-solv b-solv

and answer the questions about simulation length and lambda-values.
Typical values are 500 ps, 1000 ps, and 0.05 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.95.

Copy the prmtop, prmcrd, the group-files, the input-files, the queue-scripts, and sub.bash to a cluster of choice. Execute sub.bash on the cluster.

A black-box script

There exists a script that can do all of the things above in an automated fashion. It will create prmtop files that use the Amber99SB and GAFF force field for the protein and ligand, respectively. Water molecules will be described using TIP3P.

If the system requires additional parameters, these should be added to a file called leap_lines, e.g.

cat << EOF > leap_lines loadamberprep ca.in loadamberparams ca.dat EOF

Make sure that you have the prerequisite files and that they are named appropriately.

Make sure to change all HETATM cards in the pdb-file to ATOM cards, as I have noticed that the ter-command in changepdb works better then.

In an appropriate folder where all the perquisite files and optionally additional parameters and leap_lines are located, execute the following command

/temp4/bio/Dg_tools/prep_tiamb.bash L01 L02 prot-l01.pdb

and substitute the ligand residue names and the complex pdb-file.

Independent simulations

There is also an alternative script


      /temp4/bio/Dg_tools/prep_tiamb_indep.bash

that takes as an extra argument a number of independent simulations and set up that number of prmtop and prmcrd-files.

Additional features

Restrained atoms

One can add information on atoms to be restrained in the input file, by creating a special file that is read by make_tiinp_rel.py

This file should be called buffer and contains Amber belly information. This means that all files starts with ATOM will be assumed to contain two numbers specifying the first and last atom to be restrained, as well as RES that will be assumed to be corresponding residue specifications. E.g.

cat << EOF > bufffer ATOM 1 48 ATOM 147 200 EOF

cat << EOF > buffer RES 1 2 RES 15 18 EOF

Information will be put in the input files for the two last complex simulations (unrestrained equilibration and production). Only heavy atoms will be restrained.

Energy decomposition

One can add informations that will make Amber write out energy decomposition information on a residue-wise level, by creating a special file that is read by make_tiinp_rel.py

This file should be called decomp and should have two lines. The first line should contain two columns, specifying the last residue to be decomposed on in the complex and free-ligand simulation, respectively. The second line should contain the same information. The difference between line 1 and line 2, is that the first line is used for V0 and the second line is used for V1. Use 0 to prevent decomposition. E.g.

cat << EOF > decomp 6567 0 6567 0 EOF

Cluster library

The script that generates the input scripts (make_tiinp_rel.py) also create queue-scripts. It uses a file called cluster_amblib to know cluster-specific commands. One can edit this file manually, to support additional clusters. An example is shown below for the Alarik cluster at Lunarc.

!/bin/bash #SBATCH -N 1 #SBATCH -n 8 #SBATCH -t WWW:00:00 module add gcc/4.6.2 module add openmpi/1.4.5/gcc/4.6.2 export AMBERHOME=/lunarc/nobackup/projects/bio/Amber/Amber11/ export PATH=$AMBERHOME/exe:$PATH sbatch mpirun -np NN sander.MPI 8 %end

The first line should start with a percentage sign (%) followed by the name of the cluster.

The cluster specification should end with the line %end.

The last three lines of the specifications are special. First the command used on the cluster to add jobs to the queue are specified (e.g. sbatch, second the command used to execute the parallel version of sander, and third the number of processors to use.

All other lines in the specification are appended to the start of the queue script and should contains queue-specific lines (e.g. starting with #SBATCH), modules that need to be loaded and paths that need to be set. The AMBERHOME path must be set. The WWW mask is special and will be replaced by an estimated walltime (in hours).