Computing relative free energies with FEP using Q

This guide will show how to set up free energy perturbation (FEP) calculations in Q. The FEP calculations are used to compute the relative free energy between to ligand molecules, typically drug molecules. FEP calculations will be done when the ligand is bound to a protein and when it is free in solution.

This guide will show how to setup the calculations more or less black-box. A detailed explanation of the script can be found here.

A description of the program pre-requisite can be found here.

An additional guide, that will show how to set up these calculations for a number of ligands can be found here.

Samuel Genheden, 2011-2012


  • Prerequisite

    You need to have:

    Note!: The direction off the perturbation is important in the setup, due to technical reasons. The first ligand (l01) should have more or equal number of atoms than the second ligand (l02).
    Note 2!:The naming of the input files are important. For instance the protein and ligand name in the pdb-file should be separated by a dash (-), not an underscore.

    The bash script will create leap files that use the Amber99SB and GAFF force field for the protein and ligand, respectively. Water molecules will be described using TIP3P.
    If the system requires additional parameters, these should be added to a file called leap_lines

    Before we start, make sure that the $QPREP path is set. Locally it should be,

    export QPREP=/away/bio/Q/Q/Utils

  • Creating preparatory files

    If the complex contains ions or co-factors, or if the ligand itself needs additional parameters, the following steps need to be executed.

    First, we will create a file called leap_lines that tells leap to load the additional information. This will be added to the leap input file that the script is created. Here is an example for a Ca2+ ion.

    cat << EOF > leap_lines
    loadamberprep ca.in
    loadamberparams ca.dat
    EOF

    Next, we need to create a qlib file for the ion, which is the corresponding file in Q.

    python $QPREP/in2qlib.py ca.in > ca.qlib

    And then we need to create a qprep_lines file that tells Qprep5 to read this qlib-file

    cat << EOF > qprep_lines
    readlib ca.qlib
    EOF

    Now, we are ready to execute the main script.

  • Executing prep_fepq.bash

    Type:
    bash $QPREP/prep_fepq.bash L01 L02 fxa-l01.pdb 20

    L01 and L02 are the residue names of the two ligands. L01 will be perturbed to L02. fxa-l01.pdb is the name of the protein-ligand pdb-file. The protein name will be taken as the characters before the dash (-). Finally, 20 is the desired size of the water droplet and thereby the simulated system size.

    This script will, in order do:

  • Center the complex on the ligand

  • Solvate the complex using Qprep5

       When this is finnished, there will be text telling if a pdb-file was made successfully. If not, some error occured with QPrep5.

  • Solvate the free ligand using Qprep5

       When this is finnished, there will be text telling if a pdb-file was made successfully. If not, some error occured with QPrep5.

  • Create Amber prmtop and prmcrd files

       When this is finnished, the script will list the prmtop files (one for the complex and one for the ligand). If an error occurred, they will not exists or have zero-size

  • Minimize the complex and the free ligand

  • Create Q prmtop files

       changeparm has a mysterious bug, so sometimes there will be some error messages. Usually, it is nothing to worry about.
       When this is finnished, the script will list the qparams files (one for the complex and one for the ligand). If an error occurred, they will not exists.

  • Create perturbation files

       You will be asked to enter corresponding atoms for atoms that differ between the two ligands. If a corresponding atom do not exists, i.e., it should be perturbed to a dummy atom, just enter blank.

  • Create Q input-files and queue scripts

       You will be asked to enter the length of the simulations, the number of transformations (1 or 2), the lambda-values, and the cluster where the calculations should be executed.

    When this script is finnished, the preparatory files will be in Fep_prep and the files necessary for the simulations in Fep_sim. Copy the files in Fep_sim to a cluster and execute the queue scripts. Usually you only need to execute a bash script that was created in the very last step above.

  • Cluster library

    The last task in the list above, will create queue script that is appropriate for a specific cluster. The script that generates these scripts (make_fepinp.py) uses a file called cluster_lib to know cluster-specific commands. One can edit this file manually, to support additional clusters. An example is shown below for the Akka cluster at HPC2N.

    %Akka
    #!/bin/bash
    #PBS -l nodes=1:ppn=8
    #PBS -l walltime=WWW:00:00
    #PBS -A SNIC014-10-24

    cd $PBS_O_WORKDIR
    module add intel-mpi/3.1.026
    module add openmpi/intel
    export QHOME=/home/g/genheden/pfs/Q/bin/
    export PATH=$QHOME:$PATH
    export PATH=$PBS_O_PATH:$PATH

    qsub
    Qdyn5
    mpirun -np 8 Qdyn5p
    %end

    The first line should start with a percentage sign (%) followed by the name of the cluster.

    The cluster specification should end with the line %end.

    The last three lines of the specifications are special. First the command used on the cluster to add jobs to the queue are specified (e.g. qsub), second the command used to execute the serial version of Qdyn5, and third the command used to execute the parallel version of the same program.

    All other lines in the specification are appended to the start of the queue script and should contains queue-specific lines (e.g. starting with #PBS), modules that need to be loaded and paths that need to be set. The QHOME path must be set. Also, the number of nodes need to be set manually. At the moment, the make_fepinp.py assumes that at least 8 cores are available for execution of the calculations. The WWW mask is special and will be replaced by an estimated walltime (in hours).

  • Calculating the free energies

    This is done with the calc_dg.py script


    Detailed guide to the prep_fepq.bash script

    To perform this guide, one need to complete the steps in the guide above, before the execution of prep_fepq.bash

  • Before we begin preparation, let us create two folders Fep_prep and Fep_sim. Then move all the interesting files to this folder and finally, change to this folder.

    mkdir Fep_prep
    mkdir Fep_sim

    mv *in *dat *lib *lines *frcmod fxa-l01.pdb Fep_prep/
    cd Fep_prep

  • First, the protein--ligand complex is centered on the centre of geometry of the ligand using changepdb. It is also of interest to remove far-away water molecules. The default is to remove any water that is more than R-2 Å, where R is the simulation radius. The pdb filename and the ligand residue name is taken from the command line arguments

    changepdb << EOF > center.log
    fxa-l01.pdb
    m
    l
    L01
    row
    18

    w


    EOF

  • Next, changepdb will be used again to renumber the atoms from 1, and creating a cyx.dat file that can be used in leap to create cys-cys linkages

    changepdb << EOF > cyx.log
    fxa-l01.pdb
    REN
    1
    CYX
    w


    EOF

  • Now, when all the preparatory steps have been done, we will use Qprep5 to solvate the complex and the free ligand

    First, we need to create a qlib file, which is a file, similar to an in-file, that describes the ligand

    python/away/bio/Q/Q/Utils/in2qlib.py l01.in > l01.qlib

    Second, the Amber-formatted pdb-file need to be converted to a Q-formatted pdb file

    python /away/bio/Q/Q/Utils/amb2q.py fxa-l01.pdb > fxa-l01q.pdb

    Then we need to create the input file to Qprep5 from several files. The qprep_lines file is provided by the user.

    cat << EOF > qprep_prot.inp
    readlib /away/bio/Q/Q/Data/qamber95.lib
    readlib /away/bio/Q/Q/Data/qamber95_neu.lib
    readlib l01.qlib
    readprm /away/bio/Q/Q/Data/qamber95.prm
    readprm /away/bio/Q/Q/Data/qgaff.prm
    EOF

    cat qprep_lines >> qprep_prot.inp

    cat << EOF > >> qprep_prot.inp
    readpdb fxa-l01q.pdb
    boundary
    1
    0 0 0
    20
    solvate
    boundary
    20
    1
    HOH
    maketop
    qtop
    writepdb fxa-l01solvq.pdb
    y
    quit
    EOF

    Qprep5 < qprep_prot.inp > qprep_prot.log

    Now, we need to convert the Q-formatted file to Amber format, and we also need to be sure to add TER records.

    python /away/bio/Q/Q/Utils/q2amb.py fxa-l01solvq.pdb > fxa-l01solv.pdb

    changepdb << EOF > ter.log
    fxa-l01solv.pdb
    ter
    1
    w


    EOF

    And now the same for the free ligand:

    grep L01 fxa-l01.pdb > l01.pdb

    python /away/bio/Q/Q/Utils/amb2q.py l01.pdb > l01q.pdb

    cat << EOF > qprep_lig.inp
    readlib /away/bio/Q/Q/Data/qamber95.lib
    readlib /away/bio/Q/Q/Data/qamber95_neu.lib
    readlib l01.qlib
    EOF

    cat qprep_lines >> qprep_lig.inp

    cat << EOF >> qprep_lig.inp
    readprm /away/bio/Q/Q/Data/qamber95.prm
    readpdb l01q.pdb
    boundary
    1
    0 0 0
    20
    solvate
    boundary
    20
    1
    HOH
    maketop
    qtop
    writepdb l01solvq.pdb
    y
    quit
    EOF

    Qprep5 < qprep_lig.inp > qprep_lig.log

    python /away/bio/Q/Q/Utils/q2amb.py l01solvq.pdb > l01solv.pdb


  • The next step is to create the Amber prmtop and prmcrd files. The script is a little bit cumbersome, because we need information from several external files:

    First we create a leapcom file by merging information from several files. leap_lines is provided by the user, and cyx.dat was created above

    cat << EOF > leap_solv.inp
    source leaprc.ff99SB
    loadamberparams gaff.dat
    loadamberprep l01.in
    EOF

    cat leap_lines >> leap_solv.inp

    cat << EOF >> leap_solv.inp
    x = loadpdb fxa-l01solv.pdb
    EOF

    cat cyx.dat >> leap_solv.inp

    cat << EOF >> leap_solv.inp
    solvateCap x TIP3PBOX {0 0 0} 20 1000
    saveamberparm x fxa-l01.prm fxa-l01.rst

    x = loadpdb l01solv.pdb
    solvateCap x TIP3PBOX {0 0 0} 20 1000
    saveamberparm x l01.prm l01.rst
    quit
    EOF

    Finally, we can execute tleap

    tleap -f leap_solv.inp >& leap_solv.log

  • Next, we will perform a short constrained minimization of the water molecules and hydrogen atoms. This can be done with sander

    cat << EOF > sander.in
    Restrained minimization
      &cntrl
         irest=0,ntx=1,
         imin=1,maxcyc=100,drms=0.0001,ntmin=2,
         ntc=2,ntf=2,
         cut=20.0,
         ntpr=100,ntwx=0,ntwv=0,ntwe=0,
         ipol=0,igb=0,ntb=0,
         ntr=1,restraint_wt=100,
         restraintmask="!:WA= & !@H="
       &end
    EOF

    sander -O -i sander.in -o sander.out_prot -p fxa-l01.prm -c fxa-l01.rst -ref fxa-l01.rst -r fxa-l01.mdrest0

    sander -O -i sander.in -o sander.out_wat -p l01.prm -c l01.rst -ref l01.rst -r l01.mdrest0

    Alternatively, the minimization can be done with the nab program nabmin which is a good cost-free alternative:

    Start with creating a pdb-file of the system

    changeparm << EOF > min.log
    fxa-l01.prm
    p
    unmin.pdb
    m
    fxa-l01.rst
    q
    EOF

    then execute the program:

    /away/bio/Q/Q/Utils/nabmin fxa-l01.prm unmin.pdb min.pdb >> min.log

    and finally, convert the pdb-file to an mdrest-file

    changepdb << EOF >> min.log
    min.pdb
    wm
    fxa-l01.mdrest0
    q
    EOF

    And then for the free-ligand

    changeparm << EOF >> min.log
    l01.prm
    p
    unmin.pdb
    m
    l01.rst
    q
    EOF

    /away/bio/Q/Q/Utils/nabmin l01.prm unmin.pdb min.pdb >> min.log

    changepdb << EOF >> min.log
    min.pdb
    wm
    l01.mdrest0
    q
    EOF

  • Now, when we have the minimized systems, we can create the Q prmtop file. First for the complex and then for the free ligand. Unfortunately, there is some mysterious bug in changeparm, so the commands might have to be repeated until the qparm-files exists.

    changeparm << EOF >& create_qparm1.log
    fxa-l01.prm
    qparm
    m
    fxa-l01.mdrest0
    20
    20
    fxa-l01.qparm
    q
    EOF

    changeparm << EOF >& create_qparm2.log
    l01.prm
    qparm
    m
    l01.mdrest0
    20
    20
    l01.qparm
    q
    EOF

  • Now it is time to introduce the second ligand, and create a perturbation file that describes the difference between the two ligands.

    First, we create a dummy prmtop file for the two ligands

    cat << EOF > leap_lig.inp
    source leaprc.gaff
    loadamberprep l01.in
    loadamberprep l02.in
    EOF

    cat leap_lines >> leap_lig.inp

    cat << EOF >> leap_lig.inp
    saveamberparm L01 a.prm a.rst
    saveamberparm L02 b.prm b.rst
    quit
    EOF

    leap -f leap_lig.inp >& leap_lig.out

    Then, we use changeparm to get charge and van der Waals parameters for our two ligands. This will create a .charges and a .vdws for each of the ligands.

    changeparm << EOF > info1.log
    a.prm
    qfep
    L01
    q
    EOF

    changeparm << EOF > info2.log
    b.prm
    qfep
    L02
    q
    EOF

    The .charges and .vdws files will then be used by a Python script to create the Q perturbation files. This script will ask for atoms in L01 that do not have any corresponding atom in L02. If such atoms should be perturbed to dummy atoms, just leave the corresponding atom blank.

    python /away/bio/Q/Q/Utils/make_pert.py l01 l02

    The script will create three perturbation files to be used with the complex. One for only electrostatic perturbation, one for only van der Waals perturbation and finally, one if these should be done at the same time. We will create the corresponding files for use with the free ligand by removing the offset-line:

    sed "/OFF/d" ele_prot.fep > ele_wat.fep
    sed "/OFF/d" vdw_prot.fep > vdw_wat.fep
    sed "/OFF/d" elevdw_prot.fep > elevdw_wat.fep

    Then, we will substitue the OFF mask with the real offset, i.e. the first ligand atom minus 1

    temp=`grep L01 fxa-l01.pdb | head -n 1 | awk '{print substr($0,6,6)}'`
    nprot=`echo " ${temp} - 1" | bc`

    sed -i "s/OFF/$nprot/" ele_prot.fep
    sed -i "s/OFF/$nprot/" vdw_prot.fep
    sed -i "s/OFF/$nprot/" elevdw_prot.fep

  • Before proceeding, we will clean up a little bit

    tar -cjf sav.tar.bz2 a.* b.* *charges *vdws *solv.pdb *q.pdb *log *dat *inp mdinfo sander.out_*
    rm -f a.* b.* *charges *vdws *solv.pdb *q.pdb *log *dat *inp mdinfo sander.out_*

  • Finally, we will move to the Fep_sim folder and create input files for Q and queue-scripts. The python script will ask about equilibration and production length, number of transformations (1 or 2), the lambda-values and the cluster that the calculation should be run on.

    cp -p *qparm *fep /away/bio/Q/Q/Utils/cluster_lib ../Fep_sim
    cd ../Fep_sim
    python /away/bio/Q/Q/Utils/make_fepinp.py fxa l01 20

    cluster_lib contains information about the clusters, the way Q should be executed and modules that need to be loaded.



    Program pre-requisite

    The following programs needs to exists on your machine:

    Compiling Q

    The code downloaded from Aqvist homepage do not work properly with MPI on the clusters. G. Wallin sent me an updated code that made it work.

    Thereafter, I have made changes to the code to support soft-core Coloumb potentials. Therefore, it exsist three versions of md.f90, the original one, md_scc.f90 which implements the soft-core Coloumb but retains the original soft-core Lennard-Jones potential, and md_sc2.f90 which implements a lambda-dependent soft-core Lennard-Jones potential. I recommend to use md_sc2.f90 instead of md_scc.f90. Consequently there is three different versions of Qdyn5 and Qdyn5p (the parallel version of the former), with similar endings. Qdyn5p_sc2 will be compiled upon parallel installation.

    I have not been able to compile to code with any other compiler than the Intel ifort compiler.

    Install locally:
    Locally, ifort is installed only on the gefion machine. Type the following commands:
    export PATH=$PATH:/opt/intel/bin/
    make ifort_unopt
    mv Q* ../bin/

    Install on Abisko (similar on the other clusters):
    module add intel-fortran
    module add openmpi/intel
    make ifc
    make ifort_unopt
    mv Q* ../bin/

    Compiling nabmin

    export AMBERHOME=/away/bio/AMBER/Amber11/
    $AMBERHOME/exe/nab nabmin.nab -o nabmin