Quantum mechanical thermodynamic
cycle perturbation.
This method was described in
T. H. Rod & U. Ryde (2005) "A method for calculating high-level QM/MM free energies for enzyIf you have Fe ions that may be swapped, run the program mkswaplist with $reddir/pdb3 as the first file and $oxdir/pdb3 as the second file and put the results in a file swap in the Qtcp directory.matic reactions: methyl transfer catalyzed by catechol O-methyltransferase", Phys. Rev. Lett., 94, 138302.
T. H. Rod & U. Ryde (2005) "Free energy barriers at the density functional theory level: methyl transfer catalyzed by catechol O-methyltransferase", J. Chem. Theor. Calc., in press.
QTCP input
file
QTCP output and comments
Link to CHARMM version
Start from
a set of QM/MM structures (with protein fixed or free) along a
reaction pathway.
See the ComQum page for details.
The final data is in prmtop3 and prmcrd3, as well as in the
Turbomole files.
For each structure, calculate
Turbomole >5.8 ESP chages. (see Turbomole
page).
If you have run qmmmpbsa-ptch, this is already done in
the Ptch directory, file logm.
Otherwise:
mkdir Esp
cp control coord basis
energy mos alpha beta auxbasis comqum.dat pdb3 prmtop3 Esp
cd Esp
kdg esp_fit
kdg vdw_radii
kdg pointcharges
mv control c0
cat >control
<<EOF
\$mvd
\$moments
\$pop
\$esp_fit
shell 75000 -0.57
shell 75000 -0.38
shell 75000 -0.19
shell 75000 0.00
shell 75000 0.19
shell 75000 0.38
shell 75000 0.57
shell 75000 2.46
shell 75000 4.35
shell 75000 11.91
\$vdw_radii
h 3.85504
c 4.81880
n 4.81880
o 4.49755
s 5.62193
mn 6.42507
fe 6.42507
co 6.42507
ni 6.42507
cu 6.42507
zn 6.42507
EOF
cat c0 >>control
and then run
ridft -proper >logm
Obsolete version with
CHELP-BOW charges
For a single structure (the
central one, e.g. the transition state) do the following
points 4-12 (it also has to be done if the number of atoms are
different in the other structures, e.g. for pKa calculations):
(Jump to the other
structures)
mkdir Qtcp
cd Qtcp
Insert TER in this file between
different subunits and non-standard residues.
cp ../pdb3 pdb3_cq
changepdb <<EOF
pdb3_cq
ter
1
w
pdb3_cq
q
EOF
Run leap to fully solvate the
protein with solvent molecules reaching at least 3 Å (they are
typically already solvated in a sphere) outside the protein.
(if this is not already the case). You cannot use a belly with
spherical systems and reaction field (igb=10), therefore, we
prefer to run these calculations with constant pressure and
PME in truncated ocahedron.
solvateOct x TIP3PBOX 3
Sample of leap commants
(for more information see the equilibration
page).
Copy needed *.in and *.dat files from /away/bio/Data/Amber.
Do not forget to include extra bonds.
This should give you the files prmtop and prmcrd.
Build
a pdb file from the amber files with changeparm
This should
also be performed for the other structures.
changeparm <<EOF
prmtop
p
pdbout
m
prmcrd
q
EOF
Run changepdb on this pdbout file
to insert the QM charges from step 3.
changepdb <<EOF
pdbout
qcc
t
../Ptch/logm
comqum.dat
w
pdbnew
q
EOF
Check thoroughly that the total
charge of the two files pdbout and pdbnew are the same
(changepdb, command cc).
changepdb >newtotch
<<EOF
pdbnew
cc
q
EOF
cat newtotch
qmmmpbsa-checkch
If not, insert into comqum.dat the residual charge of each of
the quantum residues (the first number is the number of
residues, then the residue number and residual charge of each
residue is given). Then rerun the previous point.
$residue_charge_corr
1
652 -1
Check that the total charge is an integer. If not add
the residual charge (owing to rounding-off errors) to one or
several atoms.
Change the charges in prmtop:
changeparm <<EOF
prmtop
m
pdbnew
w
prmtop
q
EOF
Other
structures charge correction ends here
Start the sander calculations
(see template files
below):
You can automatically get the belly group by changepdb, command be (from $correspondance_list in
comqum.dat).
100 steps of steepest decent
MM minimisation with all heavy atoms restrained to the
original QM/MM structure.
sander -i sander.in0 -o sander.out0
-r mdrest0 -c prmcrd -ref prmcrd
20 ps MD simulation at
constant volume, still with heavy atoms restrained.
sander -i sander.in1 -o sander.out1
-r mdrest1 -c mdrest0 -ref prmcrd
20 ps MD simulation at
constant pressure, still with heavy atoms restrained.
sander -i sander.in2 -o sander.out2
-r mdrest2 -c mdrest1 -ref prmcrd
50 ps MD equilibration with
only the quantum system (or only heavy-atoms in the QM
system) restrained, constant pressure
sander -i sander.in3 -o sander.out3
-r mdrest3 -c mdrest2 -ref
mdrest2
Correct the quantum system and make mdrest files for all structures (the QM system cannot be fixed in the constant-pressure calculations):
Construct the file pdb3, from
mdrest3 and prmtop in the MD directory:
changeparm<<EOF
prmtop
p
pdb3
m
mdrest3
q
EOF
Construct a pdb file with the
original QM/MM coordinates (identical to those in the QM
file coord).
This is the pdb3_cq file from above.
Make a rms fit of pdb
_cq towards pdb3, using changepdb, fitting only the heavy
atoms in the QM system (which were restrained in the
sander.in3 calculation).
changepdb <<EOF
pdb3_cq
rf
pdb3
f
b
sander.in3
n
y
m
c
y
transf
w
moved.pdb
n
EOF
If you did not define any belly in sander.in3, you
can instead use the comqum.dat file and include or ignore
H atoms, depending on what you restrained.
Construct a mdrest file with
the quantum system exactly correct:
cp mdrest3 mdrest3_old
modrest <<EOF
mdrest3
cq
../coord
p
transf
0
w
mdrest3
q
EOF
Typical movements are 0.25 Å average and 0.5 Å
maximum
For
all the other structures:
mkdir Qtcp
cd Qtcp
cp ../comqum.dat .
cp q transf prmtop
mdrest3 sander.in4 sander.in5 from the original
directory
Then run modrest in the same
way (not needed for charge perturbation)
modrest
mdrest3
cq
../coord
p
transf
0 or the number of the mismatching atom (if atoms are
added or delted)
w
mdrest3
q
Check that the junction atoms do not move more than
0.01 Å
Fix the charges in the prmtop
file as in points 8-12
above (in a charge perturbation you instead need to
include the proper prmtop_lambda file).
If the number of atoms or anything else has changed more,
you need to build a new prmtop file and ensure that the
number of atoms corresponds exactly to that in the mdrest3
file (do points 4-6,
above, also (e.g. for pKa calculations).
Run sander.in4 and sander.in5 for all
systems
200 ps MD equilibration with the quantum system fixed;
constant volume
sander -i sander.in4 -o sander.out4 -r mdrest4 -c
mdrest3
200 ps data collection with the quantum system fixed;
constant volume
sander -i sander.in5 -o sander.out5 -r mdrest5 -c
mdrest4 -e mden5 -x
mdcrd5
Wrap all water molecules to
the primary periodix box, centering it around the protein.
This is done with ptraj and the following script (note
that the protein is not moved).
ptraj prmtop ptraj.in
This is the file ptraj.in:
trajin mdcrd5
trajout mdcrd5_wrap trajectory
image origin center familiar com :411
go
Old
(do not use):
trajin mdcrd5
trajout mdcrd5_wrap
trajectory
image center :WAT byres
familiar com :1-217
go
For a small molecule in water and with
calcqtcpfreee2, this may be better (which first centres
the molecule to the origin and then wrap all water around
it):
trajin mdcrd5
trajout mdcrd5_wrap trajectory
center :1 origin
image origin center :WAT byres
familiar com :1
go
ptraj contains bugs and often gives wrong results. Always
check one of the pdb5 files.
Sometimes it helps to remove origin, sometimes it helps to
change the ptraj version.
When all the simulations are run,
free energies should be calculated.
These calculations depend somewhat on the system.
For a standard reaction-path calculation, a program is
present, calcqtcp, which
calculates the needed energies by calling Turbomole a great
number of times.
For other perturbations, you may need to modify the code
slightly.
The program calcqtrcpfreee does
the same thing, using sander (so that you can use boundary
conditions, etc.).
For the calcqtcp calculation,
you need to set up a number of files before starting the
program:
Check that you have the
comqum.dat file in the current directory (this is probably
already the case)
cp ../comqum.dat .
If you will run also vertical
(QM) energies (they are only needed for the start and end
points), you need to check that you have all the standard
Turbomole files in the current directory:
alpha
auxbasis
basis
beta
coord
control
mos
cp ../alpha ../auxbasis ../basis ../beta ../coord
../control ../mos .
or
cp alpha auxbasis basis beta coord control mos Qtcp
Possibly, you may want to change the method or the
basis sets to a cheeper one, before running QTCP.
You should also copy the coord file to c0 (the coord file
will be overwritten)
cp coord c0
Finally, you need a file with the original QM charges of
the QM system (i.e. with unmodified charges on the
junction atoms, so that the sum of the charges of the QM
system is an integer).
Currently, this file can be in free format with one charge
on each line (no title) or a charmm streamfile format.
It can be constructed from the reamul file by simply
cutting out the column Fitted
For calcqtcpfreee (but not calcqtcp) you also need a sander.in file (see here).
Add QTCP keywords to the
comqum.dat file (see below).
Consequently,
you need the following files:
auxbasis
basis
c0
comqum.dat
control
coord
mdcrd_wrap
mos
prmtop
qm_charges
qtcp.in
If you will scale down
solvent-accessible charges (as you normally should), run
program changeparm for one typical structure.
changeparm
sa
comqum.dat
0
mdcrd5_wrap
15
solvacc
q
Then, check the buried charges in this file thoroughly, so
see that they really are buried (there often some
problems). If not change them.
Run changepdb, command bc on the pdb5 file to get a second
opinion. Then check all unclear cases with a pdb viewer.
Nothing directly connected to the QM system should be of
type Charged.
Save the file with the name solvacc
Now, you are ready to run the
program calcqtcp
There are two versions of this program:
The first (calcqtcpfreee)
uses sander for the MM energy calculations.
This means that you can employ boundary conditions, PME,
etc. On the other hand, it may be too big for the computer
memory.
The input to this program is:
300
1
mdrest_qtcp1
0
y
mdcrd7_wrap1
The otherversion is calcqtcp.
This is the preferred version normally.
It calculates all MM energies with internal code (from
changeparm):
This means that it always uses a non-periodic calculation
with an infinite cut-off.
It also calculates the components of the MM energies (i.e.
the contribution to the free energy from each QM atom and
form each MM residue.
The input file is comqum.dat, as
discussed above.
The output of these two
programs are in the stdout file, as well as in the files:
comp_qtcp, mmen_qtcp1, mmen_qtcp2, and qmen_qtcp.
comp_qtcp contains the components of each energy
difference in each snapshot (bonds, angles, dihedrals, van
der Waals, and electrostatics).
mmen_qtcp1 and mmen_qtcp2 contain the energy difference of
the perturbation in each step, in the forward (1) and
backward (2) direction.
qmen_qtcp contains the MM->QM energy difference in each
snapshot.
In stdout, the free energy is given for the three
perturbations.
However, better statistics and energies can be obtained
with the programs helmholtz.py and bennets.py, which
directly reads the files mmen_qtcp1, mmen_qtcp2, and
qmen_qtcp.
For example:
helmholtz.py de=mmen_qtcp1
kt=2.4930
You can also use postqtcp
for this.
It is useful to check that
you have made the proper links by checking the rows:
Perturbation 1, average difference
from 0: 0.0255 A; for
junctions 0.0009 A
Perturbation 2, average difference
from 0: 0.0364 A; for
junctions 0.0016 A
and
Average absolute
change:
0.0061
0.0105
They should give the same values (within 0.0001, except
for junctions) for the forward and backward
perturbations.
Compact commands
We run the calculations in two separate steps: the charge and the
coordinate perturbation. We will assume that we go from state R
(reduced) to state O (oxidised). We also assume that the two
states have exactly the same atoms. If the reduction is coupled to
a proton transfer, follow instead the pKa calculations.
We will denote the state by a vector of two variables: (charges,
coords). Note that the first concerns the prmtop file and the
second one the mdrest/prmcrd/mdcrd (coordinate) file.
Thus the initial and final states are RR and OO and we will run
the simulations:
RR -> OR -> OO
Compact commands
I have implemented a statistical
analysis using cumulants (cf. Kästner et al J. Chem. Theory Comput
2 (2006)452).
This means that we can get statistical estimates of both the FEP
and TI estimates.
Moreover, we get a more stable FEP estimate.
So, the output now looks like this:
Confidence intervals for 0.95 level: 1.97
Perturbations
10
20 QM
Helmholz free energy chang (kJ/mole)
=
7.3222 -7.1274
-4524392.0121
Cumulant FEP estimate: (kJ/mole)
=
7.3271 -7.1542
-4524392.1796
Cumulant
s:
(kJ/mole) =
0.0856
0.1030
0.3222
Lower confidence interval (kJ/mole)
=
7.1588 -7.3567
-4524392.8131
Upper confidence interval (kJ/mole)
=
7.4953 -6.9518
-4524391.5461
Average
(TI)
(kJ/mole) =
7.8017 -6.5135
-4524389.2865
TI
s
(kJ/mole) =
0.0769
0.0894
0.1900
Lower confidence interval (kJ/mole)
=
7.6504 -6.6892
-4524389.6599
Upper confidence interval (kJ/mole)
=
7.9530 -6.3377
-4524388.9131
TR Standard
error
(kJ/mole) =
0.0886
0.0941
0.2525
Standard
deviation
(kJ/mole) =
1.5388
1.7879
3.7990
Maximum
value
(kJ/mole) =
12.6181 -0.0448
-4524379.2000
Minimum
value
(kJ/mole) =
2.5184 -10.8723
-4524399.3000
Range
(kJ/mole) =
10.0997
10.8275
20.1000
Weigth of minimum value (kJ/mole)
=
0.0172
0.0112
0.0464
where
Helmholz free energy change is the old FEP estimate
Cumulant FEP estimate: is the new FEP estimate, and should be the preferred
one.
Cumulant s: is the
standard deviation of this.
Lower and Upper confidence interval give a 95% confidence interval for the FEP
estimate.
Average (TI) is
the old average value and it is the TI estimate if the prmtop
files are for lambda=0 and 1.
TI s is the
standard deviation of that estimate
Lower and Upper confidence interval give a 95% confidence interval for the
average (and TI estimate).
TR Standard error is
the old standard error
Standard deviation is
the
raw standard deviation of the data.
The other four
values are the same as before.
I
want to warn you to use indiscriminatingly the "Final
results" of calcqtcp after coarse-graining. It is very hard
to automatise this procedure of selecting appropriate values
for the coarse graining. What you normally should do is some
experimenting with postqtcp to see that you get reasonable
and stable results. In particular you need to check that the
equilibration analysis (second part of the coarse graining)
gives good results. Quite often it happen that it points out
a single point not to be equilibrated, and I think that such
points can safely be ignored (it is only 90% confidence, so
every 10th test should be wrong). You should have several
consecutive structures in a row to be non-equilibrated to
ignore them.
Moreover, you should note that the program often has to
increase the coarse-graining (nsamp) to get a maximum of 50
points in the equilibration analysis. This is a technical
problem and does not mean that you should use this higher
coarse graining in the final analysis (I have corrected this
in the current version of calcqtcp).
Finally, and most importantly, I think that if you are doing
FEP, you need to do the analysis by hand with postqtcp and
using averaging over exponential energies (command e1 or
e2). Otherwise, you get pure averages, which are strictly
not correct for FEP.
Second, I have implemented the method of reverse cumulative
averaging and coarse-graining by Karplus et al. (J. Chem. Phys.
120 (2004) 2618). It is somewhat involved, so it is good if you
read that article. In principle, you first try to get an estimate
of how correlated consecutive data points are and make a
coarse-graining to get independent data points. This means that
you take the average of nsamp consecutive points and treat them as
new data (with fewer sampled points). Next, you try to estimate at
what point the data is equilibrated, by making a test if the
points are normal distributed. The first point from the end that
is not normal distributed is considered as the last point in the
equilibration.
The QTCP program tries to automatically select the ideal nsamp and
first equilibrated points, using the TI estimate (average) of
perturbation 1 and 2 and also prints out an analysis of the QM
perturbation, if present. Then it writes out a new, updated
analysis of the results at the very end of the output file.
However, you can continue to do more analysis and try other values
with the postqtcp program, which is interactive and more
versatile. In particular, you should try to remove the
unequilibrated points by setting the number of structures to skip
and redo the coarse-graining. This will typically give you a lower
number of nsamp and therefore a smaller confidence interval.
Third, I have changed the input section of calcqtcp so that it is
no longer interactive, but it reads keywords in the file
comqum.dat.
Thereby, all data is forced to be in a file and it is easier for
me to change the program and add more options.
Fourth, I have added the possibility
to scale down the interaction between the QM system and
solvent-accessible surface charges. As most of you knows, we have
found that these groups often add a large and probably dubious
contribution to the free energy.
Our suggestion is to delete all interactions between the QM system
and solvent-accessible charges.
However, see the warning at
the end of this section.
This can now be done by adding the
keywords
$solvskip 4
$solveps infinite
$solvacc solvacc
to the comqum.dat file.
$solvskip takes an integer argument, which can be:
0 - no residues are scaled down
4 - solvent accessible charges are scaled (default)
3 - all solvent-accessible groups are scaled
2 - 3+buried charges are scaled
1 - all residues are scaled
5 - all charges are scaled, but no neutral groups
$solveps is the scaling constant (the effective dielectric
constant). It can be any number or the text infinite (which is
default).
$solvacc is the name of a file with a list of the solvent
accessible residues.
It can be generated from the prmtop and mdcrd files with
changeparm, command sa. Note that it goes through all coordinates
in the mdcrd file, so it typically takes several hours.
Note that this calculation should only be done ONCE for each
protein and the same file should be used for all QTCP
calculations.
The output looks like this:
# Residue
Status %SolvAcc
Charge Distance
1 THR 1
Charged
1.000 1.000 13.450
2 LYS 2
Charged
1.000 1.000 9.919
3 GLU 3
Charged
1.000 -1.000 8.969
4 GLN 4
Exposed
1.000 0.000 13.779
5 ARG 5
Charged
1.000 1.000 15.502
6 ILE
6
0.000 0.000 10.427
7 LEU 7
Exposed
0.955 0.000 13.332
8 ARG 8
Charged
1.000 1.000 16.671
61 GLU 61 Buried
charge 0.000 -1.000 5.163
Here, is a list of all residues in the protein (not
water)
%SolvAcc is the fraction of snapshots in which the residue is
solvent accessible.
Charge is the total charge of the residue in the prmtop file.
Distance is the closest distance between that residue and the QM
system (read from the comqum.dat file). It is an average over all
snapshots.
Finally, status is the suggested status of that residue: Charged =
solvent-accessible charge, Exposed = solvent-exposed neutral
group; Buried charge, or nothing, meaning a buried neutral group.
The status is currently based on abs(charge)>0.3 and
solvacc<0.4, but it is free for you to manually change this
choice by changing the status - calcqtcp only reads this status
field (the other fields are also read and echoed to the output,
but whether the residue is scaled or not is determined entirely by
the status field and $solvskip.
The solvent accessibility is determined by dividing the water
molecules into five groups: bulk = those not in contact (3 Å) with
any protein atom, surface = in contact with protein and bulk water
(2.5 Å), deep surface = in contact with surface, but not bulk,
channel = in contact with deep surface, and cavity = remaining
waters (such an algorithm was used to find cavities in cytochrome
P450 [Rydberg et al. J. Phys. Chem. B 111 (2007) 5445-5457]. In
this case, we define solvent-exposed residues as those in contact
(3 Å) with surface, deep-surface, or channel waters. It seems to
work quite well.
changepdb bc can do the same thing on a single structure, but
these results are much less accurate.
If $solvskip>0, the output is change so that both the results
with scaled charges and unscaled electrostatics are given, e.g.
Confidence intervals for 0.95 level: 4.30
Perturbations
10
20
QM SA
10 SA 20
Helmholz free energy chang (kJ/mol)
=
0.7753
0.7492
0.0000
7.6232 -6.3224
Cumulant FEP estimate: (kJ/mol)
=
0.6517
0.4496
0.0000
7.4985 -6.6882
Cumulant
s:
(kJ/mol)
=
inf
nan
nan
inf inf
Lower confidence interval (kJ/mol)
=
-inf
nan
nan
-inf -inf
Upper confidence interval (kJ/mol)
=
inf
nan
nan
inf inf
Average
(TI)
(kJ/mol) =
1.0877
1.2208
0.0000
7.8435 -5.8855
TI
s
(kJ/mol) =
0.8515
1.1324
0.0000
0.7573 1.1553
Lower confidence interval (kJ/mol)
=
-2.5738
-3.6487
0.0000 4.5869
-10.8535
Upper confidence interval (kJ/mol)
=
4.7493
6.0903
0.0000 11.1001
-0.9175
TR Standard
error
(kJ/mol) =
0.9287
0.9982
0.0000
0.7107 0.8329
Standard
deviation
(kJ/mol) =
1.4749
1.9614
0.0000
1.3118 2.0011
Maximum
value
(kJ/mol) =
2.0264
3.2770
0.0000
9.2052 -3.5751
Minimum
value
(kJ/mol) =
-0.6122
-0.6297
0.0000
6.5882 -7.0696
Range
(kJ/mol) =
2.6387
3.9067
0.0000
2.6170 3.4945
Weigth of minimum value (kJ/mol)
=
0.5814
0.5794
0.3333
0.5048 0.4497
Here, the first results (10 and 20) are with some
electrostatics scaled away, whereas the last two columns (SA10 and
SA20) are the corresponding results without any scaling (i.e. the
old results). Therefore, there is no reason not to use this option
(likewise, you should always turn on the Born option).
UR 10/6-10: Unfortunately, this turned out to be a quite inaccurate approach: If solvent-accessible charges are deleted only after the MD simulation, water molecules will have adapted themselves to screen the surface charges. If the latter are removed, the water molecules will remain in geometries for screening and the compensating effects will be grossly overestimated (e.g. by 700 kJ/mol in a test system with a net charge of -17). On the other hand, the protein contribution was quite well reproduced (within 4 kJ/mol).
It is probably better to delete the
solvent-accessible charges already before the MD simulation.
This can done with changepdb, command ns, using the solvacc file.
You should also avoid charged groups at the termini of the
protein:
A typical input file is (in addition
to the standard data in the comqum.dat file, read by comqum):
#
#QTCP data
#***|****1****|****2****|****3****|****4****|****5****|****6****|****7**
$qtitle
This is a title, 10/3-09
$temp 300
$first 280
$charmm
$coord1 /temp1/ulf/Backup/Comt/Comqum/Fix/Prod/Qtcp/mdrest5
$coord2 /temp1/ulf/Backup/Comt/Comqum/Fix/180/Qtcp/mdrest5
$mdcrd /temp1/ulf/Backup/Comt/Comqum/Fix/165/Qtcp/mdcrd5_wrap
$prmtop /temp1/ulf/Backup/Comt/Comqum/Fix/165/Qtcp/prmtop
$prmtop1 /temp1/ulf/Backup/Comt/Comqum/Fix/Prod/Qtcp/prmtop
$prmtop2 /temp1/ulf/Backup/Comt/Comqum/Fix/180/Qtcp/prmtop
$qmcharge /temp1/ulf/Backup/Comt/Comqum/Fix/165/Qtcp/qm_charges
$compfile comp_qtcp1
$thres 0.1 0.1 0.1 0.5 0.5
$resthres 0.0
$watthres 5.0
$born cap
$solvskip 4
$solveps infinite
$solvacc solvacc
#***|****1****|****2****|****3****|****4****|****5****|****6****|****7**
$end
The keywords (starting with $) are compulsory
Numerical data is read in free format, but it must be in the same
line as the keyword
Currently, the following are the allowed keywords and default
values:
Note that you cannot turn off a logical keyword by setting it
to .false. Instead you should NOT include it. If it is present,
it is set to true.
Allowed keywords and default
values
$qtitle '
' (one title line)
$temp
300.0 (temperature)
$test
(this keyword gives only input testing)
$first
1 (the first snapshot to consider)
$mdcrd
mdcrd5_wrap (name of the mdcrd file)
$prmtop0
prmtop (name of the unperturbed prmtop file)
$prmtop1
prmtop_qtcp1 (name
of
the first perturbed prmtop file)
$prmtop2
prmtop_qtcp2 (name of the second
perturbed prmtop file)
$thres
1000 1000 1000 1000 1000 (thresholds for write out of
large terms for bonds, angles, dihedrals, 1,4 and other non-bonded
interactions; old: 0.1 0.1
0.1 5.0 5.0)
$resthres 0.0 (threshold
to write out residue interactions; use 0 for all)
$watthres 5.0 (threshold
to write out water interactions; use 0 for all)
$compfile comp_qtcp
(name of the components output file)
$coord1 mdrest_qtcp1 (name of
the file with the first perturbed QM coordinates; not used if
$natom is read)
$coord2
mdrest_qtcp2
(name of the
file with the second perturbed QM coordinates; indicates two
perturbations also; not used if $natom is read)
$natom
0 (total number of atoms in the prmtop and mdcrd files;
only used if QM coordinates change; i.e. if QM system if free to
move; do NOT include this keyword, unless you know what you are
doing)
$born
cap
(or
box
or
prmtop
or
input
or
test
or
pb
or delphi or no) (method for Born correction)
$bornradius
0 (only read if $born input)
$borncentre 0,0,0 (only
read
if $born input)
$solvskip 4
(0=no,1=all,4=SA ch,3=all SA,5=all ch,2=also bur ch) (method
to treat solvent-exposed charged residues)
$solveps infinity (if
solvskip>0; scaling constant for solv acc residues)
$solvacc
solvacc (if solvskip>0; file with
solvent-accessibility data)
$qmfep
(indicates that you will do vertical = QM perturbation)
$nori
(only qmfep; this keyword indicate that you want to use
dscf instead of ridft)
$qmskip
1 (only qmfep; QM perturbations are done every qmskip
step)
$coord
c0 (only qmfep; name of the QM atom name file, i.e. a
copy of coord)
$qmcharge
qm_charges (only qmfep; name of the file with the
charges on the QM atoms with H junctions)
There
are
some
additional options (aimed primarily for QM free):
doit:
Calculate
new
QM charges every step
me: Do a mechanical embedding
calculation (i.e. without QM charges - it will give the same
results every step)
nocorr: Skip the QTCP correction for the
junction atoms
$charmm
(only qmfep; indicates that you have charmm format of
QM-charges file)
$qmenfile
qmen_qtcp (qmfep only; name of the QM energies output
file)
Lines not starting with $ are ignored
(except the title line after $qmtitle)
Lines after $end are ignored
Born corrections from a sander PB
or GB calculation
It is now possible to run QTCP with an explicit generalised Born (GB) or Poisson-Boltzmann (PB) correction for long-range solvation effects, using the specific geometry of each snapshot.
This is done by running sander twice or three times with the current coordinates and charges in each step (i.e. rather time consuming). It should give a significant effect only for perturbations in which the total charge is changed and when the radius of the simulated system is rather small.
To run these calculations, you set in
the comqum.dat file.
$born pb
You also have to set up a sander input file, called pbsa.in with the desired PB or GB calculation and options. For time being, I recommend a GBn calculation (because PB is sensitive to the origin and is slower), i.e. setting igb=7. Sample input files are found below both for GB and PB.
Moreover,
you
have
to
set
up
modified
parmtop
files
for
all
three
states, called pt0, pt1, and pt2.
For PB, you simply copy the prmtop, prmtop1, and prmtop2 files.
For all GB methods, you
need to remove the periodic box (this is done with changeparm,
command bo). In fact, this does not seem to be necessary - it
does not affect the energies.
For Gbn (igb=7) you should also
replace the default radii with Bondi radii, and for GB(OBC)
(igb=2 or 5), you should replace the radii with the mbondi2
radii. For GB(HCT) (igb=1), the default mbondi radii are OK.
This is done with changeparm for each of the three files, e.g.
For bondi (igb=7)
changeparm <<EOF
parmtop
bo
ra
b
w
pt0
q
EOF
For mbondi2:
changeparm <<EOF
parmtop
bo
ra
2
m
mdrest5
w
pt0
q
EOF
Then,
calcqtcp
is run as normal.
Sample GB input file (pbsa.in):
GBSA calculation, UR 24/4-08
&cntrl
ntf = 1, ntb = 0,
igb = 7, dielc = 1.0,
cut = 999.0, nsnb = 99999,
scnb = 2.0, scee = 1.2,
imin = 1, maxcyc = 0, ntmin
= 2,
intdiel= 1.0, extdiel=80.0,
rgbmax = 999.0, gbsa = 0,
saltcon = 0.0
&end
Sample PB input file (pbsa.in):
PBSA calculation, UR 24/4-08
&cntrl
ntf = 1, ntb = 0,
igb = 10, dielc = 1.0,
cut = 999.0, nsnb = 99999,
scnb = 2.0, scee = 1.2,
imin = 1, maxcyc = 0, ntmin
= 2,
&end
&pb
epsin = 1.0, epsout = 80.0,
istrng = 0, radiopt = 0,
npopt=0, sprob =
1.6,
space = 0.5,
fillratio=2
maxitn = 500, npbverb= 1, cutres = 12
&end
Sander command within calcqtcp:
sander -O -i pbsa.in -o
pbsa.out -r
mdrest_dum -c
mdrest_pbsa -p pt0
Alternatively, you can also run these calculations with Delphi (only PB).
Then, you set
$born delphi
and construct a file delphi.in with
the delphi control variables, e.g.:
indi=1.0
exdi=80.0
scale=2.0
perfil=90.0
prbrad=1.4
bndcon=4
in(crg,file="delphi.crg")
in(siz,file="delphi.siz")
in(pdb,file="delphi.pdb")
out(modpdb,file='delphi.pqr")
energy(s,g)
1000 steps of MM minimisation with
all heavy atoms restrained to the original QM/MM structure and QM
system fixed.
---
Title
&cntrl
irest=0,ntx=1,
imin=1,maxcyc=1000,drms=0.0001,
ntc=2,ntf=2,
cut=8.0,
ntpr=100,ntwx=0,ntwv=0,ntwe=0,
ipol=0,igb=0,
scnb=2.0,scee=1.2,
ibelly=1,
ntr=1,restraint_wt=100,
restraintmask="!:WAT & !@H="
&end
GRP1
ATOM 2 4
ATOM 6 6
END
END
Note that the moved atoms are
given in the belly, i.e. those NOT in the QM system!
(calculations with ntb=2,ntp=1,pres0=1.0,taup=1.0
give exactly the same result).
20 ps MD simulation at constant
volume (to get correct temperature), still with heavy atoms
restrained.
---
Title
&cntrl
irest=0,ntx=1,
nstlim=10000,dt=0.002,
temp0=300.0,ntt=1,tautp=1.0,
ntc=2,ntf=2,
cut=8.0,
ntpr=1000,ntwx=0,ntwv=0,ntwe=0,
ntb=1,
ipol=0,igb=0,
scnb=2.0,scee=1.2,
ibelly=0,
ntr=1,restraint_wt=500,
restraintmask="!:WAT & !@H="
&end
20 ps MD simulation at constant
pressure, still with heavy atoms restrained.
---
Title
&cntrl
irest=0,ntx=5,
nstlim=10000,dt=0.002,
temp0=300.0,ntt=1,tautp=1.0,
ntc=2,ntf=2,
cut=8.0,
ntpr=1000,ntwx=0,ntwv=0,ntwe=0,
ntb=2,ntp=1,pres0=1.0,taup=1.0,
ipol=0,igb=0,
scnb=2.0,scee=1.2,
ibelly=0,
ntr=1,restraint_wt=100,
restraintmask="!:WAT & !@H="
&end
50 ps MD equilibration with only the
non-hydrogen quantum system atoms restrained (to the mdrest2 input
file), constant pressure
If also QM H-atoms are restrained, you often get SHAKE problems.
---
Title
&cntrl
irest=1,ntx=5,
nstlim=25000,dt=0.002,
temp0=300.0,ntt=1,tautp=1.0,ntt=1,tautp=1.0,
ntc=2,ntf=2,
cut=8.0,
ntpr=1000,ntwx=0,ntwv=0,ntwe=0,
ntb=2,ntp=1,pres0=1.0,taup=1.0,
ipol=0,igb=0,
scnb=2.0,scee=1.2,
ibelly=0,
ntr=1,restraint_wt=100,
restraintmask='(@234-244,10075-10097,10108-10131,10142-10158,10169-10194|:653)'
&end
If the restraint list is too long, you can use the old
format:
---
Title
&cntrl
irest=1,ntx=5,
nstlim=25000,dt=0.002,
temp0=300.0,ntt=1,tautp=1.0,
ntc=2,ntf=2,
cut=8.0,
ntpr=1000,ntwx=0,ntwv=0,ntwe=0,
ntb=2,ntp=1,pres0=1.0,taup=1.0,
ipol=0,igb=0,
scnb=2.0,scee=1.2,
ibelly=0,
ntr=1
&end
GRP1
100.0
ATOM 2167
ATOM 2170 2172
ATOM 2639
ATOM 2642 2644
ATOM 2651
ATOM 2654 2658
ATOM 3366
ATOM 3376
ATOM 3379 3390
ATOM 3417 3432
END
END
200 ps MD equilibration with constant
volume; QM system fixed
Note that the moved atoms are given in the belly, i.e. those NOT
in the QM system!
This is the same as in sander.in0 above.
---
Title
&cntrl
irest=1,ntx=5,
nstlim=100000,dt=0.002,
temp0=300.0,ntt=1,tautp=1.0,
ntc=2,ntf=2,
cut=8.0,
ntpr=500,ntwx=0,ntwv=0,ntwe=0,
ntb=1,
ipol=0,igb=0,
scnb=2.0,scee=1.2,
ibelly=1,
ntr=0
&end
GRP1
ATOM 2 4
ATOM 6 6
END
END
200 ps data collection at constant
volume; QM system fixed
---
Title
&cntrl
irest=1,ntx=5,
nstlim=100000,dt=0.002,
temp0=300.0,ntt=1,tautp=1.0,
ntc=2,ntf=2,
cut=8.0,
ntpr=500,ntwx=500,ntwv=0,ntwe=500,iwrap=1,
ntb=1,
ipol=0,igb=0,
scnb=2.0,scee=1.2,
ibelly=1,
ntr=0
&end
GRP1
ATOM 2 4
ATOM 6 6
END
END
Note that the theoretically better Langevine dynamics
(ntt=3,gamma_ln=3) does not work, because it seems to be
inconsisten with the belly (it is not keep fixed).
Input that can be copied and pasted
directly into chargefit
-------
fit.out
t
moloch.out
y
n
0
b
-------
source
leaprc.ff99
addPath /platon/nobackup/users/ulf/Mnsod/Leap
loadAmberPrep wam.in
loadAmberPrep oh.in
loadAmberPrep ohd.in
loadAmberParams wam.dat
loadAmberParams oh.dat
loadAmberParams ohd.dat
x=loadpdb pdb3-cq
deleteBond x.634.C x.635.N
solvateOct x TIP3PBOX 3
saveamberparm x prmtop prmcrd
quit
A sander_qtcp.in file for
single-point energy with PBC (constant volume), Ewald, and the MM
system (i.e. the complement of the QM system) in the belly:
cp /away/bio/Data/Qtcp/sander_qtcp.in .
Title
&cntrl
irest=0,ntx=1,
nstlim=1,dt=0.0,
temp0=300.0,ntt=1,tautp=1.0,
ntc=1,ntf=1,
cut=8.0,
ntpr=1,ntwx=0,ntwv=0,ntwe=1,
ntb=1,
ipol=0,igb=0,
scnb=2.0,scee=1.2,
ibelly=1
&end
GRP1
ATOM 1 233
ATOM 235 236
ATOM 245 10074
ATOM 10079 10081
ATOM 10084 10086
ATOM 10088 10094
ATOM 10098 10107
ATOM 10111 10113
ATOM 10118 10120
ATOM 10122 10128
ATOM 10132 10141
ATOM 10149 10151
ATOM 10153 10155
ATOM 10159 10168
ATOM 10172 10174
ATOM 10182 10188
ATOM 10191 10193
ATOM 10195 10253
ATOM 10284 46421
END
END
Parameters
for WAM (water molecule with vdW parameters also on Hydrogen
(H), UR, 19/5-03; Dummy parameters for atom H2, DD, UR 10/3-09
MASS
DD
1.008
0.161
H
bonded
to
nitrogen atoms
BOND
OH-DD
553.0
0.9572 ! TIP3P water
ANGL
H -OH-H
100. 104.52
from Hw-OW-HW
H -OH-DD
100. 104.52
from Hw-OW-HW
NONB
DD
0.0000
0.0157
! From H
For all structures for which we want free energies (that will be protonated and deprotonated) do the following:
Solvate protein using octahedral box (we will use a spherical box in Amber 8, but see also below)
Add restraints to protein backbone (in some cases we might want to restrain hetero groups as well)
Minimize (500 steps of steepest descent)
200 ps MD with gradual release of restraint (possibly longer for very large structures)
Use last snapshot for ComQum calculation but make sure that all solvent molecules are translated back to the primary unit cell.
Minimize in ComQum
Take coordinates and charges from ComQum (i.e. use same prmtop and prmcrd files for MD)
Manipulate charges if wanted (see below)
In the following Amber calculations the QM region should be fixed (belly command)
Minimize using 100 steps of steepest descent (we might skip this)
Run MD (400 ps to begin with but less can probably do it)
Use snapshots for subsequent QM/MM calculations.
The QM/MM calculations can be made
with ComQum or as independent Turbomole and Amber calculations. If
ComQum is used, a link atom fix should be implemented (I think Ulf
would implement it)
Constant pressure versus constant volume:
We are calculating Helmholz free energy, which is interesting for
constant volume processes. If this is what we want, we should
therefore ensure that we are using the same volume, meaning the
same box. The best way to do that is to make a constant pressure
simulation of a system intermediate of the two end points, and
then use the box size and solvent configuration from the last
snapshot for all subsequent simulations. This intermediate
simulation can be made as the very first simulation or immediately
after the ComQum calculation. I think the latter is the best
option. If we follow this procedure, we need to figure out where
information about the box size is saved so we can change that.
Alternatively, we can run all simulations as constant pressure
simulations, which is more 'real life' like. Then it is Gibbs free
energy, which governs the process, and not Helmholtz, but we can
probably convert to Gibbs free energies if we have information
about the unit cell volume of each snapshot or otherwise just
stick to the Helmholtz free energies. I am not very religious
here, I think the constant volume is the cleanest approach. On the
other hand, the constant pressure is the simplest, since we don't
have to make a simulation in order to determine the best volume
for all other simulations. I think the best thing to do is to run
constant pressure and keep an eye on volume changes. We should
probably discuss this, particularly since we want to run Amber 8
using a spherical solvent box with a reaction field. I am not
really sure if this kind of simulation is a constant volume or
pressure or neither.
Anyway, I will expand my AmberTopology Python module, so we can
manipulate charges (and everything else) easily. Note, however,
that using the above procedure does not demand manipulation of
prmtop files (as Pär noted), unless we want to scale charges (and
I want to do that).
Cannot avoid Shake in quantum system
Cannot fix quantum system in constant pressure
For small molecules in water
solution, I think we should use a spherical system and Amber's
reaction field correction, instead of PBC and PME. To make this
work, you need to ensure that the waters come first and the fixed
molecule last among the coordinates. This is somewhat involved.
You first run tleap the normal way, but with
solvateCap x TIP3PBOX {0 0 0} 20 (or something similar)
Then, you build a pdb file from this, moves the solute to the end
of the file,change the name of all water molecules to WAU and
insert TER between all waters (replace all HETATM with ATOM and
then run changepdb command ter).
Then, you run tleap again, reding in /away/bio/Data/Amber/wau.in
Once you get the prmtop file, you have to copy the cap rows in the
first prmtop file (from the first tleap run) into the new prmtop
file:
First you set the second last number in the FLAG POINTERS
entry (almost first in the file) to 1
%FLAG POINTERS
%FORMAT(10I8)
977
6
975
1
12
0
9
0
0 0
1321
324
1
0
0
5
4
1
6 1
0
0
0
0
0
0
0
0
8 1
0
%
and then you copy these lines (or something similar)
%FLAG CAP_INFO
%FORMAT(10I8)
1
%FLAG CAP_INFO2
%FORMAT(5E16.8)
1.40000000E+01 0.00000000E+00
0.00000000E+00 0.00000000E+00
just before the line
%FLAG RADII
%FORMAT(5E16.8)
I think you can do this also with changeparm, command c.
Then, you can run the simulations with an input file similar to
this:
Title
&cntrl
irest=0,ntx=1,
nstlim=200000,dt=0.002,
temp0=300.0,ntt=1,tautp=1.0,
ntpr=2500,ntwx=0,ntwv=0,ntwe=0,
ntc=2,ntf=2,
igb=10,ntb=0,cut=0.0,fcap=10.0,
ibelly=1
&end
&pb
npbgrid=1000,nsnbr=25,nsnba=5,
space=0.7,smoothopt=1,dbfopt=0,npopt=0,
cutres=12,cutnb=9,cutfd=9
&end
GRP1
ATOM 1 969
END
END
Note that you use igb=10, which
turns on the reaction field and you need to have fcap=10 to ensure
that all molecules stay inside the cap. You also need all the
parameters in &pb, because Amber is set up for single-point PB
calculations by default.
You should change the belly to fit your system.
You run one equilibration and then a production run, e.g.
with
Title
&cntrl
irest=1,ntx=5
nstlim=500000,dt=0.002,
temp0=300.0,ntt=1,tautp=1.0,
ntpr=1000,ntwx=1000,ntwv=0,ntwe=0,
ntc=2,ntf=2,
igb=10,ntb=0,cut=0.0,fcap=10.0,
ibelly=1
&end
&pb
npbgrid=1000,nsnbr=25,nsnba=5,
space=0.7,smoothopt=1,dbfopt=0,npopt=0,
cutres=12,cutnb=9,cutfd=9
&end
GRP1
ATOM 1 969
END
END
Thus, the simulations are simpler than with PBE.
Finally, you run calcqtcpfreee2 with Born corrections.
In this case, you can also run calcqtcpfreee
with a sander_qtcp.in file like
this:
Title
&cntrl
irest=0,ntx=1,
imin=1,ntmin=2,maxcyc=0,
temp0=300.0,ntt=1,tautp=1.0,
ntpr=1,ntwx=0,ntwv=0,ntwe=0,
ntc=1,ntf=1,
igb=10,ntb=0,cut=0.0,fcap=0.0,
ibelly=1
&end
&pb
npbverb=1,
npbgrid=1,nsnbr=1,nsnba=5,
space=0.5,smoothopt=0,dbfopt=1,npopt=2,
cutres=12,cutnb=0,cutfd=9
&end
GRP1
ATOM 1
969
END
END
(Same as for AMBER)
Start from a set of QM/MM structures (with protein fixed or
free) along a reaction pathway.
See the ComQum
page for details.
The final data is in prmtop3 and prmcrd3, as well as in the
Turbomole files.
(Same as for AMBER)
For each structure, calculate CHELP-BOW charges for the
quantum system:
Start from the QM wavefunction, obtained in QM/MM with point charges.
If it is an UHF calculation,
you need to have natural orbitals. If you do not:
Insert into the control file the following keywords and
rerun ridft (dscf):
$natural orbitals
file=natural
$natural orbital occupation
file=natural
Run makepoints
to get the points where you will calculate the
electrostatic potential (ESP).
Select Turbomole (t),
Random (r), No seed, 10000,
8 Å, and Turbomole (non-default values highlighted).
This gives the file esppoints.
Note that moloch only writes out at the most 10 000 ESP
points. If you want more than that, you need to construct
several files and concatenate the results.
Insert the suggested rows into the control file (in the beginning to avoid already existing keywords).
Run moloch (RHF) or moloch2
(UHF) to get the potentials and also the Mulliken charges.
moloch2 is stupid and does not copy the file esppoints to
the new directory. Therefore, you have to do:
moloch2
cd moloch_n_dir
cp ../esppoints .
moloch2
The calculation takes ~10 minutes and the result is
in moloch_n_dir/moloch.out.
Run chargefit to get the
CHELP-BOW charges. Do the default choices, except
Enter the type of input file (Gaussian, t=Turbomole) t
Enter the name of the file with the potential (log)
moloch.out
Constrain to quadrupole moment? (n) y
Constrain charges of hydrogens bonded to the same atom?
(y) n
Restrain to octapole moment with which weight? (
1.00 , 0=No) 0
Restrain to potential points with which weight?(0) b
Sample
input file.
(Same as for AMBER)
Build a pdb file from the QM/MM calculations using changeparm
prmtop3
p
pdb3
m
prmcrd3
q
Divide the file into one for the
protein and one for water (e.g. with emacs).
The files should have the same basis name and end with .pdb,
e.g. proj_prot.pdb
and proj_wat.pdb
For each file, convert the file to Charmm format and ensure
that the residues in two files are consecutively numbered,
starting from 1 in the first file and the numbers in the
second file follows those in the first.
Preferably, also insert a segid identifer; otherwise, charmm
will use the file name and if it is longer than 4 characters,
there will be a mismatch between the psf and pdb files
generated:
changepdb
<file name>
rr
1 in the first file, n+1 in the second, if n is
the number of residues in the first file
seg
<give a max 4-character segid>
cn
y
<{new}file name>
q
Set up the psf file for the
system:
First copy all *top and *par files for the non-standard
residues files to your own directories (to make the results
reproducible - files in cp /away/bio/charmm/input may be
changed/corrected afterwards).
cp /away/bio/charmm/input/*.top .
cp /away/bio/charmm/input/*.par .
Then check the generatepsf.charmm script:
cp /away/bio/Charmm/scripts/generatepsf.charmm .
and edit it by a word processor (see the sample file)
Sample CHARMM generate
psf file
Finally, run this script:
charmm_c30b1_gnu <generatepsf.charmm
>generatepsf.out pdb=proj
This gives the files proj.pdb and proj.psf
Set up a rtfparam.stream
file for the project.
This file reads in all the topology files needed for the
project and it is used by all the standard charmm scripts.
Sample rtfparam.stream
file
Ensure that the whole protein is
fully solvated with solvent molecules reaching at least 9 Å
outside the protein.
If this is not the case, add more water molecules using
charmm: We will solvate it in an octahedral waterbox, using
the script solvate_octa.inp
cp
/away/bio/charmm/scripts/solvate_octa.inp .
You normally do not need to change anything in this script.
Then, run the script
charmm_c30b1_gnu <solvate_octa.inp
>solvate_octa.out version=0 pdbfile=proj
psf=true shape=octahedral cutoff=1
solvate=true seqname=segid
Strangely, cutoff=1
typically gives the same number of waters as an Amber solvateOct x TIP3PBOX 9 command.
charmm 31b2 (the current version) works very slowly for
solvation calculations. Therefore, it is normally better to
use the old 30b1 version (charmm_c30b1_gnu)
instead. However, then you need to rerun the generate
section also with this program (as indicated above).
solvate_octa.inp generates four files:
proj_dimens.stream
proj_solv-octa.pdb
proj_solv-octa.crd
img_octnnnnn.crd
They are used by the equilibrate scripts.
Now, we will change the charges
in the proj.psf file.
Copy the file comqum.dat from the ComQum calculation.
Then, run changepdb on the pdb3 file
pdb3
Fhfm8hg.
qcc
f
fit.out
comqum.dat
w
pdbnew
q
Check thoroughly that the total charge of the two files pdbout
and pdbnew are the same (changepdb, command cc). If not,
insert into comqum.dat the residual charge of each of the
quantum residues (the first number is the number of residues,
then the residue number and residual charge of each residue is
given). Then rerun the previous point.
$residue_charge_corr
1
652 -1
Check that the total charge is an integer. If not add
the residual charge (owing to rounding-off errors) to one or
several atoms.
Change the charges in the *psf
file changepsf:
*.psf
m
pdbnew
w
q
We will use calculations with
constant pressure and PME in truncated ocahedron.
Start the sander calculations (see template
files below):
You can automatically get the belly group by changepdb, command be (from $correspondance_list in
comqum.dat).
100 steps of steepest decent
MM minimisation with all heavy atoms restrained to the
original QM/MM structure.
sander -i sander.in1 -o sander.out1
-r mdrest1 -c prmcrd -ref prmcrd
20 ps MD simulation at
constant pressure, still with heavy atoms restrained.
sander -i sander.in2 -o sander.out2
-r mdrest2 -c mdrest1 -ref prmcrd
50 ps MD equilibration with
only the quantum system restrained, constant pressure
sander -i sander.in3 -o sander.out3
-r mdrest3 -c mdrest2 -ref
prmcrd
Move the quantum system back
to the coordinates of mdrest1 (constant pressure
calculations violate the belly):
modrest
mdrest3
c
mdrest1
w
mdrest3_corr
Copy the PBC box size from the mdrest3 file (last row) to the mdrest_corr file.
180 ps MD equilibration with
constant volume + 200 ps data collection;
sander -i sander.in4 -o sander.out4
-r mdrest4 -e mden4 -x mdcrd4 -c mdrest3_corr
This is not fully tested yet. We want to get the same box size for all systems by running 1-3 only for one system. Then 4 is run for all and a short minimisation should preceed 6.
Check that all jobs have the same
number of atoms. If not, delete the most distant water
molecules (the two most distant are found by changepdb,
command cap) by:
remove x x.residue_number
When all the simulations are run,
free energies should be calculated.
These calculations depend somewhat on the system.
For a standard reaction-path calculation, a program is
present, calcqtcpfreee, which
calculates the needed energies by calling sander and Turbomole
a great number of times.
For other perturbations, you may need to modify the code
slightly.
For the calcqtcpfreee
calculation, you need to set up a number of files before
starting the program (note that you must use exactly the
suggested file names):
A sander_qtcp.in file for
single-point energy with PBC (constant volume), Ewald, and
the quantum system in the belly:
cp
/away/bio/Data/Qtcp/sander_qtcp.in .
Title
&cntrl
irest=0,ntx=1,
nstlim=1,dt=0.0,
temp0=300.0,ntt=1,tautp=1.0,
ntc=1,ntf=1,
cut=8.0,
ntpr=1,ntwx=0,ntwv=0,ntwe=1,
ntb=1,
ipol=0,igb=0,
scnb=2.0,scee=1.2,
ibelly=1
&end
GRP1
ATOM 1 233
ATOM 235 236
ATOM 245 10074
ATOM 10079 10081
ATOM 10084 10086
ATOM 10088 10094
ATOM 10098 10107
ATOM 10111 10113
ATOM 10118 10120
ATOM 10122 10128
ATOM 10132 10141
ATOM 10149 10151
ATOM 10153 10155
ATOM 10159 10168
ATOM 10172 10174
ATOM 10182 10188
ATOM 10191 10193
ATOM 10195 10253
ATOM 10284 46421
END
END
Make links to the mdrest and
prmtop files of the next and previous step in the reaction
pathway.
The prmtop files should be called prmtop_qtcp1
and (optionally) prmtop_qtcp2.
The mdrest files should be called mdrest_qtcp1 and
mdrest_qtcp2
ln ../../220/coord coord1
ln ../../220/Md/prmtop prmtop_qtcp1
Ensure that you have the
comqum.dat file in the current directory
cp ../comqum.dat .
You also need to wrap all
molecules into the primary periodic box (if you did not
use iwrap=1 in the sander simulation).
This is done for the whole mdcrd file using ptraj (the
name of the trajectory file and the residues of the
quantum system may be changed):
new2oldparm <prmtop
>prmtop.old
ptraj prmtop.old ptraj.in
This is the file ptraj.in:
trajin mdcrd3
trajout mdcrd3_wrap
trajectory
image center :WAT byres
familiar com "!:WAT"
go
Finally, you need to check
that you have all the standard Turbomole files in the
current directory:
alpha
auxbasis
basis
beta
coord
control
mos
cp ../alpha ../auxbasis ../basis ../beta ../coord
../control ../mos .
or
cp alpha auxbasis basis beta coord control mos Qtcp
Now, you are ready to run the
program calcqtcpfreee:
300
1
mdrest_qtcp1
0
y
mdcrd7_wrap1
Changepdb can correct this, command cn (remember to write out the file)
Different atom names (e.g. HB3
vs. HB1; O vs OH2 in WAT;
in particular ILE CD, HD123 and SER/CYS HG1 in CHARMM )
Format 1HG2 is not allowed.
Residue numbers starts in one position later in CHARMM, at least if there are >1000 residues.
?HETATM is not allowed?
?Residue number must start on 1?
Note that the stupid CHARMM request that all directories and file names do not include any capital letters!
Entries that may be changed are
highlighted.
* General script to generate a PSF from a
PDB file. Undefined heavy atoms are built from force field
* parameters and hydrogen atoms from hbuild. Atoms that are not
defined in the pdb file are listed
* in standard output so please check.
*
! Thomas H. Rod, Lund University, August 2005
! Modified by UR, 23/5-05
! USAGE:
! charmm < generatepsf.charmm pdb=pdb
!
! Where:
! pdb: is the firstname of the pdb file, which
defines the sequence. Be sure that the naming corresponds
! to CHARMM
format.
!
! OUTPUT: psf file named after pdb, i.e. @pdb.psf.
! coordinate
file:
@pdb+solv.pdb
! Read the Amber 94 topology files
set INPUT /away/bio/Charmm/input/non_charmm
open unit 1 read form name @INPUT/top_amber_cornell.inp
read rtf card unit 1
close unit 1
!stream amber94.stream
! Read non-standard topology files (note the append keyword)
!set DIR /away/bio/charmm/input_bio
open unit 1 read form name aae.top
read rtf card unit 1 append
close unit 1
open unit 1 read form name patch.top
read rtf card unit 1 append
close unit 1
! Read Amber 94 parameter file
open unit 1 read form name @INPUT/par_amber_cornell.inp
read param card unit 1
close unit 1
! Read non-standard parameter files (note the append keyword)
open unit 1 read form name <b12.par
read param card unit 1 append
close unit 1
! Read the sequence and generate the topology; protein
open unit 1 read form name @pdb_prot.pdb
read sequ pdb unit 1
close unit 1
generate @pdb first nmet last none setup angles dihedrals
patch nmet @pdb
138
patch nasp @pdb
621
patch cglu @pdb
137
patch cglu @pdb
620
patch cgln @pdb
651
! Define bonds
between Co and HIK-16 and 5AD
patch cobonds
@pdb 652 @pdb 16 @pdb 653
! Delete bonds
in the protein
patch del1 @pdb
137 @pdb 138
patch del1 @pdb
620 @pdb 621
patch del1 @pdb
634 @pdb 635
!This could probably have been better done with
!delete conn select resid 137 end select resid 138 end
! Regenerate the bonds and angles so that new are defined around
Co
! and angles around deleted bonds are removed
autogen angle dihed
! Water
open unit 1 read form name @pdb_wat.pdb
read sequ pdb unit 1
close unit 1
generate water first none last none setup noangles
nodihedrals
join @pdb water renumber
! Read the coordinates
open unit 1 read form name @pdb_prot.pdb
read coor pdb unit 1
close unit 1
open unit 1 read form name @pdb_wat.pdb
read coor pdb unit 1
close unit 1
! Build missing heavy atoms
ic fill
ic param
ic build
define missing select .not. hydrogen .and. .not. init show end
hbuild select hydrogen .and. .not. init end
define missing select .not. init show end
if ?nsel .gt. 0 stop ! stop if all atoms are not defined at this
point
! Write out coordinates
open unit 1 write form name @pdb.pdb
write coor pdb unit 1
* Coordinates for @PDB to be used for CHARMM
*
close unit 1
! Write out the psf file
open unit 1 write form name @pdb.psf
write psf card unit 1
* PSF for @PDB
*
close unit 1
stop
* Special patches for GluMut
* UR 26/8-05
*
! Define bonds between Co and HIK-16 and 5AD
PRES Cobonds
BOND 1CO 2NE2
BOND 1CO 3C5'
pres Del1
DELETE BOND 1C 2N
pres Del2
DELETE BOND 1C 2CO
END
This file reads in all the
topology and parameter files for the current project.
* File rftparam.stream
* Stream file to read in parameters and topology for project GluMut
*
! Read the Amber 94 topology files
set INPUT /away/bio/Charmm/input/non_charmm
open unit 1 read form name @INPUT/top_amber_cornell.inp
read rtf card unit 1
close unit 1
!stream amber94.stream
! Read non-standard topology files (note the append keyword)
open unit 1 read form name aae.top
read rtf card unit 1 append
close unit 1
! Read Amber 94 parameter file
open unit 1 read form name @INPUT/par_amber_cornell.inp
read param card unit 1
close unit 1
! Read non-standard parameter files (note the append keyword)
open unit 1 read form name b12.par
read param card unit 1 append
close unit 1
return
stop
* file to be
streamed. Contain definition of qmatoms
*
define qmatoms select -
resid 138 .and. ( type CB .or.
type CG .or. type OD1 -
.or. type OD2 ) .or. -
resid 166 .and. ( type CB .or.
type CG .or. type OD1 -
.or. type OD2 ) .or. -
resid 167 .and. ( type CB .or.
type CG .or. type OD1 -
.or. type ND2 .or. type HD21 .or. type HD22 )
.or. -
resid 214 .and. ( type MG ) .or.
-
resid 215 .and. ( type CB .or.
type CG .or. type HG3 -
.or. type HG2 .or. type SD .or.
type CE -
.or. type HE1 .or. type HE2 .or. type
HE3 -
.or. type C5' .or. type 1H5' .or. type 2H5' -
.or. type C4' ) .or. -
resid 216 .and. ( type C1 .or.
type O1 .or. type H1 -
.or. type C2 .or. type O2
.or. type C3 -
.or. type H3 .or. type C4
.or. type H4 -
.or. type C5 .or. type H5
.or. type C6 -
.or. type H6 ) .or. -
resid 217 .and. ( type O
.or. type H1 .or. type H2 ) -
end ! select
define qmatoms select qmatoms .and. .not. resn tip3 end
Unfortunately, solvate_octa.inp does
not build up a psf file for the full system, so you need to do
this in a seperate run:
* Build up the final psf file.
* Parameters:
* old = name of original psf
file
* nwat = number of water residues
* new = name of new psf file
* segid = segid for the protein in the
old psf file
* Usage:
* charmm <build_full_psf.charmm
>buld_full_psf.out old=old new=new nwat=3000 segid=COMT
* UR 13/9-06
*
! Set default values
if @?old .eq. 0 then set old old
if @?new .eq. 0 then set new new
if @?old .eq. 0 then set nwat 0
if @?segid .eq. 0 then set segid " "
! Read in the parameter and topology files
stream rtfparam.stream
! Read in the protein psf file
open unit 1 read form name @old.psf
read psf card unit 1
close unit 1
! Generate the water residues and join to the previous psf file
read sequ tip3 @nwat
generate wat setup noangl nodihe
join @segid wat
! Write out the psf file
open unit 1 write form name @new.psf
write psf card unit 1
* PSF for @NEW
*
close unit 1
Move the quantum system back to
the coordinates of mdrest1 (constant pressure calculations
violate the belly):
modrest
mdrest3
c
mdrest1
w
mdrest3_corr
Copy the PBC box size from the mdrest3 file (last row) to the mdrest_corr file.
180 ps MD equilibration with
constant volume + 200 ps data collection;
sander -i sander.in4 -o sander.out4 -r
mdrest4 -e mden4 -x mdcrd4 -c mdrest3_corr
Jimmy's suggestions (19/5-06):
wrap mdrest3 --> mdrest3_wrap via ptraj prmtop ptraj.in
ptraj.in: trajin mdrest3
trajout mdrest3_wrap restart
image center byres :WAT familiar com :1-217
go
Make ref.pdb from mdrest3_wrap with changeparm
Make move.pdb from desired qm-structure i.e apropate mdrest file.
Rms fit and move with vmd, use the following commands in vmd
console (posible to paste).
mol new ref.pdb
mol new move.pdb
set ref_sel [atomselect 0 "(mass > 1.2) and (not waters)
and ((resname
CAT SAM WAM MG) or (resid 138 166 167))"]
set move_sel [atomselect 1 "(mass > 1.2) and (not waters)
and ((resname
CAT SAM WAM MG) or (resid 138 166 167))"]
set trans_mat [measure fit $move_sel $ref_sel]
set move_sel [atomselect 1 "all"]
$move_sel move $trans_mat
$move_sel writepdb moved.pdb
quit
Make mdrest file from moved.pdb (obtained from rms fit with vmd)
Use modrest to insert qm coordinates, according to
ref.crd
c
moved.crd
w
mdrest3_newqm
Then finally put box info from mdrest3_wrap into mdrest3_newqm.
Compute the Helmholtz free energy
change by means of free energy perturbation, i.e.
Delta A = -kT*ln sum exp(-(de)/kT), where de = e1 - e0, e1 being
the energy of the perturbed state, and e0 the energy of the
non-perturbed state. The standard errors calculated by progression
of error and by bootstrapping are reported together with the 1st
and 2nd order perturbation terms.
USAGE:
>> helmholtz.py e0=<file0>
e1=<file1> kt=<kt> [offset=<offset>
bootstrap=<ncycles>]
OR
>> helmholtz.py de=<file>
kt=<kt> [offset=<offset> bootstrap=<ncycles>]
WHERE
file0: contains a list of the energies of the
non-perturbed state
file1: contains corresponding energies for the
perturbed state
file: (if used) contains a list of the
perturbation, i.e. e1-e0
kt: Temperature in energy
units (e.g. 2.4943533 for kJ/mole)
ncycles: The number of cycles in the bootstrap procedure
(the bootstrap standard error should converge with an increasing
number of ncycles. If bootstrap is not given at the command line,
a bootstrap calculation is not performed, i.e. the default is
zero.
offset: The number of data points, which should be
omitted from the beginning of the data sets. Default: zero.
NOTE:
e0, e1, de, and kt should all be given in the same energy
units. The computed free energies and errors are reported in the
same unit.
---
Thomas H. Rod, Lund University, August 2005
helmholtz.py de=mmen_qtcp1 kt=2.4943533
offset=200
Thomas Rod wrote 29/8-05:
I have added the two scripts 'helmholtz.py' and
'helmholtz_var.py' to '/away/bio/Python/bin
Note that Helmholtz is now spelled correctly and that the scripts
are documented. Look at the beginning of the file or type the
commands without arguments. Also, note that e1 and e2 have been
changed to e0 and e1, and temp to kt (see documentation). I have
also removed the
default parameter for kt (previous temp).
The script helmholtz_var.py computes the free energy change
between two states from a simulation of each of the states (see
documentation). I
suggest we use this script to report the final free energy changes.
/thomas
BENNET'S ACCEPTANCE RATIO METHOD.
Compute the Helmholtz free energy change based on simulations of
both of the two end states by means of Bennet's acceptance ratio
method. The free change computed by free energy perturbation for
forward and reverse process as well as by the overlap method
proposed by Kophke and coworkers are reported as well.
USAGE:
>> bennets.py de1=<file1> de2=<file2>
kt=<kt> [offset=<offset>]
WHERE
file1: contains a list of the de1 energies
file2: contains a list of the de2 energies
kt: Temperature in energy
units
offset: The number of data points, which should be
omitted from the beginning of the data sets. Default: 0
NOTE:
de1, de1, and kt should all be given in the same energy
units. The computed free energies are reported in the same unit.
dynamics leap @status timestep 0.002 nstep
10000 nprint 1000 iprfrq 1000 -
firstt
@T finalt @T twindl -5.0 twindh 5.0 iseed @seed -
ichecw
1 teminc 30.0 ihtfrq 20 ieqfrq 200 -
iasors
1 iasvel 1 iscvel 0 isvfrq 1000 -
nsavc
50 iuncrd 10 nsavv 0 iunvel 0 -
iunwri
3 iunread 4 kunit 20 - !{* Nonbond options *}
inbfrq
-1 imgfrq -1 ilbfrq 0 bycb - !use bycube image nonbond list
generator
eps
1.0 cutnb @cutoff cutim @cutoff ctofnb @ctofnb ctonnb @ctonnb
vswi -
Ewald
kappa 0.440 pmEwald order 6 fftx @fft ffty @fft fftz @fft ntrfq
200 -
qcor
0!PME
bycb
cpt - constant temperature and pressure dynamics (see pressure.doc)
ctofnb
ctonnb
cutim
cutnb
eps is the value of the dielectric constant (1.0)
Ewald kappa
fftx
ffty
fftz
iasors - option for scaling (0) or assigning temperatures
iasvel - method for assignment of velocities (if iasors<>0) (1=gaussian distr)
ichecw - option for checking the temperature (1 = on)
ieqfreq - frequency of scaling velocities for constant temperature
ihtfrq - frequency for heating the molecule in increments of TEMINC degrees
ilbfrq
imgfrq - frequency for the image update (50)
inbfrq - frequency of regenerating the non-bonded list
iprfrq - the frequency to calculate averages
iscvel - method for scaling velocities (0=single scale factor)
isvfrq - the frequency to save a restart file
iundrd
iunrea - fortran unit from which the restart file should be read
iunvel - fortran unit to which velocities are written (-1 = do not save)
iunwri
kunit - fortran unit to which energies are written
leap - the recommended leapfrog Verlet algorithm
nprint - frequency for writing energies
nsavc - frequnecy for writing coordinates
nsavv - frequency for writing velocities
nstep - the number of steps
ntrfq - frequency for stopping rotation and translation (0)
pconst - constant pressure is run
pmEwald order
qcor
start/restart - the status of the calculation
tconst - constant pressure is run
teminc - the heating step (cf. ihtfrq)
timestep - the time step in ps
vswi
dynamics cpt leap @status timestep 0.002
nstep 10000 nprint 1000 iprfrq 1000 -
firstt @T finalt @T
twindl -5.0 twindh 5.0 iseed 823 -
ichecw 1 teminc 30.0
ihtfrq 100 ieqfrq 200 echeck 10000 -
iasors 1 iasvel 1
iscvel 0 isvfrq 1000 -
iunwri 3 nsavc 50 nsavv
0 iunvel 0 -
iunread 4 iuncrd 10
kunit 20 - !{* Nonbond options *}
inbfrq -1 imgfrq -1
ilbfrq 0 bycb - !use bycube image nonbond list generator
eps 1.0 cutnb @cutoff
cutim @cutoff ctofnb @ctofnb ctonnb @ctonnb vswi -
Ewald kappa 0.320
pmEwald order 4 fftx @fft ffty @fft fftz @fft ntrfq 200 - ! PME
pconstant pmass 100
pref 1 pgamma 10 ! Constant pressure
Start from the QM wavefunction, obtained in QM/MM with point charges.
If it is an UHF
calculation, you need to have natural orbitals. If you do
not:
Insert into the control file the following keywords and
rerun ridft (dscf):
$natural orbitals file=natural
$natural orbital occupation
file=natural
Run makepoints to get the points where you will calculate the
electrostatic potential (ESP).
Select Turbomole (t), Random (r), No seed, 10000, 8 Å, and Turbomole (non-default values
highlighted).
This gives the file esppoints.
Note that moloch only writes out at the most 10 000 ESP
points. If you want more than that, you need to construct
several files and concatenate the results.
Insert the suggested rows into the control file (in the beginning to avoid already existing keywords).
Run moloch (RHF)
or moloch2 (UHF) to get the potentials and also the Mulliken
charges.
moloch2 is stupid and does not copy the file esppoints to
the new directory. Therefore, you have to do:
moloch2
cd moloch_n_dir
cp ../esppoints .
moloch2
The calculation takes ~10 minutes and the result is
in moloch_n_dir/moloch.out.
Run chargefit to
get the CHELP-BOW charges. Do the default choices, except
Enter the type of input file (Gaussian, t=TurboFomole) t
Enter the name of the file with the potential (log)
moloch.out
Constrain to quadrupole moment? (n) y
Constrain charges of hydrogens bonded to the same atom? (y)
n
Restrain to octapole moment with which weight? (
1.00 , 0=No) 0
Restrain to potential points with which weight?(0) b
Sample
input file.
TI from ox to red
&cntrl
irest=1,ntx=5,
nstlim=200000,dt=0.002,
temp0=300.0, ntt=3, gamma_ln=5.0,
ntc=2,ntf=2,
cut=10.0,ips=1,
ntpr=1000,ntwx=1000,ntwe=1000
ntb=1,
ipol=0,igb=0,
scnb=2.0,scee=1.2,
ntr=1,
icfe=1, clambda=0.0
&end
CqRestr
100.0
ATOM 531 531
ATOM 534 536
ATOM 538 538
ATOM 540 540
ATOM 1229 1229
ATOM 1231 1231
ATOM 1234 1234
ATOM 1266 1266
ATOM 1269 1271
ATOM 1273 1273
ATOM 1275 1275
ATOM 1324 1324
ATOM 1327 1327
ATOM 1330 1331
ATOM 1443 1443
END
END
Usuful help with running standard TI
Ewald calculations (for pKa):
22/9-06: Changed to Langevin Temperature regulation (no, it does not work with fixed atoms; removed again) and added initial extra constant volume simulation.