Q2MM (PON) Parameterisation
An ideal way of doing MM force-field parameterisation, based on QM
data.
Can treat also transition states.
"Automated Molecular Mechanics
Parameterization with Simultaneous Utilization of Experimental and
Quantum Chemical Data", Per-Ola Norrby and Tommy Liljefors,
J.Comput.Chem. 1998, 19, 1146-1166.
Review: "Hansen, E.; Rosales, A. R.; Tutkowski, B.; Norrby, P.-O.;
Wiest, O. Prediction of Stereochemistry using Q2MM. Acc. Chem. Res.
2016, 49, 996-1005."
Implementation for Amber, Dec. 2006: "P. Rydberg, L. Olsen, P.-O. Norrby & U. Ryde (2007)
"A general transition-state force field for cytochrome P450
hydroxylation", J. Chem. Theory Comput., 3, 1765-1773"
All files and scripts needed are in /away/bio/Pon_param, as well as some sample
input files.
Some info is found in the README
file
And more info about the programs ParEval, ParMod and ParUtil can be
found in a pdf file.
Note that the period and phase of dihedrals will not be optimised
(neither will the non-bonded parameters or charges).
All instances of X in the text below denotes a number
series. Preferably 1,2,3.. etc. This is to enable all scripts to
function with multiple structures (for example an enzyme active site
with different substrates)
- Optimise the desired structure(s) with QM (it is wise to put
the atoms in a sensible order, viz. the same order as in Amber).
Calculate RESP
charges and the Hessian (by Gaussian).
Alternatively, Turbomole can now
be used for the Hessian calculation (but still with RESP
charges).
Construct a pdb file from each of the optimised structure
(gausstopdb).
- Build an amber topology file (a *.in file) for every structure
from the data. This can be done automatically by describe (or alternatively by changepdb,
command p).
Be very careful NOT to renumbering any atom (can be cured, see
below).
Set appropriate atom types and insert the RESP charges.
Check the tree symbols (M, S, B, 3, E).
Note that with Gaussian, you need to use the coordinates from
the archive entry, not from the standard orientation or input
coordinates, otherwise the correlation of the Hessian elements
will be poor (~0.5, rather than >0.95). This should now be
automatically done by describe.
- Construct a template Amber parameter file for all missing
parameters (if you used describe in the previous point, this is
already done). This file must be named frcmod.hemts.
You should set reasonablee start values (but not zero values)
for those parameters you want to parameterise (e.g. analogous
values from the amber force field).
Good starting points can be taken from parameters generated by
describe (it even makes Amber template files).
The program MkPar assumes the following strict format:
Force constants for bonds are given in format f8.3, starting
from column 6
Equilibrium bond lengths in format f10.5, starting from column
16
Force constants and equilibrium values for angles in format
f8.3, starting from columns 10 and 22.
Force constants for dihedrals (and impropers) in format f8.3,
starting from column 19
See the sample frcmod.hemts
file below.
- Set up a leap command file for each structure, leapcom_X, which
sucessfully construct the prmtop and prmcrd file for the
calculation (check that no atoms are renumbered).
It should read the three files (pdb, *.in, and frcmod.dat)
constructed above.
You run it by
tleap -f leapcom_X
- Carefully check the values in the frcmod.hemts file,
especially the periodicity and phase of the dihedrals (which are
not changed in the parameterisation).
This can be done by changeparm, commands e (the bond and angle
energies should be close to zero) and dc (there should be no
dihedrals in the erroneous phase or large angles lists; also
check all dihedrals with suboptimal values).
Dihedrals with large angles should be removed by setting the
force constants to zero.
- Construct a nmodemm.in
file for MM minimisation (optionally, you can use a sander.in file instead).
Construct a nmode.in file
for Hessian calculation.
Use the modified version of nmode in $AMBERHOME/src/nmode/nmode
(/away/bio/Bin/Linux/nmodeh), that writes out the Hessian. See below for the
modifications needed.
Test the programs with
nmode -O -i nmodemm.in -o
nmodemm.out -p prmtop -c prmcrd -r mdrest
and then
nmodeh -O -i
nmode.in -o nmode.out -p prmtop -c mdrest -r temp
Check that a file hessian
is created.
- Run
MkPar > paropt
It takes the *dat file and writes a list of all parameters in
the right format to stdout.
Note that the titles of the parts of the frcmod file must be:
BOND
ANGLE
DIHEDRAL
IMPROPER
Otherwise, the file will not be properly read.
Note that MkPar does not work with atom types with only one
character!
mkpar probably works better:
mkpar
Note that paropt is tab-delimited, so be careful not to
delete the tabs.
Check the file paropt and
remove parameters you do not want to fit.
If you want to force two parameters to be the same, add the
number in first column (the line number in forcmod) of the
second parameter as a sixth column (i.e. a new final column) to
the first parameter and delete the row with the second parameter
from paropt (the program assumes that the parameters come in the
same column in the two rows).
Example:
15
6
49.0190
4.7009 %8.3f
16
(now the parameter in column 6 on line 16 will be the same as
the one in column 6 on line 15).
- Create a tethering file if needed (to tether some values to
the QM values).
Create a file called parteth.
This should be a copy of paropt, but with only bond distances
and angles in degrees. The fourth column should be changed to
the weighting of the tethering (10 for bonds and 1 for angles is
a good start guess). The third column should be the equilibrium
data from QC calculations. Run the script AveTeth to get average
data over all your structures for all bonds and angles in your
frcmod file.
- With Gaussian
frequencies:
Run the script
GetInvHess gauss.out
(if you have a transition state)
or the script
GetHess gauss.out
(if you have a ground state)
on the Gaussian frequency output file (note that the prmtop file
also needs to be in the same directory).
NOTE: gauss.out should only contain ONE frequency calculation.
If you have several, then just delete all except the one you
want to use in the output file.
This is the mass-weighted and reformed Hessian (ts eigenvalue
made positive).
If you get only the error message in the file
Unknown atomic number, 12,
exiting...
you have to first run
GetScale
to get the file scaling
and then
ParUtil -l gauss.out
-w12,24.305 -i D -f1185.8211,2240.877 -s1,.01 -2 -4 scaling
> gauss.out.cal
or for GetInvHess
ParUtil -l $1 -w12,24.305 -e > D.eigen
awk 'i==0 {print} i==1
{i=0;x=0;n=0;for(k=1;k<=NF;k++) if($k<x) {x=$k;n=k}\
for(k=1;k<=NF;k++) {y=k==n?1.:$k;printf "\t%16.8E",y}
printf "\n"}\
/Eigenvalues/ {i=1}' D.eigen > D.eigmod
ParUtil -er D.eigmod -i D
-w12,24.305 -f1185.8211,2240.877 -s1,.01
-2 -4 scaling > $1.cal
i.e., you should give the missing atomic weight after the -w
command to ParUtil.
For Turbomole use
GetHess_t force.log
or
GetInvHess_t force.log
with the output from aoforce. The file with the hessian needs to be named hessian and be in
the current directory.
Note that you need to take a backup copy of the file hessian, because this name is later used also for the MM hessian.
cp hessian hessian-qm
If you want to run ParUtil manually the command is the same as for a
gaussian log file, but with -l replaced by -m (t was allready taken):
ParUtil -m force.log -w12,24.305 -i D -f1185.8211,2240.877 -s1,.01 -2 -4 scaling > force.log.cal
- Only needed if the atom
numbers in Gaussian and Amber do not coincide
Correct the atom numbers in the hessian D.cal files:
For each structure do the following
- Run "tleap -f leapcom_X"
- Run "changeparm" and create a pdb file from the prmcrd
file
- Open the pdbfile with Rasmol (or equivalent) and label by
atom number
- Open the gaussian outputfile (used to create the DX.cal
file)
with
Molden
and
label
by
atom number
- If the atom numbers are not the same then create a new
file with two columns, the first column are the atom numbers
from the gaussian file (as shown in Molden) and the second
column are the amber atom numbers (as shown in Rasmol). (if
the atom numbers are exactly the same you can go to PonInit
directly)
- Run the script dcalfix.py
this creates a D.test file which should have the same two
first columns as DX.cal and other values in the third
column for atom numbers which are not the same in the file
you just created.
- Replace the DX.cal file with D.test
- Run the script PonInit
This creates files named dihlist_X with list of dihedrals
from changeparm.
You may remove from this file dihedrals that you don't want to
be included in the penalty function.
Dihedrals that contain angles that are larger than 150 degrees
are automatically removed to avoid that small changes causes
large problems.
- Currently buggy, skip
this point.
Run script LinSearch to provide good starting
values for all parameters.
Output in stdout.
Files paropt and frcmod.hemts are modified.
Note that the program may remove parameters. Check this
thoroughly.
If the LinSearch script goes into an infinite loop for a
variable, edit the paropt file manually and modify this
variable slightly to get it out of the loop
- Run the script CalcLoop
This is the main optimisation program. It runs iterative
Newton-Raphson (NR) optimisation of the parameters and Simplex
optimisation of some parameters that are poorly behaving in NR.
It calculates numerical derivatives by changing one parameter in
the frcmod file at a time (both up and down, giving
2*#parameters modified frcmod files, called frcmod.nnn).
The output is in loop.log
It modifies the paropt
and frcmod.hemts
files.
Other files are described below.
Options:
Skip Simplex optimization by creating the file NoSimplex.
Use Sander MM minimisation instead of Nmode by creating the file
CalcOneSander
The program can be softly stopped by
touch Loop.stop
There are some bugs when a parameter goes towards 0,
which usually makes the program crash.
If this happen, remove the parameter or change the period or
phase of a dihedral.
- During the run of CalcLoop, you should check:
- The convergence of the penalty function:
The script poncheck.sh
can be run during an optimization to follow the convergence
of the penalty function, it reads the loop.log file and
takes out the interesting data.
- That no force constant becomes negative (nr.ini, after
Best)
- That no force constant are zeroed (frcmod.hemts).
- After convergence, you should check the following
(all this is automatically done with the script ponresult):
- The penalty function (poncheck.sh):
A good final penalty function should be ~#data_points*0.25
for a ground state, but for transition states, it is usually
more than #data_points. (#data_points is the number of lines
in the file par.tot; it is also given by poncheck.sh)
- Check the nr.ini file after the line starting with Best
All 2nd derivatives should be negative:
Parameters with a positive 2nd derivatives are marked with a
#.
awk
'/^Parameter/,/radius/ {if($3 < 0) print
$1,$2,$3,$4,$5,$6,$7 }' nr.ini
There should be no negative signs in the last column (Jtg
diag) when the calculation is converged.
awk
'/^Parameter/,/radius/ {if($6 < 0) print
$1,$2,$3,$4,$5,$6,$7 }' nr.ini
The 2:nd derivative should be larger than the 1:s derivative
at convergence (i.e. all entries in the 1D-NR-step column
should be (absolute) less than 1).
awk
'/^Parameter/,/radius/ {if(($4 < -1) || ($4 > 1))
print $1,$2,$3,$4,$5,$6,$7 }' nr.ini
but the min and max values (at the end of this section) give
the same information, i.e. they should be absolute less than
1 at convergence (these are the min and max values of
1D-NR-step).
grep Min nr.ini
- Plot the QM and MM Hessian elements:
grep f1D par.tot | cut
-f1-4 >hess
sort -n -k3 hess
>hess_sort
gnuplot
plot "hess" using
3:4,x,0
It should show elements close to the diagonal and possibly a
few outliers (and often a tendency to element around y=0).
Check the outliers by finding their values by pointing at
them in the gnuplot plot and then finding the corresponding
values in hess
sort -n -k3 hess
>hess_sort
Diagonal elements of the Hessian (e.g. f1D17y17y)
cannot easily be improved, but other terms can often be
understood.
If you get a correlation coefficient of <0.6, something
is most likely wrong. Typically, you have used wrong
coordinates in Gaussian (not archieve entry for Hessian) or
your nmodeh does not write out the mass-weighted Hessian
(check that you use iulf and not iulf1)
- Build the final structure:
tleap -f leapcom_1
nmode -O -i nmodemm.in
-c prmcrd -r mdrest
changeparm <<EOL
prmtop
p
pdb0
m
prmcrd
q
EOL
changeparm <<EOL
prmtop
p
pdb
m
mdrest
q
EOL
changepdb, command rf can perform a rms fit of the
two structures.
Alternatively, this can be done with vms
Check it and compare it to the QM structure (pdb0).
- Compare the structure with the reference structure using
changeparm, command co
- Check that all force constants and bond or angles makes
physical sense.
Also check for zeroed force constants.
grep
'1 0.000' frcmod.hemts
| grep -v Large | grep -v Zero
grep ' 1 '
frcmod.hemts | grep ' 0.0' | grep -v Zero | grep -v Large
- Check that there are no # in error_results (see next
point).
grep '\#'
error_results
Also check the error limits of all parameters.
Force constants should have uncertainties less than 2.
Bonds less than 0.01.
Angles less than 0.1
- Check error_par.anal for the terms that give the largest
errors in the penalty function (see next point)
- An error estimate can be found by checking how large changes
in one parameter gives 1% worse penalty function.
This is done by the script pon_errorcheck.py
It outputs two files error_results
and error_par.anal.
error_results
contains all force constants and uncertainties (error estimates)
for these.
The first column is line number in paropt (or # for a poor
parameter).
The second is the line number in frcmod file.
The third is equilibrium value from paropt.
The fourth is the error estimate.
The fifth is an alternative error estimate (from the
smallest solution of the second degree equation).
grep '\#' error_results
error_par.anal
contains the label, the line number, the weight, the reference
value, the error, the quadratic errors (= their influence
on the penalty function), sorted by their quadratic error.
This shows which terms gives the largest errors in the penalty
function
If your optimization crashes you can also get the latter file
by:
cut -f1-4 par.tot >
par.anal
ParEval -z par.anal | sort
-n -k6 -r > error_par.anal
- If you want to change a parameter and rerun CalcLoop, you
should change it in both paropt and frcmod.hemts.
You can use the old files and make changes only in frcmod.hemts
and then run the program
ponmkpar
It will copy the old paropt file to paropt.old and update it
with the data from frcmod.hemts, asking for appropriate delta
values.
- When the parametrisation has converged, you can optimise the
parameters a bit further.
To do this manually, decrease the step lengths in paropt to 10.0
for bonds and lower for angles and dihedrals. Then start
CalcLoopHC (this script has tighter convergence than CalcLoop
and does not update step lengths in paropt).
At convergence decrease the step lengths even more.
Continue until you can't get improvements with CalcLoopHC by
lowering the step lengths.
- When all jobs are finished, you can run the script purpon to remove
unnessesary files.
If you want to save additional space, you could also remove D1.cal
and par.tot.
Finally, gzip all files.
gzip *&
Tips & Tricks
- If you have several structures, parametrise first a single
structure and then all structures with the first results as the
starting point.
- When running simplex in the first few iterations, sometimes
negative force constants can show up (it can actually happen at
any time, but the probability is higher during the first few
iterations).
grep '-' simp.[0-9]
shows if you have negative parameters.
If so, edit the simp.[0-9] files with minus signs and just
remove the minus signs (this is not really according to
protocol, but it works fine during early stages of
parametrisation).
- If a parameter suddenly has become 0, although this was not
expected (it should physically not be 0), then stop the
optimisation, locate the last frcmod file that was OK (not zero)
in the nrCycle folders, use it as your new frcmod file and also
take the par.xx (xx is a number) from the same folder as your
new paropt file, and then restart the optimisation.
This most likely happen because simplex has made the parameter
negative, which after an iteration implies that it is zeroed.
- If an angle parameter gets a (near) zero force constant, you
typically need to reduce the force constants of the other angles
with the same central atom.
Files from CalcLoop
Files with useful information
- loop.log
The log-file of the calculation.
Long, but contains the values of the penalty function in each
iteration.
Use poncheck.sh to
extract these data.
- nr.ini
This file summarises the latest step of Newton-Raphson
optimisation.
First, there is a section with all the tested force fields
(frcmod files, i.e. 2*#parameters) and their results in terms of
the penalty function, the maximum deviation, which row that
deviates, and the change in the penalty function.
Next, there is a section (search for Best) that describes the
best force field (marked by >>> in the previous
section). Here all the parameters are listed, together with
their estimated 1st and 2nd derivatives, the quotient of the 1st
and 2nd derivatives (1D-NR-step) and two more entires (Jtg 1st
and JtJ diag).
At convergence:
1st derivative should be small
2nd derivative should be positive (parameters with
a negative second derivative are marked with #)
1D-NR-step should be larger than 1 (absolute)
JtJ diag should be positive
This section ends with a 1D NR step.
First, the radius, max and min are given, then the step length
for each parameter
At convergence, min and max should be absolute less than 1.
If the calculation is poor (the radius is too large, no such
step is given "Radius too large (1541.22), 1D-NR not printed"
- par.ref
The reference for the penalty function
The first column contains a label, describing the structure (f1,
f2, ...), the type of term (b=bond, a=angle, d=dihedral,
D=Hessian element), and a description of the term, e.g. 1_4 for
the bond between atoms 1 and 4, or 35z31y for the Hessian
element for the z-coordinate of atom 35 and the y coordinate of
atom 31.
The second column is the weight factor for this term.
The third and last column is the reference value for this term.
Thus, wc par.ref gives the number of terms in the penalty
function.
- par.tot contains label - weight - ref_value and then 1
column per force field that is tested (a very big file)
ParEval -f par.tot calculates the quadratic sum of the
current model (like in file nr.ini)
- simplex.log
Log file for the simplex optimisation
- Less interesting files
- dihlist
The list of dihedrals to include in the penalty function
- frcmod.*
These are modified frcmod files, two for each parameter in
paropt, with a single parameter modified (either increased or
decreased), according to the steps in paropt.
- hessian
The MM hessian (as well as coordinates and forces) from the
latest nmode calculation.
- mdrest
The restart file for the nmode calculations
- nmode.out
The output file of the nmode Hessian calculation
- nmodemm.out
The output file of the nmode minimisation.
- nr.dmp
Similar to nr.ini.
The first section is identical.
The second section is missing
The third section, gives another four step with differrent
dampings (10, 1, 0.1, and 0.01). None of the steps are the same
as in nr.ini
- nr.eval
is a short file similar to the initical section in nr.ini, but
it contains only a few lines.
- nr.lin
Is a file with the results from a linear search, in the form of
several nr.ini files after each other.
- nr.sol
Contains only a few summary lines (from nr.lin and nr.dmp) with
steps with radius, max, and min. Then (on the same line), one
value for each parameter are given.
- nr.svd
Contains only a copy of the first section of nr.ini
- par.best
A paropt file with the currently best values
- par.diff
Another paropt file
- par.hess
The Hessian in some format.
- par.mult
A paropt file with only some values
- par.one
The output of the last CalcOne calculation, i.e. the values of
each term of the penalty function (should be subtracted from the
corresponding value in par.ref, multiplied with the weight,
squared and then summed to get the value of the penalty
function).
- par.tmp
Some temporary par.tot file
- simp.[0-9]*
Paropt-like input files for the simplex optimisation: Only the
problematic parameters that are varied are listed, with
different values of the parameter, but otherwise identical.
- simp.test
Similar to the simp.1, etc files.
- simp.top
Similar to the par.tot file for the simplex run.
- simpinit
Similar to the simp.1, etc files.
- simpcent
A par.one-like file for simplex optimisation.
- simpmat
A matrix of all the values for the simplex optimisation, but
without the header columns.
- simpmod
Similar to the simp.1, etc files, but with the actual value in
the first column.
- simpopt
Similar to the simp.1, etc files. With optimum values (?).
- simptemp
Similar to par.ref, but with an extra column with values.
- simptmp
Similar to simpcent. Temporary.
Sample leap_com file
source leaprc.ff03
loadAmberPrep heme_ts.in
loadAmberParams frcmod.hemts
x=loadpdb hemets.pdb
check x
saveamberparm x prmtop prmcrd
quit
To avoid errors like in tleap:
+Currently only Sp3-Sp3/Sp3-Sp2/Sp2-Sp2 are supported
+---Tried to superimpose torsions for: *-C1-C6-*
+--- With Sp0 - Sp0
+--- Sp0 probably means a new atom type is involved
+--- which needs to be added via addAtomTypes
Issue the following command for all new parameters in tleap:
addAtomTypes {
{"c1" "C" "sp2"}
{"c2" "C" "sp2"} }
For transition states, you should define both the breaking and
the forming bond.
You do that with
bond x.1.C3 x.2.HN1
Often, you then get the error message:
Bond: Maximum coordination exceeded on .R<TSP
2>.A<HN1 2>
-- setting atoms pert=true
overrides default limits
which you solve by issuing the command:
set x.2.HN1 pert true
bond x.1.C3 x.2.HN1
Sample nmodemm.in file
MM minimisation with nmode
&data
ntrun = 4,
nprint = 100,
nsave = 1000,
idiel = 1,
scnb = 2.0,
scee = 1.2,
cut = 999.0
&end
Sample sander.in file
Title
&cntrl
irest=0,ntx=1,
imin=1,maxcyc=1000,drms=0.001,
ntc=1,ntf=1,
nsnb=40,cut=999.0,dielc=1.0,
ntpr=100,ntwx=0,ntwv=0,ntwe=0,
ntb=0,ntp=0,
ipol=0,igb=0,
scnb=2.0,scee=1.2,
ncyc=10,ntmin=1,dx0=0.002
&end
Sample nmode.in file
Title
&data
ntx = 1,
ntrun = 1,
nvect = 0,
drms = 1,
dielc = 1.0, idiel = 1,
scnb = 2.0,
scee = 1.2,
cut = 999.0,
iulf=1
&end
END
Sample paropt file
The first column is the line in the *dat file, the second the
column, the third is the current value, the fourth is a step factor
and the last the format to rewrite it to the file.
15
6
49.0190
4.7009 %8.3f
15
16
2.0194
0.0088 %10.5f
16
6
50.6960
3.2380 %8.3f
16
16
2.0116
0.0039 %10.5f
17
6
286.9810
7.0154 %8.3f
17
16
1.3676
0.0032 %10.5f
18
6
316.9810
7.6348 %8.3f
18
16
1.3660
0.0060 %10.5f
19
6
272.0190
3.7641 %8.3f
19
16
1.4427
0.0060 %10.5f
20
6
390.0190
3.2305 %8.3f
20
16
1.3835
0.0060 %10.5f
21
6
417.0190
5.1356 %8.3f
21
16
1.3623
0.0058 %10.5f
22
6
337.0190
2.9041 %8.3f
22
16
1.0882
0.0064 %10.5f
23
6
337.0190
4.2527 %8.3f
23
16
1.0834
0.0063 %10.5f
24
6
200.9810
6.2876 %8.3f
24
16
1.4079
0.0064 %10.5f
25
6
200.9810
7.4684 %8.3f
25
16
1.1511
0.0040 %10.5f
26
6
199.0190
3.2894 %8.3f
Sample parteth file
The first column is the line in the *dat file, the second the
column, the third is the average QC value, the fourth is the weight
factor and the last the format to rewrite it to the file.
15
16
2.0194
10.0 %10.5f
16
16
2.0116
10.0 %10.5f
17
16
1.3676
10.0 %10.5f
18
16
1.3660
10.0 %10.5f
19
16
1.4427
10.0 %10.5f
20
16
1.3835
10.0 %10.5f
21
16
1.3623
10.0 %10.5f
22
16
1.0882
10.0 %10.5f
23
16
1.0834
10.0 %10.5f
24
16
1.4079
10.0 %10.5f
25
16
1.1511
10.0 %10.5f
Sample frcmod.hemts file
Title - P. Rydberg 11/11-05
MASS
NP 14.01
NO 14.01
BOND
FE-NP
87.661 2.01280
FE-NO
49.891 2.03930
ANGLE
NP-FE-NP
32.753 173.285
NO-FE-NO
38.190 179.739
DIHEDRAL
X-NO-FE-X
1
1.000
180.000 2.000
X-NP-FE-X
1
2.000
180.000 2.000
IMPROPER
c1-c1-NP-FE
0
1.041
180.000 2.000
c1-c1-NO-FE
0
0.842
180.000 2.000
NONBON
FE
1.20000 0.05000 0.00000
HQ
1.00 0.020
0.00000
Modifications of
Amber nmode
Note that the Hessian must be mass-weighted.
Three files are modified: inpdat.h, rdinp.f, and nmode.f
diff inpdat.h~ inpdat.h
10c10
<
ipol,i3bod,ismem
---
>
ipol,i3bod,ismem,iulf
diff rdinp.f~ rdinp.f
38c38
<
ipol,i3bod,ismem
---
>
ipol,i3bod,ismem,iulf
91a92
> iulf=0
diff nmode.f~
nmode.f (Amber9)
186d185
< if
((iulf.eq.0).and.(iulf1.eq.0)) then
188d186
< endif
191,210d188
<
! ----- Dump coordinates force
and Hessian
<
! ----- Added by Ulf Ryde,
20/9-06
< if (iulf1.gt.0) then
<
write(6,*)' '
<
write(6,*)mx,mf,mh
<
open(73,file='hessian',status='unknown',form='formatted')
<
write(73,'(i9)')natom
<
write(73,'(a)')'Coordinates'
<
write(73,'(3f16.8)')(x(mx+i),i=0,3*natom-1)
<
write(73,'(a)')'Forces (not mass weighted)'
<
write(73,'(3e16.8)')(x(mf+i),i=0,3*natom-1)
<
write(73,'(a)')'Hessian (not mass weighted)'
< k=0
< do
i=1,natom*3
<
write(73,'(6e16.8)')(x(mh+k+j),j=0,i-1)
<
k=k+i
< enddo
<
close(73)
< endif
<
215,233d192
<
<
! ----- Dump coordinates force
and Hessian
<
! ----- Added by Ulf Ryde,
28/11-05
< if (iulf.gt.0) then
<
write(6,*)mx,mf,mh
<
open(73,file='hessian',status='unknown',form='formatted')
<
write(73,'(i9)')natom
<
write(73,'(a)')'Coordinates'
<
write(73,'(3f16.8)')(x(mx+i),i=0,3*natom-1)
<
write(73,'(a)')'Forces (mass weighted)'
<
write(73,'(3e16.8)')(x(mf+i),i=0,3*natom-1)
<
write(73,'(a)')'Hessian (mass weighted)'
< k=0
< do
i=1,natom*3
<
write(73,'(6e16.8)')(x(mh+k+j),j=0,i-1)
<
k=k+i
< enddo
<
close(73)
< endif
300,308d258
< ! Code added by Jacob Kongsted 2007 for new
MM/PBSA entropy
< ! JK
< nvecs
= 3*natsys - 6
< call
thermo(natsys,nvecs,ilevel,x(mx),x(mamass),x(mcval), &
<
x(mh),x(mh+ns3),x(mh+2*ns3),
&
<
x(mh+3*ns3),t,patm)
< ! End of new code
<
<
Compiling the programs
Most of the programs are unix shell scripts.
However, there are also a few c programs:
The needed files are found in the Source directory:
ParEval.c ParMod.c ParUtil.c ParUtil_t.c
jacobi.c nrutil.c
ParUtil is compiled
by:
gcc ParUtil.c jacobi.c nrutil.c -o ParUtil -lm
ParUtil_t is compiled by:
gcc ParUtil_t.c jacobi.c nrutil.c -o ParUtil_t -lm
However, I failed to compile the remaining two program, owing to
missing *.c files:
gcc ParMod.c nrutil.c -lm
ParMod.c:(.text+0x71b): undefined reference to `indexx'
file indexx.c is missing
gcc ParEval.c nrutil.c -lm
ParEval.c:(.text+0x2675): undefined reference to `dsvdcmp'
ParEval.c:(.text+0x2b17): undefined reference to `dsvbksb'
ParEval.c:(.text+0x2bf9): undefined reference to `choldc'
ParEval.c:(.text+0x2c23): undefined reference to `cholsl'
collect2: error: ld returned 1 exit status
Four *.c files are missing.
The README file
Automated parameterization example
Copyright (C) 1997 Per-Ola Norrby
See: "Automated Molecular Mechanics Parameterization with
Simultaneous Utilization of Experimental and Quantum
Chemical Data", Per-Ola Norrby, Tommy Liljefors,
J.Comput.Chem. 1998, 19, 1146-1166.
Please don't give this away. I like to keep track of
who's using it, refer interested people to me,
pon .at. kemi.dtu.dk or http://organisk.kemi.dtu.dk/PON/
Per-Ola Norrby
In this example, some parameters in MM3* are refined.
The reference data in this example are structures, energies,
and cartesian mass-weighted Hessians from B3LYP/6-31G*
calculations. The structures have been exported to
files X.mae, D.mae, and E.mae.
ESP charges were calculated from the wavefunction,
and can be found in the D.mae-file. X.mae contains
reference structures, D.mae have all structures for which
Hessians have been calculated (together with reference charges),
and E.mae have structures for calculation of energy
differences, with the QM energy for each structure.
The mass-weighted Hessians have been
extracted into file D.cal in lower triangular form. For
creating D.cal, see GetHess.
The current force field comparison data is calculated by
the following MacroModel batch files:
Mass-weighted Hessian: D.com
Minimized structures : X.com
Relative energies : E.com
There is also the command file XXR.com, which uses X.mae as input
and evaluates all structural elements (bonds, angles, torsions)
for the file of reference data. It also minimizes all structures
AFTER outputting structure data. Thus, by reading the file XXR-out.mae
into MacroModel 2 structures at a time, you can compare reference
and minimized structures on the screen (not necessary, but nice :-).
For structural data, note that only structure elements with the string
OPT in the comment portion of the force field will be extracted and
used in comparisons. To use non-default scale factors for bonds,
angles, and torsions, respectively, the second word of the structure
name must be "Scaling", and the third through fifth should be the new
scale factors. Note that scaling applies to ALL subsequent structures,
unless a new set is given.
The files mm3.fld and atom.typ should be modified so that all
structures can be calculated (and preferably be recognizable)
before starting the parameter refinement. All parameters to be
refined should be defined in file paropt. The first two colums
correspond to row and position in the force field (position values
1-3 are interpreted as default MacroModel offsets, otherwise as the
number of characters before the parameter. Negative position values
allow the parameter to get a negative value). The third column is
the current parameter value, the fourth (optional) is the numerical
differentiation step, and the fifth (optional) is a C format for
writing the parameter into the force field file (default: %9.4f).
The Unix script MkPar can be modified and used to create an
initial paropt. You may want to modify paropt.
Parameters to be tethered are defined together with
the harmonic tethering constant (column 4) in file parteth.
Three executables are included, ParEval, ParMod, and ParUtil (together
with C-source: Numerical Recipes routines are not included). These
are not described in detail, but the main functions are:
ParEval: Evaluate force field results, suggest new solutions.
ParMod : Modify the force field and parameter definition files
(mm3.fld and paropt).
ParUtil: Whatever needs doing, including extraction and manipulating
QM Hessians, and a Unix paste for
very large, tab-separated column files.
Several Unix scripts are included. These are all files that start
with the letters C, M, N, or S. Only CalcRef and CalcOne will actually
start MM calculations. These should be modified for each application.
MkPar was mentioned above, the others are called by the two top-level
commands CalcLoop and NR_Loop, as follows:
CalcLoop ; Uses both NR and Simplex, first does a
LinSearch ; Line Search (once only, see below), then loops until
; no improvement can be obtained.
NR_Jacob ; Calculates the Jacobian by numerical differentiation,
; suggests several new parameter sets
NR_CalcAll ; Calculates all reference values and all data points for
; every force field, puts the complete matrix > par.tot
CalcRef ; Scaling factors and reference values > par.ref
CalcOne ; Calculates values from one force field > par.one
NR_Eval ; Evaluates all suggested force fields, selects the best
NR_CalcAll
CalcOne
Simplex ; Performs a Simplex optimization based on parameter
; definition file simpopt
SimpAll ; Similar to NR_CalcAll, output > simp.tot
CalcRef
CalcOne
LinSearch ; Makes a Line Search with parabolic interpolation
; for each parameter in paropt; use only initially
NR_CalcAll
CalcOne
CalcRef
NR_Loop ; Loops like CalcLoop, but using only NR (not updated)
NR_Jacob
NR_CalcAll
CalcRef
CalcOne
NR_Eval
NR_CalcAll
CalcOne
Looping can be controlled by "touch"-ing specific control files.
For example, CalcLoop can be stopped gracefully by "touch Loop.stop".
The numerical differentiation steps should be "balanced", modify
them so that all derivatives (eg., in nr.lin after NR) are "similar".
What is similar? Say, within two orders of magnitude, if possible.
Manuals for the C-commands are found in the files Par*.txt
Good luck!
Send comments to Per-Ola Norrby, pon .at. chem.gu.se