ponparm.html

Q2MM (PON) Parameterisation

An ideal way of doing MM force-field parameterisation, based on QM data.
Can treat also transition states.
"Automated Molecular Mechanics Parameterization with Simultaneous Utilization of Experimental and Quantum Chemical Data", Per-Ola Norrby and Tommy Liljefors, J.Comput.Chem. 1998, 19, 1146-1166.
Review: "Hansen, E.; Rosales, A. R.; Tutkowski, B.; Norrby, P.-O.; Wiest, O. Prediction of Stereochemistry using Q2MM. Acc. Chem. Res. 2016, 49, 996-1005."

Implementation for Amber, Dec. 2006: "P. Rydberg, L. Olsen, P.-O. Norrby & U. Ryde (2007) "A general transition-state force field for cytochrome P450 hydroxylation", J. Chem. Theory Comput., 3, 1765-1773"

All files and scripts needed are in /away/bio/Pon_param, as well as some sample input files.
Some info is found in the README file
And more info about the programs ParEval, ParMod and ParUtil can be found in a pdf file.

Note that the period and phase of dihedrals will not be optimised (neither will the non-bonded parameters or charges).

All instances of X in the text below denotes a number series. Preferably 1,2,3.. etc. This is to enable all scripts to function with multiple structures (for example an enzyme active site with different substrates)

Optimise the desired structure(s) with QM (it is wise to put the atoms in a sensible order, viz. the same order as in Amber).
Calculate RESP charges and the Hessian (by Gaussian).
Alternatively, Turbomole can now be used for the Hessian calculation (but still with RESP charges).
Construct a pdb file from each of the optimised structure (gausstopdb).
Build an amber topology file (a *.in file) for every structure from the data. This can be done automatically by describe (or alternatively by changepdb, command p).
Be very careful NOT to renumbering any atom (can be cured, see below).
Set appropriate atom types and insert the RESP charges.
Check the tree symbols (M, S, B, 3, E).

Note that with Gaussian, you need to use the coordinates from the archive entry, not from the standard orientation or input coordinates, otherwise the correlation of the Hessian elements will be poor (~0.5, rather than >0.95). This should now be automatically done by describe.
Construct a template Amber parameter file for all missing parameters (if you used describe in the previous point, this is already done). This file must be named frcmod.hemts.
You should set reasonablee start values (but not zero values) for those parameters you want to parameterise (e.g. analogous values from the amber force field).
Good starting points can be taken from parameters generated by describe (it even makes Amber template files).

The program MkPar assumes the following strict format:
Force constants for bonds are given in format f8.3, starting from column 6
Equilibrium bond lengths in format f10.5, starting from column 16
Force constants and equilibrium values for angles in format f8.3, starting from columns 10 and 22.
Force constants for dihedrals (and impropers) in format f8.3, starting from column 19

See the sample frcmod.hemts file below.
Set up a leap command file for each structure, leapcom_X, which sucessfully construct the prmtop and prmcrd file for the calculation (check that no atoms are renumbered).
It should read the three files (pdb, *.in, and frcmod.dat) constructed above.

You run it by
tleap -f leapcom_X
Carefully check the values in the frcmod.hemts file, especially the periodicity and phase of the dihedrals (which are not changed in the parameterisation).

This can be done by changeparm, commands e (the bond and angle energies should be close to zero) and dc (there should be no dihedrals in the erroneous phase or large angles lists; also check all dihedrals with suboptimal values).

Dihedrals with large angles should be removed by setting the force constants to zero.
Construct a nmodemm.in file for MM minimisation (optionally, you can use a sander.in file instead).
Construct a nmode.in file for Hessian calculation.
Use the modified version of nmode in $AMBERHOME/src/nmode/nmode (/away/bio/Bin/Linux/nmodeh), that writes out the Hessian. See below for the modifications needed.
Test the programs with
nmode -O -i nmodemm.in -o nmodemm.out -p prmtop -c prmcrd -r mdrest
and then
nmodeh -O -i nmode.in -o nmode.out -p prmtop -c mdrest -r temp
Check that a file hessian is created.
Run
MkPar > paropt
It takes the *dat file and writes a list of all parameters in the right format to stdout.
Note that the titles of the parts of the frcmod file must be:
BOND
ANGLE
DIHEDRAL
IMPROPER
Otherwise, the file will not be properly read.

Note that MkPar does not work with atom types with only one character!
mkpar probably works better:
mkpar

Note that paropt is tab-delimited, so be careful not to delete the tabs.

Check the file paropt and remove parameters you do not want to fit.

If you want to force two parameters to be the same, add the number in first column (the line number in forcmod) of the second parameter as a sixth column (i.e. a new final column) to the first parameter and delete the row with the second parameter from paropt (the program assumes that the parameters come in the same column in the two rows).
Example:
15 6 49.0190 4.7009 %8.3f 16
(now the parameter in column 6 on line 16 will be the same as the one in column 6 on line 15).
Create a tethering file if needed (to tether some values to the QM values).
Create a file called parteth. This should be a copy of paropt, but with only bond distances and angles in degrees. The fourth column should be changed to the weighting of the tethering (10 for bonds and 1 for angles is a good start guess). The third column should be the equilibrium data from QC calculations. Run the script AveTeth to get average data over all your structures for all bonds and angles in your frcmod file.
With Gaussian frequencies:
Run the script
GetInvHess gauss.out (if you have a transition state)
or the script
GetHess gauss.out (if you have a ground state)
on the Gaussian frequency output file (note that the prmtop file also needs to be in the same directory).

NOTE: gauss.out should only contain ONE frequency calculation. If you have several, then just delete all except the one you want to use in the output file.

This is the mass-weighted and reformed Hessian (ts eigenvalue made positive).

If you get only the error message in the file
Unknown atomic number, 12, exiting...
you have to first run
GetScale
to get the file scaling
and then
ParUtil -l gauss.out -w12,24.305 -i D -f1185.8211,2240.877 -s1,.01 -2 -4 scaling > gauss.out.cal
or for GetInvHess
ParUtil -l $1 -w12,24.305 -e > D.eigen
awk 'i==0 {print} i==1 {i=0;x=0;n=0;for(k=1;k<=NF;k++) if($k<x) {x=$k;n=k}\
for(k=1;k<=NF;k++) {y=k==n?1.:$k;printf "\t%16.8E",y} printf "\n"}\
/Eigenvalues/ {i=1}' D.eigen > D.eigmod
ParUtil -er D.eigmod -i D -w12,24.305 -f1185.8211,2240.877 -s1,.01 -2 -4 scaling > $1.cal
i.e., you should give the missing atomic weight after the -w command to ParUtil.
```
For Turbomole use 
GetHess_t force.log
or
GetInvHess_t force.log
with the output from aoforce. The file with the hessian needs to be named hessian and be in 
the current directory. 
Note that you need to take a backup copy of the file hessian, because this name is later used also for the MM hessian.
cp hessian hessian-qm

If you want to run ParUtil manually the command is the same as for a 
gaussian log file, but with -l replaced by -m (t was allready taken):

ParUtil -m force.log -w12,24.305 -i D -f1185.8211,2240.877 -s1,.01 -2 -4 scaling > force.log.cal
```
Only needed if the atom numbers in Gaussian and Amber do not coincide
Correct the atom numbers in the hessian D.cal files:
For each structure do the following
- Run "tleap -f leapcom_X"
- Run "changeparm" and create a pdb file from the prmcrd file
- Open the pdbfile with Rasmol (or equivalent) and label by atom number
- Open the gaussian outputfile (used to create the DX.cal file) with Molden and label by atom number
- If the atom numbers are not the same then create a new file with two columns, the first column are the atom numbers from the gaussian file (as shown in Molden) and the second column are the amber atom numbers (as shown in Rasmol). (if the atom numbers are exactly the same you can go to PonInit directly)
- Run the script dcalfix.py
  this creates a D.test file which should have the same two first columns as DX.cal and other values in the third column for atom numbers which are not the same in the file you just created.
- Replace the DX.cal file with D.test
Run the script PonInit
This creates files named dihlist_X with list of dihedrals from changeparm.
You may remove from this file dihedrals that you don't want to be included in the penalty function.
Dihedrals that contain angles that are larger than 150 degrees are automatically removed to avoid that small changes causes large problems.
Currently buggy, skip this point.
Run script LinSearch to provide good starting values for all parameters.
Output in stdout.
Files paropt and frcmod.hemts are modified.
Note that the program may remove parameters. Check this thoroughly.
If the LinSearch script goes into an infinite loop for a variable, edit the paropt file manually and modify this variable slightly to get it out of the loop
Run the script CalcLoop
This is the main optimisation program. It runs iterative Newton-Raphson (NR) optimisation of the parameters and Simplex optimisation of some parameters that are poorly behaving in NR.
It calculates numerical derivatives by changing one parameter in the frcmod file at a time (both up and down, giving 2*#parameters modified frcmod files, called frcmod.nnn).

The output is in loop.log
It modifies the paropt and frcmod.hemts files.
Other files are described below.

Options:
Skip Simplex optimization by creating the file NoSimplex.
Use Sander MM minimisation instead of Nmode by creating the file CalcOneSander
The program can be softly stopped by
touch Loop.stop

There are some bugs when a parameter goes towards 0, which usually makes the program crash.
If this happen, remove the parameter or change the period or phase of a dihedral.
During the run of CalcLoop, you should check:
- The convergence of the penalty function:
  The script poncheck.sh can be run during an optimization to follow the convergence of the penalty function, it reads the loop.log file and takes out the interesting data.
- That no force constant becomes negative (nr.ini, after Best)
- That no force constant are zeroed (frcmod.hemts).
After convergence, you should check the following
(all this is automatically done with the script ponresult):
- The penalty function (poncheck.sh):
  A good final penalty function should be ~#data_points*0.25 for a ground state, but for transition states, it is usually more than #data_points. (#data_points is the number of lines in the file par.tot; it is also given by poncheck.sh)
- Check the nr.ini file after the line starting with Best
  All 2nd derivatives should be negative:
  Parameters with a positive 2nd derivatives are marked with a #.
  awk '/^Parameter/,/radius/ {if($3 < 0) print $1,$2,$3,$4,$5,$6,$7 }' nr.ini
  
  There should be no negative signs in the last column (Jtg diag) when the calculation is converged.
  awk '/^Parameter/,/radius/ {if($6 < 0) print $1,$2,$3,$4,$5,$6,$7 }' nr.ini
  
  The 2:nd derivative should be larger than the 1:s derivative at convergence (i.e. all entries in the 1D-NR-step column should be (absolute) less than 1).
  awk '/^Parameter/,/radius/ {if(($4 < -1) || ($4 > 1)) print $1,$2,$3,$4,$5,$6,$7 }' nr.ini
  but the min and max values (at the end of this section) give the same information, i.e. they should be absolute less than 1 at convergence (these are the min and max values of 1D-NR-step).
  grep Min nr.ini
- Plot the QM and MM Hessian elements:
  grep f1D par.tot | cut -f1-4 >hess
  sort -n -k3 hess >hess_sort
  gnuplot
  plot "hess" using 3:4,x,0
  It should show elements close to the diagonal and possibly a few outliers (and often a tendency to element around y=0).
  
  Check the outliers by finding their values by pointing at them in the gnuplot plot and then finding the corresponding values in hess
  sort -n -k3 hess >hess_sort
  
  Diagonal elements of the Hessian (e.g. f1D17y17y) cannot easily be improved, but other terms can often be understood.
  
  If you get a correlation coefficient of <0.6, something is most likely wrong. Typically, you have used wrong coordinates in Gaussian (not archieve entry for Hessian) or your nmodeh does not write out the mass-weighted Hessian (check that you use iulf and not iulf1)
- Build the final structure:
  tleap -f leapcom_1
  nmode -O -i nmodemm.in -c prmcrd -r mdrest
  changeparm <<EOL
  prmtop
  p
  pdb0
  m
  prmcrd
  q
  EOL
  changeparm <<EOL
  prmtop
  p
  pdb
  m
  mdrest
  q
  EOL
  
  changepdb, command rf can perform a rms fit of the two structures.
  Alternatively, this can be done with vms
  
  Check it and compare it to the QM structure (pdb0).
- Compare the structure with the reference structure using changeparm, command co
- Check that all force constants and bond or angles makes physical sense.
  Also check for zeroed force constants.
  grep '1 0.000' frcmod.hemts | grep -v Large | grep -v Zero
  
  grep ' 1 ' frcmod.hemts | grep ' 0.0' | grep -v Zero | grep -v Large
- Check that there are no # in error_results (see next point).
  grep '\#' error_results
  Also check the error limits of all parameters.
  Force constants should have uncertainties less than 2.
  Bonds less than 0.01.
  Angles less than 0.1
- Check error_par.anal for the terms that give the largest errors in the penalty function (see next point)
An error estimate can be found by checking how large changes in one parameter gives 1% worse penalty function.
This is done by the script pon_errorcheck.py

It outputs two files error_results and error_par.anal.
error_results contains all force constants and uncertainties (error estimates) for these.
The first column is line number in paropt (or # for a poor parameter).
The second is the line number in frcmod file.
The third is equilibrium value from paropt.
The fourth is the error estimate.
The fifth is an alternative error estimate (from the smallest solution of the second degree equation).
grep '\#' error_results

error_par.anal contains the label, the line number, the weight, the reference value, the error, the quadratic errors (= their influence on the penalty function), sorted by their quadratic error.
This shows which terms gives the largest errors in the penalty function

If your optimization crashes you can also get the latter file by:
cut -f1-4 par.tot > par.anal
ParEval -z par.anal | sort -n -k6 -r > error_par.anal
If you want to change a parameter and rerun CalcLoop, you should change it in both paropt and frcmod.hemts.
You can use the old files and make changes only in frcmod.hemts and then run the program
ponmkpar
It will copy the old paropt file to paropt.old and update it with the data from frcmod.hemts, asking for appropriate delta values.
When the parametrisation has converged, you can optimise the parameters a bit further.
To do this manually, decrease the step lengths in paropt to 10.0 for bonds and lower for angles and dihedrals. Then start CalcLoopHC (this script has tighter convergence than CalcLoop and does not update step lengths in paropt).
At convergence decrease the step lengths even more.
Continue until you can't get improvements with CalcLoopHC by lowering the step lengths.
When all jobs are finished, you can run the script purpon to remove unnessesary files.

gzip *&

Tips & Tricks

If you have several structures, parametrise first a single structure and then all structures with the first results as the starting point.
When running simplex in the first few iterations, sometimes negative force constants can show up (it can actually happen at any time, but the probability is higher during the first few iterations).
grep '-' simp.[0-9]
shows if you have negative parameters.
If so, edit the simp.[0-9] files with minus signs and just remove the minus signs (this is not really according to protocol, but it works fine during early stages of parametrisation).
If a parameter suddenly has become 0, although this was not expected (it should physically not be 0), then stop the optimisation, locate the last frcmod file that was OK (not zero) in the nrCycle folders, use it as your new frcmod file and also take the par.xx (xx is a number) from the same folder as your new paropt file, and then restart the optimisation.
This most likely happen because simplex has made the parameter negative, which after an iteration implies that it is zeroed.
If an angle parameter gets a (near) zero force constant, you typically need to reduce the force constants of the other angles with the same central atom.

Files from CalcLoop

Files with useful information

loop.log
The log-file of the calculation.
Long, but contains the values of the penalty function in each iteration.
Use poncheck.sh to extract these data.
nr.ini
This file summarises the latest step of Newton-Raphson optimisation.
First, there is a section with all the tested force fields (frcmod files, i.e. 2*#parameters) and their results in terms of the penalty function, the maximum deviation, which row that deviates, and the change in the penalty function.

Next, there is a section (search for Best) that describes the best force field (marked by >>> in the previous section). Here all the parameters are listed, together with their estimated 1st and 2nd derivatives, the quotient of the 1st and 2nd derivatives (1D-NR-step) and two more entires (Jtg 1st and JtJ diag).
At convergence:
   1st derivative should be small
   2nd derivative should be positive (parameters with a negative second derivative are marked with #)
   1D-NR-step should be larger than 1 (absolute)
   JtJ diag should be positive

This section ends with a 1D NR step.
First, the radius, max and min are given, then the step length for each parameter
At convergence, min and max should be absolute less than 1.
If the calculation is poor (the radius is too large, no such step is given "Radius too large (1541.22), 1D-NR not printed"
par.ref
The reference for the penalty function
The first column contains a label, describing the structure (f1, f2, ...), the type of term (b=bond, a=angle, d=dihedral, D=Hessian element), and a description of the term, e.g. 1_4 for the bond between atoms 1 and 4, or 35z31y for the Hessian element for the z-coordinate of atom 35 and the y coordinate of atom 31.
The second column is the weight factor for this term.
The third and last column is the reference value for this term.
Thus, wc par.ref gives the number of terms in the penalty function.
par.tot contains label - weight - ref_value and then 1 column per force field that is tested (a very big file)

ParEval -f par.tot calculates the quadratic sum of the current model (like in file nr.ini)
simplex.log
Log file for the simplex optimisation
Less interesting files
dihlist
The list of dihedrals to include in the penalty function
frcmod.*
These are modified frcmod files, two for each parameter in paropt, with a single parameter modified (either increased or decreased), according to the steps in paropt.
hessian
The MM hessian (as well as coordinates and forces) from the latest nmode calculation.
mdrest
The restart file for the nmode calculations
nmode.out
The output file of the nmode Hessian calculation
nmodemm.out
The output file of the nmode minimisation.
nr.dmp
Similar to nr.ini.
The first section is identical.
The second section is missing
The third section, gives another four step with differrent dampings (10, 1, 0.1, and 0.01). None of the steps are the same as in nr.ini
nr.eval
is a short file similar to the initical section in nr.ini, but it contains only a few lines.
nr.lin
Is a file with the results from a linear search, in the form of several nr.ini files after each other.
nr.sol
Contains only a few summary lines (from nr.lin and nr.dmp) with steps with radius, max, and min. Then (on the same line), one value for each parameter are given.
nr.svd
Contains only a copy of the first section of nr.ini
par.best
A paropt file with the currently best values
par.diff
Another paropt file
par.hess
The Hessian in some format.
par.mult
A paropt file with only some values
par.one
The output of the last CalcOne calculation, i.e. the values of each term of the penalty function (should be subtracted from the corresponding value in par.ref, multiplied with the weight, squared and then summed to get the value of the penalty function).
par.tmp
Some temporary par.tot file
simp.[0-9]*
Paropt-like input files for the simplex optimisation: Only the problematic parameters that are varied are listed, with different values of the parameter, but otherwise identical.
simp.test
Similar to the simp.1, etc files.
simp.top
Similar to the par.tot file for the simplex run.
simpinit
Similar to the simp.1, etc files.
simpcent
A par.one-like file for simplex optimisation.
simpmat
A matrix of all the values for the simplex optimisation, but without the header columns.
simpmod
Similar to the simp.1, etc files, but with the actual value in the first column.
simpopt
Similar to the simp.1, etc files. With optimum values (?).
simptemp
Similar to par.ref, but with an extra column with values.
simptmp
Similar to simpcent. Temporary.

Sample leap_com file

source leaprc.ff03
loadAmberPrep heme_ts.in
loadAmberParams frcmod.hemts
x=loadpdb hemets.pdb
check x
saveamberparm x prmtop prmcrd
quit

To avoid errors like in tleap: +Currently only Sp3-Sp3/Sp3-Sp2/Sp2-Sp2 are supported +---Tried to superimpose torsions for: *-C1-C6-* +--- With Sp0 - Sp0 +--- Sp0 probably means a new atom type is involved +--- which needs to be added via addAtomTypes

Issue the following command for all new parameters in tleap: addAtomTypes { {"c1" "C" "sp2"} {"c2" "C" "sp2"} }

For transition states, you should define both the breaking and the forming bond.
You do that with
bond x.1.C3 x.2.HN1

Often, you then get the error message:
Bond: Maximum coordination exceeded on .R<TSP 2>.A<HN1 2> -- setting atoms pert=true overrides default limitswhich you solve by issuing the command:
set x.2.HN1 pert true bond x.1.C3 x.2.HN1

Sample nmodemm.in file

MM minimisation with nmode
&data
ntrun = 4,
nprint = 100,
nsave = 1000,
idiel = 1,
scnb = 2.0,
scee = 1.2,
cut = 999.0
&end

Sample sander.in file

Title
&cntrl
irest=0,ntx=1,
imin=1,maxcyc=1000,drms=0.001,
ntc=1,ntf=1,
nsnb=40,cut=999.0,dielc=1.0,
ntpr=100,ntwx=0,ntwv=0,ntwe=0,
ntb=0,ntp=0,
ipol=0,igb=0,
scnb=2.0,scee=1.2,
ncyc=10,ntmin=1,dx0=0.002
&end

Sample nmode.in file

Title
&data
ntx    = 1,
ntrun = 1,       nvect = 0,
drms   = 1,
dielc = 1.0,     idiel = 1,
scnb   = 2.0,     scee   = 1.2,
cut    = 999.0,
iulf=1
&end
END

Sample paropt file

The first column is the line in the *dat file, the second the column, the third is the current value, the fourth is a step factor and the last the format to rewrite it to the file.

15      6         49.0190          4.7009       %8.3f
15      16         2.0194          0.0088       %10.5f
16      6         50.6960          3.2380       %8.3f
16      16         2.0116          0.0039       %10.5f
17      6        286.9810          7.0154       %8.3f
17      16         1.3676          0.0032       %10.5f
18      6        316.9810          7.6348       %8.3f
18      16         1.3660          0.0060       %10.5f
19      6        272.0190          3.7641       %8.3f
19      16         1.4427          0.0060       %10.5f
20      6        390.0190          3.2305       %8.3f
20      16         1.3835          0.0060       %10.5f
21      6        417.0190          5.1356       %8.3f
21      16         1.3623          0.0058       %10.5f
22      6        337.0190          2.9041       %8.3f
22      16         1.0882          0.0064       %10.5f
23      6        337.0190          4.2527       %8.3f
23      16         1.0834          0.0063       %10.5f
24      6        200.9810          6.2876       %8.3f
24      16         1.4079          0.0064       %10.5f
25      6        200.9810          7.4684       %8.3f
25      16         1.1511          0.0040       %10.5f
26      6        199.0190          3.2894       %8.3f

Sample parteth file

The first column is the line in the *dat file, the second the column, the third is the average QC value, the fourth is the weight factor and the last the format to rewrite it to the file.

15      16         2.0194          10.0       %10.5f
16      16         2.0116          10.0       %10.5f
17      16         1.3676          10.0       %10.5f
18      16         1.3660          10.0       %10.5f
19      16         1.4427          10.0       %10.5f
20      16         1.3835          10.0       %10.5f
21      16         1.3623          10.0       %10.5f
22      16         1.0882          10.0       %10.5f
23      16         1.0834          10.0       %10.5f
24      16         1.4079          10.0       %10.5f
25      16         1.1511          10.0       %10.5f

Sample frcmod.hemts file

Title - P. Rydberg 11/11-05

MASS
NP 14.01
NO 14.01

BOND
FE-NP   87.661     2.01280
FE-NO   49.891     2.03930

ANGLE
NP-FE-NP    32.753     173.285
NO-FE-NO    38.190     179.739

DIHEDRAL
X-NO-FE-X     1       1.000     180.000       2.000
X-NP-FE-X     1       2.000     180.000       2.000

IMPROPER
c1-c1-NP-FE   0       1.041     180.000       2.000
c1-c1-NO-FE   0       0.842     180.000       2.000

NONBON
FE         1.20000   0.05000   0.00000
HQ         1.00      0.020     0.00000

Modifications of Amber nmode

Note that the Hessian must be mass-weighted.
Three files are modified: inpdat.h, rdinp.f, and nmode.f

diff inpdat.h~ inpdat.h
10c10
<       ipol,i3bod,ismem
---
>   ipol,i3bod,ismem,iulf

diff rdinp.f~ rdinp.f
38c38
<          ipol,i3bod,ismem
---
>          ipol,i3bod,ismem,iulf
91a92
>    iulf=0

diff nmode.f~   nmode.f (Amber9)
186d185
<          if ((iulf.eq.0).and.(iulf1.eq.0)) then
188d186
<          endif
191,210d188
<       !       ----- Dump coordinates force and Hessian
<       !       ----- Added by Ulf Ryde, 20/9-06
<       if (iulf1.gt.0) then
<          write(6,*)' '
<          write(6,*)mx,mf,mh
<          open(73,file='hessian',status='unknown',form='formatted')
<          write(73,'(i9)')natom
<          write(73,'(a)')'Coordinates'
<          write(73,'(3f16.8)')(x(mx+i),i=0,3*natom-1)
<          write(73,'(a)')'Forces (not mass weighted)'
<          write(73,'(3e16.8)')(x(mf+i),i=0,3*natom-1)
<          write(73,'(a)')'Hessian (not mass weighted)'
<          k=0
<          do i=1,natom*3
<             write(73,'(6e16.8)')(x(mh+k+j),j=0,i-1)
<             k=k+i
<          enddo
<          close(73)
<       endif
<
215,233d192
<
<       !       ----- Dump coordinates force and Hessian
<       !       ----- Added by Ulf Ryde, 28/11-05
<       if (iulf.gt.0) then
<          write(6,*)mx,mf,mh
<          open(73,file='hessian',status='unknown',form='formatted')
<          write(73,'(i9)')natom
<          write(73,'(a)')'Coordinates'
<          write(73,'(3f16.8)')(x(mx+i),i=0,3*natom-1)
<          write(73,'(a)')'Forces (mass weighted)'
<          write(73,'(3e16.8)')(x(mf+i),i=0,3*natom-1)
<          write(73,'(a)')'Hessian (mass weighted)'
<          k=0
<          do i=1,natom*3
<             write(73,'(6e16.8)')(x(mh+k+j),j=0,i-1)
<             k=k+i
<          enddo
<          close(73)
<       endif
300,308d258
< !    Code added by Jacob Kongsted 2007 for new MM/PBSA entropy
< ! JK
<          nvecs = 3*natsys - 6
<          call thermo(natsys,nvecs,ilevel,x(mx),x(mamass),x(mcval), &
<                x(mh),x(mh+ns3),x(mh+2*ns3), &
<                x(mh+3*ns3),t,patm)
< !    End of new code
<
<

Compiling the programs
Most of the programs are unix shell scripts.
However, there are also a few c programs:
The needed files are found in the Source directory:
ParEval.c ParMod.c ParUtil.c ParUtil_t.c jacobi.c nrutil.c

ParUtil is compiled by:
gcc ParUtil.c jacobi.c nrutil.c -o ParUtil -lm

ParUtil_t is compiled by:
gcc ParUtil_t.c jacobi.c nrutil.c -o ParUtil_t -lm

However, I failed to compile the remaining two program, owing to missing *.c files:

gcc ParMod.c nrutil.c -lm
ParMod.c:(.text+0x71b): undefined reference to `indexx'
file indexx.c is missing

gcc ParEval.c nrutil.c -lm
ParEval.c:(.text+0x2675): undefined reference to `dsvdcmp'
ParEval.c:(.text+0x2b17): undefined reference to `dsvbksb'
ParEval.c:(.text+0x2bf9): undefined reference to `choldc'
ParEval.c:(.text+0x2c23): undefined reference to `cholsl'
collect2: error: ld returned 1 exit status
Four *.c files are missing.

The README file

Automated parameterization example

Copyright (C) 1997 Per-Ola Norrby

See: "Automated Molecular Mechanics Parameterization with
      Simultaneous Utilization of Experimental and Quantum
      Chemical Data", Per-Ola Norrby, Tommy Liljefors,
      J.Comput.Chem. 1998, 19, 1146-1166.

Please don't give this away.  I like to keep track of
who's using it, refer interested people to me,
pon .at. kemi.dtu.dk or http://organisk.kemi.dtu.dk/PON/

				Per-Ola Norrby


In this example, some parameters in MM3* are refined.
The reference data in this example are structures, energies,
and cartesian mass-weighted Hessians from B3LYP/6-31G* 
calculations.  The structures have been exported to
files X.mae, D.mae, and E.mae.
ESP charges were calculated from the wavefunction,
and can be found in the D.mae-file.  X.mae contains
reference structures, D.mae have all structures for which
Hessians have been calculated (together with reference charges),
and E.mae have structures for calculation of energy
differences, with the QM energy for each structure.  
The mass-weighted Hessians have been
extracted into file D.cal in lower triangular form.  For 
creating D.cal, see GetHess.

The current force field comparison data is calculated by 
the following MacroModel batch files:
Mass-weighted Hessian: D.com
Minimized structures : X.com
Relative energies    : E.com
There is also the command file XXR.com, which uses X.mae as input
and evaluates all structural elements (bonds, angles, torsions)
for the file of reference data.  It also minimizes all structures
AFTER outputting structure data.  Thus, by reading the file XXR-out.mae
into MacroModel 2 structures at a time, you can compare reference
and minimized structures on the screen (not necessary, but nice :-).

For structural data, note that only structure elements with the string
OPT in the comment portion of the force field will be extracted and
used in comparisons.  To use non-default scale factors for bonds,
angles, and torsions, respectively, the second word of the structure
name must be "Scaling", and the third through fifth should be the new
scale factors.  Note that scaling applies to ALL subsequent structures,
unless a new set is given.

The files mm3.fld and atom.typ should be modified so that all
structures can be calculated (and preferably be recognizable)
before starting the parameter refinement.  All parameters to be
refined should be defined in file paropt.  The first two colums
correspond to row and position in the force field (position values
1-3 are interpreted as default MacroModel offsets, otherwise as the
number of characters before the parameter.  Negative position values
allow the parameter to get a negative value).  The third column is
the current parameter value, the fourth (optional) is the numerical
differentiation step, and the fifth (optional) is a C format for
writing the parameter into the force field file (default: %9.4f).
The Unix script MkPar can be modified and used to create an
initial paropt.  You may want to modify paropt.

Parameters to be tethered are defined together with
the harmonic tethering constant (column 4) in file parteth.

Three executables are included, ParEval, ParMod, and ParUtil (together
with C-source: Numerical Recipes routines are not included).  These
are not described in detail, but the main functions are:

ParEval: Evaluate force field results, suggest new solutions.
ParMod : Modify the force field and parameter definition files
         (mm3.fld and paropt).
ParUtil: Whatever needs doing, including extraction and manipulating
         QM Hessians, and a Unix paste for
	 very large, tab-separated column files.

Several Unix scripts are included.  These are all files that start
with the letters C, M, N, or S.  Only CalcRef and CalcOne will actually
start MM calculations.  These should be modified for each application.
MkPar was mentioned above, the others are called by the two top-level
commands CalcLoop and NR_Loop, as follows:

CalcLoop		; Uses both NR and Simplex, first does a
    LinSearch           ; Line Search (once only, see below), then loops until
			; no improvement can be obtained.
    NR_Jacob		; Calculates the Jacobian by numerical differentiation,
			; suggests several new parameter sets
	NR_CalcAll	; Calculates all reference values and all data points for
			; every force field, puts the complete matrix > par.tot
	    CalcRef	; Scaling factors and reference values > par.ref
	    CalcOne	; Calculates values from one force field > par.one
    NR_Eval		; Evaluates all suggested force fields, selects the best
	NR_CalcAll
	    CalcOne
    Simplex		; Performs a Simplex optimization based on parameter
			; definition file simpopt
	SimpAll		; Similar to NR_CalcAll, output > simp.tot
	    CalcRef
	    CalcOne

LinSearch               ; Makes a Line Search with parabolic interpolation
                        ; for each parameter in paropt; use only initially
        NR_CalcAll
        CalcOne
        CalcRef

NR_Loop			; Loops like CalcLoop, but using only NR (not updated)
    NR_Jacob
	NR_CalcAll
	    CalcRef
	    CalcOne
    NR_Eval
	NR_CalcAll
	    CalcOne

Looping can be controlled by "touch"-ing specific control files.
For example, CalcLoop can be stopped gracefully by "touch Loop.stop".
The numerical differentiation steps should be "balanced", modify
them so that all derivatives (eg., in nr.lin after NR) are "similar".
What is similar?  Say, within two orders of magnitude, if possible.

Manuals for the C-commands are found in the files Par*.txt

Good luck!

Send comments to Per-Ola Norrby, pon .at. chem.gu.se