This link does not work any longer.
However, these are instructions from the pmispexample file (in
/temp4/bio/Pmisp/Pmispexample), file runscript:
##########################################
#
#
#
PMISP/MM
example
#
#
#
##########################################
# This script calculates the interaction energy between a
protein
# and a ligand with the PMISP/MM method
# It has to be modified for each specific need
# The example included is the avidin-biotin interaction
# and the QM level is MP2/aug-cc-pVTZ.
# This is a very heavy calculation and will take several
weeks
# if the QM calculations are not run in parallell at a
cluster.
#Files needed:
# complex.pdb : a PDB file with both protein
and ligand
# prmtop : an
AMBER parmtop file for the complex
# user.py : here you
set the QM level
# lig.inp : which
atoms belong to the ligand (7740-7770)
# restop_full : specifying protein segments
and special residue charges
# acon
: cuttable bond list for amino acids (add
non-standard residues)
#Generate the splitfile for the model with cutoff 5
Angstrom
############################################################
echo "complex.pdb
sp
c
7740
7770
5
acon
auto
model.inp
q" | changepdb
#Generate the splitfile for the full protein
############################################
#Note: A cutoff of 10000 is used to select all atoms
echo "complex.pdb
sp
c
7740
7770
10000
acon
restop_full
full.inp
q" | changepdb
#Generate an XYZ file
#####################
echo "complex.pdb
x
complex.xyz
q" | changepdb
#Create directories
###################
#We are doing two fragmentations, one for the model and
one for the
#protein. So let us use separate directories not to mess
up the files.
mkdir Model
mkdir Full
cp complex.xyz model.inp lig.inp prmtop Model
cp complex.xyz full.inp lig.inp prmtop Full
#Generate MOLCAS input files
############################
#Note: For the full protein we delete the
#supermolecular calculations (c*.input) because
#we want to apply the PMISP/MM approximation
cd Model
pmisp 1 complex.xyz model.inp - lig.inp
cd ..
cd Full
pmisp 1 complex.xyz full.inp - lig.inp
rm c*.input
cd ..
#Run MOLCAS (preferably on a computer cluster)
##############################################
#for a in */*.input; do molcas $a >${a%.*}.out; done
#Collect supermolecular results
###############################
#cd Model
#grep 'Total energy ' c*.out >resmp2
#cd ..
#OR (in this example) use the old results
#########################################
tar xvf results_Model.tar -C Model
tar xvf results_Full.tar -C Full
#Convert to XF files with multipole level 2
###########################################
cd Model
allxf 2
cd ..
cd Full
allxf 2
cd ..
#Assemble properties for the protein and calculate
classical energy
###################################################################
#Note: For large systems, this may take several hours
#The code is not optimized, because the classical
calculations
#are anyway faster than the QM calculations.
cd Full
pmisp 3 complex.xyz full.inp m
nemoint tot.xf g1.xf
cd ..
#Calculate non-classical energy
###############################
cd Model
pmisp 2 complex.xyz model.inp - lig.inp resmp2
cd ..
#Calculate van der waals energies
#################################
cd Full
calcvdw
cd ..
cd Model
calcvdw
cd ..
#Collect the energies and calculate the total interaction
energy
################################################################
#CLASSICAL: -1411.044
#NON-CLASS: 22.357
#VDW(Full): -143.185
#VDW(Model): -115.977 (to be
subtracted)
#TOTAL: -1415.895 kJ/mol
The
instructions below are outdated.
Use instead the pmisp program (in /temp4/bio/Pmisp).
Unfortunately, the documentation seems to be missing.
Instructions for calculating multipoles for a
large molecule (e.g. a protein)
- Start with a PDB file where the residue numbering
indicates where the cutting should be performed (each residue
number should appear in only one place of the PDB file). Each
atom label must start with one of the following strings:
'H','C','N','O','F','P','S','CL' (case-sensitive), indicating
the element.
- Prepare a file named restop describing the topology of
the molecule. For proteins, use the template below. Normally,
only the last part (after the three lines containing CB) has to
be edited, because the first part describes the three most
common junctions:
- Addition of COCH3 to the N-terminal side of the residue
(type 1)
- Addition of NHCH3 to the C-terminal side of the residue
(type 2)
- Addition of SCH3 to the side-chain in case of sulphur bridge
(type 3)
The last part has two sections.
First, a junction part containing the number of "junction sets"
and a line with numbers R1,R2, I, T1, and T2 for each set. Each
junction set introduces either one pair of junctions (if I=0) :
type T1 modeling R1 in the R2 fragment and type T2 modeling R2 in
the R1 fragment, or several junction pairs (if I=1): type T1
modeling R1 in R1+1, R1+1 in R1+2, etc. and type T2 modeling R1+1
in R1, R1+2 in R1+1, etc., i.e. one junction is put between
every consecutive pair of residues between R1 and R2. Note that
residue numbers are given as in the PDB file, not the "order
number", but the fragments will be renumbered in consecutive
order.
Example (for a protein with 348 residues and a Cys-Cys bond
between residues 110 and 187):
2
1 348
1
1
2
110 187 0 3 3
The second section describes non-standard charges:
The first line gives the number (N) of them.
Then N lines with two integers R and C on each, specifying that
the charge of residue R (after adding junctions) is C. The default
is to sum the nuclear charges of the fragment. If that sum is
even, the charge will be set to zero, if it is odd, the charge
will be set to +1 unless the residue name is GLU or ASP, in which
case it will be set to -1.
Example:
3
348 -1
1296 1
1299 -1
- Use changepdb with
command SP, to
create the split.inp
file.
This file may also be created manually for small systems. It
contains a line with two integers P and N, the number of
"positive" and "negative" fragments (in the creation of the
total multipole file, the multipoles for positive fragments
are added, whereas the multipoles for the negative fragments
are subtracted).
Then follows a line with P+N integers giving the charge of
each fragment.
Then follows P+N lines each describing one fragment (first the
P positive ones). Each line contains a list of integers giving
the atom number (1,2,3.. etc. from beginning of the xyz file described
below). A minus sign in front of the number indicates that a
junction should be created replacing that atom with a hydrogen
atom connected to the (non-negative) atom closest before in
the list.
- Create an xyz file with exactly the
same order of the atoms as the pdb file used to create the split.inp file. This can
for example be done with changepdb, command x; or
alternatively
pdb2xyz:
pdb2xyz FILE.pdb
which takes FILE.pdb and creates FILE.xyz. It now handles CL
but no other two-letter names (easy to add of course). The
second line in the xyz file (the title line in xyz format)
must give the total molecular charge (if nonzero). NOTE! This
has to be added manually!
- Use the program split.py
(has to be done locally). The general syntax is:
split.py pass xyzfile
split.inp [m]
where pass=1 for splitting and 2 for joining after the LoProp
calculations, and the m is only relevant for pass 2. So use
split.py 1 FILE.xyz
split.inp
to create all the P+N fragment xyz-files,named
f1.xyz, f2.xyz ....
You might encounter problems with total charge:
Traceback (most recent call last):
File "./split.py", line 43, in ?
raise Exception, "Inconsistency in
charges."
Exception: Inconsistency in charges.
Then
cp /home/bio/Bin/split.py .
and change on line 32:
charge=0
to the actual charge.
The fragments can be viewed by writing for example:
molden f352.xyz
- Make a copy of the xyz2molcas.py script and edit it so that
the correct basis set, DFT method etc. is used. NOTE: Always
use an ANO-type basis set (e.g. ANO-L, A-6-31GP,
A-AUG-CC-PVTZ, or A-AUG-CC-PVDZ), otherwise the LoProp method
is not clearly defined. Make sure that a LOPROP section is
included in the input and that the resulting $Project.MpProp
is copied back from $WorkDir.
- Use the xyz2molcas.py script to create a Molcas input file
for each fragment. This can be conveniently done by writing
(in case of 700 fragments)
for((i=1;i<=700;i++)) do ./xyz2molcas.py f$i.xyz
f${i%.*}.input ; done
- Run Molcas on all the files. On milleotto, this can be done
by submitting them to the automatic pool of calculations:
for((i=1;i<=700;i++))
do
a.automill label f${i%.*} ; done
which creates "internal jobs" named labelf1, labelf2
etc. To start "real jobs" that pick calculations from the
private pool, go to your /disk/global/lottie/Auto
directory and write
subauto 168 somename
to submit them for one week. Other useful commands (have to be
invoked from the Auto directory) are
autocheck.py (checks
which
internal jobs are started but not finished) and
autorestart.py
(restarts internal jobs)
Check the sourcecode if you use them. Note that even if a job
is counted as finished, something might have gone wrong. So
the best way of checking how many are really finished is to
stay in the directory with the input files and just look for
the number of MpProp files:
ls -l *.MpProp | wc
- When all calculations are finished and MpProp-files
collected, convert them to xf files, for example by the
command
for((i=1;i<=700;i++))
do mpproptoxf a <f$i.MpProp >f$i.xf ; done
where a
stands for anisotropic polarizabilities (without 'a', no
polarizabilities, with 'i' instead isotropic = scalar
polarizabilities)
- Finally, use split.py again (has to be done locally):
split.py 2 FILE.xyz
split.inp m
where m
denotes that a molnr list should be created inside the
resulting xf file, describing which multipoles that should
polarize which polarizabilities. FILE.xyz is the xyz file
constructed in point 4 above. The xf
file format is described in the Molcas
manual under XFIE.
- The resulting file tot.xf
is used within Molcas by including something like this into
Seward input:
XFIE
@$HomeDir/tot.xf
RF-input
REAC
80.0 85.04
15
LANG
5.0 5.0 5.0
GITTER
0
NODAMP
AMBER
CONV
1.0E-15
DRES
end of RF-input
where DRES denotes that the induced dipoles should be reset to
zero in every iteration (default is now to keep them, but the
convergence threshold must probably be set to something like
1.0E-15 (it is not an energy!) or lower before you gain
anything by that. You can play around with it.
The keyword AMBER is crucial, it tells Molcas to use the molnr
list not only for excluding the static field from some
multipoles but also for excluding the intramolecular coupling
of polarizabilities. If all coupling is desired, one must not
only remove AMBER keyword but also NODAMP, so that the closest
interactions are damped. However, the damping seems to work
poorly for a LoProp description including bond centers (it was
developed for a description without bonds).
REAC adds a reaction field around the full protein with a
dielectric constant of 80 and using multipoles up to the 15th
order. The second parameter (85.04 a.u. = 45 Å above) should
be the radius of the protein with respect to the origin. It
should be chosen so that no multipole is outside this radius
(with a margin of ~ 1Å). In our tests, this reaction field
gives an effect of ~1 kJ/mole, and the radius is not so
important.
The format of the xf file is the following:
The first row gives four numbers, nlines, nmult, npol, ngroup,
where nlines=the number of multipoles, nmult=the order of the
multipoles (0=only charges, 1= also dipoles, 2= also quadrupoles),
npol=the type of polarisabilities (0=no, 1=isotropic,
2=anisotropic), and ngroup=the number of polarisability groups.
The following nlines lines (one for each multiples) gives ngroup
integers (mostly zeroes), the xyz coordinates, and the charge, the
three dipole moment components, the 6 quadrupole moment
components, and the 1 or 6 polarisability components (of course
depending on nmult and npol)
11012 2 2 21
1 2 3 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 35.96904249
-26.06309420 38.91890615
-0.32627445 -0.09188406
0.07178359 -0.06800498
-3.49323812 -0.00186702
-0.03713634
-3.45357719
0.01717702
-3.37173779
0.71192897
-0.28306137
0.08671128
0.62045706
-0.05815451 0.81053710
2 1 3 4 5 6 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 33.94703474
-24.82721282 37.48838290
-0.05487998
0.15113481
0.06667714
0.07783401 -3.11257023
-0.05914829
0.09139883 -3.41107483
-0.03523675
-3.36723902
0.11640074 -0.33326000
-0.08837108
0.36239895
0.13097959 0.74927055
The xf file can be modified by the program changexf
(it reads the molcas input file).
You can delete polarisabilities and multipoles, change to
isotropic (scalar) polarisabilities, reduce the
multipolemoment level and (via a pdb file) do the same
only for residues outside a certain distance from the
quantum system.
Nu har jag lagt in så att man
lätt kan vrida molekyler i MCFEP. Jag skapade
också ett nytt script för detta:
nemointmove f1.xf f1org.xyz f1new.xyz f2.xf f2org.xyz f2new.xyz
som räknar ut interaktionsenergin mellan f1.xf och f2.xf när de är i
sina nya geometrier f1new.xyz och f2new.xyz. f1org.xyz ska vara
samma geometri som f1.xf men ska innehålla atomerna i samma ordning
som f1new.xyz. Xyz-filerna ska vara i det vanliga formatet, d.v.s.
koordinater i Å. Den ignorerar de två titelraderna och använder sen
de tre första atomerna för transformationen. Den klarar
småstörningar i molekylerna (några pikometer hit eller dit) men är
inte stabil för större ändringar.
Pär 13/2-12
A sample restop file
3
3 3
C
0.0
O
0.0
CA
0.0
CA
CA
CA
2 4
N
0.0
CA
0.0
N
CA
CA
CA
2 3
SG
0.0
CB
0.0
CB
CB
CB
2
1 348
1 1 2
110 187
0 3 3
3
348 -1
1296 1
1299 -1
How to recontract basis sets for use in LoProp calculations.
1. Start with a MOLCAS basis set file (available in the
basis_library of a Molcas distribution).
2. For each element you are interested in, cut out the relevant part
and take away comment lines (starting with *) and options etc. See
example below. Save this separately for each element as X.basis,
where X is the
element name (first letter capital).
3. Run the script makelopropbasis (located in /home/bio/Bin) whose
arguments are the element names for which you have prepared basis
files. For example:
makelopropbasis C H O N S F Cl
if you want all elements for which there are currently input files
available. If you need another element, you have to prepare an input
file for the atomic and anionic species, following the examples in
/home/bio/Lopropbasis. The script assumes that you have in your path
a script named m.auto for running standard MOLCAS calculations (an
example is in /home/bio/Lopropbasis).
4. Check for non-zero return codes etc., otherwise you should have
got,
for each element, the file X.recont containing the recontracted basis set.
5. Plug in this content in the basis set file, rename the basis set
labels by adding the prefix A-, and change the file name similarly.
6. Put a copy of the basis set file in /home/bio/Lopropbasis to
avoid double work.
Example of an O.basis file (nuclear charge 8, max ang. momentum 2,
10s5p2d contracted to 4s3p2d)
8. 2
10 4
11720.00000000
1759.00000000
400.80000000
113.70000000
37.03000000
13.27000000
5.02500000
1.01300000
0.30230000
0.07896000
0.00071000 -0.00016000 0.00000000 0.00000000
0.00547000 -0.00126300 0.00000000 0.00000000
0.02783700 -0.00626700 0.00000000 0.00000000
0.10480000 -0.02571600 0.00000000 0.00000000
0.28306200 -0.07092400 0.00000000 0.00000000
0.44871900 -0.16541100 0.00000000 0.00000000
0.27095200 -0.11695500 0.00000000 0.00000000
0.01545800 0.55736800 0.00000000 0.00000000
-0.00258500 0.57275900 1.00000000 0.00000000
0.00000000 0.00000000 0.00000000 1.00000000
5 3
17.70000000
3.85400000
1.04600000
0.27530000
0.06856000
0.04301800 0.00000000 0.00000000
0.22891300 0.00000000 0.00000000
0.50872800 0.00000000 0.00000000
0.46053100 1.00000000 0.00000000
0.00000000 0.00000000 1.00000000
2 2
1.18500000
0.33200000
1.00000000 0.00000000
0.00000000 1.00000000
From Pär 28/8-09
I updated changepdb so that it can create the restop file automatically in
simple cases, especially for a "chemical" model. In principle it works
also for a whole protein, but if there are any special residues you have
to include all of them in the acon file for it to work, so it may be
easier to do the restop file manually.
I also created an example which I think can be useful for anyone using
PMISP/MM. It is here:
http://www.teokem.lu.se/~pmispexample.tar.gz
It simply contains the files needed to reproduce one of the interaction
energies in the JPCB paper and a file runscript which in principle runs
the full calculation (you may ignore the warnings issued), in practice
you would have to run the molcas calculations in parallell on the
cluster, but the rest of the steps are fairly quick.