Instructions for PMISP (also for a protein)

This link does not work any longer.

However, these are instructions from the pmispexample file (in /temp4/bio/Pmisp/Pmispexample), file runscript:

########################################### ## PMISP/MM example ## ############################################ This script calculates the interaction energy between a protein# and a ligand with the PMISP/MM method# It has to be modified for each specific need# The example included is the avidin-biotin interaction# and the QM level is MP2/aug-cc-pVTZ.# This is a very heavy calculation and will take several weeks# if the QM calculations are not run in parallell at a cluster.#Files needed:# complex.pdb : a PDB file with both protein and ligand# prmtop : an AMBER parmtop file for the complex# user.py : here you set the QM level# lig.inp : which atoms belong to the ligand (7740-7770)# restop_full : specifying protein segments and special residue charges# acon : cuttable bond list for amino acids (add non-standard residues)#Generate the splitfile for the model with cutoff 5 Angstrom############################################################echo "complex.pdbspc774077705aconautomodel.inpq" | changepdb#Generate the splitfile for the full protein#############################################Note: A cutoff of 10000 is used to select all atomsecho "complex.pdbspc7740777010000aconrestop_fullfull.inpq" | changepdb#Generate an XYZ file#####################echo "complex.pdbxcomplex.xyzq" | changepdb#Create directories####################We are doing two fragmentations, one for the model and one for the#protein. So let us use separate directories not to mess up the files.mkdir Modelmkdir Fullcp complex.xyz model.inp lig.inp prmtop Modelcp complex.xyz full.inp lig.inp prmtop Full#Generate MOLCAS input files#############################Note: For the full protein we delete the#supermolecular calculations (c*.input) because#we want to apply the PMISP/MM approximationcd Modelpmisp 1 complex.xyz model.inp - lig.inpcd ..cd Fullpmisp 1 complex.xyz full.inp - lig.inprm c*.inputcd ..#Run MOLCAS (preferably on a computer cluster)###############################################for a in */*.input; do molcas $a >${a%.*}.out; done#Collect supermolecular results################################cd Model#grep 'Total energy ' c*.out >resmp2#cd ..#OR (in this example) use the old results#########################################tar xvf results_Model.tar -C Modeltar xvf results_Full.tar -C Full#Convert to XF files with multipole level 2###########################################cd Modelallxf 2cd ..cd Fullallxf 2cd ..#Assemble properties for the protein and calculate classical energy####################################################################Note: For large systems, this may take several hours#The code is not optimized, because the classical calculations#are anyway faster than the QM calculations.cd Fullpmisp 3 complex.xyz full.inp mnemoint tot.xf g1.xfcd ..#Calculate non-classical energy###############################cd Modelpmisp 2 complex.xyz model.inp - lig.inp resmp2cd ..#Calculate van der waals energies#################################cd Fullcalcvdwcd ..cd Modelcalcvdwcd ..#Collect the energies and calculate the total interaction energy#################################################################CLASSICAL: -1411.044#NON-CLASS: 22.357#VDW(Full): -143.185#VDW(Model): -115.977 (to be subtracted)#TOTAL: -1415.895 kJ/mol

The instructions below are outdated.
Use instead the pmisp program (in /temp4/bio/Pmisp).
Unfortunately, the documentation seems to be missing.

Instructions for calculating multipoles for a large molecule (e.g. a protein)

Start with a PDB file where the residue numbering indicates where the cutting should be performed (each residue number should appear in only one place of the PDB file). Each atom label must start with one of the following strings: 'H','C','N','O','F','P','S','CL' (case-sensitive), indicating the element.
Prepare a file named restop describing the topology of the molecule. For proteins, use the template below. Normally, only the last part (after the three lines containing CB) has to be edited, because the first part describes the three most common junctions:

Addition of COCH3 to the N-terminal side of the residue (type 1)
Addition of NHCH3 to the C-terminal side of the residue (type 2)
Addition of SCH3 to the side-chain in case of sulphur bridge (type 3)

1 348 1 1 2

110 187 0 3 3

348 -1

1296 1

1299 -1

Use changepdb with command SP, to create the split.inp file.
This file may also be created manually for small systems. It contains a line with two integers P and N, the number of "positive" and "negative" fragments (in the creation of the total multipole file, the multipoles for positive fragments are added, whereas the multipoles for the negative fragments are subtracted).
Then follows a line with P+N integers giving the charge of each fragment.
Then follows P+N lines each describing one fragment (first the P positive ones). Each line contains a list of integers giving the atom number (1,2,3.. etc. from beginning of the xyz file described below). A minus sign in front of the number indicates that a junction should be created replacing that atom with a hydrogen atom connected to the (non-negative) atom closest before in the list.

Create an xyz file with exactly the same order of the atoms as the pdb file used to create the split.inp file. This can for example be done with changepdb, command x; or alternatively
pdb2xyz:

pdb2xyz FILE.pdb

which takes FILE.pdb and creates FILE.xyz. It now handles CL but no other two-letter names (easy to add of course). The second line in the xyz file (the title line in xyz format) must give the total molecular charge (if nonzero). NOTE! This has to be added manually!

Use the program split.py (has to be done locally). The general syntax is:

split.py pass xyzfile split.inp [m]

where pass=1 for splitting and 2 for joining after the LoProp calculations, and the m is only relevant for pass 2. So use

split.py 1 FILE.xyz split.inp

to create all the P+N fragment xyz-files,named f1.xyz, f2.xyz ....
You might encounter problems with total charge:
Traceback (most recent call last):
File "./split.py", line 43, in ?
raise Exception, "Inconsistency in charges."
Exception: Inconsistency in charges.
Then
cp /home/bio/Bin/split.py .
and change on line 32:
charge=0
to the actual charge.

The fragments can be viewed by writing for example:

molden f352.xyz

Make a copy of the xyz2molcas.py script and edit it so that the correct basis set, DFT method etc. is used. NOTE: Always use an ANO-type basis set (e.g. ANO-L, A-6-31GP, A-AUG-CC-PVTZ, or A-AUG-CC-PVDZ), otherwise the LoProp method is not clearly defined. Make sure that a LOPROP section is included in the input and that the resulting $Project.MpProp is copied back from $WorkDir.

Use the xyz2molcas.py script to create a Molcas input file for each fragment. This can be conveniently done by writing (in case of 700 fragments)

for((i=1;i<=700;i++)) do ./xyz2molcas.py f$i.xyz f${i%.*}.input ; done

Run Molcas on all the files. On milleotto, this can be done by submitting them to the automatic pool of calculations:

for((i=1;i<=700;i++)) do a.automill label f${i%.*} ; done

which creates "internal jobs" named labelf1, labelf2 etc. To start "real jobs" that pick calculations from the private pool, go to your /disk/global/lottie/Auto directory and write

subauto 168 somename

to submit them for one week. Other useful commands (have to be invoked from the Auto directory) are

autocheck.py (checks which internal jobs are started but not finished) and
autorestart.py (restarts internal jobs)

Check the sourcecode if you use them. Note that even if a job is counted as finished, something might have gone wrong. So the best way of checking how many are really finished is to stay in the directory with the input files and just look for the number of MpProp files:

ls -l *.MpProp | wc

When all calculations are finished and MpProp-files collected, convert them to xf files, for example by the command

for((i=1;i<=700;i++)) do mpproptoxf a <f$i.MpProp >f$i.xf ; done

where a stands for anisotropic polarizabilities (without 'a', no polarizabilities, with 'i' instead isotropic = scalar polarizabilities)

Finally, use split.py again (has to be done locally):

split.py 2 FILE.xyz split.inp m

where m denotes that a molnr list should be created inside the resulting xf file, describing which multipoles that should polarize which polarizabilities. FILE.xyz is the xyz file constructed in point 4 above. The xf file format is described in the Molcas manual under XFIE.

The resulting file tot.xf is used within Molcas by including something like this into Seward input:

XFIE
@$HomeDir/tot.xf
RF-input
REAC
80.0 85.04 15
LANG
5.0 5.0 5.0
GITTER
0
NODAMP
AMBER
CONV
1.0E-15
DRES
end of RF-input

where DRES denotes that the induced dipoles should be reset to zero in every iteration (default is now to keep them, but the convergence threshold must probably be set to something like 1.0E-15 (it is not an energy!) or lower before you gain anything by that. You can play around with it.

The keyword AMBER is crucial, it tells Molcas to use the molnr list not only for excluding the static field from some multipoles but also for excluding the intramolecular coupling of polarizabilities. If all coupling is desired, one must not only remove AMBER keyword but also NODAMP, so that the closest interactions are damped. However, the damping seems to work poorly for a LoProp description including bond centers (it was developed for a description without bonds).

REAC adds a reaction field around the full protein with a dielectric constant of 80 and using multipoles up to the 15th order. The second parameter (85.04 a.u. = 45 Å above) should be the radius of the protein with respect to the origin. It should be chosen so that no multipole is outside this radius (with a margin of ~ 1Å). In our tests, this reaction field gives an effect of ~1 kJ/mole, and the radius is not so important.

11012 2 2 21
1 2 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 35.96904249 -26.06309420 38.91890615 -0.32627445 -0.09188406 0.07178359 -0.06800498 -3.49323812 -0.00186702 -0.03713634 -3.45357719 0.01717702 -3.37173779 0.71192897 -0.28306137 0.08671128 0.62045706 -0.05815451 0.81053710
2 1 3 4 5 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 33.94703474 -24.82721282 37.48838290 -0.05487998 0.15113481 0.06667714 0.07783401 -3.11257023 -0.05914829 0.09139883 -3.41107483 -0.03523675 -3.36723902 0.11640074 -0.33326000 -0.08837108 0.36239895 0.13097959 0.74927055

The xf file can be modified by the program changexf (it reads the molcas input file).
You can delete polarisabilities and multipoles, change to isotropic (scalar) polarisabilities, reduce the multipolemoment level and (via a pdb file) do the same only for residues outside a certain distance from the quantum system.

Nu har jag lagt in så att man lätt kan vrida molekyler i MCFEP. Jag skapade också ett nytt script för detta:

nemointmove f1.xf f1org.xyz f1new.xyz f2.xf f2org.xyz f2new.xyz

som räknar ut interaktionsenergin mellan f1.xf och f2.xf när de är i sina nya geometrier f1new.xyz och f2new.xyz. f1org.xyz ska vara samma geometri som f1.xf men ska innehålla atomerna i samma ordning som f1new.xyz. Xyz-filerna ska vara i det vanliga formatet, d.v.s. koordinater i Å. Den ignorerar de två titelraderna och använder sen de tre första atomerna för transformationen. Den klarar småstörningar i molekylerna (några pikometer hit eller dit) men är inte stabil för större ändringar.

Pär 13/2-12

A sample restop file

3
3 3
C
0.0
O
0.0
CA
0.0
CA
CA
CA
2 4
N
0.0
CA
0.0
N
CA
CA
CA
2 3
SG
0.0
CB
0.0
CB
CB
CB
2
1 348 1 1 2
110 187 0 3 3
3
348 -1
1296 1
1299 -1

How to recontract basis sets for use in LoProp calculations.

1. Start with a MOLCAS basis set file (available in the basis_library of a Molcas distribution).

2. For each element you are interested in, cut out the relevant part and take away comment lines (starting with *) and options etc. See example below. Save this separately for each element as X.basis, where X is the element name (first letter capital).

3. Run the script makelopropbasis (located in /home/bio/Bin) whose arguments are the element names for which you have prepared basis files. For example:

makelopropbasis C H O N S F Cl

if you want all elements for which there are currently input files available. If you need another element, you have to prepare an input file for the atomic and anionic species, following the examples in /home/bio/Lopropbasis. The script assumes that you have in your path a script named m.auto for running standard MOLCAS calculations (an example is in /home/bio/Lopropbasis).

4. Check for non-zero return codes etc., otherwise you should have got, for each element, the file X.recont containing the recontracted basis set.

5. Plug in this content in the basis set file, rename the basis set labels by adding the prefix A-, and change the file name similarly.

6. Put a copy of the basis set file in /home/bio/Lopropbasis to avoid double work.

Example of an O.basis file (nuclear charge 8, max ang. momentum 2, 10s5p2d contracted to 4s3p2d)

         8.   2
  10    4
      11720.00000000
       1759.00000000
        400.80000000
        113.70000000
         37.03000000
         13.27000000
          5.02500000
          1.01300000
          0.30230000
          0.07896000
0.00071000 -0.00016000  0.00000000  0.00000000
0.00547000 -0.00126300  0.00000000  0.00000000
0.02783700 -0.00626700  0.00000000  0.00000000
0.10480000 -0.02571600  0.00000000  0.00000000
0.28306200 -0.07092400  0.00000000  0.00000000
0.44871900 -0.16541100  0.00000000  0.00000000
0.27095200 -0.11695500  0.00000000  0.00000000
0.01545800  0.55736800  0.00000000  0.00000000
-0.00258500  0.57275900  1.00000000  0.00000000
0.00000000  0.00000000  0.00000000  1.00000000
   5    3
         17.70000000
          3.85400000
          1.04600000
          0.27530000
          0.06856000
0.04301800  0.00000000  0.00000000
0.22891300  0.00000000  0.00000000
0.50872800  0.00000000  0.00000000
0.46053100  1.00000000  0.00000000
0.00000000  0.00000000  1.00000000
   2    2
          1.18500000
          0.33200000
1.00000000  0.00000000
0.00000000  1.00000000

From Pär 28/8-09

I updated changepdb so that it can create the restop file automatically in 
simple cases, especially for a "chemical" model. In principle it works 
also for a whole protein, but if there are any special residues you have 
to include all of them in the acon file for it to work, so it may be 
easier to do the restop file manually.

I also created an example which I think can be useful for anyone using 
PMISP/MM. It is here:
http://www.teokem.lu.se/~pmispexample.tar.gz
It simply contains the files needed to reproduce one of the interaction 
energies in the JPCB paper and a file runscript which in principle runs 
the full calculation (you may ignore the warnings issued), in practice 
you would have to run the molcas calculations in parallell on the 
cluster, but the rest of the steps are fairly quick.