Instructions for PMISP (also for a protein)

This link does not work any longer.

However, these are instructions from the pmispexample file (in /temp4/bio/Pmisp/Pmispexample), file runscript:

##########################################
#                                        #
#          PMISP/MM example              #
#                                        #
##########################################


# This script calculates the interaction energy between a protein
# and a ligand with the PMISP/MM method
# It has to be modified for each specific need


# The example included is the avidin-biotin interaction
# and the QM level is MP2/aug-cc-pVTZ.
# This is a very heavy calculation and will take several weeks
# if the QM calculations are not run in parallell at a cluster.



#Files needed:

# complex.pdb :   a PDB file with both protein and ligand
# prmtop      :   an AMBER parmtop file for the complex
# user.py     :   here you set the QM level
# lig.inp     :   which atoms belong to the ligand (7740-7770)
# restop_full :   specifying protein segments and special residue charges
# acon        :   cuttable bond list for amino acids (add non-standard residues)




#Generate the splitfile for the model with cutoff 5 Angstrom
############################################################

echo "complex.pdb
sp
c
7740
7770
5
acon

auto

model.inp
q" | changepdb

#Generate the splitfile for the full protein
############################################
#Note: A cutoff of 10000 is used to select all atoms

echo "complex.pdb
sp
c
7740
7770
10000
acon

restop_full
full.inp
q" | changepdb


#Generate an XYZ file
#####################

echo "complex.pdb
x
complex.xyz
q" | changepdb


#Create directories
###################

#We are doing two fragmentations, one for the model and one for the
#protein. So let us use separate directories not to mess up the files.
mkdir Model
mkdir Full
cp complex.xyz model.inp lig.inp prmtop Model
cp complex.xyz full.inp lig.inp prmtop Full



#Generate MOLCAS input files
############################
#Note: For the full protein we delete the
#supermolecular calculations (c*.input) because
#we want to apply the PMISP/MM approximation

cd Model
pmisp 1 complex.xyz model.inp - lig.inp
cd ..
cd Full
pmisp 1 complex.xyz full.inp - lig.inp
rm c*.input
cd ..


#Run MOLCAS (preferably on a computer cluster)
##############################################

#for a in */*.input; do molcas $a >${a%.*}.out; done

#Collect supermolecular results
###############################

#cd Model
#grep 'Total energy ' c*.out >resmp2
#cd ..


#OR (in this example) use the old results
#########################################

tar xvf results_Model.tar -C Model
tar xvf results_Full.tar -C Full


#Convert to XF files with multipole level 2
###########################################

cd Model
allxf 2
cd ..
cd Full
allxf 2
cd ..

#Assemble properties for the protein and calculate classical energy
###################################################################
#Note: For large systems, this may take several hours
#The code is not optimized, because the classical calculations
#are anyway faster than the QM calculations.

cd Full
pmisp 3 complex.xyz full.inp m
nemoint tot.xf g1.xf
cd ..

#Calculate non-classical energy
###############################

cd Model
pmisp 2 complex.xyz model.inp - lig.inp resmp2
cd ..

#Calculate van der waals energies
#################################
cd Full
calcvdw
cd ..
cd Model
calcvdw
cd ..


#Collect the energies and calculate the total interaction energy
################################################################

#CLASSICAL: -1411.044
#NON-CLASS:    22.357
#VDW(Full):  -143.185
#VDW(Model): -115.977    (to be subtracted)

#TOTAL:     -1415.895 kJ/mol



The instructions below are outdated.
Use instead the pmisp program (in /temp4/bio/Pmisp).
Unfortunately, the documentation seems to be missing.

Instructions for calculating multipoles for a large molecule (e.g. a protein)

  1. Start with a PDB file where the residue numbering indicates where the cutting should be performed (each residue number should appear in only one place of the PDB file). Each atom label must start with one of the following strings: 'H','C','N','O','F','P','S','CL' (case-sensitive), indicating the element.

  2. Prepare a file named restop describing the topology of the molecule. For proteins, use the template below. Normally, only the last part (after the three lines containing CB) has to be edited, because the first part describes the three most common junctions:

  3. The last part has two sections.
    First, a junction part containing the number of "junction sets" and a line with numbers R1,R2, I, T1, and T2 for each set. Each junction set introduces either one pair of junctions (if I=0) : type T1 modeling R1 in the R2 fragment and type T2 modeling R2 in the R1 fragment, or several junction pairs (if I=1): type T1 modeling R1 in R1+1, R1+1 in R1+2, etc. and type T2 modeling R1+1 in R1, R1+2 in R1+1,  etc., i.e. one junction is put between every consecutive pair of residues between R1 and R2. Note that residue numbers are given as in the PDB file, not the "order number", but the fragments will be renumbered in consecutive order.
    Example (for a protein with 348 residues and a Cys-Cys bond between residues 110 and 187):
    2
    1 348    1    1 2
    110 187  0    3 3

    The second section describes non-standard charges:
    The first line gives the number (N) of them.
    Then N lines with two integers R and C on each, specifying that the charge of residue R (after adding junctions) is C. The default is to sum the nuclear charges of the fragment. If that sum is even, the charge will be set to zero, if it is odd, the charge will be set to +1 unless the residue name is GLU or ASP, in which case it will be set to -1.
    Example:
    3
    348  -1
    1296  1
    1299 -1

  4. Use changepdb with command SP, to create the split.inp file.
    This file may also be created manually for small systems. It contains a line with two integers P and N, the number of "positive" and "negative" fragments (in the creation of the total multipole file, the multipoles for positive fragments are added, whereas the multipoles for the negative fragments are subtracted).
    Then follows a line with P+N integers giving the charge of each fragment.
    Then follows P+N lines each describing one fragment (first the P positive ones). Each line contains a list of integers giving the atom number (1,2,3.. etc. from beginning of the xyz file described below). A minus sign in front of the number indicates that a junction should be created replacing that atom with a hydrogen atom connected to the (non-negative) atom closest before in the list.

  5. Create an xyz file with exactly the same order of the atoms as the pdb file used to create the split.inp file. This can for example be done with changepdb, command x; or alternatively
    pdb2xyz:

    pdb2xyz FILE.pdb

    which takes FILE.pdb and creates FILE.xyz. It now handles CL but no other two-letter names (easy to add of course). The second line in the xyz file (the title line in xyz format) must give the total molecular charge (if nonzero). NOTE! This has to be added manually!

  6. Use the program split.py (has to be done locally). The general syntax is:

    split.py pass xyzfile split.inp [m]

    where pass=1 for splitting and 2 for joining after the LoProp calculations, and the m is only relevant for pass 2. So use

    split.py 1 FILE.xyz split.inp

    to create all the P+N  fragment xyz-files,named f1.xyz, f2.xyz ....
    You might encounter problems with total charge:
    Traceback (most recent call last):
      File "./split.py", line 43, in ?
        raise Exception, "Inconsistency in charges."
    Exception: Inconsistency in charges.
    Then
    cp /home/bio/Bin/split.py .
    and change on line 32:
        charge=0 
    to the actual charge.

    The fragments can be viewed by writing for example:

    molden f352.xyz

  7. Make a copy of the xyz2molcas.py script and edit it so that the correct basis set, DFT method etc. is used. NOTE: Always use an ANO-type basis set (e.g. ANO-L, A-6-31GP, A-AUG-CC-PVTZ, or A-AUG-CC-PVDZ), otherwise the LoProp method is not clearly defined. Make sure that a LOPROP section is included in the input and that the resulting $Project.MpProp is copied back from $WorkDir.

  8. Use the xyz2molcas.py script to create a Molcas input file for each fragment. This can be conveniently done by writing (in case of 700 fragments)

    for((i=1;i<=700;i++)) do ./xyz2molcas.py f$i.xyz f${i%.*}.input ; done

  9. Run Molcas on all the files. On milleotto, this can be done by submitting them to the automatic pool of calculations:

    for((i=1;i<=700;i++)) do a.automill label f${i%.*} ; done

    which creates "internal jobs" named labelf1, labelf2 etc. To start "real jobs" that pick calculations from the private pool, go to your  /disk/global/lottie/Auto directory and write

    subauto 168 somename

    to submit them for one week. Other useful commands (have to be invoked from the Auto directory) are

    autocheck.py (checks which internal jobs are started but not finished) and
    autorestart.py (restarts internal jobs)

    Check the sourcecode if you use them. Note that even if a job is counted as finished, something might have gone wrong. So the best way of checking how many are really finished is to stay in the directory with the input files and just look for the number of MpProp files:

    ls -l *.MpProp | wc

  10. When all calculations are finished and MpProp-files collected, convert them to xf files, for example by the command

    for((i=1;i<=700;i++)) do mpproptoxf a <f$i.MpProp >f$i.xf ; done


    where a stands for anisotropic polarizabilities (without 'a', no polarizabilities, with 'i' instead isotropic = scalar polarizabilities)

  11. Finally, use split.py again (has to be done locally):

    split.py 2 FILE.xyz split.inp m

    where m denotes that a molnr list should be created inside the resulting xf file, describing which multipoles that should polarize which polarizabilities. FILE.xyz is the xyz file constructed in point 4 above. The xf file format is described in the Molcas manual under XFIE.

  12. The resulting file tot.xf is used within Molcas by including something like this into Seward input:

    XFIE
    @$HomeDir/tot.xf
    RF-input
    REAC
    80.0 85.04 15
    LANG
    5.0 5.0 5.0
    GITTER
    0
    NODAMP
    AMBER
    CONV
    1.0E-15
    DRES
    end of RF-input

    where DRES denotes that the induced dipoles should be reset to zero in every iteration (default is now to keep them, but the convergence threshold must probably be set to something like 1.0E-15 (it is not an energy!) or lower before you gain anything by that. You can play around with it.

    The keyword AMBER is crucial, it tells Molcas to use the molnr list not only for excluding the static field from some multipoles but also for excluding the intramolecular coupling of polarizabilities. If all coupling is desired, one must not only remove AMBER keyword but also NODAMP, so that the closest interactions are damped. However, the damping seems to work poorly for a LoProp description including bond centers (it was developed for a description without bonds).

    REAC adds a reaction field around the full protein with a dielectric constant of 80 and using multipoles up to the 15th order. The second parameter (85.04 a.u. = 45 Å above) should be the radius of the protein with respect to the origin. It should be chosen so that no multipole is outside this radius (with a margin of ~ 1Å). In our tests, this reaction field gives an effect of ~1 kJ/mole, and the radius is not so important.

  13. The format of the xf file is the following:
    The first row gives four numbers, nlines, nmult, npol, ngroup, where nlines=the number of multipoles, nmult=the order of the multipoles (0=only charges, 1= also dipoles, 2= also quadrupoles), npol=the type of polarisabilities (0=no, 1=isotropic, 2=anisotropic), and ngroup=the number of polarisability groups.
    The following nlines lines (one for each multiples) gives ngroup integers (mostly zeroes), the xyz coordinates, and the charge, the three dipole moment components, the 6 quadrupole moment components, and the 1 or 6 polarisability components (of course depending on nmult and npol)

    11012 2 2 21
    1 2 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0   35.96904249   -26.06309420    38.91890615    -0.32627445   -0.09188406     0.07178359    -0.06800498    -3.49323812    -0.00186702    -0.03713634    -3.45357719     0.01717702    -3.37173779     0.71192897    -0.28306137     0.08671128     0.62045706    -0.05815451     0.81053710
    2 1 3 4 5 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0   33.94703474   -24.82721282    37.48838290    -0.05487998    0.15113481     0.06667714     0.07783401    -3.11257023    -0.05914829     0.09139883    -3.41107483    -0.03523675    -3.36723902     0.11640074    -0.33326000    -0.08837108     0.36239895     0.13097959     0.74927055

    The xf file can be modified by the program changexf (it reads the molcas input file).
    You can delete polarisabilities and multipoles, change to isotropic (scalar) polarisabilities, reduce the multipolemoment level and (via a pdb file) do the same only for residues outside a certain distance from the quantum system.



Nu har jag lagt in så att man lätt kan vrida molekyler i MCFEP. Jag skapade också ett nytt script för detta:

nemointmove f1.xf f1org.xyz f1new.xyz    f2.xf f2org.xyz f2new.xyz

som räknar ut interaktionsenergin mellan f1.xf och f2.xf när de är i sina nya geometrier f1new.xyz och f2new.xyz. f1org.xyz ska vara samma geometri som f1.xf men ska innehålla atomerna i samma ordning som f1new.xyz. Xyz-filerna ska vara i det vanliga formatet, d.v.s. koordinater i Å. Den ignorerar de två titelraderna och använder sen de tre första atomerna för transformationen. Den klarar småstörningar i molekylerna (några pikometer hit eller dit) men är inte stabil för större ändringar.
Pär 13/2-12

A sample restop file

3
3 3
C
0.0
O
0.0
CA
0.0
CA
CA
CA
2 4
N
0.0
CA
0.0
N
CA
CA
CA
2 3
SG
0.0
CB
0.0
CB
CB
CB
2
1 348    1    1 2
110 187  0    3 3
3
348 -1
1296 1
1299 -1


How to recontract basis sets for use in LoProp calculations.

1. Start with a MOLCAS basis set file (available in the basis_library of a Molcas distribution).

2. For each element you are interested in, cut out the relevant part and take away comment lines (starting with *) and options etc. See example below. Save this separately for each element as X.basis, where X is the element name (first letter capital).

3. Run the script makelopropbasis (located in /home/bio/Bin) whose arguments are the element names for which you have prepared basis files. For example:

makelopropbasis C H O N S F Cl

if you want all elements for which there are currently input files available. If you need another element, you have to prepare an input file for the atomic and anionic species, following the examples in /home/bio/Lopropbasis. The script assumes that you have in your path a script named m.auto for running standard MOLCAS calculations (an example is in /home/bio/Lopropbasis).

4. Check for non-zero return codes etc., otherwise you should have got, for each element, the file X.recont containing the recontracted basis set.

5. Plug in this content in the basis set file, rename the basis set labels by adding the prefix A-, and change the file name similarly.

6. Put a copy of the basis set file in /home/bio/Lopropbasis to avoid double work.


Example of an O.basis file (nuclear charge 8, max ang. momentum 2, 10s5p2d contracted to 4s3p2d)

         8.   2
  10    4
      11720.00000000
       1759.00000000
        400.80000000
        113.70000000
         37.03000000
         13.27000000
          5.02500000
          1.01300000
          0.30230000
          0.07896000
 0.00071000 -0.00016000  0.00000000  0.00000000
 0.00547000 -0.00126300  0.00000000  0.00000000
 0.02783700 -0.00626700  0.00000000  0.00000000
 0.10480000 -0.02571600  0.00000000  0.00000000
 0.28306200 -0.07092400  0.00000000  0.00000000
 0.44871900 -0.16541100  0.00000000  0.00000000
 0.27095200 -0.11695500  0.00000000  0.00000000
 0.01545800  0.55736800  0.00000000  0.00000000
-0.00258500  0.57275900  1.00000000  0.00000000
 0.00000000  0.00000000  0.00000000  1.00000000
   5    3
         17.70000000
          3.85400000
          1.04600000
          0.27530000
          0.06856000
 0.04301800  0.00000000  0.00000000
 0.22891300  0.00000000  0.00000000
 0.50872800  0.00000000  0.00000000
 0.46053100  1.00000000  0.00000000
 0.00000000  0.00000000  1.00000000
   2    2
          1.18500000
          0.33200000
 1.00000000  0.00000000
 0.00000000  1.00000000


From Pär 28/8-09
I updated changepdb so that it can create the restop file automatically in 
simple cases, especially for a "chemical" model. In principle it works
also for a whole protein, but if there are any special residues you have
to include all of them in the acon file for it to work, so it may be
easier to do the restop file manually.

I also created an example which I think can be useful for anyone using
PMISP/MM. It is here:
http://www.teokem.lu.se/~pmispexample.tar.gz
It simply contains the files needed to reproduce one of the interaction
energies in the JPCB paper and a file runscript which in principle runs
the full calculation (you may ignore the warnings issued), in practice
you would have to run the molcas calculations in parallell on the
cluster, but the rest of the steps are fairly quick.