ComQum-N
ComQum-N is a method to combine QM/MM calculations and NMR structure refinement to obtain an optimum compromise between NMR and quantum chemistry.

The method is described in

Y.-W. Hsiao, T. Drakenberg & U. Ryde (2005) "NMR structure determination of proteins supplemented by quantum chemical calculations: Detailed structure of the Ca2+ sites in the EGF34 fragment of protein S", J. Biomol. NMR, 31, 97-114.

How to run a ComQum-N job
In this version, we consider QC as a complement to MM in the NMR investigation.
Form to fill in
  1. Start with a refined NMR structure. You should either take the best structure of the ensemble, or run several of the structures in the ensemble.
  2.  
  3. Find out from the pdb file the following data

  4. 1. Parameter and topology files (at least if CNS was used for refinement)

    REMARK   3  PARAMETER FILE  1  : PROTEIN_REP.PA
    REMARK   3  PARAMETER FILE  2  : WATER_REP.PARA
    REMARK   3  TOPOLOGY FILE  1   : PROTEIN.TOP
    REMARK   3  TOPOLOGY FILE  2   : WATER.TOP

    2.  All NMR restraint files

    3. All energy terms  and violations of the NMR restraints  

  5. If necessary, construct topology and parameter files for unusual ligands, etc.
  6.    
  7. Modify atom names in the pdb file
    1. Change the name of ions to the proper CNS names (e.g. residue=ZN2, atom=ZN+2).
    2. Note that if there are four letters name in the pdb file, they must start so that there is a blank before the residue. Use changepdb to change the format to -6.
    3. Moroeover, cns does not read the subunit key, so you may need to renumber the residues by changepdb, command rr.
    4. Note that changepdb removes TER from the file. These may have to be added manually; otherwise you can get PATCH-ERR in next step.
    5. Remove segid from the pdb file if present.

  8. Go to CNS home page (Input files - General) and set up a generate file for the whole protein. You can base it on either generate.inp or generate_easy.inp. In the latter case, you should rename the file to generate.inp. You should typically change:
    1. coordinate file
    2. set hydrogen flag = true
    3. build only unknown hydrogens (none should need to be built)
    4. output structure file = mm3.mtf
    5. output coordinate file = generate.pdb
    6. protein topology file = protein-allhdg.top
    7. protein parameter file = protein-allhdg.param
    8. Insert possible extra topology and parameter files as prostethic group and ligand

  9. Then run cns with this file and check the output
    cns <generate.inp >generate.out
    grep ERR generate.out
    grep WRN generate.out
    grep INFO generate.out
    grep error generate.out
    grep warning generate.out

    This gives the files generate.pdb and mm3.mtf, which should be used in the following.

    Check in the pdb file that no atoms have been added and that disulphide likages have been addred properly.

  10. Run changemtf on generate.mtf and construct a pdb file, comqum.pdb, using coordinates in generate.pdb. This is necessary to get the proper charges into comqum.pdb:

    changemtf
    mm3.mtf
    p
    p
    generate.pdb
    comqum.pdb
    q

  11. Run pdbtocomqum on comqum.pdb to set up the various systems and junctions.
    Note that when selecting the quantum system, if there is a hydrogen bond from the back-bone NH or CO groups to the quantum system, this group should be included in the quantum system .
    1. Select the name of the logfile (default is OK)
    2. Enter the name of the pdb file (comqum.pdb)
    3. Do not search for short contatcts
    4. Enter a new title (it is better done in a file, see next point).
    5. Select the quantum system, including the junction atoms that will be changed to hydrogens (you first give the number of the amino acid, then the number of the atoms in this amino acid; one <CR> let you select a new amino acid; another <CR> ends the selection).
    6. This can preferably be done by entering the name of a file with the atom numbers. This file should also contain the new title as the first line (sample file).
    7. Select the radius for system 2 (MM system that is optimised). This should be the whole protein (1000 Å).
    8. Do not include or remove residues from this system (hit return twice).
    9. Select the radius for system 3 (MM system that is fixed). This should be 0 Å (the whole protein has already been included).
    10. Select a nmr type of calculation, by entering n(mr).
    11. Select the basis set (i.e. the type of junction, typically 7).
    12. Enter the junction atoms (those that will become hydrogen atoms in the quantum system)
    13. Ensure that they are not unknown (i.e. that the junction type is not set to -1).
    14. Press return until the program is finished.

    This gives the files:
    comqum.dat   comqum.mmout   comqum.qcout    logfil  

  1. Run comqumtoturbo to get the Turbomole files:

  2. control  coord    pointcharges
     
  3. Check the name of metal atoms in file coord (they are normally not correct) and then run
    define
    to define the basis sets and get starting orbitals.
    Normally, you should also change the DFT functional to b-p (last menu, dft; func b-p)+ ri (last menu, ri; on; m 120 or more; check that no auxilary basis set is missing).
    It is also wise to construct an UFF hessian in the geometry menu:
    Go into the molecular geometry menu and select ff
    change the charge of the molecule (if <>0) by c +2
    and then start the calculation with return.
    You have now run a single-point Hessian (frequency) calculation with UFF, and the result is in the file uffhessian0-0.
    After you have finished define, check in the control file
    $forceinit on
     carthess
    (instead of default; there is often multiple entries of $forceinit; remove all except the last one)
    and (re)insert the line
    $grad   file=gradient
    Thereby, you will read the Hessian matrix and use it as an initialisation of the forceapprox matrix.
  4.  
  5. Construct parameter and topology files for the quantum system (without H-atoms).
    Most files are present in /home/bio/Data/Nmr. Copy these to the working directory. If they are many, you can use cat to concatenate them all into two files comqum.top and comqum.par, which are included by default in generate1.inp and mmrun1 (next step).
  6. If some files are missing, they can automatically be constructed with the program mkjunct. Ensure that you have links to the correct CNS topology and parameter files (typically protein-allhdg.top/.param) in the directory from which you run the program.
    Check the results carefully.
    In particular, there normally is some duplicate angles in the *.par file.

    The new topology and parameter files should involve exactly the same atom types and parameters as in the full protein, except at the junctions.
    Note that the bond parameters of the junction hydrogen atoms are fixed by the corresponding parameters for the junction carbon atoms and junctfact (which is found in the comqum.dat file).
    Thus the force constant of the junction hydrogen atom must be the force constant of the junction carbon atom * juncfact^2, whereas the equilibrium parameter should be the carbon equilibrium parameter / juncfact (i.e. it should be the ideal bond length of that bond with that basis set, as listed in /home/bio/Data/junctfactor_cns. Otherwise a spurious extra force will be introduced:
    FMM1=k(x-x0)^2
    FMM2=k/jf^2(x*jf-x0*jf)^2 = k/jf^2*jf^2(x-x0)^2 = k(x-x0)^2 = FMM1
    Example (MMP):
    $junction_atoms
        14    11    1.38765000
    bond CPB  CH3E   572.053     1.5028
    bond CPB  HA    1101.53      1.083
    1101.53=572.053*1.38765*1.38765
    1.083=1.5028/1.38765
     
  7. Run comqumtocns to get the CNS files:

  8. Enter the name of the system1 topology and parameter files (besides protein-allhdg, water, ion, and comqum). If they are many, you can use cat to concatenate them all into two files comqum.top and comqum.par, which are included by default in generate1.inp and mmrun1.

    This gives the following files:
    fix.pdb    generate1.inp    mmrun1    pdb1    pdb3  

  9. If there are any bonds between residues in the quantum system, you have to arrange a patch that constructs such a bond.  This is done in the file generate1.inp, before the row:
    {Write coordinates and structure}
    For a normal amino acid, you can use the standard protein patch:
    patch PEPT reference=-=(resnam COV) reference=+=(resnam NHD) end
    where you only change the two residue names.
    For proline, a special patch is needed:
    patch PEPU reference=-=(resnam COI) reference=+=(resnam NHP) end
    And also read it into generate.inp (after the other topology statements):
    topology    @/home/bio/Data/Nmr/pepu.top  end



  10. Construct the mmrun2 file from the original NMR refinement file.
    This is described in detail below.
    Copy all the restraints files to the working directory.
    A sample mmrun2 file is here.

  11. If you have molecule for anisotropy (origin, x, y, and z), these coordinates must be added by hand to mm3.pdb and the lines were these coordinates are read in have to be removed from mmrun2 (i.e. from the input coordinates file). You also have to put in the corresponding rows in the file fix.pdb (with coordinates 1.0, 2.0, 0.0, 1.0, 0.0 for all four "atoms").
    You also have to add four point charges (with charge = 0.0) in the file pointcharges. You can add them with any coordinates (copy the last line and zero the charge). Then run
    fixcoord2_cnsin
    fixcoord2_turboin
    fixcoord2
    fixcoord2_turboout
    to get the correct coordinates.

  12. Construct the mmrun3 file from mmrun2
    This is also detailed below (mmrun3).

  13. Test all programs by running

    cqnprep


    No ERR are allowed
    Check carefully all warnings.

    Check carefully that mmrun1 and mmrun2 have the same MM energy terms (bond, angle, dihe, and vdw; mmrun2 should in addition have all the NMR energy terms) and non-bonded parameters. This is not trivial, because these parameters are changed during the run of mmrun2. However, the used parameters are written  out in both files, just before the energy calculation.

    Check that if you mv minimize3.pdb and minimize3.pdb1 to mm3.pdb and mm3.pdb1 and run mmrun2, you get exactly the same energy (harm may differ by ~10¯5) as in mmrun3.out. Do this in a separate directory if you want to run a calculation with fixed protein first:
    mkdir T
    cd T
    cp ../* .
    cns <mmrun3 >mmrun3.out
    /bin/mv minimize3.pdb mm3.pdb
    /bin/mv minimize3.pdb1 mm3.pdb1
    cns <mmrun2 >mmrun2.out
    xdiff minimize_1.pdb refine_1.pdb

    Check that the last sani and dani forces are unity (in mmrun2.out):
    Setting force const for class S1       to    1.000
    Likewise, check carefully that you get the same energy in mmrun2 for the same final coordinates of mmrun3.

    Also check the fixcoord1 output carefully - it will identify problems with the correspondance list.
    Look in the files mmrun[123].out  for violations.

  14. You can add restraints on a distance between two atoms using (note the mixed units):
    $restraints
      atom1(1)  atom2(1)  desired_distance_in_Å   force_constant_in_au

    $restraints
    1 3  1.00 10.0


    A typical force constant is 10 a.u. (which typically gives a 0.05 pm deviation at convergence)

  15. Remove unneccesary files and make a backup by running
    cqnback

    The following 21 (20 for RHF) files are needed to run comqumn:
    auxbasis    comqum.par  fix.pdb  mm1.pdb1  mm3.pdb   mmen2   mmrun2  pointcharges
    basis       control     mm1.mtf  mm2.mtf   mm3.pdb1  mmen3   mmrun3  uffhessian0-0
    comqum.dat  coord       mm1.pdb  mm3.mtf   mmen1     mmrun1  mos

    In addition, you need the NMR restraint files and the parameter files for all heterocompounds. Check carefully that these are left in the directory (test to run
    cns<mmrun2>mmrun2.out
    grep ERR mmrun2.out
    rm force2 mmen2_1 mmrun2.out refine_1.pdb

    Cqnback constructs a directory Gz in which all files are saved as a backup. This directory is useful if you want to see what you did during the setup or if you want to start a new but similar calculation.  
     
  16. Start comqumx (the option -ri is necessary for RI calculations; the -c option gives the maximum number of geometry steps):

    comqumx -ri -c 200
    .

  17. When the calculation is finished, it is often useful to calculate the energy of the quantum system without any point charges (to be used for strain energies). You do this in the following way:
    mkdir Str
    cp control coord basis energy mos alpha beta auxbasis Str
    cd Str
    dscf >logd
    mul or mul2
    mv mul ../mul_str
    mv mos ../mos_str
    mv alpha ../alpha_str
    mv beta ../beta_str
    mv energy ..
    /bin/rm *
    cd ..
    rmdir Str


How to construct the mmrun2 file Start from the original NMR refine files (or construct new ones using the CNS home page, e.g. using the anneal.inp template).
    1. Add new parameter files (e.g. ion.param and water.param)
    2. Change the name of the structure file to mm2.mtf
    3. Change the name of the coordinate file to mm3.pdb
    4. Change md.type.hot="cartesian"
    5. Change md.type.cool ="cartesian"
    6. Set the number of trial structure pdb.end.count=1
    7. Set the number of steps in the high-temperature annealing md.hot.step=0
    8. Change md.cool.temp=1000
    9. Set the number of steps in the first slow-cool annealing md.cool.step=0
    10. Change md.cool.vdw=4.0
    11. Change md.cool.ss=0.005
    12. Change md.cool.tmpstp=25
    13. Set the number of steps in the second slow-cool annealing md.cart.step=0
    14. Set the number of steps in the final minimization md.pow.step=0
    15. Set the number of cycles in the final minimization md.pow.cycle=0
    16. Change nmr.sani.force.finl.1=1.0
    17. Set the base name for output coordinate files pdb.out.name="refine"
         
* Insert after the row end if after the row (about row 530):
coor @@&pdb.in.file.3

{ComQum: Read the 4th to 8th decimals and include them in x, y, z}
do (xcomp=0.0) (all)
do (ycomp=0.0) (all)
do (zcomp=0.0) (all)
coordinates disp=comp @mm3.pdb1

do ( x = x + xcomp/100000.0 ) (all)
do ( y = y + ycomp/100000.0 ) (all)
do ( z = z + zcomp/100000.0 ) (all)

* Comment out all dynamics in the high-temperature dynamics, i.e. insert
{Cqn
before dyna(mics) on lines 766, 788, and 806; and insert
Cqn}
after end on lines 776, 798, and 816

* Comment out the first slow cool dynamics by inserting
{Cqn
before
(line 891)
      if ( &md.type.cool = "torsion" ) then
and
Cqn}
after (line 911)
     end if


* Insert before the row (about line 1068)
 @CNS_NMRMODULE:printaccept ( ave=$ave; :

set echo=on message=on end
{ComQum Set flags}
flags ? end
parameter  nbonds ? end end

{ComQum Calculate energy and gradient}
energy end

{ComQum Print energy to file mmen2}
set display=mmen2 end
display ComQum Energy MM2
display $ener[e20.14]
display ComQum MM Energy
evaluate($cqmm=$bond+$angl+$impr+$dihe+$vdw)
display $cqmm[e20.14]
display ComQum NMR Energy
evaluate ($cqnmr=$noe+$coup+$oneb+$carb+$prot+$dani+$sani+$cdih)
display $cqnmr[e20.14]
display ComQum Other Energy
evaluate ($cqother=$harm+$ncs)
display $cqother[e20.14]
display ComQum NOE Energy
display $noe[e20.14]
display ComQum ANI Energy
evaluate ($cqani=$sani+$dani)
display $cqani[e20.14]
display ComQum CDih Energy
display $cdih[e20.14]
display ComQum NOE weight
display &md.pow.noe
display ComQum CDih weight
display &md.pow.cdih

{ComQum Calculate the number of atoms}
show sum (1) (all)
evaluate ($natom=$result)

{ComQum Print gradient to file force2}
set echo=off message=off end
set display=force2 end
display ComQum Gradients MM2
display $natom[i6]
for $4 in id (all) loop grad
   show (dx) (id $4)
   eval ($1=$result)
   show (dy) (id $4)
   eval ($2=$result)
   show (dz) (id $4)
   eval ($3=$result)
   display $1[E20.14] $2[E20.14] $3[E20.14]
end loop grad
set display=OUTPUT message=normal end

Back


How to construct the mmrun3 file
  1. cp mmrun2 mmrun3
  2. Change the name of the structure file to mm3.mtf.
  3. Set the number of steps in the final minimization md.pow.step=200
  4. Set the number of cycles in the final minimization md.pow.cycle=1
  5. Change the base name of the output coordinate files pdb.out.name="minimize" (from refine);
* Insert after section: {ComQum: Read the 4th to 8th decimals and include them in x, y, z}  i.e. before (line 533)
parameter

{ComQum: Read in fixed atoms <=> xcomp=0.000 and fix them}
do (xcomp=1.0) (all)
do (ycomp=1.0) (all)
do (zcomp=1.0) (all)
coordinate disposition=comp @fix.pdb
fix select=(attribute xcomp < 0.5) end

* Replace the ComQum section (starting with set echo=on, ending with set deisplay=OUTPUT) before the row (about line 1128)
 @CNS_NMRMODULE:printaccept ( ave=$ave;
with the following
(or Change mmen2 to mmen3 in two places and change MM2 to MM3 in the next row; and
remove  the sections {ComQum Calculate the number of atoms} and {ComQum Print gradient to file force2} (20 lines in total):

set echo=on message=on end
{ComQum Set flags}
flags ? end

parameter  nbonds ? end end


{ComQum Calculate energy and gradient}
energy end

{ComQum Print energy to file mmen3}
set display=mmen3 end
display ComQum Energy MM3
display $ener[e20.14]
display ComQum MM Energy
evaluate($cqmm=$bond+$angl+$impr+$dihe+$vdw)
display $cqmm[e20.14]
display ComQum NMR Energy
evaluate ($cqnmr=$noe+$coup+$oneb+$carb+$prot+$dani+$sani+$cdih)
display $cqnmr[e20.14]
display ComQum Other Energy
evaluate ($cqother=$harm+$ncs)
display $cqother[e20.14]
display ComQum NOE Energy
display $noe[e20.14]
display ComQum ANI Energy
evaluate ($cqani=$sani+$dani)
display $cqani[e20.14]
display ComQum CDih Energy
display $cdih[e20.14]
display ComQum NOE weight
display &md.pow.noe
display ComQum CDih weight
display &md.pow.cdih
set display=OUTPUT message=normal end
set echo=off message=off end

* Insert after end loop main (line 1124):

{ComQum: Construct the set of 4th to 8th decimals,
         write them to file minimize3.pdb1,
         and write truncated coordinates to minimize3.pdb}
do ( xcomp = int( x * 1000.0) / 1000.0 ) (all)
do ( ycomp = int( y * 1000.0) / 1000.0 ) (all)
do ( zcomp = int( z * 1000.0) / 1000.0 ) (all)
do ( refx = (x - xcomp)*100000.0 ) (all)
do ( refy = (y - ycomp)*100000.0 ) (all)
do ( refz = (z - zcomp)*100000.0 ) (all)
do ( x = refx ) (all)
do ( y = refy ) (all)
do ( z = refz ) (all)
write coord output=minimize3.pdb1 end
do ( x = xcomp ) (all)
do ( y = ycomp ) (all)
do ( z = zcomp ) (all)
write coord output=minimize3.pdb end

 

Sample file

Back



To run a pure CNS ComQum calculation
(i.e. ComQum with the CNS MM force field, but no crystal data):

Make the following two changes in mmrun2

  1. Line 238:

  2. {===>} wa=0;
    (Instead of:  wa=-1;)
  3. Line 511:

  4. exclude elec pvdw xref
    (Instead of: exclude elec include pvdw xref)


To run a ComQum-X calculation without any MM force field (No MM)
(i.e. only crystal data; the quantum system is not corrected for the change in junction atoms).

Make the following two changes in mmrun2

  1. Line 238:

  2. {===>} wa=1.0;
    (Instead of:  wa=-1;)
     
  3. Line 511:

  4. exclude elec bond angl dihe impr vdw pvdw include xref
    (Instead of: exclude elec include pvdw xref)
Also make the following change in mmrun1
  1. Line 8:

  2. flags exclude elec pele bond angl dihe impr vdw ? end
    (Instead of: flags exclude elec pele ? end)

This is /home/bio/Bin/cqxprep :

# ComQum-X Prep
# Ulf Ryde, 6/3-01
#
cp /home/bio/Data/comqum.par .

echo CNS generate 1
cns <generate1.inp >generate1.out
grep ERR     generate1.out
grep WRN     generate1.out
grep INFO    generate1.out
grep error   generate1.out
grep warning generate1.out

echo Construct dummy pdb files for 8-decimal precision
echo Enter mm1.pdb, format, 1, 3, 0, dummy, return, q
changepdb

echo
echo Enter mm3.pdb, format, 1, 3, 0, dummy, return, q

changepdb

echo CNS mmrun 1
cns <mmrun1   >mmrun1.out
grep ERR       mmrun1.out
grep WRN       mmrun1.out
grep INFO      mmrun1.out
grep violation mmrun1.out
#page mmrun1.out

echo CNS mmrun 2
cns <mmrun2   >mmrun2.out
grep ERR       mmrun2.out
grep WRN       mmrun2.out
grep INFO      mmrun2.out
grep error     mmrun2.out
grep warning   mmrun2.out
grep violation mmrun2.out

echo CNS mmrun 3
cns <mmrun3   >mmrun3.out
grep ERR       mmrun3.out
grep WRN       mmrun3.out
grep INFO      mmrun3.out
grep error     mmrun3.out
grep warning   mmrun3.out
grep violation mmrun3.out

echo fixcoord1

fixcoord1_cnsin
fixcoord1_turboin
fixcoord1
fixcoord1_cnsout

# Make the files minimize3.pdb1
cp mm3.pdb1 minimize3.pdb1


This is a schematic overview of  /home/bio/comqumn

# Start with a scf step then loop through the following four steps

# Gradient step (gradient calculation)
rm *_[123]
grad
offsetf
cp gradient grad1
fixforce_cnsin
fixforce_turboin
fixforce
fixforce_turboout
mv gradient g1
mv grad1 gradient

# Relaxation step
relax >>output
fixcoord1_cnsin
fixcoord1_turboin
fixcoord1
fixcoord1_cnsout

# Molmech step
# if  protfree then
   cp mm3.pdb mm3.old
   cns <mmrun3 
   cp refine_1.pdb mm3.pdb

   mv minimize.pdb1 mm3.pdb1
#endif

# Scf step (energy calculation)
cns <mmrun1
cns <mmrun2
dscf
grep 'Self energy' output > selfenergy
fixenergy_cnsin
fixenergy_turboin
fixenergy
fixenergy_turboout
checkconv
 

key about pdb files:
mmrun1:  mm1.pdb (no output file)
mmrun2: mm3.pdb -> minimize.pdb (not used)
mmrun2: mm3.pdb -> minimize3.pdb (used by bindividual.inp)
bindividual.inp: minimize3.pdb -> bindividual.pdb (moved to mm3.pdb)



These are the programs you need (apart from Turbomole and CNS):
fixforce_cnsin
fixforce_turboin
fixforce
fixforce_turboout
fixcoord1_cnsin
fixcoord1_turboin
fixcoord1
fixcoord1_cnsout
fixenergy_cnsin
fixenergy_turboin
fixenergy
fixenergy_turboout
comqumx

In addition, you may need some of our accessory programs:
pdbtocomqum
comqumtoturbo
comqumtocns
cqxprep
cqxback
changepdb



This is an example of a file with the quantum system (syst1)
New title
531
534-541
1229
1231-1234
1266
1269-1276
1324
1327-1334
1443

Go back