ComQum-X
ComQum-X or quantum refinement is a method to combine X-ray crystallographic refinement with QM/MM calculations to obtain an optimum compromise between crystallography and quantum chemistry.
It requires the structure factors from a crystallographic investigation.

The proper reference of this method is:
U. Ryde, L. Olsen & K. Nilsson (2002) "Quantum chemical geometry optimisations in proteins using crystallographic raw data", J. Comp. Chem., 23, 1058-1070.

How to run a ComQum-X job
New version where the cns files are constructed by sedscript
In this version, we consider QC as a complement to MM in the X-ray investigation.
Thus we do not inclued electrostatics and have no hydrogen atoms.
  Form to fill in
  1. Start with a refined crystal structure, including structure factors. The structure factors can be found from the summary entry in the 3DBrowser (i.e. look first up the pdb file).

  2. cp /temp4/bio/Data/Comqumx_lib/* .

  3. Edit the file sedfile by inserting (the values should appear between the second and last /; e.g. s/cqx_crystal_a/43.1/) the data in the following points:
    1. Space group (http://stock.cabm.rutgers.edu/xtal/NOTES/sg_table.html gives a proper translation table; note that the last number in the CRYST1 record is the number of the space group)
    2. Crystal data (a, b, c, alpha, beta, gamma)
    3. File names: coordinates, parameter, topology (do not include the CNS files), and reflections.
    4. Percentage in the test set (if not already set)
    5. Low and high resolution limits
    6. Residues in the quantum system
     
  4. Find out from the pdb file the following data

  5. 1. Unit cell and symmetry
    CRYST1   49.970   58.680   98.050  90.00  90.00  90.00 P 21 21 21    4
    (this pdb row show a, b, c, alpha, beta, gamma, and the symmetry)

    2. Resolution range
    REMARK   3  DATA USED IN REFINEMENT.
    REMARK   3   RESOLUTION RANGE HIGH (ANGSTROMS) : 1.90
    REMARK   3   RESOLUTION RANGE LOW  (ANGSTROMS) : 20.00

    3. Sigma-cutoff value
    REMARK   3   DATA CUTOFF            (SIGMA(F)) : 0.000

    4. R-factor
    REMARK   3   R VALUE            (WORKING SET) : 0.181
    REMARK   3   FREE R VALUE                     : 0.231

    5. Test set size
    REMARK   3   FREE R VALUE TEST SET SIZE   (%) : 4.900
    REMARK   3   FREE R VALUE TEST SET COUNT      : 1042

    6. Parameter and topology files (at least if CNS was used for refinement)
    REMARK   3  PARAMETER FILE  1  : PROTEIN_REP.PA
    REMARK   3  PARAMETER FILE  2  : WATER_REP.PARA
    REMARK   3  TOPOLOGY FILE  1   : PROTEIN.TOP
    REMARK   3  TOPOLOGY FILE  2   : WATER.TOP

    7. Alternative configurations
    REMARK   3  OTHER REFINEMENT REMARKS: ALTERNATE CONFORMATIONS FOR
    REMARK   3  RESIDUES 33, 120, 121 AND 122
     

  6. If there are alternative configurations in the file (you find occupation numbers <>1 by changepdb, command occ) you can either remove them (take the major one and check that Rfree is not significantly changed) or you can include them (this is better but more tedious), provided that they are not in the quantum system (se below how).
  7.  
  8. Insert END at the end of all files.

  9. It is often wise to copy the subunit column to the segid column (pos 73-76; if not used).
    changepdb, command seg followed by return and w does this for you.
    HETATMiiiiixaaaaarrrxaiiiiaxxxffffffffggggggggffffffffggggggffffffxxxxxxsegiel-1
    ****|****1****|****2****|****3****|****4****|****5****|****6****|****7****|****8
  10.  
  11. If necessary, construct topology and parameter files for unusual ligands, etc.
    They can be obtained either by describe (Hess2FF; from QM calculations) or from the HIC-UP web server.
    Collect all top and par files in the files all.top and all.par.
  12.  
  13. Change the name of ions to the proper CNS names (e.g. residue=ZN2, atom=ZN+2; it is recommend also to change the charge in the last 2 columns (79 and 80) of the pdb file).
    Note that if there are four letters name in the pdb file, they must start so that there is a blank before the residue. Use changepdb to change the format to -6.
    Moreover, cns does not read the subunit key, so you may need to renumber the residues by changepdb, command rr.
    Note that changepdb removes TER from the file. These may have to be added manually; otherwise you can get PATCH-ERR when you run generate.

  14. Run cqx_sedscript
    This will construct the proper cns files:
    bindividual.inp  bindividual1.inp  generate.inp  make_cv.inp  mmrun2  mmrun3
    1.  
  15. Run generate for system 3:

    cns <generate.inp >generate.out

    grep ERR     generate.out
    grep WRN     generate.out
    grep INFO    generate.out
    grep error   generate.out
    grep warning generate.out

    These warnings and info are acceptable:

      %READC-WRN: still   1361 missing coordinates (in selected subset)
     CNSsolve>           display  %INFO: There are no coordinates missing for non-hydrogen atoms

    This gives the file mm3.pdb, which should be used as the pdb file in the following.
    Check the start (REMARK section) of this file for added Cys-Cys links and missing atoms.

    If you want to avoid added atoms (you almost always want that), set (this is now default):
    {===>} atom_delete=(not known);


    If you want to add or delete bonds, you need to include patches:
     {* any special protein patches can be applied here *}
    {===>}
    topology
       @fs2.patch
    end

    patch FS2P
       reference=1=(resid 632) reference=2=(resid 188) reference=3=(resid 260) reference=4=(resid  97) reference=5=(resid 128)
    end

    Remark Patch to build the Fe-Cys bonds for a Fe2S2 cluster
    presidue FS2P

      add bond 1FE1 2SG
      add bond 1FE1 3NH1
      add bond 1FE2 4SG
      add bond 1FE2 5SG
      add angle 2SG  1FE1  1S1
      add angle 2SG  1FE1  1S2
      add angle 3NH1 1FE1  1S1
      add angle 3NH1 1FE1  1S2
      add angle 4SG  1FE2  1S1
      add angle 4SG  1FE2  1S2
      add angle 5SG  1FE2  1S1
      add angle 5SG  1FE2  1S2
      add angle 2CB  2SG  1FE1
      add angle 3CZ  3NH1 1FE1
      add angle 4CB  4SG  1FE2
      add angle 5CB  5SG  1FE2
    end

  1. If you want to include alternative configurations, insert the atoms involved in the atom_select keyword in the file alternate.inp, using the format "atom segid residue_number atom_name", e.g.  atom S 234 NE2 or atom L 33 CG (this list can be set up with changepdb, command occ).
    Also check that you have only two configurations of each residue.
    Then run
    cns <alternate.inp >alternate.out
    Atoms with alternate configurations will get new segids AC1, AC2, but they will have identical coordinates (that of conformation B!). This can be corrected by running changepdb, command al on the file alternate.pdb. Write the file out as mm3.pdb

    Finally
    \mv alternate.mtf mm3.mtf

    The script alternate.inp does not work with metal ion with names like "Fe+2".
    We managed to solve the problem by changing the atom name to "Fe  " in alternate.inp, mm3.mtf (only 1st field = atom name) and mm3.pdb. Then you have to change them back to FE+2 (now in both AC1 and AC2) in alternate.mtf and alternate.pdb after running cns.
    This is now done automatically with changepdb command fe and changemtf command f.

  2. cp mm3.pdb comqum.pdb
    Then change the name of the histidine residues in the quantum system to HID or HIE (to get the right protonation and to get correct junctions). Cysteinate metal ligands may remain called CYS.

  3. Protonate the quantum system in this file (comqum.pdb).

  4. One way is to use tleap (see Amber equilibration page).
    Another way to do this is to run protonate on the protein:
    /temp4/bio/AMBER/Amber7/exe/protonate -k -d /temp4/bio/Data/Amber/PROTON_INFO.bio <comqum.pdb >ppdb
    This will protonate all amino acids. To protonate water molecules, use changepdb, command addwat and then command w(rite). This will give you crude proton coordinates, which you might want to improve. Alternatively, you can add them manually with Spartan. To automatically improve the proton coordinates, follow the procedure in the Amber equilibration setup , points 9-16.
    Note that you should add protons only to the quantum system and not to the whole protein. However, protonate works only for full amino acids. Therefore, it is easiest to protonate the whole protein first and then copy only exactly those hydrogen atoms you need into the pdb file.
    B factors and occupation numbers need to be defined. Therefore, set occupation numbers of the hydrogen atoms to 1.00 and the B factors to the same value as the mother heavy atom.
    Note that rasmol does not understand the Amber format of four-character hydrogen atoms. If you want to visualise the result, run changepdb command f(ormat), chose format 3, then write out a new file with command w(rite).

    If you have an old comqum.pdb file with the H atoms already added, this can be done automatically with changepdb command cph.
     
  5. Run pdbtocomqum
    1. Select the name of the logfile (default is OK)
    2. Enter the name of the pdb file (comqum.pdb)
    3. Do not search for short contatcts
    4. Enter a new title (can be done in a file, see next point).
    5. Select the quantum system, including the junction atoms that will be changed to hydrogens (you first give the number of the amino acid, then the number of the atoms in this amino acid; one <CR> let you select a new amino acid; another <CR> ends the selection).
    6. This can preferably be done by entering the name of a file with the atom numbers (see below). This file should also contain the new title as the first line.
    7. 0 water molecules to include
    8. Select the radius for system 2 (MM system that is optimised). Typically the whole protein (1000 Å).
    9. There is no need to include or remove residues from this system.
    10. Select the radius for system 3 (MM system that is fixed). Typically 0 Å (the whole protein has already been included).
    11. Select a cns type of calculation, by entering c(ns).
    12. Select the basis set (i.e. the type of junction, typically 9).
    13. Enter the junction atoms (those that will become hydrogen atoms in the quantum system)
    14. Ensure that they are not unknown.
    15. Press return until the program is finished.

    This gives the files:
    comqum.dat   comqum.mmout   comqum.qcout    logfil  

  6. If the quantum system is built up of symmetry related atoms, insert into the file comqum.dat the relevant tranformation in the format:

    $transformation
     26-40,41
     0.0d0  1.0d0  0.0d0  0.0d0
     0.0d0  0.0d0  1.0d0  0.0d0
     1.0d0  0.0d0  0.0d0  0.0d0
     0.0d0  0.0d0  1.0d0  0.0d0
     1.0d0  0.0d0  0.0d0  0.0d0
     0.0d0  1.0d0  0.0d0  0.0d0

    where the first line gives the atoms that should be transformed
    and the following three lines gives the relevant transformation (rotation matrix followed by translation vector).
    and the last three lines gives the relevant back-transformation

  7. Run comqumtoturbo to get the Turbomole files:

  8. control  coord
     
  9. Check the name of metal atoms in file coord and then run
    define
    1. construct an UFF hessian in the geometry menu
    2. define the basis sets
    3. get starting orbitals
    4. possibly change the DFT functional
    5. turn on the ri calculations
    6. turn on DFT-D3 dispersion (bj)
  10.  
  11. No longer needed
    Construct parameter and topology files for the quantum system (without H-atoms), if necessary. This is done automatically by the program mkjunct (run it in /temp4/bio/Data/Cns/Mkjunct, where there is a modified version of protein_rep.param)

    Copy the *.top and *.par files for all QM junctions from /temp4/bio/Data/Cns.

  12.  
  13. Run comqumtocns to get the CNS files:

  14. Enter first x and then mm1.par and possibly more (e.g. all.par) as the name of the parameter file.

    comqumtocns <<EOF
    x
    mm1.par
    all.par



    EOF

    This gives the following files:
    fix.pdb    generate1.inp    mmrun1    pdb1    pdb3

    but only mmrun1 and fix.pdb are actually used.
    Check that there are no HIE, HID, or HIP in fix.pdb.

    With alternate conformations, this does not work properly (mmrun3 and binidvidual1 complains about atoms not found in molecular structure). Run:
    changepdb <<EOF
    mm3.pdb
    cqf
    comqum.dat

    w
    fix.pdb
    q
    EOF
     

  15. Convert the pdb structure factor file to CNS format using /temp4/bio/Cns/pdbtocnsrefl.f.

  16. You probably have to change the read reflections line in the source code (line ~158) to fit your reflection file. Check carefully that you get the right number of unique reflections.
    The program is compiled by typing make pdbtocnsrefl in the directory /temp4/bio/Cns/Linux.

    A better and more general tool is sf-convert:
    http://sw-tools.pdb.org/apps/SF_CONVERT (more info at the end of this file)
    or to do it by Phenix (https://www.phenix-online.org/documentation/reference/reflection_file_tools.html):
    phenix.relection_file_converter xxx.cif --cns=cns.cv (a numer of flags are normally needed)
    phenix.reflection_file_converter 6cdk-v.cif --label='_refln.F_meas_au,_refln.F_meas_sigma_au' --r-free-label='_refln.pdbx_r_free_flag' --r-free-test-flag-value=0 --cns .)

  1. Construct a test set if not already available in the structure factor file (normally not needed).

    cns <make_cv.inp >make_cv.out
    grep ERR make_cv.out
    grep WAR make_cv.out
    grep INFO make_cv.out


  2. If you have alternative configurations, you have to go into the file mmrun2 and mmrun3 and change "none" to "segid AC1" or AC2  (or perhaps something like LAC1 and LAC2) in the following (consecutive) lines (around line 195):

    {* select atoms in alternate conformation 1 *}
    {===>} conf_1=(segid AC1);

    {* select atoms in alternate conformation 2 *}
    {===>} conf_2=(segid AC2);

  3. Copy the relevant parameter file of the protein.
    For time being, you need a special version :
    cp /temp4/bio/Data/Cns/protein_rep.param protein.param
    (or protein-allhdg5-4.param or dna-rna_rep.param)

  4. Test all programs by running
    cqxprep

    Look in the files mmrun1.out and mmrun2.out for violations.

    The following warnings or informations are OK (but not any ERR):
     DEFINE>{* files must contain unique array names otherwise errors will occur *}
     %READC-WRN: multiple coordinates for   5948 atoms
      CNSsolve>     display REMARK warning: B-correction gave atomic B-values less than zero

    Check that fixcoord1 worked properly:
    ********** Fix Coord 1 ended normally **********

    If you have special parameter files, not called all.par, they have to be included in mmrun1.

  5. You should always run with a fixed value of the wa factor (even if you want the default one). This is done in the file mmrun2 (search for wa=). Run fixenergy_cnsin to get the default value of wa (take the first in the output of this program - there are two).

  6. Remove unneccesary files and make a backup by running
    cqxback

    The following 19 (20 for UHF) files are needed to run comqumx:
    basis            control          grad1            mm1.pdb1         mm3.pdb1         mmrun3           comqum.par
    bindividual.inp  coord            mm1.mtf          mm3.mtf          mmrun1           mos
    comqum.dat       fix.pdb          mm1.pdb          mm3.pdb          mmrun2           bindividual1.inp
    In addition, you need the structure factor file and the parameter files for all heterocompounds. 
     
  7. You can add restraints on a distance between two atoms using (note the unusual units):
    $restraints
      atom1(1)  atom2(1)  desired_distance_in_A   force_constant_in_au
    $restraints
    1 3  1.00 1.0

    A typical force constant is 1 a.u. (which typically gives less than 0.01 Å deviation at convergence)

  8. Start comqumx:
    comqumx -c 200
    .

    The results are in files energy and cns.dat.
    The final coordinates are in bindividual1.pdb and in coord

  9. You do not have to calculate strain energies; these can be directly read from the energy file, entry three.

How to make and look at maps

We use phenix.maps to make the maps and Coot to look at the maps:

  1. You should have the file map.parms in your directory (otherwise, copy it from/temp4/bio/Data/Comqumx_lib or  /lunarc/nobackup/projects/bio/Data/Comqumx_lib)
    You normally do not have to modify it.

  2. ln -fs bindividual1.pdb pdb

  3. phenix.maps maps.params >maps.log
    for large proteins, it needs to be run in batch queue
    This gives you four files:
    pdb_2mFo-DFc_map.ccp4 pdb_mFo-DFc_map.ccp4
    pdb_fmodel.mtz pdb_map_coeffs.mtz
    The first two are the density and difference-density maps for Pymol
    The other two are for phenix, Coot and edstats

  4. Start Coot
    open the coordinate file: pdb
    auto open pdb_map_coeffs.mtz


How to calculate RSZD scores

  1. You need to calculate the maps first (see previous point)

  2. You should have a file edstat_ph in your directory (otherwise, copy it from /temp4/bio/Data/Comqumx_lib or /lunarc/nobackup/projects/bio/Data/Comqumx_lib)
    You normally do not have to modify it.

  3. Run it
    ./edstats_ph

  4. This gives the file edstats.out
    It contains results from all residues.
    You normally want to grep out only the interesting results, e.g.
    grep "CYS C   62" edstats.out
    grep "GLY C   87" edstats.out
    ...

    There are three sections in the output: one for the mainchain (m), one for the sidechain (s) and one for all atoms in the residue (a).q
    The interesting results are in the ZD-, ZD+ and ZD columns (maximum value for the negative or positive difference density, as well as the maximum absolute value of these two.

    If you run edstats.py without any options, the program will tell you the content of all columns.


This is the file edstats_ph
cad hklin1 pdb_fmodel.mtz hklin2 pdb_map_coeffs.mtz hklout hkl_ed.mtz<<EOF
LABIN FILE 1 E1=FOBS E2=SIGFOBS E3=FCALC E4=PHIFCALC E5=FOM E6=R_FREE_FLAGS
LABOUT FILE 1 E1=FOBS E2=SIGFOBS E3=FC E4=PHIC E5=FOM E6=FREE
LABIN FILE 2 ALL
EOF
\cp pdb pdb.pdb
edstats.pl -xyzin pdb -hklin hkl_ed.mtz -flabel FOBS

cad is a program in the CCP4 suite to combine the data in the pdb_fmodel.mtz and pdb_map_coeffs.mtz files into a single file hkl_ed.mtz,  with slightly changed labels.

 

How to run Phenix after ComQum-X

  1. You need the bindividual1.pdb and *.cv files

  2. To run in batch (the GUI is easier to use), you need the param.def file (can be copied from /temp4/bio/Data/Comqumx_lib or /lunarc/nobackup/projects/bio/Data/Comqumx_lib).
    Change in this file: the unit cell and space group, the input pdb file, the cv file (and possible the various lables), possibly *cif files for all heterogroups (can be calculated by phenix), prefix for the output files.
    Set the refinement strategy (typical: *individual_adp *occupancies

  3. Phenix uses another format for the alternative conformations (closer to pdb format).
    This awk script changes it to this format:
    {
    if(substr($0,74,3)=="AC1"){
    printf("%sB%s%s%s\n",substr($0,1,16),substr($0,18,4),substr($0,73,1),substr($0,23,56));
    }
    else if(substr($0,74,3)=="AC2"){
    printf("%sA%s%s%s\n",substr($0,1,16),substr($0,18,4),substr($0,73,1),substr($0,23,56));
    }
    else
    print $0
    }
    It also copies SEGID back to subunit

    These two actions are now done by changepdb, commands ss and alc

  4. It may be necessary to change the format of metal sites back to that in PDB.

  5. Run
    phenix.refine param.def --unused_ok >& refine.log

How to run a ComQum-X job
Old version where the files are constructed with CNS
In this version, we consider QC as a complement to MM in the X-ray investigation.
Thus we do not inclued electrostatics and have no hydrogen atoms.
  Form to fill in
  1. Start with a refined crystal structure, including structure factors. The structure factors can be found from the summary entry in the 3DBrowser (i.e. look first up the pdb file).

  2.  
  3. Find out from the pdb file the following data

  4. 1. Unit cell and symmetry
    CRYST1   49.970   58.680   98.050  90.00  90.00  90.00 P 21 21 21    4
    (this pdb row show a, b, c, alpha, beta, gamma, and the symmetry)

    2. Resolution range
    REMARK   3  DATA USED IN REFINEMENT.
    REMARK   3   RESOLUTION RANGE HIGH (ANGSTROMS) : 1.90
    REMARK   3   RESOLUTION RANGE LOW  (ANGSTROMS) : 20.00

    3. Sigma-cutoff value
    REMARK   3   DATA CUTOFF            (SIGMA(F)) : 0.000

    4. R-factor
    REMARK   3   R VALUE            (WORKING SET) : 0.181
    REMARK   3   FREE R VALUE                     : 0.231

    5. Test set size
    REMARK   3   FREE R VALUE TEST SET SIZE   (%) : 4.900
    REMARK   3   FREE R VALUE TEST SET COUNT      : 1042

    6. Parameter and topology files (at least if CNS was used for refinement)
    REMARK   3  PARAMETER FILE  1  : PROTEIN_REP.PA
    REMARK   3  PARAMETER FILE  2  : WATER_REP.PARA
    REMARK   3  TOPOLOGY FILE  1   : PROTEIN.TOP
    REMARK   3  TOPOLOGY FILE  2   : WATER.TOP

    7. Alternative configurations
    REMARK   3  OTHER REFINEMENT REMARKS: ALTERNATE CONFORMATIONS FOR
    REMARK   3  RESIDUES 33, 120, 121 AND 122
     

  5. Remove alternative configurations (take the major one). You find occupation numbers <>1 by changepdb, command occ. If you want to include alternative configurations, you have to run one ComQum job for each configuration.

  6. Also set all occupancies = 1.00 (can also be done in generate.inp).
     
  7. If you like, you may split the pdb file into parts consisting of the protein, water, and ligands (metals).

  8. Insert END at the end of all files. However, you can very well keep all the coordinates in one single file (the original pdb file).
     
  9. If necessary, construct topology and parameter files for unusual ligands, etc.

  10.  
  11. Change the name of ions to the proper CNS names (e.g. residue=ZN2, atom=ZN+2).

  12.  
  13. Construct a CNS default file (Input files; Defaults; Set) with the crystallographic data.

  14. Save it and the chose Start editing files
     
  15. Set up a CNS generate file (Input files; General).

  16. You should typically change:
    1. pdb file names, both the protein and the other pdb files (water, ligands, etc.)
    2. name of the output structure file mm3.mtf (instead of generate.mtf)
    3. name of the output coordinate file mm3.pdb (instead of generate.pdb)
    4. names of the topology and parameter files for the ligands (the defaults of the protein are good).
    5. optionally set the occupancy set flag to true (if alternate configurations have been removed)
    6. Then save the file as generate.inp

    7.  
  17. Run generate for system 3:

  18. cns <generate.inp >generate.out
    grep ERR generate.out
    grep WRN generate.out
    grep INFO generate.out
    grep error generate.out
    grep warning generate.out

    This gives the file mm3.pdb, which should be used as the pdb file in the following.
     

  19. Copy mm3.pdb to comqum.pdb (cp mm3.pdb comqum.pdb) and then change the name of the histidine residues in the quantum system to HID or HIE (to get the right protonation and to get correct junctions). Cysteinate metal ligands may remain called CYS.

  20.  
  21. Protonate the quantum system and construct a pdb file with these hydrogen atoms (comqum.pdb).

  22. B factors and occupation numbers need to be included. One way to do this is to run protonate on the protein:
    protonate -k -d $OML_AMBER/dat/Mumod/PROTON_INFO.Ulf <comqum.pdb >ppdb
    This will protonate all amino acids. To protonate water molecules, use changepdb, command addwat and then command w(rite). This will give you crude proton coordinates, which you might want to improve. Alternatively, you can add them manually with Spartan. To automatically improve the proton coordinates, follow the procedure in the Amber equilibration setup , points 9-16.
    Note that you should add protons only to the quantum system and not to the whole protein. However, protonate works only for full amino acids. Therefore, it is easiest to protonate the whole protein first and then copy only exactly those hydrogen atoms you need into the pdb file. Set occupation numbers of the hydrogen atoms to 1.00 and the B factors to the same value as the mother heavy atom.
    Note that rasmol does not understand the Amber format of four-character hydrogen atoms. If you want to visualise the result, run changepdb command f(ormat), chose format 3, then write out a new file with command w(rite).
     
  23. Run pdbtocomqum
    1. Select the name of the logfile (default is OK)
    2. Enter the name of the pdb file (comqum.pdb)
    3. Do not search for short contatcts
    4. Enter a new title (can be done in a file, see next point).
    5. Select the quantum system, including the junction atoms that will be changed to hydrogens (you first give the number of the amino acid, then the number of the atoms in this amino acid; one <CR> let you select a new amino acid; another <CR> ends the selection).

    6. This can preferably be done by entering the name of a file with the atom numbers (see below). This file should also contain the new title as the first line.
    7. Select the radius for system 2 (MM system that is optimised). Typically the whole protein (1000 Å).

    8. If necessary, include or remove residues from this system.
    9. Select the radius for system 3 (MM system that is fixed). Typically 0 Å (the whole protein has already been included).
    10. Select a cns type of calculation, by entering c(ns).
    11. Select the basis set (i.e. the type of junction, typically 6 = default).
    12. Enter the junction atoms (those that will become hydrogen atoms in the quantum system)

    13. Ensure that they are not unknown.
    14. Press return until the program is finished.


    This gives the files:
    comqum.dat   comqum.mmout   comqum.qcout    logfil
     

  24. Run comqumtoturbo to get the Turbomole files:

  25. control  coord
     
  26. Run define to define the basis sets and get starting orbitals. Optionally also change the DFT functional.

  27.  
  28. Construct parameter and topology files for the quantum system (without H-atoms).

  29. These should involve exactly the same atom types and parameters as in the full protein, except at the junctions.
    Note that the bond parameters of the junction hydrogen atoms are fixed by the corresponding parameters for the junction carbon atoms and junctfact (which is found in the comqum.dat file).
    Thus the force constant of the junction hydrogen atom must be the force constant of the junction carbon atom * juncfact^2, whereas the equilibrium parameter should be the carbon equilibrium parameter / juncfact (i.e. it should be the ideal bond length of that bond with that basis set, as listed in /temp4/bio/Data/junctfactor_cns. Otherwise a spurious extra force will be introduced:
    FMM1=k(x-x0)^2
    FMM2=k/jf^2(x*jf-x0*jf)^2 = k/jf^2*jf^2(x-x0)^2 = k(x-x0)^2 = FMM1
    Example (MMP):
    $junction_atoms
        14    11    1.38765000
    bond CPB  CH3E   572.053     1.5028
    bond CPB  HA    1101.53      1.083
    1101.53=572.053*1.38765*1.38765
    1.083=1.5028/1.38765
     
  30. Run comqumtocns to get the CNS files:

  31. Enter the name of the system1 topology and parameter files.

    This gives the following files:
    fix.pdb    generate1.inp    mmrun1    pdb1    pdb3
     

  32. Convert the pdb structure factor file to CNS format using /temp4/bio/Cns/pdbtocnsrefl.

  33. You probably have to change the read relections line in the source code to fit your reflection file. Check carefylly that you get the right number of unique relections. The program is compiled by typing make pdbtocnsrefl in the directory /temp4/bio/Cns.
     
  34. Construct a test set if not already available in the structure factor file.

  35. This is done by the CNS file make_cv.inp (under X-ray, Utilities).
    You typically have to change:
    1. The name of the reflection file.
    2. The percentage of reflections to be used.
    3. The resolution range
    4. The output reflection file name.
    Then run cns:
    cns <make_cv.inp >make_cv.out
    grep ERR make_cv.out
    grep WAR make_cv.out
    grep INFO make_cv.out
     
  36. Run CNS on the web again to construct the mmrun2 file, using refinement:minimize.inp sample file.

  37. 1. Select Input files
    2. Select Set (Default)
    3. Read in the project.def file and select Start editing files
    4. Typically you shall change (for the other fields, default values are normally OK):
    1. molecular topology file name =mm3.mtf
    2. parameter file names, if any (do not forget water.param and ion.param)
    3. coordinate file name =mm3.pdb
    4. reflection file name
    5. resolution range
    6. number of minimisation steps = 0

    7.  
  38. Continue with the same file on the web

  39. 1. Write in the "select fixed atoms" box (instead of none):
       attribute xcomp < 0.5
    2. Set the number of minimisation steps = 1
    3. Change the output pdb file name to minimize3.pdb.
    Then save it as mmrun3.
     
  40. Open mmrun2 in a text editor.
* Insert after the row " coordinates @&coordinate_infile " (about row 272):

{ComQum: Read the 4th to 8th decimals and include them in x, y, z}
coordinates disp=comp @mm3.pdb1
do ( x = x + xcomp/100000.0 ) (all)
do ( y = y + ycomp/100000.0 ) (all)
do ( z = z + zcomp/100000.0 ) (all)

* Insert before the row (line 643)
 print threshold=20.0 bond
 (these lines are similar to those found in file mmrun1):

{ComQum Set flags}
set echo=off message=off end
flags ? end

{ComQum Calculate energy and gradient}
energy end

{ComQum Print energy to file mmen2}
set display=mmen2 end
display ComQum Energy MM2
display $ener[e20.14]

{ComQum Print wa = X-ray weight factor}
xray wa=? end
evaluate ($wa_print=$result)
display ComQum wa
display $wa_print

{ComQum Print rfree and r value}
display ComQum rfree r
display $full_test_r[f8.6] $full_r[f8.6]

{ComQum Calculate the number of atoms}
show sum (1) (all)
evaluate ($natom=$result)

{ComQum Print gradient to file force2}
set display=force2 end
display ComQum Gradients MM2
display $natom[i6]
for $4 in id (all) loop grad
   show (dx) (id $4)
   eval ($1=$result)
   show (dy) (id $4)
   eval ($2=$result)
   show (dz) (id $4)
   eval ($3=$result)
   display $1[E20.14] $2[E20.14] $3[E20.14]
end loop grad
set display=OUTPUT message=normal end

print threshold=0.10 bond
print threshold=10.0 angle
print threshold=60.0 dihedral
print threshold=20.0 improper


  1. Open file mmrun3 with a text editor and
* Insert after the row " coordinates @&coordinate_infile " (around line 272):

{ComQum: Read the 4th to 8th decimals and include them in x, y, z}
coordinates disp=comp @mm3.pdb1
do ( x = x + xcomp/100000.0 ) (all)
do ( y = y + ycomp/100000.0 ) (all)
do ( z = z + zcomp/100000.0 ) (all)

* Insert before the row (around line 495):
{- check isolated atoms and atoms at special positions and add to

{ComQum: Read in fixed atoms <=> xcomp=0.000}
  coordinate disposition=comp @fix.pdb

* Insert before the row (around line 646)
 print threshold=20.0 bond

{ComQum Print energy to file mmen3}
set display=mmen3 end
energy end
display ComQum Energy MM3
display $ener[e20.14]

{ComQum Print wa = X-ray weight factor}
xray wa=? end
evaluate ($wa_print=$result)
display ComQum wa
display $wa_print

{ComQum Print rfree and r value}
display ComQum rfree r
display $full_test_r[f8.6] $full_r[f8.6]
set display=OUTPUT end
 

* Insert before the row (at the end of the script)
" @CNS_XTALMODULE:write_pdb (pdb_o_format=&pdb_o_format;"

{ComQum: Construct the set of 4th to 8th decimals in xyzcomp}
do ( xcomp = int( x * 1000.0) / 1000.0 ) (all)
do ( ycomp = int( y * 1000.0) / 1000.0 ) (all)
do ( zcomp = int( z * 1000.0) / 1000.0 ) (all)
do ( refx = (x - xcomp)*100000.0 ) (all)
do ( refy = (y - ycomp)*100000.0 ) (all)
do ( refz = (z - zcomp)*100000.0 ) (all)
do ( x = xcomp ) (all)
do ( y = ycomp ) (all)
do ( z = zcomp ) (all)

* and before the final stop

{ComQum: Write file minimize.pdb1 only if Rfree is improved}
if ( $start_test_r > $full_test_r ) then
   do ( x = refx ) (all)
   do ( y = refy ) (all)
   do ( z = refz ) (all)
    @CNS_XTALMODULE:write_pdb (pdb_o_format=&pdb_o_format;
                               coordinate_outfile=minimize3.pdb1;
                               sgparam=$sgparam;)
end if


  1. Run CNS on the web again to construct the bindividual.inp file (refinement).

  2. Read in the default file again if necessary (as above).
    Typically you shall change (for the other fields, default values are normally OK):
    1. molecular topology file name (=mm3.mtf)
    2. parameter file names, if any (do not forget water.param and ion.param)
    3. coordinate file name (=mm3.pdb)
    4. reflection file name
    5. resolution range
    6. NO: Select atoms whose B-factors will remain fixed:

    7. attribute xcomp < 0.5
    8. number of steps of restrained individual B-factor minimization = 1

    9.  
  3. Change in this file:
*  Insert after the line:
coordinates @&coordinate_infile (around line 264)

{ComQum: Read the 4th to 8th decimals and include them in x, y, z}
coordinates disp=comp @mm3.pdb1
do ( x = x + xcomp/100000.0 ) (all)
do ( y = y + ycomp/100000.0 ) (all)
do ( z = z + zcomp/100000.0 ) (all)

* NO: Insert before the row (line 440):
fix selection=(not(&atom_select) or &atom_fixed) end

{ComQum: Read in fixed atoms <=> xcomp=0.000}
  coordinate disposition=comp @fix.pdb

* Insert before the line (line 574)
set display=&coordinate_outfile end

{ComQum Print energy to file mmen4}
set display=mmen4 end
energy end
display ComQum Energy MM4
display $ener[e20.14]

{ComQum Print rfree and r value}
display ComQum rfree r
display $full_test_r[f8.6] $full_r[f8.6]
set display=OUTPUT end

* Insert before the line (at the end of the script) (line 690)
" @CNS_XTALMODULE:write_pdb (pdb_o_format=&pdb_o_format;"

{ComQum: Construct the set of 4th to 8th decimals in xyzcomp}
do ( xcomp = int( x * 1000.0) / 1000.0 ) (all)
do ( ycomp = int( y * 1000.0) / 1000.0 ) (all)
do ( zcomp = int( z * 1000.0) / 1000.0 ) (all)
do ( refx = (x - xcomp)*100000.0 ) (all)
do ( refy = (y - ycomp)*100000.0 ) (all)
do ( refz = (z - zcomp)*100000.0 ) (all)
do ( x = xcomp ) (all)
do ( y = ycomp ) (all)
do ( z = zcomp ) (all)

* and before the final stop

{ComQum: Write file bindividual.pdb1 only if Rfree is improved}
if ( $start_test_r > $full_test_r ) then
   do ( x = refx ) (all)
   do ( y = refy ) (all)
   do ( z = refz ) (all)
    @CNS_XTALMODULE:write_pdb (pdb_o_format=&pdb_o_format;
                               coordinate_outfile=bindividual.pdb1;
                               sgparam=$sgparam;)
end if

     
     
  1. cp bindividual.inp bindividual1.inp

  2. Change the following lines in bindividual1.inp
    1. {===>} atom_fixed=(attribute ycomp > 1.5);   (not none)
    2. {===>} bfactor_nstep=30;  (instead of 1)
    3. coordinate_outfile="bindividual1.pdb"; (not bindividual.pdb)
    4. {ComQum Print energy to file mmen5} (not mmen4)
    5. set display=mmen5 end (not mmen4)
    6. {ComQum: Write file bindividual1.pdb1 only if Rfree is improved} (not bindividual.pdb1)
    7. coordinate_outfile=bindividuali1.pdb1; (not bindividual.pdb1)
    8. Add the following two lines before the line (~439)
     fix selection=(not(&atom_select) or &atom_fixed) end:

    {ComQum: Read in fixed atoms <=> ycomp>=2.000 (system 2)}
      coordinate disposition=comp @fix.pdb

     
  3. Test all programs by running

  4. cqxprep
    Look in the files mmrun1.out and mmrun2.out for violations
     
  5. Remove unneccesary files and make a backup by running

  6. cqxback

    The following 17 (18 for UHF) files are needed to run comqumx:
    basis            control          grad1            mm1.pdb1         mm3.pdb1         mmrun3
    bindividual.inp  coord            mm1.mtf          mm3.mtf          mmrun1           mos
    comqum.dat       fix.pdb          mm1.pdb          mm3.pdb          mmrun2
    In addition, you need the structure factor file and all parameter files for all heterocompounds.
     
     

  7. You can add restraints on a distance between two atoms using (note the unusual units):

  8. $restraints
      atom1(1)  atom2(1)  desired_distance_in_A   force_constant_in_au
    $restraints
     1 3  1.00 10.0
    A typical force constant is 10 a.u. (which typically gives a 0.05 pm deviation at convergence)
     
  9. Start comqumx:

  10. /temp4/bio/Bin/comqumx -c 200.
     
  11. You do not have to calculate strain energies; these can be directly read from the energy file, entry three.

  12.  
  13. When the calculation is converged, run

  14. calcwa
    to get a better value of wa.
    This value is inserted into the file mmrun2, in the row (row 238):
    {===>} wa=-1;
    Replace -1 by the desired value and rerun comqumx
    /temp4/bio/Bin/comqumx -c 200.
    However, it is better to run calculations with different values of wa, and select the one with the lowest Rfree.


To run a pure CNS ComQum calculation
(i.e. ComQum with the CNS MM force field, but no crystal data):

Make the following two changes in mmrun2

  1. Line 238:

  2. {===>} wa=0;
    (Instead of:  wa=-1;)
  3. Line 511:

  4. exclude elec pvdw xref
    (Instead of: exclude elec include pvdw xref)


To run a ComQum-X calculation without any MM force field (No MM)
(i.e. only crystal data; the quantum system is not corrected for the change in junction atoms).

Make the following two changes in mmrun2

  1. Line 238:

  2. {===>} wa=1.0;
    (Instead of:  wa=-1;)
     
  3. Line 511:

  4. exclude elec bond angl dihe impr vdw pvdw include xref
    (Instead of: exclude elec include pvdw xref)
Also make the following change in mmrun1
  1. Line 8:

  2. flags exclude elec pele bond angl dihe impr vdw ? end
    (Instead of: flags exclude elec pele ? end)

This is cqxprep :

# ComQum-X Prep
# Ulf Ryde, 6/3-01
# UR 1/11-16: New version without generate1, but changemtf instead
#

#cp /temp4/bio/Data/Cns/comqum.par .
#cp /temp4/bio/Data/comqum.par .

#echo CNS generate 1
#cns <generate1.inp >generate1.out
#grep ERR     generate1.out
#grep WRN     generate1.out
#grep INFO    generate1.out
#grep error   generate1.out
#grep warning generate1.out

#echo CNS generate 3
#cns <generate.inp >generate.out
#grep ERR     generate.out
#grep WRN     generate.out
#grep INFO    generate.out
#grep error   generate.out
#grep warning generate.out

#echo Run comqumtocns
#comqumtocns <<EOF
#x
#mm1.par
#all.par
#
#EOF

echo Run changemtf to set up the mm1.mtf, mm1.par, and mm1.pdb files
echo from mm3.mtf, mm3.pdb, and protein_rep.param
changemtf <<EOF
mm3.mtf
t
y
mm1.mtf
protein.param
mm1.par
y
mm3.pdb
mm1.pdb
q
EOF

echo Construct dummy pdb files for 8-decimal precision
echo Enter mm1.pdb, format, 1, 3, 0, dummy, return, q
changepdb <<EOF
mm1.pdb
f
1
3
0
dum

q
EOF

echo
echo Enter mm3.pdb, format, 1, 3, 0, dummy, return, q

changepdb <<EOF
mm3.pdb
f
1
3
0
dum

q
EOF

echo CNS mmrun 1
cns <mmrun1   >mmrun1.out
grep ERR       mmrun1.out
grep WRN       mmrun1.out
grep INFO      mmrun1.out
grep violation mmrun1.out
#page mmrun1.out

echo CNS mmrun 2
cns <mmrun2   >mmrun2.out
grep ERR       mmrun2.out
grep WRN       mmrun2.out
grep INFO      mmrun2.out
grep error     mmrun2.out
grep warning   mmrun2.out
grep '>>'      mmrun2.out
grep violation mmrun2.out

echo CNS mmrun 3
cns <mmrun3   >mmrun3.out
grep ERR       mmrun3.out
grep WRN       mmrun3.out
grep INFO      mmrun3.out
grep error     mmrun3.out
grep warning   mmrun3.out
grep violation mmrun3.out
grep '>>'      mmrun3.out

echo CNS bindividual
cns <bindividual.inp >bindividual.out
grep ERR       bindividual.out
grep WRN       bindividual.out
grep INFO      bindividual.out
grep error     bindividual.out
grep warning   bindividual.out
grep violation bindividual.out
grep '>>'      bindividual.out

echo CNS bindividual1
cns <bindividual1.inp >bindividual1.out
grep ERR       bindividual1.out
grep WRN       bindividual1.out
grep INFO      bindividual1.out
grep error     bindividual1.out
grep warning   bindividual1.out
grep violation bindividual1.out
grep '>>'      bindividual1.out

echo fixcoord1
fixcoord1_cnsin
fixcoord1_turboin
fixcoord1
fixcoord1_cnsout

# Make the files bindividual.pdb, bindividual.pdb1, and minimize3.pdb1
cp mm3.pdb1 bindividual.pdb1
cp mm3.pdb1 minimize3.pdb1
cp mm3.pdb bindividual.pdb



This is a schematic overview of  comqumx

# Start with a scf step then loop through the following four steps

# Gradient step (gradient calculation)
rm *_[123]
grad
offsetf
cp gradient grad1
fixforce_cnsin
fixforce_turboin
fixforce
fixforce_turboout
mv gradient g1
mv grad1 gradient

# Relaxation step
relax >>output
fixcoord1_cnsin
fixcoord1_turboin
fixcoord1
fixcoord1_cnsout

# Molmech step
# if  protfree then
   cp mm3.pdb mm3.old
   cns <mmrun3
   cns <bindividual.inp
   mv bindividual.pdb mm3.pdb
#endif

# Scf step (energy calculation)
cns <mmrun1
cns <mmrun2
dscf
grep 'Self energy' output > selfenergy
fixenergy_cnsin
fixenergy_turboin
fixenergy
fixenergy_turboout
checkconv
 

key about pdb files:
mmrun1:  mm1.pdb (no output file)
mmrun2: mm3.pdb -> minimize.pdb (not used)
mmrun2: mm3.pdb -> minimize3.pdb (used by bindividual.inp)
bindividual.inp: minimize3.pdb -> bindividual.pdb (moved to mm3.pdb)



These are the programs you need (apart from Turbomole and CNS):
fixforce_cnsin
fixforce_turboin
fixforce
fixforce_turboout
fixcoord1_cnsin
fixcoord1_turboin
fixcoord1
fixcoord1_cnsout
fixenergy_cnsin
fixenergy_turboin
fixenergy
fixenergy_turboout
comqumx

In addition, you may need some of our accessory programs:
pdbtocomqum
comqumtoturbo
comqumtocns
cqxprep
cqxback
changepdb



This is an example of a file with the quantum system (plastocyanin)
New title
531
534-541
1229
1231-1234
1266
1269-1276
1324
1327-1334
1443



These changes are appropriate when charges and hydrogen atoms should be included
  1. Protonate the quantum system. It can be done by protonate

  2. protonate <infile >outfile
    You may have to construct a protonate entry for it.
    If you have used protonate, you have to change the format of the pdb file with changepdb, command format (select 1; then write it out with command write).
  3. Run /temp4/bio/Bin/changemtf, option pdb, to construct a pdb file (comqum.pdb) with charges for pdbtocomqum.


Old instructions

Old:
These should involve exactly the same atom types and parameters as in the full protein, except at the junctions.
Note that the bond parameters of the junction hydrogen atoms are fixed by the corresponding parameters for the junction carbon atoms and junctfact (which is found in the comqum.dat file).
Thus the force constant of the junction hydrogen atom must be the force constant of the junction carbon atom * juncfact^2, whereas the equilibrium parameter should be the carbon equilibrium parameter / juncfact (i.e. it should be the ideal bond length of that bond with that basis set, as listed in /temp4/bio/Data/junctfactor_cns. Otherwise a spurious extra force will be introduced:
FMM1=k(x-x0)^2
FMM2=k/jf^2(x*jf-x0*jf)^2 = k/jf^2*jf^2(x-x0)^2 = k(x-x0)^2 = FMM1
Example (MMP):
$junction_atoms
    14    11    1.38765000
bond CPB  CH3E   572.053     1.5028
bond CPB  HA    1101.53      1.083
1101.53=572.053*1.38765*1.38765
1.083=1.5028/1.38765


It is often wise to run a frequency calculation at a lower level of theory to get a better forceapprox matrix.
If you did not do it in point 17 above, you can do it now with UFF.

Alternatively, if you prefer a higher level of theory (e.g. HF/Sto-3g):
mkdir Freq
cp control coord basis energy mos alpha beta auxbasis Freq
cd Freq

change basis set to sto-3g hondo and remove dft and ri (with define)

insert at the end of the control file:
$dipgrad  file=hessian
$hessian  file=hessian
$vibrational normal modes  file=hessian
$forcepath <path to the scratch files>
$maxcor <core memory in MB (70% of available memory is recommended >

dscf >logd
aoforce >logf  (or if it is an open-shell system: uhfforce >logf)

hesstofoa
cp hessfoa ..
cd ..

change in the control file
$forceinit off
$forceapprox    file=hessfoa


Alternative configurations
If you want to include alternative configurations, go to CNS home page and set up the file alternate.inp. Set the input files mm3.mtf and mm3.pdb and the default output files alternate.mtf and alternate.pdb. Finally, insert the number of conformations (typically 2) and the atoms involved using the format "atom segid residue_number atom_name", e.g.  atom S 234 NE2 or atom L 33 CG (this list can be set up with changepdb, command occ).
Set scale_occupancy=false in alternate.inp
Then run
cns <alternate.inp >alternate.out
Atoms with alternate configurations will get new segids AC1, AC2, but they will have identical coordinates (that of conformation B!). Change the coordinates of AC2 to the desired ones and change the occupancies back to the original values.
Finally
cp alternate.mtf mm3.mtf
cp alternate.pdb mm3.pdb

Script set up by Octav to do this:
#write altloc A from original pdb file
awk '{if(substr($0,17,1)=="A") print $0}' 4u9h > altlocA.pdb
#write altloc A from alternate.pdb (AC2 segid)
awk '{if(substr($0,74,3)=="AC2") print $0}' alternate.pdb > altlocA_cns.pdb
#reorder altloc A from original as altloc A from alternate.pdb
changepdb<<EOF
altlocA.pdb
ror
altlocA_cns.pdb
w


EOF
#substitute coord, bf and occup in altloc CNS
awk 'NR==FNR{a[NR]=substr($0,27,40);next}{printf("%s%s%s\n",substr($0,1,26),a[FNR],substr($0,67,14));}' altlocA.pdb altlocA_cns.pdb > altlocA_cns_right.pdb
#delete all AC2 lines from alternate.pdb and substitute them with altlocA_cns_right.pdb (manually)
#delete missing coordinates from alternate.pdb
awk '{if(substr($0,63,4)!="0.00") print $0}' alternate.pdb >t && mv -f t alternate.pdb



  1. When the calculation is converged, run (No, not recommended any longer)
    calcwa
    to get a better value of wa.
    This value is inserted into the file mmrun2, in the row (row 238):
    {===>} wa=-1;
    Replace -1 by the desired value and rerun comqumx
    /temp4/bio/Bin/comqumx -c 200.
    It is better to run calculations with different values of wa, and select the one with the lowest Rfree.





sf-convert

I found the following tool for refinement.

sf-convert
http://sw-tools.pdb.org/apps/SF-CONVERT/index.html

It seems to be able to do what we need, i.e., it converts
a set of reflections stored in PDB (1b7v-sf.cif) to the CNS format.
It is done by using the following command.
sf-convert-v1.200-prod-src/bin/sf_convert -i mmCIF -o CNS -sf 1b7v-sf.cif

This program took the first wavelength. I am not sure if there is a way
to choose which wavelength to take.

The generated file (1b7v-sf.cif.CNS) has one strange peculiarity.
The number of digits after the decimal point is 2, whereas CNS usually has 3.
However, CNS can read this file directly, as I verified.

This program can also select reflections for the test set:

sf-convert-v1.200-prod-src/bin/sf_convert -i mmCIF -o CNS -sf 1b7v-sf.cif -flag 5
Dmitri, 27/8-12

History

12/8-05
Reinserted the reading in of fix.pdb in bindividual1.inp

12/6-14
Set up files for version 1.3 of CNS.
Then, I also changed so that bindividual.inp and bindividual1.inp writes out the pdb file only if R is decreased (no longer R_free).
I also followed the original files in that I no longer use the _rep topology files for protein and water.

1/11-16
Use changemtf to set up mm1.mtf, mm1.par, and mm1.pdb from mm3.mtf, mm3.pdb, and protein_rep.param.
Updated cqxprep
generate1.inp is retired.