ComQum-UX
ComQum-UX or joint quantum neutron and X-ray refinement is a method to combine joint neutron and X-ray crystallographic refinement with QM/MM calculations to obtain an optimum compromise between neutron crystallography, X-ray crystallography and quantum chemistry.
It requires the structure factors from both a neutron and an X-ray crystallographic investigation.

The proper reference of this method is:
O. Caldararu, F. Manzoni, E. Oksanen, D. T. Logan & U. Ryde (2019) "Refinement of protein structures using a combination of quantum mechanical calculations with neutron and X-ray crystallographic data" Acta Cryst. D, 75, 368-380.

How to run a ComQum-UX job
In this version, we consider QM as a complement to MM in the X-ray and neutron investigation.
For time being, we do not include electrostatics and have no hydrogen atoms.
  1. Start with a refined crystal structure, including structure factors. The structure factors can be found from the summary entry in the 3DBrowser (i.e. look first up the pdb file). Note that you need both neutron and X-ray structure factors.

  2. cp /temp4/bio/Data/Comqumux_lib/* .

  3. Edit the file sedfile by inserting (the values should appear between the second and last /; e.g. s/cqx_crystal_a/43.1/) the data in the following points:
    1. Space group
    2. Crystal data (a, b, c, alpha, beta, gamma)
    3. File names: coordinates, parameter, topology (do not include the CNS files), and reflections.
    4. Percentage in the test set (if not already set)
    5. Low and high resolution limits
    6. [Residues in the quantum system (no longer needed)]
     
  4. The data can be found from the pdb file:

  5. 1. Unit cell and symmetry
    CRYST1   49.970   58.680   98.050  90.00  90.00  90.00 P 21 21 21    4
    (this pdb row show a, b, c, alpha, beta, gamma, and the symmetry)

    2. Resolution range
    REMARK   3  DATA USED IN REFINEMENT.
    REMARK   3   RESOLUTION RANGE HIGH (ANGSTROMS) : 1.90
    REMARK   3   RESOLUTION RANGE LOW  (ANGSTROMS) : 20.00

    3. Sigma-cutoff value
    REMARK   3   DATA CUTOFF            (SIGMA(F)) : 0.000

    4. R-factor
    REMARK   3   R VALUE            (WORKING SET) : 0.181
    REMARK   3   FREE R VALUE                     : 0.231

    5. Test set size
    REMARK   3   FREE R VALUE TEST SET SIZE   (%) : 4.900
    REMARK   3   FREE R VALUE TEST SET COUNT      : 1042

    6. Parameter and topology files (at least if CNS was used for refinement)
    REMARK   3  PARAMETER FILE  1  : PROTEIN_REP.PA
    REMARK   3  PARAMETER FILE  2  : WATER_REP.PARA
    REMARK   3  TOPOLOGY FILE  1   : PROTEIN.TOP
    REMARK   3  TOPOLOGY FILE  2   : WATER.TOP

    7. Alternative configurations
    REMARK   3  OTHER REFINEMENT REMARKS: ALTERNATE CONFORMATIONS FOR
    REMARK   3  RESIDUES 33, 120, 121 AND 122
     

  6. If there are alternative configurations in the file (you find occupation numbers <>1 by changepdb, command occ) you can either remove them (take the major one and check that Rfree is not significantly changed) or you can include them (this is better but more tedious), provided that they are not in the quantum system (see below how).
  7.  
  8. Check that there is a END at the end of all pdb files.

  9. Copy the subunit column to the segid column (pos 73-76; if not used).
    changepdb, command seg followed by return and w does this for you.
    HETATMiiiiixaaaaarrrxaiiiiaxxxffffffffggggggggffffffffggggggffffffxxxxxxsegiel-1
    ****|****1****|****2****|****3****|****4****|****5****|****6****|****7****|****8
  10.  
  11. If necessary, construct topology and parameter files for unusual ligands, etc.

  12. Remove ANISOU if present in the pdb file
    grep -v ANISOU pdb >npdb
     
  13. Change the name of ions to the proper CNS names (e.g. residue=ZN2, atom=ZN+2).

    Note that if there are four letters name in the pdb file, they must start so that there is a blank before the residue.
    Use changepdb to change the format to -6.

    Moreover, cns does not read the subunit key, so you may need to renumber the residues by changepdb, command rr.

    Note that changepdb may TER from the file.
    You can add them again with the ter command; otherwise you can get PATCH-ERR when you run generate.

  14. Change atom names to CNS convention by changepdb, command cnn.
    Do not forget to write out the modified file.

  15. Run cqxu_sedscript
    This will construct the proper cns files:
    bindividual.inp  bindividual1.inp  generate.inp  make_cv.inp  mmrun2  mmrun3
    1.  
  16. Run generate for system 3:

  17. ncns <generate.inp >generate.out
    grep ERR generate.out
    grep WRN generate.out
    grep INFO generate.out
    grep error generate.out
    grep warning generate.out

    The following warnings or infos are OK, but no ERR or error:
     %READC-WRN: still   1361 missing coordinates (in selected subset)
      %READC-WRN: atom A    113  PRO  HA   has been renamed: DA 
     CNSsolve>           display  %INFO: There are no coordinates missing for non-hydrogen atoms


    This gives the file mm3.pdb, which should be used as the pdb file in the following.
    Check the start (REMARK section) of this file for added Cys-Cys links and missing atoms.

    If you want to avoid added atoms, set:
    {===>} atom_delete=(not(known));

    If you want to add or delete bonds, you need to include patches:
     {* any special protein patches can be applied here *}
    {===>}
    topology
       @fs2.patch
    end

    patch FS2P
       reference=1=(resid 632) reference=2=(resid 188) reference=3=(resid 260) reference=4=(resid  97) reference=5=(resid 128)
    end

    Remark Patch to build the Fe-Cys bonds for a Fe2S2 cluster
    presidue FS2P

      add bond 1FE1 2SG
      add bond 1FE1 3NH1
      add bond 1FE2 4SG
      add bond 1FE2 5SG
      add angle 2SG  1FE1  1S1
      add angle 2SG  1FE1  1S2
      add angle 3NH1 1FE1  1S1
      add angle 3NH1 1FE1  1S2
      add angle 4SG  1FE2  1S1
      add angle 4SG  1FE2  1S2
      add angle 5SG  1FE2  1S1
      add angle 5SG  1FE2  1S2
      add angle 2CB  2SG  1FE1
      add angle 3CZ  3NH1 1FE1
      add angle 4CB  4SG  1FE2
      add angle 5CB  5SG  1FE2
    end

  18. If you want to include alternative configurations, go to CNS home page and set up the file alternate.inp. Set the input files mm3.mtf and mm3.pdb and the default output files alternate.mtf and alternate.pdb. Finally, insert the number of conformations (typically 2) and the atoms involved in the format "atom segid residue_number atom_name", e.g.  atom S 234 NE2 or atom L 33 CG (this can be done with changepdb, command occ).
    Then run
    cns <alternate.inp >alternate.out
    Atoms with alternate configurations will get new segids AC1, AC2, but they will have identical coordinates (that of conformation B!) and a halfed occupancy. This can be corrected by running changepdb, command al on the file alternate.pdb, use the input pdb file of generate.inp as correct file. Write the file out as mm3.pdb

    Finally
    cp alternate.mtf mm3.mtf

  19. cp mm3.pdb comqum.pdb

    If you intend to use a point-charge model in the QM calculations, you must instead build a new comqum.pdb file with changemtf, on mm3.mtf, command p, using coordinates from mm3.pdb (this file contains point charges from the CNS force field).
  20.  
  21. Run pdbtocomqum
    1. Select the name of the logfile (default is OK)
    2. Enter the name of the pdb file (comqum.pdb)
    3. Do not search for short contatcts
    4. Enter a new title (can be done in a file, see next point).
    5. Select the quantum system, including the junction atoms that will be changed to hydrogens.
    6. This should be done by entering the name of a file with the atom numbers (see below; call it syst1). This file should also contain the new title as the first line.
    7. Do not add any water molecules
    8. Select the radius for system 2 (MM system that is optimised). Typically the whole protein (1000 Å).
    9. There is no need to include or remove residues from this system.
    10. Select the radius for system 3 (MM system that is fixed). Typically 0 Å (the whole protein has already been included).
    11. Select a neutron type of calculation, by entering u.
    12. Select the basis set (i.e. the type of junction, typically 12).
    13. Enter the junction atoms (those that will become hydrogen atoms in the quantum system; this is done automatically by the syst1 file)
    14. Ensure that they are not unknown (type -1).
    15. Use the new format.
    16. Do not remove the charge of the neighbours (n).
    17. Do not redistribute the deleted charges (n).

    This gives the files:
    comqum.dat   comqum.mmout   comqum.qcout    logfil  

  22. If the quantum system is built up of symmetry related atoms, insert into the file comqum.dat the relevant tranformation in the format:

    $transformation
     26-40,41
     0.0d0  1.0d0  0.0d0  0.0d0
     0.0d0  0.0d0  1.0d0  0.0d0
     1.0d0  0.0d0  0.0d0  0.0d0
     0.0d0  0.0d0  1.0d0  0.0d0
     1.0d0  0.0d0  0.0d0  0.0d0
     0.0d0  1.0d0  0.0d0  0.0d0

    where the first line gives the atoms that should be transformed
    and the following three lines gives the relevant transformation (rotation matrix followed by translation vector).
    and the last three lines gives the relevant back-transformation

  23. Run comqumtoturbo to get the Turbomole files:
  24. control  coord
     
  25. Check the name of metal atoms in file coord and then run
    define
    Construct an UFF hessian in the geometry menu (ff in geometry menu)
    Define the basis sets
    Get starting orbitals.
    Change the DFT functional from B3LYP, e.g. to TPSS
    Turn on the ri calculations
    Turn on DFT-D3 (dsp)

  26. For time being we run without electrostatics, so
    rm pointcharges
    and remove the line
    $point_charges file=pointcharges
    from the file control

  27. No longer needed:
    Construct parameter and topology files for the quantum system (without H-atoms), if necessary. This is done automatically by the program mkjunct (run it in /temp4/bio/Data/Cns/Mkjunct, where there is a modified version of protein_rep.param)
    Copy the *.top and *.par files for all QM junctions from /temp4/bio/Data/Cns.

  28.  
  29. Run comqumtocns to get the CNS files:

  30. Enter first u and the nmm1.par and possibly more (e.g. all.par) parameter files.

    This gives the following files:
    fix.pdb    generate1.inp    mmrun1    pdb1    pdb3

    but only mmrun1 and fix.pdb are actually used.

    With alternate conformations, this does not work properly (mmrun3 and binidvidual1 complains about atoms not found in molecular structure). Run:
    changepdb <<EOF
    mm3.pdb
    cqf
    comqum.dat

    w
    fix.pdb
    q
    EOF
     

  31. Convert the pdb structure factor file to CNS format using /temp4/bio/Cns/pdbtocnsrefl.f.

  32. You probably have to change the read reflections line in the source code (line ~158) to fit your reflection file. Check carefylly that you get the right number of unique relections.
    The program is compiled by typing make pdbtocnsrefl in the directory /temp4/bio/Cns/Linux.

    A better and more general tool is sf-convert:
    http://sw-tools.pdb.org/apps/SF_CONVERT (more info at the end of this file)

     
  33. Typically not needed
    Construct a test set if not already available in the structure factor file.

    cns <make_cv.inp >make_cv.out
    grep ERR make_cv.out
    grep WAR make_cv.out
    grep INFO make_cv.out


  34. If you have alternative configurations, you have to go into the file mmrun2 and mmrun3 and change "none" to "segid AC1" or AC2 in the following (consecutive) lines (around line 195):

    {* select atoms in alternate conformation 1 *}
    {===>} conf_1=(segid AC1);

    {* select atoms in alternate conformation 2 *}
    {===>} conf_2=(segid AC2);

  35. cp /temp4/bio/Data/Cns/protein-allhdg_d.param protein_rep.param

    Test all programs by running
    cquprep

    Look in the files mmrun1.out and mmrun2.out for violations.

    The following warnings or informations are OK (but not any ERR):
      %-WRN: Could not convert to element: HARG
     DEFINE>{* files must contain unique array names otherwise errors will occur *}

     %READC-WRN: multiple coordinates for   5948 atoms
      CNSsolve>     display REMARK warning: B-correction gave atomic B-values less than zero

    Check that fixcoord1 worked properly:
    ********** Fix Coord 1 ended normally **********

  36. You should always run with a fixed value of the wa factor (even if you want the default one). This is done in the file mmrun2 (search for wa=). Run fixenergy_cnsin to get the default value of wa (take the first in the output of this program - there are two).

  37. Remove unneccesary files and make a backup by running
    cqxback

    The following 19 (20 for UHF) files are needed to run comqumx:
    basis            control          grad1            mm1.pdb1         mm3.pdb1         mmrun3           comqum.par
    bindividual.inp  coord            mm1.mtf          mm3.mtf          mmrun1           mos
    comqum.dat       fix.pdb          mm1.pdb          mm3.pdb          mmrun2           bindividual1.inp
    In addition, you need the structure factor file and the parameter files for all heterocompounds. 
     
  38. You can add restraints on a distance between two atoms using (note the unusual units):
    $restraints
      atom1(1)  atom2(1)  desired_distance_in_A   force_constant_in_au
    $restraints
    1 3  1.00 10.0

    A typical force constant is 10 a.u. (which typically gives a 0.05 pm deviation at convergence)

  39. Start comqumx:
    comqumu -c 200
    .

  40. You do not have to calculate strain energies; these can be directly read from the energy file, entry three.



To run a pure CNS ComQum calculation
(i.e. ComQum with the CNS MM force field, but no crystal data):

Make the following two changes in mmrun2

  1. Line 238:

  2. {===>} wa=0;
    (Instead of:  wa=-1;)
  3. Line 511:

  4. exclude elec pvdw xref
    (Instead of: exclude elec include pvdw xref)


To run a ComQum-X calculation without any MM force field (No MM)
(i.e. only crystal data; the quantum system is not corrected for the change in junction atoms).

Make the following two changes in mmrun2

  1. Line 238:

  2. {===>} wa=1.0;
    (Instead of:  wa=-1;)
     
  3. Line 511:

  4. exclude elec bond angl dihe impr vdw pvdw include xref
    (Instead of: exclude elec include pvdw xref)
Also make the following change in mmrun1
  1. Line 8:

  2. flags exclude elec pele bond angl dihe impr vdw ? end
    (Instead of: flags exclude elec pele ? end)

This is cquprep :

# ComQum-X Prep
# Ulf Ryde, 6/3-01
#

echo Run changemtf to set up the mm1.mtf, mm1.par, and mm1.pdb files
echo from mm3.mtf, mm3.pdb, and protein_rep.param
changemtf <<EOF
mm3.mtf
t
y
mm1.mtf
protein-allhdg_d.param 
mm1.par
y
mm3.pdb
mm1.pdb
q
EOF

echo Construct dummy pdb files for 8-decimal precision

echo Enter mm1.pdb, format, 1, 3, 0, dummy, return, q
changepdb <<EOF
mm1.pdb
f
1
3
0
dum

q
EOF

echo
echo Enter mm3.pdb, format, 1, 3, 0, dummy, return, q

changepdb <<EOF
mm3.pdb
f
1
3
0
dum

q
EOF

echo CNS mmrun 1
ncns <mmrun1   >mmrun1.out
grep ERR       mmrun1.out
grep WRN       mmrun1.out
grep INFO      mmrun1.out
grep violation mmrun1.out
#page mmrun1.out

echo CNS mmrun 2
ncns <mmrun2   >mmrun2.out
grep ERR       mmrun2.out
grep WRN       mmrun2.out
grep INFO      mmrun2.out
grep error     mmrun2.out
grep warning   mmrun2.out
grep '>>'      mmrun2.out | grep - v XRAY
grep violation mmrun2.out

echo CNS mmrun 3
ncns <mmrun3   >mmrun3.out
grep ERR       mmrun3.out
grep WRN       mmrun3.out
grep INFO      mmrun3.out
grep error     mmrun3.out
grep warning   mmrun3.out
grep violation mmrun3.out | grep - v XRAY
grep '>>'      mmrun3.out 

echo CNS bindividual
ncns <bindividual.inp >bindividual.out
grep ERR       bindividual.out
grep WRN       bindividual.out
grep INFO      bindividual.out
grep error     bindividual.out
grep warning   bindividual.out
grep violation bindividual.out
grep '>>'      bindividual.out

echo CNS bindividual1
ncns <bindividual1.inp >bindividual1.out
grep ERR       bindividual1.out
grep WRN       bindividual1.out
grep INFO      bindividual1.out
grep error     bindividual1.out
grep warning   bindividual1.out
grep violation bindividual1.out
grep '>>'      bindividual1.out

echo fixcoord1
fixcoord1_cnsin
fixcoord1_turboin
fixcoord1
fixcoord1_cnsout

# Make the files bindividual.pdb, bindividual.pdb1, and minimize3.pdb1
\cp mm3.pdb1 bindividual.pdb1
\cp mm3.pdb1 minimize3.pdb1
\cp mm3.pdb bindividual.pdb



This is a schematic overview of  comqumx

# Start with a scf step then loop through the following four steps

# Gradient step (gradient calculation)
rm *_[123]
grad
offsetf
cp gradient grad1
fixforce_cnsin
fixforce_turboin
fixforce
fixforce_turboout
mv gradient g1
mv grad1 gradient

# Relaxation step
relax >>output
fixcoord1_cnsin
fixcoord1_turboin
fixcoord1
fixcoord1_cnsout

# Molmech step
# if  protfree then
   cp mm3.pdb mm3.old
   cns <mmrun3
   cns <bindividual.inp
   mv bindividual.pdb mm3.pdb
#endif

# Scf step (energy calculation)
cns <mmrun1
cns <mmrun2
dscf
grep 'Self energy' output > selfenergy
fixenergy_cnsin
fixenergy_turboin
fixenergy
fixenergy_turboout
checkconv
 

key about pdb files:
mmrun1:  mm1.pdb (no output file)
mmrun2: mm3.pdb -> minimize.pdb (not used)
mmrun2: mm3.pdb -> minimize3.pdb (used by bindividual.inp)
bindividual.inp: minimize3.pdb -> bindividual.pdb (moved to mm3.pdb)



These are the programs you need (apart from Turbomole and CNS):
fixforce_cnsin
fixforce_turboin
fixforce
fixforce_turboout
fixcoord1_cnsin
fixcoord1_turboin
fixcoord1
fixcoord1_cnsout
fixenergy_cnsin
fixenergy_turboin
fixenergy
fixenergy_turboout
comqumx

In addition, you may need some of our accessory programs:
pdbtocomqum
comqumtoturbo
comqumtocns
cqxprep
cqxback
changepdb



This is an example of a file with the quantum system (plastocyanin)
New title
531
534-541
1229
1231-1234
1266
1269-1276
1324
1327-1334
1443



These changes are appropriate when charges and hydrogen atoms should be included
  1. Protonate the quantum system. It can be done by protonate

  2. protonate <infile >outfile
    You may have to construct a protonate entry for it.
    If you have used protonate, you have to change the format of the pdb file with changepdb, command format (select 1; then write it out with command write).
  3. Run /temp4/bio/Bin/changemtf, option pdb, to construct a pdb file (comqum.pdb) with charges for pdbtocomqum.


sf-convert

I found the following tool for refinement.

sf-convert
http://sw-tools.pdb.org/apps/SF-CONVERT/index.html

It seems to be able to do what we need, i.e., it converts
a set of reflections stored in PDB (1b7v-sf.cif) to the CNS format.
It is done by using the following command.
sf-convert-v1.200-prod-src/bin/sf_convert -i mmCIF -o CNS -sf 1b7v-sf.cif

This program took the first wavelength. I am not sure if there is a way
to choose which wavelength to take.

The generated file (1b7v-sf.cif.CNS) has one strange peculiarity.
The number of digits after the decimal point is 2, whereas CNS usually has 3.
However, CNS can read this file directly, as I verified.

This program can also select reflections for the test set:

sf-convert-v1.200-prod-src/bin/sf_convert -i mmCIF -o CNS -sf 1b7v-sf.cif -flag 5
Dmitri, 27/8-12

History

16/10-15: Method set up from ComQum-U