ComQum-UX
ComQum-UX or joint
quantum neutron and X-ray refinement is a method to combine
joint neutron and X-ray crystallographic refinement with QM/MM
calculations to obtain an optimum compromise between neutron
crystallography, X-ray crystallography and quantum chemistry.
It requires the structure factors from both a neutron and an
X-ray crystallographic investigation.
The proper reference of this method is:
O. Caldararu, F. Manzoni, E. Oksanen, D.
T. Logan & U. Ryde (2019) "Refinement of protein structures
using a combination of quantum mechanical calculations with neutron
and X-ray crystallographic data" Acta Cryst. D, 75, 368-380.
How to run
a ComQum-UX job
In this version, we consider QM as a complement to MM in the X-ray
and neutron investigation.
For time being, we do not include electrostatics and have no
hydrogen atoms.
- Start with a refined crystal structure, including structure
factors. The structure factors can be found from the summary
entry in the 3DBrowser (i.e. look first up the pdb file). Note
that you need both neutron and X-ray structure factors.
- cp /temp4/bio/Data/Comqumux_lib/* .
Edit the file sedfile by inserting (the values should appear
between the second and last /; e.g. s/cqx_crystal_a/43.1/) the
data in the following points:
1. Space group
2. Crystal data (a, b, c, alpha, beta, gamma)
3. File names: coordinates, parameter, topology (do not include
the CNS files), and reflections.
4. Percentage in the test set (if not already set)
5. Low and high resolution limits
6. [Residues in the quantum system (no longer needed)]
- The data can be found from the pdb file:
1. Unit cell and symmetry
CRYST1 49.970 58.680
98.050 90.00 90.00 90.00 P 21 21
21 4
(this pdb row show a, b, c, alpha, beta, gamma, and the symmetry)
2. Resolution range
REMARK 3 DATA USED IN REFINEMENT.
REMARK 3 RESOLUTION RANGE HIGH
(ANGSTROMS) : 1.90
REMARK 3 RESOLUTION RANGE LOW
(ANGSTROMS) : 20.00
3. Sigma-cutoff value
REMARK 3 DATA
CUTOFF
(SIGMA(F)) : 0.000
4. R-factor
REMARK 3 R
VALUE
(WORKING SET) : 0.181
REMARK 3 FREE R
VALUE
: 0.231
5. Test set size
REMARK 3 FREE R VALUE TEST SET
SIZE (%) : 4.900
REMARK 3 FREE R VALUE TEST SET
COUNT : 1042
6. Parameter and topology files (at least if CNS was used for
refinement)
REMARK 3 PARAMETER FILE 1 :
PROTEIN_REP.PA
REMARK 3 PARAMETER FILE 2 :
WATER_REP.PARA
REMARK 3 TOPOLOGY FILE 1
: PROTEIN.TOP
REMARK 3 TOPOLOGY FILE 2
: WATER.TOP
7. Alternative configurations
REMARK 3 OTHER REFINEMENT REMARKS:
ALTERNATE CONFORMATIONS FOR
REMARK 3 RESIDUES 33, 120, 121 AND 122
- If there are alternative configurations in the file (you find
occupation numbers <>1 by changepdb, command occ) you can
either remove them (take the major one and check that Rfree is
not significantly changed) or you can include them (this is
better but more tedious), provided that they are not in the
quantum system (see below how).
- Check that there is a END at the end of all pdb files.
- Copy the subunit column to the segid column (pos 73-76; if not
used).
changepdb, command seg followed by return and w does this for
you.
HETATMiiiiixaaaaarrrxaiiiiaxxxffffffffggggggggffffffffggggggffffffxxxxxxsegiel-1
****|****1****|****2****|****3****|****4****|****5****|****6****|****7****|****8
- If necessary, construct topology and parameter files for
unusual ligands, etc.
- Remove ANISOU if present in the pdb file
grep -v ANISOU pdb >npdb
- Change the name of ions to the proper CNS names (e.g.
residue=ZN2, atom=ZN+2).
Note that if there are four letters name in the pdb file, they
must start so that there is a blank before the residue.
Use changepdb to change the format to -6.
Moreover, cns does not read the subunit key, so you may need to
renumber the residues by changepdb, command rr.
Note that changepdb may TER from the file.
You can add them again with the ter command; otherwise you can
get PATCH-ERR when you run generate.
- Change atom names to CNS convention by changepdb, command cnn.
Do not forget to write out the modified file.
- Run cqxu_sedscript
This will construct the proper cns files:
bindividual.inp bindividual1.inp
generate.inp make_cv.inp mmrun2 mmrun3
- Run generate for system 3:
ncns <generate.inp >generate.out
grep ERR generate.out
grep WRN generate.out
grep INFO generate.out
grep error generate.out
grep warning generate.out
The following warnings or infos are OK, but no ERR or error:
%READC-WRN:
still 1361 missing coordinates (in selected
subset)
%READC-WRN: atom
A 113 PRO HA has
been renamed: DA
CNSsolve>
display
%INFO:
There
are
no
coordinates missing for non-hydrogen atoms
This gives the file mm3.pdb, which should be used as the pdb
file in the following.
Check the start (REMARK section) of this file for added Cys-Cys
links and missing atoms.
If you want to avoid added atoms, set:
{===>}
atom_delete=(not(known));
If you want to add or delete bonds, you need to include patches:
{* any special
protein patches can be applied here *}
{===>}
topology
@fs2.patch
end
patch FS2P
reference=1=(resid 632) reference=2=(resid 188)
reference=3=(resid 260) reference=4=(resid 97)
reference=5=(resid 128)
end
Remark Patch to build the
Fe-Cys bonds for a Fe2S2 cluster
presidue FS2P
add bond 1FE1 2SG
add bond 1FE1 3NH1
add bond 1FE2 4SG
add bond 1FE2 5SG
add angle 2SG
1FE1 1S1
add angle 2SG
1FE1 1S2
add angle 3NH1
1FE1 1S1
add angle 3NH1
1FE1 1S2
add angle 4SG
1FE2 1S1
add angle 4SG
1FE2 1S2
add angle 5SG
1FE2 1S1
add angle 5SG
1FE2 1S2
add angle 2CB
2SG 1FE1
add angle 3CZ
3NH1 1FE1
add angle 4CB
4SG 1FE2
add angle 5CB
5SG 1FE2
end
- If you want to include alternative
configurations, go to CNS
home
page and set up the file alternate.inp. Set the input
files mm3.mtf and mm3.pdb and the default output files
alternate.mtf and alternate.pdb. Finally, insert the number of
conformations (typically 2) and the atoms involved in the format
"atom segid residue_number atom_name", e.g. atom S 234 NE2
or atom L 33 CG (this can be done with changepdb, command
occ).
Then run
cns <alternate.inp
>alternate.out
Atoms with alternate configurations will get new segids AC1,
AC2, but they will have identical coordinates (that of
conformation B!) and a halfed occupancy. This can be corrected by running changepdb,
command al on the file alternate.pdb, use the input pdb file of generate.inp as correct file. Write the file out as
mm3.pdb
Finally
cp alternate.mtf mm3.mtf
- cp mm3.pdb comqum.pdb
If you intend to use a point-charge model in the QM calculations,
you must instead build a new comqum.pdb file with changemtf, on mm3.mtf,
command p, using coordinates from mm3.pdb (this file contains point
charges from the CNS force field).
- Run pdbtocomqum
- Select the name of the logfile (default is OK)
- Enter the name of the pdb file (comqum.pdb)
- Do not search for short contatcts
- Enter a new title (can be done in a file, see next point).
- Select the quantum system, including the junction atoms
that will be changed to hydrogens.
This should be done by entering the name of a file with the atom
numbers (see below; call it syst1). This file should also
contain the new title as the first line.
- Do not add any water molecules
- Select the radius for system 2 (MM system that is
optimised). Typically the whole protein (1000 Å).
There is no need to include or remove residues from this system.
- Select the radius for system 3 (MM system that is fixed).
Typically 0 Å (the whole protein has already been
included).
- Select a neutron type of calculation, by entering u.
- Select the basis set (i.e. the type of junction, typically
12).
- Enter the junction atoms (those that will become hydrogen
atoms in the quantum system; this is done automatically by the
syst1 file)
Ensure that they are not unknown (type -1).
- Use the new format.
- Do not remove the charge of the neighbours (n).
- Do not redistribute the deleted charges (n).
This gives the files:
comqum.dat comqum.mmout
comqum.qcout logfil
- If the quantum system is built up of symmetry related atoms,
insert into the file comqum.dat the relevant tranformation in
the format:
$transformation
26-40,41
0.0d0 1.0d0 0.0d0 0.0d0
0.0d0 0.0d0 1.0d0 0.0d0
1.0d0 0.0d0 0.0d0 0.0d0
0.0d0 0.0d0 1.0d0 0.0d0
1.0d0 0.0d0 0.0d0 0.0d0
0.0d0 1.0d0 0.0d0 0.0d0
where the first line gives the atoms that should be transformed
and the following three lines gives the relevant transformation
(rotation matrix followed by translation vector).
and the last three lines gives the relevant back-transformation
- Run comqumtoturbo to get the Turbomole files:
control coord
- Check the name of metal atoms in file coord and then run
define
Construct an UFF hessian in the geometry menu (ff in
geometry menu)
Define the basis sets
Get starting orbitals.
Change the DFT functional from B3LYP, e.g. to TPSS
Turn on the ri calculations
Turn on DFT-D3 (dsp)
- For time being we run without electrostatics, so
rm pointcharges
and remove the line
$point_charges file=pointcharges
from the file control
- No longer needed:
Construct parameter and topology files for the quantum system
(without H-atoms), if necessary. This is done automatically by
the program mkjunct (run it in
/temp4/bio/Data/Cns/Mkjunct, where there is a modified version
of protein_rep.param)
Copy the *.top and *.par files for all QM junctions from
/temp4/bio/Data/Cns.
- Run comqumtocns to get the CNS files:
Enter first u and the nmm1.par and possibly more (e.g. all.par)
parameter files.
This gives the following files:
fix.pdb generate1.inp
mmrun1 pdb1 pdb3
but only mmrun1 and fix.pdb are actually used.
With alternate conformations, this does not work properly
(mmrun3 and binidvidual1 complains about atoms not found in
molecular structure). Run:
changepdb <<EOF
mm3.pdb
cqf
comqum.dat
w
fix.pdb
q
EOF
- Convert the pdb structure factor file to CNS format using
/temp4/bio/Cns/pdbtocnsrefl.f.
You probably have to change the read reflections line in the
source code (line ~158) to fit your reflection file. Check
carefylly that you get the right number of unique relections.
The program is compiled by typing make pdbtocnsrefl in
the directory /temp4/bio/Cns/Linux.
A better and more general tool is sf-convert:
http://sw-tools.pdb.org/apps/SF_CONVERT (more info at the end of
this file)
- Typically not needed
Construct a test set if not already available in the structure
factor file.
cns <make_cv.inp >make_cv.out
grep ERR make_cv.out
grep WAR make_cv.out
grep INFO make_cv.out
- If you have alternative configurations, you have to go into
the file mmrun2 and mmrun3 and change "none" to "segid AC1" or
AC2 in the following (consecutive) lines (around line 195):
{* select atoms in
alternate conformation 1 *}
{===>} conf_1=(segid AC1);
{* select atoms in
alternate conformation 2 *}
{===>} conf_2=(segid AC2);
- cp /temp4/bio/Data/Cns/protein-allhdg_d.param
protein_rep.param
Test all programs by running
cquprep
Look in the files mmrun1.out and mmrun2.out for violations.
The following warnings or informations are OK (but not any ERR):
%-WRN: Could not
convert to element: HARG
DEFINE>{* files must contain unique array names
otherwise errors will occur *}
%READC-WRN: multiple
coordinates for 5948 atoms
CNSsolve> display REMARK warning:
B-correction gave atomic B-values less than zero
Check that fixcoord1 worked properly:
********** Fix Coord 1
ended normally **********
- You should always run with a fixed value of the wa factor
(even if you want the default one). This is done in the file
mmrun2 (search for wa=). Run fixenergy_cnsin to get the default value of
wa (take the first in the output of this program - there are
two).
- Remove unneccesary files and make a backup by running
cqxback
The following 19 (20 for UHF) files are needed to run comqumx:
basis
control
grad1
mm1.pdb1
mm3.pdb1
mmrun3
comqum.par
bindividual.inp
coord
mm1.mtf
mm3.mtf
mmrun1
mos
comqum.dat
fix.pdb
mm1.pdb
mm3.pdb
mmrun2 bindividual1.inp
In addition, you need the structure factor file and the
parameter files for all heterocompounds.
- You can add restraints on a distance between two atoms using
(note the unusual units):
$restraints
atom1(1) atom2(1)
desired_distance_in_A force_constant_in_au
$restraints
1 3 1.00 10.0
A typical force constant is 10 a.u. (which typically gives a
0.05 pm deviation at convergence)
- Start comqumx:
comqumu -c 200.
- You do not have to calculate strain energies; these can be
directly read from the energy file, entry three.
To run a pure CNS ComQum calculation
(i.e. ComQum with the CNS MM force field, but no crystal data):
Make the following two changes in mmrun2
- Line 238:
{===>} wa=0;
(Instead of: wa=-1;)
- Line 511:
exclude elec pvdw xref
(Instead of: exclude elec include pvdw xref)
To run a ComQum-X calculation without any MM force field (No MM)
(i.e. only crystal data; the quantum system is not corrected for the
change in junction atoms).
Make the following two changes in mmrun2
- Line 238:
{===>} wa=1.0;
(Instead of: wa=-1;)
- Line 511:
exclude elec bond angl dihe impr vdw pvdw include xref
(Instead of: exclude elec include pvdw xref)
Also make the following change in mmrun1
- Line 8:
flags exclude elec pele bond angl dihe impr vdw ? end
(Instead of: flags exclude elec pele ? end)
This is
cquprep :
# ComQum-X Prep
# Ulf Ryde, 6/3-01
#
echo Run changemtf to set up the mm1.mtf, mm1.par, and mm1.pdb files
echo from mm3.mtf, mm3.pdb, and protein_rep.param
changemtf <<EOF
mm3.mtf
t
y
mm1.mtf
protein-allhdg_d.param
mm1.par
y
mm3.pdb
mm1.pdb
q
EOF
echo Construct dummy pdb files for 8-decimal precision
echo Enter mm1.pdb, format, 1, 3, 0, dummy, return, q
changepdb <<EOF
mm1.pdb
f
1
3
0
dum
q
EOF
echo
echo Enter mm3.pdb, format, 1, 3, 0, dummy, return, q
changepdb <<EOF
mm3.pdb
f
1
3
0
dum
q
EOF
echo CNS mmrun 1
ncns <mmrun1 >mmrun1.out
grep ERR mmrun1.out
grep WRN mmrun1.out
grep INFO mmrun1.out
grep violation mmrun1.out
#page mmrun1.out
echo CNS mmrun 2
ncns <mmrun2 >mmrun2.out
grep ERR mmrun2.out
grep WRN mmrun2.out
grep INFO mmrun2.out
grep error mmrun2.out
grep warning mmrun2.out
grep '>>' mmrun2.out | grep - v XRAY
grep violation mmrun2.out
echo CNS mmrun 3
ncns <mmrun3 >mmrun3.out
grep ERR mmrun3.out
grep WRN mmrun3.out
grep INFO mmrun3.out
grep error mmrun3.out
grep warning mmrun3.out
grep violation mmrun3.out | grep - v XRAY
grep '>>' mmrun3.out
echo CNS bindividual
ncns <bindividual.inp >bindividual.out
grep ERR bindividual.out
grep WRN bindividual.out
grep INFO bindividual.out
grep error bindividual.out
grep warning bindividual.out
grep violation bindividual.out
grep '>>' bindividual.out
echo CNS bindividual1
ncns <bindividual1.inp >bindividual1.out
grep ERR bindividual1.out
grep WRN bindividual1.out
grep INFO bindividual1.out
grep error bindividual1.out
grep warning bindividual1.out
grep violation bindividual1.out
grep '>>' bindividual1.out
echo fixcoord1
fixcoord1_cnsin
fixcoord1_turboin
fixcoord1
fixcoord1_cnsout
# Make the files bindividual.pdb, bindividual.pdb1, and
minimize3.pdb1
\cp mm3.pdb1 bindividual.pdb1
\cp mm3.pdb1 minimize3.pdb1
\cp mm3.pdb bindividual.pdb
This is a schematic overview of comqumx
# Start with a scf step then loop through the following four
steps
# Gradient step (gradient calculation)
rm *_[123]
grad
offsetf
cp gradient grad1
fixforce_cnsin
fixforce_turboin
fixforce
fixforce_turboout
mv gradient g1
mv grad1 gradient
# Relaxation step
relax >>output
fixcoord1_cnsin
fixcoord1_turboin
fixcoord1
fixcoord1_cnsout
# Molmech step
# if protfree then
cp mm3.pdb mm3.old
cns <mmrun3
cns <bindividual.inp
mv bindividual.pdb mm3.pdb
#endif
# Scf step (energy calculation)
cns <mmrun1
cns <mmrun2
dscf
grep 'Self energy' output > selfenergy
fixenergy_cnsin
fixenergy_turboin
fixenergy
fixenergy_turboout
checkconv
key about pdb files:
mmrun1: mm1.pdb (no output file)
mmrun2: mm3.pdb -> minimize.pdb (not used)
mmrun2: mm3.pdb -> minimize3.pdb (used by bindividual.inp)
bindividual.inp: minimize3.pdb -> bindividual.pdb (moved to
mm3.pdb)
These are the programs you need (apart from Turbomole and CNS):
fixforce_cnsin
fixforce_turboin
fixforce
fixforce_turboout
fixcoord1_cnsin
fixcoord1_turboin
fixcoord1
fixcoord1_cnsout
fixenergy_cnsin
fixenergy_turboin
fixenergy
fixenergy_turboout
comqumx
In addition, you may need some of our accessory programs:
pdbtocomqum
comqumtoturbo
comqumtocns
cqxprep
cqxback
changepdb
This is an example of a file with the quantum system
(plastocyanin)
New title
531
534-541
1229
1231-1234
1266
1269-1276
1324
1327-1334
1443
These changes are appropriate when charges and hydrogen atoms should
be included
- Protonate the quantum system. It can be done by protonate
protonate <infile >outfile
You may have to construct a protonate
entry for it.
If you have used protonate, you have to change the format of the
pdb file with changepdb, command format (select 1; then write it
out with command write).
- Run /temp4/bio/Bin/changemtf, option pdb, to construct a pdb
file (comqum.pdb) with charges for pdbtocomqum.
sf-convert
I found the following tool for refinement.
sf-convert
http://sw-tools.pdb.org/apps/SF-CONVERT/index.html
It seems to be able to do what we need, i.e., it converts
a set of reflections stored in PDB (1b7v-sf.cif) to the CNS format.
It is done by using the following command.
sf-convert-v1.200-prod-src/bin/sf_convert -i mmCIF -o CNS -sf 1b7v-sf.cif
This program took the first wavelength. I am not sure if there is a way
to choose which wavelength to take.
The generated file (1b7v-sf.cif.CNS) has one strange peculiarity.
The number of digits after the decimal point is 2, whereas CNS usually has 3.
However, CNS can read this file directly, as I verified.
This program can also select reflections for the test set:
sf-convert-v1.200-prod-src/bin/sf_convert -i mmCIF -o CNS -sf 1b7v-sf.cif -flag 5
Dmitri, 27/8-12
History
16/10-15: Method set up from ComQum-U