ComQum-X
ComQum-X or quantum
refinement is a method to combine X-ray crystallographic
refinement with QM/MM calculations to obtain an optimum
compromise between crystallography and quantum chemistry.
It requires the structure factors from a crystallographic
investigation.
The proper reference of this method is:
U. Ryde, L. Olsen & K. Nilsson (2002)
"Quantum chemical geometry optimisations in proteins using
crystallographic raw data", J. Comp. Chem., 23, 1058-1070.
How to run
a ComQum-X job
New version where the cns files are constructed
by sedscript
In this version, we consider QC as a complement to MM in the X-ray
investigation.
Thus we do not inclued electrostatics and have no hydrogen atoms.
Form to fill in
- Start with a refined crystal structure, including structure
factors. The structure factors can be found from the summary
entry in the 3DBrowser (i.e. look first up the pdb file).
- cp /temp4/bio/Data/Comqumx_lib/* .
Edit the file sedfile by inserting (the values should appear
between the second and last /; e.g. s/cqx_crystal_a/43.1/) the
data in the following points:
1. Space group
(http://stock.cabm.rutgers.edu/xtal/NOTES/sg_table.html gives a
proper translation table; note that the last number in the CRYST1
record is the number of the space group)
2. Crystal data (a, b, c, alpha, beta, gamma)
3. File names: coordinates, parameter, topology (do not include
the CNS files), and reflections.
4. Percentage in the test set (if not already set)
5. Low and high resolution limits
6. Residues in the quantum system
- Find out from the pdb file the following data
1. Unit cell and symmetry
CRYST1 49.970 58.680
98.050 90.00 90.00 90.00 P 21 21
21 4
(this pdb row show a, b, c, alpha, beta, gamma, and the symmetry)
2. Resolution range
REMARK 3 DATA USED IN REFINEMENT.
REMARK 3 RESOLUTION RANGE HIGH
(ANGSTROMS) : 1.90
REMARK 3 RESOLUTION RANGE LOW
(ANGSTROMS) : 20.00
3. Sigma-cutoff value
REMARK 3 DATA
CUTOFF
(SIGMA(F)) : 0.000
4. R-factor
REMARK 3 R
VALUE
(WORKING SET) : 0.181
REMARK 3 FREE R
VALUE
: 0.231
5. Test set size
REMARK 3 FREE R VALUE TEST SET
SIZE (%) : 4.900
REMARK 3 FREE R VALUE TEST SET
COUNT : 1042
6. Parameter and topology files (at least if CNS was used for
refinement)
REMARK 3 PARAMETER FILE 1 :
PROTEIN_REP.PA
REMARK 3 PARAMETER FILE 2 :
WATER_REP.PARA
REMARK 3 TOPOLOGY FILE 1
: PROTEIN.TOP
REMARK 3 TOPOLOGY FILE 2
: WATER.TOP
7. Alternative configurations
REMARK 3 OTHER REFINEMENT REMARKS:
ALTERNATE CONFORMATIONS FOR
REMARK 3 RESIDUES 33, 120, 121 AND 122
- If there are alternative configurations in the file (you find
occupation numbers <>1 by changepdb, command occ) you can
either remove them (take the major one and check that Rfree is
not significantly changed) or you can include them (this is
better but more tedious), provided that they are not in the
quantum system (se below how).
- Insert END at the end of all files.
- It is often wise to copy the subunit column to the segid
column (pos 73-76; if not used).
changepdb, command seg followed by return and w does this for
you.
HETATMiiiiixaaaaarrrxaiiiiaxxxffffffffggggggggffffffffggggggffffffxxxxxxsegiel-1
****|****1****|****2****|****3****|****4****|****5****|****6****|****7****|****8
- If necessary, construct topology and parameter files for
unusual ligands, etc.
They can be obtained either by describe
(Hess2FF; from QM calculations) or from the HIC-UP web server.
Collect all top and par files in the files all.top and all.par.
- Change the name of ions to the proper CNS names (e.g.
residue=ZN2, atom=ZN+2; it is recommend also to change the
charge in the last 2 columns (79 and 80) of the pdb file).
Note that if there are four letters name in the pdb file, they
must start so that there is a blank before the residue. Use
changepdb to change the format to -6.
Moreover, cns does not read the subunit key, so you may need to
renumber the residues by changepdb, command rr.
Note that changepdb removes TER from the file. These may have to
be added manually; otherwise you can get PATCH-ERR when you run
generate.
- Run cqx_sedscript
This will construct the proper cns files:
bindividual.inp bindividual1.inp
generate.inp make_cv.inp mmrun2 mmrun3
- Run generate for system 3:
cns <generate.inp >generate.out
grep ERR generate.out
grep WRN generate.out
grep INFO generate.out
grep error generate.out
grep warning generate.out
These warnings and info are acceptable:
%READC-WRN:
still 1361 missing coordinates (in selected
subset)
CNSsolve>
display
%INFO:
There
are
no
coordinates missing for non-hydrogen atoms
This gives the file mm3.pdb, which should be used as the pdb
file in the following.
Check the start (REMARK section) of this file for added Cys-Cys
links and missing atoms.
If you want to avoid added atoms (you
almost always want that), set (this is now default):
{===>}
atom_delete=(not known);
If you want to add or delete bonds, you need to include patches:
{* any special
protein patches can be applied here *}
{===>}
topology
@fs2.patch
end
patch FS2P
reference=1=(resid 632) reference=2=(resid 188)
reference=3=(resid 260) reference=4=(resid 97)
reference=5=(resid 128)
end
Remark Patch to build the
Fe-Cys bonds for a Fe2S2 cluster
presidue FS2P
add bond 1FE1 2SG
add bond 1FE1 3NH1
add bond 1FE2 4SG
add bond 1FE2 5SG
add angle 2SG
1FE1 1S1
add angle 2SG
1FE1 1S2
add angle 3NH1
1FE1 1S1
add angle 3NH1
1FE1 1S2
add angle 4SG
1FE2 1S1
add angle 4SG
1FE2 1S2
add angle 5SG
1FE2 1S1
add angle 5SG
1FE2 1S2
add angle 2CB
2SG 1FE1
add angle 3CZ
3NH1 1FE1
add angle 4CB
4SG 1FE2
add angle 5CB
5SG 1FE2
end
- If you want to include alternative
configurations, insert the atoms involved in the
atom_select keyword in the file alternate.inp, using the format
"atom segid residue_number atom_name", e.g. atom S 234 NE2
or atom L 33 CG (this list can be set up with changepdb, command
occ).
Also check that you have only two configurations of each
residue.
Then run
cns <alternate.inp
>alternate.out
Atoms with alternate configurations will get new segids AC1,
AC2, but they will have identical coordinates (that of
conformation B!). This can be corrected by running changepdb,
command al on the file alternate.pdb. Write the file out as
mm3.pdb
Finally
\mv alternate.mtf mm3.mtf
The script alternate.inp does not work with metal ion with names
like "Fe+2".
We managed to solve the problem by changing the atom name to
"Fe " in alternate.inp, mm3.mtf (only 1st field = atom
name) and mm3.pdb. Then you have to change them back to FE+2
(now in both AC1 and AC2) in alternate.mtf and alternate.pdb
after running cns.
This is now done automatically with changepdb command fe and changemtf command f.
- cp mm3.pdb comqum.pdb
Then change the name of the histidine residues in the
quantum system to HID or HIE (to get the right protonation and
to get correct junctions). Cysteinate metal ligands may remain
called CYS.
- Protonate the quantum system in this file (comqum.pdb).
One way is to use tleap (see Amber equilibration page).
Another way to do this is to run protonate on the protein:
/temp4/bio/AMBER/Amber7/exe/protonate -k -d
/temp4/bio/Data/Amber/PROTON_INFO.bio <comqum.pdb >ppdb
This will protonate all amino acids. To protonate water molecules,
use changepdb, command addwat and then command w(rite). This will
give you crude proton coordinates, which you might want to
improve. Alternatively, you can add them manually with Spartan. To
automatically improve the proton coordinates, follow the procedure
in the Amber equilibration setup ,
points 9-16.
Note that you should add protons only to the quantum system and
not to the whole protein. However, protonate works only for full
amino acids. Therefore, it is easiest to protonate the whole
protein first and then copy only exactly those hydrogen atoms you
need into the pdb file.
B factors and occupation numbers need to be defined. Therefore,
set occupation numbers of the hydrogen atoms to 1.00 and the B
factors to the same value as the mother heavy atom.
Note that rasmol does not understand the Amber format of
four-character hydrogen atoms. If you want to visualise the
result, run changepdb command f(ormat), chose format 3, then write
out a new file with command w(rite).
If you have an old comqum.pdb file with the H atoms already added, this can be done automatically with changepdb command cph.
- Run pdbtocomqum
- Select the name of the logfile (default is OK)
- Enter the name of the pdb file (comqum.pdb)
- Do not search for short contatcts
- Enter a new title (can be done in a file, see next point).
- Select the quantum system, including the junction atoms
that will be changed to hydrogens (you first give the number
of the amino acid, then the number of the atoms in this amino
acid; one <CR> let you select a new amino acid; another
<CR> ends the selection).
This can preferably be done by entering the name of a file with
the atom numbers (see below). This file should also contain the
new title as the first line.
- 0 water molecules to include
- Select the radius for system 2 (MM system that is
optimised). Typically the whole protein (1000 Å).
There is no need to include or remove residues from this system.
- Select the radius for system 3 (MM system that is fixed).
Typically 0 Å (the whole protein has already been included).
- Select a cns type of calculation, by entering c(ns).
- Select the basis set (i.e. the type of junction, typically
9).
- Enter the junction atoms (those that will become hydrogen
atoms in the quantum system)
Ensure that they are not unknown.
- Press return until the program is finished.
This gives the files:
comqum.dat comqum.mmout
comqum.qcout logfil
- If the quantum system is built up of symmetry related atoms,
insert into the file comqum.dat the relevant tranformation in
the format:
$transformation
26-40,41
0.0d0 1.0d0 0.0d0 0.0d0
0.0d0 0.0d0 1.0d0 0.0d0
1.0d0 0.0d0 0.0d0 0.0d0
0.0d0 0.0d0 1.0d0 0.0d0
1.0d0 0.0d0 0.0d0 0.0d0
0.0d0 1.0d0 0.0d0 0.0d0
where the first line gives the atoms that should be transformed
and the following three lines gives the relevant transformation
(rotation matrix followed by translation vector).
and the last three lines gives the relevant back-transformation
- Run comqumtoturbo to get the Turbomole files:
control coord
- Check the name of metal atoms in file coord and then run
define
- construct an UFF hessian in the geometry menu
- define the basis sets
- get starting orbitals
- possibly change the DFT functional
- turn on the ri calculations
- turn on DFT-D3 dispersion (bj)
- No longer needed
Construct parameter and topology files for the quantum
system (without H-atoms), if necessary. This is done
automatically by the program mkjunct
(run it in /temp4/bio/Data/Cns/Mkjunct, where there is a
modified version of protein_rep.param)
Copy the *.top and *.par files for all QM junctions from
/temp4/bio/Data/Cns.
- Run comqumtocns to get the CNS files:
Enter first x and then mm1.par and possibly more (e.g. all.par) as
the name of the parameter file.
comqumtocns <<EOF
x
mm1.par
all.par
EOF
This gives the following files:
fix.pdb generate1.inp
mmrun1 pdb1 pdb3
but only mmrun1 and fix.pdb are actually used.
Check that there are no HIE, HID, or HIP in fix.pdb.
With alternate conformations, this does not work properly (mmrun3 and
binidvidual1 complains about atoms not found in molecular structure).
Run:
changepdb <<EOF
mm3.pdb
cqf
comqum.dat
w
fix.pdb
q
EOF
- Convert the pdb structure factor file to CNS format using
/temp4/bio/Cns/pdbtocnsrefl.f.
You probably have to change the read reflections line in the
source code (line ~158) to fit your reflection file. Check
carefully that you get the right number of unique reflections.
The program is compiled by typing make pdbtocnsrefl in
the directory /temp4/bio/Cns/Linux.
A better and more general tool is sf-convert:
http://sw-tools.pdb.org/apps/SF_CONVERT (more info at the end of
this file)
or to do it by Phenix
(https://www.phenix-online.org/documentation/reference/reflection_file_tools.html):
phenix.relection_file_converter xxx.cif --cns=cns.cv (a numer of
flags are normally needed)
phenix.reflection_file_converter 6cdk-v.cif
--label='_refln.F_meas_au,_refln.F_meas_sigma_au'
--r-free-label='_refln.pdbx_r_free_flag'
--r-free-test-flag-value=0 --cns .)
- Construct a test set if not already available in the
structure factor file (normally not needed).
cns <make_cv.inp >make_cv.out
grep ERR make_cv.out
grep WAR make_cv.out
grep INFO make_cv.out
- If you have alternative configurations, you have to go into
the file mmrun2 and mmrun3 and change "none" to "segid AC1" or
AC2 (or perhaps something like LAC1 and LAC2) in
the following (consecutive) lines (around line 195):
{* select atoms in
alternate conformation 1 *}
{===>} conf_1=(segid AC1);
{* select atoms in
alternate conformation 2 *}
{===>} conf_2=(segid AC2);
- Copy the relevant parameter file of the protein.
For time being, you need a special version :
cp /temp4/bio/Data/Cns/protein_rep.param protein.param
(or protein-allhdg5-4.param or dna-rna_rep.param)
- Test all programs by running
cqxprep
Look in the files mmrun1.out and mmrun2.out for violations.
The following warnings or informations are OK (but not any ERR):
DEFINE>{* files
must contain unique array names otherwise errors will occur *}
%READC-WRN: multiple
coordinates for 5948 atoms
CNSsolve> display REMARK warning:
B-correction gave atomic B-values less than zero
Check that fixcoord1 worked properly:
********** Fix Coord 1
ended normally **********
If you have special parameter files, not called all.par, they
have to be included in mmrun1.
- You should always run with a fixed value of the wa factor
(even if you want the default one). This is done in the file
mmrun2 (search for wa=). Run fixenergy_cnsin to get the default value of
wa (take the first in the output of this program - there are
two).
- Remove unneccesary files and make a backup by running
cqxback
The following 19 (20 for UHF) files are needed to run comqumx:
basis
control
grad1
mm1.pdb1
mm3.pdb1
mmrun3
comqum.par
bindividual.inp
coord
mm1.mtf
mm3.mtf
mmrun1
mos
comqum.dat
fix.pdb
mm1.pdb
mm3.pdb
mmrun2 bindividual1.inp
In addition, you need the structure factor file and the
parameter files for all heterocompounds.
- You can add restraints on a distance between two atoms using
(note the unusual units):
$restraints
atom1(1) atom2(1)
desired_distance_in_A force_constant_in_au
$restraints
1 3 1.00 1.0
A typical force constant is 1 a.u. (which typically gives less
than 0.01 Å deviation at convergence)
- Start comqumx:
comqumx -c 200.
The results are in files energy and cns.dat.
The final coordinates are in bindividual1.pdb and in coord
- You do not have to calculate strain energies; these can be
directly read from the energy file, entry three.
How to make and look at maps
We use phenix.maps to make the maps and Coot to look at the maps:
- You should have the file map.parms in your directory
(otherwise, copy it from/temp4/bio/Data/Comqumx_lib or
/lunarc/nobackup/projects/bio/Data/Comqumx_lib)
You normally do not have to modify it.
- ln -fs bindividual1.pdb pdb
- phenix.maps maps.params >maps.log
for large proteins, it needs to be run in batch queue
This gives you four files:
pdb_2mFo-DFc_map.ccp4 pdb_mFo-DFc_map.ccp4
pdb_fmodel.mtz pdb_map_coeffs.mtz
The first two are the density and difference-density maps for
Pymol
The other two are for phenix, Coot and edstats
- Start Coot
open the coordinate file: pdb
auto open pdb_map_coeffs.mtz
How to calculate RSZD scores
- You need to calculate the maps first (see previous point)
- You should have a file edstat_ph in your directory (otherwise,
copy it from /temp4/bio/Data/Comqumx_lib or
/lunarc/nobackup/projects/bio/Data/Comqumx_lib)
You normally do not have to modify it.
- Run it
./edstats_ph
- This gives the file edstats.out
It contains results from all residues.
You normally want to grep out only the interesting results, e.g.
grep "CYS C 62" edstats.out
grep "GLY C 87" edstats.out
...
There are three sections in the output: one for the mainchain
(m), one for the sidechain (s) and one for all atoms in the
residue (a).q
The interesting results are in the ZD-, ZD+ and ZD columns
(maximum value for the negative or positive difference density,
as well as the maximum absolute value of these two.
If you run edstats.py without any options, the program will tell
you the content of all columns.
This is the file edstats_ph
cad hklin1 pdb_fmodel.mtz hklin2 pdb_map_coeffs.mtz hklout hkl_ed.mtz<<EOF
LABIN FILE 1 E1=FOBS E2=SIGFOBS E3=FCALC E4=PHIFCALC E5=FOM E6=R_FREE_FLAGS
LABOUT FILE 1 E1=FOBS E2=SIGFOBS E3=FC E4=PHIC E5=FOM E6=FREE
LABIN FILE 2 ALL
EOF
\cp pdb pdb.pdb
edstats.pl -xyzin pdb -hklin hkl_ed.mtz -flabel FOBS
cad is a program in the CCP4 suite to combine the data in the pdb_fmodel.mtz and pdb_map_coeffs.mtz files into a single file hkl_ed.mtz, with slightly changed labels.
How to run Phenix after ComQum-X
- You need the bindividual1.pdb and *.cv files
- To run in batch (the GUI is easier to use), you need the
param.def file (can be copied from /temp4/bio/Data/Comqumx_lib
or /lunarc/nobackup/projects/bio/Data/Comqumx_lib).
Change in this file: the unit cell and space group, the input
pdb file, the cv file (and possible the various lables),
possibly *cif files for all heterogroups (can be calculated by
phenix), prefix for the output files.
Set the refinement strategy (typical: *individual_adp
*occupancies
- Phenix uses another format for the alternative conformations
(closer to pdb format).
This awk script changes it to this format:
{
if(substr($0,74,3)=="AC1"){
printf("%sB%s%s%s\n",substr($0,1,16),substr($0,18,4),substr($0,73,1),substr($0,23,56));
}
else if(substr($0,74,3)=="AC2"){
printf("%sA%s%s%s\n",substr($0,1,16),substr($0,18,4),substr($0,73,1),substr($0,23,56));
}
else
print $0
}
It also copies SEGID back to subunit
These two actions are now done by changepdb, commands ss and alc
- It may be necessary to change the format of metal sites back
to that in PDB.
- Run
phenix.refine param.def --unused_ok >& refine.log
How to run a ComQum-X
job
Old version where the files are constructed with
CNS
In this version, we consider QC as a complement to MM in the X-ray
investigation.
Thus we do not inclued electrostatics and have no hydrogen atoms.
Form to fill in
- Start with a refined crystal structure, including structure
factors. The structure factors can be found from the summary
entry in the 3DBrowser (i.e. look first up the pdb file).
- Find out from the pdb file the following data
1. Unit cell and symmetry
CRYST1 49.970 58.680
98.050 90.00 90.00 90.00 P 21 21
21 4
(this pdb row show a, b, c, alpha, beta, gamma, and the symmetry)
2. Resolution range
REMARK 3 DATA USED IN REFINEMENT.
REMARK 3 RESOLUTION RANGE HIGH
(ANGSTROMS) : 1.90
REMARK 3 RESOLUTION RANGE LOW
(ANGSTROMS) : 20.00
3. Sigma-cutoff value
REMARK 3 DATA
CUTOFF
(SIGMA(F)) : 0.000
4. R-factor
REMARK 3 R
VALUE
(WORKING SET) : 0.181
REMARK 3 FREE R
VALUE
: 0.231
5. Test set size
REMARK 3 FREE R VALUE TEST SET
SIZE (%) : 4.900
REMARK 3 FREE R VALUE TEST SET
COUNT : 1042
6. Parameter and topology files (at least if CNS was used for
refinement)
REMARK 3 PARAMETER FILE 1 :
PROTEIN_REP.PA
REMARK 3 PARAMETER FILE 2 :
WATER_REP.PARA
REMARK 3 TOPOLOGY FILE 1
: PROTEIN.TOP
REMARK 3 TOPOLOGY FILE 2
: WATER.TOP
7. Alternative configurations
REMARK 3 OTHER REFINEMENT REMARKS:
ALTERNATE CONFORMATIONS FOR
REMARK 3 RESIDUES 33, 120, 121 AND 122
- Remove alternative configurations (take the major one). You
find occupation numbers <>1 by changepdb, command occ. If
you want to include alternative configurations, you have to run
one ComQum job for each configuration.
Also set all occupancies = 1.00 (can also be done in
generate.inp).
- If you like, you may split the pdb file into parts consisting
of the protein, water, and ligands (metals).
Insert END at the end of all files. However, you can very well
keep all the coordinates in one single file (the original pdb
file).
- If necessary, construct topology and parameter files for
unusual ligands, etc.
- Change the name of ions to the proper CNS names (e.g.
residue=ZN2, atom=ZN+2).
- Construct a CNS
default file (Input files; Defaults; Set) with the
crystallographic data.
Save it and the chose Start editing files
- Set up a CNS
generate file (Input files; General).
You should typically change:
- pdb file names, both the protein and the other pdb files
(water, ligands, etc.)
- name of the output structure file mm3.mtf (instead of
generate.mtf)
- name of the output coordinate file mm3.pdb (instead of
generate.pdb)
- names of the topology and parameter files for the ligands
(the defaults of the protein are good).
- optionally set the occupancy set flag to true (if alternate
configurations have been removed)
- Then save the file as generate.inp
- Run generate for system 3:
cns <generate.inp >generate.out
grep ERR generate.out
grep WRN generate.out
grep INFO generate.out
grep error generate.out
grep warning generate.out
This gives the file mm3.pdb, which should be used as the pdb
file in the following.
- Copy mm3.pdb to comqum.pdb (cp mm3.pdb comqum.pdb)
and then change the name of the histidine residues in the
quantum system to HID or HIE (to get the right protonation and
to get correct junctions). Cysteinate metal ligands may remain
called CYS.
- Protonate the quantum system and construct a pdb file with
these hydrogen atoms (comqum.pdb).
B factors and occupation numbers need to be included. One way to
do this is to run protonate on the protein:
protonate -k -d
$OML_AMBER/dat/Mumod/PROTON_INFO.Ulf <comqum.pdb >ppdb
This will protonate all amino acids. To protonate water molecules,
use changepdb, command addwat and then command w(rite). This will
give you crude proton coordinates, which you might want to
improve. Alternatively, you can add them manually with Spartan. To
automatically improve the proton coordinates, follow the procedure
in the Amber equilibration setup ,
points 9-16.
Note that you should add protons only to the quantum system and
not to the whole protein. However, protonate works only for full
amino acids. Therefore, it is easiest to protonate the whole
protein first and then copy only exactly those hydrogen atoms you
need into the pdb file. Set occupation numbers of the hydrogen
atoms to 1.00 and the B factors to the same value as the mother
heavy atom.
Note that rasmol does not understand the Amber format of
four-character hydrogen atoms. If you want to visualise the
result, run changepdb command f(ormat), chose format 3, then write
out a new file with command w(rite).
- Run pdbtocomqum
- Select the name of the logfile (default is OK)
- Enter the name of the pdb file (comqum.pdb)
- Do not search for short contatcts
- Enter a new title (can be done in a file, see next point).
- Select the quantum system, including the junction atoms
that will be changed to hydrogens (you first give the number
of the amino acid, then the number of the atoms in this amino
acid; one <CR> let you select a new amino acid; another
<CR> ends the selection).
This can preferably be done by entering the name of a file with
the atom numbers (see below). This file should also contain the
new title as the first line.
- Select the radius for system 2 (MM system that is
optimised). Typically the whole protein (1000 Å).
If necessary, include or remove residues from this system.
- Select the radius for system 3 (MM system that is fixed).
Typically 0 Å (the whole protein has already been included).
- Select a cns type of calculation, by entering c(ns).
- Select the basis set (i.e. the type of junction, typically
6 = default).
- Enter the junction atoms (those that will become hydrogen
atoms in the quantum system)
Ensure that they are not unknown.
- Press return until the program is finished.
This gives the files:
comqum.dat comqum.mmout
comqum.qcout logfil
- Run comqumtoturbo to get the Turbomole files:
control coord
- Run define to define the basis sets and get starting
orbitals. Optionally also change the DFT functional.
- Construct parameter and topology files for the quantum system
(without H-atoms).
These should involve exactly the same atom types and parameters as
in the full protein, except at the junctions.
Note that the bond parameters of the junction hydrogen
atoms are fixed by the corresponding parameters for the
junction carbon atoms and junctfact (which is found in the
comqum.dat file).
Thus the force constant of the junction hydrogen atom must
be the force constant of the junction carbon atom * juncfact^2,
whereas the equilibrium parameter should be the carbon equilibrium
parameter / juncfact (i.e. it should be the ideal bond length of
that bond with that basis set, as listed in
/temp4/bio/Data/junctfactor_cns. Otherwise a spurious extra force
will be introduced:
FMM1=k(x-x0)^2
FMM2=k/jf^2(x*jf-x0*jf)^2 = k/jf^2*jf^2(x-x0)^2 =
k(x-x0)^2 = FMM1
Example (MMP):
$junction_atoms
14 11
1.38765000
bond CPB CH3E
572.053 1.5028
bond CPB HA
1101.53 1.083
1101.53=572.053*1.38765*1.38765
1.083=1.5028/1.38765
- Run comqumtocns to get the CNS files:
Enter the name of the system1 topology and parameter files.
This gives the following files:
fix.pdb generate1.inp
mmrun1 pdb1 pdb3
- Convert the pdb structure factor file to CNS format using
/temp4/bio/Cns/pdbtocnsrefl.
You probably have to change the read relections line in the source
code to fit your reflection file. Check carefylly that you get the
right number of unique relections. The program is compiled by
typing make pdbtocnsrefl in the directory /temp4/bio/Cns.
- Construct a test set if not already available in the
structure factor file.
This is done by the CNS file make_cv.inp (under X-ray, Utilities).
You typically have to change:
1. The name of the reflection file.
2. The percentage of reflections to be used.
3. The resolution range
4. The output reflection file name.
Then run cns:
cns <make_cv.inp >make_cv.out
grep ERR make_cv.out
grep WAR make_cv.out
grep INFO make_cv.out
- Run CNS
on the web again to construct the mmrun2 file,
using refinement:minimize.inp sample file.
1. Select Input files
2. Select Set (Default)
3. Read in the project.def file and select Start editing files
4. Typically you shall change (for the other fields, default
values are normally OK):
- molecular topology file name =mm3.mtf
- parameter file names, if any (do not forget water.param and
ion.param)
- coordinate file name =mm3.pdb
- reflection file name
- resolution range
- number of minimisation steps = 0
- Continue with the same file on the web
1. Write in the "select fixed atoms" box (instead of none):
attribute xcomp < 0.5
2. Set the number of minimisation steps = 1
3. Change the output pdb file name to minimize3.pdb.
Then save it as mmrun3.
- Open mmrun2 in a text editor.
* Insert after the row " coordinates
@&coordinate_infile " (about row 272):
{ComQum: Read the 4th to 8th decimals and include them in x,
y, z}
coordinates disp=comp @mm3.pdb1
do ( x = x + xcomp/100000.0 ) (all)
do ( y = y + ycomp/100000.0 ) (all)
do ( z = z + zcomp/100000.0 ) (all)
* Insert before the row (line 643)
print threshold=20.0 bond
(these lines are similar to those found in file mmrun1):
{ComQum Set flags}
set echo=off message=off end
flags ? end
{ComQum Calculate energy and gradient}
energy end
{ComQum Print energy to file mmen2}
set display=mmen2 end
display ComQum Energy MM2
display $ener[e20.14]
{ComQum Print wa = X-ray weight factor}
xray wa=? end
evaluate ($wa_print=$result)
display ComQum wa
display $wa_print
{ComQum Print rfree and r value}
display ComQum rfree r
display $full_test_r[f8.6] $full_r[f8.6]
{ComQum Calculate the number of atoms}
show sum (1) (all)
evaluate ($natom=$result)
{ComQum Print gradient to file force2}
set display=force2 end
display ComQum Gradients MM2
display $natom[i6]
for $4 in id (all) loop grad
show (dx) (id $4)
eval ($1=$result)
show (dy) (id $4)
eval ($2=$result)
show (dz) (id $4)
eval ($3=$result)
display $1[E20.14] $2[E20.14] $3[E20.14]
end loop grad
set display=OUTPUT message=normal end
print threshold=0.10 bond
print threshold=10.0 angle
print threshold=60.0 dihedral
print threshold=20.0 improper
- Open file mmrun3 with a text editor and
* Insert after the row " coordinates
@&coordinate_infile " (around line 272):
{ComQum: Read the 4th to 8th decimals and include them in x,
y, z}
coordinates disp=comp @mm3.pdb1
do ( x = x + xcomp/100000.0 ) (all)
do ( y = y + ycomp/100000.0 ) (all)
do ( z = z + zcomp/100000.0 ) (all)
* Insert before the row (around line 495):
{- check isolated atoms and atoms at special positions and add to
{ComQum: Read in fixed atoms <=> xcomp=0.000}
coordinate disposition=comp @fix.pdb
* Insert before the row (around line 646)
print threshold=20.0 bond
{ComQum Print energy to file mmen3}
set display=mmen3 end
energy end
display ComQum Energy MM3
display $ener[e20.14]
{ComQum Print wa = X-ray weight factor}
xray wa=? end
evaluate ($wa_print=$result)
display ComQum wa
display $wa_print
{ComQum Print rfree and r value}
display ComQum rfree r
display $full_test_r[f8.6] $full_r[f8.6]
set display=OUTPUT end
* Insert before the row (at the end of the script)
" @CNS_XTALMODULE:write_pdb (pdb_o_format=&pdb_o_format;"
{ComQum: Construct the set of 4th to 8th decimals in xyzcomp}
do ( xcomp = int( x * 1000.0) / 1000.0 ) (all)
do ( ycomp = int( y * 1000.0) / 1000.0 ) (all)
do ( zcomp = int( z * 1000.0) / 1000.0 ) (all)
do ( refx = (x - xcomp)*100000.0 ) (all)
do ( refy = (y - ycomp)*100000.0 ) (all)
do ( refz = (z - zcomp)*100000.0 ) (all)
do ( x = xcomp ) (all)
do ( y = ycomp ) (all)
do ( z = zcomp ) (all)
* and before the final stop
{ComQum: Write file minimize.pdb1 only if Rfree is improved}
if ( $start_test_r > $full_test_r ) then
do ( x = refx ) (all)
do ( y = refy ) (all)
do ( z = refz ) (all)
@CNS_XTALMODULE:write_pdb
(pdb_o_format=&pdb_o_format;
coordinate_outfile=minimize3.pdb1;
sgparam=$sgparam;)
end if
- Run CNS
on the web again to construct the bindividual.inp
file (refinement).
Read in the default file again if necessary (as above).
Typically you shall change (for the other fields, default values
are normally OK):
- molecular topology file name (=mm3.mtf)
- parameter file names, if any (do not forget water.param and
ion.param)
- coordinate file name (=mm3.pdb)
- reflection file name
- resolution range
- NO: Select atoms whose B-factors will
remain fixed:
attribute xcomp < 0.5
- number of steps of restrained individual B-factor
minimization = 1
- Change in this file:
* Insert after the line:
coordinates @&coordinate_infile (around line 264)
{ComQum: Read the 4th to 8th decimals and include them in x,
y, z}
coordinates disp=comp @mm3.pdb1
do ( x = x + xcomp/100000.0 ) (all)
do ( y = y + ycomp/100000.0 ) (all)
do ( z = z + zcomp/100000.0 ) (all)
* NO: Insert before the row (line 440):
fix selection=(not(&atom_select) or
&atom_fixed) end
{ComQum: Read in fixed atoms <=>
xcomp=0.000}
coordinate disposition=comp @fix.pdb
* Insert before the line (line 574)
set display=&coordinate_outfile end
{ComQum Print energy to file mmen4}
set display=mmen4 end
energy end
display ComQum Energy MM4
display $ener[e20.14]
{ComQum Print rfree and r value}
display ComQum rfree r
display $full_test_r[f8.6] $full_r[f8.6]
set display=OUTPUT end
* Insert before the line (at the end of the script) (line
690)
" @CNS_XTALMODULE:write_pdb (pdb_o_format=&pdb_o_format;"
{ComQum: Construct the set of 4th to 8th decimals in xyzcomp}
do ( xcomp = int( x * 1000.0) / 1000.0 ) (all)
do ( ycomp = int( y * 1000.0) / 1000.0 ) (all)
do ( zcomp = int( z * 1000.0) / 1000.0 ) (all)
do ( refx = (x - xcomp)*100000.0 ) (all)
do ( refy = (y - ycomp)*100000.0 ) (all)
do ( refz = (z - zcomp)*100000.0 ) (all)
do ( x = xcomp ) (all)
do ( y = ycomp ) (all)
do ( z = zcomp ) (all)
* and before the final stop
{ComQum: Write file bindividual.pdb1 only if Rfree is improved}
if ( $start_test_r > $full_test_r ) then
do ( x = refx ) (all)
do ( y = refy ) (all)
do ( z = refz ) (all)
@CNS_XTALMODULE:write_pdb
(pdb_o_format=&pdb_o_format;
coordinate_outfile=bindividual.pdb1;
sgparam=$sgparam;)
end if
- cp bindividual.inp bindividual1.inp
Change the following lines in bindividual1.inp
1. {===>} atom_fixed=(attribute ycomp > 1.5);
(not none)
2. {===>} bfactor_nstep=30; (instead of 1)
3. coordinate_outfile="bindividual1.pdb"; (not
bindividual.pdb)
4. {ComQum Print energy to file mmen5} (not mmen4)
5. set display=mmen5 end (not mmen4)
6. {ComQum: Write file bindividual1.pdb1 only if Rfree is
improved} (not bindividual.pdb1)
7. coordinate_outfile=bindividuali1.pdb1; (not
bindividual.pdb1)
8. Add the following two lines before the line (~439)
fix selection=(not(&atom_select) or &atom_fixed)
end:
{ComQum: Read in fixed atoms <=> ycomp>=2.000 (system 2)}
coordinate disposition=comp @fix.pdb
- Test all programs by running
cqxprep
Look in the files mmrun1.out and mmrun2.out for violations
- Remove unneccesary files and make a backup by running
cqxback
The following 17 (18 for UHF) files are needed to run comqumx:
basis
control
grad1
mm1.pdb1
mm3.pdb1
mmrun3
bindividual.inp
coord
mm1.mtf
mm3.mtf
mmrun1
mos
comqum.dat
fix.pdb
mm1.pdb
mm3.pdb
mmrun2
In addition, you need the structure factor file and all
parameter files for all heterocompounds.
- You can add restraints on a distance between two atoms using
(note the unusual units):
$restraints
atom1(1) atom2(1)
desired_distance_in_A force_constant_in_au
$restraints
1 3 1.00 10.0
A typical force constant is 10 a.u. (which typically gives a 0.05
pm deviation at convergence)
- Start comqumx:
/temp4/bio/Bin/comqumx -c 200.
- You do not have to calculate strain energies; these can be
directly read from the energy file, entry three.
- When the calculation is converged, run
calcwa
to get a better value of wa.
This value is inserted into the file mmrun2, in the row (row 238):
{===>} wa=-1;
Replace -1 by the desired value and rerun comqumx
/temp4/bio/Bin/comqumx -c 200.
However, it is better to run calculations with different values of
wa, and select the one with the lowest Rfree.
To run a pure CNS ComQum calculation
(i.e. ComQum with the CNS MM force field, but no crystal data):
Make the following two changes in mmrun2
- Line 238:
{===>} wa=0;
(Instead of: wa=-1;)
- Line 511:
exclude elec pvdw xref
(Instead of: exclude elec include pvdw xref)
To run a ComQum-X calculation without any MM force field (No MM)
(i.e. only crystal data; the quantum system is not corrected for the
change in junction atoms).
Make the following two changes in mmrun2
- Line 238:
{===>} wa=1.0;
(Instead of: wa=-1;)
- Line 511:
exclude elec bond angl dihe impr vdw pvdw include xref
(Instead of: exclude elec include pvdw xref)
Also make the following change in mmrun1
- Line 8:
flags exclude elec pele bond angl dihe impr vdw ? end
(Instead of: flags exclude elec pele ? end)
This is
cqxprep :
# ComQum-X Prep
# Ulf Ryde, 6/3-01
# UR 1/11-16: New version without generate1, but changemtf
instead
#
#cp /temp4/bio/Data/Cns/comqum.par .
#cp /temp4/bio/Data/comqum.par .
#echo CNS generate 1
#cns <generate1.inp >generate1.out
#grep ERR generate1.out
#grep WRN generate1.out
#grep INFO generate1.out
#grep error generate1.out
#grep warning generate1.out
#echo CNS generate 3
#cns <generate.inp >generate.out
#grep ERR generate.out
#grep WRN generate.out
#grep INFO generate.out
#grep error generate.out
#grep warning generate.out
#echo Run comqumtocns
#comqumtocns <<EOF
#x
#mm1.par
#all.par
#
#EOF
echo Run changemtf to set up the mm1.mtf, mm1.par, and mm1.pdb
files
echo from mm3.mtf, mm3.pdb, and protein_rep.param
changemtf <<EOF
mm3.mtf
t
y
mm1.mtf
protein.param
mm1.par
y
mm3.pdb
mm1.pdb
q
EOF
echo Construct dummy pdb files for 8-decimal precision
echo Enter mm1.pdb, format, 1, 3, 0, dummy, return, q
changepdb <<EOF
mm1.pdb
f
1
3
0
dum
q
EOF
echo
echo Enter mm3.pdb, format, 1, 3, 0, dummy, return, q
changepdb <<EOF
mm3.pdb
f
1
3
0
dum
q
EOF
echo CNS mmrun 1
cns <mmrun1 >mmrun1.out
grep ERR mmrun1.out
grep WRN mmrun1.out
grep INFO mmrun1.out
grep violation mmrun1.out
#page mmrun1.out
echo CNS mmrun 2
cns <mmrun2 >mmrun2.out
grep ERR mmrun2.out
grep WRN mmrun2.out
grep INFO mmrun2.out
grep error mmrun2.out
grep warning mmrun2.out
grep '>>' mmrun2.out
grep violation mmrun2.out
echo CNS mmrun 3
cns <mmrun3 >mmrun3.out
grep ERR mmrun3.out
grep WRN mmrun3.out
grep INFO mmrun3.out
grep error mmrun3.out
grep warning mmrun3.out
grep violation mmrun3.out
grep '>>' mmrun3.out
echo CNS bindividual
cns <bindividual.inp >bindividual.out
grep ERR bindividual.out
grep WRN bindividual.out
grep INFO bindividual.out
grep error bindividual.out
grep warning bindividual.out
grep violation bindividual.out
grep '>>' bindividual.out
echo CNS bindividual1
cns <bindividual1.inp >bindividual1.out
grep ERR bindividual1.out
grep WRN bindividual1.out
grep INFO bindividual1.out
grep error bindividual1.out
grep warning bindividual1.out
grep violation bindividual1.out
grep '>>' bindividual1.out
echo fixcoord1
fixcoord1_cnsin
fixcoord1_turboin
fixcoord1
fixcoord1_cnsout
# Make the files bindividual.pdb, bindividual.pdb1, and
minimize3.pdb1
cp mm3.pdb1 bindividual.pdb1
cp mm3.pdb1 minimize3.pdb1
cp mm3.pdb bindividual.pdb
This is a schematic overview of comqumx
# Start with a scf step then loop through the following four
steps
# Gradient step (gradient calculation)
rm *_[123]
grad
offsetf
cp gradient grad1
fixforce_cnsin
fixforce_turboin
fixforce
fixforce_turboout
mv gradient g1
mv grad1 gradient
# Relaxation step
relax >>output
fixcoord1_cnsin
fixcoord1_turboin
fixcoord1
fixcoord1_cnsout
# Molmech step
# if protfree then
cp mm3.pdb mm3.old
cns <mmrun3
cns <bindividual.inp
mv bindividual.pdb mm3.pdb
#endif
# Scf step (energy calculation)
cns <mmrun1
cns <mmrun2
dscf
grep 'Self energy' output > selfenergy
fixenergy_cnsin
fixenergy_turboin
fixenergy
fixenergy_turboout
checkconv
key about pdb files:
mmrun1: mm1.pdb (no output file)
mmrun2: mm3.pdb -> minimize.pdb (not used)
mmrun2: mm3.pdb -> minimize3.pdb (used by bindividual.inp)
bindividual.inp: minimize3.pdb -> bindividual.pdb (moved to
mm3.pdb)
These are the programs you need (apart from Turbomole and CNS):
fixforce_cnsin
fixforce_turboin
fixforce
fixforce_turboout
fixcoord1_cnsin
fixcoord1_turboin
fixcoord1
fixcoord1_cnsout
fixenergy_cnsin
fixenergy_turboin
fixenergy
fixenergy_turboout
comqumx
In addition, you may need some of our accessory programs:
pdbtocomqum
comqumtoturbo
comqumtocns
cqxprep
cqxback
changepdb
This is an example of a file with the quantum system
(plastocyanin)
New title
531
534-541
1229
1231-1234
1266
1269-1276
1324
1327-1334
1443
These changes are appropriate when charges and hydrogen atoms should
be included
- Protonate the quantum system. It can be done by protonate
protonate <infile >outfile
You may have to construct a protonate
entry for it.
If you have used protonate, you have to change the format of the
pdb file with changepdb, command format (select 1; then write it
out with command write).
- Run /temp4/bio/Bin/changemtf, option pdb, to construct a pdb
file (comqum.pdb) with charges for pdbtocomqum.
Old instructions
Old:
These should involve exactly the same atom types and parameters as
in the full protein, except at the junctions.
Note that the bond parameters of the junction hydrogen
atoms are fixed by the corresponding parameters for the
junction carbon atoms and junctfact (which is found in the
comqum.dat file).
Thus the force constant of the junction hydrogen atom must
be the force constant of the junction carbon atom * juncfact^2,
whereas the equilibrium parameter should be the carbon equilibrium
parameter / juncfact (i.e. it should be the ideal bond length of
that bond with that basis set, as listed in
/temp4/bio/Data/junctfactor_cns. Otherwise a spurious extra force
will be introduced:
FMM1=k(x-x0)^2
FMM2=k/jf^2(x*jf-x0*jf)^2 = k/jf^2*jf^2(x-x0)^2 =
k(x-x0)^2 = FMM1
Example (MMP):
$junction_atoms
14 11
1.38765000
bond CPB CH3E
572.053 1.5028
bond CPB HA
1101.53 1.083
1101.53=572.053*1.38765*1.38765
1.083=1.5028/1.38765
It is often wise to run a frequency calculation at a lower
level of theory to get a better forceapprox matrix.
If you did not do it in point 17 above, you can do it now with UFF.
Alternatively, if you prefer a higher level of theory (e.g.
HF/Sto-3g):
mkdir Freq
cp control coord basis energy mos alpha beta auxbasis Freq
cd Freq
change basis set to sto-3g hondo and remove dft and ri (with define)
insert at the end of the control file:
$dipgrad file=hessian
$hessian file=hessian
$vibrational normal modes file=hessian
$forcepath <path to the scratch files>
$maxcor <core memory in MB (70% of available memory is
recommended >
dscf >logd
aoforce >logf (or if it is an open-shell system: uhfforce
>logf)
hesstofoa
cp hessfoa ..
cd ..
change in the control file
$forceinit off
$forceapprox file=hessfoa
Alternative configurations
If you want to include alternative
configurations, go to CNS
home
page and set up the file alternate.inp. Set the input files
mm3.mtf and mm3.pdb and the default output files alternate.mtf and
alternate.pdb. Finally, insert the number of conformations
(typically 2) and the atoms involved using the format "atom segid
residue_number atom_name", e.g. atom S 234 NE2 or atom L 33 CG
(this list can be set up with changepdb,
command occ).
Set scale_occupancy=false in alternate.inp
Then run
cns <alternate.inp
>alternate.out
Atoms with alternate configurations will get new segids AC1, AC2,
but they will have identical coordinates (that of conformation B!).
Change the coordinates of AC2 to the desired ones and change the
occupancies back to the original values.
Finally
cp alternate.mtf mm3.mtf
cp alternate.pdb mm3.pdb
Script set up by Octav to do this:
#write altloc A from original pdb file
awk '{if(substr($0,17,1)=="A") print $0}' 4u9h > altlocA.pdb
#write altloc A from alternate.pdb (AC2 segid)
awk '{if(substr($0,74,3)=="AC2") print $0}' alternate.pdb >
altlocA_cns.pdb
#reorder altloc A from original as altloc A from alternate.pdb
changepdb<<EOF
altlocA.pdb
ror
altlocA_cns.pdb
w
EOF
#substitute coord, bf and occup in altloc CNS
awk
'NR==FNR{a[NR]=substr($0,27,40);next}{printf("%s%s%s\n",substr($0,1,26),a[FNR],substr($0,67,14));}'
altlocA.pdb altlocA_cns.pdb > altlocA_cns_right.pdb
#delete all AC2 lines from alternate.pdb and substitute them with
altlocA_cns_right.pdb (manually)
#delete missing coordinates from alternate.pdb
awk '{if(substr($0,63,4)!="0.00") print $0}' alternate.pdb >t
&& mv -f t alternate.pdb
- When the calculation is converged,
run (No, not recommended any longer)
calcwa
to get a better value of wa.
This value is inserted into the file mmrun2, in the row (row
238):
{===>} wa=-1;
Replace -1 by the desired value and rerun comqumx
/temp4/bio/Bin/comqumx -c 200.
It is better to run calculations with different values of
wa, and select the one with the lowest Rfree.
sf-convert
I found the following tool for refinement.
sf-convert
http://sw-tools.pdb.org/apps/SF-CONVERT/index.html
It seems to be able to do what we need, i.e., it converts
a set of reflections stored in PDB (1b7v-sf.cif) to the CNS format.
It is done by using the following command.
sf-convert-v1.200-prod-src/bin/sf_convert -i mmCIF -o CNS -sf 1b7v-sf.cif
This program took the first wavelength. I am not sure if there is a way
to choose which wavelength to take.
The generated file (1b7v-sf.cif.CNS) has one strange peculiarity.
The number of digits after the decimal point is 2, whereas CNS usually has 3.
However, CNS can read this file directly, as I verified.
This program can also select reflections for the test set:
sf-convert-v1.200-prod-src/bin/sf_convert -i mmCIF -o CNS -sf 1b7v-sf.cif -flag 5
Dmitri, 27/8-12
History
12/8-05
Reinserted the reading in of fix.pdb in bindividual1.inp
12/6-14
Set up files for version 1.3 of CNS.
Then, I also changed so that bindividual.inp and bindividual1.inp
writes out the pdb file only if R is decreased (no longer R_free).
I also followed the original files in that I no longer use the _rep
topology files for protein and water.
1/11-16
Use changemtf to set up mm1.mtf, mm1.par, and mm1.pdb from mm3.mtf,
mm3.pdb, and protein_rep.param.
Updated cqxprep
generate1.inp is retired.