The program uses a clustering algorithm to find the water in the binding site. This algorithm is explain in Abel, R., Young, T., Farid, R., Berne, B. J., Friesner, R. A. (2008) J. Am. Chem. Soc., 130, 2817-2831. DOI: 10.1021/ja0771033. The program will superpose each snapshot from one or several MD simulations on a reference pdb-file and thereafter save away the coordinates of water molecules within a specific radius of the ligand in the reference pdb-file. After all the snapshots have been processed, the program will cluster the binding site water molecules.
The usage of the program is as follow
thermowat thermowat.in
where thermowat.in is a textfile controlling the behaviour of the program.
The code is written in Fortran95 and located locally in /away/bio/Amber/Thermowat
Samuel Genheden, 2012
The control file contains values of variables determining the execution of thermowat. Each variable name starts with an ampersand (&) and the rows that follow the variable name sets the variable and associated variables. Currently, the following variables can be set:
$refpdb
- the reference pdb-file used for superposing the MD snapshots. Must be set and the pdb-file need to have the same atom order, at least for the protein, as the snapshots.
$ligand
- determines which atoms in the reference pdb-file that are consider to be the reference ligand. Two integers must be given, the first and the last ligand atom, separated by space.
$prmtop
- an Amber prmtop-file that was used to generate the MD snapshots.
$crds
- the MD trajectories. The first line after the variable name should be an integer, specifying the number of trajectories to process, n. The next n lines contains two, three, or four columns. In all cases the first column should specify the name of the trajectory-file. If two columns are given, the second column should specify the last snapshot to process and the first and frequency is set to 1. If three columns are given, the second and third column should specify the first and last snapshot to process, respectively and frequency is set to 1. If four columns are given, the last column is the frequency of processing.
$siterad
- the radius that determines the extent of the binding site. Any water molecule within at least this radius from any reference-ligand atom is considered to be a binding site water molecule.
$fitrad
- a radius that determine which residues that are used to superpose the snapshots onto the reference pdb-file. All residues with at least one atom within this radius from any reference-ligand atoms is used in the superpositioning. Does not have be set, default = 10 A.
$outprefix
- the string is appended to the start of all the output-files from the program. Default = pre
$R
- the gas constant. Default = 8.314 J/mol/K.
$T
- the absolute temperature. Default = 300 K.
$maxwat
- the maximum number of water one expects to find in the binding site. Default = 200.
$refpdb
ref.pdb
$ligand
5433 5463
$prmtop
ferr-l01.prm
$crds
1
r1_mdcrd4 1 40
$siterad
4.0
The program will write out the number of water molecules found in the binding site for each processed snapshot. At the end, it will write out a summary: the average and maximum number of water molecules found in the binding site. It will also write out the number of water sites found and the average occupancy of each site.
The program produces three output-files. These are explained below; in all cases it was assumed that the $outprefix
was the to pre