In this tutorial we will calculate the solvation free energy of an organic molecule, 2-chlorosyringaldehyde, in water. 2-chlorosyringaldehyde is a substituted benzene molecule with multiple functional groups:
You could imagine extending this process to other, drug-like small molecules or ligands of interest.
l11.pdb- the structure of the 2-chlorosyringaldehyde molecule ('l11' for short) in PDB format. Download it here.
python $PROTOMSHOME/protoms.py -s dualtopology -l l11.pdb --absolute
this sets up a dual topology simulation of the molecule in a box of TIP4P water, perturbing between the full interaction of the molecule with its surroundings and the zero interaction of a dummy atom with its surroundings.
The simulation will run 5 m equilibration steps and 40 m production steps for each of the 16 λ-values. Output will be printed to files every 100 k moves and Hamiltonian replica-exchanges between neighbouring λ-values will be attempted every 200 k moves.
Why dual topology?
For these sorts of absolute free energy calculations (i.e. 'perturbing to nothing'), dual topology simulations will generally provide smoother free energy curves and more precise free energy estimates with smaller errors, particularly at the endpoints of the perturbation. They also have the added advantage that no gas-phase simulations are needed to complete the thermodynamic cycle (see the ProtoMS manual for theory details). Although in theory these calculations can also using a single topology approach, by default protoms.py will suggest you use dual topology. Single topology simulations can be set up using the individual scripts in the tools folder if you prefer.Why
-s dualtopology argument tells the script to set up a dual topology free energy simulation and the
--absolute argument tells the script that no second ligand exists and that the ligand specified by
-l should be perturbed to a dummy particle (nothing).
--absolute, the script will expect you to input a second ligand!
Protoms.py will print out a series of information messages if successful, and error messages if there are any problems.
For more detailed information there is also a log file (called
protoms_py.log by default), which includes debug messages and defines the values of any command line flags you give to protoms.py, or their default values if you don't set any.
However, the safest way to check is to read over the created files and visualise the created system. You can read more about the files that the setup script creates further down on this page, and you can visualise the system that will be simulated with (for instance) VMD. Note the highlighted dummy atom position in green:
vmd -m l11.pdb l11_dummy.pdb l11_box.pdb
mpirun -np 16 $PROTOMSHOME/protoms3 run_free.cmdThis is most conveniently done on a computer cluster. The calculations take approximately 7 h to complete using the Iridis4 system at the University of Southampton.
out_freefolder. Let's start by evaluating whether the system is well equilibrated. We should first visualise our simulation to check nothing has gone wrong (open
out_free/lam-0.000/all.pdbin VMD for example), but we can also use the calc_series.py script to plot the time-dependence of some obvious properties of interest. Type:
python $PROTOMSHOME/tools/calc_series.py -f out_free/lam-0.000/resultsThis will bring up a wizard to look at a choice of various time series of results from the 0.000 λ-window (you can also run
calc_series.py -hto see how to define these choices from the command line). First of all, let's see how the total energy of the system changes with time. Type
totalfollowed by enter twice. This will plot the total energy as a function of simulation snapshot. It should look something like this:
calc_series.py -f out_free/lam-0.000/resultsagain and type:
total inter/solvent-solvent/sum inter/l11-solvent/sum
And then choose option 5 for displaying the graphs on the same plot, normalised to the energy of the final snapshot. You should see something like this:
Clearly the change in the total energy is dominated by the solvent-solvent interactions, while the ligand-solvent interactions remain fairly constant throughout. But what effect does this have on the forward & backward free energy differences, and hence the gradients for our TI calculations? Let's take a look, but this time we'll evaluate the trends as running averages and use the command line flags for calc_series.py:
python $PROTOMSHOME/tools/calc_series.py -f out_free/lam-0.000/results --average -s forwfe gradient
This time use option 2 for displaying the graphs as sub-plots, as they have very different scales:
So we can see that the free energy gradients seem fairly constant after around 100 snapshots. Hence we'll discard the first 100 snapshots in our free energy calculations. At this stage we're ready to begin the free energy analysis. We'll do this by Thermodynamic Integration (TI), Bennett's Acceptance Ratio (BAR) and multistate-BAR (MBAR), so we can compare whether the results are sensitive to the choice of estimator. Type:
python $PROTOMSHOME/tools/calc_dg.py -d out_free/ -l 0.25
-l flag says we should skip the first 25% of the simulation (100 snapshots) as we decided above). You'll see a table of the dV/dλ gradients followed by the total free energies for the perturbation calculated by TI, BAR and MBAR.
Remember that the free energies given are for the forwards perturbation to nothing. The hydration free energy is the transfer free energy from vacuum to solvent - to opposite of what we've just simulated.
So all we need to do is reverse the sign!
We obtained a value of around -9.3 kcal/mol with TI, -8.4 kcal/mol with BAR and -8.4 kcal/mol with MBAR. The experimental hydration free energy for 2-chlorosyringaldehyde is -7.8 kcal/mol, so although we are fairly close it seems that our computed free energy is slightly dependent on what estimator we choose...but to say for sure we would first have to run more repeats, or try out different simulation conditions.
python $PROTOMSHOME/protoms.py --fullhelp...you can see that there are many different options you can define manually to explore your simulation further! A few examples are given below.
--nequil- this controls the number of equilibration steps
--nprod- this controls the number of production steps
python $PROTOMSHOME/protoms.py -s dualtopology -l l11.pdb --absolute --nequil 10E6 --nprod 50E6you will run 10 m equilibration steps and 50 m production steps (instead of the 5 m and 40 m that is default)
python $PROTOMSHOME/tools/calc_replicapath.py -f out_free/lam-*/results -p 1.000 0.800 0.533 0.200 0.000
You'll see a plot something like this created as
See how the replicas don't traverse the whole of λ-space? Perhaps we could run for longer, or perhaps we could add more windows, and see if the exchange was improved and whether this improved the agreement of our estimators.
The argument that controls the number of λ-values is called
By typing (for instance):
python $PROTOMSHOME/protoms.py -s dualtopology -l l11.pdb --absolute --lambdas 24you will initiate 24 λ-values rather than default 16. You can also give individual λ-values to the argument. For instance:
python $PROTOMSHOME/protoms.py -s dualtopology -l l11.pdb --absolute --lambdas 0.000 0.033 0.067 0.133 0.200 0.267 0.333 0.400 0.467 0.533 0.600 0.667 0.733 0.800 0.867 0.933 0.967 1.000
will add two new λ-values at 0.033 and 0.967 to the 16 created by default.
by typing for instance:
python $PROTOMSHOME/protoms.py -s dualtopology -l l11.pdb --absolute -r 5you will create 5 input files that will create 5 output folders. But remember, you also need to execute ProtoMS 5 times with the different input files. The output folders will be named e.g.
By default, protoms.py solvates systems using the TIP4P water model. Simulations can also be performed with the TIP3P model by using the
python $PROTOMSHOME/protoms.py -s dualtopology -l l11.pdb --absolute --watmodel tip3pAlternatively you may wish to solvate the solute using a different pre-equilibrated water box, or create a box of different dimensions. This can be done using the individual solvate script in the tools section.
l11.prepi= the z-matrix and atom types of l11 in Amber format (created by Amber)
l11.frcmod= additional parameters not in GAFF (created by Amber)
l11.zmat= the z-matrix of l11 used to sample it in the MC simulation
l11.tem= the complete template (force field) file for the ligand in ProtoMS format
l11_box.pdb= the box of water solvating l11 in the simulation
l11_dummy.pdb= the dummy particle that l11 will be perturbed into
l11-dummy.tem= the combined template file for l11 and the dummy, used only in this simulation
run_free.cmd= the ProtoMS input file for the simulation
dualtopology1 1 2 synctrans syncrot softcore1 solute 1 softcore2 solute 2 softcoreparams coul 1 delta 0.2 deltacoul 2.0 power 6 soft66
lambdare 100000 0.000 0.067 0.133 0.200 0.267 0.333 0.400 0.467 0.533 0.600 0.667 0.733 0.800 0.867 0.933 1.000
l11.pdbcontains a directive, telling ProtoMS the name of the solute. The line should read
HEADER L11and can be added using whichever editor you feel most comfortable with. For example:
sed -i "1iHEADER L11" l11.pdb
python $PROTOMSHOME/tools/ambertools.py -f l11.pdb -n L11
This will execute the AmberTools programs
parmchk, creating the files
l11.frcmod, respectively. If the ligand was charged, we could specify that here using the
python $PROTOMSHOME/tools/build_template.py -p l11.prepi -f l11.frcmod -o l11.tem -n L11
This will create the files
l11.tem (the ProtoMS template file) and
l11.zmat (the ProtoMS z-matrix). It is a good idea to check the latter to see if the script has defined the molecule properly.
python $PROTOMSHOME/tools/solvate.py -b $PROTOMSHOME/data/wbox_tip4p.pdb -s l11.pdb -o l11_box.pdb
As standard this will create a box of mimimum 10 A distance between the solute and the edge of the box. The padding of the box from the solute can be increased using the
Likewise a different pre-equilibrated box can be used by specifying the location using the
-b flag. The output is written to the file
python $PROTOMSHOME/tools/make_dummy.py -f l11.pdb -o l11_dummy.pdb
l11_dummy.pdb that just contains the new particle.
python $PROTOMSHOME/tools/merge_templates.py -f l11.tem $PROTOMSHOME/data/dummy.tem -o l11-dummy.tem
l11-dummy.tem. The template file of the dummy particle is located in
Now we have all the files except the ProtoMS input file itself. As you may have noticed, this step-by-step procedure create a few files that
protoms.py does not generate, but these often contain logs of the exact procedures performed, so they may be useful to you.
python $PROTOMSHOME/tools/generate_input.py -s dualtopology -l l11.pdb l11_dummy.pdb -t l11-dummy.tem -lw l11_box.pdb --absolute -o run