Practical: Computational Reaction Rate Theory@work

Bert de Groot

This practical is largely based on a practical by Udo W. Schmitt.

Introduction

In the previous lecture we've learned about the theoretical basis of chemical reaction rate theory. In particular, we looked at Transition State Theory (TST) (or the theory of the activated complex), Kramers Theory, and at the Smoluchowski limit.

In this practical we are going to investigate in detail on these theories. To this end, we will compute molecular dynamics trajectories using the so-called Langevin equation,



which basically describes Newtonian dynamics with an additional friction term with a time-independent friction constant plus a random force that ensures proper thermalization of our system. The random force R(t) is Gaussian white noise with zero mean and obeys the fluctuation-dissipation relation,



Note that the Langevin equation also provides the fundamental basis in Kramers Theory (see previous lecture).

All calculations performed in this practical will be done with a simple molecular dynamics program written in Fortran 77/90 that solves the Langevin equation for single degree of freedom x (i.e. one particle) in a symmetric double minimum potential. In case you are interested in the details you can have a look at the source code langevin.f. To compute the rate after we have generated a trajectory we use a short analyzing program called rate.f. To start the practical download the relevant files, followed by unpacking them with the following commands

wget http://www3.mpibpc.mpg.de/groups/de_groot/compbio/p9/files/rateTheory.tar.gz
tar xzvf rateTheory.tar.gz

Then change into the new directory and compile both programs needed with

cd rateTheory
gfortran -O2 langevin.f -o langevin
gfortran -O2 rate.f -o rate

Now we are ready to run our first computation. Our first task will be to have a look at the potential V(x) that will govern the deterministic part of our Langevin dynamics. To plot the potential V(x) just start the program langevin by typing

./langevin

and then start a Linux plot program, like for example gnuplot

gnuplot
p "POTENTIAL" u 1:2 w l

Hint: Open multiple tabs in your terminal program, one for the editor, one for gnuplot, and one to run commands. This way you don't have to open and close the editor or gnuplot all the time.

(A symmetric double minimum potential should appear on the screen.) The potential is a generic example for all kinds of reactive processes that show no potential (or free energy) difference between the two metastable regions A and B. The dynamical variable x is our order parameter (or reaction coordinate, collective variable or meta-coordinate) that must be able to distinguish between region A and B. Thinking in terms of macromolecules it would be a collective coordinate like e.g. the number of native contacts, the solvent-accesible surface or the radius of gyration. You can also view it as a projection from 3N cartesian coordinates of your macromolecule at hand onto a 1D (collective) coordinate. For example in the animation of a distinct conformational transition in alanine dipeptide shown above one could imagine to use an angle or a hydrogen-bond distance as an order parameter to arrive at a reduced 1D description of the overall reactive process described by the concerted movement of all 22 atoms shown.

The input file md.in of our MD program contains the necessary parameters shown here

2000  2       # nsteps,nout
0.005         # timestep
0.2           # x(t=0)
0.0           # temperature
0.0           # friction coefficient
.F.           # compute TST and Kramers rate

that will be read by the program langevin prior to the computation. The parameters will be explained briefly in the following: nsteps are the total number of molecular dynamics steps performed, nout gives the inverse output frequency (output is written only every n'th step), timestep is the discrete time step used in the numerical integration algorithm, x(t=0) are rather self-explanatory, temperature is the reduced temperature and friction coefficient is the parameter gamma that enters the Langevin equation. Note that the program uses a reduced unit system with k_B = 1, i.e. the temperature is given in units of energy.

Ballistic trajectory

Now we can start our first simulation. Open md.in with your favorite editor and replace the content inside by the above parameters. Typing

./langevin

there should now be a file TRAJECTORY in your directory. We can display the content of this file by typing in your gnuplot tab

p "TRAJECTORY" u 1:2 w l

This should reveal the time evolution (time on the x axis) of the coordinate x (shown on the y axis) that evolves in double minimum potential according to Newton's equation of motion (Note that the friction coefficient and temperature in the input file are both zero).

The signal we obtain with the previous input setting is representative of so-called ballistic dynamics (as opposed to diffusive dynamics we will encouter later on). The particle oscillates between the two potential wells with constant total energy (try in gnuplot the following command to display the time evolution of the potential, kinetic and total energy).

p "TRAJECTORY" u 1:3 w l t 'kinetic',"TRAJECTORY" u 1:4 w l t 'potential',"TRAJECTORY" u 1:5 w l t 'total'

Turning on friction

Now we can turn on the friction term by setting gamma in the md.in input file to a finite value (e.g. 0.75) and starting the previous simulation with otherwise identical parameters. What behaviour do you see now? What is the difference to the previous example? Try to run a trajectory with the initial coordinate x located at the barrier top.

Question: Can you explain the observed behaviour?


Turning on random force (finite temperature)

Finally we are going to also turn on the random force R(t) (see Langevin equation) in order to counteract the damping that is introduced by the friction term.


Question: What different physical mechanism is mimicked by the friction term in contrast to the random force?

Set the temperature and the friction to 0.75, increase the time steps to, e.g., 10000, and rerun ./langevin. You obtain now a different time evolution of x. How many reactive events (crossing from A to B) do you observe for this short trajectory? Again try to run a trajectory with the initial coordinate x located at the barrier top and try to explain the observed behaviour.

Question: What is different compared to the previous example with R(t) = 0?

Rate vs. Temperature - the Arrhenius plot

Being done with the preliminaries we can begin with computing reaction rates from our trajectories. To do so, we have to increase the number of integration steps nsteps in md.in such that we observe a sufficiently large number of reactive events. Set, e.g., nsteps to 300000, and nout to 30. To compute the rate from our trajectory, we run a second program rate that reads in TRAJECTORY file and performs a smart counting of reactive events, i.e. how often does a the dynamical variable x move from region A to region B. We simply start the programs by typing

./langevin
./rate 

and it will give us the rate k which has dimension 1/(time) and units 1/(timestep), where timestep is the specified timestep in md.in

Next we are going to analyze whether our simple 1D model shows the expected Arrhenius behaviour, meaning that the rate is proportional to . In the previous lecture we learned that the reaction rate as a function of temperature follows a phenomenological law (Equation), which means that a plot ln(rate) vs. 1/T should reveal a straight line. To check this result we perform rate computations at four different temperatures (0.25 0.5 0.75 1.0) and from each extract the rate k. Open an editor and put the output in a file called "arrhenius.dat", with the first column the temperature and the second column the rate. Plot the combined data in gnuplot

p "arrhenius.dat" u (1./$1):(log($2)) w lp

We should see the expected Arrhenius behaviour, which means that even a system with one (classical) degree of freedom only is capable of delivering this phenomenological law observed in many complex systems.

Question: When do you expect the Arrhenius law to fail for Langevin dynamics?


Transition State Theory (TST) vs. Kramers Theory

So far in this practical we have computed our reaction rates from long trajectories that show a sufficiently large number of reactive events. While this is the most straightforward approach to compute reaction rates, it is computationally not always feasible to go via this route. If the intrinsic barrier is significantly larger than k_B T (the average thermal energy in each degree of freedom), the system will spend most of its time in region A or region B and will only rarely cross the barrier region. This is why reactive events are also called rare events. That also means we would waste our simulation time to mostly observe uninteresting (unreactive) behaviour. To resolve this issue, scientists have developed so-called rare event techniques or rate theories that allow us to compute the rate without having to run these very long trajectories.

The most simple rate theory is Transition State Theory, which tells us that the rate is given by



This means that the rate is solely determined by the potential energy profile in region A (note the range of integration), the mass of the particle, and the temperature.

We also visited Kramers Theory during the lecture which tells us that in the moderate to high friction region (gamma >> timestep/mass) there is a correction term to the TST rate



which involves gamma and the characteristic frequency omega_B of our potential in the barrier region (see potential plot above).

The program langevin will also compute the TST rate by evaluating the integral in the TST equation via a finite sum if the flag computeRate in md.in is set to .TRUE. . It also computes the frequency-dependent correction factor of Kramers Theory (see KT equation above) and outputs the information to the screen.

To compare the three approaches available for rate computation we first run a long trajectory with the following parameters:

4000000  30    # nsteps,nout
0.005         # timestep
0.2           # x(t=0)
0.75          # temperature
10            # friction coefficient
.T.           # compute TST and Kramers rate

The TST rate and the KT rate will be written to the screen. To better understand the result you can plot the Kramers prefactor as a function of gamma in gnuplot (to do so, type the command below) - this assumes that in the KT expression omega_B = 1)

p [0:10] -x/2.0 + sqrt(x*x*0.25+1)

Question: Why is the KT rate smaller than the TST rate? What is the molecular origin of the rate reduction?

Now compute the rate by counting the rate for the trajectory on the file system

./rate 

The output tells us how many MD steps are used in the analysis, the equilibrium mole fractions of region A and B, respectively, and the rate.

Systematic friction scan

Let us now systematically scan the rate as a function of friction. The script scan-gamma.sh does exactly that. Have a look at it and try to understand it, then run

./scan-gamma.sh

(Have a look at it. If it only contains 0,000 change the printf line to:
echo "$gamma" "$rate" "$kTST" "$kKramers" "$kSm" >> rate.dat
Make sure to delete rate.dat before running ./scan-gamma.sh again.)

The script writes a file rate.dat, with the following columns: gamma, rate by counting, rate by TST, rate by Kramers Theory, and the rate in the Smoluchowski limit.
You can plot it with:

p [:][0:0.2] "rate.dat" u 1:2 w l t 'rate', "rate.dat" u 1:3 w l t 'kTST', "rate.dat" u 1:4 w l t 'kKramers', "rate.dat" u 1:5 w l t 'kSmoluchowski'

Also switch to a logarithmic plot on the gamma-axis using

set logscale x
replot

Explain the finding.

Question: For which friction is TST doing more or less ok? For which frictions do Kramers Theory and the Smoluchowski limit hold? Why does Kramers fail at low friction?

Optional - Systematic temperature scan

As you noticed the last section: scripts are very useful in our business. If time is left, you can modify the script in order to compute the rate systematically as a function of temperature. You can either modify the bash script scan-gamma.sh, the Perl script scan-gamma.pl or the Python script scan-gamma.py. (First, give it a new name, of course: cp scan-gamma.xx scan-temperature.xx.)

A reasonable temperature range would be between 0.2 or 0.3 and 2. For small T, however, check in the trajectory whether you have any transitions, or whether the temperature is too low. Plot (a) the rate versus the temperature, and (b) the rate on a log scale versus 1/T (why?). At which temperature does rate theory break down? Why? (Hint: rare events?)

Within gnuplot you can also fit a straight line to the ln(k)-1/T plot. Assume that the first two columns of rate.dat contain the temperatures and the rates respectively, use

unset logscale x
replot

f(x) = a + b*x
fit f(x) "rate.dat" using (1./$1):(log($2)) via a,b
plot "rate.dat" u (1./$1):(log($2)) w lp, f(x)

Is the slope what you would expect from the barrier height? (The barrier height is written in the output of the program ./langevin.)