##### Document Text Contents

Page 1

RSC Biomolecular Sciences

Innovations in Biomolecular

Modeling and Simulations

Volume 1

Edited by Tamar Schlick

In

n

ovation

s in

B

iom

olecu

lar

M

o

d

elin

g an

d

Sim

u

lation

s

V

olu

m

e 1

Sch

lick

Page 2

Innovations in Biomolecular Modeling and Simulations

Volume 1

Page 190

To pick a re-entry point, one of the boundaries a of cell i is randomly

picked using a probability law obtained from the cell flux and steady-state

probabilities:

pboundary a of cell i ¼

WjNji=TjP

k;k 6¼iWkNki=Tk

ð7:45Þ

where boundary a is the boundary between cell i and j. Although the

implementation is different, this is similar to Warmflash et al. (2007).

1

Based on this, it appears that using two lattices is not necessary and that the

scheme correctly works with a single lattice. Also the idea of lattice is no longer

discussed in a more recent publication.

121

See also Dickson et al. (2011)

122

for

an application of this method to unfolding and refolding of RNA. In this paper

as well, a single lattice is used. Dickson and Dinner (2010)

121

present some

theoretical results regarding non-equilibrium umbrella sampling (an analysis of

the convergence of the weights using the local scheme), a comparison with and

discussion of forward flux sampling, and recent applications of these methods.

7.4.2 Reactive Trajectory Sampling

The second method, which is related in some fashion to the previous class of

techniques, can be attributed to Huber and Kim (1996).

3

In this reference, the

method is developed assuming that an approximate reaction coordinate has

been chosen. However, it is not difficult to extend this approach to a general

decomposition of the conformational space O in a manner similar to, for

example, Vanden-Eijnden and Venturoli (2009a)

63

with Voronoi cells. This

method will be discussed in more details below. It consists in running a large

number of simulations (or ‘‘walkers’’) in parallel in such a way that a given

number of walkers are maintained in each cell or macro-state. Macro-states

that are near an energy barrier will tend to be depleted and therefore a strategy

is applied to duplicate walkers in this macro-state, in a statistically correct way.

This is done by assigning statistical weights to each walker. For example a

walker with weight w can be split into two walkers, starting from the same

location in O, with weights w/2. Conversely, macro-states that are at low energy

will tend to become overcrowded and walkers are then removed. If for example

we have two walkers with weights w1 and w2, we randomly select one with

probabilities (w1/(w1þw2), w2/(w1þw2)) and assign to it the weight w1þw2.

This approach ensures an efficient sampling of phase space.

In order to calculate a reaction rate, the macro-state corresponding to region

B is transformed into a cemetery state, that is any walker that enters this macro-

state is removed from the simulation and re-inserted in region A. In this

fashion, although the simulation is effectively out of equilibrium, the popula-

tion of walkers is kept constant. This method allows computing all the relevant

quantities of interest, such as reaction rates, free energy, metastable states, etc.

We note that contrary to Markov state models, this approach does not suffer

165Computing Reaction Rates in Bio-molecular Systems Using Discrete Macro-states

Page 191

from non-Markovity errors and that in the limit of infinite sampling it provides

an exact answer.

In Zhang et al. (2007),

123

this technique was applied to explore the transition

paths ensemble in a united-residue model of calmodulin.

33,124

In Zhang et al.

(2010),

32

it is shown that the method initially developed in Huber and Kim

(1996)

3

is really applicable to a much wider class of problems and proposes

some generalizations of this procedure.

We mention that a similar technique has been applied to simulated annealing

to find minima of rough (or even fractal) functions (see Huber and McCammon

[1997]).

125

Detailed discussion of reactive trajectory sampling. We now discuss in more

details the method of Huber and Kim (1996),

3

Zhang et al. (2007, 2010)

32,123

which we rename reactive trajectory sampling method (RTS), in the broader

context of macro-state models (e.g. Voronoi cell partitioning). In this

approach, systematic errors arising from non-Markovian effects are avoided by

directly calculating reactive trajectories from A to B and obtaining the prob-

ability flux entering B (or A for the backward rate), Metzner et al. (2006).

55

When the energy barrier is high, this can be very inefficient since very few

trajectories (if any) will make it to B when started from A. However, a simple

trick allows improving the efficiency of the calculation to the extent that the

decay of the statistical errors becomes essentially independent of the energy

barrier height.

As before we split the space of possible configurations into cells. Then a large

number of random ‘‘walkers’’ are initialized and advanced forward in time. The

basic idea is to use a strategy whereby, in cells that get overcrowded (too many

walkers), we merge walkers, thereby reducing their numbers, while in cells that

are depleted (near transition regions), we split walkers to increase their number.

The end goal is to maintain a given target number of walkers in each cell. With

such an approach we are able to observe a constant stream of walkers going

from A to B (and vice versa) irrespective of the height of the energy barrier. We

now explain the details of the method.

Assume we have nw walkers whose position gets updated at each time step. It

is possible to resample from these walkers without introducing any bias in the

calculation using the following procedure. Eachwalker, whose position is denoted

xi, is assigned a probabilistic weight wi, for example initially equal to 1/nw.

A walker can be split into p walkers with weight wi/p. After the split, each walker

can be advanced independently. Averages can then be computed using:

hf i ¼ lim

nw!N

1P

jWj

X

i

wif ðxiÞ ð7:46Þ

This equation is always true, irrespective of how many times the splitting

procedure is applied, or how many steps are performed, as long as the initial

position of the walkers is drawn from the equilibrium distribution. This is

proved from the fact that the equilibrium distribution is by definition invariant

under the dynamics under consideration for x(t).

166 Chapter 7

Page 380

trajectory fragments 120–3

background to 117–20

computing rates

forward flux sampling 123,

126–9, 135–6

milestoning 121, 122, 123–6,

135–6

illustrative 2-D model

system 132–5, 136

kinetics and equilibrium

applications 129–32

transition interface sampling 143,

148–9, 150, 193–4

transition matrix 175–82

transition path sampling 118, 140,

143, 146–8

transition state theory 118, 145

tree code algorithm 93–5

Tsukuba convention 11, 13

tyrosine, UV spectrum of 4

U1 snRNA 169

umbrella sampling 45–6, 140, 144,

163–5, 194

unit cell, computational 323–6

united-residue force field

(UNRES) 6, 231

virtual screening 293–5

viruses 18, 104

HCV IRES 162–3, 165,

169–70

HDV ribozyme 143, 143–4, 146,

147

viral capsids 237–8

see also antiviral drugs; HIV

Voronoi cells 152, 156,

161, 164

Watson-Crick base pairs,

electrostatics and 56–9

weighted ensemble Brownian

dynamics see Brownian

dynamics

XRISM 53, 57

Z-DNA 61–2, 63

355Subject Index

RSC Biomolecular Sciences

Innovations in Biomolecular

Modeling and Simulations

Volume 1

Edited by Tamar Schlick

In

n

ovation

s in

B

iom

olecu

lar

M

o

d

elin

g an

d

Sim

u

lation

s

V

olu

m

e 1

Sch

lick

Page 2

Innovations in Biomolecular Modeling and Simulations

Volume 1

Page 190

To pick a re-entry point, one of the boundaries a of cell i is randomly

picked using a probability law obtained from the cell flux and steady-state

probabilities:

pboundary a of cell i ¼

WjNji=TjP

k;k 6¼iWkNki=Tk

ð7:45Þ

where boundary a is the boundary between cell i and j. Although the

implementation is different, this is similar to Warmflash et al. (2007).

1

Based on this, it appears that using two lattices is not necessary and that the

scheme correctly works with a single lattice. Also the idea of lattice is no longer

discussed in a more recent publication.

121

See also Dickson et al. (2011)

122

for

an application of this method to unfolding and refolding of RNA. In this paper

as well, a single lattice is used. Dickson and Dinner (2010)

121

present some

theoretical results regarding non-equilibrium umbrella sampling (an analysis of

the convergence of the weights using the local scheme), a comparison with and

discussion of forward flux sampling, and recent applications of these methods.

7.4.2 Reactive Trajectory Sampling

The second method, which is related in some fashion to the previous class of

techniques, can be attributed to Huber and Kim (1996).

3

In this reference, the

method is developed assuming that an approximate reaction coordinate has

been chosen. However, it is not difficult to extend this approach to a general

decomposition of the conformational space O in a manner similar to, for

example, Vanden-Eijnden and Venturoli (2009a)

63

with Voronoi cells. This

method will be discussed in more details below. It consists in running a large

number of simulations (or ‘‘walkers’’) in parallel in such a way that a given

number of walkers are maintained in each cell or macro-state. Macro-states

that are near an energy barrier will tend to be depleted and therefore a strategy

is applied to duplicate walkers in this macro-state, in a statistically correct way.

This is done by assigning statistical weights to each walker. For example a

walker with weight w can be split into two walkers, starting from the same

location in O, with weights w/2. Conversely, macro-states that are at low energy

will tend to become overcrowded and walkers are then removed. If for example

we have two walkers with weights w1 and w2, we randomly select one with

probabilities (w1/(w1þw2), w2/(w1þw2)) and assign to it the weight w1þw2.

This approach ensures an efficient sampling of phase space.

In order to calculate a reaction rate, the macro-state corresponding to region

B is transformed into a cemetery state, that is any walker that enters this macro-

state is removed from the simulation and re-inserted in region A. In this

fashion, although the simulation is effectively out of equilibrium, the popula-

tion of walkers is kept constant. This method allows computing all the relevant

quantities of interest, such as reaction rates, free energy, metastable states, etc.

We note that contrary to Markov state models, this approach does not suffer

165Computing Reaction Rates in Bio-molecular Systems Using Discrete Macro-states

Page 191

from non-Markovity errors and that in the limit of infinite sampling it provides

an exact answer.

In Zhang et al. (2007),

123

this technique was applied to explore the transition

paths ensemble in a united-residue model of calmodulin.

33,124

In Zhang et al.

(2010),

32

it is shown that the method initially developed in Huber and Kim

(1996)

3

is really applicable to a much wider class of problems and proposes

some generalizations of this procedure.

We mention that a similar technique has been applied to simulated annealing

to find minima of rough (or even fractal) functions (see Huber and McCammon

[1997]).

125

Detailed discussion of reactive trajectory sampling. We now discuss in more

details the method of Huber and Kim (1996),

3

Zhang et al. (2007, 2010)

32,123

which we rename reactive trajectory sampling method (RTS), in the broader

context of macro-state models (e.g. Voronoi cell partitioning). In this

approach, systematic errors arising from non-Markovian effects are avoided by

directly calculating reactive trajectories from A to B and obtaining the prob-

ability flux entering B (or A for the backward rate), Metzner et al. (2006).

55

When the energy barrier is high, this can be very inefficient since very few

trajectories (if any) will make it to B when started from A. However, a simple

trick allows improving the efficiency of the calculation to the extent that the

decay of the statistical errors becomes essentially independent of the energy

barrier height.

As before we split the space of possible configurations into cells. Then a large

number of random ‘‘walkers’’ are initialized and advanced forward in time. The

basic idea is to use a strategy whereby, in cells that get overcrowded (too many

walkers), we merge walkers, thereby reducing their numbers, while in cells that

are depleted (near transition regions), we split walkers to increase their number.

The end goal is to maintain a given target number of walkers in each cell. With

such an approach we are able to observe a constant stream of walkers going

from A to B (and vice versa) irrespective of the height of the energy barrier. We

now explain the details of the method.

Assume we have nw walkers whose position gets updated at each time step. It

is possible to resample from these walkers without introducing any bias in the

calculation using the following procedure. Eachwalker, whose position is denoted

xi, is assigned a probabilistic weight wi, for example initially equal to 1/nw.

A walker can be split into p walkers with weight wi/p. After the split, each walker

can be advanced independently. Averages can then be computed using:

hf i ¼ lim

nw!N

1P

jWj

X

i

wif ðxiÞ ð7:46Þ

This equation is always true, irrespective of how many times the splitting

procedure is applied, or how many steps are performed, as long as the initial

position of the walkers is drawn from the equilibrium distribution. This is

proved from the fact that the equilibrium distribution is by definition invariant

under the dynamics under consideration for x(t).

166 Chapter 7

Page 380

trajectory fragments 120–3

background to 117–20

computing rates

forward flux sampling 123,

126–9, 135–6

milestoning 121, 122, 123–6,

135–6

illustrative 2-D model

system 132–5, 136

kinetics and equilibrium

applications 129–32

transition interface sampling 143,

148–9, 150, 193–4

transition matrix 175–82

transition path sampling 118, 140,

143, 146–8

transition state theory 118, 145

tree code algorithm 93–5

Tsukuba convention 11, 13

tyrosine, UV spectrum of 4

U1 snRNA 169

umbrella sampling 45–6, 140, 144,

163–5, 194

unit cell, computational 323–6

united-residue force field

(UNRES) 6, 231

virtual screening 293–5

viruses 18, 104

HCV IRES 162–3, 165,

169–70

HDV ribozyme 143, 143–4, 146,

147

viral capsids 237–8

see also antiviral drugs; HIV

Voronoi cells 152, 156,

161, 164

Watson-Crick base pairs,

electrostatics and 56–9

weighted ensemble Brownian

dynamics see Brownian

dynamics

XRISM 53, 57

Z-DNA 61–2, 63

355Subject Index