Extending and validating PyRETIS¶

This page is for developers adding a new Monte Carlo path move or a new engine to PyRETIS, and – just as importantly – checking that the addition reproduces the same physics as the existing code. A new move or a new engine changes how paths are produced, not which path ensemble is sampled, so a correct implementation must give the same crossing probability and rate constant as the established ones, within statistical error.

PyRETIS ships a small reproducibility suite for exactly this check, in examples/validation/. The short test-easy / test-heavy runs in CI cannot afford the cycles needed to drive the rare-event statistics down, so this suite lives in examples/ and is launched manually (on a cluster) when you add or change a move or an engine.

Designing a new path move ¶

The unified path moves live in pyretis/core/moves.py. The existing moves are standard shooting (sh) and the sub-trajectory moves Stone Skipping (ss), Web Throwing (wt) and Wire Fencing (wf), each optionally in a high-acceptance variant. Their input keywords (shooting_moves, n_jumps, interface_sour, interface_cap, high_accept, sigma_v) are documented in the TIS input reference, and their output codes in Path types and rejection reasons.

When adding a move:

Implement it alongside the existing moves, give it a short two-letter move code, and make it selectable through the shooting_moves keyword.
Preserve (super-)detailed balance: the move must leave the path ensemble’s distribution unchanged. This is the correctness criterion, and it is what the validation suite measures – a move that breaks it will not reproduce the reference rate.
Document the new move code in Path types and rejection reasons.
Add a validation case for it (see below) and confirm it agrees with the other moves before relying on it.

Writing a shooting-point selector ¶

The biased shooting move (shooting_move = "bias", see Biased shooting-point selection) delegates the choice of shooting point to a ShootingPointSelector (pyretis/core/shooting_selection.py). To add your own bias – analytic, density-based, or machine-learned – subclass it and implement a single method:

from pyretis.core.shooting_selection import ShootingPointSelector

class MySelector(ShootingPointSelector):
    def weights(self, path):
        # Return a 1-D numpy array of strictly positive, finite
        # weights, ONE PER INTERIOR FRAME (path.phasepoints[1:-1]).
        ...

Contract for weights(path):

Return exactly path.length - 2 values (interior frames only; the two end points are never shooting points).
Every weight must be strictly positive and finite. Non-positive or NaN/inf weights raise rather than being silently clamped – a wrong selection weight silently biases the sampled rate, the worst failure mode for this code. (For exp-type weights, guard against underflow: a too-sharp bias that drives a frame to 0 is an error.)

The base class provides sum_weight(path) (the partition function \(W\)) and select(path, rgen), which draws one interior frame with probability \(w_i/W\) using a single uniform from the per-ensemble random generator. You do not re-implement selection or the acceptance correction; the move applies \(\min(1, W_\mathrm{old}/W_\mathrm{new})\) automatically, so any positive weights is detailed-balance correct.

Load a user selector from the input like any other pluggable class, via a module/class pair in the [shooting-selector] section:

[shooting-selector]
module = "my_selectors.py"
class = "MySelector"
# any further keys are passed to MySelector.__init__

Implementing a new engine ¶

How to subclass the engine base classes – EngineBase / MDEngine for an internal engine, or ExternalMDEngine to drive an external program such as GROMACS, LAMMPS or OpenMM – is covered, with templates, in Creating custom engines.

For correctness, the key point is the same as for moves: an engine only changes how the dynamics are generated. On a shared system (same potential / force field, temperature, timestep and thermostat) a correct engine must reproduce the same rate as the others. Make the dynamics genuinely match before treating a rate disagreement as a bug – the rate depends on the initial flux, so a mismatched thermostat or timestep, not the engine code, is the usual culprit.

Validating reproducibility ¶

The suite in examples/validation/ has two groups, both modelling the same physical system so their rates must agree:

methods/ – move equivalence on the internal engine: the same 1D double-well sampled with sh / ss / wt / wf (and the high-acceptance and infinite-swapping variants). A direct cross-check of the move implementations.
engines/ – engine equivalence with a shared force field: the same dissociation run through GROMACS / LAMMPS / OpenMM.

It is driven by a per-machine config, validation.toml – an ordered list of cases with their target cycles, plus a per-machine seed, a reverse_list toggle, and jobs (internal cases run in parallel). The first positional argument selects the action:

python run_validation.py                 # show this usage help
python run_validation.py status          # show recorded results (read-only)
python run_validation.py analyze         # (re)analyse the runs on disk
python run_validation.py run             # run every case, then analyse
python run_validation.py run sh_mt19937  # run just one case

cycles is a target total: a case that already has output is continued up to it, otherwise started fresh from seed. Use a different seed on each machine so two runs are independent and can be combined for better statistics. Each case is analysed as soon as it finishes, and the suite maintains a per-case results table in previous_results/ (tracked in git – so git diff of the table is the comparison across validation sessions; running one case updates just that case’s row). The analysis prints a rate comparison (ok / CHECK), the PCG64-vs-MT19937 RNG verdict, and writes the convergence plot rate_vs_cycles.png. See examples/validation/README.rst for the full description.

Convergence speed across moves ¶

All moves converge to the same rate – that agreement is the validation – but they do not converge at the same speed. The rate_vs_cycles.png plot shows each case’s running rate estimate against cycle number: a curve that flattens onto the common value sooner has converged faster.

This is where the suite doubles as a showcase. The sub-trajectory moves (Stone Skipping, Web Throwing, Wire Fencing [1] [2]) and the high-acceptance variants are designed to decorrelate successive paths more effectively than plain one-shot shooting, so on this system they typically reach a stable estimate in noticeably fewer cycles – which you can read straight off the plot. Running the methods/ group and looking at the overlaid curves is therefore both a correctness check and a practical guide to which move to reach for.

Adding your move or engine as a case ¶

Copy an existing case directory – e.g. methods/wf_mt19937 for a new move, or one of the engines/ cases for a new engine – keeping the same double-well (or shared force field), temperature, timestep and interfaces, so the rate is required to match.
Point the copy at your new thing (set shooting_moves to your move code, or select your engine).
Register the case in the manifest in examples/validation/run_validation.py and list it in validation.toml.
python run_validation.py run <your_case> and confirm its row in the table agrees within n_sigma once converged. A CHECK usually means not converged yet – raise its cycles and re-analyse before concluding there is a discrepancy.

References

Extending and validating PyRETIS¶

Designing a new path move¶

Writing a shooting-point selector¶

Implementing a new engine¶

Validating reproducibility¶

Convergence speed across moves¶

Adding your move or engine as a case¶