Extending and validating PyRETIS

This page is for developers adding a new Monte Carlo path move or a new engine to PyRETIS, and – just as importantly – checking that the addition reproduces the same physics as the existing code. A new move or a new engine changes how paths are produced, not which path ensemble is sampled, so a correct implementation must give the same crossing probability and rate constant as the established ones, within statistical error.

PyRETIS ships a small reproducibility suite for exactly this check, in examples/validation/. The short test-easy / test-heavy runs in CI cannot afford the cycles needed to drive the rare-event statistics down, so this suite lives in examples/ and is launched manually (on a cluster) when you add or change a move or an engine.

Designing a new path move

The path moves live in pyretis/core/tis.py (the infinite-swapping variant in pyretis/core/_tis_inf.py). The existing moves are standard shooting (sh) and the sub-trajectory moves Stone Skipping (ss), Web Throwing (wt) and Wire Fencing (wf), each optionally in a high-acceptance variant. Their input keywords (shooting_moves, n_jumps, interface_sour, interface_cap, high_accept, sigma_v) are documented in the TIS input reference, and their output codes in Path types and rejection reasons.

When adding a move:

  1. Implement it alongside the existing moves, give it a short two-letter move code, and make it selectable through the shooting_moves keyword.
  2. Preserve (super-)detailed balance: the move must leave the path ensemble’s distribution unchanged. This is the correctness criterion, and it is what the validation suite measures – a move that breaks it will not reproduce the reference rate.
  3. Document the new move code in Path types and rejection reasons.
  4. Add a validation case for it (see below) and confirm it agrees with the other moves before relying on it.

Implementing a new engine

How to subclass the engine base classes – EngineBase / MDEngine for an internal engine, or ExternalMDEngine to drive an external program such as GROMACS, LAMMPS or OpenMM – is covered, with templates, in Creating custom engines.

For correctness, the key point is the same as for moves: an engine only changes how the dynamics are generated. On a shared system (same potential / force field, temperature, timestep and thermostat) a correct engine must reproduce the same rate as the others. Make the dynamics genuinely match before treating a rate disagreement as a bug – the rate depends on the initial flux, so a mismatched thermostat or timestep, not the engine code, is the usual culprit.

Validating reproducibility

The suite in examples/validation/ has two groups, both modelling the same physical system so their rates must agree:

  • methods/move equivalence on the internal engine: the same 1D double-well sampled with sh / ss / wt / wf (and the high-acceptance and infinite-swapping variants). A direct cross-check of the move implementations.
  • engines/engine equivalence with a shared force field: the same dissociation run through GROMACS / LAMMPS / OpenMM.

It is driven by a per-machine config, validation.toml – an ordered list of cases with their target cycles, plus a per-machine seed, a reverse_list toggle, and jobs (internal cases run in parallel). The first positional argument selects the action:

python run_validation.py                 # show this usage help
python run_validation.py status          # show recorded results (read-only)
python run_validation.py analyze         # (re)analyse the runs on disk
python run_validation.py run             # run every case, then analyse
python run_validation.py run native_sh   # run just one case

cycles is a target total: a case that already has output is continued up to it, otherwise started fresh from seed. Use a different seed on each machine so two runs are independent and can be combined for better statistics. Each case is analysed as soon as it finishes, and the suite maintains a per-case results table in previous_results/ (tracked in git – so git diff of the table is the comparison across validation sessions; running one case updates just that case’s row). The analysis prints a rate comparison (ok / CHECK), the PCG64-vs-MT19937 RNG verdict, and writes the convergence plot rate_vs_cycles.png. See examples/validation/README.rst for the full description.

Convergence speed across moves

All moves converge to the same rate – that agreement is the validation – but they do not converge at the same speed. The rate_vs_cycles.png plot shows each case’s running rate estimate against cycle number: a curve that flattens onto the common value sooner has converged faster.

This is where the suite doubles as a showcase. The sub-trajectory moves (Stone Skipping, Web Throwing, Wire Fencing [1] [2]) and the high-acceptance variants are designed to decorrelate successive paths more effectively than plain one-shot shooting, so on this system they typically reach a stable estimate in noticeably fewer cycles – which you can read straight off the plot. Running the methods/ group and looking at the overlaid curves is therefore both a correctness check and a practical guide to which move to reach for.

Adding your move or engine as a case

  1. Copy an existing case directory – e.g. methods/native_wf for a new move, or one of the engines/ cases for a new engine – keeping the same double-well (or shared force field), temperature, timestep and interfaces, so the rate is required to match.
  2. Point the copy at your new thing (set shooting_moves to your move code, or select your engine).
  3. Register the case in the manifest in examples/validation/run_validation.py and list it in validation.toml.
  4. python run_validation.py run <your_case> and confirm its row in the table agrees within n_sigma once converged. A CHECK usually means not converged yet – raise its cycles and re-analyse before concluding there is a discrepancy.

References

[1]E. Riccardi, O. Dahlen and T. S. van Erp, Fast decorrelating Monte Carlo moves for efficient path sampling, J. Phys. Chem. Lett. 8, 4456-4460 (2017), https://doi.org/10.1021/acs.jpclett.7b01617
[2]D. T. Zhang, E. Riccardi and T. S. van Erp, Enhanced path sampling using subtrajectory Monte Carlo moves, J. Chem. Phys. 158, 024113 (2023), https://doi.org/10.1063/5.0127249