.. _developer-guide-extending: Extending and validating PyRETIS ================================ This page is for developers adding a new **Monte Carlo path move** or a new **engine** to |pyretis|, and -- just as importantly -- checking that the addition reproduces the **same physics** as the existing code. A new move or a new engine changes *how* paths are produced, not *which* path ensemble is sampled, so a correct implementation must give the **same crossing probability and rate constant** as the established ones, within statistical error. |pyretis| ships a small reproducibility suite for exactly this check, in ``examples/validation/``. The short ``test-easy`` / ``test-heavy`` runs in CI cannot afford the cycles needed to drive the rare-event statistics down, so this suite lives in ``examples/`` and is launched manually (on a cluster) when you add or change a move or an engine. .. contents:: Table of contents :local: Designing a new path move ------------------------- The path moves live in :file:`pyretis/core/tis.py` (the infinite-swapping variant in :file:`pyretis/core/_tis_inf.py`). The existing moves are standard shooting (``sh``) and the sub-trajectory moves Stone Skipping (``ss``), Web Throwing (``wt``) and Wire Fencing (``wf``), each optionally in a *high-acceptance* variant. Their input keywords (``shooting_moves``, ``n_jumps``, ``interface_sour``, ``interface_cap``, ``high_accept``, ``sigma_v``) are documented in the TIS input reference, and their output codes in :ref:`Path types and rejection reasons `. When adding a move: #. Implement it alongside the existing moves, give it a short two-letter **move code**, and make it selectable through the ``shooting_moves`` keyword. #. Preserve **(super-)detailed balance**: the move must leave the path ensemble's distribution unchanged. This is the correctness criterion, and it is what the validation suite measures -- a move that breaks it will not reproduce the reference rate. #. Document the new move code in :ref:`Path types and rejection reasons `. #. Add a validation case for it (see below) and confirm it agrees with the other moves before relying on it. Implementing a new engine ------------------------- How to subclass the engine base classes -- :py:class:`.EngineBase` / :py:class:`~pyretis.engines.MDEngine` for an internal engine, or :py:class:`~pyretis.engines.ExternalMDEngine` to drive an external program such as GROMACS, LAMMPS or OpenMM -- is covered, with templates, in :ref:`Creating custom engines `. For correctness, the key point is the same as for moves: an engine only changes *how* the dynamics are generated. On a shared system (same potential / force field, temperature, timestep and thermostat) a correct engine must reproduce the **same rate** as the others. Make the dynamics genuinely match before treating a rate disagreement as a bug -- the rate depends on the initial flux, so a mismatched thermostat or timestep, not the engine code, is the usual culprit. Validating reproducibility -------------------------- The suite in ``examples/validation/`` has two groups, both modelling the same physical system so their rates must agree: * ``methods/`` -- **move equivalence** on the internal engine: the same 1D double-well sampled with ``sh`` / ``ss`` / ``wt`` / ``wf`` (and the high-acceptance and infinite-swapping variants). A direct cross-check of the move implementations. * ``engines/`` -- **engine equivalence** with a shared force field: the same dissociation run through GROMACS / LAMMPS / OpenMM. It is driven by a per-machine config, ``validation.toml`` -- an ordered list of cases with their target ``cycles``, plus a per-machine ``seed``, a ``reverse_list`` toggle, and ``jobs`` (internal cases run in parallel). The first positional argument selects the action:: python run_validation.py # show this usage help python run_validation.py status # show recorded results (read-only) python run_validation.py analyze # (re)analyse the runs on disk python run_validation.py run # run every case, then analyse python run_validation.py run native_sh # run just one case ``cycles`` is a **target total**: a case that already has output is *continued* up to it, otherwise started fresh from ``seed``. Use a **different seed on each machine** so two runs are independent and can be combined for better statistics. Each case is analysed as soon as it finishes, and the suite maintains a **per-case results table** in ``previous_results/`` (tracked in git -- so ``git diff`` of the table is the comparison across validation sessions; running one case updates just that case's row). The analysis prints a rate comparison (``ok`` / ``CHECK``), the PCG64-vs-MT19937 RNG verdict, and writes the convergence plot ``rate_vs_cycles.png``. See ``examples/validation/README.rst`` for the full description. Convergence speed across moves ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All moves converge to the **same** rate -- that agreement is the validation -- but they do not converge at the same **speed**. The ``rate_vs_cycles.png`` plot shows each case's running rate estimate against cycle number: a curve that flattens onto the common value sooner has converged faster. This is where the suite doubles as a showcase. The sub-trajectory moves (Stone Skipping, Web Throwing, Wire Fencing [#ss]_ [#wf]_) and the high-acceptance variants are designed to decorrelate successive paths more effectively than plain one-shot shooting, so on this system they typically reach a stable estimate in noticeably fewer cycles -- which you can read straight off the plot. Running the ``methods/`` group and looking at the overlaid curves is therefore both a correctness check and a practical guide to which move to reach for. Adding your move or engine as a case ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #. Copy an existing case directory -- e.g. ``methods/native_wf`` for a new move, or one of the ``engines/`` cases for a new engine -- keeping the **same** double-well (or shared force field), temperature, timestep and interfaces, so the rate is required to match. #. Point the copy at your new thing (set ``shooting_moves`` to your move code, or select your engine). #. Register the case in the manifest in :file:`examples/validation/run_validation.py` and list it in ``validation.toml``. #. ``python run_validation.py run `` and confirm its row in the table agrees within ``n_sigma`` once converged. A ``CHECK`` usually means *not converged yet* -- raise its ``cycles`` and re-analyse before concluding there is a discrepancy. .. rubric:: References .. [#ss] E. Riccardi, O. Dahlen and T. S. van Erp, *Fast decorrelating Monte Carlo moves for efficient path sampling*, J. Phys. Chem. Lett. **8**, 4456-4460 (2017), https://doi.org/10.1021/acs.jpclett.7b01617 .. [#wf] D. T. Zhang, E. Riccardi and T. S. van Erp, *Enhanced path sampling using subtrajectory Monte Carlo moves*, J. Chem. Phys. **158**, 024113 (2023), https://doi.org/10.1063/5.0127249