pyretis.analysis package¶
This package defines analysis tools for the PyRETIS program.
The analysis tools are intended to be used for analysis of the simulation output from the PyRETIS program. The typical use of this package is in post-processing of the results from a simulation (or several simulations).
Package structure¶
Modules¶
- __init__.py
- This file, imports from the other modules. The method to analyse results from MD flux simulations is defined here since it will make use of analysis tools from energy_analysis.py and order_analysis.py.
- analysis.py (
pyretis.analysis.analysis) - General methods for numerical analysis.
- energy_analysis.py (
pyretis.analysis.energy_analysis) - Defines methods useful for analysing the energy output.
- histogram.py (
pyretis.analysis.histogram) - Defines methods useful for generating histograms.
- order_analysis.py (
pyretis.analysis.order_analysis) - Defines methods useful for analysis of order parameters.
- path_analysis.py (
pyretis.analysis.path_analysis) - Defines methods for analysis of path ensembles.
Important methods defined in this package¶
- analyse_energies (
analyse_energies()) - Analyse energy data from a simulation. It will calculate a running average, a distribution and do a block error analysis.
- analyse_flux (
analyse_flux()) - Analyse flux data from a MD flux simulation. It will calculate a running average, a distribution and do a block error analysis.
- analyse_orderp (
analyse_orderp()) - Analyse order parameter data. It will calculate a running average, a distribution and do a block error analysis. In addition, it will analyse the mean square displacement (if requested).
- analyse_path_ensemble (
analyse_path_ensemble()) - Analyse the results from a single path ensemble. It will calculate a running average of the probabilities, a crossing probability, perform a block error analysis, analyse lengths of paths, type of Monte Carlo moves and calculate an efficiency.
- match_probabilities (
match_probabilities()) - Method to match probabilities from several path simulations. Useful for obtaining the overall crossing probability.
- histogram (
histogram()) - Generates histogram, basically a wrapper around
numpy’s
histogram. - match_all_histograms (
match_all_histograms()) - Method to match histograms from umbrella simulations.
- retis_flux (
retis_flux()) - Method for calculating the initial flux for RETIS simulations.
- retis_rate (
retis_rate()) - Method for calculating the rate constant for RETIS simulations.
- construct_M (
construct_M()) - Build the partial-path (REPPTIS) MSM transition matrix from the local crossing probabilities.
- global_pcross_msm (
global_pcross_msm()) - Global crossing probability from a REPPTIS MSM transition matrix.
- mfpt_to_first_last_state (
mfpt_to_first_last_state()) - Mean first passage time between the boundary states of a REPPTIS MSM (used for the flux and rate of slow permeation processes).
-
pyretis.analysis.analyse_md_flux(crossdata, energydata, orderdata, settings)[source]¶ Analyse the output from a MD-flux simulation.
The obtained results will be returned as a convenient structure for plotting or reporting.
Parameters: - crossdata (numpy.array) – This is the data containing information about crossings.
- energydata (numpy.array) – This is the raw data for the energies.
- orderdata (numpy.array) – This is the raw data for the order parameter.
- settings (dict) – The settings for the analysis (e.g block length for error analysis) and some settings from the simulation (interfaces, time step etc.).
Returns: results (dict) – This dict contains the results from the different analysis as a dictionary. This dict can be used further for plotting or for generating reports.
List of submodules¶
pyretis.analysis.analysis module¶
Module defining functions useful in the analysis of simulation data.
Important methods defined here¶
- running_average (
running_average()) - Method to calculate a running average.
- block_error (
block_error()) - Perform block error analysis.
- block_error_corr (
block_error_corr()) - Method to run a block error analysis and calculate relative errors and correlation length.
-
pyretis.analysis.analysis.analyse_data(data, settings)[source]¶ Analyse the given data and run some common analysis procedures.
Specifically, it will:
- Calculate a running average.
- Obtain a histogram.
- Run a block error analysis.
Parameters: - data (numpy.array, 1D) – This numpy.array contains the data as a function of time.
- settings (dict) – This dictionary contains settings for the analysis.
Returns: result (dict) – This dict contains the results.
-
pyretis.analysis.analysis.block_error(data, maxblock=None, blockskip=1, weights=None)[source]¶ Perform block error analysis.
This function will estimate the standard deviation in the input data by performing a block analysis. The number of blocks to consider can be specified or it will be taken as the half of the length of the input data. Averages and variance are calculated using an on-the-fly algorithm [1].
Parameters: - data (numpy.array (or iterable with data points)) – The data to analyse.
- maxblock (int, optional) – Can be used to set the maximum length of the blocks to consider. Note that the maxblock will never be set longer than half the length in data.
- blockskip (int, optional) – This can be used to skip certain block lengths, i.e. blockskip = 1 will consider all blocks up to maxblock, while blockskip = n will consider every n’th block up to maxblock, i.e. it will use block lengths equal to 1, 1 + n, 1 + 2*n, and so on.
Returns: - blocklen (numpy.array) – These contain the block lengths considered.
- block_avg (numpy.array) – The averages as a function of the block length.
- block_err (numpy.array) – Estimate of errors as a function of the block length.
- block_err_avg (float) – Average of the error estimate using blocks where
length > maxblock//2.
References
[1] Wikipedia, “Algorithms for calculating variance”, http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance
-
pyretis.analysis.analysis.block_error_corr(data, maxblock=None, blockskip=1)[source]¶ Run block error analysis on the given data.
This will run the block error analysis and return the relative errors and correlation length.
Parameters: - data (numpy.array) – Data to analyse.
- maxblock (int, optional) – The maximum block length to consider.
- blockskip (int, optional) – This can be used to skip certain block lengths, i.e. blockskip = 1 will consider all blocks up to maxblock, while blockskip = n will consider every n’th block up to maxblock, i.e. it will use block lengths equal to 1, 1 + n, 1 + 2*n, and so on.
Returns: - out[0] (numpy.array) – These contains the block lengths considered (blen).
- out[1] (numpy.array) – Estimate of errors as a function of the block length (berr).
- out[2] (float) – Average of the error estimate for blocks (berr_avg)
with
length > maxblock // 2. - out[3] (numpy.array) – Estimate of relative errors normalised by the overall average as a function of block length (rel_err).
- out[4] (float) – The average relative error (avg_rel_err), for blocks
with
length > maxblock // 2. - out[5] (numpy.array) – The estimated correlation length as a function of the block length (ncor).
- out[6] (float) – The average (for blocks with length > maxblock // 2) estimated correlation length (avg_ncor).
-
pyretis.analysis.analysis.mean_square_displacement(data, ndt=None)[source]¶ Calculate the mean square displacement for the given data.
Parameters: - data (numpy.array, 1D) – This numpy.array contains the data as a function of time.
- ndt (int, optional) – This parameter is the number of time origins. I.e. points up to
ndt will be used as time origins. If not specified the value of
the input
data.size // 5will be used.
Returns: msd (numpy.array, 2D) – The first column is the mean squared displacement and the second column is the corresponding standard deviation.
pyretis.analysis.energy_analysis module¶
Methods for analysing energy data from simulations.
Important methods defined here¶
- analyse_energies (py:func:.analyse_energies)
- Run the analysis for energies (kinetic, potential etc.).
-
pyretis.analysis.energy_analysis.analyse_energies(energies, settings)[source]¶ Run the energy analysis on several energy types.
The function will run the energy analysis on several energy types and collect the energies into a structure which is convenient for plotting the results.
Parameters: - energies (dict) – This dict contains the energies to analyse.
- settings (dict) – This dictionary contains settings for the analysis.
Returns: results (dict) – For each energy key results[key] contains the result from the energy analysis.
See also
pyretis.analysis.flux_analysis module¶
Methods for analysis of crossings for flux data.
Important methods defined here¶
- analyse_flux (
analyse_flux()) - Run analysis for simulation flux data. This will calculate the initial flux for a simulation.
-
pyretis.analysis.flux_analysis._calculate_flux(effective_cross, time_in_state, time_window, time_step)[source]¶ Calculate the flux in different time windows.
Parameters: - effective_cross (list) – The number of effective crossings, obtained from
_effective_crossings. - time_in_state (int) – Time spent in over-all state
A. - time_window (int) – This is the time window we consider for calculating the flux.
- time_step (float) – This is the time-step for the simulation.
Returns: - time (np.array) – The times for which we have calculated the flux.
- ncross (np.array) – The number of crossings within a time window.
- flux (np.array) – The flux within a time window.
- effective_cross (list) – The number of effective crossings, obtained from
-
pyretis.analysis.flux_analysis._effective_crossings(fluxdata, nint, end_step)[source]¶ Analyse flux data and obtain effective crossings.
Parameters: - fluxdata (list of tuples of ints) – The list contains the data obtained from a
md-fluxsimulation. - nint (int) – The number of interfaces used.
- end_step (int) – This is the last step done in the simulation.
Returns: - eff_cross (list of lists) – eff_cross[i] is the effective crossings times for interface i.
- ncross (list of ints) – ncross[i] is the number of crossings for interface i.
- neffcross (list of ints) – neffcross[i] is the number of effective crossings for interface i.
- time_in_state (dict) – The time spent in the different states which are labeled with the keys ‘A’, ‘B’, ‘OA’, ‘OB’. ‘O’ is taken to mean the ‘overall’ state.
Note
We do here intf - 1. This is just to be compatible with the old FORTRAN code where the interfaces are numbered 1, 2, 3 rather than 0, 1, 2. If this is to be changed in the future the -1 can just be removed. Such a change will also require changes to the formatter for flux data.
- fluxdata (list of tuples of ints) – The list contains the data obtained from a
-
pyretis.analysis.flux_analysis.analyse_flux(fluxdata, settings)[source]¶ Run the analysis on the given flux data.
This will run the flux analysis and collect the results into a structure which is convenient for plotting and reporting the results.
Parameters: - fluxdata (list of tuples of integers) – This array contains the data obtained from a MD simulation for the fluxes.
- settings (dict) – This dict contains the settings for the analysis. Note that this dictionary also needs some settings from the simulation, in particular the number of cycles, the interfaces and information about the time step.
Returns: results (dict) – This dict contains the results from the flux analysis. The keys are defined in the results variable.
-
pyretis.analysis.flux_analysis.find_crossings(order, interfaces)[source]¶ Find crossings with interfaces for given order parameter data.
Parameters: - order (numpy.array (1D)) – Order parameters, as a function of time.
- interfaces (list of floats) – The interfaces for which we will investigate crossings.
Returns: out (list of tuple) – Each tuple contains the crossings on the following form: (step, interface-number, direction), where direction = ‘+’ if the interface was crossed while moving to the right and ‘-’ if the movement was towards the left.
pyretis.analysis.histogram module¶
Histogram functions for data analysis.
This module defines some simple functions for histograms.
Important methods defined here¶
- histogram (
histogram()) - Create a histogram from given data.
- match_all_histograms (
match_all_histograms()) - Function to match histograms, for instance from an umbrella sampling simulation.
- histogram_and_avg (
histogram_and_avg()) - Create a histogram and return bins, midpoints and simple statistics.
-
pyretis.analysis.histogram._match_histograms(histo1, histo2, bin_x, overlap)[source]¶ Match two histograms given an overlapping region.
The matching is done so that the integral of the overlapping regions of the two histograms are equal.
Parameters: - histo1 (numpy.array) – The first histogram.
- histo2 (numpy.array) – The second histogram, this is the histogram that will be scaled.
- bin_x (numpy.array) – This is the bin mid-points of the histograms. Note that we assume here that histo1 and histo2 are obtained using the same number of bins and limits.
- overlap (object like list, tuple or numpy.array) – This is the overlapping region.
Returns: - out[0] (numpy.array) – A scaled version of second input histogram histo2.
- out[1] (float) – The calculated scale factor.
-
pyretis.analysis.histogram.histogram(data, bins=10, limits=(-1, 1), density=False, weights=None)[source]¶ Create a histogram of the given data.
Parameters: - data (numpy.array) – The data for making the histogram.
- bins (int, optional) – The number of bins to divide the data into.
- limits (tuple/list, optional) – The max/min values to consider.
- density (boolean, optional) – If True the histogram will be normalized.
- weights (numpy.array, optional) – Weighting factors for data.
Returns: - hist (numpy.array) – The binned counts.
- bins (numpy.array) – The edges of the bins.
- bin_mid (numpy.array) – The midpoint of the bins.
-
pyretis.analysis.histogram.histogram_and_avg(data, bins, density=True, weights=None)[source]¶ Create histogram an return bins, midpoints and simple statistics.
The simple statistics include the mean value and the standard deviation. The return structure is useful for plotting routines. The midpoints returned are the midpoints of the bins.
Parameters: - data (either 1D numpy.array or 2D numpy.array) – This is the data to create the histogram from. The eventual second dimension contains the weights.
- bins (int) – The number of bins to use for the histogram.
- density (boolean, optional) – If density is true, the histogram will be normalized.
- weights (numpy.array, optional) – Weighting factors for data. Not used if data contains them.
Returns: - out[0] (numpy.array) – The histogram (frequency) values.
- out[1] (numpy.array) – The midpoints for the bins.
- out[2] (tuple of floats) – These are some simple statistics, out[2][0] is the average out[2][1] is the standard deviation.
-
pyretis.analysis.histogram.match_all_histograms(histograms, umbrellas)[source]¶ Match several histograms from an umbrella sampling.
Parameters: - histograms (list of numpy.arrays) – The histograms to match.
- umbrellas (list of lists) – The umbrella windows used in the computation.
Returns: - histograms_s (list of numpy.arrays) – The scaled histograms.
- scale_factor (list of floats) – The scale factors.
- matched_count (numpy.array) – Count for overall matched histogram (an “averaged” histogram).
pyretis.analysis.order_analysis module¶
Module defining functions for analysis of order parameters.
Important methods defined here¶
- analyse_orderp (
analyse_orderp()) - Run a simple order parameter analysis.
-
pyretis.analysis.order_analysis.analyse_orderp(orderdata, settings)[source]¶ Run the analysis on several order parameters.
The results are collected into a structure which is convenient for plotting.
Parameters: - orderdata (numpy.arrays) – The data read from the order parameter file.
- settings (dict) – This dictionary contains settings for the analysis.
Returns: results (numpy.array) – For each order parameter key, results[key] contains the result of the analysis.
See also
NoneNote
We here (and in the plotting routines) make certain assumptions about the structure, i.e. the positions are assumed to have a specific meaning: column zero is the time, column one the order parameter and so on.
pyretis.analysis.path_analysis module¶
Methods for analysis of path ensembles.
Important methods defined here¶
- analyse_path_ensemble (
analyse_path_ensemble()) - Method to analyse a path ensemble, it will calculate crossing probabilities and information about moves etc. This method can be applied to files as well as path ensemble objects.
- analyse_path_ensemble_object (
analyse_path_ensemble_object()) - Method to analyse a path ensemble, it will calculate crossing probabilities and information about moves etc. This method is intended to work directly on path ensemble objects.
- match_probabilities (
match_probabilities()) - Match probabilities from several path ensembles and calculate efficiencies and the error for the matched probability.
- retis_flux (
retis_flux()) - Calculate the initial flux with errors for a RETIS simulation.
- retis_rate (
retis_rate()) - Calculate the rate constant with errors for a RETIS simulation.
-
pyretis.analysis.path_analysis._calc_tau(op_data, bin_edges=None, timestep=1)[source]¶ Calculate the value of tau.
This function calculates tau, a measure of time spent within a certain interval (bin) in the [0^-’] ensemble. It does so by counting the number of times the order parameter falls within this interval defined by the bin_edges. It then multiplies this count by the timestep and divides by the width of the interval. If bin_edges is None or not a 2-element list, or if the order parameter data op_data is None, it just returns 0.
Parameters: - op_data (list) – The order parameter data from the [0^-’] ensemble.
- bin_edges (list, optional) – The edges of the bin to calculate tau for. If None, it returns 0.
- timestep (float, optional) – The simulation timestep. Default is 1.
Returns: tau (float) – The value of tau.
-
pyretis.analysis.path_analysis._create_shoot_histograms(shoot_stats, bins)[source]¶ Create histograms and scale for the shoot analysis.
Parameters: - shoot_stats (dict) – This dict contains the results from the analysis of shooting moves. E.g. shoot_stats[key] contain the order parameters for the status key which can be the different statuses defined in pyretis.core.path._STATUS or ‘REJ’ (for rejected).
- bins (int) – The number of bins to use for the histograms.
Returns: - out[0] (dict) – For each possible status (‘ACC, ‘BWI’, etc.) this dict will contain a histogram as returned by the histogram function. It will also contain a ‘REJ’ key which is the concatenation of all rejections and a ‘ALL’ key which is simply all the values.
- out[1] (dict) – For each possible status (‘ACC, ‘BWI’, etc.) this dict will contain the scale factors for the histograms. The scale factors are obtained by dividing with the ‘ALL’ value.
See also
-
pyretis.analysis.path_analysis._get_path_length(path, ensemble_number)[source]¶ Return the path length for different moves.
Different moves may have a different way of obtaining the path length. (Example: time-reversal vs. shooting move).
Parameters: - path (dict) – This is the dict containing the information about the path. It can typically be obtained by iterating over the path ensemble object, e.g. with a for path in path_ensemble.get_paths():.
- ensemble_number (int) – This integer identifies the ensemble. This is used for the swapping moves in [0^-] and [0^+].
Returns: out (int) – The path length
-
pyretis.analysis.path_analysis._pcross_lambda_cumulative(orderparam, ordermin, ordermax, ngrid, weights=None, ha_weights=None)[source]¶ Obtain crossing probability as a function of the order parameter.
It will do the actual calculation of the crossing probability as a function of order parameter. It is split off from pcross_lambda since the analysis is intended to be backward compatible with the output/results from the old
TISMOL FORTRANprogram.Parameters: - orderparam (numpy.array) – Array containing the order parameters.
- ordermin (float) – Minimum allowed order parameter.
- ordermax (float) – Maximum allowed order parameter.
- ngrid (int) – This is the number of grid points.
- weights (numpy.array, optional) – The weight of each order parameter. This is used in order to count a specific order parameter more than once. If not given, the values in orderparam will be weighted equally.
- ha_weights (numpy.array, optional) – The weights of the high acceptance scheme (e.g. Stone Skipping)
Returns: - lamb (numpy.array) – The grid on which pcross is based.
- pcross (numpy.array) – Estimated crossing probability distribution.
-
pyretis.analysis.path_analysis._update_shoot_stats(shoot_stats, path)[source]¶ Update the shooting statistics with the status of the given path.
Parameters: - shoot_stats (dict) – This dict contains the results from the analysis of shooting moves. E.g. shoot_stats[key] contain the order parameters for the status key which can be the different statuses defined in pyretis.core.path._STATUS or ‘REJ’ (for rejected).
- path (dict) – This is the path information, represented as a dictionary.
Returns: out (None) – Returns None but will update shoot_stats for shooting moves.
-
pyretis.analysis.path_analysis.analyse_path_ensemble(path_ensemble, settings)[source]¶ Analyse a path ensemble.
This function will make use of the different analysis functions and analyse a path ensemble. This function is more general than the analyse_path_ensemble_object function in that it should work on both PathEnsemble and PathEnsembleFile objects. The running average is updated on-the-fly, see Wikipedia for details [wikimov].
Parameters: - path_ensemble (object like
PathEnsemble) – This is the path ensemble to analyse. - settings (dict) – This dictionary contains settings for the analysis.
We make use of the following keys:
- ngrid: The number of grid points for calculating the crossing probability as a function of the order parameter.
- maxblock: The max length of the blocks for the block error analysis. Note that this will maximum be equal the half of the length of the data, see block_error in .analysis.
- blockskip: Can be used to skip certain block lengths. A blockskip equal to n will consider every n’th block up to maxblock, i.e. it will use block lengths equal to 1, 1+n, 1+2n, etc.
- bins: The number of bins to use for creating histograms.
Returns: out (dict) – This dictionary contains the main results for the analysis which can be used for plotting or other kinds of output.
References
[wikimov] Wikipedia, “Moving Average”, https://en.wikipedia.org/wiki/Moving_average - path_ensemble (object like
-
pyretis.analysis.path_analysis.analyse_repptis_ensemble(path_ensemble, settings)[source]¶ Analyse a repptis path ensemble.
This function is a modification of analyse_path_ensemble(), which make use of the different analysis functions and analyse a repptis path ensemble.
Parameters: - path_ensemble (object like
PathEnsemble) – This is the path ensemble to analyse. - settings (dict) – This dictionary contains settings for the analysis.
We make use of the following keys:
- ngrid: The number of grid points for calculating the crossing probability as a function of the order parameter.
- maxblock: The max length of the blocks for the block error analysis. Note that this will maximum be equal the half of the length of the data, see block_error in .analysis.
- blockskip: Can be used to skip certain block lengths. A blockskip equal to n will consider every n’th block up to maxblock, i.e. it will use block lengths equal to 1, 1+n, 1+2n, etc.
- bins: The number of bins to use for creating histograms.
Returns: out (dict) – This dictionary contains the main results for the analysis which can be used for plotting or other kinds of output.
- path_ensemble (object like
-
pyretis.analysis.path_analysis.match_probabilities(path_results, detect, settings=None)[source]¶ Match probabilities from several path ensembles.
It calculates the efficiencies and errors for the matched probability.
Parameters: - path_results (list) – These are the results from the path analysis. path_results[i] contains the output from analyse_path_ensemble applied to ensemble no. i. Here we make use of the following keys from path_results[i]: * pcross: The crossing probability. * prun: The running average of the crossing probability. * blockerror: The output from the block error analysis. * efficiency: The output from the efficiency analysis.
- detect (list of floats) – These are the detect interfaces used in the analysis.
- settings (dict, optional) – This dictionary contains settings for the analysis.
Here we make use of the following keys from the
analysis section:
- ngrid: The number of grid points used per ensemble when building the crossing probability as a function of the order parameter (pcross). The grid for ensemble i spans [interfaces[1], min(orderparam.max(), max(interfaces))] with ngrid linearly spaced points; the spacing therefore depends on each ensemble’s range. The total-probability output is the concatenation of the pcross slices lambda <= detect from every ensemble, so its resolution near a given interface is set by the ngrid of the ensemble that owns that region. The default (1001) gives sub-percent resolution for typical interface spacings; increase it for finer interpolation.
- maxblock: The max length of the blocks for the block error analysis. Note that this will maximum be equal the half of the length of the data, see block_error in .analysis.
- blockskip: Can be used to skip certain block lengths. A blockskip equal to n will consider every n’th block up to maxblock, i.e. it will use block lengths equal to 1, 1+n, 1+2n, etc.
- bins: The number of bins to use for creating histograms.
Returns: results (dict) – These are results for the over-all probability and error and also some over-all TIS efficiencies.
-
pyretis.analysis.path_analysis.perm_calculations(results0, pcross, pcross_err)[source]¶ Calculate the permeability.
This function collects the relevant permeability properties from the output data. It then calculates the permeability and its error based on the values of xi, tau, and the crossing probability pcross.
Parameters: - results0 (dict) – Results from the analysis of ensemble [0^-]
- pcross (float) – Estimated crossing probability
- pcross_err (float) – Relative error in crossing probability.
Returns: - xxi (float) – The value of xi
- tau (float) – The value of tau
- perm (float) – The value of the permeability
- xi_err (float) – The error in xi
- tau_err (float) – The error in tau
- perm_err (float) – The error in permeability
-
pyretis.analysis.path_analysis.retis_flux(results0, results1, timestep)[source]¶ Calculate the initial flux for RETIS.
Parameters: - results0 (dict) – Results from the analysis of ensemble [0^-]
- results1 (dict) – Results from the analysis of ensemble [0^+]
- timestep (float) – The simulation timestep.
Returns: - flux (float) – The initial flux.
- flux_error (float) – The relative error in the initial flux.
-
pyretis.analysis.path_analysis.retis_rate(pcross, pcross_relerror, flux, flux_relerror)[source]¶ Calculate the rate constant for RETIS.
Parameters: - pcross (float) – Estimated crossing probability
- pcross_relerror (float) – Relative error in crossing probability.
- flux (float) – The initial flux.
- flux_relerror (float) – Relative error in the initial flux.
Returns: - rate (float) – The rate constant
- rate_error (float) – The relative error in the rate constant.
-
pyretis.analysis.path_analysis.skip_paths(weights, nacc, nskip)[source]¶ Skips not counted paths.
Parameters: - weights (array of integers) – The weights of the paths.
- nacc (integer) – Number of accepted paths.
- nskip (integer) – Number of paths to skip.
Returns: - new_weights (array of integers) – The new weights of the paths once skipped the nskip ones.
- new_acc (integer) – Recomputed number of accepted paths.
pyretis.analysis.repptis_analysis module¶
This module ports the local/global crossing-probability analysis of
REPPTIS partial paths from tistools (see [vervust2026loc]).
Local and global crossing probabilities for partial path TIS (REPPTIS).
A REPPTIS run samples short partial paths, each classified by where it
enters and leaves the ensemble: LML / LMR / RML / RMR
(Left/Right entry, Middle, Left/Right exit). From the
weighted counts of these path types this module computes:
- the local crossing probabilities of one ensemble
(
local_crossing_probabilities()) – the per-ensemblep_mm(LML) /p_mp(LMR) /p_pm(RML) /p_pp(RMR) that feed the REPPTIS Markov state model (pyretis.analysis.repptis_msm), and - the global crossing probability across all ensembles
(
global_crossing_probabilities_from_local()) via the closed-form recursion, an alternative to the MSM that does not need the transition-time vector.
This is a faithful port of the local/global crossing-probability part
of the tistools repptis_analysis module (the analysis of
Vervust, Wils, Safaei, Zhang and Ghysels, J. Chem. Theory Comput.
2026); the path-type weighting is preserved, only adapted to consume
the lmrs / weights arrays that
pyretis.analysis.path_analysis.analyse_repptis_ensemble()
already produces, rather than the tistools reader objects.
References
| [vervust2026loc] | W. Vervust, E. Wils, S. Safaei, D. T. Zhang and A. Ghysels, “Estimating full path lengths and kinetics from partial path transition interface sampling simulations”, J. Chem. Theory Comput. (2026), https://doi.org/10.1021/acs.jctc.5c01498. |
-
pyretis.analysis.repptis_analysis.global_crossing_probabilities_from_local(p_minus_plus, p_minus_minus, p_plus_plus, p_plus_minus)[source]¶ Return global crossing probabilities from the local ones.
Combine the per-ensemble local crossing probabilities into the global crossing probabilities using the closed-form REPPTIS recursion
\[P^+_j = \frac{p^{LMR}_{j-1} P^+_{j-1}} {p^{LMR}_{j-1} + p^{LML}_{j-1} P^-_{j-1}}, \quad P^-_j = \frac{p^{RML}_{j-1} P^-_{j-1}} {p^{LMR}_{j-1} + p^{LML}_{j-1} P^-_{j-1}},\]with \(P^+_0 = P^-_0 = 1\).
P_crossis the corresponding TIS probability of crossing interfacei + 1.Parameters: - p_minus_plus (sequence of float) – Per-ensemble local probability of type
LMR(forward crossing). - p_minus_minus (sequence of float) – Per-ensemble local probability of type
LML(return left). - p_plus_plus (sequence of float) – Per-ensemble local probability of type
RMR(return right). - p_plus_minus (sequence of float) – Per-ensemble local probability of type
RML(cross left).
Returns: - p_min (list of float) – Per ensemble
i, the probability of reaching state A before interfacei + 1given a crossing ofi. - p_plus (list of float) – Per ensemble
i, the probability of reaching interfacei + 1before A given a crossing ofi. - p_cross (list of float) – Per ensemble
i, the TIS probability of crossingi + 1.
Notes
If any input contains
nan, or a denominator vanishes, the function returns[nan, nan, nan]– matching the upstream behaviour, so a degenerate ensemble does not silently produce a spurious finite rate.- p_minus_plus (sequence of float) – Per-ensemble local probability of type
-
pyretis.analysis.repptis_analysis.local_crossing_probabilities(lmrs, weights, tr=False)[source]¶ Return the local crossing probabilities of one REPPTIS ensemble.
The accepted partial paths of an
[i^+-]/[0^+-']ensemble are classified by type (RMR/RML/LMR/LML) and weighted by their Monte-Carlo weights. The local crossing probability of a type is its weight divided by the total weight of paths entering from the same side: paths entering from the left (LMR+LML) and from the right (RMR+RML) are normalised separately. This routine is not for the[0^-']ensemble.Parameters: - lmrs (sequence of string) – One
"LMR"-style type tag per accepted path (the joined left/middle/right interface flags). Tags that are not one ofPATH_TYPES(partial*tags) contribute zero weight. - weights (sequence of int or float) – The Monte-Carlo weight of each accepted path (same length and
order as
lmrs). - tr (boolean, optional) – If True, apply infinite time-reversal reweighting: double the
RMRandLMLweights and symmetriseRML/LMR. Default is False.
Returns: p (dict) – Local crossing probabilities keyed by path type (
RMR/RML/LMR/LML), plus the milestoning arrival probabilities2R/2L. A probability isnanwhen its normalising weight is zero.- lmrs (sequence of string) – One
-
pyretis.analysis.repptis_analysis.path_type_taus(orders_list, lmrs, weights, interfaces)[source]¶ Return the weighted-average transition times per path type.
For every accepted path the total interior time (
tau_total()), the time before the middle crossing (tau_before_middle()) and the time after it (tau_after_middle()) are computed and weight-averaged per path type. The middle (recrossing) timetaumis the remaindertau - tau1 - tau2. These per-type, per-ensemble averages feed the REPPTIS MSM mean-first-passage-time vector (pyretis.analysis.repptis_msm.construct_tau_vector()).Parameters: - orders_list (sequence of numpy.ndarray) – One order-parameter trajectory per accepted path.
- lmrs (sequence of string) – The path type of each accepted path (same length/order).
- weights (sequence of int or float) – The Monte-Carlo weight of each accepted path.
- interfaces (sequence of float) – The ensemble’s
(left, middle, right)interfaces.
Returns: dict – Maps each path type (
LML/LMR/RML/RMR) to a dict with keystau/tau1/tau2/taum(the weighted averages, ornanwhen the type is unsampled).
-
pyretis.analysis.repptis_analysis.tau_after_middle(orders, ptype, interfaces)[source]¶ Return the steps a path takes after its last middle crossing.
The mirror of
tau_before_middle(), measured from the end of the path inward: for a path that exits left (LML/RML) from the left interface, for a path that exits right (LMR/RMR) from the right interface.Parameters: - orders (numpy.ndarray) – The per-phase-point order parameters of the path.
- ptype (string) – The path type (
LMR/LML/RML/RMR). - interfaces (sequence of float) – The ensemble’s
(left, middle, right)interfaces.
Returns: int – The number of steps after the last middle crossing.
Raises: ValueError – If the path type is not recognised.
-
pyretis.analysis.repptis_analysis.tau_before_middle(orders, ptype, interfaces)[source]¶ Return the steps a path takes to first cross the middle interface.
For a left-entering path (
LMR/LML) this is the number of steps between the first crossing of the left interfaceinterfaces[0]and the first crossing of the middle interfaceinterfaces[1]; for a right-entering path (RML/RMR) it is measured from the right interfaceinterfaces[2]inward.Parameters: - orders (numpy.ndarray) – The per-phase-point order parameters of the path; column 0 is the main order parameter.
- ptype (string) – The path type (
LMR/LML/RML/RMR). - interfaces (sequence of float) – The ensemble’s
(left, middle, right)interfaces.
Returns: int – The number of steps before the middle interface is crossed.
Raises: ValueError – If the path type is not recognised.
-
pyretis.analysis.repptis_analysis.tau_total(orders, ptype, interfaces)[source]¶ Return the path length between its first and last boundary crossing.
The total time the path spends inside the ensemble, excluding the leading segment before the entry crossing and the trailing segment after the exit crossing. Equals
tau_before_middle+tau_after_middle+ the middle (recrossing) time.Parameters: - orders (numpy.ndarray) – The per-phase-point order parameters of the path.
- ptype (string) – The path type (
LMR/LML/RML/RMR). - interfaces (sequence of float) – The ensemble’s
(left, middle, right)interfaces.
Returns: int – The number of interior steps of the path.
Raises: ValueError – If the path type is not recognised.
pyretis.analysis.repptis_msm module¶
Markov-state-model kinetics for partial path TIS (REPPTIS).
REPPTIS samples short partial paths, so the crossing probability, mean first passage time (MFPT), flux and rate cannot be read off the path ensembles directly the way they are for full RETIS. This module reconstructs them by building a Markov state model (MSM) over the partial-path ensembles from the local crossing probabilities and the per-ensemble transition times, following the closed-form analysis of Vervust, Wils, Safaei, Zhang and Ghysels [vervust2026] (building on the permeability-from-(RE)TIS formalism of Ghysels, Roet, Davoudi and van Erp [ghysels2021], and the REPPTIS replica-exchange scheme of Vervust, Zhang, van Erp and Ghysels [vervust2023]).
This is a faithful port of the repptis_msm module from the
tistools analysis package (Elias W., May 2025); the numerical
construction is preserved verbatim, only adapted to the PyRETIS coding
conventions.
Note on the index convention¶
In the paper, N + 1 is the number of interfaces, while in this
code N is the number of interfaces directly. Both yield the same
state-space size NS = 4 * N - 5 (paper: 4 * (N - 1) - 1).
Important methods defined here¶
- construct_M (
construct_M()) - Build the PPTIS transition matrix from the local crossing probabilities.
- construct_M_milestoning (
construct_M_milestoning()) - Build the milestoning transition matrix.
- global_pcross_msm (
global_pcross_msm()) - Global crossing probability from the transition matrix.
- mfpt_to_absorbing_states (
mfpt_to_absorbing_states()) - Mean first passage time to a set of absorbing states.
- mfpt_to_first_last_state (
mfpt_to_first_last_state()) - Mean first passage time to the first or last state.
- construct_tau_vector (
construct_tau_vector()) - Flatten the per-path-type transition times into the MSM vector.
References
| [vervust2026] | W. Vervust, E. Wils, S. Safaei, D. T. Zhang and A. Ghysels, “Estimating full path lengths and kinetics from partial path transition interface sampling simulations”, J. Chem. Theory Comput. (2026), https://doi.org/10.1021/acs.jctc.5c01498 (arXiv:2602.12835). |
| [ghysels2021] | A. Ghysels, S. Roet, S. Davoudi and T. S. van Erp, “Exact non-Markovian permeability from rare event simulations”, Phys. Rev. Research 3, 033068 (2021), https://doi.org/10.1103/PhysRevResearch.3.033068. |
| [vervust2023] | W. Vervust, D. T. Zhang, T. S. van Erp and A. Ghysels, “Path sampling with memory reduction and replica exchange to reach long permeation timescales”, Biophys. J. 122, 2960-2972 (2023), https://doi.org/10.1016/j.bpj.2023.02.021. |
-
pyretis.analysis.repptis_msm.check_valid_indices(M, absor, kept)[source]¶ Validate the absorbing and non-absorbing state indices.
Ensure that the indices of absorbing (
absor) and non-absorbing (kept) states are correctly defined, in range, and partition the state space (each state in exactly one of the two sets).Parameters: - M (numpy.ndarray) – The transition matrix of the Markov process.
- absor (array-like) – Indices of absorbing states.
- kept (array-like) – Indices of non-absorbing (nonboundary) states.
Raises: ValueError – If there are duplicate indices in
absororkept, if any index is out of bounds, or if any state is neither inabsornor inkept(or in both).
-
pyretis.analysis.repptis_msm.construct_M(p_mm, p_mp, p_pm, p_pp, N)[source]¶ Construct the PPTIS transition matrix M.
The matrix describes the probabilities of transitioning between states in the PPTIS framework, built from the local crossing probabilities for the different path types (LML, LMR, RML, RMR).
Parameters: - p_mm (list of float) – Local crossing probabilities for the LML (Left-to-Left) path
type. Must have length
N - 1. - p_mp (list of float) – Local crossing probabilities for the LMR (Left-to-Right) path
type. Must have length
N - 1. - p_pm (list of float) – Local crossing probabilities for the RML (Right-to-Left) path
type. Must have length
N - 1. - p_pp (list of float) – Local crossing probabilities for the RMR (Right-to-Right) path
type. Must have length
N - 1. - N (int) – The number of interfaces. Must be at least 3.
Returns: M (numpy.ndarray) – The transition matrix of shape
(NS, NS), whereNS = 4 * N - 5.Raises: ValueError – If any of the input constraints are not met (e.g. invalid lengths or values).
- p_mm (list of float) – Local crossing probabilities for the LML (Left-to-Left) path
type. Must have length
-
pyretis.analysis.repptis_msm.construct_M_N3(p_mm, p_mp, NS)[source]¶ Construct the PPTIS transition matrix M for the case N=3.
Build the transition matrix when there are exactly 3 interfaces.
Parameters: - p_mm (list of float) – Local crossing probabilities for the LML (Left-to-Left) path type. Must have length 2.
- p_mp (list of float) – Local crossing probabilities for the LMR (Left-to-Right) path type. Must have length 2.
- NS (int) – The dimension of the transition matrix. Must satisfy
NS = 4 * N - 5withN = 3.
Returns: M (numpy.ndarray) – The transition matrix of shape
(NS, NS).Raises: ValueError – If the input probability lists do not have the correct length or if
NSis invalid.
-
pyretis.analysis.repptis_msm.construct_M_milestoning(p_min, p_plus, N)[source]¶ Construct the transition matrix M for milestoning PPTIS.
Build a transition matrix for a milestoning framework. The states are lambda0-, lambda0+, lambda1, …, lambda(N-1) = B.
Parameters: - p_min (list of float) – Probabilities of transitioning to the previous milestone. Must
have length
N - 1. - p_plus (list of float) – Probabilities of transitioning to the next milestone. Must
have length
N - 1. - N (int) – The number of interfaces (milestones). Must be at least 3.
Returns: M (numpy.ndarray) – The transition matrix of shape
(NS, NS), whereNS = N + 1.Raises: ValueError – If
Nis less than 3 or if the lengths ofp_minandp_plusare notN - 1.- p_min (list of float) – Probabilities of transitioning to the previous milestone. Must
have length
-
pyretis.analysis.repptis_msm.construct_tau_vector(N, NS, taumm, taump, taupm, taupp)[source]¶ Construct the flattened vector of PPTIS transition times.
The per-path-type transition times (LML, LMR, RML, RMR) are organised into a single flattened vector following the PPTIS ensemble structure.
Parameters: - N (int) – The number of ensembles. Must be at least 3.
- NS (int) – The expected size of the output vector. Must satisfy
NS = 4 * N - 5. - taumm (list of float) – Transition times for the LML (Left-to-Left) path type. Must
have length
N. - taump (list of float) – Transition times for the LMR (Left-to-Right) path type. Must
have length
N. - taupm (list of float) – Transition times for the RML (Right-to-Left) path type. Must
have length
N. - taupp (list of float) – Transition times for the RMR (Right-to-Right) path type. Must
have length
N.
Returns: tau (numpy.ndarray) – A flattened vector of transition times with length
NS.Raises: ValueError – If any of the input constraints are not met (e.g. invalid lengths or values).
-
pyretis.analysis.repptis_msm.create_labels_states(N)[source]¶ Generate labels for absorbing and non-absorbing states.
Create two lists of state labels:
labels1for the two absorbing states (0-andB), andlabels2for theN-2non-absorbing states.Parameters: N (int) – The total number of states. Must be at least 3. Returns: - labels1 (list of str) – Labels for the two absorbing states.
- labels2 (list of str) – Labels for the
N-2non-absorbing states.
Raises: ValueError – If Nis less than 3.
-
pyretis.analysis.repptis_msm.create_labels_states_all(N)[source]¶ Generate labels for all states in sequential order.
Create a single list of state labels: the absorbing state
0-, all non-absorbing states, then the absorbing stateB.Parameters: N (int) – The total number of states. Must be at least 3. Returns: labels (list of str) – Labels for all Nstates in sequential order.Raises: ValueError – If Nis less than 3.
-
pyretis.analysis.repptis_msm.get_pieces_matrix(M, absor, kept)[source]¶ Partition a transition matrix into absorbing/nonboundary blocks.
Split
Minto four submatrices:Mp(nonboundary to nonboundary),D(nonboundary to absorbing),E(absorbing to nonboundary), andM11(absorbing to absorbing).Parameters: - M (numpy.ndarray) – The transition matrix of the Markov process.
- absor (array-like) – Indices of absorbing states.
- kept (array-like) – Indices of non-absorbing (nonboundary) states.
Returns: - Mp (numpy.ndarray) – Submatrix for nonboundary-to-nonboundary transitions.
- D (numpy.ndarray) – Submatrix for nonboundary-to-absorbing transitions.
- E (numpy.ndarray) – Submatrix for absorbing-to-nonboundary transitions.
- M11 (numpy.ndarray) – Submatrix for absorbing-to-absorbing transitions.
Raises: ValueError – If
absorandkeptcontain invalid or overlapping indices.
-
pyretis.analysis.repptis_msm.get_pieces_vector(vec, absor, kept)[source]¶ Partition a vector into absorbing/nonboundary sub-vectors.
Split
vecintov1(elements at absorbing-state indices) andv2(elements at nonboundary-state indices).Parameters: - vec (numpy.ndarray) – A 1D array of state values.
- absor (array-like) – Indices of absorbing states.
- kept (array-like) – Indices of non-absorbing (nonboundary) states.
Returns: - v1 (numpy.ndarray) – A column vector (
len(absor) x 1) of elements atabsor. - v2 (numpy.ndarray) – A column vector (
len(kept) x 1) of elements atkept.
Raises: ValueError – If
absorandkeptcontain invalid or overlapping indices, ifvecis not 1D, or ifvecdoes not have the expected lengthlen(absor) + len(kept).
-
pyretis.analysis.repptis_msm.global_pcross_msm(M, doprint=False)[source]¶ Compute global crossing probabilities in a Markov process.
Calculate the probability of reaching state
-1before state0under different conditions, using the transition matrix. The function solves a linear system rather than inverting matrices, for better numerical stability.Parameters: - M (numpy.ndarray) – A square transition matrix representing the Markov process. Must have at least 3 states.
- doprint (bool, optional) – If True, log intermediate computation details. Default False.
Returns: - z1 (numpy.ndarray) – A 2-element array with crossing probabilities for states 0 and -1.
- z2 (numpy.ndarray) – An
(NS-2)-element array with crossing probabilities for the other states. - y1 (numpy.ndarray) – A 2-element array with adjusted crossing probabilities for
states 0 and -1.
y1[0]is the global crossing probability from state 0 to -1, given that the process starts at 0 and leaves it. - y2 (numpy.ndarray) – An
(NS-2)-element array with adjusted crossing probabilities for the other states.
Raises: ValueError – If the transition matrix has fewer than three states.
-
pyretis.analysis.repptis_msm.mfpt_to_absorbing_states(M, tau1, taum, tau2, absor, kept, doprint=False, remove_initial_m=True)[source]¶ Compute the mean first passage time (MFPT) to absorbing states.
Calculate the MFPT to reach any of the specified absorbing states, both unconditionally (
g) and conditionally on leaving the current state (h). The system is solved without explicit matrix inversion, for numerical stability.Parameters: - M (numpy.ndarray) – The transition matrix of the Markov process.
- tau1 (numpy.ndarray) – Time before the first visit to an absorbing state.
- taum (numpy.ndarray) – Time spent between the first and last visit to an absorbing state.
- tau2 (numpy.ndarray) – Time after the last visit to an absorbing state.
- absor (list or numpy.ndarray) – Indices of absorbing states.
- kept (list or numpy.ndarray) – Indices of nonboundary (non-absorbing) states.
- doprint (bool, optional) – If True, log intermediate computation details. Default False.
- remove_initial_m (bool or str, optional) – If True or
"m", the middle part (taum) is removed from the initial state’s MFPT. Default True.
Returns: - g1 (numpy.ndarray) – Array of size
(n_absorb, 1)with unconditional MFPTs for the absorbing states. - g2 (numpy.ndarray) – Array of size
(n_kept, 1)with unconditional MFPTs for the nonboundary states. - h1 (numpy.ndarray) – Array of size
(n_absorb, 1)with conditional MFPTs for the absorbing states. - h2 (numpy.ndarray) – Array of size
(n_kept, 1)with conditional MFPTs for the nonboundary states.
Raises: ValueError – If the transition matrix has fewer than three states, or if the absorbing and nonboundary states do not partition the state space.
-
pyretis.analysis.repptis_msm.mfpt_to_first_last_state(M, tau1, taum, tau2, doprint=False)[source]¶ Compute the MFPT to reach either state 0 or state -1.
Calculate the MFPT to reach the first (0) or last (
NS-1) state by treating these two states as absorbing. The key result ish1[0], the MFPT from state 0 to either 0 or -1, given that the process leaves state 0. Callsmfpt_to_absorbing_states()withremove_initial_m="m"so the intermediate passage time is excluded.Parameters: - M (numpy.ndarray) – The transition matrix of the Markov process.
- tau1 (numpy.ndarray) – Time before the first visit to an absorbing state.
- taum (numpy.ndarray) – Time spent between the first and last visit to an absorbing state.
- tau2 (numpy.ndarray) – Time after the last visit to an absorbing state.
- doprint (bool, optional) – If True, log intermediate computation details. Default False.
Returns: - g1 (numpy.ndarray) – Unconditional MFPT for the absorbing states (0 and
NS-1). - g2 (numpy.ndarray) – Unconditional MFPT for the nonboundary states.
- h1 (numpy.ndarray) – Conditional MFPT for the absorbing states.
- h2 (numpy.ndarray) – Conditional MFPT for the nonboundary states.
Raises: ValueError – If the transition matrix has fewer than three states.
-
pyretis.analysis.repptis_msm.print_all_tau(pathensembles, taumm, taump, taupm, taupp)[source]¶ Print all tau values for each path ensemble.
Parameters: - pathensembles (list) – List of path ensemble objects, each with a
.nameattribute. - taumm (array-like) – First tau metric for each path.
- taump (array-like) – Second tau metric for each path.
- taupm (array-like) – Third tau metric for each path.
- taupp (array-like) – Fourth tau metric for each path.
Raises: ValueError – If the input arrays do not have the same length as
pathensembles.- pathensembles (list) – List of path ensemble objects, each with a
-
pyretis.analysis.repptis_msm.print_vector(g, states=None, sel=None)[source]¶ Print a vector
gwith the corresponding state labels.Parameters: - g (array-like) – The vector to print.
- states (list of str, optional) – Labels corresponding to each state. If None, indices are used.
- sel (list of int, optional) – Indices of the selected states to print. If None, all states are printed.
Raises: ValueError – If
selis provided but does not match the length ofg.
pyretis.analysis.wham_analysis module¶
WHAM crossing-probability analysis for infinite-swapping RETIS.
This module computes per-interface and total crossing probabilities, the
initial flux, and the rate constant from the path-data file written by
the infinite-swapping (replica-exchange) sampler
(pyretis.simulation.repex.write_path_ensemble_data(); the file is
named infswap_data.txt).
The estimator follows the standard weighted-histogram (WHAM)
crossing-probability procedure for transition interface sampling: the
high-acceptance (HA) weight unweighting, the eta normalisation, the
Q-factor WHAM stitch (Lervik et al., J. Comput. Chem. 2015), a
point-matching cross-check and a block-error analysis. The
implementation returns its results as values rather than writing report
files.
Data-file column layout (after str.split on a data line):
0– path index1– path length2– maximum order parameter (lambda_max)3 .. 3 + nintf - 1–Cxy(fractional sampling occurrence) for ensembles[0-],[0+],[1+], …3 + nintf .. 3 + 2*nintf - 1– the high-acceptance (HA) weights for the same ensembles.
Here nintf is the number of interfaces and i0plus = 4 is the column
of the [0+] ensemble.
-
pyretis.analysis.wham_analysis.INFSWAP_DATA_NAMES= ('infswap_data.txt', 'infretis_data.txt')¶ Names of the infinite-swapping data file, newest first.
infswap_data.txtis the current name;infretis_data.txtis the pre-debrand legacy name, still produced by runs whoserestart.tomlpersists the olddata_file(so a restart started before the rename keeps writing the old name).
-
pyretis.analysis.wham_analysis._default_lamres(interfaces)[source]¶ Pick a default order-parameter resolution from the interfaces.
The WHAM stitch places every interface on a uniform grid of step
lamresviaround((lambda - lambda_A) / lamres). To keep adjacent interfaces on distinct grid points the step must be small relative to the smallest interface spacing – not the first one. Using the first gap (the historical default) silently collapses two interfaces onto the same grid index whenever a later gap is smaller, which corrupts the per-interface crossing probabilities. We therefore take one tenth of the smallest spacing.Parameters: interfaces (list of float) – Strictly increasing interface positions. Returns: lamres (float) – The chosen resolution.
-
pyretis.analysis.wham_analysis._load_run_config(directory='.')[source]¶ Load a run’s
restart.toml(orinfswap.toml) as a dict.Returns the parsed config from the first of
restart.toml/infswap.tomlfound indirectory, orNoneif neither exists.
-
pyretis.analysis.wham_analysis._rec_blocks(reduced)[source]¶ Recover block averages from a reduced running-average array.
-
pyretis.analysis.wham_analysis._running_flux(matrix, interval=1.0)[source]¶ Running-average conventional flux, one entry per path.
The
[0-]/[0+]path lengths are in simulation steps;interval(the time per step,timestep * subcycles) converts the running flux to a flux per unit time. Seewham_crossing_probability()for theintervalconvention.
-
pyretis.analysis.wham_analysis._running_lengths(matrix)[source]¶ Weighted mean path length in the [0-] and [0+] ensembles.
-
pyretis.analysis.wham_analysis._running_pm(matrix, interfaces, lamres)[source]¶ Compute the running point-matching total crossing probability.
Running point-matching total crossing probability: at each path the product over ensembles of
P_A(lambda_{i+1} | lambda_i)evaluated from the local crossing probabilities accumulated so far.
-
pyretis.analysis.wham_analysis._unweight_matrix(matrix, nintf)[source]¶ Unweight the
Cxycolumns by the HA-weights, in place.Standard WHAM unweighting step: each
Cxyvalue is divided by its HA-weight, the (now redundant) HA-weight column is overwritten with the running sum of the pre-unweightingCxy(needed for the running WHAM averages), and finally eachCxycolumn is divided by the average inverse HA-weight so theetavalues are comparable across ensembles that use different moves (e.g.[0+]shooting vs[i+]wire fencing).Parameters: - matrix (list of list of float) – The data matrix from
read_data_matrix(); modified in place. - nintf (int) – Number of interfaces.
- matrix (list of list of float) – The data matrix from
-
pyretis.analysis.wham_analysis._validate_lamres(interfaces, lamres)[source]¶ Check that
lamresresolves every interface separately.Raises if two interfaces round to the same grid index (the stitch would double-count an ensemble) and warns if an interface does not sit exactly on the grid (the per-interface crossing probability is then read at the nearest grid point; the interface is mis-placed by at most
lamres / 2in lambda, and the probability error that induces depends on the local crossing-probability slope). Failing loud here is deliberate: a wrong crossing probability is worse than a refusal to run.Parameters: - interfaces (list of float) – Strictly increasing interface positions.
- lamres (float) – Candidate order-parameter resolution.
-
pyretis.analysis.wham_analysis._wham_pq(n_plus_ens, interfaces, lamres, eta, v_alpha)[source]¶ WHAM crossing probabilities at the interfaces and Q-factors.
WHAM crossing probabilities at the interfaces and Q-factors (Q-factor recursion of Lervik et al., JCTC 2015). Returns
P(crossing probability at each interface, withlambda_Bappended) andQ(the per-ensemble normalisation factors forv_alpha).
-
pyretis.analysis.wham_analysis._wham_ptot_run(interfaces, ploc_matrix, sum_pxy)[source]¶ Running-average total crossing probability via WHAM.
ploc_matrix[j][i]isP_A(lambda_i | lambda_j)from the[j+]ensemble using the data so far.
-
pyretis.analysis.wham_analysis.analyse_wham_output(data_file, interfaces, lamres=None, nskip=0, minblocks=5, interval=1.0)[source]¶ Run the standard WHAM analysis on an infinite-swapping run.
Parameters: - data_file (str) – Path to
infswap_data.txt. - interfaces (sequence of float) – The interface positions
[lambda_0, ..., lambda_B](from the run’srestart.toml/ config). - lamres (float, optional) – Order-parameter resolution (see
wham_crossing_probability()). - nskip (int, optional) – Number of initial records to skip.
- minblocks (int, optional) – Minimum number of blocks for the block-error analysis.
- interval (float, optional) – Simulation time per recorded step (
timestep * subcycles), used to convert the flux to a rate per unit time. Seewham_crossing_probability(); read it from the run config withread_engine_interval(). The default1.0yields a rate per step.
Returns: results (dict) – The dictionary returned by
wham_crossing_probability(), pluspath_lengths(thepath_length_statistics()dict).- data_file (str) – Path to
-
pyretis.analysis.wham_analysis.detect_infswap_output(directory='.')[source]¶ Check if a directory contains infinite-swapping sampler output.
Both the current
infswap_data.txtand the legacyinfretis_data.txtare recognised, so runs created (or restarted) before the data-file rename are still analysed; the current name takes precedence when both are present.Parameters: directory (str) – Path to check. Returns: data_file (str or None) – Path to the infinite-swapping data file if found, else None.
-
pyretis.analysis.wham_analysis.path_length_statistics(matrix)[source]¶ Compute path-length statistics from a data matrix.
Parameters: matrix (list of list of float) – Data matrix from read_data_matrix().Returns: stats (dict) – mean,std,min,max,medianof path lengths.
-
pyretis.analysis.wham_analysis.read_data_matrix(filename, nskip=0)[source]¶ Read an
infswap_data.txtfile into a numeric matrix.Parameters: - filename (str) – Path to the
infswap_data.txtfile. - nskip (int, optional) – Number of initial data rows to discard (loaded paths / equilibration).
Returns: matrix (list of list of float) – One row per path;
"----"entries are mapped to0.0.- filename (str) – Path to the
-
pyretis.analysis.wham_analysis.read_engine_interval(directory='.')[source]¶ Read the time per step (
timestep * subcycles) from a run config.The infinite-swapping data file records path lengths in steps, so the WHAM flux/rate needs this interval to be expressed per unit time (the convention of the native PyRETIS flux analysis). The value is read from the run’s
[engine]section, mirroringpyretis.analysis.flux_analysis.analyse_flux()(timestep * subcycles, withsubcyclesdefaulting to 1).Parameters: directory (str) – Directory containing restart.toml(orinfswap.toml).Returns: interval (float or None) – timestep * subcycles, orNoneif no config is found or it has no enginetimestep(so the caller can decide whether to fail or fall back to a per-step rate rather than silently assuming a timestep).
-
pyretis.analysis.wham_analysis.read_interfaces(directory='.')[source]¶ Read the interface list from a run’s
restart.toml.Parameters: directory (str) – Directory containing restart.toml(orinfswap.toml).Returns: interfaces (list of float or None) – The interface positions, or Noneif no config is found.
-
pyretis.analysis.wham_analysis.rec_block_errors(runav, minblocks)[source]¶ Block-error analysis of a running-average series.
Standard recursive block-error analysis of the running average.
Parameters: - runav (sequence of float) – The running average of an observable, one entry per path.
- minblocks (int) – Minimum number of blocks in the block-error analysis.
Returns: - half_av_err (float) – Average relative error over the second half of the block lengths.
- n_stat_ineff (float) – Statistical inefficiency estimate.
- rel_errors (list of float) – Relative error as a function of block length.
-
pyretis.analysis.wham_analysis.wham_crossing_probability(matrix, interfaces, lamres=None, minblocks=5, interval=1.0)[source]¶ Compute WHAM crossing probabilities and rate from a data matrix.
Core WHAM crossing-probability, flux and rate computation (crossing probability + flux + rate), with the file/plot output replaced by a returned dictionary. The matrix is modified in place by the HA-weight unweighting step.
Parameters: - matrix (list of list of float) – Data matrix from
read_data_matrix()(alreadynskip-ped). - interfaces (sequence of float) – The interface positions
[lambda_0, ..., lambda_B]. - lamres (float, optional) – Order-parameter resolution. Defaults to one tenth of the
smallest interface spacing (see
_default_lamres()); this keeps non-uniform interfaces on distinct grid points. Whatever value is used is validated by_validate_lamres(). - minblocks (int, optional) – Minimum number of blocks for the block-error analysis.
- interval (float, optional) – Simulation time per recorded path step, i.e.
timestep * subcycles. The[0-]/[0+]path lengths are in steps, so the flux is divided by this to give a rate per unit time – matching the native PyRETIS flux analysis (pyretis.analysis.flux_analysis.analyse_flux()). The default1.0reproduces the upstream inftools convention of a rate per step; pass the real interval to obtain a physical rate.
Returns: results (dict) – Keys:
pcross_pm/pcross_wham(total crossing probability, point-matching and WHAM),pcross_pm_relerr/pcross_wham_relerr(relative block errors),pcross_at_intf(WHAM crossing probability at each interface),pcross_at_intf_pm(point-matching version),flux,rate_pm/rate_wham,rate_pm_relerr,length_0minus/length_0plus,n_records,lambda_valuesandpcross_curve_wham/pcross_curve_pm(the fullP_A(lambda)profiles).- matrix (list of list of float) – Data matrix from