pyretis.pyvisa package¶

The sub-package with tools for visualizing simulation results for PyRETIS.

This package is intended for compiling data of a simulation into a compact file standard (hdf5), and displaying results from file in a custom GUI applet. Included are compiler of simulation data and a custom-built PyQt5 GUI applet that loads pre-compiled data (or compiles when loading raw simulation data). The applet allows for user-friendly and interactive plotting of combinations of order parameter data of different interfaces and cycles of the simulation.

Package structure¶

Modules¶

__init__.py: Imports from the other modules.
common.py (pyretis.pyvisa.common): Common functions and variables for the visualization. These functions are mainly intended for internal use and are not imported here.
orderparam_density.py (pyretis.pyvisa.orderparam_density): A module that handles the compiling of data to a single file.
plotting.py (pyretis.pyvisa.plotting): A module which contains some functions that are used to plot regression lines and interface planes, and generate surface plots.
resources_rc.py (pyretis.pyvisa.resources_rc): A module containing the resources, icons/logos for the PyVISA GUI.
visualize.py (pyretis.pyvisa.visualize): A module that handles the loading and plotting of data from a compiled file or a simulation.

Sub-packages¶

None

Important classes defined in this package¶

CustomFigCanvas: (pyretis.pyvisa.visualize.CustomFigCanvas) A class for the custom figure shown in the VisualApp class PyQt5 applet.
DataObject: (pyretis.pyvisa.visualize.DataObject) A class that reads from simulation data, holds the data, and supplies the data to VisualApp for plotting.
DataSlave (pyretis.pyvisa.visualize.DataSlave): QObject class definition that holds the PathDensity data.
PathDensity (pyretis.pyvisa.orderparam_density.PathDensity): A class for reading, storing, and compiling simulation data.
PathVisualize: (pyretis.pyvisa.orderparam_density.PathVisualize) A class for loading data (compiled or not), and generating plots.
VisualApp (pyretis.pyvisa.visualize.VisualApp): A QtWidget class that holds a user-defined figure.
VisualObject: (pyretis.pyvisa.visualize.VisualObject) A class that loads from hdf5, holds and, supplies VisualApp with data for plotting.

Important methods defined in this package¶

_grid_it_up (pyretis.pyvisa.plotting._grid_it_up()): Maps the x,y and z data to a numpy.meshgrid using scipy interpolation at a user defined resolution.
gen_surface(pyretis.pyvisa.plotting.gen_surface()): A function that generates a user-defined surface plot (2D/3D).
_grid_it_up(pyretis.pyvisa.plotting._grid_it_up()): A function that generates a [X,Y] numpy.meshgrid and [Z] grid-data for a given resolution.
plot_int_plane(pyretis.pyvisa.plotting.plot_int_plane()): A function that generates planes of the simulation interfaces for 3D plots.
plot_regline(pyretis.pyvisa.plotting.plot_regline()): A function that generates a linear regression line of x and y data on a given matplotlib.axes object.
shift_data(pyretis.pyvisa.common.shift_data()): A function that shifts data values of a list by the median value.
try_data_shift(pyretis.pyvisa.common.try_data_shift()): A function that attempts a shift of the data values to increase linear correlation.

pyretis.pyvisa.inf module¶

This module just contains some info for PyRETIS.

Here we define the name of the program and some other relevant info.

pyretis.pyvisa.common module¶

Common functions for the path density.

Functions used to compare and process data, such as matching similar lists or attempting periodic shifts of values.

Important methods defined here¶

find_rst_file (:py:func: .find_rst_file): Search for a rst-file from a chosen subdirectory.
read_traj_txt_file (:py:func: .read_traj_txt_file): Read the sequence of files in a trajectory from a traj.txt file.
recalculate_all (recalculate_all()): Recalculate order parameter and new collective variables by finding all trajectory files from a simulation.
shift_data (:py:func: .shift_data): Finds the median value of a given list of floats, and shifts the lower half of the data by the median.
try_data_shift (:py:func: .try_data_shift): Takes in two lists of values, x and y, and calculates a linear regression and R**2-correlation of the data set. Attempts a shift of each data set by their respective median to increase the correlation.
where_from_to (:py:func: .where_from_to): Check the initial and final steps of a trajectory with respect to the provided interfaces.
get_cv_names (:py:func: .get_cv_names): Outputs a list of the names of the descriptors in the simulation.
recalculate_all (:py:func: .recalculate_all): Recompute all the order parameters according to the PyRETIS storage scheme or for individual files/folders.
find_data (:py:func: .find_data): Find suitable frames/trajectories to recompute the order parameter on.
read_single_order_txt (:py:func: .read_single_order_txt): Parse a standalone order parameter text file and return a DataFrame together with a list of column names suitable for use in PyVisA.
run_user_script (:py:func: .run_user_script): Execute a user-supplied Python script and capture its stdout output to produce an order.txt file that PyVisA can load.

pyretis.pyvisa.common.find_data(runfolder, ensemble_names=None, data=None)¶

Find the trajectory data used to do post-processing.

find_traj returns a dict with a structure resembling that of the simulation.

Parameters:

runfolder (string, optional) – The path of the execution directory.
ensemble_names (list, optional) – List of ensemble names in the simulation to work with.
data (string, optional) – If given, the function will check only the single file or look only in the given directory

Returns:

trj_dict (dict) – To each key, ensemble_name (e.g. 000, 001, etc) the values are: the last accepted trajectories given by the accepted-key; the generation trajectory or conf files given by the generation-key, and lastly the dictionary stored_traj that is given by the traj-key. stored_traj is split up into the dictionaries`traj-acc` and traj-rej which have keys for all the accepted and rejected cycles respectively, where the trajectory files for that cycle is stored.

pyretis.pyvisa.common.find_rst_file(search_dir)¶

Search for rst-files.

Parameters:: search_dir (string) – Path to the .rst file.
Returns:: out[0] (string) – Path and name of the .rst file.

pyretis.pyvisa.common.get_cv_names(input_settings, num_columns=None)¶

Return labels for the order parameter and collective variables.

The labels follow the same convention for the main order parameter and any extra collective variable: each [orderparameter] / [collective-variable] block may declare name either as a single string or as a list of strings.

A list of strings is used as-is for that block (one label per value returned by the corresponding calculate()).
A single string is expanded with an index suffix ("<name>_1", "<name>_2", ...) to match the number of values the block produces. If the block produces a single value the bare string is used.
When name is missing the labels fall back to "op_<i>" for the main order parameter and "cv_<i>" for each collective variable (using the bare prefix if the block produces a single value).

Parameters:

input_settings (dict) – Dictionary with the settings from the simulations.
num_columns (int, optional) – Total number of order-parameter columns observed (e.g. read from order.txt). When supplied, this is used to allocate unspecified blocks: if there is exactly one section configured and num_columns is greater than one, the single name is expanded with indices to cover all columns. When num_columns is given and cannot be reconciled with the configured names, a ValueError is raised.

Returns:

names (list of str) – Flat list of column labels, in the same order as the concatenated calculate() outputs.

Raises:

ValueError – If a name list does not match the number of values produced by its block, or if num_columns cannot be reconciled with the configured names.

pyretis.pyvisa.common.read_single_data_file(filepath, rst_file=None)¶

Parse a standalone txt or csv data file for PyVisA.

The file may contain:

numeric rows separated by whitespace or commas,
an optional commented header line (#, ##, ;, //, etc.),
or an uncommented CSV header row.

When no usable header is present, column titles are inferred from rst_file when possible. Otherwise, PyVisA falls back to time, op1, op2, …

Parameters:

filepath (str) – Path to the text/CSV file.
rst_file (str, optional) – Optional PyRETIS .rst file used to infer missing labels.

Returns:

frames (pandas.DataFrame or None) – Parsed numeric data with every column preserved, including the first time/step column when present.
plot_cols (list of str or None) – Column labels available for plotting.
main_op_label (str or None) – Best-effort main order-parameter label for interface/range logic.

pyretis.pyvisa.common.read_single_order_txt(filepath, rst_file=None)¶: Backward-compatible wrapper for standalone text-table loading.

pyretis.pyvisa.common.read_traj_txt_file(path)¶

Read a traj.txt file.

Function which reads a traj.txt file and returns a dict containing the name of each file in the trajectory and the sign of their velocity.

Parameters:: path (string) – Path to the traj.txt file.
Returns:: files (dict) – Dictionary containing each file in the trajectory and the sign of their velocity.

pyretis.pyvisa.common.recalculate_all(runfolder, iofile, ensemble_names=None, data=None, progress=False, n_workers=1)¶

Recalculate order parameter and collective variables.

Performs post-processing by analyzing trajectories of old simulations to extract data, do new calculations, and write to a new order.txt file.

Parameters:

runfolder (string) – The path of the execution directory.
iofile (string) – The input file where the settings are collected.
ensemble_names (list, optional) – List of ensemble names in the simulation to work with.
data (string, optional) – If given, the function will check only the single file or look only in the given directory.
progress (bool, optional) – If True, display tqdm progress bars.
n_workers (int, optional) – Number of parallel worker processes. Values > 1 process ensembles concurrently via ProcessPoolExecutor.

Returns:

out (boolean) – True if the recomputation was successful, False otherwise.

pyretis.pyvisa.common.run_user_script(script_path)¶

Execute a user script and capture order-parameter data from stdout.

The script must print its result to stdout in one of two formats:

JSON list of lists – [[op1_frame0, op2_frame0, …], …] where each inner list contains the order-parameter values for one simulation frame.
CSV table – a comma- or whitespace-delimited table that pandas.read_csv can parse (one row per frame, optional header).

The captured data are written to order.txt in the same directory as script_path so that PyVisA can subsequently load the file.

Parameters:

script_path (str) – Absolute path to the Python script to execute.

Returns:

order_txt_path (str or None) – Absolute path of the written order.txt file, or None on failure.
error (str or None) – Human-readable error description, or None on success.

pyretis.pyvisa.common.shift_data(x)¶

Shifts the data under the median.

Function that takes in a list of data, and shifts all values below the median value of the data by the max difference, effectively shifting parts of the data periodically in order to give clusters for visualization.

Parameters:: x (list) – Floats, data values
Returns:: xnorm (list) – Floats where some values are shifted values of x, and some are left unchanged.

pyretis.pyvisa.common.try_data_shift(x, y, fixedx)¶

Check if shifting increases correlation.

Function that checks if correlation of data increases by shifting either sets of values, x or y, or both. Correlation is checked by doing a simple linear regression on the different sets of data: - x and y , x and yshift, xshift and y, xshift and yshift. If linear correlation increases (r-squared value), data sets are updated.

As a precaution, no shift is performed on x values if they are of the first order parameter ‘op1’.

Parameters:

x, y (list) – Floats, data values
fixedx (bool) – If True, x is main OP and should not be shifted.

Returns:

x, y (list) – Floats, updated (or unchanged) data values (If changed, returns x_temp or y_temp or both)

pyretis.pyvisa.common.where_from_to(trj, int_a, int_b=-inf)¶

Detect L∕R starts and L / R / * ends.

Given a list of order parameters (a trj), the function will try to establish where the path started (L or R or *) and where it ended. Note: for the ‘REJ’ paths, this function results might differ from PyRETIS.

Parameters:

trj (numpy array) – The order parameters of the trj.
int_a (float) – The interface that defines state A.
int_b (float, optional) – The interface that defines state B. If not given, it is assumed that the 0^- ensemble is in use without the 0^- L interface.

Returns:

start (string*1) – The initial position of the trajectory in respect to the interfaces given (L eft, R ight or * for nothing).
end (string*1) – The final position of the trajectory in respect to the interfaces given (L eft, R ight or * for nothing).

pyretis.pyvisa.orderparam_density module¶

Compiler of PyRETIS simulation data.

This module is part of the PyRETIS library and can be used both for compiling the simulation data into a compressed file and/or load the data for later visualization.

Important classes defined here¶

Trajectory (PathBase): A base class to store trajectories composed by only orderp, collective variables and energies.
PathDensity (PathBase): A base class to assemble the data.
PathVisualize (PathVisualize): A base class to prepare for the visualization.

Important methods defined here¶

pyvisa_zip (:py:func: .pyvisa_zip): Compress the PyVisA output file in a .zip format.
pyvisa_unzip (:py:func: .pyvisa_unzip): Decompress the zip into PyVisA output.
remove_nan (:py:func: .remove_nan): Checks for the presence of nan values and replace them with a local, if available.
pyvisa_compress (:py:func: .pyvisa_compress): Compress PyRETIS outputs to a .hdf5 file.

class pyretis.pyvisa.orderparam_density.PathDensity(basepath='.', iofile=None)¶

Bases: object

Perform the path density analysis.

This class defines the path density analysis for completed simulations with several order parameters.

__init__(basepath='.', iofile=None)¶

Initialize the class.

Parameters:

basepath (string, optional) – The path to the input file.
iofile (string, optional) – The input file.

Variables:

traj_dict (dict) – Values of order params and energy in all ensembles and info about trajectories. To each key, ensemble_name (e.g. 000, 001, etc.) the value is the list of respective Trajectory objects.
infos (dict) –
Information about the simulation, it contains:
- ensemble_names: list List of ensemble names.
- interfaces: list List of interface positions.
- num_op: integer Number of order parameters.
- op_labels: list List of order parameter names.
- energy_labels: list List of energy entry labels.

fill_energy(traj, eframes, ensemble_name, cycle)¶

Fill in energy-data to a trajectory.

Parameters:

traj (Trajectory object) – Trajectory object to be filled in energy-data.
eframes (dict) – Dictionary containing energy-data for the trajectory.
ensemble_name (string) – The name of the ensemble.
cycle (integer) – Cycle number.
Updates
——-
traj_dict (Trajectory object) – Trajectory object with filled in energy-data.

fill_op(frames, info, ensemble_name, cycle)¶

Fill in OP-data to a trajectory.

Function that fills the dictionary traj_dict containing Trajectory objects with data from the order parameter and the collective variables from the simulation.

Parameters:

frames (dict) – Dictionary containing energy-data for the trajectory.
info (dict) – Information about the trajectory.
ensemble_name (string) – The name of the ensemble.
cycle (integer) – Cycle number.
Updates
——-
traj_dict (Trajectory object) – Trajectory object with filled in OP-data.

get_traj_energy(ensemble_name)¶

Read energy.txt files and collects energy-data.

Function that fills the dictionary traj_dict containing Trajectory objects with energy data from the simulation.

Parameters:

ensemble_name (string) – The name of the ensemble.
Updates
——-
traj_dict (dict) – Dictionary containing Trajectory objects from simulation with filled in energy-data.

get_traj_op(ensemble_name)¶

Read order.txt files and collects data OP data.

Parameters:

ensemble_name (string)
Updates
——-
traj_dict (dict) – Dictionary containing Trajectory objects from simulation with filled in OP-data.

hdf5_data()¶: Compress the data to a .hdf5 file.

initialize_compressed(input_file)¶

Load PathDensity from a compressed file.

Parameters:: input_file (string) – The input file.

walk_dirs(only_ops=False)¶

Create a lists in acc or rej dictionary for all order parameters.

First generate list of folders/ensembles to iterate through. Then search for number of order parameters(columns) in file in one of the folders of path, and create lists in acc/rej dictionaries for all order parameters.

Lastly iterate through all folders and files, filling in correct data to the lists and dictionaries.

Parameters:: only_ops (boolean, optional) – If true, PathDensity will not collect data from energy files.

class pyretis.pyvisa.orderparam_density.PathVisualize(basepath='.', pfile=None)¶

Bases: object

Class to define the visualization of data with PathDensity.

Class definition of the visualization of data gathered from simulation directory using the PathDensity class.

__init__(basepath='.', pfile=None)¶

Initialize the PathVisualize class.

If a supported compressed input file is present, loads the pre-compiled data from it. Else, must use specific functions explicitly.

Parameters:

basepath (string, optional) – The path of the input file.
pfile (string, optional) – The input file.

load_hdf5()¶

Load precompiled data from a hdf5 file.

Function that loads precompiled data from a .hdf5 file made using pandas.

load_traj(criteria)¶

Load relevant data from Trajectories.

Parameters:

criteria (dict) – Dictionary of the selection criteria for which data to load. It contains:

x: string Name of parameter to plot.
y: string Name of parameter to plot.
z: string Name of parameter to plot.
ensemble_name: string Name of ensemble to loop through.
cycles: tuple Cycles to loop over.
status: string Status of the path: accepted/rejected.
MC-move: string, optional Generation move of the trajectory.
stored: bool, optional True if the trajectory has available trajectory files.
weights: bool, optional Option to apply statistical weights to the trajectories, used in the weighted density plot.

Returns:

x_list (list) – List of data from chosen parameter.
y_list (list) – List of data from chosen parameter.
z_list (list) – List of data from chosen parameter, if required.
data_origin (list) – List of ensemble_name and cycle for each point, if required.

load_whatever()¶

Load all possible supported files.

This functions directs traffic towards the real loaders. Essentially, it does almost nothing.

class pyretis.pyvisa.orderparam_density.Trajectory(frames, info)¶

Bases: object

Class representing a simulation trajectory.

This class defines the trajectories from the completed simulations, with all information available. The labels of the order parameter and collective variables will either be labeled in the fashion of opx, or from the names in the input file if the number of names given and number of descriptors in the system are equal.

__init__(frames, info)¶

Initialize the class.

Parameters:

frames (pandas Dataframe or dict) – Dataframe/dict containing order parameter and energy-data for the trajectory. It contains:
- OP1: string Order parameter.
- OP2: string Collective variables, all other CV’s will be named in increasing fashion, OP3, OP4 etc.
- PotE: string Potential energy.
- KinE: string Kinetic energy.
- TotE: string Total energy.
info (dict) – Dictionary containing information about the trajectory. It contains:
- ensemble_name: string The name of the ensemble.
- cycle: integer The cycle number.
- status: string Status of accepted/rejected etc.
- MC-move: string Generation move of the trajectory.
- MC-start: string Generation starting point of the trajectory.
- ordermax: float Max value of OP.
- ordermin: float Min value of OP.
- stored: boolean True if the trajectory has existing trajectory files.

pyretis.pyvisa.orderparam_density.pyvisa_compress(runpath, input_file, pyvisa_dict)¶

Compress simulation data.

Parameters:

runpath (string) – The execution folder where the input files are.
input_file (string) – The input file for compression.
pyvisa_dict (dict) – It determines the section of pyvisa to use.

pyretis.pyvisa.orderparam_density.pyvisa_unzip(origin, destination=None)¶

Unzip compressed file before load in visualizer.

Parameters:

origin (string) – Zipped file to unzip.
destination (string, optional) – Unzipped file name.

pyretis.pyvisa.orderparam_density.pyvisa_zip(input_file)¶

Zip compress file of simulation data.

Parameters:: input_file (string) – The file to compress.

pyretis.pyvisa.orderparam_density.remove_nan(data)¶

Remove nan from data.

The function shall remove initial nan, assuming that they are originated by incomplete initial conditions (e.g. no energy file). In the case that nan appears as last cycle, it will not be fixed and an error shall rise up later in the code.

Parameters:: data (list, dict, object like pandas.DataFrame or pandas.Series) – Input data structure. If np.nan are present, they are replaced by the following entry. The method accounts for multiple consecutive np.nan occurrences.

pyretis.pyvisa.plotting module¶

This file contains common functions for the visualization.

It contains some functions that are used to plot regression lines and interface planes, and generate surface plots.

Important methods defined here¶

gen_surface (gen_surface()): Generates a user-defined surface/contour/etc plot with colorbar in given matplotlib.figure and -.axes objects.
plot_int_plane(plot_regline()): Generates interface planes for the current span of x-values, in a given matplotlib.axes-object.
plot_regline (plot_regline()): Calculates the linear regression and correlation, plots a line for the regression in the given matplotlib.axes-object, with info in legend.
_grid_it_up (_grid_it_up()): Maps the x,y and z data to a numpy.meshgrid using scipy interpolation at a user defined resolution.

pyretis.pyvisa.plotting._add_colorbar(fig, mappable, ax, cbar_ax=None)¶: Add a colorbar with Matplotlib’s current explicit axes API.

pyretis.pyvisa.plotting._grid_it_up(xyz, res_x=200, res_y=200, fill='max')¶

Map x, y and z data values to a numpy meshgrid by interpolation.

Parameters:

xyz (list of list) – Lists of data values.
res_x, res_y (integer, optional) – Resolution (number of points in a axis range).
fill (string, optional) – Criteria to color the un-explored regions.

Returns:

g_x, g_y, g_z (list) – Numpy.arrays of mapped data.

pyretis.pyvisa.plotting.gen_surface(x, y, z, fig, ax, cbar_ax=None, dim=3, method='contour', res_x=400, res_y=400, colormap='viridis')¶

Generate the chosen surface/contour/scatter plot.

Parameters:

x, y, z (list) – Coordinates of data points. (x,y) the chosen orderP pairs, and z is the chosen energy value of the two combinations.
fig (Matplotlib object) – main canvas.
ax (Matplotlib object) – axes of the plot.
cbar_ax (Matplotlib object, optional) – plot color-bar.
dim (int) – dimension of the plot surface.
method (string, optional) – Method used for plotting data, default is contour lines.
res_x, res_y (integer, optional) – Resolution of plot, either as N*N bins in 2D histogram (Density plot) or as grid-points for interpolation of data (Surface and contour plots).
colormap (string, optional) – Name of the colormap/color scheme to use when plotting.

Returns:

surf (Matplotlib object) – The chosen surface/contour/plot object.
cbar (Matplotlib object) – The chosen color-bar.

pyretis.pyvisa.plotting.plot_int_plane(ax, pos, yminmax, zminmax, visible=False)¶

Generate the interface planes for 3D visualization.

Parameters:

ax (The matplotlib.axes object where the planes will be plotted.)
pos (float) – The x-axis position of the interface plane.
yminmax, zminmax (list of float) – The limits of the plane in the 3D canvas.
visible (boolean, optional) – If True, shows interface planes.

Returns:

plane (A 3D surface at x=pos, perpendicular to the x-axis.)

pyretis.pyvisa.plotting.plot_regline(ax, x, y)¶

Plot a regression line calculated from input data in the input subplot.

Parameters:

ax (Matplotlib subplot, where reg.line is to be plotted.)
x, y (list) – Floats, coordinates of data regression lines are calculated from.
Updates
——-
Regression line with values.

pyretis.pyvisa.resources_rc module¶

Fix string for pydocstyle.

pyretis.pyvisa.visualize module¶

GUI application for visualizing simulation data.

This is a PyQt5 file, using a custom made ui layout, with a window which displays different plots created by the path density tool. Window can either display data of a pre-compiled compressed file generated with the orderparam_density.py module, or compile on in/out files by importing orderparam_density and executing before displaying the results.

Important methods defined here¶

visualize_main (:py:func: .visualize_main): Method to load the GUI.

Important classes defined here¶

VisualApp (VisualApp): The PyQt5 GUI class that handles all the functionalities.
CustomFigCanvas (VisualApp): Class to handle the MatPlotLiv canvas.
DataSlave (VisualApp): The class to handle the data for the plots.
VisualObject (VisualApp): The class that handles the load from files.
DataObject (VisualApp): The class that loads the PyRETIS files.

pyretis.pyvisa.statistical_methods module¶

Collection of statistical methods for PyVisA.

Important methods defined here¶

correlation_matrix (correlation_matrix()): Method for displaying the correlations from the simulation variables.
gaussian_mixture (gaussian_mixture()): Generic method for performing gaussian mixture clustering on simulation data.
hierarchical (hierarchical()): Generic method for performing hierarchical clustering on simulation data.
k_means (k_means()): Generic method for performing K-means clustering on simulation data.
pyvis_pca (pyvisa_pca()): Generic method for performing pca on simulation data.
spectral (spectral()): Generic method for performing spectral clustering on simulation data.
support_vector_machine (support_vector_machine()): Generic method for performing support vector machine classification on simulation data.

pyretis.pyvisa.statistical_methods._as_2d_cv(cv_values)¶: Convert 1D or multidimensional CV data to a 2D frame array.

pyretis.pyvisa.statistical_methods._bootstrap_wham(path_ensembles, interfaces, lambda_grid, cv_edges, n_bootstrap, random_state)¶: Return bootstrap standard errors by resampling trajectories.

pyretis.pyvisa.statistical_methods._coerce_trajectory(trajectory)¶: Return a normalized trajectory dictionary for WHAM analysis.

pyretis.pyvisa.statistical_methods._cv_bin_index(cv_point, cv_edges)¶: Return the flattened CV-bin index for one CV point.

pyretis.pyvisa.statistical_methods._first_crossing_index(lambda_values, lambda_c)¶

Return the first frame index with lambda >= lambda_c.

Parameters:

lambda_values (array_like, shape (n_frames,)) – Progress-coordinate values along one trajectory.
lambda_c (float) – Crossing interface.

Returns:

index (int or None) – First index crossing lambda_c. None is returned when the trajectory never reaches the interface.

pyretis.pyvisa.statistical_methods._interface_probabilities(lambda_maxima, interfaces)¶: Compute WHAM crossing probabilities at the TIS interfaces.

pyretis.pyvisa.statistical_methods._k_index(values, interfaces, n_ensembles)¶: Return K(lambda) from the WHAM expression.

pyretis.pyvisa.statistical_methods._lambda_max(lambda_values)¶: Return the largest progress-coordinate value in one trajectory.

pyretis.pyvisa.statistical_methods._make_cv_edges(path_ensembles, n_cv, cv_bin_widths, cv_ranges=None)¶: Create configurable histogram edges for CV-space bins.

pyretis.pyvisa.statistical_methods._make_lambda_grid(interfaces, lambda_grid=None, lambda_bin_width=None)¶: Return the fine lambda grid used for WHAM output.

pyretis.pyvisa.statistical_methods._marginal_distribution(distribution, cv_axis)¶: Return a 1D marginal distribution along one CV axis.

pyretis.pyvisa.statistical_methods._plot_pca_results(n_pca, principal_df, loadings, pca_info, cmap)¶

Plot the results of a PCA analysis.

Parameters:

n_pca (integer) – Number of principal components.
principal_df (object like pandas.DataFrame) – Principal components data.
loadings (object like pandas.DataFrame) – PCA loadings.
pca_info (dict) – Dict with keys ‘model’ (fitted PCA), ‘cols’ (column labels), ‘features’ (feature labels).
cmap (string) – Matplotlib colormap.

pyretis.pyvisa.statistical_methods._prepare_path_ensembles(path_ensembles)¶: Validate and normalize a nested sequence of path ensembles.

pyretis.pyvisa.statistical_methods._validate_grid(grid, name)¶: Validate a strictly increasing 1D numerical grid.

pyretis.pyvisa.statistical_methods._wham_core(path_ensembles, interfaces, lambda_grid, cv_edges)¶: Implement path-sampling WHAM after input normalization.

pyretis.pyvisa.statistical_methods.correlation_matrix(dataframe)¶

Show correlation matrix of simulation variables.

Function which computes the Pearson correlations between parameters and shows it in a correlation plot.

Parameters:: dataframe (object like pandas.DataFrame) – Dataframe containing simulation data.

pyretis.pyvisa.statistical_methods.decision_tree(xdata, ydata, depth)¶

Create a decision tree.

Parameters:

xdata (object like pandas.DataFrame) – Pandas dataframe containing the op and cv data from selected frames i the simulation.
ydata (object like pandas.DataFrame) – Pandas dataframe containing True/False for each frame in the selected ensembles. True if the frame is reactive, else False.
depth (integer) – Depth of the decision tree.

pyretis.pyvisa.statistical_methods.gaussian_mixture(n_clusters, data, settings, cmap)¶

Perform gaussian mixture clustering.

This function performs gaussian mixture clustering on simulation data with a chosen amount of clusters, and plots the results.

Parameters:

n_clusters (integer) – Number of clusters.
data (object like numpy.ndarray) – Simulation data from chosen ensembles.
settings (dict) – Settings from GUI.
cmap (string) – Matplotlib colormap.

pyretis.pyvisa.statistical_methods.hierarchical(n_clusters, data, settings, cmap)¶

Perform hierarchical clustering.

This function performs hierarchical clustering on simulation data with a chosen amount of clusters, and plots the results.

Parameters:

n_clusters (integer) – Number of clusters.
data (object like numpy.ndarray) – Simulation data from chosen ensembles.
settings (dict) – Settings from GUI.
cmap (string) – Matplotlib colormap.

pyretis.pyvisa.statistical_methods.k_means(n_clusters, data, settings, cmap)¶

Perform K-means clustering.

This function performs K-means clustering on simulation data with a chosen amount of clusters, and plots the results.

Parameters:

n_clusters (integer) – Number of clusters.
data (object like numpy.ndarray) – Simulation data from chosen ensembles.
settings (dict) – Settings from GUI.
cmap (string) – Matplotlib colormap.

pyretis.pyvisa.statistical_methods.plot_wham(result, lambda_c_index=None, lambda_r_index=None, cv_axis=0, cmap='viridis', show=True)¶

Create publication-style PyVisA plots for path-sampling WHAM.

The figure contains heat maps for crossing probability, predictive capacity and enhancement as functions of lambda_c and lambda_r. The fourth panel shows the reactive, unreactive and total first-crossing distributions for one selected (lambda_c, lambda_r) pair, marginalized onto one CV when the CV space is multidimensional.

Parameters:

result (dict) – Result dictionary returned by wham().
lambda_c_index, lambda_r_index (int, optional) – Indices selecting the distribution panel. Defaults to the middle valid lambda-c value and the final lambda-r value.
cv_axis (int, optional) – CV axis used for the marginal distribution panel.
cmap (string, optional) – Matplotlib colormap.
show (bool, optional) – If True, call matplotlib.pyplot.show().

Returns:

figure, axes (tuple) – Matplotlib figure and axes.

pyretis.pyvisa.statistical_methods.pyvisa_pca(n_pca, settings, data, cmap)¶

Perform PCA on ensemble data.

This function performs PCA on data from the selected ensembles and stores the data to a hdf-file for further analysis, plots the first two principal components against each other, the cumulative explained variance and the loadings. The hdf5 file contains the keywords:

sim_data: pandas Dataframe containing the simulation data.
PC: Dataframe containing the data from the principal components.
loadings: pandas Dataframe containing the loadings.
explained: pandas Series containing the simulation data.
correlation_matrix: Dataframe containing the correlation matrix.

Parameters:

n_pca (integer) – Number of clusters.
settings (dict) – Settings from GUI.
data (object like pandas.DataFrame) – Simulation data from chosen ensembles.
cmap (string) – Matplotlib colormap.

pyretis.pyvisa.statistical_methods.random_forest(xdata, ydata, depth)¶

Create random forest classification.

Parameters:

xdata (object like pandas.DataFrame) – Pandas dataframe containing the op and cv data from selected frames in the simulation.
ydata (object like pandas.DataFrame) – Pandas dataframe containing True/False for each frame in the selected ensembles. True if the frame is reactive, else False.
depth (integer) – Depth of random forest model.

pyretis.pyvisa.statistical_methods.spectral(n_clusters, data, settings, cmap)¶

Perform spectral clustering.

This function performs spectral clustering on simulation data with a chosen amount of clusters, and plots the results.

Parameters:

n_clusters (integer) – Number of clusters.
data (object like numpy.ndarray) – Simulation data from chosen ensembles.
settings (dict) – Settings from GUI.
cmap (string) – Matplotlib colormap.

pyretis.pyvisa.statistical_methods.support_vector_machine(xdata, ydata, kernel='rbf', c_value=1.0, gamma='scale', cmap='viridis')¶

Create support vector machine classification.

Parameters:

xdata (object like pandas.DataFrame) – Pandas dataframe containing the op and cv data from selected frames in the simulation.
ydata (object like pandas.DataFrame) – Pandas dataframe containing True/False for each frame in the selected ensembles. True if the frame is reactive, else False.
kernel (string, optional) – Kernel type to be used in the SVM model.
c_value (float, optional) – Regularization parameter for the SVM model.
gamma (string or float, optional) – Kernel coefficient for ‘rbf’, ‘poly’ and ‘sigmoid’ kernels.
cmap (string, optional) – Matplotlib colormap.

Returns:

svm_model (object like sklearn.pipeline.Pipeline) – Fitted support vector machine model.

pyretis.pyvisa.statistical_methods.wham(path_ensembles, interfaces, cv_bin_widths, lambda_grid=None, lambda_bin_width=None, cv_ranges=None, n_bootstrap=0, random_state=None)¶

Path-sampling weighted histogram analysis method (WHAM).

This implements the WHAM reweighting described by van Erp et al. for path-sampling trajectories. It computes WHAM-reweighted crossing probabilities, first-crossing distributions in arbitrary CV space, predictive capacity, enhancement, and optional bootstrap uncertainty estimates over trajectories.

Parameters:

path_ensembles (sequence) – Nested sequence of path ensembles [0+], [1+], .... Each trajectory may be a dictionary with lambda data under one of 'lambda', 'order_parameter' or 'progress_coordinate' and CV data under 'cv'/'cvs'; or a (lambda, cv) pair. Lambda arrays must have shape (n_frames,). CV arrays may have shape (n_frames,) or (n_frames, n_cv).
interfaces (array_like, shape (n_ensembles + 1,)) – TIS/RETIS/FFS interfaces, including the reactant and product interfaces. The number of path ensembles must be one fewer than the number of interfaces.
cv_bin_widths (float or array_like) – Histogram bin width for each CV dimension.
lambda_grid (array_like, optional) – Fine lambda grid for output. It does not need to coincide with the TIS interfaces.
lambda_bin_width (float, optional) – Used to create lambda_grid when lambda_grid is not given.
cv_ranges (array_like, optional) – Explicit CV histogram ranges with shape (n_cv, 2).
n_bootstrap (int, optional) – Number of bootstrap samples. Resampling is done within each path ensemble.
random_state (int, optional) – Seed for bootstrap resampling.

Returns:

result (dict) – Dictionary containing crossing_probabilities for \(P_A(\\lambda_r | \\lambda_c)\), reactive, unreactive and total first-crossing distributions, predictive_capacity, enhancement and optional bootstrap standard errors.

Notes

Empty CV bins are handled by zero-contribution divisions. Values with lambda_r < lambda_c are outside the intended domain and are returned as nan for scalar probability-like outputs.

pyretis.pyvisa.dialogs module¶

Dialog classes for the PyVisA GUI.

Contains all QDialog subclasses used by VisualApp for loading data.

Important classes defined here¶

LoadHdf5Dialog (LoadHdf5Dialog): File picker for HDF5 / zip compressed simulation data.
LoadOrderParamDialog (LoadOrderParamDialog): File picker for a standalone order parameter text file.
RecalculateDialog (RecalculateDialog): Dialog to select either a PyRETIS .rst input file or a Python script for order parameter recalculation from snapshot trajectories.