src package

Submodules

src.conditionedSampling module

src.conditionedSampling.clip_and_scale(sample, bounds)[source]

clips and scales the sample to be within the bounds

Parameters:
  • sample (vector of sample)

  • bounds (list of upper and lower bounds)

Return type:

Clipped and scaled sample

src.conditionedSampling.generate_lhs_sample(dimension, bounds, rs, method='LHS', n_samp=10)[source]

generates LHS or LHSMDU sample

Parameters:
  • dimension (integer of dimension of input space)

  • bounds (list of upper and lower bounds)

  • rs (random seed, default 1234)

  • method (string of method: LHS or LHSMDU)

  • n_samp (number of samples to be generated, default is 10)

Return type:

scaled LHS or LHSMDU samples

src.conditionedSampling.get_bounds_for_dimension(combi, prev_bounds)[source]

extract bounds for specific dimension

Parameters:
  • combi (combination number of bounds, integer)

  • prev_bounds (list of upper and lower bounds)

Return type:

bounds for this combination number

src.conditionedSampling.handle_dim_1(bounds, method, n_samp, rs, verbose=False)[source]

handles sampling for dimension=1

Parameters:
  • bounds (list of upper and lower bounds)

  • method (string of method: LHS or LHSMDU)

  • n_samp (number of samples to be generated)

  • rs (random seed, default 1234)

  • verbose (show prints and more info or not, default=False)

Return type:

feasible samples

src.conditionedSampling.handle_dim_2(bounds, method, n_samp, rs, verbose=False)[source]

handles sampling for dimension=2

Parameters:
  • bounds (list of upper and lower bounds)

  • method (string of method: LHS or LHSMDU)

  • n_samp (number of samples to be generated)

  • rs (random seed, default 1234)

  • verbose (show prints and more info or not, default=False)

Return type:

feasible samples

src.conditionedSampling.handle_dim_greater_than_2(bounds, method, n_samp, max_iter, max_iter_dim2, max_iter_dim3, max_rej, rs, verbose=False)[source]

handles sampling for dimension>2

Parameters:
  • bounds (list of upper and lower bounds)

  • method (string of method: LHS or LHSMDU)

  • n_samp (number of samples to be generated)

  • max_iter (maximum number of total iterations for dimension 1,2,3 to be feasible, if does not find minimum number of samples after this, breaks)

  • max_iter_dim2 (maximum number of iterations for dimension 1,2 to be feasible)

  • max_iter_dim3 (maximum number of iterations for dimension 1,2,3 to be feasible (for dimension=4))

  • max_rej (integer number of maximum allowed samples to be rejected)

  • rs (random seed, default 1234)

  • verbose (show prints and more info or not, default=False)

Returns:

sample1, sample2, sample3

Return type:

vectors

src.conditionedSampling.handle_dim_greater_than_3(bounds, method, n_samp, max_iter, max_iter_dim3, max_rej, sum_vec, sample1, sample2, rs, verbose=False)[source]

handles sampling for dimension>3

Parameters:
  • bounds (list of upper and lower bounds)

  • method (string of method: LHS or LHSMDU)

  • n_samp (number of samples to be generated)

  • max_iter (maximum number of total iterations for dimension 1,2,3 to be feasible, if does not find minimum number of samples after this, breaks)

  • max_iter_dim3 (maximum number of iterations for dimension 1,2,3 to be feasible (for dimension=4))

  • max_rej (integer number of maximum allowed samples to be rejected)

  • sum_vec (vector of sum of compatible sample1 and sample2)

  • sample1 (vector of feasible sample1)

  • sample2 (vector of feasible sample1)

  • rs (random seed, default 1234)

  • verbose (show prints and more info or not, default=False)

Returns:

  • infeasible1 (boolean flag)

  • sample1, sample2, sample3 (vectors)

src.conditionedSampling.one_constrained_sampling(n_samp, method='LHS', bounds=None, max_iter=60, max_iter_dim2=60, max_iter_dim3=60, max_rej=None, rs=None, verbose=False)[source]

one_constrained_sampling for up to 4 dimensions, where samples should add up to 1

Parameters:
  • n_samp (number of samples to be generated)

  • method (string of method: LHS or LHSMDU)

  • bounds (list of upper and lower bounds)

  • max_iter (maximum number of total iterations for dimension 1,2,3 to be feasible, if does not find minimum number of samples after this, breaks)

  • max_iter_dim2 (maximum number of iterations for dimension 1,2 to be feasible)

  • max_iter_dim3 (maximum number of iterations for dimension 1,2,3 to be feasible (for dimension=4))

  • max_rej (integer number of maximum allowed samples to be rejected)

  • rs (random seed, default 1234)

  • verbose (show prints and more info or not, default=False)

Return type:

feasible samples from routines for appropriate number of dimensions

src.conditionedSampling.one_constrained_sampling_wrapper(methodname, dim, n_samp, bounds, max_rej, rs)[source]

select samples based on distance

Parameters:
  • methodname (string of method, LHS or LHSMDU)

  • dim (dimension of input space)

  • n_samp (number of samples to be collected)

  • bounds (list of lower and upper bounds)

  • max_rej (maximum samples to be rejected)

  • rs (random seed, default 1234)

Return type:

one constrained feasible samples from routines for dim 2, 3, 4

src.conditionedSampling.prob_sample_with_bound_permutations(seeds=[42, 123, 7, 99, 56], prev_bounds=None, n_samp=10, tol_norm=0.001, all_select=False, num_select=4, max_rej=None, dim=None, verbose=False)[source]

Perform probabilistic sampling with bound permutations for a list of seeds.

Parameters:
  • seeds (list) – List of random seeds. by default list of 5 random seeds.

  • prev_bounds (list) – Previous bounds for sampling.

  • n_samp (int) – Number of samples per seed.

  • tol_norm (float) – Tolerance for the norm.

  • all_select (bool) – Whether to select all points.

  • num_select (int) – Number of selections to make.

  • max_rej (int) – Maximum rejections allowed.

  • dim (int) – Dimensionality of the sampling space.

  • rs – random seed, default 1234

  • verbose – show prints and more info or not, default=False

Returns:

Contains the selected LHS samples, selected MDU samples, and their means and standard deviations.

Return type:

tuple

src.conditionedSampling.sample_with_bound_permutations(prev_bounds, n_samp, tol_norm=0.001, all_select=False, num_select=4, max_rej=None, dim=None, rs=1234, verbose=False)[source]

Computes samples with bound permutations.

Parameters:
  • prev_bounds (list) – List of lower and upper bounds for sampling.

  • n_samp (int) – Number of samples to generate.

  • tol_norm (float, optional) – Tolerance for minimum distance between samples.

  • all_select (bool, optional) – Whether to select all samples or a fixed number.

  • num_select (int, optional) – Number of samples to select if all_select is False.

  • max_rej (int, optional) – Maximum number of rejections allowed during sampling.

  • dim (int) – Dimensionality of the problem.

  • rs (int, optional) – Random seed for reproducibility.

  • verbose (bool, optional) – Whether to print verbose output.

Returns:

All feasible samples generated using two methods.

Return type:

tuple

src.conditionedSampling.scale(sample, bounds)[source]

scales the sample to be within the bounds

Parameters:
  • sample (vector of sample)

  • bounds (list of upper and lower bounds)

Return type:

scaled sample

src.conditionedSampling.scale_data(data, decimals=3)[source]

scales data via standard scaling

Parameters:
  • data (numpy array or pandas dataframe of data)

  • decimals (decimals to be rounded to, integer)

Return type:

scaled data

src.conditionedSampling.select_samples(data_scaled, samples, samples_unscaled, tol, tol2, decimals=3, des_n_samp=None)[source]

selects samples based on distance from data set

Parameters:
  • data_scaled (scaled data, numpy array)

  • samples (array of samples)

  • samples_unscaled (array of unscaled samples)

  • tol (tolerance for minimum distance to data, float)

  • tol2 (tolerance for minimum distance to other already selected samples, float)

  • decimals (decimals to round to, integer)

  • des_n_samp (desired number of samples/experiments to be executed, default None)

Returns:

  • selected scaled rounded samples (array)

  • selected unscaled rounded samples (array)

  • selected_ind_list (list of selected indices)

src.conditionedSampling.select_samples_diff_from_data(exp_data, samples_LHS, samples_LHSMDU, des_n_samp=15, tol=0.5, tol2=0.5, decimals=3)[source]

select samples based on distance from experimental data

Parameters:
  • exp_data (array of experimental data)

  • samples_LHS (LHS samples array)

  • samples_LHSMDU (LHSMDU samples array)

  • des_n_samp (desired number of samples/experiments to be executed)

  • tol (tolerance for minimum distance to data, float)

  • tol2 (tolerance for minimum distance to other already selected samples, float)

  • decimals (decimals to round to, integer)

Returns:

  • tol_samples (selected samples with LHS)

  • tol_samples_LHSMDU (selected samples with LHSMDU)

  • tol_samples_unscaled (selected unscaled samples with LHS)

  • tol_samples_LHSMDU_unscaled (selected unscaled samples with LHSMDU)

src.plot module

src.plot.box_kdeplot_samples(samples, filename_eps='', fixed_ranges=None)[source]

generates distribution box kde subplots of samples

Parameters:
  • samples (np array of samples nsamp x ncomponents)

  • filename_eps (string of path with eps filename)

Return type:

Subplots showing box kde distributions

src.plot.create_pairwise_distribution_plots_seaborn(data, lhs, lhsmdu, markers=None, dim_labels=None, labels=None, filename_eps='')[source]

Create pairwise distribution plots using Seaborn.

Parameters:

data: np.array, original dataset. lhs: np.array, LHS samples. lhsmdu: np.array, LHSMDU samples. dim_labels: list of str, labels for each dimension. Defaults to ‘dim i’. labels: list of str, dataset labels. Defaults to [‘Data’, ‘LHS’, ‘LHSMDU’]. filename_eps: str, file path for saving the plot as EPS.

src.plot.create_pairwise_scatterplots(data, lhs, lhsmdu, dim_labels=None, colors=None, labels=None, figsize=(15, 10), filename_eps='', plots_per_fig=9)[source]

Create pairwise scatterplots for given datasets, splitting into multiple figures if necessary.

Parameters:

data: np array of original data lhs_samples: np array of LHS samples lhsmdu_samples: np array of LHSMDU samples dim_labels: list of str, labels for each dimension. Defaults to ‘component X’. colors: list of str, colors for each dataset. Defaults to [‘blue’, ‘orange’, ‘green’]. labels: list of str, labels for each dataset. Defaults to [‘Data’, ‘LHS’, ‘LHSMDU’]. figsize: tuple, size of each figure. Defaults to (15, 10). filename_prefix: prefix for filenames when saving figures. plots_per_fig: int, number of plots per figure.

src.plot.distplot_samples(samples, filename_eps='')[source]

generates distribution kde plot of samples

Parameters:
  • samples (np array of samples nsamp x ncomponents)

  • filename_eps (string of path with eps filename)

Return type:

Distplot with distributions for different components in different colors

src.plot.plot_dimred_2dims_both_methods(data_pca, lhs_samples_pca, lhsmdu_samples_pca, filename_eps='')[source]

generates scatter plot of data conditioned LHS and conditioned LHSMDU samples

Parameters:
  • data (np array of original data)

  • lhs_samples (np array of LHS samples)

  • lhsmdu_samples (np array of LHSMDU samples)

  • filename_eps (string of path with eps filename)

Return type:

Scatterplot

src.utils module

src.utils.apply_mixed_synthesis_constraint(all_val_samples)[source]

apply mixed synthesis constraints, components 0+2, 0+1, 1+2 in combination allowed, all as single componenent

Parameters:

all_val_samples (np array of samples of dimension number of points x number of components)

Return type:

samples fulfilling specific synthesis constraints

src.utils.apply_single_synthesis_constraint(all_val_samples)[source]

apply single synthesis constraints, sets maximum=1 and rest=0

Parameters:

all_val_samples (np array of samples of dimension number of points x number of components)

Return type:

0 1 samples

src.utils.save_to_csv(filepath, samples)[source]

writes samples to csv file

Parameters:
  • filepath (file path e.g. Path('Outputs/LHS_with_new_permutations_correct_suggestions_allselected_imp_ext_subprobs_improve.csv'))

  • samples (pandas dataframe with samples nsamp x ncomponents)

Return type:

csv file

src.utils.select_des_n_samp_random_pts(all_val_samples, des_n_samp=15)[source]

choose des_n_samp random points from all samples

Parameters:
  • all_val_samples (np array of samples of dimension number of points x number of components)

  • des_n_samp (number of desired selected random points, default = 15)

Returns:

tol_samples

Return type:

np array of reduced samples of length des_n_samp

src.utils.select_most_uniform_samples(samples, num_samples=90)[source]

Select a subset of samples that maximizes uniformity using pairwise distance.

Parameters:

samplesnp.array

Array of all samples (shape: n_samples x n_dimensions).

num_samplesint

Number of samples to select.

Returns:

uniform_samplesnp.array

Array of selected samples (shape: num_samples x n_dimensions).

Module contents