src package

Submodules

src.conditionedSampling module

src.conditionedSampling.clip_and_scale(sample, bounds)[source]

clips and scales the sample to be within the bounds

Parameters:

sample (vector of sample)
bounds (list of upper and lower bounds)

Return type:

Clipped and scaled sample

src.conditionedSampling.generate_lhs_sample(dimension, bounds, rs, method='LHS', n_samp=10)[source]

generates LHS or LHSMDU sample

Parameters:

dimension (integer of dimension of input space)
bounds (list of upper and lower bounds)
rs (random seed, default 1234)
method (string of method: LHS or LHSMDU)
n_samp (number of samples to be generated, default is 10)

Return type:

scaled LHS or LHSMDU samples

src.conditionedSampling.get_bounds_for_dimension(combi, prev_bounds)[source]

extract bounds for specific dimension

Parameters:

combi (combination number of bounds, integer)
prev_bounds (list of upper and lower bounds)

Return type:

bounds for this combination number

src.conditionedSampling.handle_dim_1(bounds, method, n_samp, rs, verbose=False)[source]

handles sampling for dimension=1

Parameters:

bounds (list of upper and lower bounds)
method (string of method: LHS or LHSMDU)
n_samp (number of samples to be generated)
rs (random seed, default 1234)
verbose (show prints and more info or not, default=False)

Return type:

feasible samples

src.conditionedSampling.handle_dim_2(bounds, method, n_samp, rs, verbose=False)[source]

handles sampling for dimension=2

Parameters:

bounds (list of upper and lower bounds)
method (string of method: LHS or LHSMDU)
n_samp (number of samples to be generated)
rs (random seed, default 1234)
verbose (show prints and more info or not, default=False)

Return type:

feasible samples

src.conditionedSampling.handle_dim_greater_than_2(bounds, method, n_samp, max_iter, max_iter_dim2, max_iter_dim3, max_rej, rs, verbose=False)[source]

handles sampling for dimension>2

Parameters:

bounds (list of upper and lower bounds)
method (string of method: LHS or LHSMDU)
n_samp (number of samples to be generated)
max_iter (maximum number of total iterations for dimension 1,2,3 to be feasible, if does not find minimum number of samples after this, breaks)
max_iter_dim2 (maximum number of iterations for dimension 1,2 to be feasible)
max_iter_dim3 (maximum number of iterations for dimension 1,2,3 to be feasible (for dimension=4))
max_rej (integer number of maximum allowed samples to be rejected)
rs (random seed, default 1234)
verbose (show prints and more info or not, default=False)

Returns:

sample1, sample2, sample3

Return type:

vectors

src.conditionedSampling.handle_dim_greater_than_3(bounds, method, n_samp, max_iter, max_iter_dim3, max_rej, sum_vec, sample1, sample2, rs, verbose=False)[source]

handles sampling for dimension>3

Parameters:

bounds (list of upper and lower bounds)
method (string of method: LHS or LHSMDU)
n_samp (number of samples to be generated)
max_iter (maximum number of total iterations for dimension 1,2,3 to be feasible, if does not find minimum number of samples after this, breaks)
max_iter_dim3 (maximum number of iterations for dimension 1,2,3 to be feasible (for dimension=4))
max_rej (integer number of maximum allowed samples to be rejected)
sum_vec (vector of sum of compatible sample1 and sample2)
sample1 (vector of feasible sample1)
sample2 (vector of feasible sample1)
rs (random seed, default 1234)
verbose (show prints and more info or not, default=False)

Returns:

infeasible1 (boolean flag)
sample1, sample2, sample3 (vectors)

src.conditionedSampling.one_constrained_sampling(n_samp, method='LHS', bounds=None, max_iter=60, max_iter_dim2=60, max_iter_dim3=60, max_rej=None, rs=None, verbose=False)[source]

one_constrained_sampling for up to 4 dimensions, where samples should add up to 1

Parameters:

n_samp (number of samples to be generated)
method (string of method: LHS or LHSMDU)
bounds (list of upper and lower bounds)
max_iter (maximum number of total iterations for dimension 1,2,3 to be feasible, if does not find minimum number of samples after this, breaks)
max_iter_dim2 (maximum number of iterations for dimension 1,2 to be feasible)
max_iter_dim3 (maximum number of iterations for dimension 1,2,3 to be feasible (for dimension=4))
max_rej (integer number of maximum allowed samples to be rejected)
rs (random seed, default 1234)
verbose (show prints and more info or not, default=False)

Return type:

feasible samples from routines for appropriate number of dimensions

src.conditionedSampling.one_constrained_sampling_wrapper(methodname, dim, n_samp, bounds, max_rej, rs)[source]

select samples based on distance

Parameters:

methodname (string of method, LHS or LHSMDU)
dim (dimension of input space)
n_samp (number of samples to be collected)
bounds (list of lower and upper bounds)
max_rej (maximum samples to be rejected)
rs (random seed, default 1234)

Return type:

one constrained feasible samples from routines for dim 2, 3, 4

src.conditionedSampling.prob_sample_with_bound_permutations(seeds=[42, 123, 7, 99, 56], prev_bounds=None, n_samp=10, tol_norm=0.001, all_select=False, num_select=4, max_rej=None, dim=None, verbose=False)[source]

Perform probabilistic sampling with bound permutations for a list of seeds.

Parameters:

seeds (list) – List of random seeds. by default list of 5 random seeds.
prev_bounds (list) – Previous bounds for sampling.
n_samp (int) – Number of samples per seed.
tol_norm (float) – Tolerance for the norm.
all_select (bool) – Whether to select all points.
num_select (int) – Number of selections to make.
max_rej (int) – Maximum rejections allowed.
dim (int) – Dimensionality of the sampling space.
rs – random seed, default 1234
verbose – show prints and more info or not, default=False

Returns:

Contains the selected LHS samples, selected MDU samples, and their means and standard deviations.

Return type:

tuple

src.conditionedSampling.sample_with_bound_permutations(prev_bounds, n_samp, tol_norm=0.001, all_select=False, num_select=4, max_rej=None, dim=None, rs=1234, verbose=False)[source]

Computes samples with bound permutations.

Parameters:

prev_bounds (list) – List of lower and upper bounds for sampling.
n_samp (int) – Number of samples to generate.
tol_norm (float, optional) – Tolerance for minimum distance between samples.
all_select (bool, optional) – Whether to select all samples or a fixed number.
num_select (int, optional) – Number of samples to select if all_select is False.
max_rej (int, optional) – Maximum number of rejections allowed during sampling.
dim (int) – Dimensionality of the problem.
rs (int, optional) – Random seed for reproducibility.
verbose (bool, optional) – Whether to print verbose output.

Returns:

All feasible samples generated using two methods.

Return type:

tuple

src.conditionedSampling.scale(sample, bounds)[source]

scales the sample to be within the bounds

Parameters:

sample (vector of sample)
bounds (list of upper and lower bounds)

Return type:

scaled sample

src.conditionedSampling.scale_data(data, decimals=3)[source]

scales data via standard scaling

Parameters:

data (numpy array or pandas dataframe of data)
decimals (decimals to be rounded to, integer)

Return type:

scaled data

src.conditionedSampling.select_samples(data_scaled, samples, samples_unscaled, tol, tol2, decimals=3, des_n_samp=None)[source]

selects samples based on distance from data set

Parameters:

data_scaled (scaled data, numpy array)
samples (array of samples)
samples_unscaled (array of unscaled samples)
tol (tolerance for minimum distance to data, float)
tol2 (tolerance for minimum distance to other already selected samples, float)
decimals (decimals to round to, integer)
des_n_samp (desired number of samples/experiments to be executed, default None)

Returns:

selected scaled rounded samples (array)
selected unscaled rounded samples (array)
selected_ind_list (list of selected indices)

src.conditionedSampling.select_samples_diff_from_data(exp_data, samples_LHS, samples_LHSMDU, des_n_samp=15, tol=0.5, tol2=0.5, decimals=3)[source]

select samples based on distance from experimental data

Parameters:

exp_data (array of experimental data)
samples_LHS (LHS samples array)
samples_LHSMDU (LHSMDU samples array)
des_n_samp (desired number of samples/experiments to be executed)
tol (tolerance for minimum distance to data, float)
tol2 (tolerance for minimum distance to other already selected samples, float)
decimals (decimals to round to, integer)

Returns:

tol_samples (selected samples with LHS)
tol_samples_LHSMDU (selected samples with LHSMDU)
tol_samples_unscaled (selected unscaled samples with LHS)
tol_samples_LHSMDU_unscaled (selected unscaled samples with LHSMDU)

src.plot module

src.plot.box_kdeplot_samples(samples, filename_eps='', fixed_ranges=None)[source]

generates distribution box kde subplots of samples

Parameters:

samples (np array of samples nsamp x ncomponents)

filename_eps (string of path with eps filename)

Return type:

Subplots showing box kde distributions

src.plot.create_pairwise_distribution_plots_seaborn(data, lhs, lhsmdu, markers=None, dim_labels=None, labels=None, filename_eps='')[source]

Create pairwise distribution plots using Seaborn.

Parameters:

data: np.array, original dataset. lhs: np.array, LHS samples. lhsmdu: np.array, LHSMDU samples. dim_labels: list of str, labels for each dimension. Defaults to ‘dim i’. labels: list of str, dataset labels. Defaults to [‘Data’, ‘LHS’, ‘LHSMDU’]. filename_eps: str, file path for saving the plot as EPS.

src.plot.create_pairwise_scatterplots(data, lhs, lhsmdu, dim_labels=None, colors=None, labels=None, figsize=(15, 10), filename_eps='', plots_per_fig=9)[source]

Create pairwise scatterplots for given datasets, splitting into multiple figures if necessary.

Parameters:

data: np array of original data lhs_samples: np array of LHS samples lhsmdu_samples: np array of LHSMDU samples dim_labels: list of str, labels for each dimension. Defaults to ‘component X’. colors: list of str, colors for each dataset. Defaults to [‘blue’, ‘orange’, ‘green’]. labels: list of str, labels for each dataset. Defaults to [‘Data’, ‘LHS’, ‘LHSMDU’]. figsize: tuple, size of each figure. Defaults to (15, 10). filename_prefix: prefix for filenames when saving figures. plots_per_fig: int, number of plots per figure.

src.plot.distplot_samples(samples, filename_eps='')[source]

generates distribution kde plot of samples

Parameters:

samples (np array of samples nsamp x ncomponents)

filename_eps (string of path with eps filename)

Return type:

Distplot with distributions for different components in different colors

src.plot.plot_dimred_2dims_both_methods(data_pca, lhs_samples_pca, lhsmdu_samples_pca, filename_eps='')[source]

generates scatter plot of data conditioned LHS and conditioned LHSMDU samples

Parameters:

data (np array of original data)

lhs_samples (np array of LHS samples)

lhsmdu_samples (np array of LHSMDU samples)

filename_eps (string of path with eps filename)

Return type:

Scatterplot

src.utils module

src.utils.apply_mixed_synthesis_constraint(all_val_samples)[source]

apply mixed synthesis constraints, components 0+2, 0+1, 1+2 in combination allowed, all as single componenent

Parameters:

all_val_samples (np array of samples of dimension number of points x number of components)

Return type:

samples fulfilling specific synthesis constraints

src.utils.apply_single_synthesis_constraint(all_val_samples)[source]

apply single synthesis constraints, sets maximum=1 and rest=0

Parameters:

all_val_samples (np array of samples of dimension number of points x number of components)

Return type:

0 1 samples

src.utils.save_to_csv(filepath, samples)[source]

writes samples to csv file

Parameters:

filepath (file path e.g. Path('Outputs/LHS_with_new_permutations_correct_suggestions_allselected_imp_ext_subprobs_improve.csv'))

samples (pandas dataframe with samples nsamp x ncomponents)

Return type:

csv file

src.utils.select_des_n_samp_random_pts(all_val_samples, des_n_samp=15)[source]

choose des_n_samp random points from all samples

Parameters:

all_val_samples (np array of samples of dimension number of points x number of components)

des_n_samp (number of desired selected random points, default = 15)

Returns:

tol_samples

Return type:

np array of reduced samples of length des_n_samp

src.utils.select_most_uniform_samples(samples, num_samples=90)[source]

Select a subset of samples that maximizes uniformity using pairwise distance.

Parameters:

samplesnp.array
Array of all samples (shape: n_samples x n_dimensions).

num_samplesint
Number of samples to select.

Returns:

uniform_samplesnp.array
Array of selected samples (shape: num_samples x n_dimensions).

src package

Submodules

src.conditionedSampling module

src.plot module

Parameters:

Parameters:

src.utils module

Parameters:

Returns:

Module contents