src package
Submodules
src.conditionedSampling module
- src.conditionedSampling.clip_and_scale(sample, bounds)[source]
clips and scales the sample to be within the bounds
- Parameters:
sample (vector of sample)
bounds (list of upper and lower bounds)
- Return type:
Clipped and scaled sample
- src.conditionedSampling.generate_lhs_sample(dimension, bounds, rs, method='LHS', n_samp=10)[source]
generates LHS or LHSMDU sample
- Parameters:
dimension (integer of dimension of input space)
bounds (list of upper and lower bounds)
rs (random seed, default 1234)
method (string of method: LHS or LHSMDU)
n_samp (number of samples to be generated, default is 10)
- Return type:
scaled LHS or LHSMDU samples
- src.conditionedSampling.get_bounds_for_dimension(combi, prev_bounds)[source]
extract bounds for specific dimension
- Parameters:
combi (combination number of bounds, integer)
prev_bounds (list of upper and lower bounds)
- Return type:
bounds for this combination number
- src.conditionedSampling.handle_dim_1(bounds, method, n_samp, rs, verbose=False)[source]
handles sampling for dimension=1
- Parameters:
bounds (list of upper and lower bounds)
method (string of method: LHS or LHSMDU)
n_samp (number of samples to be generated)
rs (random seed, default 1234)
verbose (show prints and more info or not, default=False)
- Return type:
feasible samples
- src.conditionedSampling.handle_dim_2(bounds, method, n_samp, rs, verbose=False)[source]
handles sampling for dimension=2
- Parameters:
bounds (list of upper and lower bounds)
method (string of method: LHS or LHSMDU)
n_samp (number of samples to be generated)
rs (random seed, default 1234)
verbose (show prints and more info or not, default=False)
- Return type:
feasible samples
- src.conditionedSampling.handle_dim_greater_than_2(bounds, method, n_samp, max_iter, max_iter_dim2, max_iter_dim3, max_rej, rs, verbose=False)[source]
handles sampling for dimension>2
- Parameters:
bounds (list of upper and lower bounds)
method (string of method: LHS or LHSMDU)
n_samp (number of samples to be generated)
max_iter (maximum number of total iterations for dimension 1,2,3 to be feasible, if does not find minimum number of samples after this, breaks)
max_iter_dim2 (maximum number of iterations for dimension 1,2 to be feasible)
max_iter_dim3 (maximum number of iterations for dimension 1,2,3 to be feasible (for dimension=4))
max_rej (integer number of maximum allowed samples to be rejected)
rs (random seed, default 1234)
verbose (show prints and more info or not, default=False)
- Returns:
sample1, sample2, sample3
- Return type:
vectors
- src.conditionedSampling.handle_dim_greater_than_3(bounds, method, n_samp, max_iter, max_iter_dim3, max_rej, sum_vec, sample1, sample2, rs, verbose=False)[source]
handles sampling for dimension>3
- Parameters:
bounds (list of upper and lower bounds)
method (string of method: LHS or LHSMDU)
n_samp (number of samples to be generated)
max_iter (maximum number of total iterations for dimension 1,2,3 to be feasible, if does not find minimum number of samples after this, breaks)
max_iter_dim3 (maximum number of iterations for dimension 1,2,3 to be feasible (for dimension=4))
max_rej (integer number of maximum allowed samples to be rejected)
sum_vec (vector of sum of compatible sample1 and sample2)
sample1 (vector of feasible sample1)
sample2 (vector of feasible sample1)
rs (random seed, default 1234)
verbose (show prints and more info or not, default=False)
- Returns:
infeasible1 (boolean flag)
sample1, sample2, sample3 (vectors)
- src.conditionedSampling.one_constrained_sampling(n_samp, method='LHS', bounds=None, max_iter=60, max_iter_dim2=60, max_iter_dim3=60, max_rej=None, rs=None, verbose=False)[source]
one_constrained_sampling for up to 4 dimensions, where samples should add up to 1
- Parameters:
n_samp (number of samples to be generated)
method (string of method: LHS or LHSMDU)
bounds (list of upper and lower bounds)
max_iter (maximum number of total iterations for dimension 1,2,3 to be feasible, if does not find minimum number of samples after this, breaks)
max_iter_dim2 (maximum number of iterations for dimension 1,2 to be feasible)
max_iter_dim3 (maximum number of iterations for dimension 1,2,3 to be feasible (for dimension=4))
max_rej (integer number of maximum allowed samples to be rejected)
rs (random seed, default 1234)
verbose (show prints and more info or not, default=False)
- Return type:
feasible samples from routines for appropriate number of dimensions
- src.conditionedSampling.one_constrained_sampling_wrapper(methodname, dim, n_samp, bounds, max_rej, rs)[source]
select samples based on distance
- Parameters:
methodname (string of method, LHS or LHSMDU)
dim (dimension of input space)
n_samp (number of samples to be collected)
bounds (list of lower and upper bounds)
max_rej (maximum samples to be rejected)
rs (random seed, default 1234)
- Return type:
one constrained feasible samples from routines for dim 2, 3, 4
- src.conditionedSampling.prob_sample_with_bound_permutations(seeds=[42, 123, 7, 99, 56], prev_bounds=None, n_samp=10, tol_norm=0.001, all_select=False, num_select=4, max_rej=None, dim=None, verbose=False)[source]
Perform probabilistic sampling with bound permutations for a list of seeds.
- Parameters:
seeds (list) – List of random seeds. by default list of 5 random seeds.
prev_bounds (list) – Previous bounds for sampling.
n_samp (int) – Number of samples per seed.
tol_norm (float) – Tolerance for the norm.
all_select (bool) – Whether to select all points.
num_select (int) – Number of selections to make.
max_rej (int) – Maximum rejections allowed.
dim (int) – Dimensionality of the sampling space.
rs – random seed, default 1234
verbose – show prints and more info or not, default=False
- Returns:
Contains the selected LHS samples, selected MDU samples, and their means and standard deviations.
- Return type:
tuple
- src.conditionedSampling.sample_with_bound_permutations(prev_bounds, n_samp, tol_norm=0.001, all_select=False, num_select=4, max_rej=None, dim=None, rs=1234, verbose=False)[source]
Computes samples with bound permutations.
- Parameters:
prev_bounds (list) – List of lower and upper bounds for sampling.
n_samp (int) – Number of samples to generate.
tol_norm (float, optional) – Tolerance for minimum distance between samples.
all_select (bool, optional) – Whether to select all samples or a fixed number.
num_select (int, optional) – Number of samples to select if all_select is False.
max_rej (int, optional) – Maximum number of rejections allowed during sampling.
dim (int) – Dimensionality of the problem.
rs (int, optional) – Random seed for reproducibility.
verbose (bool, optional) – Whether to print verbose output.
- Returns:
All feasible samples generated using two methods.
- Return type:
tuple
- src.conditionedSampling.scale(sample, bounds)[source]
scales the sample to be within the bounds
- Parameters:
sample (vector of sample)
bounds (list of upper and lower bounds)
- Return type:
scaled sample
- src.conditionedSampling.scale_data(data, decimals=3)[source]
scales data via standard scaling
- Parameters:
data (numpy array or pandas dataframe of data)
decimals (decimals to be rounded to, integer)
- Return type:
scaled data
- src.conditionedSampling.select_samples(data_scaled, samples, samples_unscaled, tol, tol2, decimals=3, des_n_samp=None)[source]
selects samples based on distance from data set
- Parameters:
data_scaled (scaled data, numpy array)
samples (array of samples)
samples_unscaled (array of unscaled samples)
tol (tolerance for minimum distance to data, float)
tol2 (tolerance for minimum distance to other already selected samples, float)
decimals (decimals to round to, integer)
des_n_samp (desired number of samples/experiments to be executed, default None)
- Returns:
selected scaled rounded samples (array)
selected unscaled rounded samples (array)
selected_ind_list (list of selected indices)
- src.conditionedSampling.select_samples_diff_from_data(exp_data, samples_LHS, samples_LHSMDU, des_n_samp=15, tol=0.5, tol2=0.5, decimals=3)[source]
select samples based on distance from experimental data
- Parameters:
exp_data (array of experimental data)
samples_LHS (LHS samples array)
samples_LHSMDU (LHSMDU samples array)
des_n_samp (desired number of samples/experiments to be executed)
tol (tolerance for minimum distance to data, float)
tol2 (tolerance for minimum distance to other already selected samples, float)
decimals (decimals to round to, integer)
- Returns:
tol_samples (selected samples with LHS)
tol_samples_LHSMDU (selected samples with LHSMDU)
tol_samples_unscaled (selected unscaled samples with LHS)
tol_samples_LHSMDU_unscaled (selected unscaled samples with LHSMDU)
src.plot module
- src.plot.box_kdeplot_samples(samples, filename_eps='', fixed_ranges=None)[source]
generates distribution box kde subplots of samples
- Parameters:
samples (np array of samples nsamp x ncomponents)
filename_eps (string of path with eps filename)
- Return type:
Subplots showing box kde distributions
- src.plot.create_pairwise_distribution_plots_seaborn(data, lhs, lhsmdu, markers=None, dim_labels=None, labels=None, filename_eps='')[source]
Create pairwise distribution plots using Seaborn.
Parameters:
data: np.array, original dataset. lhs: np.array, LHS samples. lhsmdu: np.array, LHSMDU samples. dim_labels: list of str, labels for each dimension. Defaults to ‘dim i’. labels: list of str, dataset labels. Defaults to [‘Data’, ‘LHS’, ‘LHSMDU’]. filename_eps: str, file path for saving the plot as EPS.
- src.plot.create_pairwise_scatterplots(data, lhs, lhsmdu, dim_labels=None, colors=None, labels=None, figsize=(15, 10), filename_eps='', plots_per_fig=9)[source]
Create pairwise scatterplots for given datasets, splitting into multiple figures if necessary.
Parameters:
data: np array of original data lhs_samples: np array of LHS samples lhsmdu_samples: np array of LHSMDU samples dim_labels: list of str, labels for each dimension. Defaults to ‘component X’. colors: list of str, colors for each dataset. Defaults to [‘blue’, ‘orange’, ‘green’]. labels: list of str, labels for each dataset. Defaults to [‘Data’, ‘LHS’, ‘LHSMDU’]. figsize: tuple, size of each figure. Defaults to (15, 10). filename_prefix: prefix for filenames when saving figures. plots_per_fig: int, number of plots per figure.
- src.plot.distplot_samples(samples, filename_eps='')[source]
generates distribution kde plot of samples
- Parameters:
samples (np array of samples nsamp x ncomponents)
filename_eps (string of path with eps filename)
- Return type:
Distplot with distributions for different components in different colors
- src.plot.plot_dimred_2dims_both_methods(data_pca, lhs_samples_pca, lhsmdu_samples_pca, filename_eps='')[source]
generates scatter plot of data conditioned LHS and conditioned LHSMDU samples
- Parameters:
data (np array of original data)
lhs_samples (np array of LHS samples)
lhsmdu_samples (np array of LHSMDU samples)
filename_eps (string of path with eps filename)
- Return type:
Scatterplot
src.utils module
- src.utils.apply_mixed_synthesis_constraint(all_val_samples)[source]
apply mixed synthesis constraints, components 0+2, 0+1, 1+2 in combination allowed, all as single componenent
- Parameters:
all_val_samples (np array of samples of dimension number of points x number of components)
- Return type:
samples fulfilling specific synthesis constraints
- src.utils.apply_single_synthesis_constraint(all_val_samples)[source]
apply single synthesis constraints, sets maximum=1 and rest=0
- Parameters:
all_val_samples (np array of samples of dimension number of points x number of components)
- Return type:
0 1 samples
- src.utils.save_to_csv(filepath, samples)[source]
writes samples to csv file
- Parameters:
filepath (file path e.g. Path('Outputs/LHS_with_new_permutations_correct_suggestions_allselected_imp_ext_subprobs_improve.csv'))
samples (pandas dataframe with samples nsamp x ncomponents)
- Return type:
csv file
- src.utils.select_des_n_samp_random_pts(all_val_samples, des_n_samp=15)[source]
choose des_n_samp random points from all samples
- Parameters:
all_val_samples (np array of samples of dimension number of points x number of components)
des_n_samp (number of desired selected random points, default = 15)
- Returns:
tol_samples
- Return type:
np array of reduced samples of length des_n_samp
- src.utils.select_most_uniform_samples(samples, num_samples=90)[source]
Select a subset of samples that maximizes uniformity using pairwise distance.
Parameters:
- samplesnp.array
Array of all samples (shape: n_samples x n_dimensions).
- num_samplesint
Number of samples to select.
Returns:
- uniform_samplesnp.array
Array of selected samples (shape: num_samples x n_dimensions).