Multi-dimensional gaussian fit

class multineas.multimin.ComposedMultiVariateNormal(Ngauss=0, Nvars=0, params=None, stdcorr=None, weights=None, mus=None, Sigmas=None)[source]

Bases: object

A linear combination of multivariate normal distribution (MND) with special methods for specifying the parameters of the distributions.

Ngauss

Number of composed MND.

Type:

int

Nvars

Number of random variables.

Type:

int

mus

Array with average (mu) of random variables (Ngauss x Nvars).

Type:

numpy.ndarray

weights

Array with weights of each MND (Ngauss). NOTE: These weights are normalized at the end.

Type:

numpy.ndarray

sigmas

Standard deviation of each variable(Ngauss x Nvars).

Type:

numpy.ndarray

rhos

Elements of the upper triangle of the correlation matrix (Ngauss x Nvars x (Nvars-1)/2).

Type:

numpy.ndarray

Sigmas

Array with covariance matrices for each MND (Ngauss x Nvars x Nvars).

Type:

numpy.ndarray

params

Parameters of the distribution in flatten form including symmetric elements of the covariance matrix (Ngauss*(1+Nvars+Nvars*(Nvars+1)/2)).

Type:

numpy.ndarray

stdcorr

Parameters of the distribution in flatten form, including upper triangle of the correlation matrix (Ngauss*(1+Nvars+Nvars*(Nvars+1)/2)).

Type:

numpy.ndarray

Notes

There are several ways of initialize a CMND:

  1. Providing: Ngauss and Nvars In this case the class is instantiated with zero means, unitary dispersion and covariance matrix equal to Ngasus identity matrices Nvars x Nvars.

  2. Providing: params, Nvars In this case you have a flatted version of the parametes (weights, mus, Sigmas) and want to instantiate the system. All parameters are set and no other action is required.

  3. Providing: stdcorr, Nvars In this case you have a flatted version of the parametes (weights, mus, sigmas, rhos) and want to instantiate the system. All parameters are set and no other action is required.

  4. Providing: weights, mus, Sigmas (optional) In this case the basic properties of the CMND are set.

Examples

>>> mus = [[0, 0], [1, 1]]
>>> weights = [0.1, 0]
>>> MND1 = ComposedMultiVariateNormal(mus=mus, weights=weights)
>>> MND1.set_sigmas([[[1, 0.2], [0, 1]], [[1, 0], [0, 1]]])
>>> print(MND1)
>>> params = [0.1, 0.9, 0, 0, 1, 1, 1, 0.2, 0.2, 1, 1, 0, 0, 1]
>>> MND2 = ComposedMultiVariateNormal(params=params, Nvars=2)
>>> print(MND2)
pdf(X)[source]

Compute the PDF.

Parameters:

X (numpy.ndarray) – Point in the Nvar-dimensional space (Nvar).

Returns:

  • p (float) – PDF value at X.

  • Attr. [HC]

plot_sample(data=None, N=10000, props=None, ranges=None, figsize=2, sargs={}, hargs=None)[source]

Plot a sample of the CMND.

Parameters:
  • data (numpy.ndarray, optional) – Data to plot. If None it generate a sample.

  • N (int, optional) – Number of points to generate the sample (default 10000).

  • props (list, optional) – Array with the name of the properties. Ex. props=[“x”,”y”].

  • ranges (list, optional) – Array of ranges of the properties. Ex. ranges=[-3,3],[-5,5].

  • figsize (int, optional) – Size of each axis (default 2).

  • sargs (dict, optional) – Dictionary with options for the scatter plot. Ex. sargs=dict(color=’r’).

  • hargs (dict, optional) – Dictionary with options for the hist2d function. Ex. hargs=dict(bins=50).

Returns:

G – Graphic handle. If Nvars = 2, it is a figure object, otherwise is a CornerPlot instance.

Return type:

matplotlib.figure.Figure or CornerPlot

Examples

>>> G = CMND.plot_sample(N=10000, sargs=dict(s=1, c='r'))
>>> G = CMND.plot_sample(N=1000, sargs=dict(s=1, c='r'), hargs=dict(bins=20))
>>> CMND = ComposedMultiVariateNormal(Ngauss=1, Nvars=2)
>>> fig = CMND.plot_sample(N=1000, hargs=dict(bins=20), sargs=dict(s=1, c='r'))
>>> CMND = ComposedMultiVariateNormal(Ngauss=2, Nvars=3)
>>> print(CMND)
>>> mus = [[0, 0], [1, 1]]
>>> weights = [0.1, 0.9]
>>> Sigmas = [[[1, 0.2], [0, 1]], [[1, 0], [0, 1]]]
>>> MND1 = ComposedMultiVariateNormal(mus=mus, weights=weights, Sigmas=Sigmas)
>>> #MND1=ComposedMultiVariateNormal(mus=mus,weights=weights);MND1.set_sigmas(Sigmas)
>>> print(MND1)
>>> print(MND1.pdf([1, 1]))
>>> params = [0.1, 0.9, 0.0, 0.0, 1.0, 1.0, 1.0, 0.2, 1.0, 1.0, 0.0, 1.0]
>>> MND2 = ComposedMultiVariateNormal(params=params, Nvars=2)
>>> print(MND2)
>>> print(MND2.pdf([1, 1]))

Attr. [HC]

rvs(Nsam=1)[source]

Generate a random sample of points following this Multivariate distribution.

Parameters:

Nsam (int, optional) – Number of samples (default 1).

Returns:

  • rs (numpy.ndarray) – Samples (Nsam x Nvars).

  • Attr. [HC]

sample_cmnd_likelihood(uparams, data=None, pmap=None, tset='stdcorr', scales=[], verbose=0)[source]

Compute the negative value of the logarithm of the likelihood of a sample.

Parameters:
  • uparams (numpy.ndarray) – Minimization parameters (unbound).

  • data (numpy.ndarray, optional) – Data for which log_l is computed.

  • pmap (function, optional) – Routine to map from minparams to params or stdcorr. Example: >>> def pmap(minparams): … stdcorr = np.array([1] + list(minparams)) … stdcorr[-1:] -= 1 … return stdcorr

  • tset (str, optional) – Type of minimization parameters. Values “params”, “stdcorr” (default “stdcorr”).

  • scales (list, optional) – List of scales for transforming uparams (unbound) in minparams (natural scale).

  • verbose (int, optional) – Verbosity level (0, none, 1: input parameters, 2: full definition of the CMND) (default 0).

Returns:

  • log_l (float) – Negative log-likelihood.

  • Attr. [HC]

set_params(params, Nvars)[source]

Set the properties of the CMND from flatten params.

After setting it generate flattend stdcorr and normalize weights.

Parameters:
  • params (list or numpy.ndarray) – Flattened parameters.

  • Nvars (int) – Number of variables.

  • [HC] (Attr.)

set_sigmas(Sigmas)[source]

Set the value of list of covariance matrices.

After setting Sigmas it update params and stdcorr.

Parameters:
  • Sigmas (list or numpy.ndarray) – Array of covariance matrices.

  • [HC] (Attr.)

set_stdcorr(stdcorr, Nvars)[source]

Set the properties of the CMND from flatten stdcorr.

After setting it generate flattened params and normalize weights.

Parameters:
  • stdcorr (list or numpy.ndarray) – Flattened standard deviations and correlations.

  • Nvars (int) – Number of variables.

  • [HC] (Attr.)

class multineas.multimin.FitCMND(objfile=None, Ngauss=1, Nvars=2)[source]

Bases: object

CMND Fitting handler.

Ngauss

Number of fitting MND.

Type:

int

Nvars

Number of variables in each MND.

Type:

int

cmnd

Fitting object of the class CMND. This object will have the result of the fitting procedure.

Type:

ComposedMultiVariateNormal

solution
Once the fitting is completed the solution object is returned. Attributes include:
  • fun: value of the function in the minimum.

  • x: value of the minimization parameters in the minimum.

  • nit: number of iterations of the minimization algorithm.

  • nfev: number of evaluations of the function.

  • success: if True it implies that the minimization fullfills all the conditions.

Type:

scipy.optimize.OptimizeResult

Ndim

Number of mus.

Type:

int

Ncorr

Number of correlations.

Type:

int

minparams

Array of minimization parameters at any stage in minimization process.

Type:

numpy.ndarray

scales

Array of scales to convert minparams to uparams (unbound) and viceversa.

Type:

numpy.ndarray

uparams

Array of unbound minimization parameters.

Type:

numpy.ndarray

Notes

Objects of this class must be always initialized with the number of gaussians and the number of random variables.

Examples

>>> np.random.seed(1)
>>> weights = [[1.0]]
>>> mus = [[1.0, 0.5, -0.5]]
>>> sigmas = [[1, 1.2, 2.3]]
>>> angles = [[10*Angle.Deg, 30*Angle.Deg, 20*Angle.Deg]]
>>> Sigmas = Stats.calc_covariance_from_rotation(sigmas, angles)
>>> CMND = ComposedMultiVariateNormal(mus=mus, weights=weights, Sigmas=Sigmas)
>>> data = CMND.rvs(10000)
>>> F = FitCMND(Ngauss=CMND.Ngauss, Nvars=CMND.Nvars)
>>> F.cmnd._fevfreq = 200
>>> bounds = None
>>> # bounds = F.set_bounds(boundw=(0.1, 0.9))
>>> # bounds = F.set_bounds(boundr=(-0.9, 0.9))
>>> # bounds = F.set_bounds(bounds=(0.1, 0.9*F._sigmax))
>>> # bounds = F.set_bounds(boundsm=((-3, 3), (-2, 2), (-2, 2)), boundw=(0.1, 0.9), bounds=(0.1, 0.9*F._sigmax), boundr=(-0.9, 0.9))
>>> print(bounds)
>>> Util.el_time(0)
>>> # F.fit_data(data, verbose=0, tol=1e-3, options=dict(maxiter=100, disp=True), bounds=bounds)
>>> F.fit_data(data, verbose=0, tol=1e-3, options=dict(maxiter=100, disp=True), method=None, bounds=bounds)
>>> T = Util.el_time()
>>> print(F.cmnd)
>>> G = F.plot_fit(figsize=3, hargs=dict(bins=30, cmap='YlGn'), sargs=dict(s=0.5, edgecolor='None', color='r'))
>>> F.save_fit("/tmp/fit.pkl", useprefix=False)
>>> F._load_fit("/tmp/fit.pkl")
>>> F.save_fit("/tmp/nuevo.pkl", useprefix=True, myprefix="test")
fit_data(data, verbose=0, advance=0, **args)[source]

Minimization procedure.

It updates the solution attribute.

Parameters:
  • data (numpy.ndarray) – Array with data (Nsam x Nvars).

  • verbose (int, optional) – Verbosity level for the sample_cmnd_likelihood routine (default 0).

  • advance (int, optional) – If larger than 0 show advance each “advance” iterations (default 0).

  • **args (dict) –

    Options of the minimize routine (eg. tol=1e-6). A particularly interesting parameter is the minimization method. Available methods:

    • Slow but sure: Powell

    • Fast but unsure: CG, BFGS, COBYLA, SLSQP

Return type:

None

Examples

>>> F = FitCMND(1, 3)
>>> F.fit_data(data, verbose=0, tol=1e-3, options=dict(maxiter=100, disp=True))
log_l(data)[source]

Value of the -log(Likelihood).

Parameters:

data (numpy.ndarray) – Array with data (Nsam x Nvars).

Returns:

log_l – Value of the -log(Likelihood).

Return type:

float

plot_fit(N=10000, figsize=2, props=None, ranges=None, hargs={}, sargs={})[source]

Plot the result of the fitting procedure.

Parameters:
  • N (int, optional) – number of points used to build a representation of the marginal distributions (default 10000).

  • figsize (int, optional) – Size of each axis (default 2).

  • props (list, optional) – Array with the name of the properties. Ex. props=[“x”,”y”].

  • ranges (list, optional) – Array of ranges of the properties. Ex. ranges=[-3,3],[-5,5].

  • hargs (dict, optional) – Dictionary with options for the hist2d function. Ex. hargs=dict(bins=50).

  • sargs (dict, optional) – Dictionary with options for the scatter plot. Ex. sargs=dict(color=’r’).

Returns:

G – Graphic handle.

Return type:

matplotlib.figure.Figure or CornerPlot

Examples

>>> F = FitCMND(1, 3)
>>> F.fit_data(data, verbose=0, tol=1e-3, options=dict(maxiter=100, disp=True))
>>> G = F.plot_fit(figsize=3, hargs=dict(bins=30, cmap='YlGn'), sargs=dict(s=0.5, edgecolor='None', color='r'))
pmap(minparams)[source]

Mapping routine used in sample_cmnd_likelihood.

Mapping may change depending on the complexity of the parameters to be minimized. Here we assume that all parameters in the stdcorr vector is susceptible to be minimized (with the exception of weights in the case of Ngauss=1 when this parameter should not be included).

Parameters:

minparams (numpy.ndarray) – Minimization parameters.

Returns:

stdcorr – Flatten parameters with correlations.

Return type:

numpy.ndarray

save_fit(objfile=None, useprefix=True, myprefix=None)[source]

Pickle the result of a fit.

Parameters:
  • objfile (str, optional) – Name of the file where the fit will be stored. If None, the name is set by the routine as FitCMND.pkl.

  • useprefix (bool, optional) – Use a prefix in the filename of the pickle file (default True). The prefix is normally {Ngauss}cmnd_{hash}.

  • myprefix (str, optional) – Custom prefix.

Examples

If objfile=”fit.pkl”, the final filename will be fit-1mnd_asa33.pkl

set_bounds(boundw=None, bounds=None, boundr=None, boundsm=None)[source]

Set the minimization parameters.

Parameters:
  • boundw (tuple, optional) – Bound of weights (default (-np.inf, np.inf)).

  • bounds (tuple or list, optional) – Bounds of weights, mus, sigmas and rhos of each variable.

  • boundr (tuple, optional) – Bound of rhos (default (-np.inf, np.inf)).

  • boundsm (tuple of tuples, optional) – Bounds of averages (default (-np.inf, np.inf)). Normally the bounds on averages must be expressed in this way: boundsm = ((-min_1, max_1), (-min_2, max_2), …) Example for Nvars = 2: boundsm = ((-2, 1), (-3, 0))

Returns:

bounds – Formatted bounds for minimization.

Return type:

tuple

set_params(mu=0.5, sigma=1.0, rho=0.5)[source]

Set the value of the basic params (minparams, scales, etc.).

It updates minparams, scales and uprams.

Parameters:
  • mu (float, optional) – Value of all initial mus (default 0.5).

  • sigma (float, optional) – Value of all initial sigmas (default 1.0).

  • rho (float, optional) – Value of all initial rhos (default 0.5).

Return type:

None