mgkit.counts.glm module

New in version 0.3.3.

GLM models with metagenomes and metatranscriptomes. Experimental

mgkit.counts.glm.fit_lowess_interpolate(endog, exog, frac=0.2, it=3, kind='slinear')[source]

Fits a lowess for the passed endog (Y) and exog (X) and returns an interpolated function that describes it. The first 4 arguments are passed to statsmodels.api.sm.nonparametric.lowess(), while the last one is passed to scipy.interpolate.interp1d()

Parameters
  • endog (array) – array of the dependent variable (Y)

  • exog (array) – array of the indipendent variable (X)

  • frac (float) – fraction of the number of elements to use when fitting (0.0-1.0)

  • it (int) – number of iterations to fit the lowess

  • kind (str) – type of interpolation to use

Returns

interpolated function representing the lowess fitted from the data passed

Return type

func

mgkit.counts.glm.lowess_ci_bootstrap(endog, exog, num=100, frac=0.2, it=3, alpha=0.05, delta=0.0, min_value=0.001, kind='slinear')[source]

Bootstraps a lowess for the dependent (endog) and indipendent (exog) arguments.

Parameters
  • endog (array) – indipendent variable (Y)

  • exog (array) – indipendent variable (X)

  • num (int) – number of iterations for the bootstrap

  • frac (float) – fraction of the array to use when fitting

  • it (int) – number of iterations used to fit the lowess

  • alpha (float) – confidence intervals for the bootstrap

  • delta (float) – passed to statsmodels.api.nonparametric.lowess()

  • min_value (float) – minimum value for the function to avoid out of bounds

  • kind (str) – type of interpolation passed to scipy.interpolate.interp1d()

Returns

the first element is the function describing the lowest confidence interval, the second element is for the highest confidence interval and the last one for the mean

Return type

tuple

Note

Performance increase with the value of delta.

mgkit.counts.glm.optimise_alpha_scipy(formula, data, mean_func, q1_func, q2_func)[source]

New in version 0.4.0.

Used to find an optimal alpha parameter for the Negative Binomial distribution used in statsmodels, using the lowess functions from lowess_ci_bootstrap().

Parameters
Returns

alpha value for the Negative Binomial

Return type

float

mgkit.counts.glm.optimise_alpha_scipy_function(args, formula, data, criterion='aic')[source]

New in version 0.4.0.

mgkit.counts.glm.variance_to_alpha(mu, func, min_alpha=0.001)[source]

Based on the variance defined in the Negative Binomial in statsmodels

var = mu + alpha * (mu ** 2)

Parameters
  • mu (float) – mean to calculate the alphas for

  • func (func) – function that returns the variace of the mean

  • min_alpha (float) – value of alpha if the func goes out of bounds

Returns

value of alpha for the passed mean

Return type

float