UAH LTG IDL Library

IDL Routines from Phillip Bitzer and UAH Lightning Group

summary     class     fields     routine details     file attributes

pmb_logistic_regression.pro

ATS606, Statistics, Logistic

includes main-level program

This routine performs logistic regression for a simple linear model. In logistic regression, the dependent ("measured") data is binary, consisting of 0's and 1's (e.g., "dry" and "wet"). The model, $y(x_i)$, yields a probability of an outcome.

The generalized linear model for this regression is $$\ln \left(\frac{y(x_i)}{1-y(x_i)}\right) = A + B*x$$

To find estimators for $A,B$ that maximizes the likelihood given the data, the (log) likelishood is used. The method of finding this maximum is via the Newton-Raphson method. See Wilks 7.3.2 (third edition) for more information.

You can also find the (log) likelihood and associated parameters corresponding to a null hypothesis in which $B$ is held constant, while $A$ is free to vary. See nullB.

Examples

See the main level program to see this routine applied to Wilks Example 7.4. The main level program uses Coyote Graphics.

Author information

Author

Phillip M. Bitzer, University of Alabama in Huntsville, pm.bitzer "AT" uah.edu

History

Modification History:

 First written: Mar 20, 2014 Added an output keyword for the log-likelihood; Added helper function for the log-likelihood 20140331 PMB Add the ability to find the null hypothesis of this regression, holding the B param fixed. This means modification of a helper routine, and whole host of new keywords. 20140331 

Routines

Routines from pmb_logistic_regression.pro

result = pmb_logistic_regression_logL(x, y, param)

This helper routine find the log likelihood for a given set of parameters.

result = pmb_logistic_regression_nr(x, y, param [, /FIXB])

This helper routine performs one iteration of the Newton-Raphson method to find the maximum likelihood estimators.

result = pmb_logistic_regression(x, y, a, b [, TOL=float] [, /VERBOSE] [, MAXITER=integer] [, LOGL=float] [, NULLB=float] [, NULLLOGL=float] [, NULLPARAM=numeric array])

This routine performs the logistic regression of a simple linear model.

top pmb_logistic_regression_logL

result = pmb_logistic_regression_logL(x, y, param)

This helper routine find the log likelihood for a given set of parameters. Really, this could be its own function...

The log likelihood is given by: $$\ln L = \sum_i \left[y_i*(A+B x_i) - \ln \left(1+e^{A+B x_i}\right)\right]$$

Return value

The log likelihood corresponding to the passed parameters.

Parameters

x in required type=numeric array

The independent data.

y in required type=numeric array

The dependent data.

param in required type=two element numeric array

The values of the parameters from which we calculate the log likelihood.

top pmb_logistic_regression_nr

result = pmb_logistic_regression_nr(x, y, param [, /FIXB])

This helper routine performs one iteration of the Newton-Raphson method to find the maximum likelihood estimators.

Return value

A two dimensional array containing the "new" parameters.

Parameters

x in required type=numeric array

The independent data.

y in required type=numeric array

The dependent data.

param in required type=two element numeric array

The current values of the parameters.

Keywords

FIXB in optional type=boolean default=0B

If set, the "B" parameter is held fixed.

top pmb_logistic_regression

result = pmb_logistic_regression(x, y, a, b [, TOL=float] [, /VERBOSE] [, MAXITER=integer] [, LOGL=float] [, NULLB=float] [, NULLLOGL=float] [, NULLPARAM=numeric array])

This routine performs the logistic regression of a simple linear model.

Return value

A two dimensional array containing the A,B maximum likelihood estimators of the model, with $$A + Bx$$

Parameters

x in required type=numeric array

The independent data.

y in required type=numeric array

The dependent data. Remember, logisitic regression requires this be a binary array of 0's and 1's.

a in required type=float or double

The initial guess of the intercept parameter

b in required type=float or double

The initial guess of the slope parameter

Keywords

TOL in optional type=float default=1e-3

The minimum tolerance. When two iterations differ by less than this value, the routine considers convergence to the maximum likelihood estimators has been reached.

VERBOSE in optional type=boolean default=0B

If set, print the parameters at each iteration.

MAXITER in optional type=integer default=20

The maxmimum number of iterations to be performed.

LOGL out optional type=float

The Log-Likelihood corresponding to the fit.

NULLB in optional type=float

If you provide a value, then the routine will also find the best fit parameter for A, holding B constant to the value you give.

NULLLOGL out optional type=float

The Log-Likelihood corresponding to the null hypothesis.

NULLPARAM out optional type=numeric array

The parameters corresponding to the null hypothesis. Note the second element in the array will be the same as whatever you give me for nullB.

Uses:

File attributes

 Modification date: Thu Oct 19 09:15:08 2017 Lines: 76 Docformat: rst rst