Home / Statistical Tools / Distribution Fit/Calc / Distribution Fit / Akaike
Akaike's Information Criteria (AIC) and Bayesian Information Criteria (BIC)¶
Akaike's Information Criteria (AIC)¶
Measure of the quality in which the selected distribution fits the data.
- The preferred distribution is the one with the minimum AIC values.
AIC = -2L + 2p + (2p*(p+1)) / ((n-(p+1))
where... L = LogLikelihood from the resulting fit, p = Number of parameters in the distribution, n = Sample size
Bayesian Information Criteria (BIC)¶
Measure of the quality in which the selected distribution fits the data.
- The preferred distribution is the one with the minimum BIC values. The BIC penalizes the number of parameters more strongly than AIC.
BIC = -2L + p*log(n)
where... L = LogLikelihood from the resulting fit, p = Number of parameters in the distribution, n = Sample size
Both AIC and BIC are based on the Log-Likelihood of the distribution being fitted. Quantum XL uses the Maximum Likelihood Estimation to find the parameters for each distribution. This process maximizes the Log-Likelihood and when complete, the value of the Log-Likelihood function is used for both AIC and BIC.
For information comparing AIC with BIC, see Burnham & Anderson (2002) and Yang (2005).
Advantageous and Disadvantageous of AIC/BIC vs. Anderson Darling¶
Advantageous of AIC/BIC
-
AIC/BIC can be calculated for any distribution. Anderson Darling is not available for three parameter distributions with an offset (threshold) parameter. Note: The exception is the three parameter Weibull which does have an AD p-value.
-
Anderson Darling can only reject a distribution's fit. Small datasets often have large a p-value due to sample size, not lack of fit.
-
The Anderson Darling doesn't penalize a distribution with more parameters. For example, the 3-Parameter Weibull can be simplified into the 1-Parameter exponential when Beta = 1 and Threshold = 0. This can lead to "over fitting" which is well known in statistics.
-
Parameter estimation is usually done via Maximum Likelihood which has the sole goal of maximizing the LogLikelihood for the distribution being fitted. The Anderson Darling statistic doesn't use the LogLikelihood in its calculations. The origins of the Anderson Darling distribution assume the distribution's parameters are known. Others, specifically D'Agostino (1986), derived adjustments for when the parameters were estimated from data, yet the statistic is not tied to the method of parameter estimation (Maximum Likelihood).
-
AIC/BIC can be calculated for censored data whereas p-values for the Anderson Darling cannot.
Disadvantageous of AIC/BIC
- AIC/BIC are not hypothesis tests. The distribution with the minimum AIC may still be a very poor fit.
For more information about AIC, see: Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control AC19, pp. 716-722.
For more information about the Anderson Darling statistic see:
-
R.B. D'Agostino and M.A. Stephens, Eds. (1986). Goodness-of-Fit Techniques, Marcel Dekker.