Parameter profiling#
Likelihood-based inference requires parameter estimation, so it is important to quantify the sensitivity of a chosen model
to each of those parameters. The profiler module includes the Profiler class that allows the linking of a model
to a set of observations by providing some goodness of fit metrics and “profiles” for all parameters. Profiles are
provided under the form of a dictionary of pandas.DataFrame objects. Each key is a parameter to profile, i.e. to fix
and vary while the other distribution parameters are optimized, and each associated data frame contains the values of all
of the distribution parameters as well as this of the score function (usually the opposite log-likelihood) throughout
the partial optimization.
If the distribution includes a trend in one of the parameters, the parameters of the trend will be profiled. If some
parameters were fixed in the distribution provided to the Profiler, the associated profiles are not computed.
Computing the profile likelihood can be done as follows.
>>> from pykelihood.profiler import Profiler
>>> from pykelihood.distributions import GEV
>>> fitted_gev = GEV.fit(data, loc=kernels.linear(np.linspace(-1, 1, len(data))))
>>> profiler = Profiler(fitted_gev, data, inference_confidence=0.99) # level of confidence for tests
>>> profiler.AIC # the standard fit is without trend
{'AIC MLE': -359.73533182968777, 'AIC Standard MLE Fit': 623.9896838880583}
>>> profiler.profiles.keys()
dict_keys(['loc_a', 'loc_b', 'scale', 'shape'])
>>> profiler.profiles["shape"].head(5)
loc_a loc_b scale shape score
0 -0.000122 1.000812 0.002495 -0.866884 1815.022132
1 -0.000196 1.000795 0.001964 -0.662803 1882.043541
2 -0.000283 1.000477 0.001469 -0.458721 1954.283256
3 -0.000439 1.000012 0.000987 -0.254640 2009.740282
4 -0.000555 1.000016 0.000948 -0.050558 1992.812843
A binary search algorithm implemented to compute the parameter confidence intervals allows for very efficient exploration
of the parameter space. It can be provided with a precision argument, defaulted to 10^{-5}.
For example, if the parameter of interest is the location of the GEV distribution, the profile likelihood-based associated
confidence interval is computed using the following syntax:
>>> profiler.confidence_interval("loc", precision=1e-3)
from which the output would be an array containing the lower and upper bound for the corresponding confidence interval (using the level defined as a parameter of the Profiler object).
>>> [-4.160287666875364, 4.7039931595123825]