Description
Problem
When inverting a quantile function Q: $[0,1]\rightarrow\mathbb{R}$, I want to use a bracketing rootfinder to constrain the roots to $[0,1]$ interval.
Derivative-based vs bracketing rootfinders
The existing derivative-based rootfinders (Newton, Powell) attempt to find the root outside of the $[0,1]$ interval leading to numerical issues in the quantile function calculation. The bracketing rootfinders (Brent and its iterations, incl Zhang, Riddlers, Chandrupatla, and TOMS748) require the interval, which contains the root, i.e. the interval $[a,b]$ such that $\text{sign}(\Omega(a))\neq \text{sign}(\Omega(b))$, where $\Omega$ is the target function.
Target function for inverting QFs
For the task of inverting a quantile function (QF) the $\Omega^{-}(u \vert x,\theta)=x-Q(u\vert\theta)$ is monotonic, non-increasing target function, where $x$ is the observation, $u$ is the depth corresponding to the observation $x$ and $Q(u\vert\theta)$ is a valid (non-decreasing) quantile function1. Finding the root of the target function $\Omega(u \vert x,\theta)$ is tantamount to finding the inverse of the quantile function $Q(u\vert\theta)$.
Quantile-based likelihoods
Given the random sample $\underline y$, we can use the quantile function $Q_Y(u \vert \theta)$ to compute $\underline{Q}={Q(u_1), \dots Q(u_n) \vert \theta}$, such that $u_i=F(y_i \vert \theta), i={1,2, \dots n}$. The depths $u_i$ are degenerate random variables because they are fully determined given the values of $\underline{y}$ and the parameter $\theta$. Since $Q_Y(u_i \vert \theta)=y_i$ we can substitute $\underline Q$ for $\underline y$. Then the Bayesian inference formula becomes
$$f(\theta \vert \underline{Q}) \propto \mathcal{L}(\theta;\underline{Q})f(\theta)$$
$$\mathcal{L}(\theta;\underline{Q})=\prod_{i=1}^{n} f(Q(u_i \vert \theta))=\prod_{i=1}^n[q(u_i \vert \theta)]^{-1}$$
where $\underline{Q}={Q(u_1), \dots Q(u_n) \vert \theta}$, such that $u_i=F(y_i \vert \theta), i={1,2, \dots n}$. In case the CDF $F(\underline y \vert \theta)$ is not available, the numeric approximation of $\widehat{Q^{-1}}(\underline{y} \vert \theta)$ has to be used (Perepolkin, Goodrich & Sahlin, 2021). This is where numerical inverting of quantile functions become useful.
Quantile functions are easily extensible using the Gilchrist's transformation rules (Gilchrist, 2000). Modification of quantile functions can help create highly flexible quantile-based likelihoods, (ref Award 2153019), which may not be invertible.
Example
While logistic quantile function $Q(u)=\text{logit(u)}=\ln(u)-\ln(1-u)$ is easily invertible, introducing the weights to the exponential components of the logistic quantile function $Q(u\vert \delta)=(1-\delta)\ln(u)-\delta\ln(1-u)$ creates the skew-logistic (SLD) quantile function, which, unfortunately, is no longer invertible.
Let's assume the data Y is inversely distributed as the Skew-Logistic Distribution2 with location $\mu$, scale $\sigma$ and skewness $\delta$, assigned some diffuse priors.
$$Y\overset{u}{\backsim} SLD(\mu,\sigma,\delta)$$
$$\mu \sim N(0,1)$$
$$\sigma \sim Exp(1)$$
$$\delta \sim Beta(2,2)$$
In Stan it would be implemented as:
parameters {
real mu; // location
real<lower=0> signa; // scale
real<lower=0,upper=1> delta; //skewness
}
model {
vector[N] u;
//Priors
target += normal_lpdf(mu| 0,1);
target += exponential_lpdf(sigma| 1);
target += beta_lpdf(delta| 1,1);
for (i in 1:N){
// numerically inverted quantile function, using bracketing rootfinding algorithm, i.e. on [0,1] without u_guess
u[i] = inv_qf(skewlogistic_qf, y[i], mu, sigma, delta, rel_tol, max_steps);
target += skewlogistic_ldqf_lpdf(u[i] | lambda);
}
}
generic QF inverting function with the scalar/vector signature
real inv_qf(function qf, data real x, data real rel_tol, data int max_steps,...)
vector inv_qf(function qf, data vector x, data real rel_tol, data int max_steps,...)
takes a reference to a quantile function qf(u,...) with the parameters passed through the ellipsis .... The tolerance rel_tol and maximum number of steps max_steps regulate the convergence.
Expected Output
The output of the inv_qf() is the vector(real) values of depths u, which are roots of target function $\Omega^{-}(u \vert x,\theta)=x-Q(u\vert\theta)$ for the observations x given the value of parameter $\theta$ (passed to qf() through ellipsis).
Current Version:
v4.5.0
References
Gilchrist W. 2000. Statistical modelling with quantile functions. Boca Raton: Chapman & Hall/CRC.
Description
Problem
When inverting a quantile function Q:$[0,1]\rightarrow\mathbb{R}$ , I want to use a bracketing rootfinder to constrain the roots to $[0,1]$ interval.
Derivative-based vs bracketing rootfinders
The existing derivative-based rootfinders (Newton, Powell) attempt to find the root outside of the$[0,1]$ interval leading to numerical issues in the quantile function calculation. The bracketing rootfinders (Brent and its iterations, incl Zhang, Riddlers, Chandrupatla, and TOMS748) require the interval, which contains the root, i.e. the interval $[a,b]$ such that $\text{sign}(\Omega(a))\neq \text{sign}(\Omega(b))$ , where $\Omega$ is the target function.
Target function for inverting QFs
For the task of inverting a quantile function (QF) the$\Omega^{-}(u \vert x,\theta)=x-Q(u\vert\theta)$ is monotonic, non-increasing target function, where $x$ is the observation, $u$ is the depth corresponding to the observation $x$ and $Q(u\vert\theta)$ is a valid (non-decreasing) quantile function1. Finding the root of the target function $\Omega(u \vert x,\theta)$ is tantamount to finding the inverse of the quantile function $Q(u\vert\theta)$ .
Quantile-based likelihoods
Given the random sample$\underline y$ , we can use the quantile function $Q_Y(u \vert \theta)$ to compute $\underline{Q}={Q(u_1), \dots Q(u_n) \vert \theta}$ , such that $u_i=F(y_i \vert \theta), i={1,2, \dots n}$ . The depths $u_i$ are degenerate random variables because they are fully determined given the values of $\underline{y}$ and the parameter $\theta$ . Since $Q_Y(u_i \vert \theta)=y_i$ we can substitute $\underline Q$ for $\underline y$ . Then the Bayesian inference formula becomes
where$\underline{Q}={Q(u_1), \dots Q(u_n) \vert \theta}$ , such that $u_i=F(y_i \vert \theta), i={1,2, \dots n}$ . In case the CDF $F(\underline y \vert \theta)$ is not available, the numeric approximation of $\widehat{Q^{-1}}(\underline{y} \vert \theta)$ has to be used (Perepolkin, Goodrich & Sahlin, 2021). This is where numerical inverting of quantile functions become useful.
Quantile functions are easily extensible using the Gilchrist's transformation rules (Gilchrist, 2000). Modification of quantile functions can help create highly flexible quantile-based likelihoods, (ref Award 2153019), which may not be invertible.
Example
While logistic quantile function$Q(u)=\text{logit(u)}=\ln(u)-\ln(1-u)$ is easily invertible, introducing the weights to the exponential components of the logistic quantile function $Q(u\vert \delta)=(1-\delta)\ln(u)-\delta\ln(1-u)$ creates the skew-logistic (SLD) quantile function, which, unfortunately, is no longer invertible.
Let's assume the data Y is inversely distributed as the Skew-Logistic Distribution2 with location$\mu$ , scale $\sigma$ and skewness $\delta$ , assigned some diffuse priors.
In Stan it would be implemented as:
generic QF inverting function with the scalar/vector signature
takes a reference to a quantile function
qf(u,...)with the parameters passed through the ellipsis.... The tolerancerel_toland maximum number of stepsmax_stepsregulate the convergence.Expected Output
The output of the$\Omega^{-}(u \vert x,\theta)=x-Q(u\vert\theta)$ for the observations $\theta$ (passed to
inv_qf()is the vector(real) values of depthsu, which are roots of target functionxgiven the value of parameterqf()through ellipsis).Current Version:
v4.5.0
References
Gilchrist W. 2000. Statistical modelling with quantile functions. Boca Raton: Chapman & Hall/CRC.
Footnotes
An alternative (non-decreasing) formulation of the target function $\Omega^{+}(u \vert x,\theta)=Q(u\vert\theta)-x$ is equally plausible. ↩
We say that the data Y is "inversely distributed as SLD", because SLD lacks the CDF, therefore, the data can not be "distributed as". The depth, corresponding to observations of Y is distributed as SLD (via the quantile function). Therefore Y is said to be "inversely distributed as" (Perepolkin, Goodrich & Sahlin, 2021). ↩