Linear Regression with Skew-Normal Errors
snreg.Rdsnreg fits a linear regression model where the disturbance term follows
a skew-normal distribution. The function supports multiplicative
heteroskedasticity of the noise variance via a log-linear specification
(ln.var.v) and allows the skewness parameter to vary linearly with
exogenous variables (skew.v).
Arguments
- formula
an object of class
formulaspecifying the regression: typicallyy ~ x1 + ..., whereyis the dependent variable andx's are regressors.- data
an optional
data.framecontaining the variables informula. If not found indata, variables are taken fromenvironment(formula).- subset
an optional logical or numeric vector specifying the subset of observations to be used in estimation.
- init.sk
numeric. Initial value for the (global) skewness parameter of the noise; can be
NULLifskew.vis supplied with its own coefficients to initialize.- ln.var.v
optional one-sided formula; e.g.
ln.var.v ~ z1 + z2. Specifies exogenous variables entering the (log) variance of the random noise component. IfNULL, the noise variance is homoskedastic.- skew.v
optional one-sided formula; e.g.
skew.v ~ z3 + z4. Specifies exogenous variables determining the skewness of the noise via a linear index; ifNULL, the skewness is constant (scalar).- start.val
optional numeric vector of starting values for all free parameters (regression coefficients, variance/heteroskedasticity parameters, skewness parameters).
- technique
character vector giving the preferred maximization routine(s) in order of preference. Currently recognized keywords include
"nr"(Newton–Raphson),"bhhh","nm"(Nelder–Mead),"bfgs","cg". This scaffold does not implement them yet, but records the choice.- vcetype
character specifying the variance-covariance estimator type:
"aim"for the approximated information matrix or"opg"for the outer product of gradients. Default is"aim".- lmtol
numeric. Convergence tolerance based on the scaled gradient (if applicable). Default is
1e-5.- reltol
numeric. Relative convergence tolerance for likelihood maximization. Default is
1e-12.- maxit
integer. Maximum number of iterations for the optimizer. Default is
199.- optim.report
integer. Verbosity for reporting progress (if implemented). Default is
1.- optim.trace
integer. If positive, tracing information is printed (if implemented). Default is
1.- print.level
integer. Printing level for summaries:
1—print estimation results;2—print optimization details;3—print compact summary. Default3.- digits
integer. Number of digits for printing. Default
4.- only.data
logical. If
TRUE, the function returns only the constructed model matrices and design sets (no estimation). DefaultFALSE.- ...
additional arguments reserved for future methods (e.g., box constraints).
Value
An object of class "snreg" containing the maximum-likelihood results and,
depending on the optimization routine, additional diagnostics:
parNumeric vector of parameter estimates at the optimum.
coefNamed numeric vector equal to
par.vcovVariance–covariance matrix of the estimates.
sdsStandard errors, computed as
sqrt(diag(vcov)).ctabCoefficient table with columns:
Estimate,Std.Err,Z value,Pr(>z).RSSResidual sum of squares.
esampleLogical vector indicating which observations were used in estimation.
nNumber of observations used in the estimation sample.
skewnessVector of the fitted skewness index.
hessian(BFGS only) Observed Hessian at the optimum. If
vcetype == "opg", this is set to the negative outer product of the individual gradients; otherwise a numerical Hessian is computed.value(BFGS only) Objective value returned by
optim. Withcontrol$fnscale = -1, this equals the maximized log-likelihood.counts(BFGS only) Number of iterations / function evaluations returned by
optim.convergence(BFGS only) Convergence code from
optim.message(BFGS only) Additional
optimmessage, if any.llMaximized log-likelihood value.
gradient(NR only) Gradient at the solution.
gg(NR only) Optional gradient-related diagnostic.
gHg(NR only) Optional Newton-step diagnostic.
theta_rel_ch(NR only) Relative parameter change metric across iterations.
The returned object has class "snreg".
Details
Linear Regression with Skew-Normal Errors
The model is $$y_i = x_i^\top \beta + \varepsilon_i,\quad \varepsilon_i \sim SN(0, \sigma_i^2, \alpha_i),$$ where \(SN\) denotes the skew-normal distribution (Azzalini).
Heteroskedasticity in the noise variance (if specified via ln.var.v) is modeled as
$$\log(\sigma_i^2) = w_i^\top \gamma_v,$$
and the (optional) covariate-driven skewness (if specified via skew.v) as
$$\alpha_i = s_i^\top \delta.$$
This function constructs the model frame and design matrices for \(\beta\), \(\gamma_v\), and \(\delta\), and is designed to be paired with a maximum likelihood routine to estimate parameters and (optionally) their asymptotic covariance via either AIM or OPG.
References
Azzalini, A. (1985). A Class of Distributions Which Includes the Normal Ones. Scandinavian Journal of Statistics, 12(2), 171–178.
Azzalini, A., & Capitanio, A. (2014). The Skew-Normal and Related Families. Cambridge University Press.
Examples
if (FALSE) { # \dontrun{
library(snreg)
data("banks07")
head(banks07)
# Translog cost function
spe.tl <- log(TC) ~ (log(Y1) + log(Y2) + log(W1) + log(W2))^2 +
I(0.5 * log(Y1)^2) + I(0.5 * log(Y2)^2) +
I(0.5 * log(W1)^2) + I(0.5 * log(W2)^2)
# -------------------------------------------------------------
# Specification 1: homoskedastic & symmetric noise
# -------------------------------------------------------------
formSV <- NULL # variance equation
formSK <- NULL # skewness equation
m1 <- snreg(
formula = spe.tl,
data = banks07,
ln.var.v = formSV,
skew.v = formSK
)
coef(m1)
# -------------------------------------------------------------
# Specification 2: heteroskedastic + skewed noise
# -------------------------------------------------------------
formSV <- ~ log(TA) # heteroskedasticity in v
formSK <- ~ ER # skewness driven by equity ratio
m2 <- snreg(
formula = spe.tl,
data = banks07,
ln.var.v = formSV,
skew.v = formSK
)
coef(m2)
} # }