Summary Utility (su)
su.RdComputes a compact table of summary statistics for each variable in a vector,
matrix, or data frame. The following metrics are returned per variable:
number of observations (Obs), missing values (NAs), mean,
standard deviation (StDev), interquartile range (IQR),
minimum (Min), user-specified quantiles (probs), and maximum (Max).
Usage
su(
x,
mat.var.in.col = TRUE,
digits = 4,
probs = c(0.1, 0.25, 0.5, 0.75, 0.9),
print = FALSE
)Arguments
- x
a numeric vector, matrix, or data frame. For matrices, variables are assumed to be in columns; set
mat.var.in.col = FALSEto treat rows as variables.- mat.var.in.col
logical. If
TRUE(default), a matrix is interpreted as variables in columns. IfFALSE, the matrix is transposed so that rows are treated as variables.- digits
integer. Number of digits to use when printing (only affects printed output when
print = TRUE). Default is4.- probs
numeric vector of probabilities in \([0, 1]\) for which quantiles are computed. Default is
c(0.1, 0.25, 0.5, 0.75, 0.9).logical. If
TRUE, prints the transposed summary table using the specified number of digits. Default isFALSE.
Value
A matrix (coercible to data.frame) where each row corresponds to a variable
and columns contain the summary statistics:
Obs, NAs, Mean, StDev, IQR, Min,
the requested probs quantiles (named), and Max.
The returned object is given class "snreg" for compatibility with
package-specific print/summarization methods.
Details
Compact Summary Statistics for Vectors, Matrices, and Data Frames
Input handling:
If
xis a matrix with a single row or column, it is treated like a vector. Column or row names are used (if available). Otherwise, a default name is created.If
xis a matrix with multiple variables, variables are taken as columns. Usemat.var.in.col = FALSEto transpose and treat rows as variables.If
xis a vector, its deparsed symbol name is used as the variable name.If
xis a data frame, each column is summarized.
Missing values are excluded in all summary computations.
Examples
if (FALSE) { # \dontrun{
# Vector
set.seed(1)
v <- rnorm(100)
su(v, print = TRUE)
# Matrix: variables in columns
M <- cbind(x = rnorm(50), y = runif(50))
su(M)
# Matrix: variables in rows
Mr <- rbind(x = rnorm(50), y = runif(50))
su(Mr, mat.var.in.col = FALSE)
# Data frame
DF <- data.frame(a = rnorm(30), b = rexp(30), c = rbinom(30, 1, 0.3))
out <- su(DF)
head(out)
} # }