Title: | Miscellaneous Tools and Utilities |
---|---|
Description: | Miscellaneous small tools and utilities. Many of them facilitate the work with matrices, e.g. inserting rows or columns, creating symmetric matrices, or checking for semidefiniteness. Other tools facilitate the work with regression models, e.g. extracting the standard errors, obtaining the number of (estimated) parameters, or calculating R-squared values. |
Authors: | Arne Henningsen, Ott Toomet |
Maintainer: | Arne Henningsen <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.6-29 |
Built: | 2024-10-24 03:28:47 UTC |
Source: | https://github.com/arne-henningsen/misctools |
Generate Table for Coefficients, Std. Errors, t-values and P-values.
coefTable( coef, stdErr, df = NULL )
coefTable( coef, stdErr, df = NULL )
coef |
vector that contains the coefficients. |
stdErr |
vector that contains the standard errors of the coefficients. |
df |
degrees of freedom of the t-test used to calculate P-values. |
a matrix with 4 columns: coefficients, standard errors, t-values
and P-values.
If argument df
is not provided, the last column (P-values)
is filled with NA
s.
Arne Henningsen
coefTable( rnorm( 10 ), 0.5 * abs( rnorm( 10 ) ), 20 )
coefTable( rnorm( 10 ), 0.5 * abs( rnorm( 10 ) ), 20 )
Compute the sample medians of the columns (non-rows) of a data.frame or array.
colMedians( x, na.rm = FALSE )
colMedians( x, na.rm = FALSE )
x |
a data.frame or array. |
na.rm |
a logical value indicating whether |
A vector or array of the medians of each column (non-row) of x
with dimension dim( x )[-1]
.
Arne Henningsen
data( "Electricity", package = "Ecdat" ) colMedians( Electricity ) a4 <- array( 1:120, dim = c(5,4,3,2), dimnames = list( c("a","b","c","d","e"), c("A","B","C","D"), c("x","y","z"), c("Y","Z") ) ) colMedians( a4 ) median( a4[ , "B", "x", "Z" ] ) # equal to colMedians( a4 )[ "B", "x", "Z" ]
data( "Electricity", package = "Ecdat" ) colMedians( Electricity ) a4 <- array( 1:120, dim = c(5,4,3,2), dimnames = list( c("a","b","c","d","e"), c("A","B","C","D"), c("x","y","z"), c("Y","Z") ) ) colMedians( a4 ) median( a4[ , "B", "x", "Z" ] ) # equal to colMedians( a4 )[ "B", "x", "Z" ]
Plot a scatterplot to compare two variables.
compPlot( x, y, lim = NULL, ... )
compPlot( x, y, lim = NULL, ... )
x |
values of the first variable (on the X axis). |
y |
values of the second variable (on the Y axis). |
lim |
optional vector of two elements specifying the limits of both axes). |
... |
further arguments are passed to |
Arne Henningsen
set.seed( 123 ) x <- runif( 25 ) y <- 2 + 3 * x + rnorm( 25 ) ols <- lm( y ~ x ) compPlot( y, fitted( ols ) ) compPlot( y, fitted( ols ), lim = c( 0, 10 ) ) compPlot( y, fitted( ols ), pch = 20 ) compPlot( y, fitted( ols ), xlab = "observed", ylab = "fitted" ) compPlot( y, fitted( ols ), log = "xy" )
set.seed( 123 ) x <- runif( 25 ) y <- 2 + 3 * x + rnorm( 25 ) ols <- lm( y ~ x ) compPlot( y, fitted( ols ) ) compPlot( y, fitted( ols ), lim = c( 0, 10 ) ) compPlot( y, fitted( ols ), pch = 20 ) compPlot( y, fitted( ols ), xlab = "observed", ylab = "fitted" ) compPlot( y, fitted( ols ), log = "xy" )
This function returns the derivative(s) of the density function
of the normal (Gaussian) distribution with respect to the quantile,
evaluated at the quantile(s), mean(s), and standard deviation(s)
specified by arguments x
, mean
, and sd
, respectively.
ddnorm( x, mean = 0, sd = 1 )
ddnorm( x, mean = 0, sd = 1 )
x |
quantile or vector of quantiles. |
mean |
mean or vector of means. |
sd |
standard deviation or vector of standard deviations. |
numeric value(s): derivative(s) of the density function of the normal distribution with respect to the quantile
Arne Henningsen
ddnorm( c( -1, 0, 1 ) )
ddnorm( c( -1, 0, 1 ) )
Plot a histrogram and add a kernel density line.
histDens( x, breaks = "Sturges", ... )
histDens( x, breaks = "Sturges", ... )
x |
values of the variable. |
breaks |
passed to |
... |
further arguments are passed to |
Arne Henningsen
set.seed( 123 ) x <- rnorm( 100 ) histDens( x ) histDens( x, 20 ) histDens( x, 20, main = "My Title" )
set.seed( 123 ) x <- rnorm( 100 ) histDens( x ) histDens( x, 20 ) histDens( x, 20, main = "My Title" )
Insert a new column into a matrix.
insertCol( m, c, v = NA, cName = "" )
insertCol( m, c, v = NA, cName = "" )
m |
matrix. |
c |
column number where the new column should be inserted. |
v |
optional values of the new column. |
cName |
optional character string: the name of the new column. |
a matrix with one more column than the provided matrix m
.
Arne Henningsen
m <- matrix( 1:4, 2 ) insertCol( m, 2, 5:6 )
m <- matrix( 1:4, 2 ) insertCol( m, 2, 5:6 )
Insert a new row into a matrix.
insertRow( m, r, v = NA, rName = "" )
insertRow( m, r, v = NA, rName = "" )
m |
matrix. |
r |
row number where the new row should be inserted. |
v |
optional values for the new row. |
rName |
optional character string: the name of the new row. |
a matrix with one more row than the provided matrix m
.
Arne Henningsen
m <- matrix( 1:4, 2 ) insertRow( m, 2, 5:6 )
m <- matrix( 1:4, 2 ) insertRow( m, 2, 5:6 )
Check whether a symmetric matrix is positive or negative semidefinite.
isSemidefinite( m, ... ) ## Default S3 method: isSemidefinite( m, ... ) ## S3 method for class 'matrix' isSemidefinite( m, positive = TRUE, tol = 100 * .Machine$double.eps, method = ifelse( nrow( m ) < 13, "det", "eigen" ), ... ) ## S3 method for class 'list' isSemidefinite( m, ... ) semidefiniteness( m, ... )
isSemidefinite( m, ... ) ## Default S3 method: isSemidefinite( m, ... ) ## S3 method for class 'matrix' isSemidefinite( m, positive = TRUE, tol = 100 * .Machine$double.eps, method = ifelse( nrow( m ) < 13, "det", "eigen" ), ... ) ## S3 method for class 'list' isSemidefinite( m, ... ) semidefiniteness( m, ... )
m |
a symmetric quadratic matrix or a list containing symmetric quadratic matrices. |
positive |
logical. Check for positive semidefiniteness
(if |
tol |
tolerance level (values between |
method |
method to test for semidefiniteness, either
checking the signs of the principal minors
(if |
... |
further arguments of |
Function semidefiniteness()
passes all its arguments
to isSemidefinite()
.
It is only kept for backward-compatibility
and may be removed in the future.
If argument positive
is set to FALSE
,
isSemidefinite()
checks for negative semidefiniteness
by checking for positive semidefiniteness
of the negative of argument m
, i.e. -m
.
If method "det"
is used
(default for matrices with up to 12 rows/columns),
isSemidefinite()
checks whether all principal minors
(not only the leading principal minors)
of the matrix m
(or of the matrix -m
if argument positive
is FALSE
)
are larger than -tol
.
Due to rounding errors,
which are unavoidable on digital computers,
the calculated determinants of singular (sub-)matrices
(which should theoretically be zero)
can considerably deviate from zero.
In order to reduce the probability of incorrect results
due to rounding errors,
isSemidefinite()
does not calculate the determinants
of (sub-)matrices with reciprocal condition numbers
smaller than argument tol
but sets the corresponding principal minors to (exactly) zero.
The number of principal minors of an matrix is
choose
,
which gets very large for large matrices.
Therefore, it is not recommended to use method
"det"
for matrices with, say, more than 12 rows/columns.
If method "eigen"
(default for matrices with 13 or more rows/columns) is used,
isSemidefinite()
checks whether all eigenvalues
of the matrix m
(or of the matrix -m
if argument positive
is FALSE
)
are larger than -tol
.
In case of a singular or nearly singular matrix,
some eigenvalues
that theoretically should be zero
can considerably deviate from zero
due to rounding errors,
which are unavoidable on digital computers.
isSemidefinite()
uses the following procedure
to reduce the probability of incorrectly returning FALSE
due to rounding errors in the calculation of eigenvalues
of singular or nearly singular matrices:
if the reciprocal condition number of an matrix
is smaller than argument
tol
and not all of the eigenvalues of this matrix are larger than -tol
,
isSemidefinite()
checks
whether all choose
submatrices
are positive semidefinite,
where
with
is the number of eigenvalues
in the interval
-tol
and tol
.
If necessary, this procedure is done recursively.
Please note that a matrix can be neither positive semidefinite nor negative semidefinite.
isSemidefinite()
and semidefiniteness()
return a locigal value (if argument m
is a matrix)
or a logical vector (if argument m
is a list)
indicating whether the matrix (or each of the matrices)
is positive/negative (depending on argument positive
)
semidefinite.
Arne Henningsen
Chiang, A.C. (1984): Fundamental Methods of Mathematical Economics, 3rd ed., McGraw-Hill.
Gantmacher, F.R. (1959): The Theory of Matrices, Chelsea Publishing.
# a positive semidefinite matrix isSemidefinite( matrix( 1, 3, 3 )) # a negative semidefinite matrix isSemidefinite( matrix(-1, 3, 3 ), positive = FALSE ) # a matrix that is positive and negative semidefinite isSemidefinite( matrix( 0, 3, 3 )) isSemidefinite( matrix( 0, 3, 3 ), positive = FALSE ) # a matrix that is neither positive nor negative semidefinite isSemidefinite( symMatrix( 1:6 ) ) isSemidefinite( symMatrix( 1:6 ), positive = FALSE ) # checking a list of matrices ml <- list( matrix( 1, 3, 3 ), matrix(-1, 3, 3 ), matrix( 0, 3, 3 ) ) isSemidefinite( ml ) isSemidefinite( ml, positive = FALSE )
# a positive semidefinite matrix isSemidefinite( matrix( 1, 3, 3 )) # a negative semidefinite matrix isSemidefinite( matrix(-1, 3, 3 ), positive = FALSE ) # a matrix that is positive and negative semidefinite isSemidefinite( matrix( 0, 3, 3 )) isSemidefinite( matrix( 0, 3, 3 ), positive = FALSE ) # a matrix that is neither positive nor negative semidefinite isSemidefinite( symMatrix( 1:6 ) ) isSemidefinite( symMatrix( 1:6 ), positive = FALSE ) # checking a list of matrices ml <- list( matrix( 1, 3, 3 ), matrix(-1, 3, 3 ), matrix( 0, 3, 3 ) ) isSemidefinite( ml ) isSemidefinite( ml, positive = FALSE )
Currently, this package just defines the generic function margEff
so that it can be used
to define margEff
methods for objects of specific classes
in other packages.
margEff( object, ... )
margEff( object, ... )
object |
an object of which marginal effects should be calculated. |
... |
further arguments for methods |
Arne Henningsen
Returns number of observations for statistical models. The default
method assumes presence of a component param$nObs
in x
.
nObs(x, ...) ## Default S3 method: nObs(x, ...) ## S3 method for class 'lm' nObs(x, ...)
nObs(x, ...) ## Default S3 method: nObs(x, ...) ## S3 method for class 'lm' nObs(x, ...)
x |
a statistical model, such as created by |
... |
further arguments for methods |
This is a generic function. The default method returns the component
x$param$nObs
. The lm
-method is based on
qr-decomposition, in the same way as the does summary.lm
.
numeric, number of observations
Ott Toomet, [email protected]
# Construct a simple OLS regression: x1 <- runif(100) x2 <- runif(100) y <- 3 + 4*x1 + 5*x2 + rnorm(100) m <- lm(y~x1+x2) # estimate it nObs(m)
# Construct a simple OLS regression: x1 <- runif(100) x2 <- runif(100) y <- 3 + 4*x1 + 5*x2 + rnorm(100) m <- lm(y~x1+x2) # estimate it nObs(m)
This function returns the number of model parameters. The default
method returns the component x$param$nParam
.
nParam(x, free=FALSE, ...) ## Default S3 method: nParam(x, ...) ## S3 method for class 'lm' nParam(x, ...)
nParam(x, free=FALSE, ...) ## Default S3 method: nParam(x, ...) ## S3 method for class 'lm' nParam(x, ...)
x |
a statistical model |
free |
logical, whether to report only the free parameters or the total number of parameters (default) |
... |
other arguments for methods |
Free parameters are the parameters with no equality restrictions. Some parameters may be restricted (e.g. sum of two probabilities may be restricted to equal unity). In this case the total number of parameters may depend on the normalisation.
Number of parameters in the model
Ott Toomet, [email protected]
nObs
for number of observations
# Construct a simple OLS regression: x1 <- runif(100) x2 <- runif(100) y <- 3 + 4*x1 + 5*x2 + rnorm(100) m <- lm(y~x1+x2) # estimate it summary(m) nParam(m) # you get 3
# Construct a simple OLS regression: x1 <- runif(100) x2 <- runif(100) y <- 3 + 4*x1 + 5*x2 + rnorm(100) m <- lm(y~x1+x2) # estimate it summary(m) nParam(m) # you get 3
Test wether a function is quasiconcave or quasiconvex.
The bordered Hessian of this function is checked by
quasiconcavity
() or quasiconvexity
().
quasiconcavity( m, tol = .Machine$double.eps ) quasiconvexity( m, tol = .Machine$double.eps )
quasiconcavity( m, tol = .Machine$double.eps ) quasiconvexity( m, tol = .Machine$double.eps )
m |
a bordered Hessian matrix or a list containing bordered Hessian matrices |
tol |
tolerance level (values between |
locigal or a logical vector (if m
is a list).
Arne Henningsen
Chiang, A.C. (1984) Fundamental Methods of Mathematical Economics, 3rd ed., McGraw-Hill.
quasiconcavity( matrix( 0, 3, 3 ) ) quasiconvexity( matrix( 0, 3, 3 ) ) m <- list() m[[1]] <- matrix( c( 0,-1,-1, -1,-2,3, -1,3,5 ), 3, 3 ) m[[2]] <- matrix( c( 0,1,-1, 1,-2,3, -1,3,5 ), 3, 3 ) quasiconcavity( m ) quasiconvexity( m )
quasiconcavity( matrix( 0, 3, 3 ) ) quasiconvexity( matrix( 0, 3, 3 ) ) m <- list() m[[1]] <- matrix( c( 0,-1,-1, -1,-2,3, -1,3,5 ), 3, 3 ) m[[2]] <- matrix( c( 0,1,-1, 1,-2,3, -1,3,5 ), 3, 3 ) quasiconcavity( m ) quasiconvexity( m )
Compute the sample medians of the rows of a data.frame or matrix.
rowMedians( x, na.rm = FALSE )
rowMedians( x, na.rm = FALSE )
x |
a data.frame or matrix. |
na.rm |
a logical value indicating whether |
A vector of the medians of each row of x
.
Arne Henningsen
m <- matrix( 1:12, nrow = 4 ) rowMedians( m )
m <- matrix( 1:12, nrow = 4 ) rowMedians( m )
Calculate R squared value.
rSquared( y, resid )
rSquared( y, resid )
y |
vector of endogenous variables |
resid |
vector of residuals |
Arne Henningsen
data( "Electricity", package = "Ecdat" ) reg <- lm( cost ~ q + pl + pk + pf, Electricity ) rSquared( Electricity$cost, reg$residuals ) summary( reg )$r.squared # returns the same value
data( "Electricity", package = "Ecdat" ) reg <- lm( cost ~ q + pl + pk + pf, Electricity ) rSquared( Electricity$cost, reg$residuals ) summary( reg )$r.squared # returns the same value
Extract standard deviations from estimated models.
stdEr(x, ...) ## Default S3 method: stdEr(x, ...) ## S3 method for class 'lm' stdEr(x, ...)
stdEr(x, ...) ## Default S3 method: stdEr(x, ...) ## S3 method for class 'lm' stdEr(x, ...)
x |
a statistical model, such as created by |
... |
further arguments for methods |
stdEr
is a generic function with methods for objects of "lm"
class. The default method returns the square root of the diagonal of
the variance-covariance matrix.
numeric, the estimated standard errors of the coefficients.
Ott Toomet [email protected]
data(cars) lmRes <- lm(dist ~ speed, data=cars) stdEr( lmRes )
data(cars) lmRes <- lm(dist ~ speed, data=cars) stdEr( lmRes )
This function returns the sum of an numeric array (e.g. vector or matrix) while keeping its attributes.
sumKeepAttr( x, keepNames = FALSE, na.rm = FALSE )
sumKeepAttr( x, keepNames = FALSE, na.rm = FALSE )
x |
an numeric array (e.g. vector or matrix). |
keepNames |
logical. Should the name(s) of the element(s) of |
na.rm |
logical. Passed to |
the sum (see sum
).
Arne Henningsen
a <- 1:10 attr( a, "min" ) <- 1 attr( a, "max" ) <- 10 sum(a) sumKeepAttr(a)
a <- 1:10 attr( a, "min" ) <- 1 attr( a, "max" ) <- 10 sum(a) sumKeepAttr(a)
This function summarizes each variable that is in a data.frame. It can be used, e.g., in an R script to write summary information about a data.frame into a text file that is in a version control system so that one can see in the version control system whether one or more variables in the data frame have changed.
summarizeDF( dat, printValues = TRUE, maxLevel = 20, file = NULL, ... )
summarizeDF( dat, printValues = TRUE, maxLevel = 20, file = NULL, ... )
dat |
a data.frame. |
printValues |
logical. If |
maxLevel |
integer. If the number of unique values in a variable
is less than or equal to the number specified in this argument
(and argument |
file |
a character string or a writable connection naming the file to write to. |
... |
further arguments forwarded to |
Arne Henningsen
Create a Symmetric Matrix.
symMatrix( data = NA, nrow = NULL, byrow = FALSE, upper = FALSE )
symMatrix( data = NA, nrow = NULL, byrow = FALSE, upper = FALSE )
data |
an optional data vector. |
nrow |
the desired number of rows and columns. |
byrow |
logical. If 'FALSE' (the default) the matrix is filled by columns, otherwise the matrix is filled by rows. |
upper |
logical. If 'FALSE' (the default) the lower triangular part of the matrix (including the diagonal) is filled, otherwise the upper triangular part of the matrix is filled. |
a symmetric matrix.
Arne Henningsen
# fill the lower triangular part by columns symMatrix( 1:10, 4 ) # fill the upper triangular part by columns symMatrix( 1:10, 4, upper = TRUE ) # fill the lower triangular part by rows symMatrix( 1:10, 4, byrow = FALSE )
# fill the lower triangular part by columns symMatrix( 1:10, 4 ) # fill the upper triangular part by columns symMatrix( 1:10, 4, upper = TRUE ) # fill the lower triangular part by rows symMatrix( 1:10, 4, byrow = FALSE )
Creates an upper triangular square matrix from a vector.
triang( v, n )
triang( v, n )
v |
vector |
n |
desired dimension of the returned square matrix |
If the vector has less elements than the upper triangular matrix, the last elements are set to zero.
Arne Henningsen
v <- c( 1:5 ) triang( v, 3 )
v <- c( 1:5 ) triang( v, 3 )
Returns a vector containing the linear independent elements of a symmetric matrix (of full rank).
vecli( m )
vecli( m )
m |
symmetric matrix |
Arne Henningsen
# a symmetric n x n matrix m <- cbind(c(11,12,13),c(12,22,23),c(13,23,33)) vecli(m) # returns: 11 12 13 22 23 33
# a symmetric n x n matrix m <- cbind(c(11,12,13),c(12,22,23),c(13,23,33)) vecli(m) # returns: 11 12 13 22 23 33
Converts a vector into a symmetric matrix that the original vector contains the linear independent values of the returned symmetric matrix.
vecli2m( v )
vecli2m( v )
v |
a vector. |
Arne Henningsen
v <- c( 11, 12, 13, 22, 23, 33 ) vecli2m( v )
v <- c( 11, 12, 13, 22, 23, 33 ) vecli2m( v )
Returns the position of the [i
,j
]th element of a symmetric
n
n
matrix that this element has in a vector
of the linear independent values of the matrix.
veclipos( i, j, n )
veclipos( i, j, n )
i |
row of the element in the matrix. |
j |
column of the element in the matrix. |
n |
dimension of the matrix. |
A symmetric n
n
matrix has n*(n+1)/2
independent values.
The function is: n*(n-1)/2-((n-min(i,j))*(n-min(i,j)+1)/2)+max(i,j)
Arne Henningsen
veclipos( 1, 2, 3 ) # returns: 2
veclipos( 1, 2, 3 ) # returns: 2