Equality Test Between Two Data Tables
all.equal.data.table.Rd
Convenient test of data equality between data.table
objects. Performs some factor level stripping.
Usage
# S3 method for data.table
all.equal(target, current, trim.levels=TRUE, check.attributes=TRUE,
ignore.col.order=FALSE, ignore.row.order=FALSE, tolerance=sqrt(.Machine$double.eps),
...)
Arguments
- target, current
data.table
s to compare. Ifcurrent
is not adata.table
, butcheck.attributes
is FALSE, it will be coerced to one via as.data.table.- trim.levels
A logical indicating whether or not to remove all unused levels in columns that are factors before running equality check. It effect only when
check.attributes
is TRUE andignore.row.order
is FALSE.- check.attributes
A logical indicating whether or not to check attributes, will apply not only to data.table but also attributes of the columns. It will skip
c("row.names",".internal.selfref")
data.table attributes.- ignore.col.order
A logical indicating whether or not to ignore columns order in
data.table
.- ignore.row.order
A logical indicating whether or not to ignore rows order in
data.table
. This option requires datasets to use data types on which join can be made, so no support for list, complex, raw, but still supports integer64.- tolerance
A numeric value used when comparing numeric columns, by default
sqrt(.Machine$double.eps)
. Unless non-default value provided it will be forced to0
if used together withignore.row.order
and duplicate rows detected or factor columns present.- ...
Passed down to internal call of
all.equal
.
Details
For efficiency data.table method will exit on detected non-equality issues, unlike most all.equal
methods which process equality checks further. Besides that fact it also handles the most time consuming case of ignore.row.order = TRUE
very efficiently.
Value
Either TRUE
or a vector of mode "character"
describing the
differences between target
and current
.
Examples
dt1 <- data.table(A = letters[1:10], X = 1:10, key = "A")
dt2 <- data.table(A = letters[5:14], Y = 1:10, key = "A")
isTRUE(all.equal(dt1, dt1))
#> [1] TRUE
is.character(all.equal(dt1, dt2))
#> [1] TRUE
# ignore.col.order
x <- copy(dt1)
y <- dt1[, .(X, A)]
all.equal(x, y)
#> [1] "Different column order"
all.equal(x, y, ignore.col.order = TRUE)
#> [1] TRUE
# ignore.row.order
x <- setkeyv(copy(dt1), NULL)
y <- dt1[sample(nrow(dt1))]
all.equal(x, y)
#> [1] "Column 'A': 10 string mismatches"
all.equal(x, y, ignore.row.order = TRUE)
#> [1] TRUE
# check.attributes
x = copy(dt1)
y = setkeyv(copy(dt1), NULL)
all.equal(x, y)
#> [1] "Datasets have different keys. 'target': [A]. 'current': has no key."
all.equal(x, y, check.attributes = FALSE)
#> [1] TRUE
x = data.table(1L)
y = 1L
all.equal(x, y)
#> [1] "target is data.table, current is numeric"
all.equal(x, y, check.attributes = FALSE)
#> [1] TRUE
# trim.levels
x <- data.table(A = factor(letters[1:10])[1:4]) # 10 levels
y <- data.table(A = factor(letters[1:5])[1:4]) # 5 levels
all.equal(x, y, trim.levels = FALSE)
#> [1] "Column 'A': Levels not identical. No attempt to refactor because trim.levels is FALSE"
all.equal(x, y, trim.levels = FALSE, check.attributes = FALSE)
#> [1] "Column 'A': Levels not identical. No attempt to refactor because trim.levels is FALSE"
all.equal(x, y)
#> [1] TRUE