Equality Test Between Two Data Tables
all.equal.data.table.Rd
Convenient test of data equality between data.table
objects. Performs some factor level stripping.
Usage
# S3 method for class 'data.table'
all.equal(target, current, trim.levels=TRUE, check.attributes=TRUE,
ignore.col.order=FALSE, ignore.row.order=FALSE, tolerance=sqrt(.Machine$double.eps),
...)
Arguments
- target, current
data.table
s to compare. Ifcurrent
is not adata.table
, butcheck.attributes
is FALSE, it will be coerced to one via as.data.table.- trim.levels
A logical indicating whether or not to remove all unused levels in columns that are factors before running equality check. It effect only when
check.attributes
is TRUE andignore.row.order
is FALSE.- check.attributes
A logical indicating whether or not to check attributes. Note that this will apply not only to the data.tables, but also to attributes of the columns.
"row.names"
and any internal data.table attributes are always skipped.- ignore.col.order
A logical indicating whether or not to ignore columns order in
data.table
.- ignore.row.order
A logical indicating whether or not to ignore rows order in
data.table
. This option requires datasets to use data types on which join can be made, so no support for list, complex, raw, but still supports integer64.- tolerance
A numeric value used when comparing numeric columns, by default
sqrt(.Machine$double.eps)
. Unless non-default value provided it will be forced to0
if used together withignore.row.order
and duplicate rows detected or factor columns present.- ...
Passed down to internal call of
all.equal
.
Details
For efficiency data.table method will exit on detected non-equality issues, unlike most all.equal
methods which process equality checks further. Besides that fact it also handles the most time consuming case of ignore.row.order = TRUE
very efficiently.
Value
Either TRUE
or a vector of mode "character"
describing the
differences between target
and current
.
Examples
dt1 <- data.table(A = letters[1:10], X = 1:10, key = "A")
dt2 <- data.table(A = letters[5:14], Y = 1:10, key = "A")
isTRUE(all.equal(dt1, dt1))
#> [1] TRUE
is.character(all.equal(dt1, dt2))
#> [1] TRUE
# ignore.col.order
x <- copy(dt1)
y <- dt1[, .(X, A)]
all.equal(x, y)
#> [1] "Different column order"
all.equal(x, y, ignore.col.order = TRUE)
#> [1] TRUE
# ignore.row.order
x <- setkeyv(copy(dt1), NULL)
y <- dt1[sample(nrow(dt1))]
all.equal(x, y)
#> [1] "Column 'A': 9 string mismatches"
all.equal(x, y, ignore.row.order = TRUE)
#> [1] TRUE
# check.attributes
x = copy(dt1)
y = setkeyv(copy(dt1), NULL)
all.equal(x, y)
#> [1] "Datasets have different keys. 'target': [A]. 'current': has no key."
all.equal(x, y, check.attributes = FALSE)
#> [1] TRUE
x = data.table(1L)
y = 1L
all.equal(x, y)
#> [1] "target is data.table, current is numeric"
all.equal(x, y, check.attributes = FALSE)
#> [1] TRUE
# trim.levels
x <- data.table(A = factor(letters[1:10])[1:4]) # 10 levels
y <- data.table(A = factor(letters[1:5])[1:4]) # 5 levels
all.equal(x, y, trim.levels = FALSE)
#> [1] "Column 'A': Levels not identical. No attempt to refactor because trim.levels is FALSE"
all.equal(x, y, trim.levels = FALSE, check.attributes = FALSE)
#> [1] "Column 'A': Levels not identical. No attempt to refactor because trim.levels is FALSE"
all.equal(x, y)
#> [1] TRUE