Set operations for data tables
setops.Rd
Similar to base R set functions, union
, intersect
, setdiff
and setequal
but for data.table
s. Additional all
argument controls how duplicated rows are handled. Functions fintersect
, setdiff
(MINUS
or EXCEPT
in SQL) and funion
are meant to provide functionality of corresponding SQL operators. Unlike SQL, data.table functions will retain row order.
Usage
fintersect(x, y, all = FALSE)
fsetdiff(x, y, all = FALSE)
funion(x, y, all = FALSE)
fsetequal(x, y, all = TRUE)
Arguments
- x, y
data.table
s.- all
Logical. Default is
FALSE
and removes duplicate rows on the result. WhenTRUE
, if there arexn
copies of a particular row inx
andyn
copies of the same row iny
, then:fintersect
will returnmin(xn, yn)
copies of that row.fsetdiff
will returnmax(0, xn-yn)
copies of that row.funion
will returnxn+yn
copies of that row.fsetequal
will returnFALSE
unlessxn == yn
.
Details
bit64::integer64
columns are supported but not complex
and list
, except for funion
.
Examples
x = data.table(c(1,2,2,2,3,4,4))
x2 = data.table(c(1,2,3,4)) # same set of rows as x
y = data.table(c(2,3,4,4,4,5))
fintersect(x, y) # intersect
#> V1
#> <num>
#> 1: 2
#> 2: 3
#> 3: 4
fintersect(x, y, all=TRUE) # intersect all
#> V1
#> <num>
#> 1: 2
#> 2: 3
#> 3: 4
#> 4: 4
fsetdiff(x, y) # except
#> V1
#> <num>
#> 1: 1
fsetdiff(x, y, all=TRUE) # except all
#> V1
#> <num>
#> 1: 1
#> 2: 2
#> 3: 2
funion(x, y) # union
#> V1
#> <num>
#> 1: 1
#> 2: 2
#> 3: 3
#> 4: 4
#> 5: 5
funion(x, y, all=TRUE) # union all
#> V1
#> <num>
#> 1: 1
#> 2: 2
#> 3: 2
#> 4: 2
#> 5: 3
#> 6: 4
#> 7: 4
#> 8: 2
#> 9: 3
#> 10: 4
#> 11: 4
#> 12: 4
#> 13: 5
fsetequal(x, x2, all=FALSE) # setequal
#> [1] TRUE
fsetequal(x, x2) # setequal all
#> [1] FALSE