Creates a join data.table — J • data.table

Creates a data.table for use in i in a [.data.table join.

Usage

# DT[J(...)]                          # J() only for use inside DT[...]
# DT[.(...)]                          # .() only for use inside DT[...]
# DT[list(...)]                       # same; .(), list() and J() are identical
SJ(...)                             # DT[SJ(...)]
CJ(..., sorted=TRUE, unique=FALSE)  # DT[CJ(...)]

Arguments

...: Each argument is a vector. Generally each vector is the same length, but if they are not then the usual silent recycling is applied.
sorted: logical. Should setkey() be called on all the columns in the order they were passed to CJ?
unique: logical. When TRUE, only unique values of each vectors are used (automatically).

Details

SJ and CJ are convenience functions to create a data.table to be used in i when performing a data.table 'query' on x.

x[data.table(id)] is the same as x[J(id)] but the latter is more readable. Identical alternatives are x[list(id)] and x[.(id)].

When using a join table in i, x must either be keyed or the on argument be used to indicate the columns in x and i which should be joined. See [.data.table.

Value

J : the same result as calling list, for which J is a direct alias.

SJ : Sorted Join. The same value as J() but additionally setkey() is called on all columns in the order they were passed to SJ. For efficiency, to invoke a binary merge rather than a repeated binary full search for each row of i.

CJ : Cross Join. A data.table is formed from the cross product of the vectors. For example, CJ on 10 ids and 100 dates, returns a 1000 row table containing all dates for all ids. If sorted = TRUE (default), setkey() is called on all columns in the order they were passed in to CJ. If sorted = FALSE, the result is unkeyed and input order is retained.

Examples

DT = data.table(A=5:1, B=letters[5:1])
setkey(DT, B)   # reorders table and marks it sorted
DT[J("b")]      # returns the 2nd row
#> Key: <B>
#>        A      B
#>    <int> <char>
#> 1:     2      b
DT[list("b")]   # same
#> Key: <B>
#>        A      B
#>    <int> <char>
#> 1:     2      b
DT[.("b")]      # same using the dot alias for list
#> Key: <B>
#>        A      B
#>    <int> <char>
#> 1:     2      b

# CJ usage examples
CJ(c(5, NA, 1), c(1, 3, 2))                 # sorted and keyed data.table
#> Key: <V1, V2>
#>       V1    V2
#>    <num> <num>
#> 1:    NA     1
#> 2:    NA     2
#> 3:    NA     3
#> 4:     1     1
#> 5:     1     2
#> 6:     1     3
#> 7:     5     1
#> 8:     5     2
#> 9:     5     3
do.call(CJ, list(c(5, NA, 1), c(1, 3, 2)))  # same as above
#> Key: <V1, V2>
#>       V1    V2
#>    <num> <num>
#> 1:    NA     1
#> 2:    NA     2
#> 3:    NA     3
#> 4:     1     1
#> 5:     1     2
#> 6:     1     3
#> 7:     5     1
#> 8:     5     2
#> 9:     5     3
CJ(c(5, NA, 1), c(1, 3, 2), sorted=FALSE)   # same order as input, unkeyed
#>       V1    V2
#>    <num> <num>
#> 1:     5     1
#> 2:     5     3
#> 3:     5     2
#> 4:    NA     1
#> 5:    NA     3
#> 6:    NA     2
#> 7:     1     1
#> 8:     1     3
#> 9:     1     2
# use for 'unique=' argument
x = c(1, 1, 2)
y = c(4, 6, 4)
CJ(x, y)              # output columns are automatically named 'x' and 'y'
#> Key: <x, y>
#>        x     y
#>    <num> <num>
#> 1:     1     4
#> 2:     1     4
#> 3:     1     4
#> 4:     1     4
#> 5:     1     6
#> 6:     1     6
#> 7:     2     4
#> 8:     2     4
#> 9:     2     6
CJ(x, y, unique=TRUE) # unique(x) and unique(y) are computed automatically
#> Key: <x, y>
#>        x     y
#>    <num> <num>
#> 1:     1     4
#> 2:     1     6
#> 3:     2     4
#> 4:     2     6

z = 0:1 + (0:1)*1i
CJ(x, z, sorted = FALSE) # support for sorting complex is not yet implemented
#>        x      z
#>    <num> <cplx>
#> 1:     1   0+0i
#> 2:     1   1+1i
#> 3:     1   0+0i
#> 4:     1   1+1i
#> 5:     2   0+0i
#> 6:     2   1+1i

Creates a join `data.table`

Usage

Arguments

Details

Value

See also

Examples