special-symbols {data.table}R Documentation

Special symbols

Description

.SD, .BY, .N, .I, .GRP, and .NGRP are read-only symbols for use in j. .N can be used in i as well. .I can be used in by as well. See the vignettes, Details and Examples here and in data.table. .EACHI is a symbol passed to by; i.e. by=.EACHI, .NATURAL is a symbol passed to on; i.e. on=.NATURAL

Details

The bindings of these variables are locked and attempting to assign to them will generate an error. If you wish to manipulate .SD before returning it, take a copy(.SD) first (see FAQ 4.5). Using := in the j of .SD is reserved for future use as a (tortuously) flexible way to update DT by reference by group (even when groups are not contiguous in an ad hoc by).

These symbols used in j are defined as follows.

.EACHI is defined as NULL but its value is not used. Its usage is by=.EACHI (or keyby=.EACHI) which invokes grouping-by-each-row-of-i; see data.table's by argument for more details.

.NATURAL is defined as NULL but its value is not used. Its usage is on=.NATURAL (alternative of X[on=Y]) which joins two tables on their common column names, performing a natural join; see data.table's on argument for more details.

Note that .N in i is computed up-front, while that in j applies after filtering in i. That means that even absent grouping, .N in i can be different from .N in j. See Examples.

Note also that you should consider these symbols read-only and of limited scope – internal data.table code might manipulate them in unexpected ways, and as such their bindings are locked. There are subtle ways to wind up with the wrong object, especially when attempting to copy their values outside a grouping context. See examples; when in doubt, copy() is your friend.

See Also

data.table, :=, set, datatable-optimize

Examples

DT = data.table(x=rep(c("b","a","c"),each=3), v=c(1,1,1,2,2,1,1,2,2), y=c(1,3,6), a=1:9, b=9:1)
DT
X = data.table(x=c("c","b"), v=8:7, foo=c(4,2))
X

DT[.N]                                 # last row, only special symbol allowed in 'i'
DT[, .N]                               # total number of rows in DT
DT[, .N, by=x]                         # number of rows in each group
DT[, .SD, .SDcols=x:y]                 # select columns 'x' through 'y'
DT[, .SD[1]]                           # first row of all columns
DT[, .SD[1], by=x]                     # first row of all columns for each group in 'x'
DT[, c(.N, lapply(.SD, sum)), by=x]    # get rows *and* sum all columns by group
DT[, .I[1], by=x]                      # row number in DT corresponding to each group
DT[, .N, by=rleid(v)]                  # get count of consecutive runs of 'v'
DT[, c(.(y=max(y)), lapply(.SD, min)),
        by=rleid(v), .SDcols=v:b]      # compute 'j' for each consecutive runs of 'v'
DT[, grp := .GRP, by=x]                # add a group counter
DT[, grp_pct := .GRP/.NGRP, by=x]      # add a group "progress" counter
X[, DT[.BY, y, on="x"], by=x]          # join within each group
DT[X, on=.NATURAL]                     # join X and DT on common column similar to X[on=Y]

# .N can be different in i and j
DT[{cat(sprintf('in i, .N is %d\n', .N)); a < .N/2},
   {cat(sprintf('in j, .N is %d\n', .N)); mean(a)}]

# .I can be different in j and by, enabling rowwise operations in by
DT[, .(.I, min(.SD[,-1]))]
DT[, .(min(.SD[,-1])), by=.I]

# Do not expect this to correctly append the value of .BY in each group; copy(.BY) will work.
by_tracker = list()
DT[, { append(by_tracker, .BY); sum(v) }, by=x]

[Package data.table version 1.16.99 Index]