rleid {data.table} | R Documentation |
A convenience function for generating a run-length type id column to be used in grouping operations. It accepts atomic vectors, lists, data.frames or data.tables as input.
rleid(..., prefix=NULL)
rleidv(x, cols=seq_along(x), prefix=NULL)
x |
A vector, list, data.frame or data.table. |
... |
A sequence of numeric, integer64, character or logical vectors, all of same length. For interactive use. |
cols |
Only meaningful for lists, data.frames or data.tables. A character vector of column names (or numbers) of x. |
prefix |
Either |
At times aggregation (or grouping) operations need to be performed where consecutive runs of identical values should belong to the same group (See rle
). The use for such a function has come up repeatedly on StackOverflow, see the See Also
section. This function allows to generate "run-length" groups directly.
rleid
is designed for interactive use and accepts a sequence of vectors as arguments. For programming, rleidv
might be more useful.
When prefix = NULL
, an integer vector with same length as NROW(x)
, else a character vector with the value in prefix
prefixed to the ids obtained.
data.table
, rowid
, https://stackoverflow.com/q/21421047/559784
DT = data.table(grp=rep(c("A", "B", "C", "A", "B"), c(2,2,3,1,2)), value=1:10)
rleid(DT$grp) # get run-length ids
rleidv(DT, "grp") # same as above
rleid(DT$grp, prefix="grp") # prefix with 'grp'
# get sum of value over run-length groups
DT[, sum(value), by=.(grp, rleid(grp))]
DT[, sum(value), by=.(grp, rleid(grp, prefix="grp"))]