Skip to contents

In data.table parlance, all set* functions change their input by reference. That is, no copy is made at all, other than temporary working memory, which is as large as one column. The only other data.table operator that modifies input by reference is :=. Check out the See Also section below for other set* function data.table provides.

copy() copies an entire object.





A data.table.


data.table provides functions that operate on objects by reference and minimise full object copies as much as possible. Still, it might be necessary in some situations to work on an object's copy which can be done using DT.copy <- copy(DT). It may also be sometimes useful before := (or set) is used to subassign to a column by reference.

A copy() may be required when doing dt_names = names(DT). Due to R's copy-on-modify, dt_names still points to the same location in memory as names(DT). Therefore modifying DT by reference now, say by adding a new column, dt_names will also get updated. To avoid this, one has to explicitly copy: dt_names <- copy(names(DT)).


To confirm precisely whether an object is a copy of another, compare their exact memory address with address.


Returns a copy of the object.


# Type 'example(copy)' to run these at prompt and browse output

DT = data.table(A=5:1,B=letters[5:1])
DT2 = copy(DT)        # explicit copy() needed to copy a data.table
setkey(DT2,B)         # now just changes DT2
identical(DT,DT2)     # FALSE. DT and DT2 are now different tables
#> [1] FALSE

DT = data.table(A=5:1, B=letters[5:1])
nm1 = names(DT)
nm2 = copy(names(DT))
DT[, C := 1L]
#>        A      B     C
#>    <int> <char> <int>
#> 1:     5      e     1
#> 2:     4      d     1
#> 3:     3      c     1
#> 4:     2      b     1
#> 5:     1      a     1
identical(nm1, names(DT)) # TRUE, nm1 is also changed by reference
#> [1] TRUE
identical(nm2, names(DT)) # FALSE, nm2 is a copy, different from names(DT)
#> [1] FALSE