The data.table 2023 community survey is now live! Click on https://tinyurl.com/datatable-survey to fill it out. The survey will remain open until December 1st, 2023.
In addition to filling out the survey, it would be great if you could share it with others who might be interested in participating.
data.table
provides a high-performance version of base R’s data.frame
with syntax and feature enhancements for ease of use, convenience and programming speed.
data.table
?
?fread
, see also convenience features for small data
?fwrite
IRanges::findOverlaps
), non-equi joins (i.e. joins using operators >, >=, <, <=
), aggregate on join (by=.EACHI
), update on join
?dcast
(pivot/wider/spread) and ?melt
(unpivot/longer/gather)list
are supported
install.packages("data.table")
# latest development version that has passed all tests:
data.table::update_dev_pkg()
See the Installation wiki for more details.
Use data.table
subset [
operator the same way you would use data.frame
one, but…
DT$
(like subset()
and with()
but built-in)j
argument, not just list of columnsby
to compute j
expression by group
library(data.table)
DT = as.data.table(iris)
# FROM[WHERE, SELECT, GROUP BY]
# DT [i, j, by]
DT[Petal.Width > 1.0, mean(Petal.Length), by = Species]
# Species V1
#1: versicolor 4.362791
#2: virginica 5.552000
example(data.table)
data.table
is widely used by the R community. It is being directly used by hundreds of CRAN and Bioconductor packages, and indirectly by thousands. It is one of the top most starred R packages on GitHub, and was highly rated by the Depsy project. If you need help, the data.table
community is active on StackOverflow.
Guidelines for filing issues / pull requests: Contribution Guidelines.