We are a consulting company specializing in services related to
data.table
, which is an R package that provides a high performance
database system. Our services include big data analysis, machine
learning, data visualization, and teaching specialized
programming classes related to these subjects.
Who are we? We have 20+ years experience in statistical
programming and data science, and can show you how the efficiency of
data.table
can be a critical part of your big data pipelines, to
optimize the use of finite computational resouces (time and
memory). In benchmarks, data.table
can be 100x faster than other
software that provides similar functionality (base R, tidyverse,
python pandas, polars, arrow, duckdb, etc):
- Slides: 3 hour
data.table
tutorial, including figures comparing timings with other R functions,pandas
,duckdb
, andpolars
. - Blog: code for comparing
data.table
withpandas
,duckdb
, andpolars
, in terms of CSV reading/writing and aggregation. - Blog: code for comparing
data.table
with base R,dplyr
,collapse
, andarrow
, in terms of CSV reading/writing and aggregation. - Blog: code for comparing
data.table
withduckdb
andpolars
, in terms of functionality and speed for data reshaping operations (SQL PIVOT/UNPIVOT). - duckdb labs benchmark web
page:
data.table
consistently performs near the top in this database-like ops benchmark.