We are a consulting company specializing in services related to
data.table, which is an R package that provides a high performance
database system. Our services include big data analysis, machine
learning, data visualization, and teaching specialized
programming classes related to these subjects.
Who are we? We have 20+ years experience in statistical
programming and data science, and can show you how the efficiency of
data.table can be a critical part of your big data pipelines, to
optimize the use of finite computational resouces (time and
memory). In benchmarks, data.table can be 100x faster than other
software that provides similar functionality (base R, tidyverse,
python pandas, polars, arrow, duckdb, etc):
- Slides: 3 hour
data.tabletutorial, including figures comparing timings with other R functions,pandas,duckdb, andpolars. - Blog: code for comparing
data.tablewithpandas,duckdb, andpolars, in terms of CSV reading/writing and aggregation. - Blog: code for comparing
data.tablewith base R,dplyr,collapse, andarrow, in terms of CSV reading/writing and aggregation. - Blog: code for comparing
data.tablewithduckdbandpolars, in terms of functionality and speed for data reshaping operations (SQL PIVOT/UNPIVOT). - duckdb labs benchmark web
page:
data.tableconsistently performs near the top in this database-like ops benchmark.