test(benchmarks): add basic, local benchmark suite #173

deepyaman · 2024-09-24T19:37:21Z

Is local ML preprocessing with DuckDB faster than ML preprocessing with scikit-learn? How does one-hot encoding with IbisML on Snowflake compare to snowflake.ml.modeling.preprocessing.OneHotEncoder? Can ML preprocessing on the database outperform ML preprocessing in Ray Data? Should we have been training our deep learning models in T-SQL all along?

Let's start by looking at benchmarks at various data volumes and numbers of preprocessing steps locally. The purpose of this is to mostly understand the workflows wherein IbisML can provide value; it is not, for instance, to say that people shouldn't use scikit-learn for a lot of local ML pipelines.

The text was updated successfully, but these errors were encountered:

github-project-automation bot added this to Ibis planning and roadmap Sep 24, 2024

github-project-automation bot moved this to backlog in Ibis planning and roadmap Sep 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(benchmarks): add basic, local benchmark suite #173

test(benchmarks): add basic, local benchmark suite #173

deepyaman commented Sep 24, 2024

test(benchmarks): add basic, local benchmark suite #173

test(benchmarks): add basic, local benchmark suite #173

Comments

deepyaman commented Sep 24, 2024