Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test(benchmarks): add basic, local benchmark suite #173

Open
deepyaman opened this issue Sep 24, 2024 · 0 comments
Open

test(benchmarks): add basic, local benchmark suite #173

deepyaman opened this issue Sep 24, 2024 · 0 comments

Comments

@deepyaman
Copy link
Collaborator

Is local ML preprocessing with DuckDB faster than ML preprocessing with scikit-learn? How does one-hot encoding with IbisML on Snowflake compare to snowflake.ml.modeling.preprocessing.OneHotEncoder? Can ML preprocessing on the database outperform ML preprocessing in Ray Data? Should we have been training our deep learning models in T-SQL all along?

Let's start by looking at benchmarks at various data volumes and numbers of preprocessing steps locally. The purpose of this is to mostly understand the workflows wherein IbisML can provide value; it is not, for instance, to say that people shouldn't use scikit-learn for a lot of local ML pipelines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: backlog
Development

No branches or pull requests

1 participant