This GitHub portfolio gather some of my school and work projects.
They're done with R and Phyton
website
📅 SQL, BASH, PYTHON
- A study to show potentiality of Dremio Iceberg with custom data
- Built a NiFi flow to download data
- Created container with Docker for Dremio, Nessi, MinIo, NiFi
- Script for json concatenation with Python
🐍 PYTHON
- An evolutive study from Boolean data clustering1
- Showing capabilities of models, no deep parameters tuning
- High level theory explained in documentations
- For a more friendly read, open file word and set Navigation Pane
🐍 PYTHON
- A real-life project about unsupervised learning with KMeans and others
- Strictly limited by client's packages versions
- Include built-in custom functions
- Outputs saved in HDFS and followed by layers (polybase, analysis service) to end on Power Bi
🐍 PYTHON
- Only ppt due to NDA
- It's a small POC due to a single feature analyzed (univariate)
- Tried several algorithms for unsupervised and supervised learning (supervised not shown here)
🐍 PYTHON
- Really simple data pipeline
- Took from a YouTube tutorial with some upgrades
- Created a presentation with PowerPoint
🐍 PYTHON
- Data Manipulation and Analysis project
- Dataset from Kaggle about billionaire companies all over the world
- Tried a presentation mode with UnicornCompanies.slides, a built-in function of jupyter notebook
📈 R
- Work project about error detections in annual reporting.
- Only R script due to NDA.
📈 R
- Self-taught excersises for work projects. In the link above there are some examples used to show to clients
📈 R
- A tool to calculate the probability of fire in Algerian forest.
- Machine learning models used are: LDA, SVM linear and radial, KNN, random forest, rpart
📈 R
- A study about O3 pollution in the USA (O3 pollution.pdf).
- Kriging method
📈 R
- This project is developed on three different problems:
- Created a tool to predict a prostate cancer by using RIDGE and LASSO regressions (PROBLEMA 1);
- Created a fake dataset to understand how RIDGE and LASSO works (PROBLEMA 2);
- With another fake dataset is developed a Shooting Alghoritm to see how LASSO works (PROBLEMA 3)
- It's a full-immersion and deep-understanding of Ridge and Lasso alghoritm functions.
🐍 PYTHON
- This was my first time with Python and my first Neural Network
- For a better view go to this link: Google COLAB
- It's everything explained there so just enjoy the reading!
📈 R
- This is the last project for first semester Data science course. Here it's executed a really simple PCA for economics data.
- Actually I've learned new graphics and new libraries for a better visual info and anlysis.
- This project was useful to learn basic analytic skills for following cluster models (in other courses)
📈 R
- Tested the influence of neighbourhood in real estate value
- Moran test, Geary, Getis-Ord, jarque-bera..
- Spatial autoregressive model (SAM)
- Spatial lag model (SLM)
- Spatial error model (SEM)
- Spatial durbin model (SDM)
- SARAR
- Spatial durbin error model (SDEM)
- Spatial lag x (SLX)
- General nesting model (GNS)
📈 R
- Created a model to estimate a purchase by social media ads
- Dummy analysis on gender
- Confusion matrix
📈 R (it was my first time)
- Created a model to estimate revenue for movies
- Dataset got from Kaggle
- Selected the best variables for prediction and tested OLS with Jarque Bera, Breusch Pagan and Durbin Watson
That's all for now.
Bye 👋