November 3, 2022
Sourced and prepared data for analysis with data preparation done to ensure validity, accuracy, completeness, consistency, and uniformity.
Performed feature selection and engineering i.e., label and one-hot encoding, SMOTE, normalization, and train-test splitting with a 0.25 test size and random state of 1.
Employed modelling techniques such as ADA Boost, Random Forest, SVM, Lasso Regression, Naive Bayes, and Logistic Regression, with the Random Forest model performing best with a score of 92%.