All about my experience at the Metis Data Science Bootcamp
Weeks four through six took us through all of the classification models and a whole grab bag of topics related to working in the cloud and web development.
We covered a majority of the main classification machine learning algorithms in use today. For all of these algorithms we did an in-depth look at the mathmatical underpinnings of the algorithms and also talked in detail about their strengths and weaknesses. A common review topic was comparing and contrasting when it would be appropriate to use one algorithm over another and how you would determine which performed best. I've created a summary 'cheat sheet' and shared it below.
Key Concept:
Assumptions:
Scale variant, data should be Standardized.
Steps:
Hyperparameters:
nan
Interpretability | Highly interpretable, has an analytic solution. |
Computation Time | Quick to fit, slow to predict. Requires holding all data in memory to predict, may run slow with very large datasets. |
Accuracy | Varies |
Key Concept:
Assumptions:
Scale variant, data should be Standardized.
Steps:
nan
Hyperparameters:
nan
Interpretability | Highly interpretable, returns probabilities and coefficients. Loses interpretability with multiple classes (probabilities do not add up to one, unless using multi-). |
Computation Time | Linear, relatively fast, scales well. Multi-class much slower. |
Accuracy | Varies |
Key Concept:
Assumptions:
Scale invariant, no assumptions about the data.
Steps:
Hyperparameters:
Interpretability | Linear classification is easily interpretable. Regession and other kernals less so. The complex data transformations and resulting boundary plane are very difficult to interpret. |
Computation Time | Depends on kernel. Linear much faster than poly or rbf. |
Accuracy | Varies |
Key Concept:
Applies Bayes Theorum to all features, then multiplied.
Assumptions:
Assumes independent features.
Steps:
nan
Hyperparameters:
nan
Interpretability | Highly interpretable. Calculations consist of simple counting and multiplication. |
Computation Time | Performs well. |
Accuracy | Varies |
Key Concept:
Assumptions:
Non-parametric, scale invariant.
Steps:
Hyperparameters:
Regression uses MSE. Classification uses entropy/gini.
Interpretability | Highly interpretable. |
Computation Time | Scales well (depending on implementation, parallelizeable). |
Accuracy | Prone to overfitting. |
Key Concept:
Assumptions:
Non-parametric, scale invariant.
Steps:
nan
Hyperparameters:
nan
Interpretability | nan |
Computation Time | Scales well (highly parallelizeable). |
Accuracy | nan |
Key Concept:
Assumptions:
Non-parametric, scale invariant.
Steps:
Hyperparameters:
nan
Interpretability | nan |
Computation Time | nan |
Accuracy | nan |
Key Concept:
Assumptions:
nan
Steps:
Hyperparameters:
nan
Interpretability | nan |
Computation Time | nan |
Accuracy | nan |
The big event of week 6 at Metis was the completion of our 3rd project, McNulty. This project was to be focused on the different classification models we’re covered in the past weeks. I again learned the lesson the hard way that real world problems that involve gathering messy data from multiple sources are not appropriate for a 2 week learning project.
My topic, what factors contribute to high schools successfully preparing kids for college, seemed like a straight-forward enough project. Alas, it was not. Although the data I collected did not come together how I had hoped, I did manage to get some practice working with a variety of classification models.