• Franco Arda

Save and Load a Machine Learning Model: Not so easy ...

Finding an accurate Machine Learning model is often not the end of the project. If the model goes into "production", one part is that we have to save and load the model. In this case, I'm using Python's Pickel. This part is rarely taught in Data Science classes. It's not easy. One tricky part is, that if you need to normalize your data (e.g. Min Max Normalizer), you must make sure that you use the very same scale for new data!

The following Jupyter Notebook is just the shortest possible Logistic Regression I could do. As short as possible, because the focus is on how to use Python's pickle function.

(1) Here I normalize the data with the Min Max Scaler (2) We save the model as a pickle (3) We want to predict based on new data (4) In order to predict, we need to open our model (in pickel). With the new data, the model predicts 72.78%. Obviously, we don't need to normalize data in our source. This would not be needed. In fact, normalizing the data in the source would be a mistake. A similar mistake as not normalizing at all.

If we wanted to run this model in production in Tableau, we would have to add a few steps. Those can be seen in my other case studies posts.


Franco Arda, Frankfurt am Main (Germany)