Mathematics behind Machine Learning: You need to become a better Data Scientist

3 min readOct 25, 2020

It is not mandatory to understand the mathematics behind a model for you to make it work but it is crucial for success. No like you are using and experimenting with multiple black boxes and then using the one that work’s best.
So you can become a Data Scientist without the maths but you will heavily rely on luck and libraries 😓. To be successful, you need to understand the maths so that you can fine-tune your model but more importantly explain it with confidence 😎. It is an intrinsic part of a data scientist’s role and every recruiter and experienced machine learning professional will vouch for this.

I was first introduced to the world of Data Analytics while working as a software developer in the IT sector for a Fortune 500 company. Although, at that time, the term Data science wasn’t this famous. Nonetheless, I understood that any industry generated a lot of data, and not tapping into them is a waste of valuable insights. At the time, I was working as an iOS developer but I was always more excited about using automation and app data to generate insights.

I started my journey from the all famous Machine Learning A-Z course at Udemy. No doubt it is an amazing course for beginners who want to understand everything in ML as the course covers the breadth of the field. As I advanced in the field my interest in the field grew mainly because it involved some mathematics 👨 ➡️ ❤️.

😪 My struggle was to find the right resources where there was an in-depth mathematical explanation of at least the most commonly used ML algorithms like decision trees, SVMs. Most importantly, where can I get some clarifications or get my questions answered.

🥳 Finally, I decided to create one of my own to help myself and my fellow colleagues.

👀 Why Worry About The Maths?[1]

There are many reasons why the mathematics of Machine Learning is important and I will highlight some of them below:

Selecting the right algorithm which includes giving considerations to accuracy, training time, model complexity, number of parameters, and number of features.
Choosing parameter settings and validation strategies.
Identifying underfitting and overfitting by understanding the Bias-Variance tradeoff.
Estimating the right confidence interval and uncertainty.

📈 In this ever-evolving series, I will go through the functioning of widely used ML algorithms and some of the mathematical equations implemented in those algorithms, how they are formulated and how to use them using the scikit-learn library.

⚠️ This series will require some basic understanding of probability, calculus, and statistics. I will try my best to keep it as simple as possible.

Below is the list:

ML00: The very basics for Machine Learning
ML01: Linear Regression
ML02: Logistic Regression
ML03: Clustering and K-means
ML04: Kernel Density Estimation
ML05: Support Vector Machines (SVM)
ML06: Multi-Class Classification
ML07: Loss functions, Cross-validation
ML08: Dimensionality Reduction — PCA, AutoEncoders
ML09: Generative Models, Naive Bayes

more coming soon….

I will be adding links to the articles as soon as they are ready but most importantly this list will grow regularly so bookmark 🔖 it so you don’t miss a new update 😃 🙌🏻

Hope this series helps you towards your career. Happy Learning ✌🏻

For comments and recommendations you can reach me at:

Vaibhav Malhotra | Portfolio

I love writing code !! 🖥 !! My specialties include quickly learning new skills and programming languages, problem…

malhotravaibhav.com

Vaibhav Malhotra - Concordia University - Montreal, Quebec, Canada | LinkedIn

I am in my final semester of Masters in Software Engineering at Concordia University, Montreal. I'm currently seeking…

linkedin.com

Vaibhav3M - Overview

Hey! I'm Vaibhav! 👋🏻 I'm an aspiring data scientist dedicated to finding insights in data as well as educating others…

github.com

References:

[1] https://towardsdatascience.com/the-mathematics-of-machine-learning-894f046c568