Mathematics behind Machine Learning: You need to become a better Data Scientist

I was first introduced to the world of Data Analytics while working as a software developer in the IT sector for a Fortune 500 company. Although, at that time, the term Data science wasn’t this famous. Nonetheless, I understood that any industry generated a lot of data, and not tapping into them is a waste of valuable insights. At the time, I was working as an iOS developer but I was always more excited about using automation and app data to generate insights.

I started my journey from the all famous Machine Learning A-Z course at Udemy. No doubt it is an amazing course for beginners who want to understand everything in ML as the course covers the breadth of the field. As I advanced in the field my interest in the field grew mainly because it involved some mathematics 👨 ➡️ ❤️.

😪 My struggle was to find the right resources where there was an in-depth mathematical explanation of at least the most commonly used ML algorithms like decision trees, SVMs. Most importantly, where can I get some clarifications or get my questions answered.

🥳 Finally, I decided to create one of my own to help myself and my fellow colleagues.

👀 Why Worry About The Maths?[1]

There are many reasons why the mathematics of Machine Learning is important and I will highlight some of them below:

  1. Selecting the right algorithm which includes giving considerations to accuracy, training time, model complexity, number of parameters, and number of features.
  2. Choosing parameter settings and validation strategies.
  3. Identifying underfitting and overfitting by understanding the Bias-Variance tradeoff.
  4. Estimating the right confidence interval and uncertainty.

📈 In this ever-evolving series, I will go through the functioning of widely used ML algorithms and some of the mathematical equations implemented in those algorithms, how they are formulated and how to use them using the scikit-learn library.

⚠️ This series will require some basic understanding of probability, calculus, and statistics. I will try my best to keep it as simple as possible.

Below is the list:

  • ML00: The very basics for Machine Learning
  • ML01: Linear Regression
  • ML02: Logistic Regression
  • ML03: Clustering and K-means
  • ML04: Kernel Density Estimation
  • ML05: Support Vector Machines (SVM)
  • ML06: Multi-Class Classification
  • ML07: Loss functions, Cross-validation
  • ML08: Dimensionality Reduction — PCA, AutoEncoders
  • ML09: Generative Models, Naive Bayes

more coming soon….



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store