Posts

Showing posts with the label ML

Tackling Imbalanced Datasets in Classification Problems

When I started working with Imbalanced Data, I quickly realized there was a gap between theory and what actually happens in practice. This post is about how i handle imbalanced datasets in classification problems. I'll walk you through what I learned, what tripped me up, and the lessons that stuck with me. No fluff — just honest notes from someone who went through it. Introduction to Imbalanced Datasets I still remember the first time I encountered an imbalanced dataset in a classification problem. I was working on a fraud detection model, and my initial results showed a whopping 99 percent accuracy. Sounds great, right? But as I dug deeper, I realized that my model was predicting every single instance as non-fraud. The model was essentially useless, as it was unable to detect any fraudulent cases. This experience taught me a valuable lesson: accuracy is not always the best metric, especially when dealing with imbalanced datasets. The Problem with Imbalanced Datasets Imbalance...

Demystifying Model Predictions with SHAP Values

When I started working with SHAP, I quickly realized there was a gap between theory and what actually happens in practice. This post is about how i used shap values to understand what my model was actually doing. I'll walk you through what I learned, what tripped me up, and the lessons that stuck with me. No fluff — just honest notes from someone who went through it. Introduction to SHAP Values As a machine learning engineer, I've often found myself wondering what's driving my model's predictions. Are the features I've carefully selected truly influencing the outcomes, or is something else at play? I discovered the answer to this question when I started using SHAP values, a technique that has revolutionized the way I understand and debug my models. In this article, I'll share my experience with SHAP values, the lessons I learned, and the mistakes I made along the way. What are SHAP Values? SHAP (SHapley Additive exPlanations) values are a technique used to ...

Structuring a Real-World Machine Learning Project from Scratch

When I started working with Project Structure, I quickly realized there was a gap between theory and what actually happens in practice. This post is about how i structured a real ml project from scratch. I'll walk you through what I learned, what tripped me up, and the lessons that stuck with me. No fluff — just honest notes from someone who went through it. Introduction to Machine Learning Project Structure I'll never forget my first machine learning project. I was excited to dive in and start building, but I made a critical mistake: I put all my code in a single, massive script. It wasn't long before I realized that this approach wouldn't scale, especially when I added a second teammate to the project. The script was cumbersome, difficult to navigate, and prone to errors. I learned the hard way that a well-structured project is essential for success in machine learning. As I worked through the challenges of building a machine learning project from scratch, I disco...

Streamlining ML Experiment Tracking with MLflow

When I started working with MLflow, I quickly realized there was a gap between theory and what actually happens in practice. This post is about how mlflow changed the way i track ml experiments. I'll walk you through what I learned, what tripped me up, and the lessons that stuck with me. No fluff — just honest notes from someone who went through it. Introduction to MLflow As I delved into the world of machine learning, I quickly realized that tracking experiments was a crucial part of the development process. However, I was doing it the hard way - saving metrics in print statements and notebooks. It wasn't until I discovered MLflow that I was able to streamline my workflow and make the most out of my experiments. In this article, I'll share my experience with MLflow, the lessons I learned, and how it changed the way I approach ML development. The Struggle is Real Before MLflow, I was struggling to keep track of my experiments. I would run multiple iterations, tweaking ...