Look at all this math and code you need to understand to learn machine learning. It can be very hard, for me at least; it was as well, until I learned these 5 secrets, which honestly aren’t even secrets, but no one really teaches you them, although everyone should know them. I mean, I spent the last 3 and a half years studying machine learning, and it took me way too long to learn these secrets on my own.
So let me reveal them to you so that you don’t have to struggle for that long. You’re thinking of math the wrong way around, which already sets you up for failure at the very beginning, but isn’t it your fault? Back when I started learning machine learning, some of my professors would simply throw a formula on screen and tell us, This is the loss function for a decision tree. And that was it. I and most of my peers were confused and simply stared at the formula, waiting for a magical aha moment where the formula made sense. I was always asking myself how those smart scientists could understand math so well that they could develop new algorithms and just think in the language of math. Until I realized I was thinking the wrong way around, I was focusing too much on the actual mathematical formulas in the realm of math instead of taking a step back and thinking like a scientist. I mean, I was literally looking at a formula and trying to understand the formula as a whole, which for me now, after learning the secret, just doesn’t make any sense anymore. Don’t think of math as something abstract; make it human-interpretable. You need to realize that you need to think the other way around. Think of the idea a human had, understand it, and then think of how to translate it into the language of math.
This may sound very confusing, but math is not a standalone language in which people think. Scientists think just like you and me, in natural language. They just know how to translate their ideas into their formalisms of math, which then allows them to be implemented and then further developed using the rules of math. As mentioned, I was always looking at a formula as a whole, but each component of a formula is just a component of this human idea. For example, a sum or a product are literally just a for loop that can have some conditions that are literally equivalent to an if-else statement in code. Of course, this is easier said than done, and to understand a mathematical concept using human ideas requires someone to actually properly teach you these human ideas and how to translate them step by step. But in my experience, there are two scenarios. One, the teacher does that already, but you don’t understand why he does it because you were never explicitly taught to think that way, or two, the teacher really only looks at the formulas and derivations. In that case, you need to try to figure out the original human idea yourself by, for example, looking it up online, but the good thing is that you now know it is not your fault and there is an intuitive understanding of the math that you can find. Math is just the formalization of a human idea. Very few people actually think in the language of math; it’s just a tool, but when it comes to every intermediate derivation step, you often actually think in the language of math, which is very difficult unless you know the next secret. This secret literally changed the way I look at scary math derivations like this one. Again, jumping back in time to when I watched a lecture at college, my professor would explain the intuitive idea of an ML algorithm, show the translation into the language of math, and then show us where we want the formula to end up to make it more efficient or simply actually work as an algorithm that can be implemented.
But at some point he would go off on a derivation spree, writing out one step after the other and expecting us to understand why he did what he did. Everyone was confused and, of course, scared and annoyed. But these derivations are simpler than you might think. Not easy, but much simpler to execute after learning this one secret. I realized each step was simply applying one specific rule or definition. I realized that, up to a certain degree, these mathematical derivations or transformations just require you to have a list of rules and tricks you need to collect that you can then simply apply. During the lectures, for each step I saw, I would explicitly look for the rule and definition they used and write that down on my list. When solving or reading math derivations on my own, for homework assignments, or during an exam, in most cases, I would literally just do some sort of pattern matching. I would look at where I currently am, go down my list of rules and definitions, and apply what fits the pattern. Of course, some patterns, rules, and definitions are harder to spot than others, but after doing this for long enough, you just start to memorize certain patterns. But in general, for most ML math, this secret technique does work wonders. You need to collect your mathematical toolkit and learn to recognize when you can apply a rule. Which means practice, practice, practice. But math is, of course, far from all there is to ML. Coding is also a very challenging skill. Learning the basics of Python and then an ML library like PyTorch is really cool and fun. You simply follow a tutorial online and have a really steep learning curve. You follow a recipe of steps and implement a lot of code. You really see and feel the progress you are making. But then, when you want to go further and learn to implement actual algorithms or ML pipelines on your own, you hit a wall. All of a sudden, you sit on one annoying problem for several hours, have written perhaps five lines of code, and you think you are not making any progress. This is very, very frustrating and is the point where a lot of people determine that coding is really hard and that they will never be able to really learn to code. Writing five lines of code in three hours is pathetic. I mean, I always thought I was so stupid for writing code that never worked until I debugged it for hours. This can get so bad that you don’t even want to start coding because you know it will fail. But that is simply the wrong way to think. Actually, writing code for one hour will very likely mean, let’s say, three hours of debugging. Once I learned that this is what it really means to be coding, I suddenly felt so relieved and not stupid anymore.
Coding ML models didn’t feel impossible or hard anymore because I was doing exactly what was normal and expected. And nowadays, there are amazing tools that I can’t live without, like github copilot, that can generate code for you and explain code for you. But there is so much that you simply learn through your own experience or the experience of others. That’s why I have a completely free weekly newsletter where I share my experience as a machine learning researcher, including actionable tips, AI news, and more. I’ll just pin a comment below with the link to sign up. But anyway, you have to realize that writing code is not actually coding; debugging is coding. This realization really helps you with implementing things on your own, step by step. But when you have to work with an existing code base where you have everything at once, you will probably still be overwhelmed. So let’s look at the next secret tip. There are two cases where you will need to understand complex code. The first one is when building on top of an existing repository. I remember when I started working on my first larger project, where I built on top of an existing code base. It was so much code. I literally had no idea where to start. Just like in the previous secret, I again felt like learning to read code was as impossible as writing code. All those tutorials and smaller personal projects didn’t prepare me for this much code. I tried reading each source file and started to write code as soon as possible into places I thought made sense, but that unsurprisingly led to a lot of headaches and wasting time writing code that was destined to fail. I had to learn the hard way that there is a very simple strategy for understanding large code bases. I wish someone would have simply told me once how to approach a challenge like this. With most large ML repositories, you have a train.py and an eval.py file. Those should always be the starting points. I find those files, set a breakpoint in the beginning, and start stepping through the code with the debugger. I cannot emphasize enough how simply yet insanely effective this technique is. It’s literally like cheating. You can step through the data preprocessing, the training loop, the actual model, the evaluation metrics, and every other detail. Depending on the codebase and your experience, this takes just a few hours, and you will have an amazing overview of the codebase and a much better feeling for where to add the new code for your own idea. That said, you might not always want to build on top of an existing highly optimized codebase but simply want to understand an algorithm better. For example, when you want to understand PPO, a famous reinforcement learning algorithm, I would not recommend looking at the optimized implementation; that’s way too overkill and complex. Luckily, for many important models, there are minimal educational implementations that just implement the main idea so that people can understand the model. And here, yet again, the best way is to set a breakpoint at the beginning of the main function and then just start debugging.
Finally, there is one fundamental secret to mastering machine learning that you need to know. This final secret ties everything we just discussed together and is the one reason that will determine your success or failure in mastering ML. 34% of organizations consider poor AI skills, expertise, or knowledge as the top reason blocking successful AI adoption, according to an IBM study from 2022. Why do you think people fail to learn machine learning and fall into the category of people with poor AI skills? Is it because it is hard? Yes, but it was also hard for every person who has now mastered it. People fail to master ML because they stop learning machine learning too early and give up. And why do people give up? They have false expectations and don’t enjoy the process of learning. They think mastering ML is hard because they didn’t learn it in a few weeks. Or, because they didn’t understand a video explaining an ML concept the first time, they will never understand it. I took my first introductory AI college course about 3 and a half years ago. After that semester, I took my first real ML course, along with working on my first ML projects. After that semester, I continued with my first deep learning courses and continued working on projects and reading a lot of papers. I definitely didn’t understand everything the first time, but I knew that was normal. Over time, I learned the secrets mentioned before and that somewhat mastering ML takes time. I failed my first ML interviews for internships at Amazon, Neuro, and Google DeepMind, but now I am working with an ex-meta professor and collaborating with a Google DeepMind researcher. It takes time. Period. This is not a skill you learn over a weekend. The 10,000-hour rule applies here as well; if you spend 10,000 hours on a specific skill, you will master it. I’m absolutely not trying to discourage you, but rather the opposite. I don’t want to be some weird influencer who wants to sell you a dream of mastering ML in a few weeks. I want to encourage you to really learn machine learning, to really learn the theory, and to really gather practical experience. And the mastery of machine learning comes after learning fundamentals by really working on projects, encountering real-world problems, and reading real-state-of-the art papers or blog posts.
By having this expectation that it will take time, you relax way more, and the learning process becomes much easier, more enjoyable, and more successful in the end. All these secrets are universally true, no matter how you decide to learn machine learning. And I say that because there are mainly three ways to do so.