Recently, I had the fortune to host a world-class Machine Learning practitioner, who has not only built a wide range of Machine Learning systems but also helped many people to make a career in Machine Learning.
He is Santiago Valdarrama, someone you might know if you are on the Twitter ML community. Being a Director of Computer Vision at Levatas, he leads a team of software developers and machine learning engineers in the development of Levatas’ flagship product. On his mission to help people, he is regularly creating learning content, such as courses, articles, newsletters, and not to mention his tweets that squeeze complex ML concepts.
We talked about a range of topics, covering all things that anyone would want to know in their early days of careering in ML. While this article is a summary of the talk, here is a recording of the whole discussion.
This summary is arranged by bullet points. Nearly 100% of such bullet points are Santiago’s experience in doing and learning Machine Learning.
Getting started: Prerequisites, resources, tools, good and wrong practices
- The single most prerequisite to doing Machine Learning is to learn how to talk to computers, in other words, using any programming language.
- Having software development experience is very helpful. Programming is a fundamental skill to succeed as an ML Engineer.
- Statistics, data analysis, and probabilities are different skills. Programming is a different skill too.
- A lot of people recommend learning maths first, ‘I like to start finding problems that I can tackle by acquiring new knowledge.
- Learning by solving problems doesn’t mean skipping fundamentals or not paying attention to theories, what it means is that the mentality of a problem first might be very beneficial for you to start with a problem do what it takes to solve that problem and in that process, you’re going to start acquiring knowledge…If it means to use maths/probabilities to solve the problem, you can go deep and learn these.
- I coined the above approach to problem-oriented learning.
- The 3 most learning resources that leverage the above approach are fast.ai Deep Learning, Google Machine Learning course, and Andrew Ng. Machine Learning course.
- General programming setup for Machine Learning: Visual Studio Code, Jupyter Notebooks, and Colab notebooks for leveraging free GPU. VS Code supports notebooks too.
- Machine Learning specific tools: TensorFlow (or PyTorch) for deep learning, Scikit-Learn for classical ML, Pandas for data analysis and manipulation, NumPy for computation things, Matplotlib/Seaborn for data visualization.
- ML is a long journey. Stay motivated and learn consistently.
- Skipping courses assignments or books exercises is like leaving great food on the table.
- If I had to start over again learning, I would learn things that matter, things that are useful.
Maths and ML: The right balance
- Math is a fundamental part of Machine Learning. Nobody can deny that.
- The right amount of maths you need depends on what you want to do with your life. If it’s research or creating state-of-the-art algorithms, you’re definitely going to have to spend way more time understanding the underpinnings of the existing machine learning field than somebody that’s trying to use what others create. Same thing if you want to build a framework.
- You can use the library without necessarily understanding how it works behind the scenes, you have general knowledge but you don’t have to get not down to the nitty-gritty details.
- Math is a fundamental part of machine learning, nobody can deny that but telling people that they cannot use a technique or that they can not use an algorithm unless they understand how that algorithm works, I think it’s disingenuous and I think it’s putting too much emphasis on something that might not be needed.
- You can choose your own path and learn math whenever you think it’s necessary.
- We’ve all been searching lists and using search functionality without necessarily understanding how search works. It’s the same thing with any algorithm or any framework.** I’m not saying that math is not important, I’m saying that the order that you follow has to be productive and to produce value. **
- You can choose your own path and learn math whenever you think it’s necessary.
Software Developers to Machine Learning
- Many developers are afraid of ML. The reason is that we are usually afraid of what we don’t know. Change is hard.
- Machine learning is pretty new when you compare it to general software development.
- Most books, courses, and videos have been created by research people/academia and so those put a lot of emphasis on math and theory. There is not a lot of content created by people that are using machine learning that are looking at machine learning from a more pragmatic point of view.
- The reality is very different: you can do and have a huge impact by using machine learning techniques without needing a Ph.D. or a master’s degree.
- Instead of switching from software development, incorporate ML into your development career.
Ideas on Building Projects for Portfolio and Creating Contents
- Creating great content or building a compelling portfolio doesn’t happen overnight. It is about practice and repetition.
- In the early days of creating content, it can feel that nobody cares, and hence it’s easy to give up.
- If you are going to create content, believe in yourself, understand the path, understand that in the first days, it will seem that nobody cares about your content. Keep creating and sharing, your content is going to be visible and more people will start to engage with it.
- It is a practice that makes the content so good.
- Without a reason why you are creating content, it is easy to give up. Santiago creates content as a way to learn new stuff actively. He calls it a win-win situation, he is both learning and teaching others.
- Networking is important. The people you meet on social media can recommend job opportunities in the future.
- To Santiago, his Twitter account is his portfolio for any potential employee in the future.
- A resume is not a place for an entire life story, GPA, and college course works. It is a place to show how you are a good candidate for the job.
- The single most purpose of a resume is to get you a phone call. A resume is not going to get you a job, nor make your compensation better. It is only going to get your foot in the door of opportunities. Optimize the resume for the job position.
- You can use the same resume for 500 jobs but if you want to maximize the probability of getting a phone call, make sure your resume is speaking for a job position.
- Cut down the resume to one page. Get rid of all distracting information, remove all noises. A typical hiring manager spends an average of 6 seconds on each resume. Get rid of any information that does not ensure you’re getting a phone call.
- Creativity plays a role in differentiating yourself from others. It can be a cool analysis or an interesting project that is far different from other boring projects that most developers do.
Other unlabelled points:
- Machine Learning is hard, but it can be extremely rewarding if the work you do provides value to others or help them learn something new.
- Santiago’s favorite books: Hands-on Machine Learning with Scikit-Learn, Keras and TensorFlow by Aurelion Geron, Deep Learning with Python by Francois Chollet, Designer of Keras, and James Clear’s Atomic Habits.
- Favourite Python books: Python Crash Course, Automating Boring Stuff with Python.
- Santiago’s last 3 pieces of advice: Stay consistent, learn to learn effectively, and build things.
To ward the end of the event, answering the audience questions, you will find other useful ideas about introducing machine learning in existing industries, machine learning applications in medicine, and building highly engaging tech communities.
A conversation with Santiago was definitely a whole learning experience for me and I hope that if you have watched it, you will find it useful especially if you are beginning your machine learning career.
Thanks for reading!
Each week, with a probability of 80%, I write one article about machine learning techniques, ideas, or best practices.