Introduction to ML Coding Interviews
The ML coding interview assesses your technical problem-solving skills, knowledge of ML frameworks, and experience with the team’s sub-field.
This course includes an interview framework, rubric to explain how you’re graded, mock interviews, and practice questions. In this lesson, we give an overview of the interview round, what to expect, and how to prepare.
ML coding vs. traditional coding interviews
In a software engineering interview, the interview questions will most likely focus on data structures and algorithms in a Leetcode-style format. In an ML coding interview, there are generally three unique ways that the coding prompt will appear:
- Write a common algorithm from scratch (e.g. k-means, knn).
You’re expected to:
- Build the algorithm using NumPy
- Remember basic algorithms
- Implement them from scratch, typically using dummy data.
These questions are similar to what Hackerrank sends for data science positions, where you code linear or logistic regression from scratch using NumPy and linear algebra and/or logistic or linear regression using scikit-learn.
- Given some data, provide an end to end (e2e) solution and present metrics and reasoning.
You're expected to:
- Transform data
- Choose a model(s) and metric(s)
- Show some hyperparameter tuning
- Explain how to search the hyperparameter space (e.g. random search vs. grid search)
You’ll also typically visualize the data. For example, in a classification problem, you might see imbalanced labels. You’d discuss this observation and explain how it affects your decision on metrics, sampling, loss functions, etc. You may also be expected to perform exploratory data analysis (EDA). These EDAs will require Matplotlib and Jupyter notebooks.
- Perform a common ML operation (e.g. 2d convolution, self-attention, batch norm).
These questions test your knowledge of these operations and your ability to clearly perform these operations in NumPy. They can also be posed as Leetcode-style questions on 2d matrix manipulations.
Unlike other coding interviews, almost all current ML coding interviews are conducted in Python, and you will likely be at a disadvantage if you are not well-versed in Python.
What to expect
Example questions include:
- Given a table of data with features (e.g. user time on app, number of interactions, and target of whether or not the user deletes the app), create an ML solution to predict the likelihood that users will delete the app.
- Implement the K-nearest neighbor algorithm.
- Implement a 2D convolutional filter.
- Implement the K-means algorithm.
- Given some text data and labels on whether it is harmful, create an ML solution that predicts harmful text.
What interviewers look for
In the ML coding interview, you’re assessed on how well you:
- Understand and solve the given problem
- Understand the chosen ML framework and the team’s particular sub-field
- Implement organized and accurate code
- Communicate your logic
- Display comfort and skill with ML algorithms
How to prepare
Brush up on some fundamentals in your ML framework of choice. Most ML start-ups and large companies use Python and PyTorch. Some additional helpful resources to learn include:
- PyTorch tutorials: cover the fundamentals of data loading, training loops, neural network architecture implementations, and even reinforcement learning. In practice, most companies also use a wrapper on top of PyTorch, such as HuggingFace transformers.
- HuggingFace courses: cover the essentials of how to use their transformers, datasets, and metrics libraries. Looking at some of their examples can be helpful too, so that you know how to implement real-world ML applications with their frameworks. After you’ve reviewed the fundamentals, practice implementing common algorithms (e.g. logistic regression, K-means) under a time limit, and practice working with NumPy arrays.
Lastly, check out the following resources to gain the high-level and implementation knowledge:
- Rubric signals to identify opportunities for improvement.
- Mock interviews on real-world ML coding interview questions.
- 150+ practice questions with feedback from other users.