Do you have an upcoming machine learning interview?
This post is intended for ML engineers or research scientists who are being hired onto specific teams. Its goal is to help you understand the core machine learning concepts you should know before going into interviews and what to expect during the interview process.
Learn how to navigate the machine learning system design interview process through scenarios and role-plays with Exponent's upcoming machine learning interview course.
Who wrote this guide?
This guide was written with the help of Angelica Chen.
Angelica Chen is a Ph.D. candidate at NYU's Center for Data Science. Her research interests broadly revolve around training language models (LMs) for code generation, particularly with human feedback, and improving the evaluation of LMs. This includes long-context question answering and addressing social biases.
Previously, she worked as a researcher at Google Brain and as a student researcher at Google Research, where she contributed to training streaming models for disfluency detection. Additionally, she served as a software engineer at Google Search, where she developed neural models for semantic parsing.
Her skill set includes researching, designing experiments, and implementing model training pipelines using PyTorch, TensorFlow, Jax, and Haiku.
She has interviewed for Google Brain, Meta AI Research, MosaicML, and Microsoft Research.
Top Machine Learning Interview Questions and Concepts
This is a broad list of interview questions and concepts you can expect in an ML interview.
- Implement an attention mechanism using PyTorch. : Neural network architectures, tensor operations, PyTorch/TF knowledge
- Implement a convolutional filter using PyTorch/TensorFlow. : Neural network architectures, tensor operations, PyTorch/TF knowledge
- Explain what a tokenizer is, why they are needed, and the common types of tokenizers. : Tokenizers, pre-processing, internationalization
- What is a BERT model? : Neural network architectures, masked language modeling, encoders
- What are some differences between BERT-style and GPT-style models? : Encoder versus decoder models, causal versus bidirectional training objectives, auto-regressive decoding
- What is in-context learning? : Gradient-free learning, few-shot inference, prompting
- Explain the bias-variance tradeoff. : Bias-variance decomposition, regularization
- Explain how you would evaluate an LM before deploying it to a product. : Evaluation of metrics, precision vs. recall, ID vs. OOD evaluation, dataset splits, fairness/bias
- What are the differences between stochastic gradient descent, mini-batch gradient descent, and gradient descent? : If/when is each guaranteed to converge? Gradient descent, batching, optimization
- What are some types of adaptive optimizers? : Optimization, Adam, Adagrad, Adadelta, Adabound, etc.
- Describe some metrics you would track while pre-training a large LM, and explain what each metric tells you. : Evaluation metrics, underfitting vs. overfitting
- Design a machine-learning system for classifying emails as spam or ham. : Classification models, evaluating classification models
- What distinguishes a Transformer from a recurrent neural network (RNN)? : Attention, recurrence, encoder/decoder models
- Describe how you would evaluate a trained model's out-of-distribution (OOD) generalization. : Generalization, evaluation
Preparing for Neural Language Processing or Computer Vision Interviews
Break down your prep into these focus areas to prepare for an upcoming interview.
Part 1: Machine Learning Fundamentals
Before practicing for your interviews, review the fundamentals of machine learning.
- Classification, regression, generation.
- Optimization: Types of loss functions like MSE, ranking loss, hinge, loss, cross-entropy, and regularization. Gradient and subgradient descent and back-propagation.
- Linear Models: Linear regression, generalized additive models (GAMs), and support vector machines (SVMs) are essential topics in machine learning. They help build an intuition for other models.
- Tree methods: Bagging, boosting, forward stagewise additive modeling, Adaboost, decision trees, and random forests.
- Unsupervised Learning: Clustering techniques such as k-means, hierarchical clustering, and mixture models, as well as expectation-maximization (EM) algorithms, and dimensionality reduction methods such as principal component analysis, non-negative matrix factorization, and singular value decomposition (SVD).
- Recommendation Systems: Content-based filtering and collaborative filtering are two popular recommendation techniques. Matrix factorization is a commonly used method for collaborative filtering. Another approach to recommendation is retrieval, followed by ranking.
- Deep learning: Basic neural network architectures such as MLPs, RNNs, and convolutional nets, as well as best practices for training. This includes using a training/validation/testing split, tracking multiple metrics on the validation dataset, using dropout, warming up the learning rate, applying weight decay, and monitoring for convergence, overfitting, or double descent.
Part 2: Machine Learning for NLP
Next, spend time with neural language processing to understand how to build and fine-tune models.
- Fundamentals of NLP: Distributional hypothesis, word vector representations (such as GloVE and word2vec), basic NLP models (such as n-gram models and bag-of-words), and types of classic NLP tasks (such as semantic parsing, dependency parsing, translation, understanding, and natural language inference).
- Basics of Sequence Modeling: Mapping language to continuous representations/embeddings, recursive and recurrent neural networks (such as LSTMs, GRUs, and RNNs), and evaluation metrics.
- Attention and Transformers: The basics of the self-attention mechanisms and the Transformer architecture, encoder versus decoder transformers, and auto-regressive decoding strategies (such as greedy decoding, beam search, temperature sampling, and nucleus sampling).
- Pre-Training and Fine-Tuning: Training objectives (such as MLM, CLM, denoising, NSP, and contrastive) and best practices for each stage.
- Prompting/In-Context Learning: Few-shot learning, prompting (including standard techniques such as chain-of-thought, self-consistency, and least-to-most complex), prompt tuning, scaling laws, and common LMs that are used with prompting (such as GPT-3/4, Claude, GPT-J, and OPT).
Part 3: Machine Learning for Computer Vision
- Introduction to CV Tasks: Image classification and image/video understanding.
- Perception: Convolutional nets (pooling, convolutions, image features, down-/up-sampling), object detection, and image/semantic segmentation.
- Multimodal understanding and generation: GANs, VAEs, and stable diffusion.
Part 4: Machine Learning System Design
Some sample questions that you might be asked during an ML system design interview include:
- Design a recommendation system for music streaming software.
- Design a system for filtering out offensive content from online comments.
- Design a system for predicting changes in engagement resulting from a feature change.
- Design a system for responding to customer support messages.
- Design a system for automatically tagging faces in a photo.
Structure of Machine Learning Interviews
Different steps are involved in the ML research scientist/engineer interview loop.
- Recruiter screen (30 minutes): In this initial step, the recruiter briefly discusses the job expectations and assesses the candidate's potential fit for the role.
- ML coding interview (45 minutes): In this step, the candidate is usually asked 1-2 questions that assess their knowledge of an ML framework (e.g., TensorFlow, PyTorch) and of a core ML concept in the team's particular sub-field (e.g., Transformers, convolutional nets, data processing, ML evaluation systems, distributed ML). The candidate is expected to implement the solution correctly and explain how that component works and interacts with the more extensive system. If the candidate finishes early, there might be a follow-up question about extending the system to a more complex or generalizable scenario.
- ML concepts interview (45 minutes): In this step, an ML engineer or research scientist will ask the candidate questions about fundamental ML concepts such as bias-variance tradeoff, underfitting vs. overfitting, differences between different ML algorithms, and how a specific ML model works. They may also ask about the candidate's current research interests. For the latter, they're looking more for the candidate's ability to rigorously design an experimental set-up, execute a research plan, and interpret results. Some companies might also ask about more niche topics explicitly related to their work area (e.g., information retrieval, recommendation systems, diffusion models).
- ML system design (45 minutes): In this step, the candidate is asked to design an ML system from end-to-end, including pre-processing the data, training + evaluating the model, and deploying the model. They will be expected to know some of the more practical real-world aspects of productionizing an ML model, particularly concerning efficiency, monitoring, preventing harmful model outputs, and building inference infrastructure.
- Research job talk (60 minutes): This interview is rarer and is usually only conducted for the Research Scientist role, not for ML engineering roles. In this step, the candidate must present a one-hour presentation of their past research. They'll need to tell a cohesive story of the motivations behind and impact of their past research and explain complex technical concepts simply and intuitively. They'll also be expected to answer questions about their research, defend it against other approaches, and relate it to relevant work in the field. In some sense, this interview is similar to a less-intense dissertation defense.
- Hiring manager (30 minutes): In this final step, the hiring manager usually has a short discussion with the candidate to assess whether their skills and working style are good with those of the team. ML research teams hire candidates who are particular matches for their team and have specialized expertise in that particular niche.
It's important to note that the interview loop might vary depending on the company, and not all companies might include all the above-mentioned steps. However, this list provides a general idea of what to expect during an ML research scientist/engineer interview loop.
Preparing for Machine Learning Interviews
Machine learning interviews tend to be area-specific, meaning that ML engineers/research scientists are often hired for specific teams rather than as generalists.
To prepare for such interviews, research the team you are interviewing for. Look at their recent products/features and published research papers/blog posts. Skim through a few top papers in the current ML literature in their sub-field to have a good idea of the current state-of-the-art.
To prepare for coding interviews, brush up on fundamentals in your ML framework of choice. Most ML start-ups and large companies outside of Google use Python and PyTorch nowadays, so you should aim to be proficient in that.
PyTorch has excellent tutorials that cover data loading, training loops, neural network architecture implementations, and reinforcement learning. Most companies use a wrapper on top of PyTorch, such as HuggingFace transformers, which have useful online courses covering the essentials of their transformers, datasets, and metrics libraries.
Looking at their examples can be helpful, too, so you know how to implement real-life ML applications with their frameworks.
For ML systems design interviews, look at multiple examples of different ML problems. Online courses such as Stanford's CS 329S and Chip Huyen's Machine Learning Systems Design cover essential topics for ML system design, including data collection/pre-processing, training/inference infrastructure, monitoring, and evaluation.
Once you have covered the fundamentals, read the recent papers that top industry labs have released for their applied ML systems, such as the YouTube ranking system or the TikTok recommendation algorithm.
It is a lot of information to cover, so focus on getting a high-level intuitive understanding before focusing on implementation details. Understand why certain design decisions were made and how to generalize the same problem-solving techniques to other problems.
For example, how would you generalize a 2-class SVM to a multi-class SVM? Or, how might you apply a collaborative recommendation system to a book recommendation website instead of Spotify, which is a more classic setting?