Skip to main content

How to Answer ML Model Questions

Premium

Overview

Model selection and optimization questions assess your knowledge of the modeling techniques used in production machine learning systems. The interviewer may also ask about algorithms you’ve used in past projects. Some sample questions include:

  1. Describe your favorite machine learning model to a non-technical stakeholder.
  2. Describe how splits in a decision tree occur.
  3. Why did you choose a deep learning model rather than a different model in project X?

To prepare for these types of questions, review the terms under the “Model selection and optimization” section of our ML Interview Glossary.

How to answer

You should know how to apply modeling techniques to real-world problems. For example, say you’re given customer data for an app. You’re asked how you’d determine new product ideas based on customer similarities. To answer this question effectively, you should explain why this project can’t be done with supervised learning by describing the missing components. Then, you should discuss the appropriate algorithms you’d use. Finally, you should explain how you would use the results of your algorithm to make product decisions.

Additionally, know your audience. Is your interviewer a technical or non-technical stakeholder? Practice explaining these topics in both technical and non-technical ways. Non-technical explanations can have a higher-level view, fewer details, and a conceptual explanation of how a process works. Technical explanations include more depth, mathematical details, and a nuanced understanding of the chosen algorithm. The example below breaks down a technical vs. non-technical answer for the given prompt.

If you have the relevant background and an opportunity arises, try to incorporate an unsupervised method in a useful way to your problem. It will likely stand out to your interviewer as an unexpected, creative solution.

Let’s say your interviewer gives you the following prompt:

Imagine your company sends an email to convert customers from free to paying. You build a logistic regression model with the following output:

ML Model Graphic

Explain the results to a non-technical and technical stakeholder.

You could say,

“For the non-technical stakeholder, I would say,

The email works. We have evidence that both receiving the email and spending time on the website positively impact whether the customer converts from free to paying. Receiving the email increases the odds of converting by more than 400%, and each additional minute spent on the website increases the likelihood of conversion by 10%.

For the technical stakeholder, I would say,

exp(1.48) = 4.39

exp(0.103) = 1.1

Receiving the email vs. not receiving the email increases the odds of converting to paying by more than 400%.

For each additional minute spent on the website, we see an increase in the odds of converting by 10%.

I would also collect metrics about the model fit, clarify how the data was collected, and assess whether including an interaction in the model would be useful.”

Common pitfalls

  • Choosing a complex algorithm and being unable to explain it fully.
  • Being unable to tailor your communication appropriately to technical vs. non-technical audiences.
  • Giving an incorrect interpretation of coefficients or p-values. We recommend memorizing these definitions verbatim, rather than making up your own definition.
  • Neglecting the importance of data quality over any algorithm choice.
  • Missing data leakage from training into testing (especially around data transformations).

Senior candidates

As expected, senior candidates have slightly different performance expectations. The more senior the role, the more you’re expected to demonstrate your ability to:

  1. Build the model infrastructure from end to end
  2. Gauge the pros and cons of using a particular algorithm
  3. Integrate your domain knowledge from previous roles
  4. Describe your experience productionizing ML models in previous roles
  5. Work cross-functionally with both technical and non-technical stakeholders.