Skip to main content

Machine Learning Interview Questions

Review this list of 106 Machine Learning interview questions and answers verified by hiring managers and candidates.
  • Google logoAsked at Google 
    5 answers
    +1

    "DNNs can learn hierarchical features, with each layer learning progressively more abstract features, and generalizes better. SNNs are better for simplier problems involving smaller datasets and if low latency is required."

    Louie Z. - "DNNs can learn hierarchical features, with each layer learning progressively more abstract features, and generalizes better. SNNs are better for simplier problems involving smaller datasets and if low latency is required."See full answer

    Software Engineer
    Machine Learning
    +2 more
  • Anthropic logoAsked at Anthropic 
    Add answer
    Machine Learning Engineer
    Machine Learning
    +3 more
  • Meta logoAsked at Meta 
    1 answer
    Video answer for 'Design a fake news detection system.'

    " Functional Requirements Content Ingestion\: Ingest news articles from various sources (websites, social media, etc.). Handle different types of content (text, images, videos). Content Analysis\: Extract and preprocess text from articles. Analyze the content for potential indicators of fake news. Model Training and Prediction\: Use machine learning models to classify content as fake or real. Continuously improve models with new data and f"

    Scott S. - " Functional Requirements Content Ingestion\: Ingest news articles from various sources (websites, social media, etc.). Handle different types of content (text, images, videos). Content Analysis\: Extract and preprocess text from articles. Analyze the content for potential indicators of fake news. Model Training and Prediction\: Use machine learning models to classify content as fake or real. Continuously improve models with new data and f"See full answer

    Technical Program Manager
    Machine Learning
    +3 more
  • Atlassian logoAsked at Atlassian 
    1 answer

    "The interviewer hinted that a two-tower recommender system might be a suitable approach, using user history to embed users and pages separately and train on view or interaction data. Instead, I proposed a different approach that I felt was more aligned with how knowledge is structured in Confluence: I designed a system using a graph database to model the relationships between Confluence pages. Each page is a node, and edges represent content-based references. For example, when one article"

    Clayton P. - "The interviewer hinted that a two-tower recommender system might be a suitable approach, using user history to embed users and pages separately and train on view or interaction data. Instead, I proposed a different approach that I felt was more aligned with how knowledge is structured in Confluence: I designed a system using a graph database to model the relationships between Confluence pages. Each page is a node, and edges represent content-based references. For example, when one article"See full answer

    Machine Learning Engineer
    Machine Learning
    +2 more
  • Amazon logoAsked at Amazon 
    4 answers
    +1

    "in simple words, linear regression helps in predicting the value whereas logistics regression helps in predicting the binary classification. But lets talk through some example Linear regression model: E-commerce website pricing recommendation engine is built on linear regression model where we do have some variables such as competitor price, internal economics and consumer demand etc when we put this in a supervised learning model, it helps in predicting prices Logistics regression model"

    Anonymous Aardvark - "in simple words, linear regression helps in predicting the value whereas logistics regression helps in predicting the binary classification. But lets talk through some example Linear regression model: E-commerce website pricing recommendation engine is built on linear regression model where we do have some variables such as competitor price, internal economics and consumer demand etc when we put this in a supervised learning model, it helps in predicting prices Logistics regression model"See full answer

    Machine Learning Engineer
    Machine Learning
    +2 more
  • 🧠 Want an expert answer to a question? Saving questions lets us know what content to make next.

  • OpenAI logoAsked at OpenAI 
    Add answer
    Video answer for 'How do you select input for modeling if there are features highly correlated with each other?'
    Machine Learning Engineer
    Machine Learning
    +2 more
  • Anthropic logoAsked at Anthropic 
    Add answer
    Machine Learning Engineer
    Machine Learning
    +4 more
  • Amazon logoAsked at Amazon 
    2 answers
    Video answer for 'Implement k-means clustering.'

    "at first I want to know number of cluster I will put random number if I don't know and I will use method called Elbow method or Silhouette Score ,Gap Statistic and Davies–Bouldin Index to know the best number of cluster and I will use scikit-learn library to import kmeans from sklearn.cluster import KMeans kmeans = KMeans(nclusters=2, randomstate=0) kmeans.fit(X) and X this my data "

    Taheia S. - "at first I want to know number of cluster I will put random number if I don't know and I will use method called Elbow method or Silhouette Score ,Gap Statistic and Davies–Bouldin Index to know the best number of cluster and I will use scikit-learn library to import kmeans from sklearn.cluster import KMeans kmeans = KMeans(nclusters=2, randomstate=0) kmeans.fit(X) and X this my data "See full answer

    Machine Learning Engineer
    Machine Learning
    +5 more
  • Machine Learning Engineer
    Machine Learning
  • Amazon logoAsked at Amazon 
    14 answers
    Video answer for 'Implement a k-nearest neighbors algorithm.'
    +10

    "Even more faster and vectorized version, using np.linalg.norm - to avoid loop and np.argpartition to select lowest k. We dont need to sort whole array - we need to be sure that first k elements are lower than the rest. import numpy as np def knn(Xtrain, ytrain, X_new, k): distances = np.linalg.norm(Xtrain - Xnew, axis=1) k_indices = np.argpartition(distances, k)[:k] # O(N) selection instead of O(N log N) sort return int(np.sum(ytrain[kindices]) > k / 2.0) `"

    Dinar M. - "Even more faster and vectorized version, using np.linalg.norm - to avoid loop and np.argpartition to select lowest k. We dont need to sort whole array - we need to be sure that first k elements are lower than the rest. import numpy as np def knn(Xtrain, ytrain, X_new, k): distances = np.linalg.norm(Xtrain - Xnew, axis=1) k_indices = np.argpartition(distances, k)[:k] # O(N) selection instead of O(N log N) sort return int(np.sum(ytrain[kindices]) > k / 2.0) `"See full answer

    Machine Learning Engineer
    Machine Learning
    +2 more
  • TikTok logoAsked at TikTok 
    Add answer
    Machine Learning Engineer
    Machine Learning
    +1 more
  • Nvidia logoAsked at Nvidia 
    1 answer
    Software Engineer
    Machine Learning
    +2 more
  • Sierra AI logoAsked at Sierra AI 
    Add answer
    Software Engineer
    Machine Learning
    +2 more
  • Microsoft logoAsked at Microsoft 
    5 answers
    Video answer for 'How do you select the value of 'k' in the k-means algorithm?'
    +2

    "As an interviewer, I have asked this question to candidates in the past. Here are the major topics I am looking for in an interview The candidate should understand that there are ways of measuring the loss of a particular clustering. For example, we can take the average distance of each point to it's cluster center. The candidate should understand that this loss will always decrease as the number of clusters increases. For that reason, we can't just pick the value of K that minimizes the l"

    Michael F. - "As an interviewer, I have asked this question to candidates in the past. Here are the major topics I am looking for in an interview The candidate should understand that there are ways of measuring the loss of a particular clustering. For example, we can take the average distance of each point to it's cluster center. The candidate should understand that this loss will always decrease as the number of clusters increases. For that reason, we can't just pick the value of K that minimizes the l"See full answer

    Machine Learning Engineer
    Machine Learning
    +1 more
  • Amazon logoAsked at Amazon 
    Add answer
    Machine Learning Engineer
    Machine Learning
    +1 more
  • Add answer
    Video answer for 'Design a System to Predict Netflix Watch Times'
    Machine Learning
    System Design
  • 4 answers
    Video answer for 'Implement K-Means Clustering '
    +1

    "import numpy as np class Centroid: def init(self, location, vectors): self.location = location # (D,) self.vectors = vectors # (N_i, D) class KMeans: def init(self, n_features, k): self.nfeatures = nfeatures self.centroids = [ Centroid( location=np.random.randn(n_features), vectors=np.empty((0, n_features)) ) for _ in range(k) ] def distance(self, x,"

    Dinesh G. - "import numpy as np class Centroid: def init(self, location, vectors): self.location = location # (D,) self.vectors = vectors # (N_i, D) class KMeans: def init(self, n_features, k): self.nfeatures = nfeatures self.centroids = [ Centroid( location=np.random.randn(n_features), vectors=np.empty((0, n_features)) ) for _ in range(k) ] def distance(self, x,"See full answer

    Machine Learning
    Coding
  • Machine Learning Engineer
    Machine Learning
  • Perplexity AI logoAsked at Perplexity AI 
    Add answer
    Machine Learning Engineer
    Machine Learning
    +3 more
  • Nvidia logoAsked at Nvidia 
    4 answers
    +1

    "Over-fitting of a model occurs when model fails to generalize to any new data and has high variance withing training data whereas in under fitting model isn't able to uncover the underlying pattern in the training data and high bias. Tree based model like decision tree and random forest are likely to overfit whereas linear models like linear regression and logistic regression tends to under fit. There are many reasons why a Random forest can overfits easily 1. Model has grown to its full depth a"

    Jyoti V. - "Over-fitting of a model occurs when model fails to generalize to any new data and has high variance withing training data whereas in under fitting model isn't able to uncover the underlying pattern in the training data and high bias. Tree based model like decision tree and random forest are likely to overfit whereas linear models like linear regression and logistic regression tends to under fit. There are many reasons why a Random forest can overfits easily 1. Model has grown to its full depth a"See full answer

    Machine Learning Engineer
    Machine Learning
    +2 more
Showing 21-40 of 106
Exponent

Get updates in your inbox with the latest tips, job listings, and more.

Follow Us

Products
Courses
Interview Questions
Interview Experiences
Popular articles
Guides
Coaching
For Partners
Company
Exponent © 2026
Terms of Service | Privacy