Interview Questions

Review this list of 4,071 interview questions and answers verified by hiring managers and candidates.

+ Add interview

Product

Engineering

Operations

Design

Marketing

Data

Sales

Finance

Consulting

Add interview

Product Manager Software Engineer Data Engineer Technical Program Manager Engineering Manager Data Scientist Machine Learning Engineer Data Analyst BizOps & Strategy Business Analyst

Asked at Pinterest • 2 years ago
Design spam detection on Pinterest.
Machine Learning Engineer
System Design
Add answer I was asked this
Machine Learning Engineer
System Design
Asked at Pinterest • 2 years ago
Explain the differences between nonconvex and convex functions.
Machine Learning Engineer
Concept
1 answer I was asked this
"The difference between convex and nonconvex functions lies in their mathematical properties and the implications for optimization problems. Convex Functions:A convex function has a shape where any line segment connecting two points on its graph lies entirely above or on the graph. This property ensures that any local minimum is also a global minimum, making optimization straightforward and reliable. Convex functions are critical in machine learning and optimization tasks because of th"
Alan T. - "The difference between convex and nonconvex functions lies in their mathematical properties and the implications for optimization problems. Convex Functions:A convex function has a shape where any line segment connecting two points on its graph lies entirely above or on the graph. This property ensures that any local minimum is also a global minimum, making optimization straightforward and reliable. Convex functions are critical in machine learning and optimization tasks because of th"See full answer
Machine Learning Engineer
Concept
Asked at Pinterest • 2 years ago
Explain the differences between ReLU and Sigmoid.
Machine Learning Engineer
Concept
1 answer I was asked this
"Relu = 0 if > some threshold else x sigmoid normalizes to 0-1 asymptotically"
William M. - "Relu = 0 if > some threshold else x sigmoid normalizes to 0-1 asymptotically"See full answer
Machine Learning Engineer
Concept
Asked at Pinterest • 2 years ago
Where do vanishing gradients occur in a neural network?
Machine Learning Engineer
Concept
Add answer I was asked this
Machine Learning Engineer
Concept
Asked at Pinterest • 2 years ago
Explain data drifting.
Machine Learning Engineer
Concept
Add answer I was asked this
Machine Learning Engineer
Concept

🧠 Want an expert answer to a question? Saving questions lets us know what content to make next.

Asked at Pinterest • 2 years ago
Explain overfitting.
Machine Learning Engineer
Concept
1 answer I was asked this
"Overfitting is the condition where your model is giving an unexpectedly higher accuracy because of its training in a small database and not getting exposed to anu different type of database while testing"
Bhavya V. - "Overfitting is the condition where your model is giving an unexpectedly higher accuracy because of its training in a small database and not getting exposed to anu different type of database while testing"See full answer
Machine Learning Engineer
Concept
Asked at Pinterest • 2 years ago
Explain regularization.
Machine Learning Engineer
Concept
Add answer I was asked this
Machine Learning Engineer
Concept
Asked at Amazon, Pinterest • 16 days ago
Implement a k-nearest neighbors algorithm.
IDE
Easy
Machine Learning Engineer
Coding
+2 more
7 answers I was asked this
+4
"Even more faster and vectorized version, using np.linalg.norm - to avoid loop and np.argpartition to select lowest k. We dont need to sort whole array - we need to be sure that first k elements are lower than the rest. import numpy as np def knn(Xtrain, ytrain, X_new, k): distances = np.linalg.norm(Xtrain - Xnew, axis=1) k_indices = np.argpartition(distances, k)[:k] # O(N) selection instead of O(N log N) sort return int(np.sum(ytrain[kindices]) > k / 2.0) `"
Dinar M. - "Even more faster and vectorized version, using np.linalg.norm - to avoid loop and np.argpartition to select lowest k. We dont need to sort whole array - we need to be sure that first k elements are lower than the rest. import numpy as np def knn(Xtrain, ytrain, X_new, k): distances = np.linalg.norm(Xtrain - Xnew, axis=1) k_indices = np.argpartition(distances, k)[:k] # O(N) selection instead of O(N log N) sort return int(np.sum(ytrain[kindices]) > k / 2.0) `"See full answer
Machine Learning Engineer
Coding
+2 more
Asked at Microsoft • 2 years ago
Tell me about a machine learning research topic you find interesting.
Machine Learning Engineer
Machine Learning
Add answer I was asked this
Machine Learning Engineer
Machine Learning
Asked at Microsoft • 2 years ago
What is a perceptron?
Machine Learning Engineer
Concept
1 answer I was asked this
"A perceptron is the most basic building block of a neural network and represents a single-layer binary classifier."
Lash - "A perceptron is the most basic building block of a neural network and represents a single-layer binary classifier."See full answer
Machine Learning Engineer
Concept
Asked at Microsoft • 2 years ago
How do you select the value of 'k' in the k-means algorithm?
Machine Learning Engineer
Concept
+1 more
4 answers I was asked this
+1
"As an interviewer, I have asked this question to candidates in the past. Here are the major topics I am looking for in an interview The candidate should understand that there are ways of measuring the loss of a particular clustering. For example, we can take the average distance of each point to it's cluster center. The candidate should understand that this loss will always decrease as the number of clusters increases. For that reason, we can't just pick the value of K that minimizes the l"
Michael F. - "As an interviewer, I have asked this question to candidates in the past. Here are the major topics I am looking for in an interview The candidate should understand that there are ways of measuring the loss of a particular clustering. For example, we can take the average distance of each point to it's cluster center. The candidate should understand that this loss will always decrease as the number of clusters increases. For that reason, we can't just pick the value of K that minimizes the l"See full answer
Machine Learning Engineer
Concept
+1 more
Asked at Microsoft • 2 years ago
Explain BERT.
Machine Learning Engineer
Concept
2 answers I was asked this
"BERT - bidirectional encoder representations from transformer. For example:- it takes an entire sentence as input at once and understands the meaning of the words in that sentence and calculate the relations of words with each other irrespective of their positions from the original word to understand the meaning of the word using neighboring words. BERT model is a pre trained transformer model which can be fine-tuned for our purposes. It is used for tasks such sentimental analysis, question answ"
Bhavya V. - "BERT - bidirectional encoder representations from transformer. For example:- it takes an entire sentence as input at once and understands the meaning of the words in that sentence and calculate the relations of words with each other irrespective of their positions from the original word to understand the meaning of the word using neighboring words. BERT model is a pre trained transformer model which can be fine-tuned for our purposes. It is used for tasks such sentimental analysis, question answ"See full answer
Machine Learning Engineer
Concept
Asked at Microsoft • 2 years ago
Explain the differences between L1 and L2 regression.
Machine Learning Engineer
Concept
Add answer I was asked this
Machine Learning Engineer
Concept
Asked at Apple, Meta (Facebook), Oracle • 10 months ago
Implement Trie
IDE
Medium
Data Engineer
Data Structures & Algorithms
+3 more
3 answers I was asked this
"class TrieNode { constructor() { this.children = {}; this.isEndOfWord = false; } } class Trie { constructor() { this.root = new TrieNode(); } insert(word) { let node = this.root; for (const char of word) { if (!node.children[char]) { node.children[char] = new TrieNode(); } node = node.children[char]; } node.isEndOfWord = true; } search(word) { l"
Tiago R. - "class TrieNode { constructor() { this.children = {}; this.isEndOfWord = false; } } class Trie { constructor() { this.root = new TrieNode(); } insert(word) { let node = this.root; for (const char of word) { if (!node.children[char]) { node.children[char] = new TrieNode(); } node = node.children[char]; } node.isEndOfWord = true; } search(word) { l"See full answer
Data Engineer
Data Structures & Algorithms
+3 more
Asked at Meta (Facebook) • 2 years ago
Determine the minimum number of parentheses needed to balance a given string.
Machine Learning Engineer
Data Structures & Algorithms
+1 more
Add answer I was asked this
Machine Learning Engineer
Data Structures & Algorithms
+1 more
Asked at Meta (Facebook) • 2 years ago
Determine if two words are sorted in lexicographic order.
Machine Learning Engineer
Data Structures & Algorithms
+1 more
Add answer I was asked this
Machine Learning Engineer
Data Structures & Algorithms
+1 more
Asked at Meta (Facebook) • 2 years ago
Find the dot product of two sparse vectors.
Machine Learning Engineer
Data Structures & Algorithms
+1 more
Add answer I was asked this
Machine Learning Engineer
Data Structures & Algorithms
+1 more
Asked at Meta (Facebook) • 2 years ago
Design an auto-complete feature.
Machine Learning Engineer
System Design
1 answer I was asked this
"Functional requirement's: partial search while searching for users, products any keywords in the search. additional keywords in the filter Black listed words in the search. Non functional requirements: low latency, search through 2 Billion records recent search should be cached. Design: high reads, we should have caching enabled over the primary db storages. caching cluster can be added when the search load increases. read ahead. - check in cache (periodic cache refresh), lfu, lru "
Sandeep Y. - "Functional requirement's: partial search while searching for users, products any keywords in the search. additional keywords in the filter Black listed words in the search. Non functional requirements: low latency, search through 2 Billion records recent search should be cached. Design: high reads, we should have caching enabled over the primary db storages. caching cluster can be added when the search load increases. read ahead. - check in cache (periodic cache refresh), lfu, lru "See full answer
Machine Learning Engineer
System Design
Asked at Meta (Facebook) • 2 years ago
Given a Directed Acyclic Graph (DAG), write a function to return the length of the longest path.
Machine Learning Engineer
Data Structures & Algorithms
+1 more
Add answer I was asked this
Machine Learning Engineer
Data Structures & Algorithms
+1 more
Asked at Meta (Facebook) • 2 years ago
How do you test your machine learning models for production?
Machine Learning Engineer
Concept
Add answer I was asked this
Machine Learning Engineer
Concept