Data Engineer Interview Questions

Review this list of 105 data engineer interview questions and answers verified by hiring managers and candidates.

+ Add interview

Product

Engineering

Operations

Design

Marketing

Data

Sales

Finance

Consulting

Add interview

Product Manager Software Engineer Data Scientist Technical Program Manager Engineering Manager Data Engineer Machine Learning Engineer Data Analyst BizOps & Strategy Product Analyst

Asked at Amazon • 3 months ago
Create geographic and demographic dashboards for weekly, monthly, and yearly analytics using order data (100M daily records for 5 years) and customer data (1B customers).
Data Engineer
Data Modeling
1 answer I was asked this
"What do all data scientists need to know about how to work with very large datasets? 37 Follow Request Answer More All related (39) Recommended 📷 Corrin Lakeland · Follow , M.S. Data Science, University of St. Thomas, St. Paul (2018)6yData Science consultant and managerUpvoted by[Tom Halloin](https://www.quora"
Hayatu H. - "What do all data scientists need to know about how to work with very large datasets? 37 Follow Request Answer More All related (39) Recommended 📷 Corrin Lakeland · Follow , M.S. Data Science, University of St. Thomas, St. Paul (2018)6yData Science consultant and managerUpvoted by[Tom Halloin](https://www.quora"See full answer
Data Engineer
Data Modeling
Asked at Adobe, Apple, BlackRock + 10 more • 7 months ago
Group anagrams
Data Engineer
Data Structures & Algorithms
+4 more
1 answer I was asked this
"Use a representative of each, e.g. sort the string and add it to the value of a hashmap> where we put all the words that belong to the same anagram together."
Gaston B. - "Use a representative of each, e.g. sort the string and add it to the value of a hashmap> where we put all the words that belong to the same anagram together."See full answer
Data Engineer
Data Structures & Algorithms
+4 more
Asked at Adobe, Apple, Goldman Sachs + 6 more • 7 months ago
Given an integer array nums and an integer k, return true if nums has a subarray of at least two elements whose sum is a multiple of k.
IDE
Hard
Data Engineer
Data Structures & Algorithms
+4 more
13 answers I was asked this
+9
"Would be better to adjust resolution in the video player directly."
Anonymous Prawn - "Would be better to adjust resolution in the video player directly."See full answer
Data Engineer
Data Structures & Algorithms
+4 more
Asked at DoorDash • 5 months ago
On DoorDash, there are missing item and wrong item issues for deliveries. How would you analyze each of them?
Data Engineer
Statistics & Experimentation
+1 more
1 answer I was asked this
"Missing Item - User ordered multiple items, few items are missing Wrong Item - Entire order is wrong / there are items in the order that were never ordered How is this measured ? CSAT Missing Items Wrong Items Step 1 : Collect data on orders that reported missing / wrong items. Dive deep to understand if the problem is isolated to a specific metro/zip code/restaurant type (say fast food vs fine dine), time of day (lunch vs dinner), tenure of the courier on th"
Saurabh K. - "Missing Item - User ordered multiple items, few items are missing Wrong Item - Entire order is wrong / there are items in the order that were never ordered How is this measured ? CSAT Missing Items Wrong Items Step 1 : Collect data on orders that reported missing / wrong items. Dive deep to understand if the problem is isolated to a specific metro/zip code/restaurant type (say fast food vs fine dine), time of day (lunch vs dinner), tenure of the courier on th"See full answer
Data Engineer
Statistics & Experimentation
+1 more
Asked at OpenAI • 7 months ago
Why do you want to work at OpenAI?
Data Engineer
Behavioral
+5 more
Add answer I was asked this
Data Engineer
Behavioral
+5 more

🧠 Want an expert answer to a question? Saving questions lets us know what content to make next.

Asked at Apple, Intuit, JP Morgan Chase + 6 more • 7 months ago
Find the longest substring without repeating characters.
IDE
Medium
Data Engineer
Data Structures & Algorithms
+4 more
12 answers I was asked this
+9
"we can use two pointer + set like maintain i,j and also insert jth character to set like while set size is equal to our window j-i+1 then maximize our answer and increase jth pointer till last index"
Kishor J. - "we can use two pointer + set like maintain i,j and also insert jth character to set like while set size is equal to our window j-i+1 then maximize our answer and increase jth pointer till last index"See full answer
Data Engineer
Data Structures & Algorithms
+4 more
Asked at Adobe, Apple, Google + 3 more • 7 months ago
Reverse a Sentence
IDE
Easy
Data Engineer
Data Structures & Algorithms
+4 more
25 answers I was asked this
+20
"#inplace reversal without inbuilt functions def reverseString(s): chars = list(s) l, r = 0, len(s)-1 while l < r: chars[l],chars[r] = chars[r],chars[l] l += 1 r -= 1 reversed = "".join(chars) return reversed "
Anonymous Possum - "#inplace reversal without inbuilt functions def reverseString(s): chars = list(s) l, r = 0, len(s)-1 while l < r: chars[l],chars[r] = chars[r],chars[l] l += 1 r -= 1 reversed = "".join(chars) return reversed "See full answer
Data Engineer
Data Structures & Algorithms
+4 more
Asked at Meta (Facebook), Google, IBM + 8 more • 2 months ago
Merge Intervals
IDE
Medium
Data Engineer
Data Structures & Algorithms
+6 more
41 answers I was asked this
+33
"const mergeIntervals = (intervals) => { const compare = (a, b) => { if(a[0] b[0]) return 1 else if(a[0] === b[0]) { return a[1] - b[1] } } let current = [] const result = [] const sorted = intervals.sort(compare) for(let i = 0; i = b[0]) current[1] = b[1] els"
Kofi N. - "const mergeIntervals = (intervals) => { const compare = (a, b) => { if(a[0] b[0]) return 1 else if(a[0] === b[0]) { return a[1] - b[1] } } let current = [] const result = [] const sorted = intervals.sort(compare) for(let i = 0; i = b[0]) current[1] = b[1] els"See full answer
Data Engineer
Data Structures & Algorithms
+6 more
Asked at Databricks • 7 months ago
What's the difference between a data lakehouse and a data warehouse?
Data Engineer
Data Pipeline Design
3 answers I was asked this
"Data lake and warehouse are both places that allow an organization to store large amounts of data. When swimming in a lake, one would imagine that they come across all sorts of stuff - floating twigs, fish in the water, stones, chemicals and sometimes may be even a snake. Similarly, a data lake stores all forms of data that the company has without any indexing. The data is available at any time but needs to be first cleaned up and reorganized before it can be used for any type of analysis. A"
Kshitij I. - "Data lake and warehouse are both places that allow an organization to store large amounts of data. When swimming in a lake, one would imagine that they come across all sorts of stuff - floating twigs, fish in the water, stones, chemicals and sometimes may be even a snake. Similarly, a data lake stores all forms of data that the company has without any indexing. The data is available at any time but needs to be first cleaned up and reorganized before it can be used for any type of analysis. A"See full answer
Data Engineer
Data Pipeline Design
Asked at DoorDash • a month ago
You're a PM at a food delivery app where conversion rates have declined over the past week. How would you investigate the causes? (Conversion: From users browsing to placing orders.)
Data Engineer
Behavioral
+2 more
Add answer I was asked this
Data Engineer
Behavioral
+2 more
Asked at DoorDash • 5 months ago
Given an array of task durations (in minutes), return the pairs of tasks that can be completed within 60 minutes. For example, for [1, 43, 20, 59, 30, 30], return [[0, 3], [4, 5]].
Data Engineer
Coding
+1 more
1 answer I was asked this
"It's a 2Sum question with duplicate array elements."
Anzhe M. - "It's a 2Sum question with duplicate array elements."See full answer
Data Engineer
Coding
+1 more
Asked at Amazon, Apple, Oracle + 3 more • 7 months ago
Course Schedule
IDE
Medium
Data Engineer
Data Structures & Algorithms
+4 more
7 answers I was asked this
+4
"Any cycle would cause the prerequisite to be greater than the course. This passes all the tests: function canFinish(_numCourses, prerequisites) { for (const [a, b] of prerequisites) { if (b > a) return false } return true } `"
Jeremy D. - "Any cycle would cause the prerequisite to be greater than the course. This passes all the tests: function canFinish(_numCourses, prerequisites) { for (const [a, b] of prerequisites) { if (b > a) return false } return true } `"See full answer
Data Engineer
Data Structures & Algorithms
+4 more
Asked at Apple, Booking.com, Meta (Facebook) + 9 more • 7 months ago
Valid Parentheses
IDE
Easy
Data Engineer
Data Structures & Algorithms
+4 more
18 answers I was asked this
+15
"function isValid(s) { const stack = []; for (let i=0; i < s.length; i++) { const char = s.charAt(i); if (['(', '{', '['].includes(char)) { stack.push(char); } else { const top = stack.pop(); if ((char === ')' && top !== '(') || (char === '}' && top !== '{') || (char === ']' && top !== '[')) { return false; } } } return stack.length === 0"
Tiago R. - "function isValid(s) { const stack = []; for (let i=0; i < s.length; i++) { const char = s.charAt(i); if (['(', '{', '['].includes(char)) { stack.push(char); } else { const top = stack.pop(); if ((char === ')' && top !== '(') || (char === '}' && top !== '{') || (char === ']' && top !== '[')) { return false; } } } return stack.length === 0"See full answer
Data Engineer
Data Structures & Algorithms
+4 more
Asked at Visa • 7 months ago
Why do you want to work at Visa?
Data Engineer
Behavioral
+4 more
1 answer I was asked this
"There are couple of reasons for it - Kind of role : Its a product manager role loaded with analytical work, So working with data in stringent regulatory guideline make it more exciting and thrilling. Location & industry is like - Cherry on the cake, Bangalore weather and BFI is at its all time peak as people spending behavior is changing continuously, it will be interesting to see big giants like visa are managing it."
Nidhi S. - "There are couple of reasons for it - Kind of role : Its a product manager role loaded with analytical work, So working with data in stringent regulatory guideline make it more exciting and thrilling. Location & industry is like - Cherry on the cake, Bangalore weather and BFI is at its all time peak as people spending behavior is changing continuously, it will be interesting to see big giants like visa are managing it."See full answer
Data Engineer
Behavioral
+4 more
Asked at Adobe, Apple, Goldman Sachs + 4 more • 7 months ago
Find the median of two sorted arrays.
Data Engineer
Data Structures & Algorithms
+4 more
Add answer I was asked this
Data Engineer
Data Structures & Algorithms
+4 more
Asked at TikTok • 7 months ago
Split an array into equal sum subarrays
Data Engineer
Data Structures & Algorithms
+1 more
Add answer I was asked this
Data Engineer
Data Structures & Algorithms
+1 more
Asked at Discord, Meta (Facebook), Walmart Labs • 7 months ago
Tell me about a mistake you made and what you learned from it.
Data Engineer
Behavioral
+2 more
2 answers I was asked this
" A couple of years ago, we were working on a project to integrate a new third-party data feed into our existing data processing pipeline. This data feed was critical for enhancing our trading algorithms with more comprehensive market data. Given the tight timeline and high stakes, I decided to push for a rapid implementation. In my eagerness to meet the deadline, I underestimated the complexity of integrating this new data feed. I did not allocate sufficient time for thorough testing and valida"
Scott S. - " A couple of years ago, we were working on a project to integrate a new third-party data feed into our existing data processing pipeline. This data feed was critical for enhancing our trading algorithms with more comprehensive market data. Given the tight timeline and high stakes, I decided to push for a rapid implementation. In my eagerness to meet the deadline, I underestimated the complexity of integrating this new data feed. I did not allocate sufficient time for thorough testing and valida"See full answer
Data Engineer
Behavioral
+2 more
Asked at Adobe, Goldman Sachs, Google • 7 months ago
Climbing Stairs
IDE
Easy
Data Engineer
Data Structures & Algorithms
+3 more
11 answers I was asked this
+6
" function climbStairs(n) { // 4 iterations of Dynamic Programming solutions: // Step 1: Recursive: // if (n <= 2) return n // return climbStairs(n-1) + climbStairs(n-2) // Step 2: Top-down Memoization // const memo = {0:0, 1:1, 2:2} // function f(x) { // if (x in memo) return memo[x] // memo[x] = f(x-1) + f(x-2) // return memo[x] // } // return f(n) // Step 3: Bottom-up Tabulation // const tab = [0,1,2] // f"
Matthew K. - " function climbStairs(n) { // 4 iterations of Dynamic Programming solutions: // Step 1: Recursive: // if (n <= 2) return n // return climbStairs(n-1) + climbStairs(n-2) // Step 2: Top-down Memoization // const memo = {0:0, 1:1, 2:2} // function f(x) { // if (x in memo) return memo[x] // memo[x] = f(x-1) + f(x-2) // return memo[x] // } // return f(n) // Step 3: Bottom-up Tabulation // const tab = [0,1,2] // f"See full answer
Data Engineer
Data Structures & Algorithms
+3 more
Asked at Adobe, Apple, Meta (Facebook) + 2 more • 7 months ago
Move all zeros to the end of an array.
IDE
Easy
Data Engineer
Data Structures & Algorithms
+4 more
44 answers I was asked this
+39
"this solution here is much faster than the exponent reference soln. It is also far more concise and easy to understand def moveZerosToEnd(arr: List[int]) -> List[int]: left = 0 for right in range(len(arr)): if arr[right] == 0: pass else: if left != right: temp = arr[left] arr[left] = arr[right] arr[right] = temp left += 1 return arr `"
Devesh K. - "this solution here is much faster than the exponent reference soln. It is also far more concise and easy to understand def moveZerosToEnd(arr: List[int]) -> List[int]: left = 0 for right in range(len(arr)): if arr[right] == 0: pass else: if left != right: temp = arr[left] arr[left] = arr[right] arr[right] = temp left += 1 return arr `"See full answer
Data Engineer
Data Structures & Algorithms
+4 more
Asked at Adobe, Apple • 7 months ago
Find the Duplicates
IDE
Easy
Data Engineer
Data Structures & Algorithms
+2 more
28 answers I was asked this
+24
" from typing import List one pass O(n) def find_duplicates(arr1: List[int], arr2: List[int]) -> List[int]: duplicates = [] i1 = i2 = 0 while i1 < len(arr1) and i2 < len(arr2): if arr1[i1] == arr2[i2]: duplicates.append(arr1[i1]) i2 += 1 i1 += 1 return duplicates debug your code below print(find_duplicates([1, 2, 3, 5, 6, 7], [3, 6, 7, 8, 20])) `"
Rick E. - " from typing import List one pass O(n) def find_duplicates(arr1: List[int], arr2: List[int]) -> List[int]: duplicates = [] i1 = i2 = 0 while i1 < len(arr1) and i2 < len(arr2): if arr1[i1] == arr2[i2]: duplicates.append(arr1[i1]) i2 += 1 i1 += 1 return duplicates debug your code below print(find_duplicates([1, 2, 3, 5, 6, 7], [3, 6, 7, 8, 20])) `"See full answer
Data Engineer
Data Structures & Algorithms
+2 more