Data Engineer Interview Questions

Review this list of 154 data engineer interview questions and answers verified by hiring managers and candidates.
  • +2

    "-- Write your query here select u.userid as userid, IFNULL(sum(purchase_value), 0) AS LTV FROM user_sessions u JOIN attribution a ON u.sessionid = a.sessionid group by user_id order by LTV desc ; Needs a full join. Wondering why cant we do a left outer join here. All the sessions should have complete data."

    Aneesha K. - "-- Write your query here select u.userid as userid, IFNULL(sum(purchase_value), 0) AS LTV FROM user_sessions u JOIN attribution a ON u.sessionid = a.sessionid group by user_id order by LTV desc ; Needs a full join. Wondering why cant we do a left outer join here. All the sessions should have complete data."See full answer

    Data Engineer
    Coding
    +3 more
  • Apple logoAsked at Apple 

    "I was able to provide the optimal approach and coded it up"

    Anonymous Wasp - "I was able to provide the optimal approach and coded it up"See full answer

    Data Engineer
    Data Structures & Algorithms
    +2 more
  • Adobe logoAsked at Adobe 
    +5

    "bool isValidBST(TreeNode* root, long min = LONGMIN, long max = LONGMAX){ if (root == NULL) return true; if (root->val val >= max) return false; return isValidBST(root->left, min, root->val) && isValidBST(root->right, root->val, max); } `"

    Alvaro R. - "bool isValidBST(TreeNode* root, long min = LONGMIN, long max = LONGMAX){ if (root == NULL) return true; if (root->val val >= max) return false; return isValidBST(root->left, min, root->val) && isValidBST(root->right, root->val, max); } `"See full answer

    Data Engineer
    Coding
    +4 more
  • Adobe logoAsked at Adobe 
    +19

    " O(n) time, O(1) space from typing import List def maxsubarraysum(nums: List[int]) -> int: if len(nums) == 0: return 0 maxsum = currsum = nums[0] for i in range(1, len(nums)): currsum = max(currsum + nums[i], nums[i]) maxsum = max(currsum, max_sum) return max_sum debug your code below print(maxsubarraysum([-1, 2, -3, 4])) `"

    Rick E. - " O(n) time, O(1) space from typing import List def maxsubarraysum(nums: List[int]) -> int: if len(nums) == 0: return 0 maxsum = currsum = nums[0] for i in range(1, len(nums)): currsum = max(currsum + nums[i], nums[i]) maxsum = max(currsum, max_sum) return max_sum debug your code below print(maxsubarraysum([-1, 2, -3, 4])) `"See full answer

    Data Engineer
    Data Structures & Algorithms
    +4 more
  • Adobe logoAsked at Adobe 
    +3

    "function main(){ const v1=[2,3, 4, 10] const v2= [3,4 ,5,20, 23] return merge(v1,v2); } function merge(left, right){ const result=[]; while(left.length>0&& right.length>0){ if(left[0]0){ result=result.concat(left) } if(right.length>0){ result=result.concat(right) } return result; }"

    Samuel M. - "function main(){ const v1=[2,3, 4, 10] const v2= [3,4 ,5,20, 23] return merge(v1,v2); } function merge(left, right){ const result=[]; while(left.length>0&& right.length>0){ if(left[0]0){ result=result.concat(left) } if(right.length>0){ result=result.concat(right) } return result; }"See full answer

    Data Engineer
    Data Structures & Algorithms
    +4 more
  • 🧠 Want an expert answer to a question? Saving questions lets us know what content to make next.

  • Discord logoAsked at Discord 
    Data Engineer
    Behavioral
    +4 more
  • Apple logoAsked at Apple 
    +2

    "This could be done using two-pointer approach assuming array is sorted: left and right pointers. We need track two sums (left and right) as we move pointers. For moving pointers we will move left to right by 1 (increment) when right sum is greater. We will move right pointer to left by 1 (decrement) when left sum is greater. at some point we will either get the sum same and that's when we exit from the loop. 0-left will be one array and right-(n-1) will be another array. We are not going to mo"

    Bhaskar B. - "This could be done using two-pointer approach assuming array is sorted: left and right pointers. We need track two sums (left and right) as we move pointers. For moving pointers we will move left to right by 1 (increment) when right sum is greater. We will move right pointer to left by 1 (decrement) when left sum is greater. at some point we will either get the sum same and that's when we exit from the loop. 0-left will be one array and right-(n-1) will be another array. We are not going to mo"See full answer

    Data Engineer
    Data Structures & Algorithms
    +2 more
  • Uber logoAsked at Uber 

    "Not my answer, but rather the details of this question. It should include the following functions: int insertNewCustomer(double revenue) -> returns a customer ID (assume auto-incremented & 0-based) int insertNewCustomer(double revenue, int referrerID) -> returns a customer ID (assume auto-incremented & 0-based) Set getLowestKCustomersByMinTotalRevenue(int k, double minTotalRevenue) -> returns customer IDs Note: The total revenue consists of the revenue that this customer bring"

    Anzhe M. - "Not my answer, but rather the details of this question. It should include the following functions: int insertNewCustomer(double revenue) -> returns a customer ID (assume auto-incremented & 0-based) int insertNewCustomer(double revenue, int referrerID) -> returns a customer ID (assume auto-incremented & 0-based) Set getLowestKCustomersByMinTotalRevenue(int k, double minTotalRevenue) -> returns customer IDs Note: The total revenue consists of the revenue that this customer bring"See full answer

    Data Engineer
    Coding
  • "There are 2 questions popping into my mind: Should the 2nd job have to kick off at 12:30AM? Are there others depending on the 2nd job? If both answers are no, we may simply postpone the second job to allow sufficient time for the first one to complete. If they are yeses, we could let the 2nd job retry to a certain amount of times. Make sure that even reaching the maximum of retries won't delay or fail the following jobs."

    Anzhe M. - "There are 2 questions popping into my mind: Should the 2nd job have to kick off at 12:30AM? Are there others depending on the 2nd job? If both answers are no, we may simply postpone the second job to allow sufficient time for the first one to complete. If they are yeses, we could let the 2nd job retry to a certain amount of times. Make sure that even reaching the maximum of retries won't delay or fail the following jobs."See full answer

    Data Engineer
    Data Pipeline Design
  • Adobe logoAsked at Adobe 
    Video answer for 'Solve John Conway's "Game of Life".'
    Data Engineer
    Data Structures & Algorithms
    +2 more
  • Walmart Labs logoAsked at Walmart Labs 
    Data Engineer
    Behavioral
    +5 more
  • Google logoAsked at Google 

    "Clarification questions What is the purpose of connecting the DB? Do we expect high-volumes of traffic to hit the DB Do we have scalability or reliability concerns? Format Code -> DB Code -> Cache -> DB API -> Cache -> DB - APIs are built for a purpose and have a specified protocol (GET, POST, DELETE) to speak to the DB. APIs can also use a contract to retrieve information from a DB much faster than code. Load balanced APIs -> Cache -> DB **Aut"

    Aaron W. - "Clarification questions What is the purpose of connecting the DB? Do we expect high-volumes of traffic to hit the DB Do we have scalability or reliability concerns? Format Code -> DB Code -> Cache -> DB API -> Cache -> DB - APIs are built for a purpose and have a specified protocol (GET, POST, DELETE) to speak to the DB. APIs can also use a contract to retrieve information from a DB much faster than code. Load balanced APIs -> Cache -> DB **Aut"See full answer

    Data Engineer
    Concept
    +5 more
  • "1) Have a common goal 2) Have a clear and fair accountability between teams 3) Ensure conflicts are resolved in time on common issues 4) Promote common Brain-storming , problem solving sessions 5) Most important , Have clear and effective communication established and practised"

    Saurabh N. - "1) Have a common goal 2) Have a clear and fair accountability between teams 3) Ensure conflicts are resolved in time on common issues 4) Promote common Brain-storming , problem solving sessions 5) Most important , Have clear and effective communication established and practised"See full answer

    Data Engineer
    Behavioral
    +4 more
  • Apple logoAsked at Apple 

    "class TrieNode { constructor() { this.children = {}; this.isEndOfWord = false; } } class Trie { constructor() { this.root = new TrieNode(); } insert(word) { let node = this.root; for (const char of word) { if (!node.children[char]) { node.children[char] = new TrieNode(); } node = node.children[char]; } node.isEndOfWord = true; } search(word) { l"

    Tiago R. - "class TrieNode { constructor() { this.children = {}; this.isEndOfWord = false; } } class Trie { constructor() { this.root = new TrieNode(); } insert(word) { let node = this.root; for (const char of word) { if (!node.children[char]) { node.children[char] = new TrieNode(); } node = node.children[char]; } node.isEndOfWord = true; } search(word) { l"See full answer

    Data Engineer
    Data Structures & Algorithms
    +3 more
  • +1

    "WITH suspicious_transactions AS ( SELECT c.first_name, c.last_name, t.receipt_number, COUNT(t.receiptnumber) OVER (PARTITION BY c.customerid) AS noofoffences FROM customers c JOIN transactions t ON c.customerid = t.customerid WHERE t.receipt_number LIKE '%999%' OR t.receipt_number LIKE '%1234%' OR t.receipt_number LIKE '%XYZ%' ) SELECT first_name, last_name, receipt_number, noofoffences FROM suspicious_transactions WHERE noofoffences >= 2;"

    Jayveer S. - "WITH suspicious_transactions AS ( SELECT c.first_name, c.last_name, t.receipt_number, COUNT(t.receiptnumber) OVER (PARTITION BY c.customerid) AS noofoffences FROM customers c JOIN transactions t ON c.customerid = t.customerid WHERE t.receipt_number LIKE '%999%' OR t.receipt_number LIKE '%1234%' OR t.receipt_number LIKE '%XYZ%' ) SELECT first_name, last_name, receipt_number, noofoffences FROM suspicious_transactions WHERE noofoffences >= 2;"See full answer

    Data Engineer
    Coding
    +3 more
  • +1

    "SELECT upsellcampaignid, COUNT(DISTINCT trans.userid) AS eligibleusers FROM campaign JOIN "transaction" AS trans ON transactiondate BETWEEN datestart AND date_end JOIN user ON trans.userid = user.userid WHERE iseligibleforupsellcampaign = 1 GROUP BY upsellcampaignid `"

    Alina G. - "SELECT upsellcampaignid, COUNT(DISTINCT trans.userid) AS eligibleusers FROM campaign JOIN "transaction" AS trans ON transactiondate BETWEEN datestart AND date_end JOIN user ON trans.userid = user.userid WHERE iseligibleforupsellcampaign = 1 GROUP BY upsellcampaignid `"See full answer

    Data Engineer
    Coding
    +3 more
  • Adobe logoAsked at Adobe 

    "static boolean sudokuSolve(char board) { return sudokuSolve(board, 0, 0); } static boolean sudokuSolve(char board, int r, int c) { if(c>=board[0].length) { r=r+1; c=0; } if(r>=board.length) return true; if(boardr=='.') { for(int num=1; num<=9; num++) { boardr=(char)('0' + num); if(isValidPosition(board, r, c)) { if(sudokuSolve(board, r, c+1)) return true; } boardr='.'; } } else { return sudokuSolve(board, r, c+1); } return false; } static boolean isValidPosition(char b"

    Divya R. - "static boolean sudokuSolve(char board) { return sudokuSolve(board, 0, 0); } static boolean sudokuSolve(char board, int r, int c) { if(c>=board[0].length) { r=r+1; c=0; } if(r>=board.length) return true; if(boardr=='.') { for(int num=1; num<=9; num++) { boardr=(char)('0' + num); if(isValidPosition(board, r, c)) { if(sudokuSolve(board, r, c+1)) return true; } boardr='.'; } } else { return sudokuSolve(board, r, c+1); } return false; } static boolean isValidPosition(char b"See full answer

    Data Engineer
    Data Structures & Algorithms
    +4 more
  • Apple logoAsked at Apple 
    Data Engineer
    Data Structures & Algorithms
    +2 more
  • Adobe logoAsked at Adobe 

    Permutations

    IDE
    Medium

    "function permute(nums) { if (nums.length <= 1) { return [nums]; } const prevPermutations = permute(nums.slice(0, nums.length-1)); const currentNum = nums[nums.length-1]; const permutations = new Set(); for (let prev of prevPermutations) { for (let i=0; i < prev.length; i++) { permutations.add([...prev.slice(0, i), currentNum, ...prev.slice(i)]); } permutations.add([...prev, currentNum]); } return [...permutations]"

    Tiago R. - "function permute(nums) { if (nums.length <= 1) { return [nums]; } const prevPermutations = permute(nums.slice(0, nums.length-1)); const currentNum = nums[nums.length-1]; const permutations = new Set(); for (let prev of prevPermutations) { for (let i=0; i < prev.length; i++) { permutations.add([...prev.slice(0, i), currentNum, ...prev.slice(i)]); } permutations.add([...prev, currentNum]); } return [...permutations]"See full answer

    Data Engineer
    Data Structures & Algorithms
    +3 more
  • Walmart Labs logoAsked at Walmart Labs 

    "I’ve spent over 6 years building and scaling e-commerce products across EMEA and APAC. At Jumia, I led product initiatives on the checkout and payments side. For example, I launched gamified promotions on PDP and checkout that improved engagement and delivered a 2.3x uplift in conversion. I also introduced automated installment payments and order cancellation flows, which not only improved user trust but also reduced complaints by 30% and lowered operational costs. Before that, at Lazada, I work"

    Rajeev K. - "I’ve spent over 6 years building and scaling e-commerce products across EMEA and APAC. At Jumia, I led product initiatives on the checkout and payments side. For example, I launched gamified promotions on PDP and checkout that improved engagement and delivered a 2.3x uplift in conversion. I also introduced automated installment payments and order cancellation flows, which not only improved user trust but also reduced complaints by 30% and lowered operational costs. Before that, at Lazada, I work"See full answer

    Data Engineer
    Behavioral
    +2 more
Showing 101-120 of 154